[task #7896] Support zero-copy drivers

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt

URL:
  <http://savannah.nongnu.org/task/?7896>

                 Summary: Support zero-copy drivers
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: jifl
            Submitted on: Thursday 03/27/2008 at 22:28
                Category: None
         Should Start On: Thursday 03/27/2008 at 00:00
   Should be Finished on: Friday 06/27/2008 at 00:00
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
        Percent Complete: 0%
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: 1.4.0
                  Effort: 0.00

    _______________________________________________________

Details:

This is an off-shoot from task #6735 which is now invalid. Here's what I
wrote there:


There's a lot to do to support zero-copy transmits and receives. For
receives, there are usually requirements for things such as using specific
memory regions (PCI windows, or dedicated DMA regions, or on-chip RAM, etc.),
as well as alignment and buffer size constraints. One port I've already done
with my own zero-copy rx mods is to AT91SAM7X which uses fixed 128-byte
buffers, thus requiring fixed 128-byte pool pbufs. Fortunately I could still
keep the existing pool structure, but that won't be true on other systems - it
will need to be possible to separate the struct pbuf, from the following pool
memory itself. The simplest way I can think of is to keep the struct pbuf and
the pbuf memory itself in parallel arrays. So if you have the pbuf start, then
since the pbufs are of constant size and you know the base, you can easily
calculate the index of this pbuf, and therefore the index of the corresponding
struct pbuf.

Another requirement is that drivers need to be informed when more of that
pool memory becomes available, as often the hardware has its own circular
buffer lists, and these will need to be marked as available again, once a pool
pbuf has been freed.

So far I only have a solution for one ethernet driver at a time. Implementing
this for multiple ones (efficiently) is more tricky since of course it's
possible for drivers to have different requirements. The easiest solution is
probably to say they can't do it then.






    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/



_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt

Follow-up Comment #1, task #7896 (project lwip):

Ok, seems like this is one of the last tasks left before releasing 1.4.0
(together with bug #29361, which is kind of related to this one).

I think there is not much coding involved here but we rather need to know
what we need to support zero copy. For that, we would need input from many
developers/users telling us what they need. Personally, the only requirements
for my implementation are:

- delayed freeing of TX pbufs (i.e. transmit might take some time / pbuf is
*not* transmitted when netif->linkoutput returns - this is already taken care
of by adjusting the docs to say "applications may not reuse pbufs after
passing them to send/output functions")
- RX pbufs should be properly aligned (depending on the cache line size -
this currently doesn't work as there is only one MEM_ALIGNMENT)

As Jonathan pointed out some more requirements below, I think the most
portable solution would be to allow the driver to allocate pbufs in its own
memory and to let pbuf_free() call a port-defined function to free those pbufs
- we have plenty numbers left in the 'enum pbuf_type'.

This would also help to solve bug #29361. However, the downside is that the
whole stack would have to be reviewed in order to remove assumptions made on
different pbuf types (i.e. pbuf_header fails on PBUF_REF in some cases, etc.).

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt

Follow-up Comment #2, task #7896 (project lwip):

I am wondering how zero-copy transmission could be implemented with Coldfire
v2 Ethernet controller. The limitations of its DMA is that transmit buffers
must be 32-bit aligned and each buffer's size must be a multiple of 16 bytes.
The latter limitation is strange since in the very same microcontroller manual
they suggest to allocate one buffer for IP header, another for TCP header and
so on, but those headers' sizes are not multiples of 16 bytes...
Anyway, with those limitations I cannot see how zero-copy transmission could
work without imposing severe restrictions on how the application supplies data
to lwip. Basically, it would not be practical.
Am I right in this?

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: [task #7896] Support zero-copy drivers

goldsimon@gmx.de
Mike Kleshov wrote:
> I am wondering how zero-copy transmission could be implemented with Coldfire
> v2 Ethernet controller. The limitations of its DMA is that transmit buffers
> must be 32-bit aligned and each buffer's size must be a multiple of 16 bytes.
> The latter limitation is strange since in the very same microcontroller manual
> they suggest to allocate one buffer for IP header, another for TCP header and
> so on, but those headers' sizes are not multiples of 16 bytes...
>    
That looks strange, indeed. I've had a quick look at the the PDF
provided at www.freescale.com and I couldn't find a note about the size
having to be a multiple of 16 bytes. Instead, I found that the
TX-buffers have to be 4-byte-aligned while the RX-buffers have to be
16-byte aligned. This brings us to the fact that there is at least a
different alignment requirement for TX- or RX-buffers than for the rest
of the stack.
> Anyway, with those limitations I cannot see how zero-copy transmission could
> work without imposing severe restrictions on how the application supplies data
> to lwip. Basically, it would not be practical.
> Am I right in this?
>    
Zero-copy from the application to the wire would indeed not be practical
unless you would write the application to generate data into a given
pbuf queue and in your port make sure the pbuf queue is allocated in a
way that fits your MAC. However, zero-copy from lwIP's internal TCP
buffers to the wire could still be worth to try.

Simon

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [task #7896] Support zero-copy drivers

Mike Kleshov-2
On 22 May 2010 23:25, [hidden email] <[hidden email]> wrote:
> That looks strange, indeed. I've had a quick look at the the PDF provided at
> www.freescale.com and I couldn't find a note about the size having to be a
> multiple of 16 bytes. Instead, I found that the TX-buffers have to be
> 4-byte-aligned while the RX-buffers have to be 16-byte aligned. This brings
> us to the fact that there is at least a different alignment requirement for
> TX- or RX-buffers than for the rest of the stack.

I just had another look. It gets more confusing. I have designs based
on MCF5223x as well as MCF5225x. The manual for MCF5223x says that
only the bits [15..5] of the length field of the transmit buffer
descriptor are used by the DMA, while bits [4..0] are ignored. That
led me to believe that the length must always be a multiple of 32 (not
16, as I said initially.) But the manual for MCF5225x does not have
this restriction.
So there are 2 likely explanations:
1) The Ethernet controllers in those MCU's are in fact different, and
the one in MCF5225x is more flexible since it doesn't impose the
multiple-of-32-bytes restriction on transmit buffers.
2) The Ethernet controllers in those MCU's are identical, and there is
an error in the manual for MCF5223x. That would explain the suggestion
of using separate transmit buffers for IP, TCP and Ethernet headers.
I am leaning towards explanation 2.
As soon as I have a chance, I'll test this in hardware. I'll also
submit a service request to Freescale.

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [task #7896] Support zero-copy drivers

vespaman
On Saturday 22 May 2010 23.44.14 Mike Kleshov wrote:
> I just had another look. It gets more confusing. I have designs based
> on MCF5223x as well as MCF5225x. The manual for MCF5223x says that
> only the bits [15..5] of the length field of the transmit buffer
> descriptor are used by the DMA, while bits [4..0] are ignored.

While I haven't used the 5223x in any design (yet) this sounds like a typo.
Normally the buffer ring start address pointer need to be aligned, but not  
the lengths field in the the actual transmit descriptors. Or maybe an errata
never fixed.

The FEC is very similar in the devices I have been using so far (mainly 5282
and 52259), but of course there may be minor changes that I have missed.
The FEC on the first coldfire with ethernet MAC, the 5272, was a different
beast though.
On the 5282 I did DMA zero recevive but left out on zero copy transmit for
another stack (interniche), since it was much trickier, as the buffer
management of (that version of) interniche did not lend itself to this
easily.

And while I did not do tx, I have no recollection of the size being rounded to
any particular boundary and this was on the 5282, which I think is the second
Coldfire V2 device with emedded FEC.

> As soon as I have a chance, I'll test this in hardware. I'll also
> submit a service request to Freescale.

Keep us updated!
 
 Micael

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

RE: [task #7896] Support zero-copy drivers

Bill Auerbach
In reply to this post by Simon Goldschmidt
I solved this in the Ethernet driver.  I used a chained DMA.  I copied 8 bytes to a stack-based array (which is aligned) and set up DMA to copy for 4 plus what it takes to align the second segment.  So it's not 0 copy, but takes only 4 instructions to do the copy to the array.  Then I set up the second DMA transfer to start at the aligned address in the data for count minus what is in the first DMA segment.  For a 1k or larger transmit, this was far more efficient than copying the whole packet to an aligned stack-based array (which is what the driver I started with was doing).  I think it would still be more efficient if you don't have chained DMA if you had to do this in 2 separate transfers to the MAC.

Bill

>Follow-up Comment #2, task #7896 (project lwip):
>Anyway, with those limitations I cannot see how zero-copy transmission
>could work without imposing severe restrictions on how the application
>supplies data to lwip. Basically, it would not be practical.
>Am I right in this?



_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: Re: [task #7896] Support zero-copy drivers

Mike Kleshov-2
In reply to this post by vespaman
>> As soon as I have a chance, I'll test this in hardware. I'll also
>> submit a service request to Freescale.
>
> Keep us updated!
>
>  Micael

Well, I haven't done any testing in hardware yet. But I have got a
reply from Freescale support. Basically, they are saying that this
must be a mistake in the manual. They don't know where it came from,
probably some unintended carry-over from the transmit buffer alignment
requirement. They also say that they didn't modify any ethernet
drivers for the MCF5223x. The fact that the drivers still work is
strong evidence in support of the doc bug theory.
This is good enough for me. I don't think I'll be testing this in hardware...

- mike

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: [task #7896] Support zero-copy drivers

Jonathan Larmour
In reply to this post by Bill Auerbach
On 24/05/10 13:44, Bill Auerbach wrote:
> I solved this in the Ethernet driver.  I used a chained DMA.  I copied
> 8 bytes to a stack-based array (which is aligned) and set up DMA to
> copy for 4 plus what it takes to align the second segment.  So it's not
> 0 copy, but takes only 4 instructions to do the copy to the array.
> Then I set up the second DMA transfer to start at the aligned address
> in the data for count minus what is in the first DMA segment.  For a 1k
> or larger transmit, this was far more efficient than copying the whole
> packet to an aligned stack-based array (which is what the driver I
> started with was doing).

I'm coming to this late because I've not been in a position to work on
lwIP for over a year. But OOI I have already implemented zero-copy for
coldfire m68k and lwIP, but it isn't in a sufficiently generic way.

> I think it would still be more efficient if
> you don't have chained DMA if you had to do this in 2 separate
> transfers to the MAC.

Not all MACs would be able to do that.

>> Follow-up Comment #2, task #7896 (project lwip): Anyway, with those
>> limitations I cannot see how zero-copy transmission could work
>> without imposing severe restrictions on how the application supplies
>> data to lwip. Basically, it would not be practical. Am I right in
>> this?

It's not true that it isn't practical. It's true that if an application
sends non-aligned data a lower layer (perhaps the driver) will have to
fall back to copying it before transmission.

What can be done is to change how pbufs are allocated so that they _do_
fit the alignment requirements of the MAC. So when the user (via raw API
or netconn API) allocates space for their data, it will already be copied
in correctly (although I'm glossing over a few issues to do with
scatter-gather even then, and then what if you have multiple devices with
different constraints). We can also provide a means for the user to get
the alignment constraints from the stack if they want zero-copy.

It's pretty much intrinsic with the BSD sockets API that it can almost
never be zero-copy. That is one of its limitations and why netconn/raw
APIs can be superior (unless you're using e.g. netbuf_ref() ).

I note that there was also discussion about the alignment requirements of
the MAC. But that is not the only issue - with DMA in use, you also have
to consider the alignment requirements of the processor data cache (for
those processors with data cache anyway, which is more likely if they're
high-end enough to have DMA, but isn't true for all coldfires e.g. 5272).

Jifl
--
eCosCentric Limited      http://www.eCosCentric.com/     The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK.       Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["Si fractum non sit, noli id reficere"]------       Opinions==mine

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: [task #7896] Support zero-copy drivers

goldsimon@gmx.de
Wow, your alive, hehe. Nice hearing from you!


Jonathan Larmour wrote:
>> I think it would still be more efficient if
>> you don't have chained DMA if you had to do this in 2 separate
>> transfers to the MAC.
>>      
> Not all MACs would be able to do that.
>    
For example the last FPGA-based MAC I used only allowed single-buffer
ethernet frames that are aligned on a 32-bit boundary.

So reading the rest of the mail, is this (task #7896) all about
alignment only? Because if so, I wouldn't want to let that hold back
1.4.0 if it's only new code, not changing the API.

The only thing I can add is that for my other MAC, the only limitation
is alignment, too. There might be other things though (like having all
pbufs of a specific maximum length for RX), but I guess these could be
also solved by allocating pbufs correctly.

Simon

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: [task #7896] Support zero-copy drivers

Jonathan Larmour
On 17/06/10 21:10, [hidden email] wrote:
> Wow, your alive, hehe. Nice hearing from you!

I had a year off, and now my time is directed by the way the wind blows
from my work. Which may involve improving lwIP... but as ever time is money.

> Jonathan Larmour wrote:
>>> I think it would still be more efficient if
>>> you don't have chained DMA if you had to do this in 2 separate
>>> transfers to the MAC.
>>>      
>> Not all MACs would be able to do that.
>>    
> For example the last FPGA-based MAC I used only allowed single-buffer
> ethernet frames that are aligned on a 32-bit boundary.
>
> So reading the rest of the mail, is this (task #7896) all about
> alignment only? Because if so, I wouldn't want to let that hold back
> 1.4.0 if it's only new code, not changing the API.

I would expect supporting zero-copy to involve some API changes (or at
least extensions) and some major surgery to the packet buffer code. But if
you're talking about a 1.4.0 release in the near future, I don't think
there's much chance of this being done and dusted by then, so if I were
you I wouldn't let it block a release. Just my opinion :).

> The only thing I can add is that for my other MAC, the only limitation
> is alignment, too. There might be other things though (like having all
> pbufs of a specific maximum length for RX), but I guess these could be
> also solved by allocating pbufs correctly.

More widely, there can be other issues beyond alignment, such as the
location of the payload contents - it may need to be within a PCI window
in memory for example. Or it may need to reside in one particular bank of
memory[1]. Again this points to changing the way pbufs are allocated.

Also in some cases we can't put arbitrary data amongst the packet data. So
we would have to allow pool pbufs where the struct pbuf _doesn't_ precede
the pbuf payload. That bit isn't hard at least.

There's the question of multiple devices - different MACs could have
different requirements. Covering all bases may be tricky, certainly given
a straightforward pbuf_alloc with no indication which MAC will eventually
be used with it. I don't know if we could come up with a usage model which
could do it. We may have to say we can't guarantee zero copy with multiple
different devices; and that isn't very common at least. We may have to
have per-device pbuf pools.

In all cases, drivers would probably have to be written so that they
didn't _require_ zero-copy, but would be able to fall back to copying if
their various requirements weren't met.

Jifl
[1] The NXP LPC24xx series is an example, certainly the LPC2468. It has
16Kb directly addressable on-chip ethernet RAM. It _can_ also address
normal external off-chip RAM, but that's much slower, and in fact we found
it caused packets to get dropped readily.

_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
In reply to this post by Simon Goldschmidt

Update of task #7896 (project lwip):

         Planned Release:                   1.4.0 => 1.4.1                  

    _______________________________________________________

Follow-up Comment #3:

I'm setting this to 1.4.1 (if not later): we need people with bug reports who
can tell us in which way they need this fixed. For me, it just works (after
the changes to IP_FRAG and to the docs that say applications may not reuse
pbufs).

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

Re: [task #7896] Support zero-copy drivers

Kieran Mansley
In reply to this post by Jonathan Larmour
On Fri, 2010-06-18 at 01:36 +0100, Jonathan Larmour wrote:
> if
> you're talking about a 1.4.0 release in the near future, I don't think
> there's much chance of this being done and dusted by then, so if I
> were
> you I wouldn't let it block a release. Just my opinion :).

I agree.  I'm going to go through the current bugs/tasks/patches today
(I hope) and see what's remaining for 1.4.0

Kieran


_______________________________________________
lwip-devel mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
In reply to this post by Simon Goldschmidt
Follow-up Comment #4, task #7896 (project lwip):

Just a link to 2 mails on the mailing list from users using DMA:

http://lists.gnu.org/archive/html/lwip-users/2003-03/msg00085.html

http://lists.gnu.org/archive/html/lwip-devel/2011-09/msg00057.html (lower half
of the mail)

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht geschickt von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Update of task #7896 (project lwip):

         Planned Release:                   1.4.1 => None                  

    _______________________________________________________

Follow-up Comment #5:

This isn't needed for 1.4.1

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Update of task #7896 (project lwip):

                Category:                    None => Network drivers        


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht gesendet von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Follow-up Comment #6, task #7896 (project lwip):

And another user mail requesting support for this:
https://lists.gnu.org/archive/html/lwip-users/2014-02/msg00003.html


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht gesendet von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Follow-up Comment #7, task #7896 (project lwip):

We're actually working on this. It showed up that we still need to
- fully support a REF type pbuf for RX (fails at least on IPv6 reassembly, I
think)
- need to DMA-align TX buffers (the resulting payload pointers that get sent
on the link). This might be more tricky as it would mean the 'header' of a
pbuf is a different one for UDP/TCP and IPv4/IPv6...

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht gesendet von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Follow-up Comment #8, task #7896 (project lwip):

Using PBUF_REF for RX should now work.

DMA-alignment of TX buffers remains to be done...

    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht gesendet von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[task #7896] Support zero-copy drivers

Simon Goldschmidt
Update of task #7896 (project lwip):

        Percent Complete:                      0% => 50%                    


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/task/?7896>

_______________________________________________
  Nachricht gesendet von/durch Savannah
  http://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
12