mem_malloc(): memory fragmentation

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

mem_malloc(): memory fragmentation

Goldschmidt Simon
Hi,
 
I'm currently working on an embedded product which should use the lwIP-stack. Since the application should be running for _very_ long time, the memory allocation used by mem_malloc() is not really appropriate since it seems to be using a normal malloc()-like heap.
 
My question is: has anyone ever bothered to somehow avoid using a heap and using pools instead?
 
I would favor solving this problem by 'fixing' savannah bug #3031 submitted by Leon Woestenberg, which basically proposes getting rid of PBUF_RAM and using pools instead. As other modules use mem_malloc() also (dhcp/snmp/loopif), maybe a better solution might be to implement mem_malloc() as different pools and leave the PBUF_RAM implementation since it would be allocated from pools then.
 
The downside to this is that more RAM is needed, since only three or four pools with different block sizes would be created, but this way, you can calculate the memory needs based on application throughput and e.g. TCP_WND, what you can't if your memory gets fragmented...
 
Any comments?
 
Simon

_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: mem_malloc(): memory fragmentation

Jonathan Larmour
Goldschmidt Simon wrote:

> Hi,
>  
> I'm currently working on an embedded product which should use the
> lwIP-stack. Since the application should be running for _very_ long
> time, the memory allocation used by mem_malloc() is not really
> appropriate since it seems to be using a normal malloc()-like heap.
>  
> My question is: has anyone ever bothered to somehow avoid using a heap
> and using pools instead?
>  
> I would favor solving this problem by 'fixing' savannah bug #3031
> submitted by Leon Woestenberg, which basically proposes getting rid of
> PBUF_RAM and using pools instead. As other modules use mem_malloc() also
> (dhcp/snmp/loopif),

DHCP and loopif at least hardly use any space.

> maybe a better solution might be to implement
> mem_malloc() as different pools and leave the PBUF_RAM implementation
> since it would be allocated from pools then.

Note that despite what's implied in that bug, IMHO you can't actually let
it be the current pbuf_alloc(..., PBUF_POOL), otherwise if you use up all
the pbufs with TX data, you won't have room for any RX packets, including
the TCP ACK packets which will allow you to free some of your TX packets.
So either RX and TX packets should be allocated from different pools, or
there should be a low water mark on the pool for TX allocations in order to
reserve a minimum number of packets for RX.

> The downside to this is that more RAM is needed, since only three or
> four pools with different block sizes would be created, but this way,
> you can calculate the memory needs based on application throughput and
> e.g. TCP_WND, what you can't if your memory gets fragmented...
>  
> Any comments?

I agree that if there were just a set of fixed size pbuf pools of various
sizes, it could waste a lot of memory.

One good solution if using 2^n sized pools is to use a buddy allocator[1]
to divide up larger contiguous space, so it may not be as wasteful as you
think. One difference with a normal buddy allocator is that a normal one
would normally e.g. return a 2Kbyte buffer if you request 1025 bytes. An
lwIP implementation could work for maximum efficiency instead and allocate
that as a 1024 byte buffer plus a 64 byte buffer (or whatever the lowest
granularity would be) chained together.

But all this would be a non-trivial bit of coding so I'm sure people would
be grateful if you have time to do it (unfortunately I don't as I have a
lot of other things to address in my own lwIP work).

I could also believe the result will use more a fair bit more code space
than the present mem_malloc.

Jifl
[1] Just in case: http://en.wikipedia.org/wiki/Buddy_memory_allocation
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: mem_malloc(): memory fragmentation

Kieran Mansley
On Mon, 2006-10-23 at 13:42 +0100, Jonathan Larmour wrote:

> Goldschmidt Simon wrote:
> > maybe a better solution might be to implement
> > mem_malloc() as different pools and leave the PBUF_RAM implementation
> > since it would be allocated from pools then.
>
> Note that despite what's implied in that bug, IMHO you can't actually let
> it be the current pbuf_alloc(..., PBUF_POOL), otherwise if you use up all
> the pbufs with TX data, you won't have room for any RX packets, including
> the TCP ACK packets which will allow you to free some of your TX packets.
> So either RX and TX packets should be allocated from different pools, or
> there should be a low water mark on the pool for TX allocations in order to
> reserve a minimum number of packets for RX.

The later would be my preference as it is much more efficient on memory
usage where you have unidirectional traffic (which is, to a first
approximation, quite common for bulk transfers).

> One good solution if using 2^n sized pools is to use a buddy allocator[1]
> to divide up larger contiguous space, so it may not be as wasteful as you
> think. One difference with a normal buddy allocator is that a normal one
> would normally e.g. return a 2Kbyte buffer if you request 1025 bytes. An
> lwIP implementation could work for maximum efficiency instead and allocate
> that as a 1024 byte buffer plus a 64 byte buffer (or whatever the lowest
> granularity would be) chained together.

I think that would definitely be the only way such a change would be
acceptable given that lwIP tries to have a low memory footprint, and
indeed it's the only way I'd even though of it being done.

> But all this would be a non-trivial bit of coding so I'm sure people would
> be grateful if you have time to do it

Exactly.  These day changes to lwIP generally only get included by
someone who uses it taking the time to write the code - the maintainers
do their best to maintain it and fix bugs, but a rework such as this is
unlikely to ever reach the top of their list of things to do.

> I could also believe the result will use more a fair bit more code space
> than the present mem_malloc.

Certainly true.

Kieran



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: mem_malloc(): memory fragmentation

Christiaan Simons
In reply to this post by Goldschmidt Simon

Simon,

> My question is: has anyone ever bothered to somehow avoid using a
> heap and using pools instead?

Can remember being a bit conservative about using mem_malloc for
24/7 use myself. Most literature about RT systems is quite biased againsed
malloc for many good reasons. However you shouldn't be too frightened by
mem_malloc fragmentation unless you have a malloc which performs
very poor in this respect.

We had this discussion before on this list and frankly I'm not a bit afraid
using malloc for non/soft RT, 24/7 sytem stuff now. You'll certainly
run into other troubles when trying to avoid it to great lengths.

Let's quote (and learn from) McKusick and Karels (from the BSD kernel
malloc paper):

"A generalized memory allocator is needed to reduce the complexity of
writing code inside the kernel.
Rather than providing many semi-specialized ways of allocating memory, the
kernel should provide a single
general purpose allocator. With only a single interface, programmers do not
need to figure out the most
appropriate way to allocate memory. If a good general purpose allocator is
available, it helps avoid the syndrome
of creating yet another special purpose allocator."

Search'n'replace "kernel" with "lwip" and you get my point.
Too many hackers here, with too many good ideas for their own lwIP setup,
I'm afraid.

> I would favor solving this problem by 'fixing' savannah bug #3031
> submitted by Leon Woestenberg, which basically proposes getting rid
> of PBUF_RAM and using pools instead.

Different issue. #3031 is nice to have, and may solve some other odd
corners.
Doesn't solve fragmentation.

I'm open for  fragmentation test results and test cases for our
mem_malloc() though...

Christiaan Simons

Hardware Designer
Axon Digital Design

http://www.axon.tv




_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: mem_malloc(): memory fragmentation

Jonathan Larmour
Christiaan Simons wrote:

>
> Let's quote (and learn from) McKusick and Karels (from the BSD kernel
> malloc paper):
>
> "A generalized memory allocator is needed to reduce the complexity of
> writing code inside the kernel.
> Rather than providing many semi-specialized ways of allocating memory, the
> kernel should provide a single
> general purpose allocator. With only a single interface, programmers do not
> need to figure out the most
> appropriate way to allocate memory. If a good general purpose allocator is
> available, it helps avoid the syndrome
> of creating yet another special purpose allocator."
>
> Search'n'replace "kernel" with "lwip" and you get my point.
> Too many hackers here, with too many good ideas for their own lwIP setup,
> I'm afraid.

Generalised memory allocators only work sensibly with enough spare space
left over that fragmentation is an infrequent occurence. Unlike the BSD
kernel, lwIP users are unlikely to have the freedom of lots of space for
the heap - quite the opposite. Maybe that's not too bad for PBUF_RAM
allocations - apps can possibly handle not being able to send data for a
little while (in the hope that more space will free itself later), but for,
failures in, say, renewing a DHCP lease, you could lose your interface. And
that could happen just because the SNMP agent was feeling a bit hungry with
its own stuff. If there are more users, the problem compounds - a low
priority subsystem can deprive a high priority subsystem of memory, either
directly due to using it, or indirectly due to fragmentation.

Also in the above quote, the emphasis is on "good". The mem_malloc
allocator is not "good". It is simple. This is only right in the context of
lwIP. But it does not resist fragmentation in the way most other memory
allocators endeavour to do (however it does avoid the larger code and space
penalty).

>> I would favor solving this problem by 'fixing' savannah bug #3031
>> submitted by Leon Woestenberg, which basically proposes getting rid
>> of PBUF_RAM and using pools instead.
>
> Different issue. #3031 is nice to have, and may solve some other odd
> corners.
> Doesn't solve fragmentation.

Since pbufs can be chained, I'm not sure what you mean?

> I'm open for  fragmentation test results and test cases for our
> mem_malloc() though...

It entirely depends on the size of MEM_SIZE, compared with the allocation
pressure.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

RE: mem_malloc(): memory fragmentation

Goldschmidt Simon
In reply to this post by Kieran Mansley

> On Mon, 2006-10-23 at 13:42 +0100, Jonathan Larmour wrote:
> > Goldschmidt Simon wrote:
> > > maybe a better solution might be to implement
> > > mem_malloc() as different pools and leave the PBUF_RAM
> > > implementation since it would be allocated from pools then.
> >
> > Note that despite what's implied in that bug, IMHO you can't
actually
> > let it be the current pbuf_alloc(..., PBUF_POOL), otherwise  if you
use
> > up all the pbufs with TX data, you won't have room for any  RX
packets,
> > including the TCP ACK packets which will allow you to free some of
your TX packets.
> > So either RX and TX packets should be allocated from different
pools,
> > or there should be a low water mark on the pool for TX allocations
in
> > order to reserve a minimum number of packets for RX.
>
> The later would be my preference as it is much more efficient
> on memory usage where you have unidirectional traffic (which
> is, to a first approximation, quite common for bulk transfers).

That would imply that the current lwIP implementation uses PBUF_POOL for
RX frames only, whereas all TX frames would use PBUF_RAM or no-copy
pbufs. As far as I know, that's not true: SNMP & PPP use PBUF_POOL where
I don't think it's on the input side. But certainly this would be a good
idea! If PBUF_RAM was allocated from pools also, so there would be no
difference in using PBUF_POOL or PBUF_RAM.

>
> > One good solution if using 2^n sized pools is to use a buddy
> > allocator[1] to divide up larger contiguous space, so it may not be
as
> > wasteful as you think. One difference with a normal buddy allocator
is
> > that a normal one would normally e.g. return a 2Kbyte buffer if you
> > request 1025 bytes. An lwIP implementation could work for maximum
> > efficiency instead and allocate that as a 1024 byte buffer plus a 64

> > byte buffer (or whatever the lowest granularity would be) chained
together.
>
> I think that would definitely be the only way such a change
> would be acceptable given that lwIP tries to have a low
> memory footprint, and indeed it's the only way I'd even
> though of it being done.

I don't really understand what you mean with the buddy allocator... Do
you mean chaining 2 pbufs together? The way it's done in the wikipedia
example does not really suppress fragmentation, or does it? Chaining a
pbuf from two fragments would be easy (implementing pools), but I don't
know if it would be less speedy or causing any other problem.

>
> > But all this would be a non-trivial bit of coding so I'm sure people

> > would be grateful if you have time to do it

Don't know about the triviality... I implemented 5 pools of different
sizes (using the memp.c interface, just added some pools). Normal calls
to mem_malloc() get the size they want (maybe too big, but only DHCP,
SNMP & loopif are using that and they could be reprogrammed using
memp_malloc, I suggest). Pbuf_alloc(PBUF_RAM) would construct a
pbuf-chain of sizes just like you suggested.

>
> Exactly.  These day changes to lwIP generally only get
> included by someone who uses it taking the time to write the
> code - the maintainers do their best to maintain it and fix
> bugs, but a rework such as this is unlikely to ever reach the
> top of their list of things to do.
>
> > I could also believe the result will use more a fair bit more code
> > space than the present mem_malloc.
>
> Certainly true.

About code size, I'm not sure, but you certainly can better calculate
the amount of data RAM you need when suppressing (external)
fragmentation (and thus save space compared to the current heap), and
that should make up for the bigger code size. Last but not least you get
a _much_ better feeling running lwIP applications for some years without
rebooting (at least I do).

I would be open to fragmentation tests, too (like Christiaan suggested),
but I'm not sure that this makes much sense. Since the memory usage is
almost always caused by clients contacting my board, fragmentation would
possibly depend on the network traffic my app is running on... So I'd
rather theoretically think about the fragmentation issue instead of
trying to prove it in examples.

Simon.


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: mem_malloc(): memory fragmentation

Jonathan Larmour
Goldschmidt Simon wrote:

>
>>> One good solution if using 2^n sized pools is to use a buddy
>>> allocator[1] to divide up larger contiguous space, so it may not be
> as
>>> wasteful as you think. One difference with a normal buddy allocator
> is
>>> that a normal one would normally e.g. return a 2Kbyte buffer if you
>>> request 1025 bytes. An lwIP implementation could work for maximum
>>> efficiency instead and allocate that as a 1024 byte buffer plus a 64
>
>>> byte buffer (or whatever the lowest granularity would be) chained
> together.
>> I think that would definitely be the only way such a change
>> would be acceptable given that lwIP tries to have a low
>> memory footprint, and indeed it's the only way I'd even
>> though of it being done.
>
> I don't really understand what you mean with the buddy allocator... Do
> you mean chaining 2 pbufs together?

Two or more. A buddy allocator is a simple, quick and efficient way of
allocating chunks of memory of size 2^^N for a range of N.

> The way it's done in the wikipedia
> example does not really suppress fragmentation, or does it?

Since you're able to chain pbufs together then it resists fragmentation.

> Chaining a
> pbuf from two fragments would be easy (implementing pools), but I don't
> know if it would be less speedy or causing any other problem.

A buddy allocator should be pretty speedy. It would have some interesting
effects on the application though - netbufs would be far more likely to
consist of multiple pbufs.

>>> But all this would be a non-trivial bit of coding so I'm sure people
>
>>> would be grateful if you have time to do it
>
> Don't know about the triviality... I implemented 5 pools of different
> sizes (using the memp.c interface, just added some pools). Normal calls
> to mem_malloc() get the size they want (maybe too big, but only DHCP,
> SNMP & loopif are using that and they could be reprogrammed using
> memp_malloc, I suggest). Pbuf_alloc(PBUF_RAM) would construct a
> pbuf-chain of sizes just like you suggested.

It's the "maybe too big" bit that I think wants to be solved. What sizes
are the packets in your pools?

> About code size, I'm not sure, but you certainly can better calculate
> the amount of data RAM you need when suppressing (external)
> fragmentation (and thus save space compared to the current heap), and
> that should make up for the bigger code size. Last but not least you get
> a _much_ better feeling running lwIP applications for some years without
> rebooting (at least I do).

ROM is usually much cheaper than RAM anyway. So adding more code if it
saves RAM is usually good.

> I would be open to fragmentation tests, too (like Christiaan suggested),
> but I'm not sure that this makes much sense. Since the memory usage is
> almost always caused by clients contacting my board, fragmentation would
> possibly depend on the network traffic my app is running on... So I'd
> rather theoretically think about the fragmentation issue instead of
> trying to prove it in examples.

It absolutely would depend on network traffic, and especially if you send
packets of wildly different sizes. Things like adding a DHCP renewal in the
middle may have the potential to fragment the space a bit as that uses
small but persistant allocations. The upcoming SNMP agent would have a
major effect on fragmentation as it makes lots of mem_malloc calls of
various sizes.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users