Re:TCP transmission error

classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Leon Woestenberg
Hello all,

this is very late reply.

On Sun, 2006-10-22 at 07:44 +0100, Kieran Mansley wrote:

> On Fri, Oct 20, 2006 at 11:37:23PM +0100, Jonathan Larmour wrote:
> > With the current code, I think the only solution is SYS_LIGHTWEIGHT_PROT.
> > My opinion is that the !SYS_LIGHTWEIGHT_PROT stuff should probably just be
> > removed.
> >
>
> Thanks for the detailed post - good to see folks getting to grips with
> the stack.  The above seemed to summarise quite nicely and I agree
> completely with that.
>
I completely agree with you; It's good to see there is interest in the
lwIP stack from developers that are concerned abouts its correctness.

The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
developers using the stack to protect *ONLY* between interrupt context
and single-thread user-space context if I am not mistaken.

I am all for removing it, because the locking solution does not scale
across different platforms.

Regards,

Leon.





_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
Leon Woestenberg wrote:

> Hello all,
>
> this is very late reply.
>
> On Sun, 2006-10-22 at 07:44 +0100, Kieran Mansley wrote:
>
>>On Fri, Oct 20, 2006 at 11:37:23PM +0100, Jonathan Larmour wrote:
>>
>>>With the current code, I think the only solution is SYS_LIGHTWEIGHT_PROT.
>>>My opinion is that the !SYS_LIGHTWEIGHT_PROT stuff should probably just be
>>>removed.
>>>
>>
>>Thanks for the detailed post - good to see folks getting to grips with
>>the stack.  The above seemed to summarise quite nicely and I agree
>>completely with that.
>>
>
> I completely agree with you; It's good to see there is interest in the
> lwIP stack from developers that are concerned abouts its correctness.
>
> The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> developers using the stack to protect *ONLY* between interrupt context
> and single-thread user-space context if I am not mistaken.
>
> I am all for removing it, because the locking solution does not scale
> across different platforms.

I was actually arguing the other way round I'm afraid :-). I was arguing
that it looks to me that SYS_LIGHTWEIGHT_PROT is essentially mandatory
because packets arrive asynchronously and so pbuf allocation needs to have
some protection no matter what since the allocation will take place from a
device driver. That would be true whatever the context, or whether single
or multi-threaded.

The !SYS_LIGHTWEIGHT_PROT code simply doesn't give any thread or interrupt
safety at all. So in fact the !SYS_LIGHTWEIGHT_PROT code does not appear
to solve the problem it seems intended to solve. Hence me saying that that
code should be removed entirely since as far as I can tell it doesn't
achieve anything. If protection is needed, something using
SYS_LIGHTWEIGHT_PROT is the solution.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Peter Graf
In reply to this post by Leon Woestenberg
Leon Woestenberg wrote:

>> Thanks for the detailed post - good to see folks getting to grips with
>> the stack.  The above seemed to summarise quite nicely and I agree
>> completely with that.
>>
> I completely agree with you; It's good to see there is interest in the
> lwIP stack from developers that are concerned abouts its correctness.
>
> The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> developers using the stack to protect *ONLY* between interrupt context
> and single-thread user-space context if I am not mistaken.

I think you are mistaken.

> I am all for removing it, because the locking solution does not scale
> across different platforms.

I have to use SYS_LIGHTWEIGHT_PROT in a _multithreaded_ environment with
interrupt-triggered device driver. Removing it would render lwIP
unusable for me. I guess it lies in the nature of a simple locking
mechanism to be platform specific, but that makes it "lightweight".

I vote against a removal.

All the best
Peter



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

RE: Bug in pbuf.c regarding PBUF_POOL

JamminJimP

I recall hashing this about previously; again I believe it is required
when the OS does not provide a semaphore mechanism to assure the pbuf
pool does not get corrupted. Whether a multi-threaded or single-threaded
implementation, if the ethif interrupt routine is allocating and moving
received packets using pbufs, there has to be a way to prevent that from
happening in the middle of a critical section of the stack's code. THIS
is why SYS_LIGHTWEIGHT_PROT exists and is needed.

Read the comments in sys.h for more background... And someone correct me
if I'm wrong, but I think this was Adam's original protection scheme and
not something 'some programmer' added.

I am using it and must also vote to retain this code.

- Jim


-----Original Message-----
From: lwip-users-bounces+jim.pettinato=[hidden email]
[mailto:lwip-users-bounces+jim.pettinato=[hidden email]] On Behalf
Of Peter Graf
Sent: Sunday, November 12, 2006 5:25 AM
To: Mailing list for lwIP users
Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL


Leon Woestenberg wrote:

>> Thanks for the detailed post - good to see folks getting to grips
>> with the stack.  The above seemed to summarise quite nicely and I
>> agree completely with that.
>>
> I completely agree with you; It's good to see there is interest in the

> lwIP stack from developers that are concerned abouts its correctness.
>
> The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> developers using the stack to protect *ONLY* between interrupt context

> and single-thread user-space context if I am not mistaken.

I think you are mistaken.

> I am all for removing it, because the locking solution does not scale
> across different platforms.

I have to use SYS_LIGHTWEIGHT_PROT in a _multithreaded_ environment with
interrupt-triggered device driver. Removing it would render lwIP
unusable for me. I guess it lies in the nature of a simple locking
mechanism to be platform specific, but that makes it "lightweight".

I vote against a removal.

All the best
Peter



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

RE: Bug in pbuf.c regarding PBUF_POOL

Goldschmidt Simon
Uhm, that's exactly what I meant when starting this thread. I just think
the pbuf-pool-code NOT using  SYS_LEIGHTWEIGHT_PROT should be removed
since I discovered it is buggy and wanted to protect other lwip-users
from running into this.

Of course, the SYS_LEIGHTWEIGHT_PROT must not be removed! This would
break the stack!

Simon


> -----Original Message-----
> From:
> lwip-users-bounces+sgoldschmidt=[hidden email]
> g
> [mailto:lwip-users-bounces+sgoldschmidt=de.pepperl-fuchs.com@n
ongnu.org] On Behalf Of Pettinato, Jim

> Sent: Monday, November 13, 2006 2:55 PM
> To: Mailing list for lwIP users
> Subject: RE: [lwip-users] Bug in pbuf.c regarding PBUF_POOL
>
>
> I recall hashing this about previously; again I believe it is
> required when the OS does not provide a semaphore mechanism
> to assure the pbuf pool does not get corrupted. Whether a
> multi-threaded or single-threaded implementation, if the
> ethif interrupt routine is allocating and moving received
> packets using pbufs, there has to be a way to prevent that
> from happening in the middle of a critical section of the
> stack's code. THIS is why SYS_LIGHTWEIGHT_PROT exists and is needed.
>
> Read the comments in sys.h for more background... And someone
> correct me if I'm wrong, but I think this was Adam's original
> protection scheme and not something 'some programmer' added.
>
> I am using it and must also vote to retain this code.
>
> - Jim
>
>
> -----Original Message-----
> From: lwip-users-bounces+jim.pettinato=[hidden email]
> [mailto:lwip-users-bounces+jim.pettinato=[hidden email]]
>  On Behalf Of Peter Graf
> Sent: Sunday, November 12, 2006 5:25 AM
> To: Mailing list for lwIP users
> Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL
>
>
> Leon Woestenberg wrote:
>
> >> Thanks for the detailed post - good to see folks getting to grips
> >> with the stack.  The above seemed to summarise quite nicely and I
> >> agree completely with that.
> >>
> > I completely agree with you; It's good to see there is
> interest in the
>
> > lwIP stack from developers that are concerned abouts its
> correctness.
> >
> > The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> > developers using the stack to protect *ONLY* between
> interrupt context
>
> > and single-thread user-space context if I am not mistaken.
>
> I think you are mistaken.
>
> > I am all for removing it, because the locking solution does
> not scale
> > across different platforms.
>
> I have to use SYS_LIGHTWEIGHT_PROT in a _multithreaded_
> environment with interrupt-triggered device driver. Removing
> it would render lwIP unusable for me. I guess it lies in the
> nature of a simple locking mechanism to be platform specific,
> but that makes it "lightweight".
>
> I vote against a removal.
>
> All the best
> Peter
>
>
>
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/lwip-users
>
>
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/lwip-users
>


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Leon Woestenberg
In reply to this post by Jonathan Larmour
Hello all,

On Sun, 2006-11-12 at 01:43 +0000, Jonathan Larmour wrote:

> Leon Woestenberg wrote:
> > The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> > developers using the stack to protect *ONLY* between interrupt context
> > and single-thread user-space context if I am not mistaken.
> >
> > I am all for removing it, because the locking solution does not scale
> > across different platforms.
>
> I was actually arguing the other way round I'm afraid :-). I was arguing
>
As said, I have been away from the lwIP stack too long, so I should have
shut up in the first place. However...

> that it looks to me that SYS_LIGHTWEIGHT_PROT is essentially mandatory
> because packets arrive asynchronously and so pbuf allocation needs to have
> some protection no matter what since the allocation will take place from a
> device driver. That would be true whatever the context, or whether single
> or multi-threaded.
>
In a fully pre-emptive environment (like full-blown eCos), this
light-weight locking mechanism must then disable the Ethernet chip
receive interrupt, to prevent concurrent access to the pbufs, during
LIGHTWEIGHT_PROT locks from non-interrupt context.

I would certainly design a pre-emptive system by NOT entering the pbuf
layer from interrupt context, but instead schedule a high-priority
real-time task to move the packets of the chip from the interrupt
handler, and nothing more.

We have a proven real-time pre-emptive design (even with nested
pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but uses
the "bottom-half" approach.

This is using the open-source, non-free uCOS-II RTOS.

If I am completely mistaken, I will probably be corrected by my
colleague Christiaan :-)

> The !SYS_LIGHTWEIGHT_PROT code simply doesn't give any thread or interrupt
> safety at all. So in fact the !SYS_LIGHTWEIGHT_PROT code does not appear
> to solve the problem it seems intended to solve. Hence me saying that that
> code should be removed entirely since as far as I can tell it doesn't
> achieve anything. If protection is needed, something using
> SYS_LIGHTWEIGHT_PROT is the solution.

I am NOT sure if SYS_LIGHTWEIGHT_PROT protects against all concurrent
acccess possible, I thought this was exactly the reason why it was
called lightweight when introduced.

Pushing a packet into the stack will have a chain reaction where all
higher-level layers are called into. I would not run this from interrupt
context.

I am against removing the !SYS_LIGHTWEIGHT_PROT code for the above
reasons. :-)

Regards,

Leon.



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
Leon Woestenberg wrote:

>
>> that it looks to me that SYS_LIGHTWEIGHT_PROT is essentially mandatory
>> because packets arrive asynchronously and so pbuf allocation needs to have
>> some protection no matter what since the allocation will take place from a
>> device driver. That would be true whatever the context, or whether single
>> or multi-threaded.
>>
> In a fully pre-emptive environment (like full-blown eCos), this
> light-weight locking mechanism must then disable the Ethernet chip
> receive interrupt, to prevent concurrent access to the pbufs, during
> LIGHTWEIGHT_PROT locks from non-interrupt context.

That would be sufficient if the only access to the PBUF_POOL buffers was
from the ethernet interrupt driver. Unfortunately as I described in my
previous mail, it is not. In the current trunk, there are pool allocations
from ip_frag.c, snmp/msg_out.c, ppp.c, vj.c, slipif.c, ethernetif.c, and
via pbuf_take, etharp.c.

Most of these can happen in different contexts, either TCPIP thread, eth
processing thread or PPP or SLIP threads.

As I described in my earlier thread, the only way the pbuf
!SYS_LIGHTWEIGHT_PROT support could theoretically work as it stands is if
all pbuf_alloc (... PBUF_POOL) calls were guaranteed to only be called from
the same context.

> I would certainly design a pre-emptive system by NOT entering the pbuf
> layer from interrupt context, but instead schedule a high-priority
> real-time task to move the packets of the chip from the interrupt
> handler, and nothing more.

There is no difference in practice with a pre-emptive system - if a lower
priority thread could call pbuf_alloc and be interrupted by a higher
priority thread which does the same, there's still a problem.


> We have a proven real-time pre-emptive design (even with nested
> pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but uses
> the "bottom-half" approach.

I would be surprised if it were truly thread-safe given my understanding of
the current code. Maybe it used to be with different earlier code. But I
haven't seen the code for that port.

>> The !SYS_LIGHTWEIGHT_PROT code simply doesn't give any thread or interrupt
>> safety at all. So in fact the !SYS_LIGHTWEIGHT_PROT code does not appear
>> to solve the problem it seems intended to solve. Hence me saying that that
>> code should be removed entirely since as far as I can tell it doesn't
>> achieve anything. If protection is needed, something using
>> SYS_LIGHTWEIGHT_PROT is the solution.
>
> I am NOT sure if SYS_LIGHTWEIGHT_PROT protects against all concurrent
> acccess possible, I thought this was exactly the reason why it was
> called lightweight when introduced.

I thought it just meant that it was a slightly sledgehammer approach
(blocking all tasking or maybe even all interrupts) and only to be used for
guaranteed brief periods.

Anyway, my beef is with the !SYS_LIGHTWEIGHT_PROT support which appears to
me to be not merely non-functional (e.g. no incr of pbuf_pool_free_lock),
but incapable of providing protection, in which case why have it at all if
you are _guaranteed_ only one context.

> Pushing a packet into the stack will have a chain reaction where all
> higher-level layers are called into. I would not run this from interrupt
> context.

Out of interest, the approach I have taken with my own improvements to the
eCos port is to remove the separate ethernet input processing thread
entirely, and instead make the tcpip thread suck the packets in (courtesy
of mods to api/tcpip.c). This also removes nasty race conditions affecting
the etharp data structures. The etharp code (in the old eCos port) could
also otherwise be called from two different thread contexts, therefore
potentially unsafely. There are other serious problems with the port I
won't go into here.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Christiaan Simons
@Jonathan:

> > We have a proven real-time pre-emptive design (even with nested
> > pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but
uses
> > the "bottom-half" approach.
>
> I would be surprised if it were truly thread-safe given my understanding
of
> the current code. Maybe it used to be with different earlier code. But I
> haven't seen the code for that port.

What we do is truly thread safe,
since we protect the calls to the _raw-API_.

As Leon said the interrupt top half merely
wakes a high priority task (bottom half) on eth rx.

This task locks a (uCOS) mutex, copies data from the
ethernet controller and pushes this into the lwIP stack.
After lwIP is done we unlock this mutex.

Other tasks accessing to any lwIP structure
are allowed when the mutex is available.

We completely avoid the lwIP socket or sequential API,
therefore we don't require the troublesome lwIP
"protection".

Any finer grained locking would require
improved thread safety many places in the current
code as many observed correctly before.

BTW direct pbuf pool access is desirable
for raw-API applications, such as SNMP.

Christiaan Simons

Hardware Designer
Axon Digital Design

http://www.axon.tv

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
This message contains confidential information and is intended only for the
individual named.  If you are not the named addressee you should not
disseminate, distribute or copy this e-mail.



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Mumtaz Ahmad
Dear Simons

Thats a better way and i have also used it and created my own sequential
APIs using this locking mechanism .This helped me to get a BSD style socket
api with the approximate performance of raw api .

Mumtaz Ahmad
Network Systems Group


----- Original Message -----
From: "Christiaan Simons" <[hidden email]>
To: "Mailing list for lwIP users" <[hidden email]>
Sent: Friday, November 17, 2006 5:39 PM
Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL


> @Jonathan:
>
> > > We have a proven real-time pre-emptive design (even with nested
> > > pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but
> uses
> > > the "bottom-half" approach.
> >
> > I would be surprised if it were truly thread-safe given my understanding
> of
> > the current code. Maybe it used to be with different earlier code. But I
> > haven't seen the code for that port.
>
> What we do is truly thread safe,
> since we protect the calls to the _raw-API_.
>
> As Leon said the interrupt top half merely
> wakes a high priority task (bottom half) on eth rx.
>
> This task locks a (uCOS) mutex, copies data from the
> ethernet controller and pushes this into the lwIP stack.
> After lwIP is done we unlock this mutex.
>
> Other tasks accessing to any lwIP structure
> are allowed when the mutex is available.
>
> We completely avoid the lwIP socket or sequential API,
> therefore we don't require the troublesome lwIP
> "protection".
>
> Any finer grained locking would require
> improved thread safety many places in the current
> code as many observed correctly before.
>
> BTW direct pbuf pool access is desirable
> for raw-API applications, such as SNMP.
>
> Christiaan Simons
>
> Hardware Designer
> Axon Digital Design
>
> http://www.axon.tv
>
> This email and any files transmitted with it are confidential and intended
> solely for the use of the individual or entity to whom they are addressed.
> If you have received this email in error please notify the system manager.
> This message contains confidential information and is intended only for
the
> individual named.  If you are not the named addressee you should not
> disseminate, distribute or copy this e-mail.
>
>
>
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/lwip-users



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

RE: Bug in pbuf.c regarding PBUF_POOL

JamminJimP
In reply to this post by Leon Woestenberg

Leon et al, I agree completely... If you have a pre-emptive OS, the
interface need only signal it needs serviced; a high-priority task can
handle feeding packets into the stack. Without such a prioritized
platform though, performance would be severely degraded if handled any
other way than directly in the ISR.

I also agree that I wouldn't push a packet all the way up the stack
directly in the ISR... but allocating a pbuf in the ISR to hold the
packet (and getting it cleared from the hardware queue) prior to
signaling the task level driver is a valuable ability. It makes the
task-level driver code much simpler and avoids double-buffering input
packets. I acually have a small array of type (pbuf *) used as a ring
buffer - so if the non-deterministic scheduler holds up my task-level
driver, all the packets can be processed without loss quite easily.

Some of the reason I'm faced with doing this is the fact that my
Ethernet hardware (SC91C111) only has 4 2K buffers in the hardware
queue, so you don't have much time on a 100M link before they fill up.

Now I could have used a permanently allocated array of packet buffers
for the ISR... but you know how memory constraints go...

- Jim
__
 

James M. Pettinato, Jr.
Software Engineer
FMC Technologies Measurement Solutions Inc.
1602 Wagner Avenue | Erie PA | 16510 USA
Phone: 814 898 5000 | Fax: 814 899-3414
www.fmctechnologies.com



-----Original Message-----
From: lwip-users-bounces+jim.pettinato=[hidden email]
[mailto:lwip-users-bounces+jim.pettinato=[hidden email]] On Behalf
Of Leon Woestenberg
Sent: Thursday, November 16, 2006 2:15 PM
To: Mailing list for lwIP users
Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL


Hello all,

On Sun, 2006-11-12 at 01:43 +0000, Jonathan Larmour wrote:

> Leon Woestenberg wrote:
> > The SYS_LIGHTWEIGHT_PROT protection was introduced by one of the
> > developers using the stack to protect *ONLY* between interrupt
> > context and single-thread user-space context if I am not mistaken.
> >
> > I am all for removing it, because the locking solution does not
> > scale across different platforms.
>
> I was actually arguing the other way round I'm afraid :-). I was
> arguing
>
As said, I have been away from the lwIP stack too long, so I should have
shut up in the first place. However...

> that it looks to me that SYS_LIGHTWEIGHT_PROT is essentially mandatory

> because packets arrive asynchronously and so pbuf allocation needs to
> have some protection no matter what since the allocation will take
> place from a device driver. That would be true whatever the context,
> or whether single or multi-threaded.
>
In a fully pre-emptive environment (like full-blown eCos), this
light-weight locking mechanism must then disable the Ethernet chip
receive interrupt, to prevent concurrent access to the pbufs, during
LIGHTWEIGHT_PROT locks from non-interrupt context.

I would certainly design a pre-emptive system by NOT entering the pbuf
layer from interrupt context, but instead schedule a high-priority
real-time task to move the packets of the chip from the interrupt
handler, and nothing more.

We have a proven real-time pre-emptive design (even with nested
pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but uses
the "bottom-half" approach.

This is using the open-source, non-free uCOS-II RTOS.

If I am completely mistaken, I will probably be corrected by my
colleague Christiaan :-)

> The !SYS_LIGHTWEIGHT_PROT code simply doesn't give any thread or
> interrupt
> safety at all. So in fact the !SYS_LIGHTWEIGHT_PROT code does not
appear
> to solve the problem it seems intended to solve. Hence me saying that
that
> code should be removed entirely since as far as I can tell it doesn't
> achieve anything. If protection is needed, something using
> SYS_LIGHTWEIGHT_PROT is the solution.

I am NOT sure if SYS_LIGHTWEIGHT_PROT protects against all concurrent
acccess possible, I thought this was exactly the reason why it was
called lightweight when introduced.

Pushing a packet into the stack will have a chain reaction where all
higher-level layers are called into. I would not run this from interrupt
context.

I am against removing the !SYS_LIGHTWEIGHT_PROT code for the above
reasons. :-)

Regards,

Leon.



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
In reply to this post by Christiaan Simons
Christiaan Simons wrote:

> @Jonathan:
>
>>> We have a proven real-time pre-emptive design (even with nested
>>> pre-emptable interrupts) that does not use SYS_LIGHTWEIGHT_PROT but
> uses
>>> the "bottom-half" approach.
>> I would be surprised if it were truly thread-safe given my understanding
> of
>> the current code. Maybe it used to be with different earlier code. But I
>> haven't seen the code for that port.
>
> What we do is truly thread safe,
> since we protect the calls to the _raw-API_.
>
> As Leon said the interrupt top half merely
> wakes a high priority task (bottom half) on eth rx.
>
> This task locks a (uCOS) mutex, copies data from the
> ethernet controller and pushes this into the lwIP stack.

So it calls etharp_ip_input or etharp_arp_input from that task context with
the mutex still locked?

> After lwIP is done we unlock this mutex.

Do you mean a mutex or a semaphore? The current pbuf.c code only uses a
semaphore, and therefore cannot prevent priority inversion.

> Other tasks accessing to any lwIP structure
> are allowed when the mutex is available.

So you mean that you lock and unlock a mutex from the application *every
time* you make a raw API call?

> We completely avoid the lwIP socket or sequential API,
> therefore we don't require the troublesome lwIP
> "protection".

Well, I want the socket and sequential API to be thread safe too, and I
think that is achievable. You appear to be describing a different more cut
down scenario.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
In reply to this post by JamminJimP
Pettinato, Jim wrote:
> Leon et al, I agree completely... If you have a pre-emptive OS, the
> interface need only signal it needs serviced; a high-priority task can
> handle feeding packets into the stack. Without such a prioritized
> platform though, performance would be severely degraded if handled any
> other way than directly in the ISR.
>
> I also agree that I wouldn't push a packet all the way up the stack
> directly in the ISR...

Just in case, no-one so far has been suggesting passing the packet into the
stack from an ISR, but I think the rest of your mail shows you do know that :).

I'm not sure a high priority task is sufficient, if that task can block for
an indeterminate time waiting for a lower priority task (semaphores suffer
from priority inversion). So it really makes no difference whether it's
called from a high priority thread or a bottom half - we don't want to
block either way round.

If PBUF_POOL pbufs were only allocated from a single context (and different
bottom halfs would be sufficient for this, for example), this could all be
avoided; but they aren't. It's not clear to me why many of these calls are
from PBUF_POOL, rather than PBUF_RAM.

Incidentally, having to have a bunch of threads around to handle these
sorts of issues is a big waste due to the extra thread control blocks and
thread stacks - precious memory in the context in which lwIP is intended to
be used. Personally, I'm changing lwIP not to rely on such a model for
ethernet at least - for me it's a simple change in fact.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
In reply to this post by Jonathan Larmour
Jonathan Larmour wrote:
[big ker-snip]

Actually I allowed myself to lose focus about what's really under
discussion. There are multiple issues in play here. Christian does have a
scenario in which the !SYS_LIGHTWEIGHT_PROT might be useful. In that case,
the code should stay and that discussion can end.

Separately, there's the issue that the !SYS_LIGHTWEIGHT_PROT code is
broken. Someone interested in that code (Christian?) may want to take that up.

And separately again, there's the issue that the stack code itself is not
thread-safe, possibly if there is a ethernet input thread/bottom half
(depending on implementation), and definitely so in the presence of SLIP or
PPP, and the only present way to solve that is implementing
SYS_LIGHTWEIGHT_PROT. At least in the current code.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Peter Graf
In reply to this post by JamminJimP
Leon wrote:

> I would certainly design a pre-emptive system by NOT entering the pbuf
> layer from interrupt context, but instead schedule a high-priority
> real-time task to move the packets of the chip from the interrupt
> handler, and nothing more.

To have a pre-emptive multitasking system doesn't necessarily mean a
high priority task is never interrupted by a lower priority one. In my
case, higher priority just means that it gets more timeslices. (Unlike
eCos where it wouldn't be interrupted by lower priority threads, at
least with the standard scheduling behaviour).

> I am NOT sure if SYS_LIGHTWEIGHT_PROT protects against all concurrent
> acccess possible,

In my case it certainly does, and I think it's generally meant to be.
But it's several years ago that I dealt with this issue.

> I thought this was exactly the reason why it was
> called lightweight when introduced.

I think not. It's lightweight, because it has very little overhead and
is easy to implement (which makes it "low-level", non-portabe and
rigorous. Usually disabling all interrupts below a certain level.)

All the best
Peter



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Leon Woestenberg
In reply to this post by Jonathan Larmour
Hello all,

first of all thanks Christiaan for reminding me of the details of how we
implemented this.

On Fri, 2006-11-17 at 20:03 +0000, Jonathan Larmour wrote:
> Jonathan Larmour wrote:
> [big ker-snip]
>
> Actually I allowed myself to lose focus about what's really under
> discussion. There are multiple issues in play here. Christian does have a
> scenario in which the !SYS_LIGHTWEIGHT_PROT might be useful. In that case,
> the code should stay and that discussion can end.
>
All users which access the raw (core) API do not need the
SYS_LIGHTWEIGHT_PROT per se, and can solve locking issues more
elegantly, depending on their OS. Many users approach the lwIP stack
like this, some have replied on this thread.

We ourselves chose the 'one-giant-lock' mechanism. Each call is
protected by acquiring a mutex. The lwIP calls are all O(1) or O(n)
and do not busy-delay or block. They are really not that giant, with n
being typically low for the targetted systems.

> Separately, there's the issue that the !SYS_LIGHTWEIGHT_PROT code is
> broken. Someone interested in that code (Christian?) may want to take that up.
>
I think it the raw (core) API level is not broken. Also, the core is
intentionally designed to be void of any locking mechanisms.

Adam Dunkels (the original author of lwIP) surrounded the core functions
with a sys arch wrapper and tried to make it abstract enough to map to
any OS.

I *do* think the sys (wrapper) level is broken, and has always been
broken, in corner case of the sys (wrapper) level. I have given examples
of this years ago, but as no user seemed to hit, or care, about these
corner cases, the code was mostly kept as-is. I've seen several fixes to
the sys API code addressing the corner cases.

> And separately again, there's the issue that the stack code itself is not
> thread-safe, possibly if there is a ethernet input thread/bottom half
>
The core function calls into the stack are thread-safe in the sense of
re-enterable (or "pure"), i.e. not keeping global state etc. I always I
have carefully reviewed that this has remained so.

However, this does not go for SLIP and PPP.

> (depending on implementation), and definitely so in the presence of SLIP or
> PPP, and the only present way to solve that is implementing
> SYS_LIGHTWEIGHT_PROT. At least in the current code.
>
This might be true for the sys arch API, as said, I doubt the raw API
needs any fixing.

I know it is a nuisance to manage code that has several API's and/or can
be used in multiple ways. My personal preference would be to throw away
the sys arch abstraction and rewrite it altogether, merging it with a
BSD alike socket interface. However, we never made it that far.


Regards,

Leon.



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Andrew Lentvorski
In reply to this post by Mumtaz Ahmad
Mumtaz Ahmad wrote:
> Dear Simons
>
> Thats a better way and i have also used it and created my own sequential
> APIs using this locking mechanism .This helped me to get a BSD style socket
> api with the approximate performance of raw api .

Do you have code available to share about this?

A BSD-style API which does not require a threading system would be a
nice addition to lwIP.

-a


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Kieran Mansley
In reply to this post by Jonathan Larmour
On Fri, Nov 17, 2006 at 08:03:04PM +0000, Jonathan Larmour wrote:
> Jonathan Larmour wrote:
> [big ker-snip]
>
> Actually I allowed myself to lose focus about what's really under
> discussion. There are multiple issues in play here. Christian does have a
> scenario in which the !SYS_LIGHTWEIGHT_PROT might be useful. In that case,
> the code should stay and that discussion can end.

...and be commented so that there isn't the same confusion in six
months time about whether it is useful!

Kieran


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
In reply to this post by Christiaan Simons
Christiaan Simons wrote:
>
> BTW direct pbuf pool access is desirable
> for raw-API applications, such as SNMP.

Actually I've changed my mind, and out of curiousity will go back to this
one after all. If such applications can only be single threaded anyway,
why bother with any protection at all?

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Jonathan Larmour
In reply to this post by Leon Woestenberg
Leon Woestenberg wrote:
>
>>Separately, there's the issue that the !SYS_LIGHTWEIGHT_PROT code is
>>broken. Someone interested in that code (Christian?) may want to take that up.
>>
>
> I think it the raw (core) API level is not broken. Also, the core is
> intentionally designed to be void of any locking mechanisms.

Go back to the start of this thread: pbuf_pool_free_lock is never changed,
and pbuf_pool_free_sem is only used by PBUF_POOL_FREE and nowhere else.
There is therefore no protection between allocs and frees.

Failure scenario with two threads, FREE_T1 doing free, ALLOC_T2 doing alloc:

FREE_T1 calls PBUF_POOL_FAST_FREE
FREE_T1 does p->next = pbuf_pool (where p is pbuf to free)
ALLOC_T2 does p = pbuf_pool
ALLOC_T2 does pbuf_pool = p->next (where p was from previous line)
then later when FREE_T1 is rescheduled:
FREE_T1 does pbuf_pool = p (where p is pbuf to free)

This results in pbuf_pool->next pointing to a buffer that has been
allocated. If there are two more pbuf allocations, then the second one
will go horribly wrong, and disaster is inevitable since there will be a
single pbuf in the pbuf_pool list mentioned twice.

Or if you are saying "void of any locking mechanism", then why even have
the locks and semaphores that are there for the !SYS_LIGHTWEIGHT_PROT
code. That's what I had been arguing for removing, after all.

This may overlap with the other mail I sent earlier, but your comment
seems to agree with the hypothesis that you couldn't have multiple threads
using the stack simultaneously when using the raw API (in which case there
needn't be any code to protect it).

> I *do* think the sys (wrapper) level is broken, and has always been
> broken, in corner case of the sys (wrapper) level. I have given examples
> of this years ago, but as no user seemed to hit, or care, about these
> corner cases, the code was mostly kept as-is. I've seen several fixes to
> the sys API code addressing the corner cases.

I'd be interested to know what corner cases remain. Other than the obvious
finite resources, the abstraction appears crude (which for a lightweight
stack is not necessarily a bad thing) but functional to me.

>>(depending on implementation), and definitely so in the presence of SLIP or
>>PPP, and the only present way to solve that is implementing
>>SYS_LIGHTWEIGHT_PROT. At least in the current code.
>
> This might be true for the sys arch API, as said, I doubt the raw API
> needs any fixing.

You mean sequential rather than sys arch?

> I know it is a nuisance to manage code that has several API's and/or can
> be used in multiple ways. My personal preference would be to throw away
> the sys arch abstraction and rewrite it altogether, merging it with a
> BSD alike socket interface. However, we never made it that far.

I don't know quite what you mean by "alike", but the differences of the
sequential API from the BSD API are, to me, a very good thing for a
lightweight stack.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Bug in pbuf.c regarding PBUF_POOL

Mumtaz Ahmad
In reply to this post by Andrew Lentvorski
Well probably i quoted it wrong . Infact I have implemented BSD style API
that are much simpler than lwip BSD layer but they do require a
multithreading environment for sure. These API use simple thread safe
mechanism instead of lwip message boxes. The system is yet in testing .

Regards

----- Original Message -----
From: "Andrew Lentvorski" <[hidden email]>
To: "Mailing list for lwIP users" <[hidden email]>
Sent: Sunday, November 19, 2006 8:02 AM
Subject: Re: [lwip-users] Bug in pbuf.c regarding PBUF_POOL


> Mumtaz Ahmad wrote:
> > Dear Simons
> >
> > Thats a better way and i have also used it and created my own sequential
> > APIs using this locking mechanism .This helped me to get a BSD style
socket

> > api with the approximate performance of raw api .
>
> Do you have code available to share about this?
>
> A BSD-style API which does not require a threading system would be a
> nice addition to lwIP.
>
> -a
>
>
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> http://lists.nongnu.org/mailman/listinfo/lwip-users



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
123