netconn_delete and _recv not thread safe

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

netconn_delete and _recv not thread safe

Jonathan Larmour
I have most of the lwip stack running in a "tcpip" thread, and stack users
in various threads synchronise with it using the netconn* API. But there is
a problem with netconn_delete and netconn_recv when they delete mboxes. A
packet, or a connection to a listening socket, may arrive in the gap
between removing mbox items, and deleting the mboxes. For example:

   /* Drain the recvmbox. */
   if (conn->recvmbox != SYS_MBOX_NULL) {
     while (sys_arch_mbox_fetch(conn->recvmbox, &mem, 1) != SYS_ARCH_TIMEOUT) {
       if (conn->type == NETCONN_TCP) {
         if(mem != NULL)
           pbuf_free((struct pbuf *)mem);
       } else {
         netbuf_delete((struct netbuf *)mem);
       }
     }

*** PACKET CAN ARRIVE HERE ***
     sys_mbox_free(conn->recvmbox);
     conn->recvmbox = SYS_MBOX_NULL;
   }

In that case a sys_mbox_free can be called on a non-empty mbox. This may
either leak a pbuf, or cause an assertion in the OS from the underlying
mbox implementation. I've been thinking about how to fix this. Either new
sys_arch primitives could be added. Or, better, we could try to use
SYS_LIGHTWEIGHT_PROT, but this seems a bit dubious as we need to protect
from before the sys_arch_mbox_fetch, until the conn->recvmbox = SYS_MBOX_NULL.

This may mean calling sys_arch_mbox_fetch, which has a timeout of 1, with
the protection (of whatever form) on, which is a bad idea.

It may be possible to solve this if it is ok to assign mboxes:
     SYS_ARCH_PROTECT(old_level);
     tmp_mbox = conn->recvmbox;
     conn->recvmbox = SYS_MBOX_NULL;
     SYS_ARCH_UNPROTECT(old_level);

Is assuming mboxes are assignable a valid approach? Any other thoughts? I
can't think of any other alternative than adding to the sys_arch API.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: netconn_delete and _recv not thread safe

Kieran Mansley
On Thu, 2006-12-14 at 20:59 +0000, Jonathan Larmour wrote:

> It may be possible to solve this if it is ok to assign mboxes:
>      SYS_ARCH_PROTECT(old_level);
>      tmp_mbox = conn->recvmbox;
>      conn->recvmbox = SYS_MBOX_NULL;
>      SYS_ARCH_UNPROTECT(old_level);
>
> Is assuming mboxes are assignable a valid approach? Any other thoughts?

That would be exactly what I'd try first.  Should be easy to experiment
with at least.

Kieran



_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: netconn_delete and _recv not thread safe

Jonathan Larmour
Kieran Mansley wrote:

> On Thu, 2006-12-14 at 20:59 +0000, Jonathan Larmour wrote:
>
>> It may be possible to solve this if it is ok to assign mboxes:
>>      SYS_ARCH_PROTECT(old_level);
>>      tmp_mbox = conn->recvmbox;
>>      conn->recvmbox = SYS_MBOX_NULL;
>>      SYS_ARCH_UNPROTECT(old_level);
>>
>> Is assuming mboxes are assignable a valid approach? Any other thoughts?
>
> That would be exactly what I'd try first.  Should be easy to experiment
> with at least.

Oh it works for me because of how I define sys_mbox_t - for me it's only a
handle. But if I was to contribute the patch, someone else's sys_mbox_t
might be a direct typedef of a larger structure, in which case this would
assign the whole structure. Not necessarily bad, but not straightforward.
I'm wondering if there are any situations where assigning the whole
structure could be the wrong thing to do.

So I'd like to get the _right_ fix - getting it to work for me is easy :).

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

RE: netconn_delete and _recv not thread safe

Krueger, Don
In reply to this post by Jonathan Larmour
> I have most of the lwip stack running in a "tcpip" thread, and stack
> users in various threads synchronise with it using the netconn* API.
But
> there is a problem with netconn_delete and netconn_recv when they
delete
> mboxes. A packet, or a connection to a listening socket, may arrive in
> the gap between removing mbox items, and deleting the mboxes. For
> example:
>
>    /* Drain the recvmbox. */
>    if (conn->recvmbox != SYS_MBOX_NULL) {
>      while (sys_arch_mbox_fetch(conn->recvmbox, &mem, 1) !=
SYS_ARCH_TIMEOUT) {

>        if (conn->type == NETCONN_TCP) {
>          if(mem != NULL)
>            pbuf_free((struct pbuf *)mem);
>        } else {
>          netbuf_delete((struct netbuf *)mem);
>        }
>      }
>
> *** PACKET CAN ARRIVE HERE ***
>      sys_mbox_free(conn->recvmbox);
>      conn->recvmbox = SYS_MBOX_NULL;
>    }
>
> In that case a sys_mbox_free can be called on a non-empty mbox. This
may
> either leak a pbuf, or cause an assertion in the OS from the
underlying
> mbox implementation. I've been thinking about how to fix this. Either
> new sys_arch primitives could be added. Or, better, we could try to
use
> SYS_LIGHTWEIGHT_PROT, but this seems a bit dubious as we need to
protect
> from before the sys_arch_mbox_fetch, until the conn->recvmbox =
> SYS_MBOX_NULL.

Doesn't the API_MSG_DELCONN message posted just prior to the above code
already prevent any further packets from being placed in the recvmbox?

Don Krueger

======================================================================
 This electronic message transmission and any attachments are
 confidential and/or proprietary and may constitute legally privileged information of
 Meso Scale Diagnostics, LLC. The information is intended for solely
 the use of Mailing list for lwIP users ([hidden email]).  If you are not
 the intended recipient, you are hereby notified that any
 disclosure, copying, distribution or the taking of any action in
 reliance of this information is strictly prohibited. You are not
 authorized to retain it in any form nor to re-transmit it, and
 you should destroy this email immediately.

 If you have received this electronic transmission in error,
 please notify us by telephone (240-631-2522) or by electronic
 mail to the sender of this email, Krueger, Don ([hidden email]),
 immediately.
 =====================================================================




_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: netconn_delete and _recv not thread safe

Jonathan Larmour
Krueger, Don wrote:
 >Jifl wrote:
>>  A packet, or a connection to a listening socket, may arrive in
>> the gap between removing mbox items, and deleting the mboxes. For
>> example:
[snip]
>
> Doesn't the API_MSG_DELCONN message posted just prior to the above code
> already prevent any further packets from being placed in the recvmbox?

It seems that you're right. After having thought I "fixed" the problem, it
then showed itself up again a bit later.

The problem I was having evidently wasn't what I had thought it was - the
real problem was related to sys_arch_mbox_fetch being called with a timeout
of 1. My OS does timeouts by providing absolute tick values for the kernel
clock, not relative timeouts. A timeout of 1 can mean that a context swap
can cause the timeout to appear to be in the past, which causes an
immediate "timeout" failure. This means that not all messages in the mbox
get removed before the mbox is deleted (I had been thinking that they must
have just arrived).

I'm now treating a timeout of 1 specially as if it were a poll, and so far
all is well.

Sorry for the false alarm.

Jifl
--
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
------["The best things in life aren't things."]------      Opinions==mine


_______________________________________________
lwip-users mailing list
[hidden email]
http://lists.nongnu.org/mailman/listinfo/lwip-users