[bug #57445] LWIP_NETCONN_FULLDUPLEX: Assertion failed: sockets[i].select_waiting == 0

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[bug #57445] LWIP_NETCONN_FULLDUPLEX: Assertion failed: sockets[i].select_waiting == 0

Simon Goldschmidt

                 Summary: LWIP_NETCONN_FULLDUPLEX: Assertion failed:
sockets[i].select_waiting == 0
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: pschlang
            Submitted on: Thu 19 Dec 2019 09:42:42 AM UTC
                Category: sockets/netconn
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None
            lwIP version: git head




I've discovered a possible issue with lwip_select() in LWIP_NETCONN_FULLDUPLEX

When closing a socket which is being lwip_select()ed on from another
task/thread, the socket might end up in a state where it is closed but
select_waiting is not decremented properly. This will trigger the assertion
"sockets[i].select_waiting == 0" when re-allocating that socket.

I'm trying to explain the failure mechanism:

1. A thread calls lwip_select() to wait for events on a specific socket

2. lwip_select() will increment the used count for each socket via
lwip_select_inc_sockets_used() to ensure it's not freed during the select

3. After increasing the select_waiting for the socket but before decrementing
it again (i.e. while waiting for events), the socket is closed from another
thread/task. Since the socket is still in use by lwip_select(),
fd_free_pending will be set.

4. In lwip_select(), the loop to decrease select_waiting is entered. In the
loop, tryget_socket_unconn_locked is used to retrieve the socket structure.
For the socket closed in (3), tryget_socket_unconn_locked will return NULL
because fd_free_pending is set (checked in sock_inc_used_locked).  Since
tryget_socket_unconn_locked returned NULL, lwip_select() will correctly set
nready to -1 an errno to EBADF, but it never decrements select_waiting for
that socket.

5. lwip_select_dec_sockets_used() is used to decrement the used count before
returning -1 from lwip_select(). The used count of the closed socket will
become 0 and the socket is actually freed, but select_waiting is still 1.

6. Later, when re-using the socket structure in alloc_socket(),
"sockets[i].select_waiting == 0" assertion fails.

Is this an issue in lwIP or am I just using it in a non-supported way?

I've prepared a small patch which fixes the problem in my tests. Since I'm not
an expert on lwIP internals, I'd appreciate if somebody could double-check if
the fix is valid.




File Attachments:

Date: Thu 19 Dec 2019 09:42:42 AM UTC  Name:
0001-Fix-select_waiting-not-being-decremented-for-sockets.patch  Size: 1KiB  
By: pschlang



Reply to this item at:


  Message sent via Savannah

lwip-devel mailing list
[hidden email]