LWIP_ASSERT on pbuf_free function

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

LWIP_ASSERT on pbuf_free function

lwip-users mailing list

Hi everyone,


I'm using MCUXpresso 10.2.1 and a custom board based on MK65F2M0 chip.
I have a problem on using LwIP: it sometimes crash generating an assert, because of some invalid parameter.


The workflow is:

  • The board is configured with a main connection using a GPRS modem and an optional second LAN connection
  • Every about 15 seconds the board connects to a server and sends two types of packets, one smaller and one a little bigger. The bigger one is sended once every 3/4 times typically
  • Most of the time the server closes the connection (if no operation on the device is requested), but sometimes it can go ahead in the session, exchanging some data (if an operation on the device is requested). In this debug session the server closes the connection every time
  • The problem is that after a while (5 minutes mean, but sometimes after 20 minutes), the board generates an assertion on releasing a network buffer, because it seems that is trying to free a buffer already freed


Here is a snippet of code:


   int sock;
   int opt;
   struct sockaddr_in outerdata;
   int rn;


   outerdata.sin_addr.s_addr = htonl( address );
   outerdata.sin_family = AF_INET;
   outerdata.sin_port = htons( port );
   sock = socket( AF_INET, SOCK_STREAM, 0 );


   //set non blocking
   opt = 1;
   ioctl( sock, FIONBIO, &opt );


   connect( sock, ( struct sockaddr * )&outerdata, sizeof( outerdata )))


   //set blocking
   opt = 0;
   ioctl( sock, FIONBIO, &opt );


   send( sock, &data, sizeof( data ), 0 );


   struct timeval tv;
   tv.tv_sec = 2; /* 2 Secs Timeout */
   tv.tv_usec = 0;


   setsockopt( sock, SOL_SOCKET, SO_RCVTIMEO, ( const void * )&tv, sizeof( struct timeval ));


   //total timeout: 10 s
   for( i = 0; i < 5; i++ )
   {
      rn = recv( sock, ( void * )&rxdata, sizeof( rxdata ), 0 );
      if(( errno == EAGAIN ) && ( rn < 0 ))
      {
         PRINTF( "WAIT SERVER\n" );
         continue;
      }
      if( rn < 0 )
      {
         PRINTF( "ERRORE %d ON RECV\n", errno );
         break;
      }
      if( rn == 0 )
      {
         PRINTF( "CONNECTION CLOSED\n" );
         break;
      }
      goto rx_ok;
   }
   closesocket( sock );
   return( rn );


rx_ok:;

   //exchange data
   ...


The problem occurs only when the bigger packet is sent and the first timeout occurred on the recv function (I can see the "WAIT SERVER" message on the debug monitor only once), but it doesn't occurs all the times these conditions are present.
The assertion occurs on the pbuf_free() function at the instruction LWIP_ASSERT("pbuf_free: p->ref > 0", p->ref > 0);


The image below shows the Wireshark captured session of the smaller packet:


The image below shows the Wireshark captured session of the bigger packet:


The image below shows the Wireshark captured session of the error condition; note that the TCP Retransmission packets are present because the device is stopped on the breakpoint:


I enabled statistics on LwIP and all used resources are below the limits.


How can I debug this situation?


This is another situation maybe caused by the same issue:


Something to notice in the memp_TCP_SEG variable: the stats struct shows that no element is now used and the maximum number of elements ever used from the application start is 3. The tab variable points to the linked list of all free elements: having now no used elements, the linked list should link all the elements (26), but as it can be seen, there is only one element in the list.


How can it be generated?


Many thanks

Biafra

 

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP_ASSERT on pbuf_free function

Sergio R. Caprile
I'm not well versed in sockets here, but since you are using sockets,
there shouldn't be application-related pbuf free issues if threading
rules are respected.
Somewhere in your vendor provided code, someone did not play by the
rules. You have an OS, you have a port, you have a driver, you have many
places to search.
You should check for modified code, and threading misuse, that is, all
code for one socket goes in one thread, all low-level code goes in one
thread.
I would probably start by tracing the free operation to its caller and
wondering why is it trying to free it or why it has been freed before,
and by whom. Most of the times, when it is not a threading issue, it is
a driver issue.


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users