TCP retransmission flooding at end of stream

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

TCP retransmission flooding at end of stream

Michael Steinecke
Hello Folks,

currently I'm struggling while creating an application for a custom STM32F429ZG based custom board using LwIP.
I need to achieve a throughput of at least 2 MBit/s, according our requirements. Due to the HW this should be possible.

The application is based on the STMCubeMX V4.3.0 and FW Library 1.3.0. However I've written a zero-copy EthernetIF driver in RX&TX.

For the application, the MCU based device acts as a TCP/IP server with our own application protocol, implemented using the RAW TCP API.
The implementation is inspired by the HTTP Server example. Also the guidelines regarding throughput from the LwIP wiki has been followed.
The FW Library bug in the Ethernet IRQ, eating fast packets is fixed. I'm using custom memory pools. there is most likely no memory issue.
Also all the priorities are ok.

To achieve maximum throughput, I have done some further, potential dangerous changes:
- The MSS, TCP_SND_BUF, pbuf len, etc is increased from u16_t to u32_t. The MTU is still 1500. The problem with u16_t is the limitation the send buffer to a maximum of 0xFFFE bytes. For some checks in LwIP 0xFFFF is interpreted as 0x0000, but is need to have segments of 0xFFFF bytes for a fast processing of SD Card pages.
- I decreased the TCP timer intervals from 250 ms to 10 ms. A even higher rate tends to produce a lot of retransmissions.

Currently I can achieve almost the desired speed for one file transfer. However, at the end of the transfer, there is one strange bug. I couldn't find a similar case in the forums, so far.
Before the last frame of the last segment is sent, LwIP starts to retransmit a bunch of TCP packets. The actual number is variable, but usually somewhere between 10 and 25 frames.

Interestingly there is always this pattern:
MCU -> PC Packet 1460 Bytes [ACK]
MCU -> PC Packet 1460 Bytes [ACK]
PC -> MCU ACK on both, 0 Bytes
MCU -> PC Packet 1460 Bytes [ACK]
MCU -> PC Packet 1460 Bytes [ACK]
PC -> MCU ACK on both, 0 Bytes
...
MCU -> PC Packet 1460 Bytes [ACK]
MCU -> PC Packet 1460 Bytes [ACK, PSH]
PC -> MCU ACK on both, 0 Bytes
............


Then the following occurs on the last segment:
(for this example, the Seq is relative to the begin of the segment)
MCU -> PC Packet 1460 Bytes [ACK] [Seq: 0000]
MCU -> PC Packet 1460 Bytes [ACK] [Seq: 1460]
PC -> MCU ACK on both, 0 Bytes [Ack: 2920]
MCU -> PC Packet 1460 Bytes [ACK] [Seq: 2920]
MCU -> PC Packet 1460 Bytes [ACK] [Seq: 4380]
PC -> MCU ACK on both, 0 Bytes [Ack: 4380]
...
MCU -> PC Packet 1460 Bytes [TCP Retransmission] [ACK] [Seq: 0000]
PC -> MCU 0 Bytes [TCP DUP ACK] [Ack: 4380]
MCU -> PC Packet 1460 Bytes [TCP Retransmission] [ACK] [Seq: 2920] (Packet with Seq 1460 is NOT retransmitted!)
PC -> MCU 0 Bytes [TCP DUP ACK] [Ack: 4380]
MCU -> PC Packet 1460 Bytes [TCP Retransmission] [ACK] [Seq: 4380]
PC -> MCU 0 Bytes [TCP DUP ACK] [Ack: 4380]
...
MCU -> PC Packet 1198 Bytes [ACK, PSH] [Seq: 65700]
PC -> MCU ACK on last one, sending new command, 38 Bytes [ACK, PSH] [Seq: 66898]
MCU -> PC Packet 1198 Bytes [TCP Retransmission] [ACK, PSH] [Seq: 65700]
PC -> MCU 0 Bytes [TCP DUP ACK #1] [ACK, PSH] [Seq: 66898]
MCU -> PC Packet 1198 Bytes [TCP Retransmission] [ACK, PSH] [Seq: 65700]
PC -> MCU0 Bytes [TCP DUP ACK #2] [ACK, PSH] [Seq: 66898]
MCU -> PC Packet 1198 Bytes [TCP Retransmission] [ACK, PSH] [Seq: 65700]
PC -> MCU 0 Bytes [TCP DUP ACK #3] [ACK, PSH] [Seq: 66898]
PC -> MCU ACK on last one, sending new command, 38 Bytes [TCP Retransmission] [ACK, PSH] [Seq: 66898]
...

Afterwards, nearly every packet is transmitted at least twice in both directions. Effectively no data transfer possible anymore.
In the pcap file, 192.168.111.200 is MCU and .63 the PC. The interesting part is starting with frame 2056, the first retransmitted frame. 2057 is not retransmitted - that means frame 2058 must have been processed by LwIP - right?

The data is passed to LwIP by tcp_write().
One segment consists of two writes:
tcp_write(pcb, [ptr to 8 bytes of header], 8, TCP_WRITE_FLAG_COPY | TCP_WRITE_FLAG_MORE)
tcp_write(pcb, [ptr to up to 65535 bytes of data], len, 0) // the data is stored in elements of the memory pool and freed by the tcp_sent callback

According to lwip_stats there is no memory leak and no packet drop. Has someone seen something similar? Any clue?
There are some other scenarios, leading to similar behaviors.


SD_download_breaks.pcap

Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

goldsimon@gmx.de
Michael Steinecke wrote:
> currently I'm struggling while creating an application for a custom
> STM32F429ZG based custom board using LwIP.
Too sad the F429 discovery board doesn't have an ethernet connector or I
could try to reproduce this :-)
> To achieve maximum throughput, I have done some further, potential dangerous
> changes:
> - The MSS, TCP_SND_BUF, pbuf len, etc is increased from u16_t to u32_t. The
> MTU is still 1500. The problem with u16_t is the limitation the send buffer
> to a maximum of 0xFFFE bytes. For some checks in LwIP 0xFFFF is interpreted
> as 0x0000, but is need to have segments of 0xFFFF bytes for a fast
> processing of SD Card pages.

Which version of lwIP are you using? Do you know that we support TCP
window scaling by now (LWIP_WND_SCALE)?

> - I decreased the TCP timer intervals from 250 ms to 10 ms. A even higher
> rate tends to produce a lot of retransmissions.
You should really not need to do this! I rather expect more problems
than anything being solved. Especially when your main issue is sending
data, not receiving.

I can't help you much on the rest, I'm afraid.


Simon

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Krzysztof Wesołowski
I>  need to achieve a throughput of at least 2 MBit/s, according our
> requirements. Due to the HW this should be possible.

I am not sure why you decided to go in such extreme direction with your changes.

We are almost able to saturate 100MBit connection (>8 MB/s) and upload about 2MB/s from SD Card on STM32F407 with RMII attached PHY (Some Micrels KSZ...)

Are you using some WiFi in your setup? With Ethernet networks we only needed to tune memory in lwipopts, and there was no need to change types and/or polling interval.

Have you benchmarked if the need for optimization really is within LwIP related code?

Regards,
Krzysztof Wesołowski,


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Michael Steinecke
goldsimon@gmx.de wrote
Which version of lwIP are you using? Do you know that we support TCP
window scaling by now (LWIP_WND_SCALE)?
Indeed, i forgot this one. Its the version provided by the STM32CubeMX tool. Diff shows its identical to LWIP 1.4.1. I didn't knew that. I guess you refer to patch 7858? I will apply it.

goldsimon@gmx.de wrote
> - I decreased the TCP timer intervals from 250 ms to 10 ms. A even higher
> rate tends to produce a lot of retransmissions.
You should really not need to do this! I rather expect more problems
than anything being solved. Especially when your main issue is sending
data, not receiving.
I've tested again with 250 ms - The only difference in the behavior seems to be a much lower transmission rate. I achieve about 200 kB/s.

Krzysztof Wesołowski wrote
I am not sure why you decided to go in such extreme direction with your
changes.

We are almost able to saturate 100MBit connection (>8 MB/s) and upload
about 2MB/s from SD Card on STM32F407 with RMII attached PHY (Some Micrels
KSZ...)

Are you using some WiFi in your setup? With Ethernet networks we only
needed to tune memory in lwipopts, and there was no need to change types
and/or polling interval.

Have you benchmarked if the need for optimization really is within LwIP
related code?
I've started about a month ago porting our STM32F1 based board to the new MCU. The old design had a WizNet W5300 Ethernet IC, implementing the TCPIP Stack in HW. Therefore I'm absolutely not sure, my changes are going in the right direction. However, initially I struggled on very high roundtrip times and a low throughput of about 5 kBit/s. The PC seemed to resend unacknowledged packets after about 200 ms. Also, I'm using the tcp_poll callback to en-queue new data to the stack, in the context of the tcp_thread. For both reasons it seems natural to reduce the intervals. On the other hand, having a bigger SND_WND allows less memory management, outside of the stack, which seems to be quite efficient. Now i can achieve 2 MBit/s (until the error occurs) so yes it seems to be influenced by the stack. (5 kB -> 70 kB was achieved, due to the zero copy driver)

On the other hand, I guess there still several other regions for improvements.

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Sergio R. Caprile
In reply to this post by Michael Steinecke
Is this the (in)famous ST-lost-frames bug again ?
Translation: is your port running on an RTOS with an Rx task fired from
Eth interrupt and taking only the first frame out of the chip ?


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

goldsimon@gmx.de
In reply to this post by Michael Steinecke
Sergio,


Michael Steinecke wrote:

> The FW Library bug in the Ethernet IRQ, eating fast packets is fixed.

So no, this does not seem to be the standard STM issue...


Simon

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Sergio R. Caprile
In reply to this post by Michael Steinecke
>> The FW Library bug in the Ethernet IRQ, eating fast packets is fixed.
>So no, this does not seem to be the standard STM issue...

Oh, I see, missed that part. Should we believe the vendor ? (terrified face)
Anyway, here are my 2 cents:

- Frame 16: bad FCS on ARP response from MCU to PC, why ? 
- Your DHCP on UDP port 55555, turn it off, just in case, you don't seem to be using it
- Frame 2094: Yes, 2058's ACK has been seen, but 2057's not. Then, Seq#s jump at sometimes more than 1460, so some frames were lost, some not.
- Frame 2162(3,5) ARP request is not seen by lwIP, frame loss
	You are definitely having an event that triggers frame losses. Where is it, I can tell.
	You said this is a custom board, I had once something like this where my driver went out of sync with the eth chip by incorrectly reading available bytes.
		Please run known to work code first, this looks to me like an eth driver problem
- You say you are using tcp_poll() to enqueue data. Don't do that if you aim for performance, that is just to avoid state machines on connection closures and some other good stuff, not for streaming data.
	You should start sending your data from your tcp_recv() parsing the request and then keep steady sending from your tcp_sent()

> According to lwip_stats there is no memory leak and no packet drop
Well, lwIP can only count *packet* drops, not *frame* loss.
And memory leak is tricky, is it possible you are freeing a wrong pointer or in the wrong place ? Try sending and freeing at the same place, that is tcp_sent(), let tcp_poll() aside for now.
Check the web server or smtp client sending functions.

  


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Michael Steinecke
Sergio R. Caprile wrote
>>/ The FW Library bug in the Ethernet IRQ, eating fast packets is fixed./
>So no, this does not seem to be the standard STM issue...

Oh, I see, missed that part. Should we believe the vendor ? (terrified face)
I think there is another related bug as well. The semaphore to signal new packets is a binary one. It should be a counting one. I had it at least once, that there was a full semaphore due to really fast packages and the latter one got eventually lost.
ethernetif.c Line 280 in low_level_init():
  s_xSemaphore = osSemaphoreCreate(osSemaphore(SEM) , 1 );
should be:
  s_xSemaphore = xSemaphoreCreateCounting(ETH_RXBUFNB, 0);


Sergio R. Caprile wrote
- Frame 16: bad FCS on ARP response from MCU to PC, why ?
Good one! I had an old version of Wireshark, not reporting this one. Currently I'm doing a bit of research and it seems that I have a erroneous calculation of the total length of the pbuf in my RX driver. Later on, the packet is reused by etherarp for the response. I guess this could lead to the wrong CRC value. On the other hand, the CRC is calculated by HW, this should always be correct, right? What happens when LwIP or the ETH detects a corrupt CRC on RX? the packet is lost or dropped?

Sergio R. Caprile wrote
- Your DHCP on UDP port 55555, turn it off, just in case, you don't seem to be using it
The UDP 55555 broadcast is a discovery broadcast of our application protocol and used. Turning of DHCP made no difference.
Sergio R. Caprile wrote
- Frame 2094: Yes, 2058's ACK has been seen, but 2057's not. Then, Seq#s jump at sometimes more than 1460, so some frames were lost, some not.

- Frame 2162(3,5) ARP request is not seen by lwIP, frame loss
        You are definitely having an event that triggers frame losses. Where is it, I can tell.
        You said this is a custom board, I had once something like this where my driver went out of sync with the eth chip by incorrectly reading available bytes.
                Please run known to work code first, this looks to me like an eth driver problem
I will have a closer look and eventually switch back to the original driver.


Sergio R. Caprile wrote
- You say you are using tcp_poll() to enqueue data. Don't do that if you aim for performance, that is just to avoid state machines on connection closures and some other good stuff, not for streaming data.
        You should start sending your data from your tcp_recv() parsing the request and then keep steady sending from your tcp_sent()
So, how is the best way to handle the following scenarios?
The device is a measurement device, always acquisitioning data from external ADCs at 5-20 kHz, normally 18 bytes per sample. It is transfered by SPI and DMA to a cache in a FMC connected SD-RAM. The pointer handling must be done at highest priority IRQ level, even higher then ETH.

a) The Client (PC) requests a large file from the SD card. Currently, I pass the request from tcp_recv() to the SD Card gatekeeper task, which reads the data page per page (64k). As soon as one page has been read, it queues a structure with some commands and control variables and as well as the pointer to the memory. That command from the queue is read by tcp_poll() and the data written using tcp_write(). Eventually, the data is freed in tcp_sent(). Also, if there was a full sndbuffer, tcp_sent() would enqueue the missing data. Between the initial request from the client and the end of transfer, I don't get any new client-requests.

b) Nearly the same scenario, but the data is not from the SD-Card but directly from the SD-RAM of the current acquisition. As soon as there is a big block of data available, a command like in case a) is posted in the same queue. Scenario b) can occur by Client request or due to an external trigger signal.

The responsible tasks are running on a lower priority as the tcp-thread.

Sergio R. Caprile wrote
> According to lwip_stats there is no memory leak and no packet drop
Well, lwIP can only count *packet* drops, not *frame* loss.
And memory leak is tricky, is it possible you are freeing a wrong pointer or in the wrong place ? Try sending and freeing at the same place, that is tcp_sent(), let tcp_poll() aside for now.
Check the web server or smtp client sending functions.
Currently I don't believe there is an memory issue. The behavior is independent from the amount of used data. However - for some reason there lost frames, I agree. I'll have a closer look on the drivers again, then go back to the examples.

Thanks for you input!
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

goldsimon@gmx.de
Michael Steinecke wrote:
> Sergio R. Caprile wrote
>>>>/ The FW Library bug in the Ethernet IRQ, eating fast packets is fixed./
>>>So no, this does not seem to be the standard STM issue...
>>
>> Oh, I see, missed that part. Should we believe the vendor ? (terrified
>> face)
>
> I think there is another related bug as well. The semaphore to signal new
> packets is a binary one. It should be a counting one. I had it at least

Ehrm, wasn't that the STM bug we were talking about? At least I think I remember something like that being discussed on this list...

So you haven't fixed it? That could explain retransmissions...


Simon

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Michael Steinecke
Simon Goldschmidt wrote
Michael Steinecke wrote:
> Sergio R. Caprile wrote
>>>>/ The FW Library bug in the Ethernet IRQ, eating fast packets is fixed./
>>>So no, this does not seem to be the standard STM issue...
>>
>> Oh, I see, missed that part. Should we believe the vendor ? (terrified
>> face)
>
> I think there is another related bug as well. The semaphore to signal new
> packets is a binary one. It should be a counting one. I had it at least

Ehrm, wasn't that the STM bug we were talking about? At least I think I remember something like that being discussed on this list...

So you haven't fixed it? That could explain retransmissions...
No I'd fixed that one for the entire time. The post I'd read regarding the famous bug talked about http://lists.nongnu.org/archive/html/lwip-users/2014-03/msg00033.html 
This is fixed in the meantime by STM32
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Jens Nielsen
Hi

Just to clarify, there are two flavours of the STM bug:

The first and most common one is where you receive one packet that triggers an
interrupt, and before the rx thread wakes up and services the packet you get
another packet and triggers a new interrupt. Since ST only used a simple
semaphore and services one packet per "taken semaphore" you would miss the
second packet. This can be solved with a counting semaphore.

The second and more obscure bug is where you receive one packet but before the
interrupt flag is cleared you receive another packet. I.e. you will only get
one interrupt for two packets, in that case your counting semaphore will only
be given once and (assuming you still have the "service one packet per taken
semaphore" code) you will only service one packet. My solution to this was to
loop and poll DMA for several packets each time the rx thread wakes up. (and by
doing this you actually don't need a counting semaphore)

BR /Jens


>----Ursprungligt meddelande----
>Från: [hidden email]
>Datum: 2014-09-18 13:30
>Till: <[hidden email]>
>Ärende: Re: [lwip-users] TCP retransmission flooding at end of stream
>
>Simon Goldschmidt wrote
>> Michael Steinecke wrote:
>>> Sergio R. Caprile wrote
>>>>>>/ The FW Library bug in the Ethernet IRQ, eating fast packets is
>fixed./
>>>>>So no, this does not seem to be the standard STM issue...
>>>>
>>>> Oh, I see, missed that part. Should we believe the vendor ? (terrified
>>>> face)
>>>
>>> I think there is another related bug as well. The semaphore to signal new
>>> packets is a binary one. It should be a counting one. I had it at least
>>
>> Ehrm, wasn't that the STM bug we were talking about? At least I think I
>> remember something like that being discussed on this list...
>>
>> So you haven't fixed it? That could explain retransmissions...
>
>No I'd fixed that one for the entire time. The post I'd read regarding the
>famous bug talked about
>http://lists.nongnu.org/archive/html/lwip-users/2014-03/msg00033.html 
>This is fixed in the meantime by STM32
>
>
>
>--
>View this message in context: http://lwip.100.n7.nabble.com/TCP-retransmission-flooding-at-end-of-stream-tp23275p23313.html

>Sent from the lwip-users mailing list archive at Nabble.com.
>
>_______________________________________________
>lwip-users mailing list
>[hidden email]
>https://lists.nongnu.org/mailman/listinfo/lwip-users
>



_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Michael Steinecke
deft wrote
Just to clarify, there are two flavours of the STM bug:
GOTCHA!
When I've looked for the fix, somehow I missed the loop in the low_level_input(). I'm really blind...
I've also fixed the bad ARP checksum, caused by my driver.

The throughput reaches 22 MBit/s, now!
Thank you all
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

Sergio R. Caprile
In reply to this post by Michael Steinecke
For any Dragon Ball Z fans out there, this STM bug looks like Majin Buu
to me...
Anyway, glad you managed to solve your issue Michael, next user with an
STM bug will be charged ;^) I wonder if the SICS can take donations...

As per the tcp_poll() vs tcp_sent() in your scenario, it depends on what
is more important for your main task. The fastest way to transmit is by
filling the TCP buffer from tcp_sent(), just after freeing the acked
pbuf. This way you can keep the buffer full and respond quickly to
window changes (if any). However, you'll keep your hardware dedicated to
this task, and if it is not your highest priority, it might not be your
main interest. If you otherwise just fill the current buffer and let
tcp_poll() wake you up and keep sending, you'll probably won't saturate
your link, but use less processing power. The send loops in servers
mostly work on tcp_sent(), this is OK when serving "regular" amounts of
data. For CGI handlers that might serve long logs, I like to put a pause
once in a while in case the rest of the tasks need to breath, but I
don't use RTOS but bare metal. In any case, I think it is clearer if you
handle data sending on the tcp_sent() callback and keep the polling for
closure or resume after pause (if any).



_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: TCP retransmission flooding at end of stream

goldsimon@gmx.de
In reply to this post by Michael Steinecke
Sergio R. Caprile wrote:
> Anyway, glad you managed to solve your issue Michael, next user with an
> STM bug will be charged ;^) I wonder if the SICS can take donations...

I would take donations as well :-) I'm not getting paid for this, and my slooow 2007er MacBook is one of the reasons I dislike development lately ;-)

However, sadly STM doesn't produce notebooks, or do they?

Simon
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users