Throughput benchmark question

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Throughput benchmark question

Dave Nadler

Hi - Newbie here trying to do some basic throughput tests.
LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.

I want to see how much I can consistently push through the stack.
Made a simple test server (sockets API) which repeatedly outputs 101-character lines.
I access the server via PuTTY raw mode on Winbloze over a local network.
I can usually send 3 lines per msec for a second (3000 lines in 1 second), but...
Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).

How should I go about understanding where the pauses come from?

Thanks in advance for any hints,
Best Regards, Dave

-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question

maillist
Dave,

First thing would be to sniff the network using Wireshark, and see what
happens there when the traffic pauses. This would usually give a good
indication on what did happen.

Regards,
Johan

On 2019-02-20 00:15, Dave Nadler wrote:

> Hi - Newbie here trying to do some basic throughput tests.
> LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.
>
> I want to see how much I can consistently push through the stack.
> Made a simple test server (sockets API) which repeatedly outputs
> 101-character lines.
> I access the server via PuTTY raw mode on Winbloze over a local
> network.
> I can usually send 3 lines per msec for a second (3000 lines in 1
> second), but...
> Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).
>
> How should I go about understanding where the pauses come from?
>
> Thanks in advance for any hints,
> Best Regards, Dave
>
> --
> Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email],
> Skype
>  Dave.Nadler1
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question

Johan Borkhuis
In reply to this post by Dave Nadler
Dave,

First thing would be to sniff the network using Wireshark, and see what
happens there when the traffic pauses. This would usually give a good
indication on what did happen.

Regards,
Johan

On 2019-02-20 00:15, Dave Nadler wrote:

> Hi - Newbie here trying to do some basic throughput tests.
> LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.
>
> I want to see how much I can consistently push through the stack.
> Made a simple test server (sockets API) which repeatedly outputs
> 101-character lines.
> I access the server via PuTTY raw mode on Winbloze over a local
> network.
> I can usually send 3 lines per msec for a second (3000 lines in 1
> second), but...
> Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).
>
> How should I go about understanding where the pauses come from?
>
> Thanks in advance for any hints,
> Best Regards, Dave
>
> --
> Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email],
> Skype
>  Dave.Nadler1
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

Dave Nadler
I figured out how to get the wireshark trace,
but how to get the wireshark GUI to output the summary below in text baffles me, hope the pic is OK:
Everything is going swimmingly until 4316.
I don't understand the meaning of "previous segment not captured" here - something got dropped.
And then it takes a second to get going again.
Any pointers appreciated!
Thanks,
Best Regards, Dave

On 2/20/2019 1:45 AM, Johan Borkhuis wrote:
Dave,

First thing would be to sniff the network using Wireshark, and see what happens there when the traffic pauses.
This would usually give a good indication on what did happen.

Regards,
Johan

On 2019-02-20 00:15, Dave Nadler wrote:
Hi - Newbie here trying to do some basic throughput tests.
LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.

I want to see how much I can consistently push through the stack.
Made a simple test server (sockets API) which repeatedly outputs
101-character lines.
I access the server via PuTTY raw mode on Winbloze over a local
network.
I can usually send 3 lines per msec for a second (3000 lines in 1
second), but...
Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).

How should I go about understanding where the pauses come from?

Thanks in advance for any hints,
Best Regards, Dave

--
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email],
Skype
 Dave.Nadler1
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users



-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

Dave Nadler
I naively expected that after receiving the" duplicate ack" signalling a packet dropped,
LwIP would immediately re-transmit the dropped packet.
Instead there is a 1.5 second pause (see Wireshark trace below).
Why is that?
Sorry if that's a dumb question; I'm a newbie with this...
Thanks,
Best Regards, Dave

PS: Might this be related to the pauses seen by UAZ ?
I'm also using FreeRTOS but with preemption enabled (unlike UAZ).
Test application uses sockets interface as follows:

        const int testCnt = 1000;
        const int linesPerCycle = 3;
        static const TickType_t xDelay = 1 / portTICK_PERIOD_MS; /* Block for xx ms. */
        static char testData[] = "xxx123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456\n\r";
        char saveMe = testData[4];
        TickType_t startTick = xTaskGetTickCount();
        uint32_t elapsedTicks;
        for(int i=0; i<testCnt; i++) {
            sprintf(testData,"%03d", i);
            testData[4] = saveMe; // replace terminating 0 written by sprintf
            for(int j=0; j<linesPerCycle; j++) _send(user->socketFD, testData, sizeof(testData));
            // Delay to simulate desired throughput, but try not to fall behind...
            elapsedTicks = xTaskGetTickCount()-startTick;
            if(elapsedTicks > i+1) continue; // skip delay if we've got behind
            vTaskDelay( xDelay ); // may delay for a lot longer than requested, for example delayed by processing in LwIP thread...
        };
        char buf[96];
        sprintf(buf,"Elapsed ticks=%lu for %d lines (%f msec per line, using %d-line sets)\n\r", elapsedTicks, linesPerCycle*testCnt, ((float)elapsedTicks)/(linesPerCycle*testCnt), linesPerCycle);
        _send(user->socketFD, buf, strlen(buf));
        return;



On 2/26/2019 6:45 PM, Dave Nadler wrote:
I figured out how to get the wireshark trace,
but how to get the wireshark GUI to output the summary below in text baffles me, hope the pic is OK:
Everything is going swimmingly until 4316.
I don't understand the meaning of "previous segment not captured" here - something got dropped.
And then it takes a second to get going again.
Any pointers appreciated!
Thanks,
Best Regards, Dave

On 2/20/2019 1:45 AM, Johan Borkhuis wrote:
Dave,

First thing would be to sniff the network using Wireshark, and see what happens there when the traffic pauses.
This would usually give a good indication on what did happen.

Regards,
Johan

On 2019-02-20 00:15, Dave Nadler wrote:
Hi - Newbie here trying to do some basic throughput tests.
LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.

I want to see how much I can consistently push through the stack.
Made a simple test server (sockets API) which repeatedly outputs 101-character lines.
I access the server via PuTTY raw mode on Winbloze over a local network.
I can usually send 3 lines per msec for a second (3000 lines in 1 second), but...
Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).

How should I go about understanding where the pauses come from?

Thanks in advance for any hints,
Best Regards, Dave
--
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype  Dave.Nadler1
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

goldsimon@gmx.de
Am 28.02.2019 um 17:42 schrieb Dave Nadler:
> I naively expected that after receiving the" duplicate ack" signalling a
> packet dropped,
> LwIP would immediately re-transmit the dropped packet.

No, TCP fast retransmission starts after 3 dupacks only.

> Instead there is a 1.5 second pause (see Wireshark trace below).
> Why is that?

Yeah, that's a little strange. I would have expected ~250ms for the RTO
retransmission, but then again, it depends on your roundtrip time. But
it could well be 1.5 seconds because your timers are not coming at the
expected interval?

Regards,
Simon

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

Dave Nadler
Thanks Simon. The FreeRTOS tic is running at 1mSec.
Test setup is laptop and Nucleo on my desk with hub, so round-trip should be fast, no?
What other timers could be involved?
Where should I look for LwIP <-> low-level timer setup/use?
Possibly ST's implementation is sub-optimal.

Thanks for the help,
Best Regards, Dave

On 2/28/2019 1:09 PM, [hidden email] wrote:
Am 28.02.2019 um 17:42 schrieb Dave Nadler:
I naively expected that after receiving the" duplicate ack" signalling a packet dropped,
LwIP would immediately re-transmit the dropped packet.

No, TCP fast retransmission starts after 3 dupacks only.

Instead there is a 1.5 second pause (see Wireshark trace below).
Why is that?

Yeah, that's a little strange. I would have expected ~250ms for the RTO retransmission,
but then again, it depends on your roundtrip time.
But it could well be 1.5 seconds because your timers are not coming at the expected interval?

Regards,
Simon
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
uaz
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

uaz
In reply to this post by Dave Nadler
Hey Dave,

Your problem sounds quite similar to me except mine is completely stucked.
Have you checked your freeRTOS trace and determine where your application
stops sending packets or freezes in any thread?

I verified mine using Segger Systemview and can see clearly that it
pauses/freezes in certain thread for very long time without any systick or
ETH interrupt.
I'll be trying RAW API since I'm unable to debug this issue after spending
sometime.

Btw, I'm using NXP mcu with their supplied lwip and freertos ports (Upgraded
to the latest version myself but the driver layer remains the same)

Regards,
UAZ



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

Dave Nadler
On 2/28/2019 6:44 PM, uaz wrote:
Hey Dave,

Your problem sounds quite similar to me except mine is completely stucked.
Have you checked your freeRTOS trace and determine where your application
stops sending packets or freezes in any thread?

I verified mine using Segger Systemview and can see clearly that it
pauses/freezes in certain thread...

What thread?
Anything that might be breakpointable?
Mine's not stuck, just unreasonable delay.
Let me know and I'll dig into it here in the next couple of days...


for very long time without any systick or
ETH interrupt.
I'll be trying RAW API since I'm unable to debug this issue after spending
sometime. 

Btw, I'm using NXP mcu with their supplied lwip and freertos ports (Upgraded
to the latest version myself but the driver layer remains the same)

Regards,
UAZ

--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

Thanks,
Best Regards, Dave


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
uaz
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty pauses

uaz
Stuck in tcpip_thread() of tcpip.c file, or my own udp server thread.
I'm not sure if 'stuck' is the exact description, but it is spending too
much time in the above threads without switching or interrupts.
For now, I'm unable to find the exact location of the freezing.

If your other threads are able to run normally during the pause, it means
that our problem is not the same.

Below is my Segger trace for your reference:
<http://lwip.100.n7.nabble.com/file/t2097/sysview.png>
During the pause, systick which is supposed to trigger every 1ms did not
trigger.



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Dave Nadler
In reply to this post by uaz
A bit more info about where it stops. Any help where to look next appreciated!

I paused the application in the debugger during the extensive (~1.5 second) pause:
- tcp thread is waiting here:
    /* wait for a message, timeouts are processed while waiting */
    TCPIP_MBOX_FETCH(&tcpip_mbox, (void **)&msg);
  Inside tcpip_timeouts_mbox_fetch that's waiting with a 250msec timeout in:
  UNLOCK_TCPIP_CORE();
  res = sys_arch_mbox_fetch(mbox, msg, sleeptime);

- application thread is waiting inside lwip_netconn_do_write  at  sys_arch_sem_wait(LWIP_API_MSG_SEM(msg), 0);
  That's called by tcpip_send_msg_wait_sem here:
  LOCK_TCPIP_CORE();
  fn(apimsg);
- the ethernetif_input thread is waiting in  if (osSemaphoreWait( s_xSemaphore, TIME_WAITING_FOR_INPUT)==osOK)
  This is an ST-provided module that just waits for packets and dispatches them to netif->input

Thanks in advance for any pointers,
Best Regards, Dave

On 3/1/2019 3:43 PM, Dave Nadler wrote:
Hi UAZ - I'm using ST not NXP, so different driver and LwIP glue layer.
Can you tell *which* systick did not count (ie ARM core or some other timer)?
Can you see where in code its stuck?
I'm using sockets interface and TCP for my server.
I'll add an "LED blinky" thread to ensure other threads are running OK in my case.

If I could figure out how to breakpoint the code that's pausing I could probably sort this...

Thanks,
Best Regards, Dave

PS: Hope to work on this a bit more next week. Will stick with sockets API.
Raw API would make coding harder in my case.

On 2/28/2019 9:29 PM, uaz wrote:
Stuck in tcpip_thread() of tcpip.c file, or my own udp server thread.
I'm not sure if 'stuck' is the exact description, but it is spending too
much time in the above threads without switching or interrupts.
For now, I'm unable to find the exact location of the freezing.

If your other threads are able to run normally during the pause, it means
that our problem is not the same.

Below is my Segger trace for your reference:
<http://lwip.100.n7.nabble.com/file/t2097/sysview.png> 
During the pause, systick which is supposed to trigger every 1ms did not
trigger.

On 2/28/2019 6:44 PM, uaz wrote:
Hey Dave,

Your problem sounds quite similar to me except mine is completely stucked.
Have you checked your freeRTOS trace and determine where your application
stops sending packets or freezes in any thread?

I verified mine using Segger Systemview and can see clearly that it
pauses/freezes in certain thread for very long time without any systick or
ETH interrupt.
I'll be trying RAW API since I'm unable to debug this issue after spending
sometime. 

Btw, I'm using NXP mcu with their supplied lwip and freertos ports (Upgraded
to the latest version myself but the driver layer remains the same)

Regards,
UAZ
--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Sergio R. Caprile
ST driver code wasn't freeing all frames in the Ethernet controller on
ints, do you have the "working" version ? (can't tell you which one)
I particularly don't like watching screen snapshots, if you post a
capture I can try to look and let you know if I see something.
At birdseye, looks like you are missing frames, I would start looking there.

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Dave Nadler
Thanks for the help Sergio, I'm a newbie...

On 3/15/2019 8:20 AM, Sergio R. Caprile wrote:
ST driver code wasn't freeing all frames in the Ethernet controller on ints, 

What's an "int"?

do you have the "working" version ? (can't tell you which one)

No idea - I'm using whatever ST has circulated with recent version of their "CubeMX" package.
How can I find out if its the "working" version?

I particularly don't like watching screen snapshots, if you post a
capture I can try to look and let you know if I see something.

Please let me know what format you prefer and I'll try to obtain that from Wireshark on the PC.

At birdseye, looks like you are missing frames, I would start looking there.

The initial problem is a missing frame, which should not cause grief.
But the BIG problem is that it takes ~1.5 seconds for the LwIP app to retransmit, no?

Thanks again for the help!
Best Regards, Dave


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Sergio R. Caprile
In reply to this post by Sergio R. Caprile
OK, I'll try to be more precise.
ST driver code wasn't properly handling receive interrupts from the
Ethernet controller. They just took the first frame in the buffer
without thinking more could have arrived since the interrupt fired.
Those frames remained there sleeping until a new one arrived, causing
delays and frame loss.
This is supposed to have been fixed, but from time to time I see people
telling the story that revision X for hw A has the problem that was
fixed in revision W for hw B, so...

I would first analyze a traffic capture to determine the reason for the
low throughput, there can be delays, lost frames; you could have a
broken driver, a broken port, a broken application...
You should use a known-to-work application (one of those in the apps
directory or in the contrib tree, I used a netio long ago and I guess
there is an iperf there now) so you can rule that out and check for
driver/port issues.
Most people use Wireshark, pcap or pcapng is OK. Post a link if possible.

Cheers

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Dave Nadler
Thanks Sergio, more info below...

On 3/15/2019 10:57 AM, Sergio R. Caprile wrote:
OK, I'll try to be more precise.
ST driver code wasn't properly handling receive interrupts from the
Ethernet controller. They just took the first frame in the buffer
without thinking more could have arrived since the interrupt fired.
Those frames remained there sleeping until a new one arrived, causing
delays and frame loss.
This is supposed to have been fixed, but from time to time I see people
telling the story that revision X for hw A has the problem that was
fixed in revision W for hw B, so...

I'll go study this. First look at the ISR seems to assume one buffer received per
DMA-completion interrupt, but the documentation clearly says there
can be more than one. Looks bad... I'll research and report back.

I would first analyze a traffic capture to determine the reason for the
low throughput, there can be delays, lost frames; you could have a
broken driver, a broken port, a broken application...
You should use a known-to-work application (one of those in the apps
directory or in the contrib tree, I used a netio long ago and I guess
there is an iperf there now) so you can rule that out and check for
driver/port issues.
Most people use Wireshark, pcap or pcapng is OK. Post a link if possible.

Here's the capture (same session as screen image posted earlier):
http://www.nadler.com/backups/20190227_Lwip_pause.pcapng

To recap: LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.
I want to see how much I can consistently push through the stack.
Made a simple test server (sockets API) which repeatedly outputs 101-character lines.
I access the server via PuTTY raw mode on Winbloze over a local network.|
I can usually send 3 lines per msec for a second (3000 lines in 1 second), but...
Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).

Everything is going swimmingly until 4316.
The windows client notes a missing segment and issues a duplicate ACK as expected.
This exact pattern is quite repeatable.
FreeRTOS is running happily during the evil pause (LED blinky task uninterrupted).

Why does the LwIP application take ~1.5 seconds to retransmit the data?

Cheers

Again, thanks for your time and any hints...
Best Regards, Dave


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

goldsimon@gmx.de
Am 15.03.2019 um 22:56 schrieb Dave Nadler:

> Thanks Sergio, more info below...
>
> On 3/15/2019 10:57 AM, Sergio R. Caprile wrote:
>> OK, I'll try to be more precise.
>> ST driver code wasn't properly handling receive interrupts from the
>> Ethernet controller. They just took the first frame in the buffer
>> without thinking more could have arrived since the interrupt fired.
>> Those frames remained there sleeping until a new one arrived, causing
>> delays and frame loss.
>> This is supposed to have been fixed, but from time to time I see people
>> telling the story that revision X for hw A has the problem that was
>> fixed in revision W for hw B, so...
>
> I'll go study this. First look at the ISR seems to assume one buffer
> received per
> DMA-completion interrupt, but the documentation clearly says there
> can be more than one. Looks bad... I'll research and report back.
>
>> I would first analyze a traffic capture to determine the reason for the
>> low throughput, there can be delays, lost frames; you could have a
>> broken driver, a broken port, a broken application...
>> You should use a known-to-work application (one of those in the apps
>> directory or in the contrib tree, I used a netio long ago and I guess
>> there is an iperf there now) so you can rule that out and check for
>> driver/port issues.
>> Most people use Wireshark, pcap or pcapng is OK. Post a link if possible.
>
> Here's the capture (same session as screen image posted earlier):
> http://www.nadler.com/backups/20190227_Lwip_pause.pcapng
>
> To recap: LwIP 2.1.2 on FreeRTOS 9, ST32F429, IPv4, TCP.
> I want to see how much I can consistently push through the stack.
> Made a simple test server (sockets API) which repeatedly outputs
> 101-character lines.
> I access the server via PuTTY raw mode on Winbloze over a local network.|
> I can usually send 3 lines per msec for a second (3000 lines in 1
> second), but...
> Sometimes, I get ~ 1-second pauses (as seen in Putty or TeraTerm).
>
> Everything is going swimmingly until 4316.
> The windows client notes a missing segment and issues a duplicate ACK as
> expected.
> This exact pattern is quite repeatable.
> FreeRTOS is running happily during the evil pause (LED blinky task
> uninterrupted).
>
> Why does the LwIP application take ~1.5 seconds to retransmit the data?

In absence of more dup-acks (there's only one at that time, packet
#4317), there's only the rto timeout that triggers retransmissions.
Without having a look at the spec or at the code, I can't tell if these
1.5 seconds are expected or not. They certainly seem too long for a
decent, robust communication flow, but then again, you have a somewhat
"untypical" ping-pong of very short data. Maybe tcp is just not up to
what you want (in terms of "real-time"). Protocols like Modbus-TCP
implement retries on top of TCP, maybe that's an option for you?

Regards,
Simon

>
>> Cheers
>
> Again, thanks for your time and any hints...
> Best Regards, Dave
>
>
> --
> Dave Nadler, USA East Coast voice (978) 263-0097,[hidden email], Skype
>   Dave.Nadler1
>
>
> _______________________________________________
> lwip-users mailing list
> [hidden email]
> https://lists.nongnu.org/mailman/listinfo/lwip-users
>


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

antonio
Hi Simon,
I am using LWIP-2.1.2
TCP_Dup_Ack_FIN_ACK_RST.pcapng
<http://lwip.100.n7.nabble.com/file/t1901/TCP_Dup_Ack_FIN_ACK_RST.pcapng>  I
read through this thread, and I am having a problem similar to the one
described here..

Description:
"LWIP stack at client at some point skips an expected segment (tcp pkt of
1460 bytes length) and server sends a dup ack requesting for the missing
segment. However, LWIP stack still does not send it and remained silent for
over 500msec. At this point, server triggers a connection close. More than a
second after the connection close handshake, client sends the missing
segment which triggers the server to reset the connection. LWIP stack
invokes callbacks into the iperf_client which is what is seen in the console
logs".



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Sergio R. Caprile
In reply to this post by Sergio R. Caprile
mmm...
The bug I mentioned is on the rx side, I see you are losing frames on
the tx side. There should be a frame between #4206 and #4207 that is
either lost inside your device or on its way to your PC.
The same pattern repeats where you mention, there is a missing frame
between #4315 and #4316.
I would hunt this instead.
You don't seem to lose ACKs.
The retransmission triggers a bit late for what I like, but as Simon
points out, you have an unusual pattern. TCP always wants to wait before
sending, if you don't cram its buffer and just sit waiting for the
response, it will retransmit when it gets tired of waiting. But if you
want to keep sending, it will probably "insist" earlier.
As I mentioned before, try a known application. I would chase for those
missing frames on the output side, though.
You could also check your port is providing correct timing, but I guess
we can consider the FreeRTOS port as "standard" and "working" ?

Cheers


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Dave Nadler
On 3/18/2019 9:15 AM, Sergio R. Caprile wrote:
mmm...
The bug I mentioned is on the rx side, I see you are losing frames on
the tx side. There should be a frame between #4206 and #4207 that is
either lost inside your device or on its way to your PC.
The same pattern repeats where you mention, there is a missing frame
between #4315 and #4316.
I would hunt this instead.

To debug this, can you suggest a convenient place in LwIP to record the most
recent sequence# transmitted (or at least passed to the dubious driver)?

You don't seem to lose ACKs.
The retransmission triggers a bit late for what I like, but as Simon
points out, you have an unusual pattern. TCP always wants to wait before
sending, if you don't cram its buffer and just sit waiting for the
response, it will retransmit when it gets tired of waiting. But if you
want to keep sending, it will probably "insist" earlier.
As I mentioned before, try a known application. I would chase for those
missing frames on the output side, though.


You could also check your port is providing correct timing, but I guess
we can consider the FreeRTOS port as "standard" and "working" ?

FreeRTOS and timers appear solid; I've got a blinky task that keeps
blinking as expected during this nasty pause.

The ST-provided glue (driver, FreeRTOS-LwIP-driver binding code),
well that's a whole nuther terrible mess...

Thanks Sergio,
Best Regards, Dave

Cheers
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Throughput benchmark question - nasty ~1.5 second pauses

Sergio R. Caprile
In reply to this post by Sergio R. Caprile
If I were to debug this, I would use UDP, you can move a pin when
sending and be sure the msg will at least get to the driver (where we
think it is lost). TCP... well...
Then, every entry to the driver should end with a safe exit and due to
DMA a later "done" interrupt where the driver frees the buffer. (with no
DMA, the buffer is freed when sent to the chip and things are way simpler)

A driver framework is simple, but a DMA driver gets complicated. I would
try putting breakpoints at troublesome decision points and try to catch
errors or unexpected conditions.
I mean, most of the time frames go through, so something happens once in
a while; what is this that happens ? What can be ?
Is it copying memory to DMA ring buffers ?
Most of the times the driver should get an available ring buffer, what
happens when it doesn't ? Is it happening ?
I'd suggest asking for help in ST forums.
Good luck!

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
12