Poor RX performance, misconfigured lwipopts?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Poor RX performance, misconfigured lwipopts?

josephjah
Hello everyone,

I've been troubleshooting an RX performance issue for a couple of weeks now
and while I've made incremental gains and improved my own driver-side code,
I'm still missing something big here and I suspect that I'm likely just
misusing lwIP or have a bad config. Any help would be greatly appreciated.


Setup:

 - Port: Unix
 - Hardware: Modern desktop-class hardware for both RX and TX
 - API: Socket (I know the raw API is more performant).
 - Protocol of concern: TCP


Test setup:

For each of the following tests I'm sending and receiving blobs of data of
exponentially increasing size (from 4mb to 256mb). All sizes exhibit roughly
the same problem over the course of their TCP stream.

 - (1) OS native sockets TX -> OS native sockets RX (for baseline
comparison)
 - (2) lwIP socket API TX -> OS native RX (no issue here)
 - (3) OS native TX -> lwIP socket RX (poor performance)
 - (4) lwIP socket TX -> lwIP socket RX (exceedingly poor performance)

 - In the capture file I have attached (11.7.7.15) is the native OS stack
sending to lwIP at (11.7.7.130)

 I have tested against macOS and linux stacks, both react similarly to the
odd lwIP behavior.


Observations:

In all test configurations (1/2/3/4) at home on my LAN I get roughly
comparable throughput and everything behaves just fine. However when I test
from a workstation at my office to home things change. For test (2) I see
about 90% of baseline, but for test (3) I usually see 25% of baseline (very
rarely it performs around 90% like test (2)). And test (4) is absolutely
horrendous with about 5% of baseline.

When looking at a packet capture (attached), I see a smattering of DUP ACKS,
Retransmissions, and in some cases ACKs that seem out of order or extremely
old but that are not duplicates!

It appears that eventually the sending side decides to reduce the segment
size.

I've turned on DEBUG types and observed delayed ACK from lwIP

I've also turned on STATS and am not noticing any errors whatsoever.

Conclusions:

 - Given that I only see this from my office I am suspecting that either
packet loss or the increased latency is creating a situation that my current
lwIP config isn't handling well.
 - Since I can observe >95% of baseline when on my LAN I doubt there is a
bottleneck in my code or lwIP itself.
 - Since the performance seems to get worse with the addition of (lwIP RX)
on either side I'm suspecting that if I fix my issue in test (3) it should
also fix test (4).


Theories:

 - After much reading a common theme of dropped frames comes up, I've
inspected and simplified my ethernet driver to the point where I don't
believe this is a possibility, especially given how well it performs on my
LAN.
 - I've read that TCP_TMR_INTERVAL is tick based and not based on an actual
timer. I've toyed around with this value lowering it to 1 and raising it to
4000 but I feel this is a shortsighted approach asking for trouble. I've
looked at the unix port and it does look like it's using a clock so I'm
somewhat confused on this point.
 - Initially I thought this could be a window scaling issue so I've bumped
it up to a pretty high value for testing.

Questions:

 - Am I out in the weeds?
 - What else can I do to narrow down the issue?
 - What are some reasonable values to explore in lwipopts.h given my setup?

Thanks in advance for any assistance!
 - Joseph

lwipopts.h:

#define LWIP_MTU          2800
#define LWIP_CHKSUM_ALGORITHM          2
// memory
#define MEMP_NUM_NETCONN          1024
#define MEMP_NUM_NETBUF          2
#define MEMP_NUM_TCPIP_MSG_API          64
#define MEMP_NUM_TCPIP_MSG_INPKT        64
#define PBUF_POOL_SIZE                  128
#define TCP_DEFAULT_LISTEN_BACKLOG      0xff
// arp
#define ARP_TABLE_SIZE                  64
#define ARP_MAXAGE                      300
#define ARP_QUEUEING                    1
#define ARP_QUEUE_LEN                   3
// ip
#define IP_REASS_MAXAGE                 15
#define IP_REASS_MAX_PBUFS              32
// tcp
#define TCP_TMR_INTERVAL                100
#define TCP_WND                         0x7fff8
#define TCP_MAXRTX                      12
#define TCP_SYNMAXRTX                   12
#define LWIP_TCP_SACK_OUT               1
#define LWIP_TCP_MAX_SACK_NUM           4
#define TCP_MSS                         (LWIP_MTU - 40)
#define TCP_SND_BUF                     (64 * TCP_MSS)
#define TCP_SND_QUEUELEN                (64 * (2 * (TCP_SND_BUF/TCP_MSS)))
#define TCP_SNDLOWAT                    (0xffff - (4*TCP_MSS) - 1)
#define TCP_SNDQUEUELOWAT               LWIP_MAX(((TCP_SND_QUEUELEN)/2), 5)
#define TCP_WND_UPDATE_THRESHOLD        LWIP_MIN((TCP_WND / 4), (TCP_MSS *
4))
#define LWIP_WND_SCALE                  1
#define TCP_RCV_SCALE                   3
// tcpip
#define TCPIP_MBOX_SIZE                 0
#define LWIP_TCPIP_CORE_LOCKING         1
#define LWIP_TCPIP_CORE_LOCKING_INPUT   1
// netconn
#define LWIP_NETCONN_FULLDUPLEX         0
// netif
#define LWIP_SINGLE_NETIF               0
#define LWIP_NETIF_HWADDRHINT           1
#define LWIP_NETIF_TX_SINGLE_PBUF       0
#define TCPIP_THREAD_PRIO               1
 

lwip_poor_rx_perf_subset.pcapng
<http://lwip.100.n7.nabble.com/file/t1811/lwip_poor_rx_perf_subset.pcapng>  



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Poor RX performance, misconfigured lwipopts?

josephjah
I forgot to mention that I'm running 2.1.2, but I doubt that will make a
difference.

Thanks.



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Poor RX performance, misconfigured lwipopts?

Sergio R. Caprile
In reply to this post by josephjah
Your msg is too long for me, I'm too lazy to read it and too dumb to
keep focus at the same time.
Your capture file is long too, but fortunately retransmissions happen
right at the beginning.

I see you are ACKing 100ms later, several frames later.
I see (at least once) that you ACK a frame and ms later you ACK again
and even several times in a row (frame #177 and starting at #182).
That looks (to me) like a time base problem, check your sys_now() and
your port. I'm more of the bare metal type so I can't tell you much more
on how to setup an OS port. I've seen the unix port long ago and used it
as bare metal, don't know how it will handle timing info to lwIP (nor
sockets, btw).

And... 2814 bytes per frame ? Jumbo frames ? Can you try with more
common MTUs over the Internet ? Just in case.

Try to run some perf test over UDP, this will move the timers out of the
scenario and you can check for possible frame loss. UDP datagrams should
be numbered, though.

In any case, violating threading rules causes lots of strange artifacts,
make sure you don't.

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Poor RX performance, misconfigured lwipopts?

josephjah
Sergio,

Thank you for taking a look and thank you for the suggestions. After
considering your idea about testing with just UDP I decided to try with ICMP
packet loss measurements and found out that indeed it was my driver that was
dropping frames on occasion. Ugh.

Fixing my driver-side code has resolved all of the issues previously
mentioned.

 - Joseph



--
Sent from: http://lwip.100.n7.nabble.com/lwip-users-f3.html

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users