LWIP problems

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

LWIP problems

Trampas Stern
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Yasir Arafat
I would suggest to check if MAC ID is unique. One of main reason behind slow network network interface in my case. 

On Mon, 15 Jun, 2020, 9:36 pm Trampas Stern, <[hidden email]> wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
The MAC address is unique unless someone copied our company MAC range. 

On Mon, Jun 15, 2020 at 12:48 PM yasir arafat <[hidden email]> wrote:
I would suggest to check if MAC ID is unique. One of main reason behind slow network network interface in my case. 

On Mon, 15 Jun, 2020, 9:36 pm Trampas Stern, <[hidden email]> wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Stephen Cowell
In reply to this post by Trampas Stern
We have asked customers to reduce the mask and create a smaller subnet... this has worked for us. 
--
Stephen Cowell
Project Manager/Engineer
Plasmability LLC
Office (512) 267-7087
Cell  (512) 632-8593
www.plasmability.com
On 6/15/2020 11:05 AM, Trampas Stern wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
I can not access the customer network to root cause so I am working on modifying LWIP to get some stats on the various packet types to see if the issue is the broadcast messages, or something else. 

I just find that LWIP code so confusing it takes me forever to figure out the N layers of abstraction and such to understand what is going on.  

Trampas

On Mon, Jun 15, 2020 at 1:18 PM Stephen Cowell <[hidden email]> wrote:
We have asked customers to reduce the mask and create a smaller subnet... this has worked for us. 
--
Stephen Cowell
Project Manager/Engineer
Plasmability LLC
Office (512) 267-7087
Cell  (512) 632-8593
www.plasmability.com
On 6/15/2020 11:05 AM, Trampas Stern wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Terry Barnaby-2
Hi,

We did see an issue like this on one of our LWIP systems. Our system's networking was going very slow on a customers site.

In our case it was due to the customers network having a lot of video multicast packets and a simplistic LWIP Ethernet driver. The LWIP Ethernet driver was enabling the use of multicast packets (which we were using for auto discovery), but it did so by allowing any packet through for LWIP processing.
We fixed this by modifying the LWIP Ethernet driver to only allow the particular multicast packets through (appropriate IP address).

Terry
On 15/06/2020 18:30, Trampas Stern wrote:
I can not access the customer network to root cause so I am working on modifying LWIP to get some stats on the various packet types to see if the issue is the broadcast messages, or something else. 

I just find that LWIP code so confusing it takes me forever to figure out the N layers of abstraction and such to understand what is going on.  

Trampas

On Mon, Jun 15, 2020 at 1:18 PM Stephen Cowell <[hidden email]> wrote:
We have asked customers to reduce the mask and create a smaller subnet... this has worked for us. 
--
Stephen Cowell
Project Manager/Engineer
Plasmability LLC
Office (512) 267-7087
Cell  (512) 632-8593
www.plasmability.com
On 6/15/2020 11:05 AM, Trampas Stern wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dr Terry Barnaby            BEAM Ltd
Phone: +44 1454 324512      Northavon Business Center,
Email: [hidden email]    Dean Rd, Yate
Web: www.beam.ltd.uk        Bristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection? 

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0 

I am wondering if the ARP traffic is overwhelming lwip... 

Thanks




On Tue, Jun 16, 2020 at 9:12 AM Terry Barnaby <[hidden email]> wrote:
Hi,

We did see an issue like this on one of our LWIP systems. Our system's networking was going very slow on a customers site.

In our case it was due to the customers network having a lot of video multicast packets and a simplistic LWIP Ethernet driver. The LWIP Ethernet driver was enabling the use of multicast packets (which we were using for auto discovery), but it did so by allowing any packet through for LWIP processing.
We fixed this by modifying the LWIP Ethernet driver to only allow the particular multicast packets through (appropriate IP address).

Terry
On 15/06/2020 18:30, Trampas Stern wrote:
I can not access the customer network to root cause so I am working on modifying LWIP to get some stats on the various packet types to see if the issue is the broadcast messages, or something else. 

I just find that LWIP code so confusing it takes me forever to figure out the N layers of abstraction and such to understand what is going on.  

Trampas

On Mon, Jun 15, 2020 at 1:18 PM Stephen Cowell <[hidden email]> wrote:
We have asked customers to reduce the mask and create a smaller subnet... this has worked for us. 
--
Stephen Cowell
Project Manager/Engineer
Plasmability LLC
Office (512) 267-7087
Cell  (512) 632-8593
www.plasmability.com
On 6/15/2020 11:05 AM, Trampas Stern wrote:
We finally shipped the product using LWIP and now customers are complaining that the network interface is not working properly on their network.  They can not load webpages as it appears too slow of a connection. 

The only thing I can figure out is they have enough broadcast and ARP packages that it overwhelms our ability in LWIP to handle the messages but I am not sure.   

Has anyone encountered these problems before? 

Thanks
Trampas

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dr Terry Barnaby            BEAM Ltd
Phone: +44 1454 324512      Northavon Business Center,
Email: [hidden email]    Dean Rd, Yate
Web: www.beam.ltd.uk        Bristol, BS37 5NH, UK
BEAM Engineering: Instrumentation, Electronics/Software/Systems
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Patrick Klos-2
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Jens Nielsen-2
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
I have tried to reproduce.  We have run the system for hours on 4 different networks from zero traffic to high traffic and have not seen any issues. 

I am trying to put in more debug information to see if we can isolate the failure.  So far the only difference appears to be that this system has high latency and the ARP packets. I have attempted to replicate issues using Chrome's throttling but no success.  

The system does poll the ethernet MAC, that is the reference code we had when a packet comes in on the GMAC hardware it will not be processed until we poll.  I am wondering if the issue is not with the polling or the way the GMAC driver is working.  

Trampas


On Tue, Jun 23, 2020 at 10:09 AM Jens Nielsen <[hidden email]> wrote:
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Trampas LWIP problems

Dave Nadler
In reply to this post by Jens Nielsen-2
Hi Trampas - I may be completely wrong here, but...
I notice you are using Microchip stuff...

Are you by any chance using Microchip's Harmony framework?
Last time we looked at Harmony it was completely unusable for our work,
as it used internal timeslicing without proper RTOS integration.
This can lead to extremely poor performance for network and USB,
as it can be a LONG time between a driver receiving a packet and
that packet getting processed.
Unfortunately, after shipping many products using Microchip parts,
as a consequence of Harmony we stopped using Microchip.

Are you by any chance seeing this issue?
Hope that helps,
Best Regards, Dave

On 6/23/2020 10:10 AM, Jens Nielsen wrote:
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Trampas LWIP problems

Trampas Stern
Dave,  Thanks for the note.  I do not use Harmony as I have found too many bugs.  I based the GMAC on the AS studio example code, and then fixed issues like the data caching and such.  

I basically write my own drivers for all the peripherals where possible as that I can build it with error logging and timeouts.  I have found that MCHP/Atmel often uses systick for time outs in drivers when often you are using for system ticks or RTOS.  Hence have developed solid reentrent error logging and time management libraries that the entire firmware gets built upon, including drivers.   So in theory nothing can happen without logging an error.  I am now making it where the error logging is not just sent out serial port but logged to the file system on SD card, such that I hopefully find more details about the root cause of this issue. 

Trampas




On Tue, Jun 23, 2020 at 10:42 AM Dave Nadler <[hidden email]> wrote:
Hi Trampas - I may be completely wrong here, but...
I notice you are using Microchip stuff...

Are you by any chance using Microchip's Harmony framework?
Last time we looked at Harmony it was completely unusable for our work,
as it used internal timeslicing without proper RTOS integration.
This can lead to extremely poor performance for network and USB,
as it can be a LONG time between a driver receiving a packet and
that packet getting processed.
Unfortunately, after shipping many products using Microchip parts,
as a consequence of Harmony we stopped using Microchip.

Are you by any chance seeing this issue?
Hope that helps,
Best Regards, Dave

On 6/23/2020 10:10 AM, Jens Nielsen wrote:
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: Trampas LWIP problems

Dave Nadler
Thanks Trampas - Glad you're not trying to use Harmony.
Related though, what OS are you using?
How are you ensuring (and verifying) that packets are processed promptly after receipt by the driver?
Thanks,
Best Regards, Dave

On 6/23/2020 11:06 AM, Trampas Stern wrote:
Dave,  Thanks for the note.  I do not use Harmony as I have found too many bugs.  I based the GMAC on the AS studio example code, and then fixed issues like the data caching and such.  

I basically write my own drivers for all the peripherals where possible as that I can build it with error logging and timeouts.  I have found that MCHP/Atmel often uses systick for time outs in drivers when often you are using for system ticks or RTOS.  Hence have developed solid reentrent error logging and time management libraries that the entire firmware gets built upon, including drivers.   So in theory nothing can happen without logging an error.  I am now making it where the error logging is not just sent out serial port but logged to the file system on SD card, such that I hopefully find more details about the root cause of this issue. 

Trampas


On Tue, Jun 23, 2020 at 10:42 AM Dave Nadler <[hidden email]> wrote:
Hi Trampas - I may be completely wrong here, but...
I notice you are using Microchip stuff...

Are you by any chance using Microchip's Harmony framework?
Last time we looked at Harmony it was completely unusable for our work,
as it used internal timeslicing without proper RTOS integration.
This can lead to extremely poor performance for network and USB,
as it can be a LONG time between a driver receiving a packet and
that packet getting processed.
Unfortunately, after shipping many products using Microchip parts,
as a consequence of Harmony we stopped using Microchip.

Are you by any chance seeing this issue?
Hope that helps,
Best Regards, Dave

On 6/23/2020 10:10 AM, Jens Nielsen wrote:
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1
_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users


-- 
Dave Nadler, USA East Coast voice (978) 263-0097, [hidden email], Skype 
 Dave.Nadler1

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users
Reply | Threaded
Open this post in threaded view
|

Re: LWIP problems

Trampas Stern
In reply to this post by Trampas Stern
So as a follow up what we have found is that we are getting CRC errors at layer 1.  It seems that the switch customer has is not playing well with device and we are getting dropped packets due to CRC errors.  

I have not noticed any dropped packets or problems locally and have ordered a managed switch (that monitors CRC errors) to see if I can replicate.  

Thanks
Trampas

On Tue, Jun 23, 2020 at 10:39 AM Trampas Stern <[hidden email]> wrote:
I have tried to reproduce.  We have run the system for hours on 4 different networks from zero traffic to high traffic and have not seen any issues. 

I am trying to put in more debug information to see if we can isolate the failure.  So far the only difference appears to be that this system has high latency and the ARP packets. I have attempted to replicate issues using Chrome's throttling but no success.  

The system does poll the ethernet MAC, that is the reference code we had when a packet comes in on the GMAC hardware it will not be processed until we poll.  I am wondering if the issue is not with the polling or the way the GMAC driver is working.  

Trampas


On Tue, Jun 23, 2020 at 10:09 AM Jens Nielsen <[hidden email]> wrote:
Hi

As Patrick said that's not a high enough rate of ARPs to cause any troubles, there are in total 11 in 16 seconds there? If that overwhelms your system you have other issues...

I will eat my socks if your problem isn't caused by the two fast SYNs in 111 and 112. How sure are you that your driver isn't dropping packets at high load? Is everything fine after 131 or what happens next?

Based on your screenshot it seems to me extremely unlikely that the issue is caused by something in your customer network, if I were you I'd try to reproduce and debug.

//Jens

On 2020-06-23 12:53, Trampas Stern wrote:
The maximum number of TCP connections is 20, below is the configuration.  Note this is using http not https. 

What I am seeing in the code is the ARP packets do allocate memory for the packet, hence I could be running out of memory. 

Either way it looks like I am not processing packets fast enough or losing them. 


/**
 * \file
 *
 * \brief LwIP configuration.
 *
 * Copyright (c) 2013-2018 Microchip Technology Inc. and its subsidiaries.
 *
 * \asf_license_start
 *
 * \page License
 *
 * Subject to your compliance with these terms, you may use Microchip
 * software and any derivatives exclusively with Microchip products.
 * It is your responsibility to comply with third party license terms applicable
 * to your use of third party software (including open source software) that
 * may accompany Microchip software.
 *
 * THIS SOFTWARE IS SUPPLIED BY MICROCHIP "AS IS". NO WARRANTIES,
 * WHETHER EXPRESS, IMPLIED OR STATUTORY, APPLY TO THIS SOFTWARE,
 * INCLUDING ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY,
 * AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL MICROCHIP BE
 * LIABLE FOR ANY INDIRECT, SPECIAL, PUNITIVE, INCIDENTAL OR CONSEQUENTIAL
 * LOSS, DAMAGE, COST OR EXPENSE OF ANY KIND WHATSOEVER RELATED TO THE
 * SOFTWARE, HOWEVER CAUSED, EVEN IF MICROCHIP HAS BEEN ADVISED OF THE
 * POSSIBILITY OR THE DAMAGES ARE FORESEEABLE.  TO THE FULLEST EXTENT
 * ALLOWED BY LAW, MICROCHIP'S TOTAL LIABILITY ON ALL CLAIMS IN ANY WAY
 * RELATED TO THIS SOFTWARE WILL NOT EXCEED THE AMOUNT OF FEES, IF ANY,
 * THAT YOU HAVE PAID DIRECTLY TO MICROCHIP FOR THIS SOFTWARE.
 *
 * \asf_license_stop
 *
 */

#ifndef __LWIPOPTS_H__
#define __LWIPOPTS_H__

/* Include ethernet configuration first */
#include "conf_eth.h"
#include "board.h"
/*
   -----------------------------------------------
   -------------- LwIP API Support ---------------
   -----------------------------------------------
*/

#define SNMP_LWIP_ENTERPRISE_OID  TRIMM_ENTERPRISE_OID

#define LWIP_HTTPD_DYNAMIC_FILE_READ 1
#define HTTPD_ENABLE_HTTPS 1
#define LWIP_ALTCP_TLS 1
#define LWIP_ALTCP_TLS_MBEDTLS 1
#define LWIP_ALTCP 1
#define LWIP_HTTPD_SUPPORT_POST 1

#define HTTPD_MAX_RETRIES  10 //tbs 3-19-2020 increased from 4 to prevent time out on bad connections

#define PPP_SUPPORT 0
#define PPPOE_SUPPORT 0

#define SNMP_LWIP_MIB2 1
#define LWIP_SNMP 1
#define LWIP_SNMP_V3 1


#define MIB2_STATS 1
#define SNMP_USE_RAW 1

//#define HTTPD_DEBUG LWIP_DBG_ON
#define ALTCP_MBEDTLS_DEBUG  LWIP_DBG_ON
//#define TCP_OUTPUT_DEBUG LWIP_DBG_ON | LWIP_DBG_LEVEL_SEVERE
//#define DHCP_DEBUG LWIP_DBG_ON

/**
 * NO_SYS==1: Provides VERY minimal functionality. Otherwise,
 * use lwIP facilities.
 * Uses Raw API only.
 */
#define NO_SYS                 1

/**
 * LWIP_NETIF_STATUS_CALLBACK==1: Support a callback function whenever an interface
 * changes its up/down status (i.e., due to DHCP IP acquistion)
 */
#define LWIP_NETIF_STATUS_CALLBACK 1

/**
 * LWIP_RAW==1: Enable application layer to hook into the IP layer itself.
 * Used to implement custom transport protocol (!= than Raw API).
 */
#define LWIP_RAW                   0

/**
 * SYS_LIGHTWEIGHT_PROT==1: if you want inter-task protection for certain
 * critical regions during buffer allocation, deallocation and memory
 * allocation and deallocation.
 */
#define SYS_LIGHTWEIGHT_PROT        0

/* These are not available when using "NO_SYS" */
#define LWIP_NETCONN             0
#define LWIP_SOCKET             0

/* Uncomment following line to use DHCP instead of fixed IP */
#define DHCP_USED

/*
   ------------------------------------
   ---------- Memory options ----------
   ------------------------------------
*/

/**
 * MEM_ALIGNMENT: should be set to the alignment of the CPU
 *    4 byte alignment -> #define MEM_ALIGNMENT 4
 *    2 byte alignment -> #define MEM_ALIGNMENT 2
 */
#define MEM_ALIGNMENT           4

/**
 * MEM_SIZE: the size of the heap memory. If the application will send
 * a lot of data that needs to be copied, this should be set high.
 */
#define MEM_SIZE                 27 * 1024

/**
 * MEMP_NUM_UDP_PCB: the number of UDP protocol control blocks. One
 * per active UDP "connection".
 * (requires the LWIP_UDP option)
 */
#define MEMP_NUM_UDP_PCB                2

/**
 * MEMP_NUM_TCP_PCB: the number of simulatenously active TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB                20

/**
 * MEMP_NUM_TCP_PCB_LISTEN: the number of listening TCP connections.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_PCB_LISTEN        6

/**
 * MEMP_NUM_TCP_SEG: the number of simultaneously queued TCP segments.
 * (requires the LWIP_TCP option)
 */
#define MEMP_NUM_TCP_SEG               20

/**
 * MEMP_NUM_REASSDATA: the number of IP packets simultaneously queued for
 * reassembly (whole packets, not fragments!)
 */
#define MEMP_NUM_REASSDATA              4

/**
 * MEMP_NUM_FRAG_PBUF: the number of IP fragments simultaneously sent
 * (fragments, not whole packets!).
 * This is only used with IP_FRAG_USES_STATIC_BUF==0 and
 * LWIP_NETIF_TX_SINGLE_PBUF==0 and only has to be > 1 with DMA-enabled MACs
 * where the packet is not yet sent when netif->output returns.
 */
#define MEMP_NUM_FRAG_PBUF              6

/**
 * MEMP_NUM_PBUF: the number of memp struct pbufs (used for PBUF_ROM and PBUF_REF).
 * If the application sends a lot of data out of ROM (or other static memory),
 * this should be set high.
 */
#define MEMP_NUM_PBUF                   10

/**
 * MEMP_NUM_NETBUF: the number of struct netbufs.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETBUF                 0

/**
 * MEMP_NUM_NETCONN: the number of struct netconns.
 * (only needed if you use the sequential API, like api_lib.c)
 */
#define MEMP_NUM_NETCONN                0

/**
 * PBUF_POOL_SIZE: the number of buffers in the pbuf pool.
 */
#define PBUF_POOL_SIZE                 50

/**
 * PBUF_POOL_BUFSIZE: the size of each pbuf in the pbuf pool.
 */
#define PBUF_POOL_BUFSIZE               GMAC_FRAME_LENTGH_MAX

/*
   ----------------------------------
   ---------- DHCP options ----------
   ----------------------------------
*/

#if defined(DHCP_USED)
/**
 * LWIP_DHCP==1: Enable DHCP module.
 */
#define LWIP_DHCP               1
#endif

/*
   ---------------------------------
   ---------- UDP options ----------
   ---------------------------------
*/

/**
 * LWIP_UDP==1: Turn on UDP.
 */
#define LWIP_UDP                1

/*
   ---------------------------------
   ---------- TCP options ----------
   ---------------------------------
*/

/**
 * LWIP_TCP==1: Turn on TCP.
 */
#define LWIP_TCP                1

/**
 * TCP_MSS: The maximum segment size controls the maximum amount of
 * payload bytes per packet. For maximum throughput, set this as
 * high as possible for your network (i.e. 1460 bytes for standard
 * ethernet).
 * For the receive side, this MSS is advertised to the remote side
 * when opening a connection. For the transmit size, this MSS sets
 * an upper limit on the MSS advertised by the remote host.
 */
#define TCP_MSS                 (1460)

/**
 * TCP_WND: The size of a TCP window.  This must be at least
 * (2 * TCP_MSS) for things to work well
 */
#define TCP_WND               (32 * TCP_MSS) //should be more than 16k for TLS/HTTPS

/**
 * TCP_SND_BUF: TCP sender buffer space (bytes).
 * To achieve good performance, this should be at least 2 * TCP_MSS.
 */
#define TCP_SND_BUF             (4 * TCP_MSS)

/*
   ------------------------------------
   ---------- Thread options ----------
   ------------------------------------
*/

/** The stack sizes allocated to the netif stack: (256 * 4) = 1048 bytes. */
#define netifINTERFACE_TASK_STACK_SIZE    256

/** The priority of the netif stack. */
#define netifINTERFACE_TASK_PRIORITY      (tskIDLE_PRIORITY + 4)

/** The stack sizes allocated to the TCPIP stack: (256 * 4) = 1048 bytes. */
#define TCPIP_THREAD_STACKSIZE            256

/** The priority of the TCPIP stack. */
#define TCPIP_THREAD_PRIO                 (tskIDLE_PRIORITY + 5)

/** The mailbox size for the tcpip thread messages */
#define TCPIP_MBOX_SIZE                   16
#define DEFAULT_ACCEPTMBOX_SIZE           16
#define DEFAULT_RAW_RECVMBOX_SIZE         16
#define DEFAULT_TCP_RECVMBOX_SIZE         16

/*
   ----------------------------------------
   ---------- Statistics options ----------
   ----------------------------------------
*/


/**
 * LWIP_STATS==1: Enable statistics collection in lwip_stats.
 */
#define LWIP_STATS                        1


/**
 * LWIP_STATS_DISPLAY==1: Compile in the statistics output functions.
 */
#define LWIP_STATS_DISPLAY                0

/**
 * LWIP_STATS_LARGE==1: Use 32 bits counter instead of 16.
 */
#define LWIP_STATS_LARGE                  0

#if LWIP_STATS
#define LINK_STATS                        0
#define IP_STATS                          0
#define IPFRAG_STATS                      0
#define ICMP_STATS                        0
#define IGMP_STATS                        0
#define UDP_STATS                         0
#define TCP_STATS                         0
#define MEM_STATS                         0
#define MEMP_STATS                        0
#define SYS_STATS                         0
#endif
/* Left outside to avoid warning. */
#define ETHARP_STATS                      0

/*
   ---------------------------------------
   ---------- Debugging options ----------
   ---------------------------------------
*/

//#define LWIP_NOASSERT

#define LWIP_DEBUG
#define LWIP_DBG_MIN_LEVEL              LWIP_DBG_LEVEL_WARNING
#define LWIP_DBG_TYPES_ON               LWIP_DBG_ON



// \note For a list of all possible lwIP configurations, check http://lwip.wikia.com/wiki/Lwipopts.h

#endif /* __LWIPOPTS_H__ */

On Mon, Jun 22, 2020 at 9:43 PM Patrick Klos <[hidden email]> wrote:
On 6/22/2020 2:31 PM, Trampas Stern wrote:
I got a wireshark dump from customer and it looks like there is a lot of ARP messages on their network.

Does the ARP broadcast consume a connection?

No.  ARPs do not consume a [TCP] connection.

In the image below our device is 10.2.65.250 and it is looking like we are getting duplicate requests (line 116,129)  from the client. 
image.png

It looks like packet 116 is a resend of packet 115, and then packet 117 is the ACK for packet 116.

Packet 129 is a resend of packet 112, and then packet 130 is the SYN-ACK for that [second] connection.

In both cases, your device seems to be losing a packet and recovering after the resend?

How many simultaneous connections does your device support?

The network has lots of ARP requests... 
image.png

It appears that a device is asking for 10.2.65.1, which might be a mistake as the gateway is 10.2.64.1 with  netmask of 255.255.254.0

Do you have any idea what devices have the "MRVCommu" OUI in their MAC addresses?  Maybe those devices are misconfigured?

I am wondering if the ARP traffic is overwhelming lwip...

I haven't had a chance to review the ARP code to see if getting overwhelmed would affect the TCP connections?  Regardless, there doesn't appear to be a high enough rate of ARPs to be troublesome.  Many of them are 1 second or more apart.

Please share any other clues and maybe we can find something...

Patrick


_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users

_______________________________________________
lwip-users mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-users