[bug #56431] LWIP_ASSERT in sys_timeout because SNTP use so much timeouts

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug #56431] LWIP_ASSERT in sys_timeout because SNTP use so much timeouts

Wilfred
URL:
  <https://savannah.nongnu.org/bugs/?56431>

                 Summary: LWIP_ASSERT in sys_timeout because SNTP use so much
timeouts
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: dgirault
            Submitted on: lun. 03 juin 2019 13:46:47 UTC
                Category: apps
                Severity: 3 - Normal
              Item Group: Crash Error
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None
            lwIP version: git head

    _______________________________________________________

Details:

Hi all,

Few of our customers got reboot because of LWIP_ASSERT() in sys_timeout()
because not enough timeout are available.

In our configuration, we have MEMP_NUM_SYS_TIMEOUT defined as
((LWIP_NUM_SYS_TIMEOUT_INTERNAL + 42 + 7) & ~7), which result in 56 timeouts
to be available.

During tests of our codebase, we use between 20-25 timeout MAX.

I've added some log (just before call of sys_check_timeouts()) in our code to
detect who is consuming so much timeouts and got the following results:


June 2nd 2019, 00:48:40.100 small amount of free timeout lwip small amount of
free timeout remain=0 max=56 now=15401490 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.101 timeout lwip timeout time=15401300 cb=0x08093205
arg=0x0818f8bc WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.102 timeout lwip timeout time=15401300 cb=0x08093205
arg=0x0818f8c4 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.103 timeout lwip timeout time=15401300 cb=0x08093205
arg=0x0818f8e4 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.104 timeout lwip timeout time=15401320 cb=0x0809f219
arg=0xc0329594 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.105 timeout lwip timeout time=15401450 cb=0x080932b1
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.106 timeout lwip timeout time=15401700 cb=0x08093205
arg=0x0818f8b4 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.107 timeout lwip timeout time=15402200 cb=0x08093205
arg=0x0818f89c WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.108 timeout lwip timeout time=15402200 cb=0x08093205
arg=0x0818f8a4 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.109 timeout lwip timeout time=15402200 cb=0x08093205
arg=0x0818f8cc WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.110 timeout lwip timeout time=15402200 cb=0x08093205
arg=0x0818f8d4 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.111 timeout lwip timeout time=15402200 cb=0x08093205
arg=0x0818f8dc WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.112 timeout lwip timeout time=15404330 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.113 timeout lwip timeout time=15404450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.114 timeout lwip timeout time=15404450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.115 timeout lwip timeout time=15404450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.116 timeout lwip timeout time=15404450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.117 timeout lwip timeout time=15406440 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.118 timeout lwip timeout time=15406440 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.119 timeout lwip timeout time=15406440 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.120 timeout lwip timeout time=15406450 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.121 timeout lwip timeout time=15406450 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.122 timeout lwip timeout time=15406450 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.123 timeout lwip timeout time=15416200 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.124 timeout lwip timeout time=15416210 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.125 timeout lwip timeout time=15416210 cb=0x0809fb4d
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.126 timeout lwip timeout time=15419780 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.127 timeout lwip timeout time=15419780 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.128 timeout lwip timeout time=15420290 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.129 timeout lwip timeout time=15428420 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.130 timeout lwip timeout time=15428420 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.131 timeout lwip timeout time=15428420 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.132 timeout lwip timeout time=15428420 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.133 timeout lwip timeout time=15429460 cb=0x08093205
arg=0x0818f8ac WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.134 timeout lwip timeout time=15433300 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.135 timeout lwip timeout time=15433320 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.136 timeout lwip timeout time=15433320 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.137 timeout lwip timeout time=15433410 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.138 timeout lwip timeout time=15438190 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.139 timeout lwip timeout time=15438190 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.140 timeout lwip timeout time=15438190 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.141 timeout lwip timeout time=15449370 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.142 timeout lwip timeout time=15449370 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.143 timeout lwip timeout time=15449370 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:40.144 timeout lwip timeout time=15449370 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.671 timeout lwip timeout time=15449370 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.672 timeout lwip timeout time=15468210 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.673 timeout lwip timeout time=15493760 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.674 timeout lwip timeout time=15496790 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.675 timeout lwip timeout time=15496790 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.676 timeout lwip timeout time=15504330 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.677 timeout lwip timeout time=15504690 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.678 timeout lwip timeout time=15511450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.679 timeout lwip timeout time=15511450 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.680 timeout lwip timeout time=15521210 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.681 timeout lwip timeout time=15521210 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:48:55.682 timeout lwip timeout time=15521490 cb=0x0809fc89
arg=0000000000 WARN June 2nd 2019, 08:48:39.175
June 2nd 2019, 00:49:22.685 halt fataldump halt msg="sys_halt in lwip @
0x0809fb5a (Failed at timeouts.c:190, sys_timeout: timeout != NULL, pool
MEMP_SYS_TIMEOUT is empty) version=v0.69.15 upc0=0x80db5ee upc1=0x8028a6c
upc2=0x809fb5a upc3=0x809ff28 upc4=0x8093590 upc5=0x80976f4 upc6=0x809b848
upc7=0x8028f5e upc8=0x80cadac upc9=0x80ca8ae"
early_fatal_errors=0 FATAL January 1st 1970, 00:00:18.000


According to addr2line, the two addresses listed by next_timeout linked list
dumping is in sntp.c :


$ arm-none-eabi-addr2line -e prog.elf -fC
0x0809fb4d
sntp_retry
../../lib/lwip/lwip/src/apps/sntp/sntp.c:367
0x0809fc89
sntp_request
../../lib/lwip/lwip/src/apps/sntp/sntp.c:604


This bug seems to appears only when SNTP client never got time successfully
(system date still 01/01/1970).

sntp.c is build using SNTP_CHECK_RESPONSE = 2.





    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?56431>

_______________________________________________
  Message posté via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #56431] LWIP_ASSERT in sys_timeout because SNTP use so much timeouts

Wilfred
Follow-up Comment #1, bug #56431 (project lwip):

That should really not be happening: as stated in the file header, sntp should
only need one timeout at a time. Any chance you can debug this? At least a
translation of the 'cb' and 'arg' addresses might help.

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?56431>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #56431] LWIP_ASSERT in sys_timeout because SNTP use so much timeouts

Wilfred
Follow-up Comment #2, bug #56431 (project lwip):

Hi simon,

I have already include a translation for two of the cb in my previous comment.


cb=0x0809fc89 -> sntp_request
cb=0x0809fb4d -> sntp_retry

the other cb are mostly for cyclic timers.

cb=0x08093205 -> lwip_cyclic_timer
cb=0x0809f219 -> mqtt_cyclic_timer
cb=0x080932b1 -> tcpip_tcp_timer

All non-null args seems to be entries in lwip_cyclic_timers table.

After analysis, sntp.c, I found that sys_untimeout() wasn't called at some
place. Not sure what is the initial cause.

See attached patch. It isn't included yet in our customer's version, so I
don't have result yet.

Regards,
David

(file #47037)
    _______________________________________________________

Additional Item Attachment:

File name: 0001-sntp-remove-existing-timeout-before-creating-new.patch Size:2
KB
   
<https://savannah.nongnu.org/file/0001-sntp-remove-existing-timeout-before-creating-new.patch?file_id=47037>



    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?56431>

_______________________________________________
  Message posté via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel