URL:
<https://savannah.nongnu.org/bugs/?57790> Summary: Fragmented UDP packets leads to crash on reassembly Project: lwIP - A Lightweight TCP/IP stack Submitted by: jmalmari Submitted on: Tue 11 Feb 2020 04:05:23 PM UTC Category: IPv4 Severity: 3 - Normal Item Group: Crash Error Status: None Privacy: Public Assigned to: None Open/Closed: Open Discussion Lock: Any Planned Release: None lwIP version: git head _______________________________________________________ Details: = Test setup = == Device under test == Ran into crash with custom ST H7 board. Reproducible with NUCLEO-H743ZI2. == Software == Reproducable with ST's project example: STM32Cube_FW_H7_V1.6.0/Projects/NUCLEO-H743ZI/Applications/LwIP/LwIP_HTTP_Server_Netconn_RTOS (only changed IP and disabled dhcp) Available at https://www.st.com/en/embedded-software/stm32cubeh7.html == LwIP == lwipopts.h attached Versions tested: 2.0.3, 2.2.0 Not sure if relevant, but peculiarities of H7 include its multiple SRAMs. In the example, program data is in AXI SRAM to which ethernet DMA has no access. Therefore, the _LWIP_RAM_HEAP_POINTER_ is relocated to SRAM3. Ethernet RX buffers are also in SRAM3. CPU cache for these memory regions is configured (disabled) by Memory Protection Unit (MPU_Config()). = Test run = From Linux shell: bs=6000 ip=<device ip> dd if=/dev/urandom bs=$bs count=5 |socat -b $bs -u stdin UDP4-DATAGRAM:$ip:12345 Expect hardfault after a few seconds. = Debug = Source of hardfault is function _ip_reass_free_complete_datagram_, dereferencing invalid pointer p (p->payload). Issue seems to be combination of fragmented packets and receive buffer filling up. Therefore, depending on memory settings the amount of UDP bombardment may need adjusting. Typical output with IP_REASS_DEBUG=LWIP_DBG_ON: ip_reass_pbufcount: 1 out ip4_reass: matching previous fragment ID=8778 ip4_reass: last fragment seen, total len 1508 ip_reass_pbufcount: 2 out ip_reass_tmr: timer dec 14 ip_reass_pbufcount: 3 out ip4_reass: matching previous fragment ID=8779 ip4_reass: last fragment seen, total len 1508 ip_reass_pbufcount: 4 out ip_reass_pbufcount: 5 out ip4_reass: matching previous fragment ID=877a ip4_reass: last fragment seen, total len 1508 ip_reass_pbufcount: 6 out ip4_reass: last fragment seen, total len 1508 ip_reass_pbufcount: 7 out ip4_reass: last fragment seen, total len 1508 ip_reass_pbufcount: 8 out ip_reass_tmr: timer dec 14 ip_reass_tmr: timer dec 14 ip_reass_tmr: timer dec 14 ip_reass_tmr: timer dec 14 ip_reass_tmr: timer dec 13 ip_reass_tmr: timer dec 13 ip_reass_tmr: timer dec 13 ip_reass_tmr: timer dec 13 <...counting down from 13 to 1...> ip_reass_tmr: timer dec 1 ip_reass_tmr: timer dec 1 ip_reass_tmr: timer dec 0 ip_reass_tmr: timer dec 0 ip_reass_tmr: timer dec 0 ip_reass_tmr: timer dec 0 ip_reass_tmr: timer dec 0 ip_reass_tmr: timer timed out <hardfault> = Workaround = #define IP_REASSEMBLY 0 _______________________________________________________ File Attachments: ------------------------------------------------------- Date: Tue 11 Feb 2020 04:05:23 PM UTC Name: lwipopts.h Size: 9KiB By: jmalmari <http://savannah.nongnu.org/bugs/download.php?file_id=48393> _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #1, bug #57790 (project lwip):
You won't get me to reproduce this with an H7 board, sorry. The drivers ST provides are known to be buggy. Why are you sure this is a bug in lwIP, not in your port? _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #2, bug #57790 (project lwip):
[comment #1 comment #1:] > You won't get me to reproduce this with an H7 board, sorry. Understandable. > > The drivers ST provides are known to be buggy. Why are you sure this is a bug in lwIP, not in your port? I am not. During my short exposure with ST drivers, I agree, they seem not well tested. However, this consistently fails on me while no other stress test does. On the other hand, I do have ST's F4 with LwIP that doesn't fail. Biggest difference between the two is the memory setup and layout. Any restrictions on what type of memory access should be present for LwIP's rx/tx to work? Is it ok for LWIP_RAM_HEAP_POINTER to be only for writing and RX only for reading? _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #3, bug #57790 (project lwip):
Are you at least sure it's not a threading issue? If you could send a simple udp API level application to test against, I could try to test that on another target... > Is it ok for LWIP_RAM_HEAP_POINTER to be only for writing and RX only for reading? I'm afraid I don't understand that. But I would have thought the memory setup of the H7 itself should not be a problem, unless configured wrong. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #4, bug #57790 (project lwip):
> Are you at least sure it's not a threading issue? Yes. My app is single threaded. The public example happened to be multi. > > If you could send a simple udp API level application to test against, I could try to test that on another target... > No udp api code is even needed. Simple loop running just the lwip stack is enough. > > Is it ok for LWIP_RAM_HEAP_POINTER to be only for writing and RX only for reading? > > I'm afraid I don't understand that. Sorry, I wasn't clear. I mean the lwip is given a pool that may be write only (that eth dma will read). For lwip input, a custom pbuf is allocated from external pool that may be read only (that eth dma wrote). I just wonder whether it's safe to assume lwip doesn't use pbuf payloads in a conflicting way under the hood. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #5, bug #57790 (project lwip):
> Yes. My app is single threaded. The public example happened to be multi. Your app is not the only thing that can cause problems. It's mainly the port (are interrupts used, how are timers executed, do interrupts call into lwIP code, etc) that causes problems. As to the read/write of memory: lwIP *does* write into the payload of received packets. They are *not* readonly. TCP does that (there's a bug that this might need to be changed) and IPv4 reassembly does that, too: it stores reassembly information in the pbuf where the IP header would be (after copying the header once). So this might be the fault you're seeing. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #6, bug #57790 (project lwip):
> As to the read/write of memory: lwIP *does* write into the payload of received packets. They are *not* readonly. TCP does that (there's a bug that this might need to be changed) and IPv4 reassembly does that, too: it stores reassembly information in the pbuf where the IP header would be (after copying the header once). > This is good to know. I assumed otherwise, so this is something I need to get back to. I won't be able to check it right away so feel free to close this for now. Can you point me to that tcp bug so I can follow up any discussion there? Thanks _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #7, bug #57790 (project lwip):
Yes, that's task #14807 (not a bug). I'll still leave this open as it might hit others and I think IP reassembly and TCP might be the only parts that write to rx pbufs. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Update of bug #57790 (project lwip):
Status: None => Invalid Assigned to: None => goldsimon Open/Closed: Open => Closed _______________________________________________________ Follow-up Comment #8: Changed the summary of task #14807 to include IPv4 reassemly, so this can be closed. Also, I've added code to the windows port to test readonly RX. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #9, bug #57790 (project lwip):
I can confirm that bug. I'm using STM32F767II MCU with FreeRtos and LwIP 2.1.2. I'm not working with real hardware network interface but I've implemented a virtual network interface without any DMA access etc. In ip_reass_free_complete_datagram: [...] /* First, free all received pbufs. The individual pbufs need to be released separately as they have not yet been chained */ p = ipr->p; while (p != NULL) { struct pbuf *pcur; *->* iprh = (struct ip_reass_helper *)p->payload; It "crashed" in the second iteration of that while-loop. Setting IP_REASSEMBLY to 0 helps to prevent that fault, however that is not a solution. Are there any news about how to fix that _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #10, bug #57790 (project lwip):
Re comment #9: You're using 2.1.2 but this report is set to "git head". Can you check if the issue is still present in git head? Thanks. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #11, bug #57790 (project lwip):
I compared ip4_frag.c version 2.1.2 with git head and there were no relevant changes made. I ommited to further test that version since I already burned a full week in realizing that the bug was not in our product. I will put that on my todo list and if I get some time I will test it again. Just want to inform the community that there seems to be a kind of problem. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #12, bug #57790 (project lwip):
Maybe the time is better invested debugging the error you have instead of going to git head. - What is the value of p or p->payload when it crashes? - Are you sure threading requirements are met in your setup? - Are you sure the STM MAC driver is not buggy? I'm not aware of any problems here. Also, extended fuzzing tests did not show such bugs (and they showed quite a lot, lately). _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Follow-up Comment #13, bug #57790 (project lwip):
Mentioning threading requirement is a good point, I will check that in detail. But I think I found at least one issue in my source: In my interface low level input method: struct pbuf* p = pbuf_alloc(PBUF_RAW, Msg->Size, PBUF_REF); if (p != NULL) { p->payload = GetPayloadBuffer(Msg); p->len = Msg->Size; if (Interface->input(p, Interface) != ERR_OK ) { pbuf_free(p); // On fail, we have to free the buffer } else { CallbackResult = 1; // buffer is already freed internally } } My Msg was actually a stack variable, I was assuming that Interface->input() was blocking and that after that call the content of that stack variable gets cloned. However this isn't the case especially with PBUF_REF set. _______________________________________________________ Reply to this item at: <https://savannah.nongnu.org/bugs/?57790> _______________________________________________ Message sent via Savannah https://savannah.nongnu.org/ _______________________________________________ lwip-devel mailing list [hidden email] https://lists.nongnu.org/mailman/listinfo/lwip-devel |
Free forum by Nabble | Edit this page |