[bug #57346] TCP stop working

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug #57346] TCP stop working

Simon Goldschmidt
URL:
  <https://savannah.nongnu.org/bugs/?57346>

                 Summary: TCP stop working
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: hs4smd
            Submitted on: Tue 03 Dec 2019 09:30:39 AM UTC
                Category: TCP
                Severity: 3 - Normal
              Item Group: Crash Error
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None
            lwIP version: CVS Head

    _______________________________________________________

Details:

The following text references the attached picture where my test environment
is drawn. I did 6 tests - test 1 to 4 seems not to cause any problems. Test 5
and 6 results in stopping _any_ further TCP traffic
On both PC (#1 and #2) I startet "ping" during all tests
"Appl." on the PCs is my application, which establishes 2 TCP connections each
(drawn in red color)

Test 1:
1.) The "Appl." is _not_ running on either PC
2.) The network overview is shown on PC #1
3.) The network used for the connection is set to deactivated
4.) In result "ping" stops responding
5.) Some seconds later the network used for the connection is set to activated
again
6.) A little amount of time later "ping" response again

Test 2: Same as Test 1, but deactivating/reactivating is done on PC #2

Test 3:
1.) The "Appl." is running _only_ on PC #1
2.) The network overview is shown on PC #2
3.) The network used for the connection is set to deactivated
4.) In result "ping" stops responding
5.) Some seconds later the network used for the connection is set to activated
again
6.) A little amount of time later "ping" response again
7.) "Appl." on PC #1 is unaffected - working all the time

Test 4: Same as Test 3, but "Appl." running on PC #2 and
deactivating/reactivating is done on PC #1

Test 5:
1.) The "Appl." is running on _both_ PCs (#1 and #2)
2.) The network overview is shown on PC #2
3.) The network used for the connection is set to deactivated
4.) In result "Appl." on PC #2 _and_ PC #1 (yes, #1 !) stops working
5.) In result "ping" stops responding
6.) Some seconds later the network used for the connection is set to activated
again
7.) A little amount of time later "ping" response again
8.) "Appl." on _both_ PCs doesn't reconnect anymore

Test 6: Same as Test 5, but deactivating/reactivating is done on PC #1

My question: Is the reason causing this behaviour already known? And if, is
any solution or work arround existing?



    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Tue 03 Dec 2019 09:30:39 AM UTC  Name: Topology.png  Size: 303KiB   By:
hs4smd
Topology of my test environment
<http://savannah.nongnu.org/bugs/download.php?file_id=47971>
-------------------------------------------------------
Date: Tue 03 Dec 2019 09:30:39 AM UTC  Name: WireShark.pdf  Size: 70KiB   By:
hs4smd
Topology of my test environment
<http://savannah.nongnu.org/bugs/download.php?file_id=47972>

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?57346>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #57346] TCP stop working

Simon Goldschmidt
Follow-up Comment #1, bug #57346 (project lwip):

I did one more test and this discovered a huge problem:

I turn off power of switch B for approx. 2 minutes while the Appl. on PC #2
was _not_ stopped.
After power was applied to switch B again, the traffic was completely
confused.

After resetting the CM3 both applications on PC #1 and #2 reconnected
successfuly. But this is neihter a solution nor a work-around for the problem

The captured network traffic is in the attached textfile

(file #47974)
    _______________________________________________________

Additional Item Attachment:

File name: Errors.txt                     Size:35 KB
    <https://savannah.nongnu.org/file/Errors.txt?file_id=47974>



    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?57346>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #57346] TCP stop working

Simon Goldschmidt
Update of bug #57346 (project lwip):

              Item Group:             Crash Error => None                  
            lwIP version:                CVS Head => git head              

    _______________________________________________________

Follow-up Comment #2:

How is this a crash error? How are you even sure this is a bug in lwIP?

Aside from that, you are stying much too vague. Get a debugger and see what
your target is doing. And when sending wireshark captures, send pcap files,
not pdf or txt.

Keeping that aside, you don't even say how your application looks like. Oh,
and I do remember the MAC driver coming with SmartFusion2 was kind of hard to
use when it comes to using multiple interrupt levels.

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?57346>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #57346] TCP stop working

Simon Goldschmidt
Follow-up Comment #3, bug #57346 (project lwip):

Yes, I'm sure that this is something in lwIP. Why? Because "ping" is not
affected at all and I saw Dup ACK, Spurious Packets after returning the power
of switch B.

My application on PC #1, which should not being affected by cutting the
traffic to PC #2 (by power off the switch), disconnected after internal
timeout and tried to reconnect, but there was no answer from lwIP. As said
before, "ping" works the hole time without problems (even with no change of
the packet response time reported)

OK, I'll add a serial output for the ASSERT strings and capture them im puTTY.
At the moment I startet the target from within "Softconsole" and got no break
in any ASSERT

I didn't know that wireshark is so common. Because the behaviour is so easy to
reproduce in my testenvironment, I'll capture the behaviour when switching the
power of the switch and send it as pcap

I added "main" (.c and .h) and "lwipopts.h" of my application. You'll see, the
application is really very simple. In principle, one connection does just do
coping data from TCP to a buffer in the PLC Controller and vice versa.
An other connection reads a DP-RAM in the FPGA fabric and sends the data to
TCP


(file #47980, file #47981, file #47982)
    _______________________________________________________

Additional Item Attachment:

File name: main.c                         Size:19 KB
    <https://savannah.nongnu.org/file/main.c?file_id=47980>

File name: main.h                         Size:2 KB
    <https://savannah.nongnu.org/file/main.h?file_id=47981>

File name: lwipopts.h                     Size:56 KB
    <https://savannah.nongnu.org/file/lwipopts.h?file_id=47982>



    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?57346>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #57346] TCP stop working

Simon Goldschmidt
Follow-up Comment #4, bug #57346 (project lwip):

OK, so you're using netconn API. That means you can have a bunch of problems
when threading is done wrong.

From your previous posts, I couldn't read ping is still working. Maybe you
should concentrate on one scenario that fails first instead of making too many
words that distract from the problem :-)

Your lwipopts.h is wrong. Seems like a copy of opt.h, but it should really
only be a file that overwrites the options you need to change.

Keeping all that aside, have you checked you don't run out of memory? A
perfect explanation would be that the TCP connections where the "wire" is
broken hog all buffers so that the remaining connections at some point just
cannot send anything.

To check that, you can print lwip_stats from your main task, e.g. every second
or so. I wouldn't have expected an ASSERT though.

I'm still not convinced this is a *bug* in lwIP.

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?57346>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel