[bug #58142] mDNS: RFC violation after recent changes - affecting probing

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
URL:
  <https://savannah.nongnu.org/bugs/?58142>

                 Summary: mDNS: RFC violation after recent changes - affecting
probing
                 Project: lwIP - A Lightweight TCP/IP stack
            Submitted by: jasperv
            Submitted on: Wed 08 Apr 2020 12:54:11 PM CEST
                Category: apps
                Severity: 3 - Normal
              Item Group: Faulty Behaviour
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None
            lwIP version: git head

    _______________________________________________________

Details:

Commit introducing wrong behavior:

e85e4738
mdns: remove service TXT record from probe packets
   TXT records isn't required to be unique in network, so it shouldn't be
included in probe packets.
   Additionnaly, when TXT record is present, the Bonjour Conformance Test from
Apple Inc. always fail because generated probe nevert have TXT record.

change done:

-      outmsg->serv_replies[i] = REPLY_SERVICE_SRV | REPLY_SERVICE_TXT;
+      outmsg->serv_replies[i] = REPLY_SERVICE_SRV;


This breaks the Simultaneous Probe Tiebreaking feature in mDNS.
RFC6762 section 8.2 states:
   When a host is probing for a group of related records with the same
   name (e.g., the SRV and TXT record describing a DNS-SD service), only
   a single question need be placed in the Question Section, since query
   type "ANY" (255) is used, which will elicit answers for all records
   with that name.  However, for tiebreaking to work correctly in all
   cases, the Authority Section must contain *all* the records and
   proposed rdata being probed for uniqueness.

If we receive an ANY question for a service we reply with the SRV and TXT
record in the answer section.
We add the A record in the additional section. This also after the above
change.
If we do it at that point, we should also do it in the probe.

I compared this to the behavior of a "well-known" implementation -
avahi-daemon.
On my network I have a device called 'device1' that has an http service using
a version of lwIP prior to the above commit.
If I use avahi-daemon to also host that same service I can see the probes and
the response of my unit.

$ avahi-publish-service device1 _http._tcp 80 []

Avahi sends out this query:
  Multicast Domain Name System (query)
    Transaction ID: 0x0000
    Flags: 0x0000 Standard query
    Questions: 1
    Answer RRs: 0
    Authority RRs: 2
    Additional RRs: 0
    Queries
        device1._http._tcp.local: type ANY, class IN, "QM" question
    Authoritative nameservers
        device1._http._tcp.local: type SRV, class IN, priority 0, weight 0,
port 80, target jasper.local
        device1._http._tcp.local: type TXT, class IN
    [Response In: 19]

Including TXT and SRV.

The response of my device:
  Multicast Domain Name System (response)
    Transaction ID: 0x0000
    Flags: 0x8400 Standard query response, No error
    Questions: 0
    Answer RRs: 2
    Authority RRs: 0
    Additional RRs: 1
    Answers
        device1._http._tcp.local: type SRV, class IN, cache flush, priority 0,
weight 0, port 80, target device1.local
        device1._http._tcp.local: type TXT, class IN, cache flush
    Additional records
        device1.local: type A, class IN, cache flush, addr 192.168.0.121
    [Unsolicited: True]

You can see above the my PC doesn't have the correct hostname compared to the
service name (jasper != device1).
This isn't possible as it would introduce hostname conflicts and be resolved
there already.
But that doesn't matter to prove my point.

So I request to revert this commit.
We should however check why Bonjour Conformance Test by Apple have an issue
with this.
Maybe something else is wrong?
How are the Bonjour Conformance Tests executed?




    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #1, bug #58142 (project lwip):

Hi jasper,


> However, for tiebreaking to work correctly in all
> cases, the Authority Section must contain all the records and
> proposed rdata being probed for uniqueness.

The key word here is uniqueness. Since the TXT record don't require uniqueness
in the network, it should not be used during tiebreaking.

Perhaps I have misunderstood the RFC, but BCT always fails for the `WINNING
SIMULTANEOUS PROBES` tests without this fix.

May be PROBE tiebreaking checks in lwIP should not use this function at all.

You may revert this commit, but a better fix must be added to replace it.

Perhaps removing the TXT record after the call to
`mdns_define_probe_rrs_to_send` in `mdns_handle_probe_tiebreaking` ?

BCT never have TXT record in it's conflicting probe, even if one exist in the
probe sent by lwIP. It may be a BCT bug since according the Apple
mDNSResponder source code I've just check, the TXT record is typed
`kDNSRecordTypeUnique` like the SRV record and is included in probe sent.

Additionnaly, you should try using the dnssd daemon from Apple instead of
Avahi ;-). RFC6762 is an Apple RFC, and reference implementation is in
mDNSResponder which was open-source. See here:
https://opensource.apple.com/tarballs/mDNSResponder/.

Regards,
David

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message posté via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #2, bug #58142 (project lwip):

I just recover some capture just before my patch. See frame 239 & 241 on the
attached capture.

Frame 239 is the Probe sent by Lwip. Frame 241 is the conflicting probe sent
by the BCT. This one is should result in lwIP WIN in tiebreaking (Port is 79
is < port 80 sent by Lwip). All previous conflicting probe result in lwIP LOST
tiebreaking.

But, lwIP lost!

Frame 243, lwIP re-probe. Frame 245, BCT conflict probe that should lost. Same
for 246 & 248, the third probe & conflict.

lwIP should announce just after, but instead redo probing in 249/251/253.

lwIP announce only in Frame 256 after 3 additional probes without BCT sending
any conflict (test was already failed).


  239  63.088615 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  241  63.261466 169.254.182.99 → 224.0.0.251  MDNS 160 Standard query
0x0000 ANY CoBra [938]._cObrA-eCi._Tcp.LOcal, "QU" question SRV 0 0 79
lwip-180.local
  243  64.281103 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  245  64.480271 169.254.182.99 → 224.0.0.251  MDNS 160 Standard query
0x0000 ANY Cobra [938]._cobra-ecI._tcP.lOCAl, "QU" question SRV 0 0 79
lwip-180.local
  246  65.491614 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  248  65.710155 169.254.182.99 → 224.0.0.251  MDNS 160 Standard query
0x0000 ANY cobrA [938]._CObRa-eci._TcP.lOcaL, "QU" question SRV 0 0 79
lwip-180.local
  249  66.735742 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  251  66.979057 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  253  67.275080 169.254.61.58 → 224.0.0.251  MDNS 197 Standard query 0x0000
ANY lwip-180.local, "QU" question ANY Cobra [938]._cobra-eci._tcp.local, "QU"
question A 169.254.61.58 AAAA fe80::f8f0:5ff:fe36:3a3c SRV 0 0 80
lwip-180.local TXT
  256  67.566714 169.254.61.58 → 224.0.0.251  MDNS 358 Standard query
response 0x0000 A, cache flush 169.254.61.58 PTR, cache flush lwip-180.local
AAAA, cache flush fe80::f8f0:5ff:fe36:3a3c PTR, cache flush lwip-180.local PTR
_cobra-eci._tcp.local PTR Cobra [938]._cobra-eci._tcp.local SRV, cache flush 0
0 80 lwip-180.local TXT, cache flush
  258  68.512479 169.254.61.58 → 224.0.0.251  MDNS 358 Standard query
response 0x0000 A, cache flush 169.254.61.58 PTR, cache flush lwip-180.local
AAAA, cache flush fe80::f8f0:5ff:fe36:3a3c PTR, cache flush lwip-180.local PTR
_cobra-eci._tcp.local PTR Cobra [938]._cobra-eci._tcp.local SRV, cache flush 0
0 80 lwip-180.local TXT, cache flush



I hope this help resolving tiebreaking issue the right way.

Regards,
David

(file #48793)
    _______________________________________________________

Additional Item Attachment:

File name: capture_bct2.pcap              Size:50 KB
    <https://savannah.nongnu.org/file/capture_bct2.pcap?file_id=48793>



    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message posté via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #3, bug #58142 (project lwip):

Hi David,

I think that this topic is a big discussion point between implementers.
I have checked a few different implementations and see that this is not
consistent.
Some include TXT record in the authority section and some don't.
This to me is a big issue since tie-breaking will not work between these
different implementations.
So there is some work to do before mDNS can be really stable.
Which contradicts the complete point of zero configuration.

I must say that some parts of the RFC are sensitive to interpretation.
This part is definitely one of them.

You suggest we fix this in another place but keep the txt record in the
probe.
This I think isn't a good idea since the tie-breaking is looking for *all*
answers which includes txt records.
If we do not take into account txt records for tie-breaking but include it in
the authority section we might confuse the other host.
This is also an implementation that is not described in the RFC.

So for me the question is simple: do we include TXT or not in the probe
authority section.
Or in other words is your patch good or not.

I have taken the devices I have at home and evaluated what they include in the
probe message:
* Google Chromecast Ultra - includes the TXT record in the authority section
* Apple TV 1e gen - does not include TXT record
* Apple TV 2e gen - does not include TXT record
* Brother printer - does not include TXT record
* Avahi-Deamon - includes TXT record
* Honeywell Lyric thermostat - does not include TXT record

Since Apple is probably the most trust worthy implementation, I agree we
should follow them.
Strange that a big company as Google does it differently.
They might still be annoyed they didn't develop it :).

So I think this bug report might be incorrect.
Please give me a little more time to discuss this with others before rejecting
this bug.

Kind regards,
Jasper

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #4, bug #58142 (project lwip):

Hi David,

I went through the RFC6762 probe tiebreaking section again and noticed
something that proves the bug report is valid.

In RFC6762 section 8.2.1., Simultaneous Probe Tiebreaking for Multiple
Records, you can read the following:

   The records are sorted using the same lexicographical order as
   described above, that is, if the record classes differ, the record
   with the lower class number comes first.  If the classes are the same
   but the rrtypes differ, the record with the lower rrtype number comes
   first.  If the class and rrtype match, then the rdata is compared
   bytewise until a difference is found.  For example, in the common
   case of advertising DNS-SD services with a TXT record and an SRV
   record, the TXT record comes first (the rrtype value for TXT is 16)
   and the SRV record comes second (the rrtype value for SRV is 33).

The example given states that for a DNS-SD service both TXT and SRV record are
found in the authority section.
This proves that the TXT record should be in.

If we go back to the previous discussed section 8.2 Simultaneous Probe
Tiebreaking:

   When a host is probing for a group of related records with the same
   name (e.g., the SRV and TXT record describing a DNS-SD service), only
   a single question need be placed in the Question Section, since query
   type "ANY" (255) is used, which will elicit answers for all records
   with that name.  However, for tiebreaking to work correctly in all
   cases, the Authority Section must contain *all* the records and
   proposed rdata being probed for uniqueness.

The in previous comments discussed key word is *uniqueness*.
I think uniqueness refers to the group of related records with the same name
being probed for.
Not to the records answering that query.
So it's not about the rdata being unique but about the service name being
unique.
As they again give an example of a group made from a SRV and TXT record, this
tells me the TXT record should be in.

To explain this differently, lets use an example:
We host a DNS-SD service, being http.
Our hostname is device1.
So the service name is: "device1._http._tcp.local"
We want that service name to be unique on the network.
So we probe that service name.
It is asked to use query type ANY, so we only have one query.
I now see two scenarios, depending on the TXT record being in the authority
section or not.

1. The TXT record is not in. This means we do not think the TXT record is a
valid answer to the query we just send out.
We also do not think that this TXT record answer needs to be unique.
This means that if a host on the network says it has a TXT record for the
service name you want to own, you would ignore it.
This does not sound correct.
2. The TXT record is in. This means we do want to be the only one having a TXT
record for this service name.
This makes sense to me.
By including the TXT record we are now certain that no other user on the
network will respond to a TXT record query with our name.
And that is the complete point of the probing step.

It was previously said that the TXT record shouldn't be unique.
But actually it needs to be unique.
The TXT record is directly related to the service name containing the device
hostname.
I do not see why we make a difference between the SRV and the TXT record.
They both need to be unique.
There contents / rdata however is indeed different.
The SRV record contents again refers to the hostname, which we want to be
unique.
The TXT record contents on it's own does not need to be unique in the network.

It just contains some text information, if it contains anything at all.
So not the TXT content but the TXT record needs to be unique.

I think Apple is following there own standard here and this deviates from the
by RFC6762 defined standard.
The legacy implementation isn't always correct, and we shouldn't copy it's
mistakes.

I think the question boils down to: does lwIP follow the RFC or does it follow
Apple.
Since we refer to the RFC a lot and we call it mDNS and not Apple Bonjour, I
suggest we follow the RFC and revert this commit.

Kind regards,
Jasper

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #5, bug #58142 (project lwip):

Hi Jasper,

I read all your arguments and they are all valid.

> I think Apple is following there own standard here and this deviates from
the by RFC6762 defined standard.
> The legacy implementation isn't always correct, and we shouldn't copy it's
mistakes.
> I think the question boils down to: does lwIP follow the RFC or does it
follow Apple.
> Since we refer to the RFC a lot and we call it mDNS and not Apple Bonjour, I
suggest we follow the RFC and revert this commit.

Now, I agree with you that the RFC should be followed and we should not copy
their mistake.

But, not doing the way Apple does in their "Bonjour Conformance Test" will
prevent all devices using lwIP to pass the tests.

It may be useful to contact Apple directly through their Bonjour-dev mailing
list (link in Bonjour page here:
https://developer.apple.com/softwarelicensing/bonjour/) so
they can either fix the BCT code or the RFC.

Best regards,
David


    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message posté via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel
Reply | Threaded
Open this post in threaded view
|

[bug #58142] mDNS: RFC violation after recent changes - affecting probing

Ashley Duncan
Follow-up Comment #6, bug #58142 (project lwip):

Hi David,

BCT is indeed an important reference test.
I find it so strange that there is almost nothing to find on this subject
online.

The Bonjour-dev mailing list is closed.
I added a question on the developer forum.
https://developer.apple.com/forums/thread/655832

I'm not sure what the response will be...
In the mean time, I have reverted the commit for my project.

Kind regards,
Jasper

    _______________________________________________________

Reply to this item at:

  <https://savannah.nongnu.org/bugs/?58142>

_______________________________________________
  Message sent via Savannah
  https://savannah.nongnu.org/


_______________________________________________
lwip-devel mailing list
[hidden email]
https://lists.nongnu.org/mailman/listinfo/lwip-devel