DNS problem (protection.outlook.com)

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

DNS problem (protection.outlook.com)

mrobti
Last few days, I'm seeing large amount of failures in a log file for
domains using protection.outlook.com:

to=<[hidden email]>, relay=none, delay=13190, delays=13187/0.08/2.2/0,
dsn=4.4.3, status=deferred (Host or domain name not found. Name service
error for name=example-com.mail.protection.outlook.com type=AAAA: Host
not found, try again)

These domains do have A records, but some of them can take anywhere from
.75 of a second to 3 seconds to return a result from DNS lookup (using
dig).

When postfix reports it cannot find AAAA record, can I assume every time
it retries it also looks for the A record?

Is the problem a lookup timeout? Never seen this before the last few
days, so am inclined to think it's mostly their problem, or is there
something I could do?
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni

> On Dec 6, 2016, at 1:44 PM, MRob <[hidden email]> wrote:
>
> Last few days, I'm seeing large amount of failures in a log file for domains using protection.outlook.com:
>
> to=<[hidden email]>, relay=none, delay=13190, delays=13187/0.08/2.2/0, dsn=4.4.3, status=deferred (Host or domain name not found. Name service error for name=example-com.mail.protection.outlook.com type=AAAA: Host not found, try again)

Some of the MX hosts behind *.mail.protection.outlook.com have AAAA
records, and there is not generally a problem resolving these:

nist-gov.mail.protection.outlook.com. 10 IN AAAA 2a01:111:f400:7c0c::11
nist-gov.mail.protection.outlook.com. 10 IN AAAA 2a01:111:f400:7c09::11
nist-gov.mail.protection.outlook.com. 10 IN AAAA 2a01:111:f400:7c10::10

The nameservers for this zone don't support EDNS0, and there've been
recent reports of some resolvers not handling FORMERR correctly
(alpha versions of PowerDNS recursor on Debian/Ubuntu systems IIRC).

Make sure your resolver handles EDNS0 rejection correctly.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Wietse Venema
In reply to this post by mrobti
MRob:
> Last few days, I'm seeing large amount of failures in a log file for
> domains using protection.outlook.com:
>
> to=<[hidden email]>, relay=none, delay=13190, delays=13187/0.08/2.2/0,
> dsn=4.4.3, status=deferred (Host or domain name not found. Name service
> error for name=example-com.mail.protection.outlook.com type=AAAA: Host
> not found, try again)

Do you need IPv6 support? If not, disable it and avoid useless lookups.

> These domains do have A records, but some of them can take anywhere from
> .75 of a second to 3 seconds to return a result from DNS lookup (using
> dig).
>
> When postfix reports it cannot find AAAA record, can I assume every time
> it retries it also looks for the A record?

If you enable both IPv4 and IPv6, then Postfix must look for both
A and AAAA records. There is no IP protocol field in MX records.

The current Postfix default is to randomize equal-preference A and
AAAA lookups, so I am surprised that the last failUre is always for
AAAA lookups.

> Is the problem a lookup timeout? Never seen this before the last few
> days, so am inclined to think it's mostly their problem, or is there
> something I could do?

This could a messed-up DNS resolver anywhere in the path, including
a bad resolv.conf file under /var/spool/postfix/etc, or some
'security' filter that breaks connectivity to some DNS server.

For me, A and AAAA lookups of example-com.mail.protection.outlook.com
are instantaneous (reply: NXDOMAIN).

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

mrobti
Victor, Wietse,

On 2016-12-06 11:16, [hidden email] wrote:

> MRob:
>> Last few days, I'm seeing large amount of failures in a log file for
>> domains using protection.outlook.com:
>>
>> to=<[hidden email]>, relay=none, delay=13190,
>> delays=13187/0.08/2.2/0,
>> dsn=4.4.3, status=deferred (Host or domain name not found. Name
>> service
>> error for name=example-com.mail.protection.outlook.com type=AAAA: Host
>> not found, try again)
>
> Do you need IPv6 support? If not, disable it and avoid useless lookups.

No, it was only enabled because that is Postfix default. I have disabled
this to reduce contributing factors. Unfortunately, the issue persists.

>> These domains do have A records, but some of them can take anywhere
>> from
>> .75 of a second to 3 seconds to return a result from DNS lookup (using
>> dig).
>>
>> When postfix reports it cannot find AAAA record, can I assume every
>> time
>> it retries it also looks for the A record?
>
> If you enable both IPv4 and IPv6, then Postfix must look for both
> A and AAAA records. There is no IP protocol field in MX records.
>
> The current Postfix default is to randomize equal-preference A and
> AAAA lookups, so I am surprised that the last failUre is always for
> AAAA lookups.

This is strange, then, because until I disabled ipv6, the logs for these
problem domains only showed errors looking up AAAA. Only when I disabled
ipv6, I can now see this:

status=deferred (Host or domain name not found. Name service error for
name=example-com.mail.protection.outlook.com type=A: Host not found, try
again)

>> Is the problem a lookup timeout? Never seen this before the last few
>> days, so am inclined to think it's mostly their problem, or is there
>> something I could do?
>
> This could a messed-up DNS resolver anywhere in the path, including
> a bad resolv.conf file under /var/spool/postfix/etc, or some
> 'security' filter that breaks connectivity to some DNS server.

Victor suggested in a mail prior to yours (Victor, please correct me if
I misunderstand) that it could have been due to Microsoft providing ipv6
responses for some domains, but some of those responses being EDNS0,
which our local resolver may not know how to handle. This seemed
plausible, but now that I took out ipv6 and the error continues with A,
I am less certain.

Having removed ipv6 from the question, I get the error I quoted above
even for domains that do resolve using "dig" from the CLI of the same
host. Why would there be that kind of discrepancy?

> For me, A and AAAA lookups of example-com.mail.protection.outlook.com
> are instantaneous (reply: NXDOMAIN).

Of course I changed the real domain to protect the innocent
(example-com). Is it appropriate to give a real, live example? It may or
may not help, because A is resolving fine with dig, but postfix is
having trouble itself.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni
On Tue, Dec 06, 2016 at 04:02:21PM -0800, MRob wrote:

> > This could a messed-up DNS resolver anywhere in the path, including
> > a bad resolv.conf file under /var/spool/postfix/etc, or some
> > 'security' filter that breaks connectivity to some DNS server.
>
> Victor suggested in a mail prior to yours (Victor, please correct me if I
> misunderstand) that it could have been due to Microsoft providing ipv6
> responses for some domains, but some of those responses being EDNS0, which
> our local resolver may not know how to handle.

Quite the opposite, your resolver probably uses EDNS0, but Microsoft's
rather skimpy implementation of DNS in the load-balancers for that
zone don't grok EDNS0 (among other problems).  So your resolver needs
to be able to back off to non-EDNS0 on FORMERR, but may not depending
on the software you're running.

> This seemed plausible, but
> now that I took out ipv6 and the error continues with A, I am less certain.

The query type is irrelevant.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Wietse Venema
In reply to this post by mrobti
MRob:
> Having removed ipv6 from the question, I get the error I quoted above
> even for domains that do resolve using "dig" from the CLI of the same
> host. Why would there be that kind of discrepancy?

Not at all, just some intermediate resolver that messes up as I
suggested in my first reply. Perhaps powerdns resolvers that assume
everyone supports EDNS0, and that break all kinds of lookups.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni
On Tue, Dec 06, 2016 at 07:20:41PM -0500, Wietse Venema wrote:

> > Having removed ipv6 from the question, I get the error I quoted above
> > even for domains that do resolve using "dig" from the CLI of the same
> > host. Why would there be that kind of discrepancy?
>
> Not at all, just some intermediate resolver that messes up as I
> suggested in my first reply. Perhaps powerdns resolvers that assume
> everyone supports EDNS0, and that break all kinds of lookups.

To be fair to the good folks at PowerDNS, the software in question
was an alpha version, that Ubuntu should probably not have shipped
in a prod release.  I don't know of any similar issues in actual
releases of PowerDNS.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

mrobti
On 2016-12-06 16:23, Viktor Dukhovni wrote:

> On Tue, Dec 06, 2016 at 07:20:41PM -0500, Wietse Venema wrote:
>
>> > Having removed ipv6 from the question, I get the error I quoted above
>> > even for domains that do resolve using "dig" from the CLI of the same
>> > host. Why would there be that kind of discrepancy?
>>
>> Not at all, just some intermediate resolver that messes up as I
>> suggested in my first reply. Perhaps powerdns resolvers that assume
>> everyone supports EDNS0, and that break all kinds of lookups.
>
> To be fair to the good folks at PowerDNS, the software in question
> was an alpha version, that Ubuntu should probably not have shipped
> in a prod release.  I don't know of any similar issues in actual
> releases of PowerDNS.

Thank you both for the help. Looks like the only recourse is to stop
using PDNS or install it from source until Ubuntu can provide a
non-alpha product. I'm shocked that they have done such a thing. I
wonder if a post to their mailing list would get the attention of the
right person.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni
On Tue, Dec 06, 2016 at 04:56:58PM -0800, MRob wrote:

> > To be fair to the good folks at PowerDNS, the software in question
> > was an alpha version, that Ubuntu should probably not have shipped
> > in a prod release.  I don't know of any similar issues in actual
> > releases of PowerDNS.
>
> Thank you both for the help. Looks like the only recourse is to stop using
> PDNS or install it from source until Ubuntu can provide a non-alpha product.
> I'm shocked that they have done such a thing. I wonder if a post to their
> mailing list would get the attention of the right person.

I take it then that you too are using a PDNS resolver on (a suitably
recent) Ubuntu? In that case the problem is rather expected, and
matches exactly the same issue reported here a week or two back.

Search the list archives.  You may find that the original poster
of that older message has already opened an Ubuntu ticket  for this
issue.

In the mean-time, unbound works pretty well, and of course you can
insteall a more stable PDNS from the upstream source.

I expect we'll now be seeing repeated reports of this particular
issue from time to time.  I might not always be inspired to come
forward with the standard answer, so if anyone else wants to help
out the next user with this problem, go for it.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

mrobti
On 2016-12-06 17:14, Viktor Dukhovni wrote:

> On Tue, Dec 06, 2016 at 04:56:58PM -0800, MRob wrote:
>
>> > To be fair to the good folks at PowerDNS, the software in question
>> > was an alpha version, that Ubuntu should probably not have shipped
>> > in a prod release.  I don't know of any similar issues in actual
>> > releases of PowerDNS.
>>
>> Thank you both for the help. Looks like the only recourse is to stop
>> using
>> PDNS or install it from source until Ubuntu can provide a non-alpha
>> product.
>> I'm shocked that they have done such a thing. I wonder if a post to
>> their
>> mailing list would get the attention of the right person.
>
> I take it then that you too are using a PDNS resolver on (a suitably
> recent) Ubuntu? In that case the problem is rather expected, and
> matches exactly the same issue reported here a week or two back.
>
> Search the list archives.  You may find that the original poster
> of that older message has already opened an Ubuntu ticket  for this
> issue.
>
> In the mean-time, unbound works pretty well, and of course you can
> insteall a more stable PDNS from the upstream source.
>
> I expect we'll now be seeing repeated reports of this particular
> issue from time to time.  I might not always be inspired to come
> forward with the standard answer, so if anyone else wants to help
> out the next user with this problem, go for it.

You're correct. I had searched prior to posting, but maybe the last
thread is too recent to have made it into the right coffers. For the
record, here is the official bug (to which I am adding my voice) and the
original thread from this list:

https://bugs.launchpad.net/ubuntu/+source/pdns-recursor/+bug/1646538
http://postfix.1071664.n5.nabble.com/EDNS-DANE-trouble-with-Microsoft-mail-protection-outlook-com-td87331.html#a87353

I know it might irk you, but it does appear that changing this setting
will fix the problem for now. Hope Ubuntu can get this fixed quickly.

smtp_tls_dane_insecure_mx_policy=may
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Scott Kitterman-4
In reply to this post by mrobti
On Tuesday, December 06, 2016 04:56:58 PM MRob wrote:

> On 2016-12-06 16:23, Viktor Dukhovni wrote:
> > On Tue, Dec 06, 2016 at 07:20:41PM -0500, Wietse Venema wrote:
> >> > Having removed ipv6 from the question, I get the error I quoted above
> >> > even for domains that do resolve using "dig" from the CLI of the same
> >> > host. Why would there be that kind of discrepancy?
> >>
> >> Not at all, just some intermediate resolver that messes up as I
> >> suggested in my first reply. Perhaps powerdns resolvers that assume
> >> everyone supports EDNS0, and that break all kinds of lookups.
> >
> > To be fair to the good folks at PowerDNS, the software in question
> > was an alpha version, that Ubuntu should probably not have shipped
> > in a prod release.  I don't know of any similar issues in actual
> > releases of PowerDNS.
>
> Thank you both for the help. Looks like the only recourse is to stop
> using PDNS or install it from source until Ubuntu can provide a
> non-alpha product. I'm shocked that they have done such a thing. I
> wonder if a post to their mailing list would get the attention of the
> right person.

There is almost certainly not a right person.  PDNS is in the Universe section
of their archive that's community maintained.  Unfortunately, the non-
Canonical development community in Ubuntu has almost vanished I'm a former
member of it myself).

Scott K
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni
On Tue, Dec 06, 2016 at 08:47:27PM -0500, Scott Kitterman wrote:

> > I'm shocked that they have done such a thing. I
> > wonder if a post to their mailing list would get the attention of the
> > right person.
>
> There is almost certainly not a right person.  PDNS is in the Universe section
> of their archive that's community maintained.  Unfortunately, the non-
> Canonical development community in Ubuntu has almost vanished I'm a former
> member of it myself).

I wonder why it is that multiple users are brave enough to go with
a resolver that is not a mainstream part of the O/S?  Is there some
HOWTO somewhere that's encouraging them to stray away from more
mainstream options?

Instead of partly disabling DANE support, it seems to make more
sense to switch to unbound or BIND.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Scott Kitterman-4
On Wednesday, December 07, 2016 02:08:24 AM Viktor Dukhovni wrote:

> On Tue, Dec 06, 2016 at 08:47:27PM -0500, Scott Kitterman wrote:
> > > I'm shocked that they have done such a thing. I
> > > wonder if a post to their mailing list would get the attention of the
> > > right person.
> >
> > There is almost certainly not a right person.  PDNS is in the Universe
> > section of their archive that's community maintained.  Unfortunately, the
> > non- Canonical development community in Ubuntu has almost vanished I'm a
> > former member of it myself).
>
> I wonder why it is that multiple users are brave enough to go with
> a resolver that is not a mainstream part of the O/S?  Is there some
> HOWTO somewhere that's encouraging them to stray away from more
> mainstream options?
>
> Instead of partly disabling DANE support, it seems to make more
> sense to switch to unbound or BIND.

I agree.  I think most users don't understand the distinction between the
parts of the archive.

Scott K
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni

> On Dec 7, 2016, at 12:34 AM, Scott Kitterman <[hidden email]> wrote:
>
>> Instead of partly disabling DANE support, it seems to make more
>> sense to switch to unbound or BIND.
>
> I agree.  I think most users don't understand the distinction between the
> parts of the archive.

Let's hope the word gets out somehow.  In the meantime, are you still
involved in packaging Postfix for Debian?  I am still hoping to some
day see a Postfix package with a correct "postfix-files" file (that
does not list files the package does not include) and does not leave
out the makedefs.out file that shows the build settings...

With a broken "postfix-files" none of the multi-instance features work.
It should be possible to create secondary instances on Debian via
"postmulti -e create -I postfix-mumble" and "postfix check" should
not fail for lack of files to match the content of "postfix-files".

I was hoping this would be fixed with the 3.1 packages in stretch,
but no luck so far.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Scott Kitterman-4
On Wednesday, December 07, 2016 12:50:05 AM Viktor Dukhovni wrote:
> > On Dec 7, 2016, at 12:34 AM, Scott Kitterman <[hidden email]>
wrote:

> >> Instead of partly disabling DANE support, it seems to make more
> >> sense to switch to unbound or BIND.
> >
> > I agree.  I think most users don't understand the distinction between the
> > parts of the archive.
>
> Let's hope the word gets out somehow.  In the meantime, are you still
> involved in packaging Postfix for Debian?  I am still hoping to some
> day see a Postfix package with a correct "postfix-files" file (that
> does not list files the package does not include) and does not leave
> out the makedefs.out file that shows the build settings...
>
> With a broken "postfix-files" none of the multi-instance features work.
> It should be possible to create secondary instances on Debian via
> "postmulti -e create -I postfix-mumble" and "postfix check" should
> not fail for lack of files to match the content of "postfix-files".
>
> I was hoping this would be fixed with the 3.1 packages in stretch,
> but no luck so far.

As I recall from our previous discussions on the topic (and what I read in the
documentation), since we split the various dynamic map types into their own
binary packages, we need to make sure that the basic postfix package doesn't
reference those files and that the binary includes a postfix-files snippet to
drop in a postfix-files.d directory.  Is that right?

As an example, for mysql (the html docs are in a doc pacakge):

$shlib_directory/${LIB_PREFIX}mysql${LIB_SUFFIX}:f:root:-:755
$manpage_directory/man5/mysql_table.5:f:root:-:644

is what goes in /etc/postfix/postfix-files.d/mysql (for lack of a better name)

Is there an upstream method to split those lines out into separate files based
on the build type?  I can come up with something Debian specific to do it, but
if it's already covered, I don't want to re-invent the wheel.

Scott K
Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni

> On Dec 11, 2016, at 11:44 AM, Scott Kitterman <[hidden email]> wrote:
>
> As I recall from our previous discussions on the topic (and what I read in the
> documentation), since we split the various dynamic map types into their own
> binary packages, we need to make sure that the basic postfix package doesn't
> reference those files and that the binary includes a postfix-files snippet to
> drop in a postfix-files.d directory.  Is that right?

Not just "those" files, but rather *any* files not included in
the base package.  And the filenames included need to be correct.
IIRC the Debian manpages are compressed, but the filenames in
the postfix-files file have no ".gz" suffix.  There are likely
some additional anomalies.

> As an example, for mysql (the html docs are in a doc package):
>
> $shlib_directory/${LIB_PREFIX}mysql${LIB_SUFFIX}:f:root:-:755
> $manpage_directory/man5/mysql_table.5:f:root:-:644
>
> is what goes in /etc/postfix/postfix-files.d/mysql (for lack of a better name)

Sure provided both are installed by the same package, or both
packages merge the relevant entry into a common file.  And of
course, deal with ".gz" extensions as needed.

> Is there an upstream method to split those lines out into separate files based
> on the build type?  I can come up with something Debian specific to do it, but
> if it's already covered, I don't want to re-invent the wheel.

Nothing built-in, since packaging detais are out of scope.

The idea would be to *intersect* the content of each package with
matching names from the upstream postfix-files file, and create the
appropriate per-package files for postfix-files.d/.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Scott Kitterman-4


On December 11, 2016 1:24:22 PM EST, Viktor Dukhovni <[hidden email]> wrote:

>
>> On Dec 11, 2016, at 11:44 AM, Scott Kitterman <[hidden email]>
>wrote:
>>
>> As I recall from our previous discussions on the topic (and what I
>read in the
>> documentation), since we split the various dynamic map types into
>their own
>> binary packages, we need to make sure that the basic postfix package
>doesn't
>> reference those files and that the binary includes a postfix-files
>snippet to
>> drop in a postfix-files.d directory.  Is that right?
>
>Not just "those" files, but rather *any* files not included in
>the base package.  And the filenames included need to be correct.
>IIRC the Debian manpages are compressed, but the filenames in
>the postfix-files file have no ".gz" suffix.  There are likely
>some additional anomalies.
>
>> As an example, for mysql (the html docs are in a doc package):
>>
>> $shlib_directory/${LIB_PREFIX}mysql${LIB_SUFFIX}:f:root:-:755
>> $manpage_directory/man5/mysql_table.5:f:root:-:644
>>
>> is what goes in /etc/postfix/postfix-files.d/mysql (for lack of a
>better name)
>
>Sure provided both are installed by the same package, or both
>packages merge the relevant entry into a common file.  And of
>course, deal with ".gz" extensions as needed.
>
>> Is there an upstream method to split those lines out into separate
>files based
>> on the build type?  I can come up with something Debian specific to
>do it, but
>> if it's already covered, I don't want to re-invent the wheel.
>
>Nothing built-in, since packaging detais are out of scope.
>
>The idea would be to *intersect* the content of each package with
>matching names from the upstream postfix-files file, and create the
>appropriate per-package files for postfix-files.d/.

Thanks.  That makes it clear.  I probably would have missed the compressed man page detail.

Scott K

Reply | Threaded
Open this post in threaded view
|

Re: DNS problem (protection.outlook.com)

Viktor Dukhovni

> On Dec 11, 2016, at 3:02 PM, Scott Kitterman <[hidden email]> wrote:
>
> Thanks.  That makes it clear.  I probably would have missed the compressed man page detail.

You'll you're done when (as root):

        1.  "postfix check" logs no errors.
            "postfix set-permissions" logs no errors.
        2.  "postmulti -I postfix-test -e create" logs no errors
        3.  You can add/remove dictionary modules without breaking the
            above.
        4.  Optional dictionary modules are seen by all Postfix instances.
        5.  "postfix start/stop" work across multiple instances.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Postfix packages for Debian Stretch ( was: Re: DNS problem (protection.outlook.com))

Scott Kitterman-4
On Sunday, December 11, 2016 03:45:26 PM Viktor Dukhovni wrote:

> > On Dec 11, 2016, at 3:02 PM, Scott Kitterman <[hidden email]>
> > wrote:
> >
> > Thanks.  That makes it clear.  I probably would have missed the compressed
> > man page detail.
> You'll you're done when (as root):
>
> 1.  "postfix check" logs no errors.
>             "postfix set-permissions" logs no errors.
> 2.  "postmulti -I postfix-test -e create" logs no errors
> 3.  You can add/remove dictionary modules without breaking the
>             above.
> 4.  Optional dictionary modules are seen by all Postfix instances.
> 5.  "postfix start/stop" work across multiple instances.

I believe we've got this done with the version we uploaded to Debian Unstable
today (3.1.4-2).  Barring significant issues, it will be in Debian Stretch in
10 days.

I didn't actually test item 4, but have no reason to believe it would be a
problem.

Since this is distro specific, if Debian users encounter problems, please file
bugs in the Debian Bug Tracking System (bugs.debian.org) instead of having a
long discussion about it here.  Of particular interest is testing from users
of the various dictionary modules (since neither of the Debian Postfix
maintainers use them).

Thanks,

Scott K