how to debug TLS certificate verification error?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

how to debug TLS certificate verification error?

postfix-6
I am unable to receive mail from my Comcast friends at my Postfix server
(postfix-3.2.0-2.6.1 on openSUSE 42.3 with openssl-1.0.2j). As far as I
know only Comcast has a problem sending me mail. I have tried asking
Comcast for help, but they are useless. I am hoping someone on this list
can suggest debugging advice to figure out what the problem might be.

Comcast claims a TLS certificate verify failure. I have checked the TLS
connection process with

openssl s_client -connect maple.killian.com:25 -starttls smtp

and it looks good. I also checked with https://www.checktls.com and got
all 100%. The certificate being used was issued by the EFF's cerbot /
Let's Encrypt project and passed to postfix with smtpd_tls_key_file and
smtpd_tls_cert_file.

Here is the Comcast bounce message my friend received (some deletions
for privacy):

From: [hidden email] [mailto:[hidden email]]
Sent: Sunday, February 09, 2020 10:59 PM
To: [snip]
Subject: Temporary Failure

     This is an automatically generated Delivery Status Notification.     

Delivery to the following recipients was aborted after 6.5 hour(s):

 * [snip]

Reason: Temporary Failure

Reporting-MTA: dns; resqmta-ch2-07v.sys.comcast.net [69.252.207.39]
Received-From-MTA: dns; resomta-ch2-16v.sys.comcast.net [69.252.207.112]
Arrival-Date: Sun, 09 Feb 2020 21:32:03 +0000


Final-recipient: rfc822; [snip]
Diagnostic-Code: smtp; TLS negotiation: certificate verify failed
Last-attempt-Date: Mon, 10 Feb 2020 03:59:23 +0000

Here is some of the server log from a different connection attempt
(after I set debug_peer_list = 69.252.207.0/24 and debug_peer_level = 2)
(deletions of smtpd_client_event_limit_exceptions lines for privacy):

Feb 14 08:53:16 maple kernel: FW-ACC-TCP IN=eth0 OUT=
MAC=00:30:48:62:9c:18:7c:1c:f1:8e:5a:42:08:00 SRC=69.252.207.44
DST=199.165.155.8 LEN=60 TOS=0x00 PREC=0x00 TTL=52 ID=45494 DF PROTO=TCP
SPT=41255 DPT=25 WINDOW=14600 RES=0x00 SYN URGP=0 OPT
(020405B40101080A16FC2A350000000001030303)
Feb 14 08:53:17 maple postfix/smtpd[14512]: connect from
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]
Feb 14 08:53:17 maple postfix/smtpd[14512]: smtp_stream_setup:
maxtime=300 enable_deadline=0
[snipped]
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
resqmta-ch2-12v.sys.comcast.net: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
69.252.207.44: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: auto_clnt_open: connected to
private/anvil
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr request = connect
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr ident =
smtp:69.252.207.44
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value: 0
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: count
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: count
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value: 1
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: rate
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: rate
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value: 1
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: (list terminator)
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: (end)
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 220 maple.killian.com
ESMTP By proceeding, you agree to the terms and conditions in
http://www.killian.com/spam.html.  If you do not agree, quit
immediately.  In particular, DO NOT send unsolicited commercial email
(i.e. spam) to this site.  We reserve the right to charge US$5000 per
violation.
Feb 14 08:53:17 maple postfix/smtpd[14512]: watchdog_pat: 0x555db41d0c10
Feb 14 08:53:17 maple postfix/smtpd[14512]: <
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: EHLO
resqmta-ch2-12v.sys.comcast.net
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
resqmta-ch2-12v.sys.comcast.net: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
69.252.207.44: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-maple.killian.com
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-PIPELINING
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-SIZE 80000000
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-ETRN
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-STARTTLS
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-ENHANCEDSTATUSCODES
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250-8BITMIME
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 250 DSN
Feb 14 08:53:17 maple postfix/smtpd[14512]: watchdog_pat: 0x555db41d0c10
Feb 14 08:53:17 maple postfix/smtpd[14512]: <
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: STARTTLS
Feb 14 08:53:17 maple postfix/smtpd[14512]: >
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 220 2.0.0 Ready to start TLS
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr request = seed
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr size = 32
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/tlsmgr: wanted
attribute: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value: 0
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/tlsmgr: wanted
attribute: seed
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: seed
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value:
UyM2p2Rixq0C0knqtSxx8pfYa5Vm5ijixD9+YOoXGJM=
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/tlsmgr: wanted
attribute: (list terminator)
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: (end)
Feb 14 08:53:17 maple postfix/smtpd[14512]: SSL_accept error from
resqmta-ch2-12v.sys.comcast.net[69.252.207.44]: 0
Feb 14 08:53:17 maple postfix/smtpd[14512]: warning: TLS library
problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad
certificate:s3_pkt.c:1487:SSL alert number 42:
[snip]
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
resqmta-ch2-12v.sys.comcast.net: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: match_list_match:
69.252.207.44: no match
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr request = disconnect
Feb 14 08:53:17 maple postfix/smtpd[14512]: send attr ident =
smtp:69.252.207.44
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: status
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute value: 0
Feb 14 08:53:17 maple postfix/smtpd[14512]: private/anvil: wanted
attribute: (list terminator)
Feb 14 08:53:17 maple postfix/smtpd[14512]: input attribute name: (end)
Feb 14 08:53:17 maple postfix/smtpd[14512]: lost connection after
STARTTLS from resqmta-ch2-12v.sys.comcast.net[69.252.207.44]
Feb 14 08:53:17 maple postfix/smtpd[14512]: disconnect from
resqmta-ch2-12v.sys.comcast.net[69.252.207.44] ehlo=1 starttls=0/1
commands=1/2


Reply | Threaded
Open this post in threaded view
|

Re: how to debug TLS certificate verification error?

Viktor Dukhovni
On Sun, Feb 16, 2020 at 10:26:45AM -0800, Earl Killian wrote:

> I am unable to receive mail from my Comcast friends at my Postfix server
> (postfix-3.2.0-2.6.1 on openSUSE 42.3 with openssl-1.0.2j). As far as I
> know only Comcast has a problem sending me mail. I have tried asking
> Comcast for help, but they are useless. I am hoping someone on this list
> can suggest debugging advice to figure out what the problem might be.

As luck would have it, you've come to the right place.  Your domain is
DNSSEC-signed, and your MX host has DANE TLSA records:

    $ hsdig -t a maple.killian.com
    maple.killian.com. IN A 199.165.155.8 ; NoError AD=1

    $ hsdig -t tlsa _25._tcp.maple.killian.com
    _25._tcp.maple.killian.com. IN TLSA 3 0 1 4ca6fa3e1b53c809442cf7db22227e3f4a6bf51074305dbcf0a4593c30d1b723 ; NoError AD=1
    _25._tcp.maple.killian.com. IN TLSA 3 0 1 7a668f4b7f418a618a9e1043b644c282d55e5ead0ff20acaa4db5357a9764a2f ; NoError AD=1

> Comcast claims a TLS certificate verify failure. I have checked the TLS
> connection process with

Comcast (and not only they) support and enforce DANE.

> Diagnostic-Code: smtp; TLS negotiation: certificate verify failed

Which is expected, since your certificate chain DOES NOT match your
DANE TLSA record:

    ; 1. Get rid of all the "sha1" DS records, they're useless, the "sha2"
    ;    hashes are universally supported and quite sufficient.
    ; 2. You probably don't need DS RRs for four different KSKs, at most two
    ;    at a time is enough to support a reasonable key rollover process.
    ;
    killian.com. IN DS 1396 14 1 <...> ; AD=1 NoError
    killian.com. IN DS 1396 14 2 <...> ; AD=1 NoError
    killian.com. IN DS 10651 14 1 <...> ; AD=1 NoError
    killian.com. IN DS 10651 14 2 <...> ; AD=1 NoError
    killian.com. IN DS 29048 14 1 <...> ; AD=1 NoError
    killian.com. IN DS 29048 14 2 <...> ; AD=1 NoError
    killian.com. IN DS 33864 14 1 <...> ; AD=1 NoError
    killian.com. IN DS 33864 14 2 <...> ; AD=1 NoError

    ; And of course four KSKs is too many, at most two will do.
    ;
    killian.com. IN DNSKEY 257 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 257 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 257 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 257 3 14 <...> ; AD=1 NoError

    ; And ditto for the ZSKs
    ;
    killian.com. IN DNSKEY 256 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 256 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 256 3 14 <...> ; AD=1 NoError
    killian.com. IN DNSKEY 256 3 14 <...> ; AD=1 NoError

    ; Your MX host promises IPv6:
    ;
    killian.com. IN MX 10 maple.killian.com. ; AD=1 NoError
    maple.killian.com. IN A 199.165.155.8 ; AD=1 NoError
    maple.killian.com. IN AAAA 2607:f358:10:27::8 ; AD=1 NoError

    ; But refuses IPv6 connections
    ;
    _25._tcp.maple.killian.com. IN TLSA 3 0 1 4ca6fa3e1b53c809442cf7db22227e3f4a6bf51074305dbcf0a4593c30d1b723 ; AD=1 NoError
    _25._tcp.maple.killian.com. IN TLSA 3 0 1 7a668f4b7f418a618a9e1043b644c282d55e5ead0ff20acaa4db5357a9764a2f ; AD=1 NoError

      ; Most importantly, its just replaced Let's Encrypt certificate
      ; does not match its TLSA record
      ;
      ; Suggested more robust TLSA record management approaches can be found via:

        https://github.com/internetstandards/toolbox-wiki/blob/master/DANE-for-SMTP-how-to.md
        https://mail.sys4.de/pipermail/dane-users/2018-February/000440.html
        https://community.letsencrypt.org/t/please-avoid-3-0-1-and-3-0-2-dane-tlsa-records-with-le-certificates/7022/17
        https://mail.sys4.de/pipermail/dane-users/2017-August/000417.html
        https://github.com/baknu/DANE-for-SMTP/wiki/2.-Implementation-resources

      maple.killian.com[199.165.155.8]: tlsa-mismatch
      maple.killian.com[2607:f358:10:27::8]: connection refused
        TLS = TLS12 with ECDHE-RSA-AES256GCM-SHA384,P384
        name = killian.com
        name = maple.killian.com
        name = pine.killian.com
        name = puffleservices.com
        name = smtp.killian.com
        name = smtp.puffleservices.com
        name = smtp1.killian.com
        name = smtp1.puffleservices.com
        name = smtp2.killian.com
        name = smtp2.puffleservices.com
        depth = 0
          Issuer CommonName = Let's Encrypt Authority X3
          Issuer Organization = Let's Encrypt
          notBefore = 2020-02-12T23:29:02Z
          notAfter = 2020-05-12T23:29:02Z
          Subject CommonName = smtp.killian.com
          cert sha256 [nomatch] <- 3 0 1 ca2be3cf3e0f13fec3860bc6a54a21f3d51deea640fe8695c83c9fd817de02a6
          pkey sha256 [nomatch] <- 3 1 1 b7f1cd36893e5a884a3c4c70853e87089ea8b65e07c9c7996181d1b3b48ceb39
        depth = 1
          Issuer CommonName = DST Root CA X3
          Issuer Organization = Digital Signature Trust Co.
          notBefore = 2016-03-17T16:40:46Z
          notAfter = 2021-03-17T16:40:46Z
          Subject CommonName = Let's Encrypt Authority X3
          Subject Organization = Let's Encrypt
          cert sha256 [nomatch] <- 2 0 1 25847d668eb4f04fdd40b12b6b0740c567da7d024308eb6c2c96fe41d9de218d
          pkey sha256 [nomatch] <- 2 1 1 60b87575447dcba2a36b7d11ac09fb24a9db406fee12d2cc90180517616e8a18

       * Potential matching TLSA records:

        3 1 1 b7f1cd36893e5a884a3c4c70853e87089ea8b65e07c9c7996181d1b3b48ceb39
        2 1 1 60b87575447dcba2a36b7d11ac09fb24a9db406fee12d2cc90180517616e8a18

--
    Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: how to debug TLS certificate verification error?

Bernardo Reino
On Sun, 16 Feb 2020, Viktor Dukhovni wrote:

> As luck would have it, you've come to the right place.  Your domain is
> DNSSEC-signed, and your MX host has DANE TLSA records:
>
>    $ hsdig -t a maple.killian.com
>    maple.killian.com. IN A 199.165.155.8 ; NoError AD=1
>
> [...]

May I ask you where to find/download your hsdig tool?

(a quick search indicates that it's some Haskell tool written by yourself,
but I can't seem to find it :)

Cheers.
Reply | Threaded
Open this post in threaded view
|

Re: how to debug TLS certificate verification error?

Viktor Dukhovni
> On Feb 16, 2020, at 3:18 PM, Bernardo Reino <[hidden email]> wrote:
>
> May I ask you where to find/download your hsdig tool?
>
> (a quick search indicates that it's some Haskell tool written by yourself, but I can't seem to find it :)

I've not made it available to the public.  You can get essentially
similar information from dig, be it with much more verbose output
by default.

I use "hsdig" because it can also do parallel batch DNS queries at
very high concurrency, but that's also why I'm reluctant to share
it.  It is too easily misused.

--
        Viktor.

Reply | Threaded
Open this post in threaded view
|

Re: how to debug TLS certificate verification error?

Viktor Dukhovni
In reply to this post by Viktor Dukhovni
On Sun, Feb 16, 2020 at 01:41:16PM -0500, Viktor Dukhovni wrote:

>       ; Suggested more robust TLSA record management approaches can be found via:
>
>         https://github.com/internetstandards/toolbox-wiki/blob/master/DANE-for-SMTP-how-to.md
>         https://mail.sys4.de/pipermail/dane-users/2018-February/000440.html
>         https://community.letsencrypt.org/t/please-avoid-3-0-1-and-3-0-2-dane-tlsa-records-with-le-certificates/7022/17
>         https://mail.sys4.de/pipermail/dane-users/2017-August/000417.html
>         https://github.com/baknu/DANE-for-SMTP/wiki/2.-Implementation-resources

No matter how hard I try, it seems people are just too distracted to
heed (or read) sound advice (e.g. the 3rd link above).  A band-aid has
been applied and the published TLSA record is now:

    @maple.killian.com.[199.165.155.8]
    @pine.killian.com.[35.167.26.164]
    _25._tcp.maple.killian.com. TLSA 3 0 1 CA2BE3CF3E0F13FEC3860BC6A54A21F3D51DEEA640FE8695C83C9FD817DE02A6

which does match the certificate just at the moment, but will promptly
break in ~60 days when the next Let's Encrypt certificate rollover
happens.

>         depth = 0
>           Issuer CommonName = Let's Encrypt Authority X3
>           Issuer Organization = Let's Encrypt
>           notBefore = 2020-02-12T23:29:02Z
>           notAfter = 2020-05-12T23:29:02Z
>           Subject CommonName = smtp.killian.com
>           cert sha256 [nomatch] <- 3 0 1 ca2be3cf3e0f13fec3860bc6a54a21f3d51deea640fe8695c83c9fd817de02a6
>           pkey sha256 [nomatch] <- 3 1 1 b7f1cd36893e5a884a3c4c70853e87089ea8b65e07c9c7996181d1b3b48ceb39

This is why the right TLSA RR type to use with Let's Encrypt is "3 1 1"
(pinning the key, not the certificate), and by using

    certbot renew --reuse-key

the "3 1 1" record continues to work across certificate rollovers,
but even then, before implementing DANE:

    1.  Impelement monitoring, so that if your setup breaks, you'll
        know it before I do.

    2.  Automate a robust cert/key rollover process.  See suggested
        "3 1 1 + 3 1 1" approach or "3 1 1 + 2 1 1".

--
    Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: how to debug TLS certificate verification error?

postfix-6
Victor, thank you for your two helpful replies.

I do intend to read through the approaches you suggested, and most
likely implement them. My high-priority was to get the mail flowing
again, which your first helpful reply let me do. Indeed, I postponed
replying because I wanted to read the items you suggested before I did so.

I should point out that this problem arose from a bug in one of my
scripts that generates both the certificates and the DNS entries. I
failed to use my new subroutine in one place to select the certificate
to publish in DNS when I went from self-signed to Let's Encrypt (it was
used in postfix main.cf, but not the named zone file), so it was still
publishing the self-signed certificate. I thought all was OK after the
changeover because https://www.checktls.com and openssl s_client both
said so.

-Earl

On 2/16/20 19:16, Viktor Dukhovni wrote:

> On Sun, Feb 16, 2020 at 01:41:16PM -0500, Viktor Dukhovni wrote:
>
>>       ; Suggested more robust TLSA record management approaches can be found via:
>>
>>         https://github.com/internetstandards/toolbox-wiki/blob/master/DANE-for-SMTP-how-to.md
>>         https://mail.sys4.de/pipermail/dane-users/2018-February/000440.html
>>         https://community.letsencrypt.org/t/please-avoid-3-0-1-and-3-0-2-dane-tlsa-records-with-le-certificates/7022/17
>>         https://mail.sys4.de/pipermail/dane-users/2017-August/000417.html
>>         https://github.com/baknu/DANE-for-SMTP/wiki/2.-Implementation-resources
> No matter how hard I try, it seems people are just too distracted to
> heed (or read) sound advice (e.g. the 3rd link above).  A band-aid has
> been applied and the published TLSA record is now:
>
>     @maple.killian.com.[199.165.155.8]
>     @pine.killian.com.[35.167.26.164]
>     _25._tcp.maple.killian.com. TLSA 3 0 1 CA2BE3CF3E0F13FEC3860BC6A54A21F3D51DEEA640FE8695C83C9FD817DE02A6
>
> which does match the certificate just at the moment, but will promptly
> break in ~60 days when the next Let's Encrypt certificate rollover
> happens.
>
>>         depth = 0
>>           Issuer CommonName = Let's Encrypt Authority X3
>>           Issuer Organization = Let's Encrypt
>>           notBefore = 2020-02-12T23:29:02Z
>>           notAfter = 2020-05-12T23:29:02Z
>>           Subject CommonName = smtp.killian.com
>>           cert sha256 [nomatch] <- 3 0 1 ca2be3cf3e0f13fec3860bc6a54a21f3d51deea640fe8695c83c9fd817de02a6
>>           pkey sha256 [nomatch] <- 3 1 1 b7f1cd36893e5a884a3c4c70853e87089ea8b65e07c9c7996181d1b3b48ceb39
> This is why the right TLSA RR type to use with Let's Encrypt is "3 1 1"
> (pinning the key, not the certificate), and by using
>
>     certbot renew --reuse-key
>
> the "3 1 1" record continues to work across certificate rollovers,
> but even then, before implementing DANE:
>
>     1.  Impelement monitoring, so that if your setup breaks, you'll
>         know it before I do.
>
>     2.  Automate a robust cert/key rollover process.  See suggested
>         "3 1 1 + 3 1 1" approach or "3 1 1 + 2 1 1".
>