TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Mike Kazantsev
Hello,

I have recently updated postfix on Arch to 3.4.0 and had an interesting
hard-to-debug (with my limited knowledge) problem where it fails to
deliver all mail to relayhost via TLS.

main.cf configuration file for that host looks like this:
  https://gist.github.com/mk-fg/f9ac42ff34a5694ce24cd9a925b32721#file-main-cf

master.cf is the default one (available on the same gist as main.cf
above), with tlsproxy commented-out.

And logs only show two kinds of messages on delivery:

  postfix/smtp[16394]: initializing the client-side TLS engine
  postfix/smtp[16393]: 869C7A23AD: TLS is required, but our TLS engine is unavailable

Neither of these tells me what the problem with TLS engine was, and why
it stopped working in 3.4.0, which I think is the main problem here.

I've tried using smtp_tls_loglevel=2 and the usual debug_peer_list=...
to get more information on what exactly is failing, but neither of them
provides anything else about the problem.

What is the expected way for postfix user to get an understanding of why
postfix starts failing here after upgrade?
I.e. which option it rejects or lacks, for what reason, etc.
(while working perfectly and without any warnings in 3.3.2)


On IRC I've been pointed out that there are multiple TLS-related
changes in 3.4.0, and have yet to look into these, but complete lack of
information here looks like a bug in itself.

Expected logging would be something like:

  postfix/smtp: ERROR: smtp_tls_X requires tlsproxy enabled in master.cf
or
  postfix/smtp: ERROR: failed to use certificate ... - openssl error: ...
or
  postfix/smtp: WARNING: smtp_tls_eccert_file is deprecated and will be removed in 3.4.0

But as mentioned, there don't seem to be any such hints.

Would also appreciate an advice on how to fix current configuration.

Suspect that I might be able to figure it out after looking through
3.4.0 changes though, and altering configuration to use new features.


Thanks!

--
Mike Kazantsev // fraggod.net
Reply | Threaded
Open this post in threaded view
|

Re: TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Viktor Dukhovni
On Tue, Mar 05, 2019 at 04:02:36AM +0500, Mike Kazantsev wrote:

> And logs only show two kinds of messages on delivery:
>
>   postfix/smtp[16394]: initializing the client-side TLS engine
>   postfix/smtp[16393]: 869C7A23AD: TLS is required, but our TLS engine is unavailable
>
> Neither of these tells me what the problem with TLS engine was, and why
> it stopped working in 3.4.0, which I think is the main problem here.

The reason there's no logging of a problem, is that there is no
problem to log! :-(  The certificate initialization was successful,
but due to a bug in reporting success/failure to the caller, the
successful outcome was treated as a failure (whose reason would
have already been logged if it were a real failure).

The patch is a one liner, below.

--- a/src/tls/tls_certkey.c
+++ b/src/tls/tls_certkey.c
@@ -589,7 +589,7 @@ static int set_cert_stuff(SSL_CTX *ctx, const char *cert_type,
      * single pass, avoiding potential race conditions during key rollover.
      */
     if (strcmp(cert_file, key_file) == 0)
- return (load_mixed_file(ctx, cert_file));
+ return (load_mixed_file(ctx, cert_file) == 0);
 
     /*
      * We need both the private key (in key_file) and the public key

A work-around is to make symlink to the cert file and use that as
the keyfile.

    # ln -s /etc/ssl/mail-relay-auth.pem /etc/ssl/mail-relay-key.pem
    # postconf -e 'smtp_tls_eckey_file = /etc/ssl/mail-relay-key.pem

The bug is in the code that tries to import a single file with both
the key and the certificate in one go.  Or rather, in the code that
calls that code (the underlying code was carefully tested, but the
calling code escaped unscrutinized).  The official patch will have
a more comprehensive test.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Wietse Venema
Viktor Dukhovni:
> The reason there's no logging of a problem, is that there is no
> problem to log! :-(  The certificate initialization was successful,
> but due to a bug in reporting success/failure to the caller, the
> successful outcome was treated as a failure (whose reason would
> have already been logged if it were a real failure).
>
> The patch is a one liner, below.

I have uploaded postfix-3.4.1-RC1 (stable) and postfix-3.5-20190304 (unstable).

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Mike Kazantsev
In reply to this post by Viktor Dukhovni
On Mon, 4 Mar 2019 18:43:32 -0500
Viktor Dukhovni <[hidden email]> wrote:

> > Neither of these tells me what the problem with TLS engine was, and why
> > it stopped working in 3.4.0, which I think is the main problem here.  
>
> The reason there's no logging of a problem, is that there is no
> problem to log! :-(  The certificate initialization was successful,
> but due to a bug in reporting success/failure to the caller, the
> successful outcome was treated as a failure (whose reason would
> have already been logged if it were a real failure).
...
> A work-around is to make symlink to the cert file and use that as
> the keyfile.

Thanks for testing this one-file case, such a quick patch and releases.

It's rare that such things are bugs and not my misconfiguration, so
apologies for maybe being too long-winded in describing the wrong problem.


--
Mike Kazantsev // fraggod.net
Reply | Threaded
Open this post in threaded view
|

Re: TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Viktor Dukhovni
On Tue, Mar 05, 2019 at 06:03:47AM +0500, Mike Kazantsev wrote:

> Thanks for testing this one-file case, such a quick patch and releases.

You're welcome.

> It's rare that such things are bugs and not my misconfiguration, so
> apologies for maybe being too long-winded in describing the wrong problem.

No apologies needed.  Thanks for the prompt bug report, it had all
the right details.  The sooner the bug is fixed, the fewer users
end up with systems running the problem code.

--
        Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: TLS changes and breakage after 3.3.2 -> 3.4.0 upgrade

Viktor Dukhovni
In reply to this post by Wietse Venema
On Mon, Mar 04, 2019 at 07:43:17PM -0500, Wietse Venema wrote:

> Viktor Dukhovni:
> > The reason there's no logging of a problem, is that there is no
> > problem to log! :-(  The certificate initialization was successful,
> > but due to a bug in reporting success/failure to the caller, the
> > successful outcome was treated as a failure (whose reason would
> > have already been logged if it were a real failure).
> >
> > The patch is a one liner, below.
>
> I have uploaded postfix-3.4.1-RC1 (stable) and postfix-3.5-20190304 (unstable).

Below is a more complete patch (relative to 3.4.0, including same
one-liner), with tests that exercise the problem code-path.

We could refactor functions in tls_certkey.c to be consistent with
respect to whether 0 or non-zero is a successful return code.  The
lower-level internal functions return 0 on success, but the external
API returns 0 on failure.  I failed to handle one case of impedance
mismatch.  :-(

--
        Viktor.

--- a/Makefile.in
+++ b/Makefile.in
@@ -58,6 +58,15 @@ tls_certkey_tests: test
     $(SHLIB_ENV) $(VALGRIND) ./tls_certkey -m $$pem > $$pem.out 2>&1 || exit 1; \
     diff $$pem.ref $$pem.out || exit 1; \
     echo "  $$pem: OK"; \
+    $(SHLIB_ENV) $(VALGRIND) ./tls_certkey -k $$pem $$pem > $$pem.out 2>&1 || exit 1; \
+    diff $$pem.ref $$pem.out || exit 1; \
+    echo "  $$pem (with key in $$pem): OK"; \
+    case $$pem in good-*) \
+ ln -sf $$pem tmpkey.pem; \
+ $(SHLIB_ENV) $(VALGRIND) ./tls_certkey -k tmpkey.pem $$pem > $$pem.out 2>&1 || exit 1; \
+ diff $$pem.ref $$pem.out || exit 1; \
+ echo "  $$pem (with key in tmpkey.pem): OK";; \
+    esac; \
  done; \
  for pem in bad-*.pem; do \
     $(SHLIB_ENV) $(VALGRIND) ./tls_certkey $$pem > $$pem.out 2>&1 && exit 1 || : ok; \
--- a/tls_certkey.c
+++ b/tls_certkey.c
@@ -589,7 +589,7 @@ static int set_cert_stuff(SSL_CTX *ctx, const char *cert_type,
      * single pass, avoiding potential race conditions during key rollover.
      */
     if (strcmp(cert_file, key_file) == 0)
- return (load_mixed_file(ctx, cert_file));
+ return (load_mixed_file(ctx, cert_file) == 0);
 
     /*
      * We need both the private key (in key_file) and the public key
@@ -690,6 +690,7 @@ int     main(int argc, char *argv[])
     int     ch;
     int     mixed = 0;
     int     ret;
+    char   *key_file = 0;
     SSL_CTX *ctx;
 
 #if OPENSSL_VERSION_NUMBER < 0x10100000L
@@ -707,8 +708,11 @@ int     main(int argc, char *argv[])
  tls_print_errors();
  exit(1);
     }
-    while ((ch = GETOPT(argc, argv, "m")) > 0) {
+    while ((ch = GETOPT(argc, argv, "mk:")) > 0) {
  switch (ch) {
+ case 'k':
+    key_file = optarg;
+    break;
  case 'm':
     mixed = 1;
     break;
@@ -722,7 +726,9 @@ int     main(int argc, char *argv[])
     if (argc < 1)
  usage();
 
-    if (mixed)
+    if (key_file)
+ ret = set_cert_stuff(ctx, "any", argv[0], key_file) == 0;
+    else if (mixed)
  ret = load_mixed_file(ctx, argv[0]);
     else
  ret = load_chain_files(ctx, argv[0]);