Caching issues when using LDAP lookups for transports

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

Caching issues when using LDAP lookups for transports

Ralph Seichter-3
In a new server setup, I use two consecutive transport lookups:

  transport_maps = ldap:/etc/postfix/foo.cf ldap:/etc/postfix/bar.cf

The lookup defined in foo.cf MAY return a result for a given recipient,
while using bar.cf MUST return a result. This works, but with a caveat:
Adding or removing the relevant attribute for the foo.cf based lookup is
not recognised by my Postfix tests until I run "postfix reload".

ldap_table(5) mentions that all cache* LDAP parameters are ignored, so I
assume I cannot affect result caching there. As a consequence, I tried
setting the following queue manager parameters:

  qmgr_message_recipient_limit = 1
  qmgr_message_recipient_minimum = 1

However, this does not resolve the issue, even if I use changing
recipient addresses in an attempt to flush the qmgr in-memory status
cache.

I have two questions which I hope you guys can answer:

1. How do I force Postfix to perform an LDAP lookup every time a new
inbound message arrives (i.e., how to disable caching lookup results)?

2. Can I configure a single LDAP lookup instead of two sequential ones,
which behaves according to the following pseudocode:

  x = ldap_lookup_recipient_record(envelope_to_address)
  if x.has_attribute(alpha)
      return x.value_of_attribute(alpha)
  else
      return x.value_of_attribute(beta)

Your help is appreciated.

-Ralph
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Viktor Dukhovni
On Thu, Feb 18, 2021 at 07:52:07AM +0100, Ralph Seichter wrote:

> In a new server setup, I use two consecutive transport lookups:
>
>   transport_maps = ldap:/etc/postfix/foo.cf ldap:/etc/postfix/bar.cf

I strongly do not recommend using LDAP for per-user transport lookups.
Instead:

    - Use virtual(5) LDAP tables to *rewrite* recipient addresses
      to transport-specific domains

    - Resolve these domains via a stable (ideally indexed table)
      domain -> transport mapping

    - Where needed, use smtp_generic_maps to rewrite the
      transport-specific recipient domain back to the original
      address (something similar to canonical_maps, but on output).
      The definitions of smtp_generic_maps can transport-specific,
      via master.cf overrides.

Yes, this is more complex, but:

    - Your single-threaded queue manager is no longer blocked waiting
      on potentially rather expensive LDAP lookups.

    - Postfix can continue to process already queued mail even when
      LDAP is down, it just won't take in new mail.

    - Logically, your configuration is more modular, rewrite users
      from (typically) virtual_alias domains to mailstore domains,
      leaving the transport to be defined indirectly.

      Then separately from asigning the user to a mailstore domain,
      configure Postfix to route each domain to an appropriate
      transport (or just send to the MX host of that domain).

> However, this does not resolve the issue, even if I use changing
> recipient addresses in an attempt to flush the qmgr in-memory status
> cache.

The queue_manager has a one elemen transport lookup cache, when
a stream of back-to-back messages (usually when testing, rather
than in real life) all go to the same recipient, there's only
one transport lookup.

> I have two questions which I hope you guys can answer:
>
> 1. How do I force Postfix to perform an LDAP lookup every time a new
> inbound message arrives (i.e., how to disable caching lookup results)?

You can't the built-in transport-resolution cache is not dictionary
specific.

> 2. Can I configure a single LDAP lookup instead of two sequential ones,
> which behaves according to the following pseudocode:
>
>   x = ldap_lookup_recipient_record(envelope_to_address)
>   if x.has_attribute(alpha)
>       return x.value_of_attribute(alpha)
>   else
>       return x.value_of_attribute(beta)

Possibly, yes, via a suitable combination of leaf_result_attribute,
terminal_result_attribute and result_attribute.  See ldap_table(5).

    terminal_result_attribute = alpha
    result_attribute = beta

But this will not change your original issue.  Again, DO NOT
burden the queue manager with LDAP lookups.  Make transport
lookups purely local and largely static.

--
    Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Ralph Seichter-3
* Viktor Dukhovni:

> I strongly do not recommend using LDAP for per-user transport lookups.

Shame that it does not scale, because it works. I have tried using a
combination of LDAP-based virtual_alias_maps and hashed transport_maps
as per your suggestion, but have not yet quite achieve the result I am
looking for.

Let me modify the pseudocode to describe my goal in more detail:

  x = ldap_lookup_recipient_record(envelope_to_address)
  if x.has_attribute(alpha)
      reject_with_code_4xx(message=value_of_attribute(alpha))
  else
      relay_message(nexthop=value_of_attribute(beta))

When I use transport lookups, this is possible, for example by setting
attributes alpha="retry:Unavailable" or beta="smtp:[somehost]".

I tried to emulate this by using virtual alias lookups which return
alpha="pause.domain.tld" or beta="route.domain.tld", combined with a
transports hash map containing

  pause.domain.tld  retry:Unavailable
  route.domain.tld  smtp:[somehost]

Alas, when I do it this way, Postfix accepts email for users with the
alpha attribute and then stores them as deferred in the queue. I need to
reject with "4xx Unavailable" instead.

I feel like I am close but overlooking something.

-Ralph
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Wietse Venema
A Postfix process won't look up transport_maps if the same query
repeats, but when I look at the code, there is a 30-second time
limit on the reusing the cached response. Is that not sufficient?

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Ralph Seichter-3
* Wietse Venema:

> A Postfix process won't look up transport_maps if the same query
> repeats, but when I look at the code, there is a 30-second time
> limit on the reusing the cached response. Is that not sufficient?

Maybe it would help if I described the scenario in more detail. Consider
an LDAP server listing every valid recipient address, and for each
recipient an attribute which identifies one member of a set of "next
hop" servers, i.e. the user's home server.

User accounts need to be moved between home servers, which can take some
time, depending on the amount of data that needs to be moved. My goal is
to add an LDAP attribute which indicates that an account is undergoing
maintenance. This attribute is set before maintenance starts, and while
it is active, incoming email for the account should be rejected with
code 4xx to avoid queueing issues.

In order to keep the window for temporary message rejection as small as
possible, the LDAP attribute is set immediately before maintenance
starts, and is removed immediately after maintenance ends. Any caching
interferes when incoming traffic volume is high, even 30 seconds matter.

If messages are not rejected during maintenance, they end up in the
Postfix queue. However, mail queued for next hop someserver.domain.tld
will no longer be accepted by that server once maintenance ends. All
mail, including the messages queued during maintenance, must only be
sent to otherserver.domain.tld after maintenance finishes. The actual
value of the new home server can only be determined via LDAP lookups,
after maintenance finishes.

My first attempt was to solve this with transport lookups, but Viktor
pointed out that it does not scale well. I am now trying to solve this
in a manner which does not block any given Postfix process.

-Ralph
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Wietse Venema
In reply to this post by Wietse Venema
Wietse Venema:
> A Postfix process won't look up transport_maps if the same query
> repeats, but when I look at the code, there is a 30-second time
> limit on the reusing the cached response. Is that not sufficient?

Sorry, that is the resolve client cache, not the transport_map
cache.  But the result is that the transport map won't be queried
for a repeated resolve client request, for 30s.

Finally, the transport map lookup client caches the "*" result
indefinitely.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Wietse Venema
In reply to this post by Ralph Seichter-3
Ralph Seichter:
> In order to keep the window for temporary message rejection as small as
> possible, the LDAP attribute is set immediately before maintenance
> starts, and is removed immediately after maintenance ends. Any caching
> interferes when incoming traffic volume is high, even 30 seconds matter.

Now that you know about the 30s, how would that make a difference?
The safe sequence is to

1) Stop accepting email (reply with 4xx).

2) Update LDAP

3) Wait until caches and queues have drained (al least 30s).

4) Start accepting email.

        Wietse

> If messages are not rejected during maintenance, they end up in the
> Postfix queue. However, mail queued for next hop someserver.domain.tld
> will no longer be accepted by that server once maintenance ends. All
> mail, including the messages queued during maintenance, must only be
> sent to otherserver.domain.tld after maintenance finishes. The actual
> value of the new home server can only be determined via LDAP lookups,
> after maintenance finishes.
>
> My first attempt was to solve this with transport lookups, but Viktor
> pointed out that it does not scale well. I am now trying to solve this
> in a manner which does not block any given Postfix process.
>
> -Ralph
>
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Wietse Venema
Wietse Venema:

> Ralph Seichter:
> > In order to keep the window for temporary message rejection as small as
> > possible, the LDAP attribute is set immediately before maintenance
> > starts, and is removed immediately after maintenance ends. Any caching
> > interferes when incoming traffic volume is high, even 30 seconds matter.
>
> Now that you know about the 30s, how would that make a difference?
> The safe sequence is to
>
> 1) Stop accepting email (reply with 4xx).
>
> 2) Update LDAP
>
> 3) Wait until caches and queues have drained (al least 30s).
>
> 4) Start accepting email.

Actually, drain caches and queues BEFORE updating LDAP, so that
LDAP is not changing while Postfix is still processing email.

  Wietse
 

> > If messages are not rejected during maintenance, they end up in the
> > Postfix queue. However, mail queued for next hop someserver.domain.tld
> > will no longer be accepted by that server once maintenance ends. All
> > mail, including the messages queued during maintenance, must only be
> > sent to otherserver.domain.tld after maintenance finishes. The actual
> > value of the new home server can only be determined via LDAP lookups,
> > after maintenance finishes.
> >
> > My first attempt was to solve this with transport lookups, but Viktor
> > pointed out that it does not scale well. I am now trying to solve this
> > in a manner which does not block any given Postfix process.
> >
> > -Ralph
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Viktor Dukhovni
In reply to this post by Ralph Seichter-3
On Thu, Feb 18, 2021 at 02:00:11PM +0100, Ralph Seichter wrote:

> > I strongly do not recommend using LDAP for per-user transport lookups.
>
> Shame that it does not scale, because it works. I have tried using a
> combination of LDAP-based virtual_alias_maps and hashed transport_maps
> as per your suggestion, but have not yet quite achieve the result I am
> looking for.
>
> Let me modify the pseudocode to describe my goal in more detail:
>
>   x = ldap_lookup_recipient_record(envelope_to_address)
>   if x.has_attribute(alpha)
>       reject_with_code_4xx(message=value_of_attribute(alpha))
>   else
>       relay_message(nexthop=value_of_attribute(beta))

You should not be using the transport(5) table for SMTP access control,
that's what access(5) is for.  LDAP used in access(5) tables works just
fine.  And scales better because while there's only one queue-manager,
there are many smtpd(8) processes, whose LDAP queries are concurrent,
(typically via multiple instances of proxymap, which scales up on
demand).

> When I use transport lookups, this is possible, for example by setting
> attributes alpha="retry:Unavailable" or beta="smtp:[somehost]".

The mistake is using transport(5) for access control.

> I tried to emulate this by using virtual alias lookups which return
> alpha="pause.domain.tld" or beta="route.domain.tld", combined with a
> transports hash map containing
>
>   pause.domain.tld  retry:Unavailable
>   route.domain.tld  smtp:[somehost]

Sorry, that was for actually delivering email, not for simulating
access control (square peg, round hole).

--
    Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Viktor Dukhovni
On Thu, Feb 18, 2021 at 10:56:24AM -0500, Viktor Dukhovni wrote:

> > Let me modify the pseudocode to describe my goal in more detail:
> >
> >   x = ldap_lookup_recipient_record(envelope_to_address)
> >   if x.has_attribute(alpha)
> >       reject_with_code_4xx(message=value_of_attribute(alpha))
> >   else
> >       relay_message(nexthop=value_of_attribute(beta))
>
> You should not be using the transport(5) table for SMTP access control,
> that's what access(5) is for.  LDAP used in access(5) tables works just
> fine.  And scales better because while there's only one queue-manager,
> there are many smtpd(8) processes, whose LDAP queries are concurrent,
> (typically via multiple instances of proxymap, which scales up on
> demand).

In fact you have two potential mechanisms for this:

    main.cf:
        # Filter out unauthorised access before recipient checks
        #
        smtpd_client_restrictions =
            permit_mynetworks,
            reject_unauth_destination
            # ... RBL lookups ...

        ldap = proxy:ldap:${config_directory}/
        smtpd_recipient_restrictions =
            check_recipient_access ${ldap}ldap-rcpt.cf

        smtpd_relay_restrictions =
            permit_mynetworks,
            # permit_sasl_authenticated,
            reject_unauth_destination

    ldap-rcpt.cf:
        server = ...
        ...
        query_filter = mail=%s
        result_attribute = reject_action

This assumes that the "reject_action" is a fully formed access(5) value
starting with "REJECT" or "450" or "550".  You also start with a keyword
and use a regexp "pipemap" to map the keyword to an access action.

Bottom line, use the transport(5) table for routing, and access(5) for
access control.

--
    Viktor.
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Wietse Venema
Viktor Dukhovni:
> Bottom line, use the transport(5) table for routing, and access(5) for
> access control.

These are queried at different points in time. Is this race-condition
safe, i.e. can LDAP reponses change while an email message is in
flight inside Postfix?

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Ralph Seichter-3
In reply to this post by Wietse Venema
* Wietse Venema:

> Actually, drain caches and queues BEFORE updating LDAP, so that
> LDAP is not changing while Postfix is still processing email.

The maintenance service and Postfix only intersect in LDAP, and moving
an account between servers can happen at any time. That's why I can only
rely on the LDAP query results.

I have gone through many tests based on Viktor's and your suggestions,
and found the following combination promising:

  virtual_alias_maps = ldap:/etc/postfix/virtual_alias.cf

  smtpd_recipient_restrictions = [... reject_* here ...]
    check_recipient_access ldap:/etc/postfix/recipient_access.cf

The lookups of course use different result attributes with matching
result data: An email address for virtual alias, and DEFER_IF_PERMIT for
access while an account is undergoing maintenance.

-Ralph
Reply | Threaded
Open this post in threaded view
|

Re: Caching issues when using LDAP lookups for transports

Viktor Dukhovni
In reply to this post by Wietse Venema
On Thu, Feb 18, 2021 at 11:53:56AM -0500, Wietse Venema wrote:
> Viktor Dukhovni:
> > Bottom line, use the transport(5) table for routing, and access(5) for
> > access control.
>
> These are queried at different points in time. Is this race-condition
> safe, i.e. can LDAP reponses change while an email message is in
> flight inside Postfix?

There's no specific issue here.  There's never a guarantee that an
address accepted initially is still valid by the time the message lands
in the active queue (possibly after being deferred).

If there are access(5) rules they run in real time to reject some email
that is deemed invalid at that time.

Perhaps later, the accepted and rewritten via virtual(5) recipients are
mapped to transports, and the delivery agent may discover that the
recipient is no longer valid (not in /etc/passwd, rejected by a remote
SMTP or LMTP server, ...).  That's normal, some may bounce.

--
    Viktor.