possible problem with postfix/local??

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

possible problem with postfix/local??

satishkumarp2k1
Hi,

We are noticing couple of strange problems with postfix in our
environment. They are as follows:

We have a relay server, which is extensively used internally in our
organization. This server receives email for one email domain (let's
day domain.com for example) and uses alias tables to route the emails
to several backend mailbox servers.

More precisely: All the emails will be sent to @domain.com addresses.
So this server find the corresponding alias entry in one of the alias
hash tables and then sends the email to the alias defined in
the alias tables.

Example alias table entries:
USER1: [hidden email]
USER2: [hidden email] etc.

 a. postfix/smtpd : receives the email properly. Since we have smtpd
restrictions configured, emails will be rejected for unknown
recipients.
 b. postfix queues the email
 c. postfix/local : tries to find the alias in the alias tables before
sending the email. Here is the actual problem. Once in a while (one or
two emails in a month or so), postfix/local fails to find the USER
entry in alias tables (though the entry exists) and bounces the email
with either of the following messages:

1. "unknown user" (this is really strange, if the user were unknown,
postfix/smtpd would have rejected the recipient at SMTP connection
itself)
2. "mail forwarding loop" for [hidden email] (though we are pretty
sure that the mail came to this server once - i mean not looping b/w
the servers)

In all the cases we observed, postfix/local fails to find the entry in
alias tables. This server handles almost 70000 emails daily and works
perfectly except the bugging issue I mentioned above. Few details
regarding our environment are as follows:

postfix-2.2.9-10.25.3 (O.S: SLES 10)

Any inputs to resolve this would be appreciated. Thanks in advance.

Thanks,
Satish
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Sahil Tandon
On Mon, 28 Dec 2009, Satish Kumar P wrote:

> We are noticing couple of strange problems with postfix in our
> environment. They are as follows:

[ ... ]

Your problem description is useful, but actual logging that corresponds
to your situation and the output of 'postconf -n' are required.  Please
see DEBUG_README (a document to which you were linked upon joining this
mailing list) for tips on seeking help here.

> postfix-2.2.9-10.25.3 (O.S: SLES 10)

As an aside, you might consider upgrading to a more recent version of
Postfix.

--
Sahil Tandon <[hidden email]>
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

satishkumarp2k1
> Your problem description is useful, but actual logging that corresponds
> to your situation and the output of 'postconf -n' are required.  Please
> see DEBUG_README (a document to which you were linked upon joining this
> mailing list) for tips on seeking help here.

Thanks for the response.

-> Corresponding postfix logs:

SUCCESSFUL EMAIL

Dec 27 18:34:03 SERVER postfix/smtpd[30149]: CFD0A1000084: client=CLIENT[172.22.23.21]
Dec 27 18:34:04 SERVER postfix/cleanup[29796]: CFD0A1000084: message-id=<20091227183402.EA54A37A1D@CLIENT>
Dec 27 18:34:04 SERVER postfix/qmgr[32069]: CFD0A1000084: from=<USER1@DOMAIN.com>, size=874, nrcpt=1 (queue active)
Dec 27 18:34:04 SERVER postfix/local[30215]: CFD0A1000084: to=<USER1@DOMAIN.com>, relay=local, delay=1, status=sent (forwarded as 1CBAF100008C)
Dec 27 18:34:04 SERVER postfix/qmgr[32069]: CFD0A1000084: removed

Dec 27 18:34:04 SERVER postfix/cleanup[29800]: 1CBAF100008C: message-id=<20091227183402.EA54A37A1D@CLIENT>
Dec 27 18:34:04 SERVER postfix/qmgr[32069]: 1CBAF100008C: from=<USER1@DOMAIN.com>, size=1019, nrcpt=1 (queue active)
Dec 27 18:34:04 SERVER postfix/smtp[29014]: 1CBAF100008C: to=<USER1@sub8.DOMAIN.com>, orig_to=<USER1@DOMAIN.com>, relay=INTERNAL-SERVER1[172.17.34.37], delay=0, status=sent (250 2.6.0  <20091227183402.EA54A37A1D@CLIENT> Queued mail for delivery)
Dec 27 18:34:04 SERVER postfix/qmgr[32069]: 1CBAF100008C: removed


BOUNCED EMAIL

Dec 27 18:30:03 SERVER postfix/smtpd[29854]: 946F8100008C: client=CLIENT[172.22.23.21]
Dec 27 18:30:03 SERVER postfix/cleanup[29863]: 946F8100008C: message-id=<20091227183002.AF64B37A1D@CLIENT>
Dec 27 18:30:03 SERVER postfix/qmgr[32069]: 946F8100008C: from=<USER1@DOMAIN.com>, size=874, nrcpt=1 (queue active)
Dec 27 18:30:04 SERVER postfix/local[30128]: 946F8100008C: to=<USER1@DOMAIN.com>, relay=local, delay=1, status=bounced (unknown user: "USER1")
Dec 27 18:30:04 SERVER postfix/qmgr[32069]: 946F8100008C: removed

EXAMPLE REJECTED EMAIL FOR UNKNOWN RECIPIENT by postfix/smtpd:

Dec 27 16:16:29 SERVER postfix/smtpd[24805]: NOQUEUE: reject: RCPT from CLIENT[172.22.23.21]: 550 <abc@DOMAIN.com>: Recipient address rejected: User unknown in local recipient table; from=<USER1@CLIENT> to=<abc@DOMAIN.com> proto=ESMTP helo=<CLIENT>

Kindly check the postfix/local lines for both the emails. Sorry, I disguised the client, server, domain and user names in the logs (all other portions are intact). If 'USER1' user is really not found, postfix/smtpd should have rejected - but this is not the case. Thousands of emails get through for all the users, but once in a while postfix/local fails to find the user in the alias tables (as shown above).

-> O/p of "postconf -n" command

alias_database = hash:/etc/postfix/aliases        hash:/etc/postfix/aliases.shared        hash:/etc/postfix/aliases.users        hash:/etc/postfix/aliases.lists hash:/etc/postfix/aliases.dllists
alias_maps = hash:/etc/postfix/aliases        hash:/etc/postfix/aliases.shared        hash:/etc/postfix/aliases.users        hash:/etc/postfix/aliases.lists hash:/etc/postfix/aliases.dllists
biff = no
canonical_maps = regexp:/etc/postfix/canonical
config_directory = /etc/postfix
daemon_directory = /usr/lib/postfix
html_directory = /usr/share/doc/packages/postfix/html
local_header_rewrite_clients = permit_mynetworks
local_recipient_maps = $alias_maps
mail_owner = postfix
mailq_path = /usr/bin/mailq
manpage_directory = /usr/share/man
masquerade_domains = !sub1.DOMAIN.com !sub2.DOMAIN.com !sub3.DOMAIN.com !sub4.DOMAIN.com !sub5.DOMAIN.com !sub6.DOMAIN.com !sub7.DOMAIN.com !sub8.DOMAIN.com DOMAIN.com
message_size_limit = 41943040
mydestination = $myhostname, $mydomain, sub1.DOMAIN.com, sub2.DOMAIN.com, localhost, localhost.localdomain
mydomain = DOMAIN.com
mynetworks = 127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, /etc/postfix/relay_from
mynetworks_style = subnet
myorigin = DOMAIN.com
newaliases_path = /usr/bin/newaliases
notify_classes = resource, software, policy
readme_directory = /usr/share/doc/packages/postfix/README_FILES
recipient_delimiter = +
relay_domains =
sample_directory = /usr/share/doc/packages/postfix/samples
sendmail_path = /usr/sbin/sendmail
setgid_group = maildrop
transport_maps = hash:/etc/postfix/transport

Thanks,
Satish
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Wietse Venema
When the recipient domain matches mydestination (or any IP address
literal that matches inet_interfaces or proxy_interfaces).

1) local(8) will accept the recipient ONLY if the username is
   found. It does not look for username@domain.

2) However, smtpd(8) will accept the recipient if either the
   username@domain or the username are found.

So, be sure that you don't have user@domain forms in $alias_maps.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

satishkumarp2k1

> So, be sure that you don't have user@domain forms in $alias_maps.

Thanks. Every line in the alias files defined in $alias_maps is of the following form:

USER:   USER@subX.DOMAIN.com

We are not using the form "user@domain" for alias entries (instead just USER). I would like to inform again that the setup is working for all the users most of the times except few cases, which happens at very random (around 5-10 emails so out of roughly 70000*30 emails in a month get bounced).

I am guessing whether "local" might be unable to perform the lookups in alias tables under heavy load - this is just a guess, but I might be wrong. Need advise from experts.
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Terry Carmen
In reply to this post by satishkumarp2k1
On 12/27/2009 11:28 PM, Satish Kumar P wrote:

> 1. "unknown user" (this is really strange, if the user were unknown,
> postfix/smtpd would have rejected the recipient at SMTP connection
> itself)
> 2. "mail forwarding loop" for [hidden email] (though we are pretty
> sure that the mail came to this server once - i mean not looping b/w
> the servers)
>
> In all the cases we observed, postfix/local fails to find the entry in
> alias tables. This server handles almost 70000 emails daily and works
> perfectly except the bugging issue I mentioned above. Few details
> regarding our environment are as follows:

Is the alias table generated dynamically? It is possible that it's not
readable (still being written) at the time the lookup happens?

Terry

Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Victor Duchovni
On Mon, Dec 28, 2009 at 12:32:51PM -0500, Terry Carmen wrote:

> On 12/27/2009 11:28 PM, Satish Kumar P wrote:
>> 1. "unknown user" (this is really strange, if the user were unknown,
>> postfix/smtpd would have rejected the recipient at SMTP connection
>> itself)
>> 2. "mail forwarding loop" for [hidden email] (though we are pretty
>> sure that the mail came to this server once - i mean not looping b/w
>> the servers)
>>
>> In all the cases we observed, postfix/local fails to find the entry in
>> alias tables. This server handles almost 70000 emails daily and works
>> perfectly except the bugging issue I mentioned above. Few details
>> regarding our environment are as follows:
>
> Is the alias table generated dynamically? It is possible that it's not
> readable (still being written) at the time the lookup happens?

Not much point speculating without logs. It should also be noted that
the most frequent explanation for "disappearing" local users is lookup
timeouts in remote nsswitch.conf mechanisms, and the C-library lying
by returning "no such user".

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Wietse Venema
In reply to this post by satishkumarp2k1
satishkumarp2k1:

>
>
> > So, be sure that you don't have user@domain forms in $alias_maps.
>
> Thanks. Every line in the alias files defined in $alias_maps is of the
> following form:
>
> USER:   [hidden email]
>
> We are not using the form "user@domain" for alias entries (instead just
> USER). I would like to inform again that the setup is working for all the
> users most of the times except few cases, which happens at very random
> (around 5-10 emails so out of roughly 70000*30 emails in a month get
> bounced).
>
> I am guessing whether "local" might be unable to perform the lookups in
> alias tables under heavy load - this is just a guess, but I might be wrong.
> Need advise from experts.

According to the main.cf information in the problem report,
local_recipient_maps=$alias_maps and all those maps are local files,
so that would exclude the usual nsswitch foul-ups.

There are no confirmed reports on this list that Postfix "forgets"
users in local alias database files. For this reason I will assume
with confidence that you have some buggy database library.

Postfix uses the same system library routines to access alias_maps
in the smtpd(8) and local(8) programs. If smtpd(8) finds users that
local(8) does not find, then I suggest that you consider using a
more robust database implementation.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

/dev/rob0
On Mon, Dec 28, 2009 at 05:56:48PM -0500, Wietse Venema wrote:
> satishkumarp2k1:
...
> According to the main.cf information in the problem report,
> local_recipient_maps=$alias_maps and all those maps are local
> files, so that would exclude the usual nsswitch foul-ups.
>
> There are no confirmed reports on this list that Postfix "forgets"
> users in local alias database files. For this reason I will assume
> with confidence that you have some buggy database library.

Just a suggestion, this sounds like a good case for a Makefile to
compile the multiple source files into a single DB?

Another suggestion, try adding CDB support and using that, see
CDB_README.html .
--
    Offlist mail to this address is discarded unless
    "/dev/rob0" or "not-spam" is in Subject: header
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

satishkumarp2k1
In reply to this post by Terry Carmen

<quote author="Terry Carmen">
> Is the alias table generated dynamically? It is possible that it's not
> readable (still being written) at the time the lookup happens?

Yes, correct. All the alias files are generated using perl scripts, which run periodically. The scripts actually generate temporary alias files (while generating the aliases) and then just use "mv" command to the actual alias file. Do you still think lookup might fail even in this case??
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Stan Hoeppner
satishkumarp2k1 put forth on 12/28/2009 9:29 PM:

> Yes, correct. All the alias files are generated using perl scripts, which
> run periodically. The scripts actually generate temporary alias files (while
> generating the aliases) and then just use "mv" command to the actual alias
> file. Do you still think lookup might fail even in this case??

How big is the alias file and how busy is this server?  If the answers are big
and busy, then the chances of querying while the file is locked for write access
are higher.  Check your logs and see if this is the case.  Index the script run
time stamp to the error time stamp.  If the times match, you've probably found
the cause.

A seriously ugly hack to get around this would be to stop postfix at the top of
your script and start postfix at the end of the script after the "mv", inserting
a few waits after the "mv" to make sure it completes before postfix starts
again.  Like I said, this is a very ugly solution and wrought with other
potential problems, but it should solve this immediate issue.

I'm guessing your best option moving forward would be to switch to a database
driven alias setup with something like mysql or postgresql.  That would pretty
much eliminate the possibility of the scenario you're currently running into.

--
Stan

Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Victor Duchovni
On Mon, Dec 28, 2009 at 10:51:47PM -0600, Stan Hoeppner wrote:

> satishkumarp2k1 put forth on 12/28/2009 9:29 PM:
>
> > Yes, correct. All the alias files are generated using perl scripts, which
> > run periodically. The scripts actually generate temporary alias files (while
> > generating the aliases) and then just use "mv" command to the actual alias
> > file. Do you still think lookup might fail even in this case??

The "mv" is unsafe if it moves files across file-systems. Perl or C code
that uses system("mv $old $new") to rename(2) a file instead of using
the rename(2) system call (perldoc -f rename) is written by programmers
who should not be trusted with system code.

> How big is the alias file and how busy is this server?

This is not relevant. In-place rename(2) is atomic, and unless dbm(3)
files are used instead of Berkeley DB, smtpd(8) will not fail to find
recipients when an old indexed table is replaced by a new table, and the
recipient is present in both.

Programmers who use system("mv ...") cannot be trusted to write
robust code to update critical system configuration files.

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

Wietse Venema
In reply to this post by satishkumarp2k1
satishkumarp2k1:

>
>
>
> > Is the alias table generated dynamically? It is possible that it's not
> > readable (still being written) at the time the lookup happens?
>
> Yes, correct. All the alias files are generated using perl scripts, which
> run periodically. The scripts actually generate temporary alias files (while
> generating the aliases) and then just use "mv" command to the actual alias
> file. Do you still think lookup might fail even in this case??

The "rename into place" approach is generally safe (except with
moves between across file system boundaries, or with DBM files
which come in pairs).

Renaming a database is definitely safer than overwriting.

However, if a user is added then there is a brief time where a some
Postfix program may still have a handle to the old database copy
that does not have that user. That amount of time is the time needed
to handle a mail delivery request.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: possible problem with postfix/local??

satishkumarp2k1

Thanks a lot to everyone for suggestions. Couple of questions:

1. I noticed that postfix restarts the appropriate daemons/programs (smtpd/local) whenever it notices changes in the aliases files. How does it determine that (based on file's attributes etc.)??

2. Does postfix load the alias tables into memory?? I am just trying to understand whether postfix searches in the memory resident copy of data or makes a system call to hash tables?

Thanks