Delays in writing to INBOX

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Delays in writing to INBOX

Durga Prasad Malyala
Hello all,
I am seeing consistent delays in writing to disk (my System redhat 7.2
 using GFS2 file system cluster)

May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
IIt4Ejji61o3LgAAuUaIWw Saved)

during major bursts of receiving mailq delays goes upto 600+ also.

GFS2 writing is OK. Not as great as XFS but OK. Disk bottleneck is
ruled out as it is a 3PAR SAN with 16 Gbps FC.

ocassionally we get  "timed out while sending end of data -- message
may be sent more than once)" message also and Message goes to deferred
queue.

Thanks
DP
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Mike Guelfi
Sounds like GFS2 operating normally.

Do you have any metrics on for the performance of the SAN during these events?

Quoting Durga Prasad Malyala <[hidden email]>:

> Hello all,
> I am seeing consistent delays in writing to disk (my System redhat 7.2
>  using GFS2 file system cluster)
>
> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
> IIt4Ejji61o3LgAAuUaIWw Saved)
>
> during major bursts of receiving mailq delays goes upto 600+ also.
>
> GFS2 writing is OK. Not as great as XFS but OK. Disk bottleneck is
> ruled out as it is a 3PAR SAN with 16 Gbps FC.
>
> ocassionally we get  "timed out while sending end of data -- message
> may be sent more than once)" message also and Message goes to deferred
> queue.
>
> Thanks
> DP


Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Bastian Blank-3
In reply to this post by Durga Prasad Malyala
On Fri, May 04, 2018 at 10:11:39AM +0530, Durga Prasad Malyala wrote:
> I am seeing consistent delays in writing to disk (my System redhat 7.2
>  using GFS2 file system cluster)

I think 7.5 is current.  Don't you want to upgrade?

The Red Hat support you pay for covers GFS2, Postfix and Dovecot.  What
did they tell you?

> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
> IIt4Ejji61o3LgAAuUaIWw Saved)
>
> during major bursts of receiving mailq delays goes upto 600+ also.

I don't think Dovecot is part of Postfix.  Do you?

> GFS2 writing is OK. Not as great as XFS but OK. Disk bottleneck is
> ruled out as it is a 3PAR SAN with 16 Gbps FC.

How did you check that writing is OK?  What did your load tests show as
maximum possible mail throughput?  Mail needs a lot of fsync, which in
fact is sensitive to latency.

> ocassionally we get  "timed out while sending end of data -- message
> may be sent more than once)" message also and Message goes to deferred
> queue.

Well, check your Dovecot log.  Postfix is just the messenger.

Bastian

--
Vulcans never bluff.
                -- Spock, "The Doomsday Machine", stardate 4202.1
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Wietse Venema
In reply to this post by Durga Prasad Malyala
Durga Prasad Malyala:
> Hello all,
> I am seeing consistent delays in writing to disk (my System redhat 7.2
>  using GFS2 file system cluster)
>
> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
> IIt4Ejji61o3LgAAuUaIWw Saved)

The 0.02/0/0/50 means that Postfix spends 50s waiting for DOVECOT
to accept the mail.

Look in your DOVECOT logs for warnings or errors.

The format of the "delays=a/b/c/d" logging is as follows:

    a = time from message arrival to last active queue entry
    b = time from last active queue entry to connection setup
    c = time in connection setup, including DNS, EHLO and STARTTLS
    d = time in message transmission

http://www.postfix.org/postconf.5.html#delay_logging_resolution_limit

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Durga Prasad Malyala
Thank you for your reply Wietse,
You are my Yoda Jedi Master and I have greatest regard for you - Sincerely.

Ive seen the above link earlier too and am at my wits end on this. I
was wondering if there is any issue in LMTP handing mail over to
dovecot.
I've mailed this to the Dovecot group as well but no-one is
volunteering to get involved.
This mail is to get responses if anyone has faced similar issues
earlier. Since Dovecot is very common and maybe someone can give
pointers.

Rgds/DP

On 4 May 2018 at 16:54, Wietse Venema <[hidden email]> wrote:

> Durga Prasad Malyala:
>> Hello all,
>> I am seeing consistent delays in writing to disk (my System redhat 7.2
>>  using GFS2 file system cluster)
>>
>> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
>> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
>> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
>> IIt4Ejji61o3LgAAuUaIWw Saved)
>
> The 0.02/0/0/50 means that Postfix spends 50s waiting for DOVECOT
> to accept the mail.
>
> Look in your DOVECOT logs for warnings or errors.
>
> The format of the "delays=a/b/c/d" logging is as follows:
>
>     a = time from message arrival to last active queue entry
>     b = time from last active queue entry to connection setup
>     c = time in connection setup, including DNS, EHLO and STARTTLS
>     d = time in message transmission
>
> http://www.postfix.org/postconf.5.html#delay_logging_resolution_limit
>
>         Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Wietse Venema
Durga Prasad Malyala:
> Thank you for your reply Wietse,
> You are my Yoda Jedi Master and I have greatest regard for you - Sincerely.
>
> Ive seen the above link earlier too and am at my wits end on this. I
> was wondering if there is any issue in LMTP handing mail over to
> dovecot.

I suppose that Postfix is overwhelming Dovecot, or that Dovecot is
overwhelming the file system, or both.

Have you looked at Dovecot logs? If not, then you may be missing
important information.

Have you looked at file system performance indicators? If not, then
you may be missing important information.

No-one can help you you aren't willing to do some of the work.
Although Postfix and Dovecot are common, you are unlikely to fix
this with some cut and paste.

You can configure Postfix to make fewer Dovecot connections, for
example by reducing in master.cf the number of processes that can
talk to Dovecot (this change requires "postfix reload"). This will
slow things down a bit and move the bottleneck to the Postfix queue.

You really need to make sure that Dovecot is not logging errors,
and that your file system is performing properly.

        Wietse

> I've mailed this to the Dovecot group as well but no-one is
> volunteering to get involved.
> This mail is to get responses if anyone has faced similar issues
> earlier. Since Dovecot is very common and maybe someone can give
> pointers.
>
> Rgds/DP
>
> On 4 May 2018 at 16:54, Wietse Venema <[hidden email]> wrote:
> > Durga Prasad Malyala:
> >> Hello all,
> >> I am seeing consistent delays in writing to disk (my System redhat 7.2
> >>  using GFS2 file system cluster)
> >>
> >> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
> >> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
> >> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
> >> IIt4Ejji61o3LgAAuUaIWw Saved)
> >
> > The 0.02/0/0/50 means that Postfix spends 50s waiting for DOVECOT
> > to accept the mail.
> >
> > Look in your DOVECOT logs for warnings or errors.
> >
> > The format of the "delays=a/b/c/d" logging is as follows:
> >
> >     a = time from message arrival to last active queue entry
> >     b = time from last active queue entry to connection setup
> >     c = time in connection setup, including DNS, EHLO and STARTTLS
> >     d = time in message transmission
> >
> > http://www.postfix.org/postconf.5.html#delay_logging_resolution_limit
> >
> >         Wietse
>
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Durga Prasad Malyala
Dear Wietse,
I understand your point. I am doing the checking on all of these.
iostat -xz shows nearly zero queue. w_await is in the low single
digits. In maillog - I am getting the following errors.

May  4 21:48:26 mail1 postfix/lmtp[32281]: 2830940E7A0B:
to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp],
conn_use=4, delay=4688, delays=3184/904/0/600, dsn=4.4.2,
status=deferred (conversation with mail.xyz.com[private/dovecot-lmtp]
timed out while sending end of data -- message may be sent more than
once)
This after I've moved some mails from hold queue to active queue.
Im using GFS2 as the file system.

Thanks for any help
DP



On 4 May 2018 at 20:46, Wietse Venema <[hidden email]> wrote:

> Durga Prasad Malyala:
>> Thank you for your reply Wietse,
>> You are my Yoda Jedi Master and I have greatest regard for you - Sincerely.
>>
>> Ive seen the above link earlier too and am at my wits end on this. I
>> was wondering if there is any issue in LMTP handing mail over to
>> dovecot.
>
> I suppose that Postfix is overwhelming Dovecot, or that Dovecot is
> overwhelming the file system, or both.
>
> Have you looked at Dovecot logs? If not, then you may be missing
> important information.
>
> Have you looked at file system performance indicators? If not, then
> you may be missing important information.
>
> No-one can help you you aren't willing to do some of the work.
> Although Postfix and Dovecot are common, you are unlikely to fix
> this with some cut and paste.
>
> You can configure Postfix to make fewer Dovecot connections, for
> example by reducing in master.cf the number of processes that can
> talk to Dovecot (this change requires "postfix reload"). This will
> slow things down a bit and move the bottleneck to the Postfix queue.
>
> You really need to make sure that Dovecot is not logging errors,
> and that your file system is performing properly.
>
>         Wietse
>
>> I've mailed this to the Dovecot group as well but no-one is
>> volunteering to get involved.
>> This mail is to get responses if anyone has faced similar issues
>> earlier. Since Dovecot is very common and maybe someone can give
>> pointers.
>>
>> Rgds/DP
>>
>> On 4 May 2018 at 16:54, Wietse Venema <[hidden email]> wrote:
>> > Durga Prasad Malyala:
>> >> Hello all,
>> >> I am seeing consistent delays in writing to disk (my System redhat 7.2
>> >>  using GFS2 file system cluster)
>> >>
>> >> May  4 10:03:34 mail1 postfix/lmtp[11662]: E4EB75048C19:
>> >> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp], delay=50,
>> >> delays=0.02/0/0/50, dsn=2.0.0, status=sent (250 2.0.0 <[hidden email]>
>> >> IIt4Ejji61o3LgAAuUaIWw Saved)
>> >
>> > The 0.02/0/0/50 means that Postfix spends 50s waiting for DOVECOT
>> > to accept the mail.
>> >
>> > Look in your DOVECOT logs for warnings or errors.
>> >
>> > The format of the "delays=a/b/c/d" logging is as follows:
>> >
>> >     a = time from message arrival to last active queue entry
>> >     b = time from last active queue entry to connection setup
>> >     c = time in connection setup, including DNS, EHLO and STARTTLS
>> >     d = time in message transmission
>> >
>> > http://www.postfix.org/postconf.5.html#delay_logging_resolution_limit
>> >
>> >         Wietse
>>
Reply | Threaded
Open this post in threaded view
|

Re: Delays in writing to INBOX

Wietse Venema
Durga Prasad Malyala:

> Dear Wietse,
> I understand your point. I am doing the checking on all of these.
> iostat -xz shows nearly zero queue. w_await is in the low single
> digits. In maillog - I am getting the following errors.
>
> May  4 21:48:26 mail1 postfix/lmtp[32281]: 2830940E7A0B:
> to=<[hidden email]>, relay=mail.xyz.com[private/dovecot-lmtp],
> conn_use=4, delay=4688, delays=3184/904/0/600, dsn=4.4.2,
> status=deferred (conversation with mail.xyz.com[private/dovecot-lmtp]
> timed out while sending end of data -- message may be sent more than
> once)

Postfix tells you that Dovecot is slow. We already know that.

You need to find out why Dovecot is slow. You won't find that
by looking at Postfix logs.

        Wietse