Message-id logging (include rfc822-comments?)

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Message-id logging (include rfc822-comments?)

Victor Duchovni

When a message-id is followed by rfc822 comment text:

     Message-Id: <test@test> (test)

     2008-11-06T13:13:35-0500 amnesiac postfix/cleanup[10832]: AF24675A3D:
         message-id=<test@test> (test)

postfix logs both the "id" and the "comment". This is perhaps more
"robust", in case the header is mangled, and most of the unique data
is in the comment. On the other hand, for well-formed headers, the
comment is not part of the message-id: for example:

    2008-11-06T01:11:19-0500 amnesiac postfix/cleanup[13756]: AE620EF8001:
    message-id=<[hidden email]> (added by [hidden email])

Should Postfix make any effort to log the above message differently?

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: Message-id logging (include rfc822-comments?)

Wietse Venema
Victor Duchovni:

>
> When a message-id is followed by rfc822 comment text:
>
>      Message-Id: <test@test> (test)
>
>      2008-11-06T13:13:35-0500 amnesiac postfix/cleanup[10832]: AF24675A3D:
> message-id=<test@test> (test)
>
> postfix logs both the "id" and the "comment". This is perhaps more
> "robust", in case the header is mangled, and most of the unique data
> is in the comment.

Indeed, the current implementation is conservative; it does not
"lose" information in the event of malformed content (it does,
however, neutralize non-printable characters before logging).

> On the other hand, for well-formed headers, the
> comment is not part of the message-id: for example:
>
>     2008-11-06T01:11:19-0500 amnesiac postfix/cleanup[13756]: AE620EF8001:
>     message-id=<[hidden email]> (added by [hidden email])
>
> Should Postfix make any effort to log the above message differently?

How would one decide that a (message-id) header is not mangled?
This would require parsing the string, counting the "address"
tokens, and if there is only one "address" token, use that as the
logged message ID, otherwise log the entire original string.

But I wonder if it is really worth the trouble.

        Wietse
Reply | Threaded
Open this post in threaded view
|

Re: Message-id logging (include rfc822-comments?)

Victor Duchovni
On Thu, Nov 06, 2008 at 04:38:41PM -0500, Wietse Venema wrote:

> >      Message-Id: <test@test> (test)
> >
> >      2008-11-06T13:13:35-0500 amnesiac postfix/cleanup[10832]: AF24675A3D:
> > message-id=<test@test> (test)
> >
> > postfix logs both the "id" and the "comment". This is perhaps more
> > "robust", in case the header is mangled, and most of the unique data
> > is in the comment.
>
> Indeed, the current implementation is conservative; it does not
> "lose" information in the event of malformed content (it does,
> however, neutralize non-printable characters before logging).

I've seen Sendmail delete Message-Ids it believes malformed, and generates
a new id. I did not look closely enough to determine what it thought invalid.

> > On the other hand, for well-formed headers, the
> > comment is not part of the message-id: for example:
> >
> >     2008-11-06T01:11:19-0500 amnesiac postfix/cleanup[13756]: AE620EF8001:
> >     message-id=<[hidden email]> (added by [hidden email])
> >
> > Should Postfix make any effort to log the above message differently?
>
> How would one decide that a (message-id) header is not mangled?
> This would require parsing the string, counting the "address"
> tokens, and if there is only one "address" token, use that as the
> logged message ID, otherwise log the entire original string.

Real-life examples include:

Message-Id: News_03/11/2008 16:11:15_PR Newswire Brasil<[hidden email]>
Message-ID: <42M0XSEC17ENNJN27.1103.753798 @lowbehold.com>
Message-ID: <2008-11-07 10:43:57 TheSystem@>
Message-ID: <9e05ac0428f1df4dd4541888b64b73c1@Peek &amp; Cloppenburg Website>
Message-Id: <[hidden email] >

So the "address" token parser would have to be fairly "liberal".

> But I wonder if it is really worth the trouble.

I was thinking that we could just trim comments.

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: Message-id logging (include rfc822-comments?)

Wietse Venema
Victor Duchovni:

> > > On the other hand, for well-formed headers, the
> > > comment is not part of the message-id: for example:
> > >
> > >     2008-11-06T01:11:19-0500 amnesiac postfix/cleanup[13756]: AE620EF8001:
> > >     message-id=<[hidden email]> (added by [hidden email])
> > >
> > > Should Postfix make any effort to log the above message differently?
> >
> > How would one decide that a (message-id) header is not mangled?
> > This would require parsing the string, counting the "address"
> > tokens, and if there is only one "address" token, use that as the
> > logged message ID, otherwise log the entire original string.
>
> Real-life examples include:
>
> Message-Id: News_03/11/2008 16:11:15_PR Newswire Brasil<[hidden email]>
> Message-ID: <42M0XSEC17ENNJN27.1103.753798 @lowbehold.com>
> Message-ID: <2008-11-07 10:43:57 TheSystem@>
> Message-ID: <9e05ac0428f1df4dd4541888b64b73c1@Peek &amp; Cloppenburg Website>
> Message-Id: <[hidden email] >
>
> So the "address" token parser would have to be fairly "liberal".

I'm not sure if cosmetic concerns about Message-ID logging alone
would justify the implementation of another RFC822 parser.

The existing code is already as liberal as it gets; unlike a
compiler, it doesn't throw away "unexpected" tokens. But it also
doesn't represent whitespace between tokens, so it can't un-parse
three of the above examples.

        Wietse

> > But I wonder if it is really worth the trouble.
>
> I was thinking that we could just trim comments.
>
> --
> Viktor.
>
> Disclaimer: off-list followups get on-list replies or get ignored.
> Please do not ignore the "Reply-To" header.
>
> To unsubscribe from the postfix-users list, visit
> http://www.postfix.org/lists.html or click the link below:
> <mailto:[hidden email]?body=unsubscribe%20postfix-users>
>
> If my response solves your problem, the best way to thank me is to not
> send an "it worked, thanks" follow-up. If you must respond, please put
> "It worked, thanks" in the "Subject" so I can delete these quickly.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Message-id logging (include rfc822-comments?)

Victor Duchovni
On Fri, Nov 07, 2008 at 04:16:02PM -0500, Wietse Venema wrote:

> Victor Duchovni:
> > > > On the other hand, for well-formed headers, the
> > > > comment is not part of the message-id: for example:
> > > >
> > > >     2008-11-06T01:11:19-0500 amnesiac postfix/cleanup[13756]: AE620EF8001:
> > > >     message-id=<[hidden email]> (added by [hidden email])
> > > >
> > > > Should Postfix make any effort to log the above message differently?
> > >
> > > How would one decide that a (message-id) header is not mangled?
> > > This would require parsing the string, counting the "address"
> > > tokens, and if there is only one "address" token, use that as the
> > > logged message ID, otherwise log the entire original string.
> >
> > Real-life examples include:
> >
> > Message-Id: News_03/11/2008 16:11:15_PR Newswire Brasil<[hidden email]>
> > Message-ID: <42M0XSEC17ENNJN27.1103.753798 @lowbehold.com>
> > Message-ID: <2008-11-07 10:43:57 TheSystem@>
> > Message-ID: <9e05ac0428f1df4dd4541888b64b73c1@Peek &amp; Cloppenburg Website>
> > Message-Id: <[hidden email] >
> >
> > So the "address" token parser would have to be fairly "liberal".
>
> I'm not sure if cosmetic concerns about Message-ID logging alone
> would justify the implementation of another RFC822 parser.

The concerns are not entirely cosmetic, as some folks are contemplating
pulling logs into structured databases, and indexing on message-id,
queue-id, and so on. Do we want the log parsers to parse the raw header
value, or should we try to "help" by trimming comments, leaving just
the "real" message-id?

--
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[hidden email]?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.
Reply | Threaded
Open this post in threaded view
|

Re: Message-id logging (include rfc822-comments?)

Wietse Venema
Victor Duchovni:

> > > > How would one decide that a (message-id) header is not mangled?
> > > > This would require parsing the string, counting the "address"
> > > > tokens, and if there is only one "address" token, use that as the
> > > > logged message ID, otherwise log the entire original string.
> > >
> > > Real-life examples include:
> > >
> > > Message-Id: News_03/11/2008 16:11:15_PR Newswire Brasil<[hidden email]>
> > > Message-ID: <42M0XSEC17ENNJN27.1103.753798 @lowbehold.com>
> > > Message-ID: <2008-11-07 10:43:57 TheSystem@>
> > > Message-ID: <9e05ac0428f1df4dd4541888b64b73c1@Peek &amp; Cloppenburg Website>
> > > Message-Id: <[hidden email] >
> > >
> > > So the "address" token parser would have to be fairly "liberal".
> >
> > I'm not sure if cosmetic concerns about Message-ID logging alone
> > would justify the implementation of another RFC822 parser.
>
> The concerns are not entirely cosmetic, as some folks are contemplating
> pulling logs into structured databases, and indexing on message-id,
> queue-id, and so on. Do we want the log parsers to parse the raw header
> value, or should we try to "help" by trimming comments, leaving just
> the "real" message-id?

Even with comments removed, your logfile processor would still need
to use heuristics for dealing with malformed Message-ID strings.
Compared to such heuristics, stripping off the (text) seems trivial.

I definitely don't want yet another RFC822 parser just for the
purpose of Message-ID logging.  So, it would have to be done with
the existing RFC822 scanner/unparser, which does not preserve
whitespace that isn't supposed to be there.

        Wietse