KS2011: Structured error logging
Lennart Poettering and Kay Sievers got up to talk about their "journal" work. Their point is that syslog is getting quite old - it is essentially 1980's technology - and we need a better way to deal with messages. Output that is (only) human readable is not as useful as it could be.
Traditional syslog has a lot of problems. Messages are untrusted; they can come from anywhere. There are lot of different logging techniques. Parsing log files is a hard problem - tools that do so generally turn into "regex horrors." Kernel messages tend to have associated context which does not make it into the log file and must be reconstructed, unreliably, later on. Time zone information is missing, the log stream can be manipulated by an attacker, the use of UDP for network logging means that messages can be dropped, there is "no sane access control," and more. In response to all these problems, Lennart and Kay have been designing "the journal," which is intended to provide unified logging through the entire software stack. The user-space side of the journal exists now and is integrated with service management; the kernel code is not yet available.
The user-space code is able to augment kernel messages with associated context information; it can, for example, add the real device name that matches a message emitted from the kernel. Strings like "sda" make no sense when seen in isolation on today's dynamic systems, even one day later.
The most controversial part of the plan (so far as they were able to present it) was the idea of attaching a 128-bit randomized tag to each message. These tags would have no central management; developers are expected to simply generate a new tag when they create a new message. Tools could then be taught the meaning of specific tags and could quickly recognize relevant messages in the logs. Translators and documentation writers could attach information to specific tags. The tag database would be maintained on the net and run in user space; kernel developers would not have to care about the message tags at all.
Linus broke in to make the claim that UUIDs (which is essentially what these tags are) have never worked in the past; why, he asked, do people still try to use them? The IDs simply get copied around and reused; that will happen with kernel messages too, he predicted. Besides, kernel messages already have a random bit string attached to them - it's the string that is the message itself (or the format string from which it is created). Lennart's defense, which included the assertion that UUIDs are used successfully in the EFI BIOS interface, seemed to convince few people. From there, the conversation became rather loud and unstructured; those who have complained about the lack of contention in recent kernel summits should be less unhappy this time around.
Some of the pieces of a structured logging interface already exist, as Alan Cox pointed out; functions like dev_printk() and friends already encode associated device information in a set way. But Christoph Hellwig asserted that what users really care about is actions on objects; what should be done is to add a new interface catering to that need and leave printk() alone. As it happens, Lennart and Kay propose the addition of a journalk() logging function for just that purpose. Output from printk() would go to the journal, but structured messages with IDs would be created with journalk().
It was suggested that what we are really seeing here is a reinvention of tracepoints. Alan added that things have to be done carefully, since these messages, however they are created, become part of the ABI. He also noted that we have a good interface for logging now: netlink. All that is needed is to put everything together. It turns out that using netlink for the journal is part of the plan - but the code does not exist yet.
At the end of the session, Alan recommended that the developers not really even think about printk(); they should find a way to characterize what they really need, then come up with a mechanism to provide it. Kay and Lennart sat down without having had the time to present their full proposal; Kay promised that they would try again next year.
Next: Coming to love cgroups
Index entries for this article | |
---|---|
Kernel | Kernel messages |
Kernel | Messages |
Posted Oct 25, 2011 2:34 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
XML might be replaced by something better, but the principle stands.
Posted Oct 25, 2011 2:49 UTC (Tue)
by dlang (guest, #313)
[Link] (17 responses)
those who do not understand syslog are doomed to reinvent it, poorly
there are some real issues with syslog (the lack of any ability to know where the log message _really_ came from on a system for example), but most of the problems that are attributed to syslog are really failures in sane log formatting on the part of the application generating the log message, and what makes anyone thing that if you change to some new mechanism the application programmers will be consistent any more than they have in the past?
I am dealing with this at work as well where the application programmers declare syslog 'obsolete' and go on to create their own logging mechanism.
the end result after much effort?? someone finally writes a tool to get this custom log into syslog, and then things settle down.
Posted Oct 25, 2011 3:19 UTC (Tue)
by dlang (guest, #313)
[Link] (16 responses)
don't invent a new logging mechanism, just clean up the formatting of your logs
if the kernel could just adopt a rule along the lines of:
All new log messages would start at the beginning of a line, all continuations of a log message would start with whitespace
it would be a huge win.
this would be enough to let log daemons figure out what a complete log is, and from there log parsers can take it reasonably.
log parsers frequently devolve into regex hell, but they don't have to, and it doesn't require creating a new logging protocol to solve the problem.
Posted Oct 25, 2011 7:08 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
How should we parse it? Well, let's start with the date. It can be parsed by regexps, but it's already some amount of code.
Then there's user name. It can't be parsed by regexps at all (because of quoting, for example "Joe \"the mad\" User"). Then there's URL, which also can not be reliably parsed by regexps. And then finally the exit code which luckily is just a pre-defined string.
It's _really_ _really_ easy to make log unparseable accidentally. And given that quite a lot of log messages are printed only during exceptional/error conditions you might not discover it until it's too late.
So something which just CAN NOT be misused is sorely needed.
Posted Oct 25, 2011 7:50 UTC (Tue)
by dlang (guest, #313)
[Link]
the fact that your program outputs bogus logs now is a problem with your program, not with the logging protocol.
Posted Oct 25, 2011 7:56 UTC (Tue)
by liljencrantz (guest, #28458)
[Link] (4 responses)
Posted Oct 25, 2011 8:21 UTC (Tue)
by l0b0 (guest, #80670)
[Link] (3 responses)
Posted Oct 25, 2011 8:53 UTC (Tue)
by iq-0 (subscriber, #36655)
[Link] (2 responses)
Posted Oct 25, 2011 9:35 UTC (Tue)
by liljencrantz (guest, #28458)
[Link] (1 responses)
Posted Oct 25, 2011 10:02 UTC (Tue)
by nix (subscriber, #2304)
[Link]
Posted Oct 25, 2011 10:00 UTC (Tue)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted Oct 25, 2011 16:34 UTC (Tue)
by dlang (guest, #313)
[Link] (2 responses)
however you need to note that the problematic formatting isn't done by syslog, but by the applications that are generating the log.
even in the old syslog RFC, the portions under the control of the syslog daemon are well defined and specified (although there is a LOT of stuff out there that violates these specs)
Posted Oct 26, 2011 14:01 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Oct 27, 2011 1:24 UTC (Thu)
by dlang (guest, #313)
[Link]
Posted Oct 25, 2011 16:53 UTC (Tue)
by erwbgy (subscriber, #4104)
[Link] (4 responses)
Posted Oct 25, 2011 17:04 UTC (Tue)
by dlang (guest, #313)
[Link] (2 responses)
Posted Oct 26, 2011 13:56 UTC (Wed)
by sorpigal (guest, #36106)
[Link] (1 responses)
Posted Oct 27, 2011 1:19 UTC (Thu)
by dlang (guest, #313)
[Link]
Posted Oct 26, 2011 6:56 UTC (Wed)
by sdalley (subscriber, #18550)
[Link]
Posted Oct 25, 2011 9:55 UTC (Tue)
by nix (subscriber, #2304)
[Link]
Posted Oct 25, 2011 19:21 UTC (Tue)
by raven667 (subscriber, #5198)
[Link] (3 responses)
Posted Oct 27, 2011 7:18 UTC (Thu)
by epierre (subscriber, #60841)
[Link]
Posted Oct 27, 2011 11:10 UTC (Thu)
by Felix.Braun (guest, #3032)
[Link] (1 responses)
Posted Nov 3, 2011 13:52 UTC (Thu)
by renox (guest, #23785)
[Link]
Totally wrong, just think about the use of Linux on desktops where each user is also a "sysadmin".
Posted Nov 1, 2011 5:04 UTC (Tue)
by jcm (subscriber, #18262)
[Link] (1 responses)
Posted Nov 1, 2011 14:32 UTC (Tue)
by nix (subscriber, #2304)
[Link]
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
============
[Sat 26 Oct 2011] "Joe User" requested lmpp://myserver.dc=some.dc=com/service , "SUCCESS" has been returned.
============
KS2011: Structured error logging
Minor nit:
You can easily parse quoted strings using something like:
KS2011: Structured error logging
"([^"]|\\.)*"
This will work since regexps choose the longest mathing string. Or am I missing something?
You also need to account for the fact that you might have an even or odd number of backslashes before the quote:
KS2011: Structured error logging
To fix it, we would need to check that any quotes are preceded by an *odd* number of backslashes:
echo '"foo \"bar\" baz"' | grep -E '"([^"]|\\.)*"' # Succeeds
echo '"foo \"bar\\" baz"' | grep -E '"([^"]|\\.)*"' # Ouch, that's a literal backslash, not an escaped quote!
Unfortunately this doesn't work with "([^"]|(?<=\\(\\\\)*)")*"
grep -P
("lookbehind assertion is not fixed length"). I don't know if any other regex engines support this.
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
Hmm. He does seem to have a serene disregard for the rule that sentences should start with a capital letter. On the other hand, he doesn't seem an up so floating many bells down kind of guy, at all ...
KS2011: Structured error logging
KS2011: Structured error logging
Parsing log files is a hard problem - tools that do so generally turn into [Kay Sievers and Lennart Poettering] "regex horrors."
This is why syslog-ng has had a proper parser, driven by a database of log message formats, for many years now. No regex horrors needed. It's really quite nice (and vastly underused).
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging
KS2011: Structured error logging