On Thu, Sep 20, 2012 at 08:23:59PM +0200, Miloslav Trmač wrote: > This patch set adds structured syslog support to libvirt. > > In short, the idea is to embed JSON in a syslog record. This format > is used in a new log output type called "syslog/json". > > Example "syslog/json" record (line-wrapped for readability): > > Sep 20 19:59:40 kulicka libvirt: @cee: {"msg":"libvirt version: 0.10.1", > "category":"../../src/test/test_driver.c","priority":"info", > "funcname":"virLogVMessage","line":769, > "timestamp":"2012-09-20 17:59:40.853+0000"} > > compared with existing "syslog" record format: > > Sep 20 20:02:24 kulicka libvirt: 5896: info : libvirt version: 0.10.1 > > > This JSON embedding format is defined by the Lumberjack project led > by the major Open Source syslog implementations > (https://fedorahosted.org/lumberjack/). The format is also a base > for a future MITRE CEE standard. Some of the fields are specified > at https://fedorahosted.org/lumberjack/wiki/FieldList , but applications > are free to use whatever fields make sense for them. > > The JSON structure makes easy to process logs by programs without writing > code that understands the specific format used by each application, > or using unreliable regexps (https://fedorahosted.org/ceelog/ is a > companion project that allows processing such logs). > > > Current systems will store and these records unmodified and present them > to users unmodified, but tools (and probably /var/log/messages) will > be able to show only the "msg" field; this allows us to add more detailed > information without worsening the user experience (the example above > already shows added information that is currently being thrown away when > logging to syslog). While I agree with the effort to try to standardize on metadata to be collected with log messages, I can't say I like the Lumberjack design, or this proposed usage in libvirt. First it is not possible for libvirt to detect whether messages it sends to syslog() are going to be processed by a syslog daemon which actually understands the JSON format. This means we can't turn on use of this JSON format automatically, without causing all existing distros to get their logs filled with JSON data. I also don't like the use of JSON for this purpose in the first place. Application logging facilities really need to be simple, reliable, and safe to use in any context. For example in between a fork() and execve() where POSIX requires only use of async signal safe functions. Now I know that technically speaking syslog() itself is not async signal safe in general, but the way libvirt uses it for logging makes it safe(ish) in our case. We can not, however, imagine that JSON (or really any general JSON library) is going to be async signal safe. Note, I don't care what data format is used server side for storing the logs - if syslogd wants to use JSON format, that's fine. Pushing this format into client applications is what I dislike. I can see the appeal of JSON in that it allows applications to send arbitrarily complex nested data structures into the logs, but that kind of generality for what is IMHO niche usage, really compromises the common case. What makes it worse is that the standard field list quoted above actually demonstrates that you don't actually even need the nesting in many cases, because it shows use of {"user!name":"bob"} as an alternative to {"user":{"name":"bob"}} This patch series goes to great lengths to introduce a whole series of new JSON APIs in libvirt, to avoid memory allocation from the heap, but this work is negated by the fact that the YAJL APIs used at the end all allocate from the heap anyway. Finally alot of the metadata items listed in the "standard fields" doc above would be better filled in by the server (from SCM_RIGHTS), than the client, so that the server can prevent an application from lying about its uid/gid/pid etc. By way of comparison the systemd journal APIs deal with all theses issues. The actual journald client side logging API is possible to implement and use without doing any memory allocation at all, in a completely async signal safe manner, using only socket() + sendmsg() with a stack allocated iovec[] for storing individual metadata fields to be sent. The API is soooo simple to use too, for example, a traditional syslog() call, can be replaced with something like: sd_journal_send("MESSAGE=%s", str, "PRIORITY=%d", prio, "CODE_LINE=%lld", linenr, "CODE_FUNC=%s", funcname, ...arbirary other key=value metadata... NULL); The journald communication uses a new UNIX socket, so apps can automatically determine whether journald is available and if not, fallback to plain syslog for old systems. It uses SCM_RIGHTS to automatically fill in a number of key fields in an unspoofable way. The way the data is transferred using sendmsg + iovec means there is no need for client applications to do any kind of magic escaping of special characters in strings. The journald log processing tools can easily output the log data in many different formats, including JSON, without the apps having to get involved in dealing with JSON themselves. As I mentioned right at the start, I think it is good to have structured standardized metadata associated with log messages. I just don't think that the way applications/libraries like libvirt are being asked to implement this is a good idea. It would be entirely practical for rsyslog or syslog-ng to provide a client/daemon protocol (+library) that worked in exactly the same way systemd's journal protocol (+library) do, and then turn the received raw data into their preferred JSON format on the server. FYI, I already have a local patch which adds support for journald to libvirt logging which I am yet to submit, but is so much simpler than this lumberjack support. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list