On 03/03/2014 12:18 PM, Daniel P. Berrange wrote: > Currently the log filter strings are used in a string comparison > against the source filename each time log message is emitted. > If no log filters at all are set, there's obviously no string > comparison to be done. If any single log filter is set though, > this imposes a compute burden on every logging call even if logs > from the file in question are disabled. This string comparison > must also be done while the logging mutex is held, which has > implications for concurrency when multiple threads are emitting > log messages. > > This changes the log filtering to be done baed on the virLogSource s/baed/based/ > object name. The virLogSource struct is extended to container s/container/contain/ > 'serial' and 'priority' fields. Any time the global log filter > rules are changed a global serial number is incremented. When a > log message is emitted, the serial in the virLogSource instance > is compared with the global serial number. If out of date, then > the 'priority' field in the virLogSource instance is updated based > on the new filter rules. The 'priority' field is checked to see > whether the log message should be sent to the log outputs. > > The comparisons of the 'serial' and 'priority' fields are done > with no locks held. So in the common case each logging call has > an overhead of 2 integer comparisons, with no locks held. Only > if the decision is made to forward the message to the log output, > or if the 'serial' value is out of date do locks need to be > acquired. > > Technically the comparisons of the 'serial' and 'priority' fields > should be done with locks held, or using atomic operations. Only if we care about strict happens-after relationships between code that adjust logging levels vs. code testing whether to emit a particular log message. > Both > of these options have a notable performance impact, however, and > it is believed that worst case behaviour where the fields are > read concurrently with being written would merely result in an > mistaken emissions or dropping of the log message in question. > This is an acceptable tradeoff for the performance benefit of > avoiding locking. > Almost. As long as writes are safe, the worst that can happen is we fail to emit a message that just got enabled, or we emit a message that just got disabled. But had we used locks to avoid this race, and the locks get obtained in reverse order, we would see the same behavior. So the locks add no protection, and eliding them in favor of simpler non-atomic integer ops is a safe action. If there were only a single writer, then writes would be automatically safe. However, you have multiple writers. Thus, you have the situation that if two writers both try to increment the global serial with no locks or atomic increments, you could end up with the classic symptoms of over- or under-incrementing the global serial. If three threads compete: start: global is 1 thread one: read global to compute its increment thread two: read global to compute its increment thread two: write the increment, global is now 2 thread three: compare local 0 against global 2, recompute priority thread one: write the increment, global is now 2 thread three: compare local 2 against global 2, no priority recompute Oops - if thread one changed priority so that thread three should no longer log, we've messed up, and may have LOTS of messages logged that should not have been, rather than just one or two at the race boundary case. So I think your _increments_ need to be atomic, to protect the writers, but those are not the hot path; while the READS of the global can remain unprotected, and those are the actual hot path you are optimizing in this patch. > As a quick benchmark, a demo program that registers 500 file > descriptors with the event loop (eg equiv of 500 QEMU monitor > commands), creates pending read I/O on every FD, and then runs > virEventRunDefaultImpl() took 4.6 seconds todo 51200 iterations. s/todo/to do/ > After this optimization it only takes 3.3 seconds, with the log > APIs no longer being a relevant factor in the running time. > > Signed-off-by: Daniel P. Berrange <berrange@xxxxxxxxxx> > --- > src/util/virlog.c | 81 +++++++++++++++++++++++++------------------------------ > src/util/virlog.h | 6 +++++ > 2 files changed, 43 insertions(+), 44 deletions(-) > > diff --git a/src/util/virlog.c b/src/util/virlog.c > index 6a20b89..048d2c4 100644 > --- a/src/util/virlog.c > +++ b/src/util/virlog.c > @@ -95,6 +95,7 @@ struct _virLogFilter { > typedef struct _virLogFilter virLogFilter; > typedef virLogFilter *virLogFilterPtr; > > +static int virLogFiltersSerial = 1; Good - starting at 1 while all per-file structs start at 0 means that the first time any logging is attempted it recomputes priority in each file. > static virLogFilterPtr virLogFilters = NULL; > static int virLogNbFilters = 0; > > @@ -514,6 +515,7 @@ virLogResetFilters(void) > VIR_FREE(virLogFilters[i].match); > VIR_FREE(virLogFilters); > virLogNbFilters = 0; > + virLogFiltersSerial++; > return i; > } > > @@ -569,6 +571,7 @@ virLogDefineFilter(const char *match, > virLogFilters[i].priority = priority; > virLogFilters[i].flags = flags; > virLogNbFilters++; > + virLogFiltersSerial++; These are the increments that I think are not hotpath, and need to be serialized with an atomic increment. > > +static void > +virLogSourceUpdate(virLogSourcePtr source) > +{ > + virLogLock(); The common case is that the per-log serial will match the global serial; while holding the lock on the boundary case of an updated global serial is not going to affect hot case. > /* > - * check against list of specific logging patterns > + * 3 intentionally non-thread safe variable reads. > + * Worst case result is a log message is accidentally > + * dropped or emitted, if another thread is updating > + * log filter list concurrently with a log message and again, if we WERE thread-safe, the race could have always been won the other way in obtaining the lock, for the same end result, so our lack of thread-safety isn't critical. > */ > - fprio = virLogFiltersCheck(filename, &filterflags); > - if (fprio == 0) { > - if (priority < virLogDefaultPriority) > - emit = false; > - } else if (priority < fprio) { > - emit = false; > - } > - > - if (!emit) > + if (source->serial < virLogFiltersSerial) > + virLogSourceUpdate(source); > + if (priority < source->priority) > goto cleanup; > + filterflags = source->flags; Makes sense. > @@ -58,6 +61,9 @@ struct _virLogSource { > # define VIR_LOG_INIT(n) \ > static virLogSource virLogSelf = { \ > .name = "" n "", \ > + .priority = VIR_LOG_ERROR, \ > + .serial = 0, \ > + .flags = 0, \ The last two are not strictly necessary; static initialization guarantees zero-initialization. But it doesn't hurt. I'll let you either spin a v2 to address my atomic increment concerns, or provide a counterargument on why I should ACK as-is. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list