On Tue, Aug 12, 2008 at 12:57:21AM +0200, Sven Wegener wrote: > Both schedulers have a race condition that happens in the following > situation: > > We have an entry in our table that already has expired according to it's > last use time. Then we need to schedule a new connection that uses this > entry. > > CPU 1 CPU 2 > > ip_vs_lblc_schedule() > ip_vs_lblc_get() > lock table for read > find entry > unlock table > ip_vs_lblc_check_expire() > lock table for write > kfree() expired entry > unlock table > return invalid entry > > Problem is that we assign the last use time outside of our critical > region. We can make hitting this race more difficult, if not impossible, > if we assign the last use time while still holding the lock for reading. > That gives us six minutes during which it's save to use the entry, which > should be enough for our use case, as we're going to use it immediately > and don't keep a long reference to it. > > We're holding the lock for reading and not for writing. The last use time > is an unsigned long, so the assignment should be atomic by itself. And we > don't care, if some other user sets it to a slightly different value. The > read_unlock() implies a barrier so that other CPUs see the new last use > time during cleanup, even if we're just using a read lock. > > Other solutions would be: 1) protect the whole ip_vs_lblc_schedule() with > write_lock()ing the lock, 2) add reference counting for the entries, 3) > protect each entry with it's own lock. And all are bad for performance. > > Comments? Ideas? Is there a pathological case here if sysctl_ip_vs_lblc_expiration is set to be very short and we happen to hit ip_vs_lblc_full_check()? To be honest I think that I like the reference count approach best, as it seems safe and simple. Is it really going to be horrible for performance? If so, I wonder if a workable solution would be to provide a more fine-grained lock on tbl. Something like the way that ct_read_lock/unlock() works. -- To unsubscribe from this list: send the line "unsubscribe lvs-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html