Broken locking in libuuid [Was: [RFC PATCH] libuuid: introduce safe variant of uuid_generate_time()]

Petr Uzel <petr.uzel@xxxxxxx> · Mon, 7 Feb 2011 14:08:46 +0100

Hi all,

On Mon, Jan 31, 2011 at 12:50:39PM +0100, Petr Uzel wrote:
> On Fri, Jan 28, 2011 at 08:53:55PM -0500, Ted Ts'o wrote:
> > On Fri, Jan 28, 2011 at 03:12:23PM +0100, Petr Uzel wrote:
> > > 
> > > Sometimes, when there are multiple processes generating time-based
> > > UUIDs in a loop (each of them using multiple threads), it can happen
> > > that the socket backlog in uuidd daemon is full and so the connection
> > > request is refused. uuid_generate_time() then fallbacks to unsafe way of
> > > generating UUID, which may cause duplicate time-based UUIDs to be
> > > generated.
> > 
> > It's probably also a good idea to increase the socket backlog queue
> > length in uuidd, which is currently set to 5 for no good reason.  We
> > should probably just change that to be SOMAXCONN.
> > 
> > I don't have a problem with engineering solutions which wait until you
> > can talk to the uuidd, and returning an error to the processes if it
> > fails, but it's better not to fail in the first place, and so
> > increasing the listen backlog parameter would be a good thing to do.
> 
> I agree; we should try to make libuuid not to fail in the first place.

After another week of staring at the code, I think I have finally
found the root cause of the problem.

First, I'd like to describe the tests I did. As I wrote in my previous
mail, I couldn't reproduce the problem anywhere else except the
VirtualBox VM (with SLES) at first. I tried also another virtual
machine (with different system) and it was even more weird - sometimes
I could reproduce, but after a reboot, I couldn't.

Comparing the 'dmesg's, I figured out that it depends on the
clocksource being used by the kernel. The problem is reproducible only
using 'jiffies' as the clocksource, but I haven't seen it on a
machine/VM using 'tsc' or any other clocksource.

The problem with jiffies is that the granularity of timestamps is
quite high (above 10ms on my testing VM). As far as I understand, the
kernel uses jiffies only as the last resort, if other possible
clocksources can not be used (TSC unstable), which sometimes happened,
sometimes not.

After booting my workstation with 'clocksource=jiffies' boot
parameter, the problem started to appear there as well.

As far as I understand the libuuid/uuidd code, the high clocksource
granularity should not cause troubles, as long as the
application/uuidd have access to the clock.txt (I ran all the tests as
root). But apparently, the clocksource=jiffies made the duplicate
UUIDs more likely.

Next, to make the tests as simple as possible, I also disabled the
uuidd daemon (removed the binary). If I understand the libuuid code
correctly, all the processes/threads generating UUIDs synchronize
using the clock.txt (if they have access), so the duplicate UUIDs
shouldn't appear even theoretically (is this true?) However, I still
got duplicates.

Now to the point finally: I think the locking of clock.txt actually
does not work as expected. Let's have one process, running as root,
with multiple threads, each of them calling uuid_generate_time() and
uuidd daemon isn't installed. The threads should serialize accesses to
clock.txt using fcntl(2) locks. However, due to how fcntl works, it is
not possible to use it for synchronization of threads that belong to
the same process
- it behaves as if no lock was there (it does not matter that the
threads access/lock the file using different - thread specific -
file descriptor). So the accesses to clock.txt are not serialized ->
race condition -> duplicate UUIDs.

OTOH, flock(2) can be used for this purpose. I have replaced the fcntl
with flock and can not reproduce it anymore. I looked into the list of
limitations of flock(2) - I don't see any disadvantages of using
flock() instead of fcntl() in this context, so I propose switching to
flock()-based locking of clock.txt. (I'll send the patch shortly).
What do you think?

I would appreciate any feedback on this matter.

Best regards,

Petr

--
Petr Uzel
IRC: ptr_uzl @ freenode
Attachment:
pgpHKQUEHqsVN.pgp

Description: PGP signature