Re: [PATCH 05/24] compatfd is included before, and it is compiled unconditionally

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/01/2009 04:23 PM, Anthony Liguori wrote:
Juan Quintela wrote:
Discused with Anthony about it.  signalfd is complicated for qemu
upstream (too difficult to use properly),

It's not an issue of being difficult.

To emulate signalfd, we need to create a thread that writes to a pipe from a signal handler. The problem is that a write() can return a partial result and following the partial result, we can end up getting an EAGAIN. We have no way to queue signals beyond that point and we have no sane way to deal with partial writes.

pipe buffers are multiples of of the signalfd size. As long as we read and write signalfd-sized blocks, we won't get partial writes. It's true that depending on an implementation detail is bad practice, but this is emulation code, and if helps simplifying everything else, I think it's fine to use it.

hmm, pipe(7) says writes smaller than the pipe buffer size are atomic:

       O_NONBLOCK enabled, n <= PIPE_BUF
If there is room to write n bytes to the pipe, then write(2) succeeds immediately, writing all n bytes;
              otherwise write(2) fails, with errno set to EAGAIN.

so it seems this practice has been blessed by posix.

Instead, how we do this in upstream QEMU is that we install a signal handler and write one byte to the fd. If we get EAGAIN, that's fine because all we care about is that at least one byte exists in the fd's buffer. This requires that we use an fd-per-signal which means we end up with a different model than signalfd.

The reason to use signalfd over what we do in upstream QEMU is that signalfd can allow us to mask the signals which means less EINTRs. I don't think that's a huge advantage and the inability to do backwards compatibility in a sane way means that emulated signalfd is not workable.

signalfd is several microseconds faster than signals + pipes. Do we have so much performance we can throw some of it away?

The same is generally true for eventfd.

eventfd emulation will also never get partial writes.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux