On Sun, Aug 31, 2014 at 5:58 PM, Rusty Russell <rusty@xxxxxxxxxxxxxxx> wrote: > Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >> The only unusual thing about virtio's use of scatterlists is that >> two of the APIs accept scatterlists that might not be terminated. >> Using function pointers to handle this case is overkill; for_each_sg >> can do it. >> >> There's a small subtlely here: for_each_sg assumes that the provided >> count is correct, but, because of the way that virtio_ring handles >> multiple scatterlists at once, the count is only an upper bound if >> there is more than one scatterlist. This is easily solved by >> checking the sg pointer for NULL on each iteration. > > Hi Andy, > > (Sorry for the delayed response; I was still catching up from > my week at KS.) No problem. In the grand scheme of maintainer response time, I think you're near the top. :) > > Unfortunately the reason we dance through so many hoops here is that > it has a measurable performance impact :( Those indirect calls get inlined. gcc inlines that? That must nearly double the size of the object file. :-/ > > There's only one place which actually uses a weirdly-formed sg now, > and that's virtio_net. It's pretty trivial to fix. > > However, vring_bench drops 15% when we do this. There's a larger > question as to how much difference that makes in Real Life, of course. > I'll measure that today. Weird. sg_next shouldn't be nearly that slow. Weird. > > Here are my two patches, back-to-back (it cam out of of an earlier > concern about reducing stack usage, hence the stack measurements). > I like your version better than mine, except that I suspect that your version will blow up for the same reason that my v2 patches blow up: you probably need the skb_to_sgvec_nomark fix, too. IOW, what happens if you apply patches 1-4 from my v3 series and then apply your patches on top of that? There'll be a hit on some virtio_pci machines due to use of the DMA API. I would argue that, if this is measurable, the right fix is to prod the DMA API maintainers, whoever they are, to fix it. The DMA API really out to be very fast on identity-mapped devices, but I don't know whether it is in practice. --Andy _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization