On 3/21/22 8:55 AM, Andrea Righi wrote: > On Fri, Mar 18, 2022 at 02:34:29PM +0100, Claudio Fontana wrote: > ... >> I have lots of questions here, and I tried to involve Jiri and Andrea Righi here, who a long time ago proposed a POSIX_FADV_NOREUSE implementation. >> >> 1) What is the reason iohelper was introduced? >> >> 2) Was Jiri's comment about the missing linux implementation of POSIX_FADV_NOREUSE? >> >> 3) if using O_DIRECT is the only reason for iohelper to exist (...?), would replacing it with posix_fadvise remove the need for iohelper? >> >> 4) What has stopped Andreas' or another POSIX_FADV_NOREUSE implementation in the kernel? > > For what I remember (it was a long time ago sorry) I stopped to pursue > the POSIX_FADV_NOREUSE idea, because we thought that moving to a > memcg-based solution was a better and more flexible approach, assuming > memcg would have given some form of specific page cache control. As of > today I think we still don't have any specific page cache control > feature in memcg, so maybe we could reconsider the FADV_NOREUSE idea (or > something similar)? > > Maybe even introduce a separate FADV_<something> flag if we don't want > to bind a specific implementation of this feature to a standard POSIX > flag (even if FADV_NOREUSE is still implemented as a no-op in the > kernel). > > The thing that I liked about the fadvise approach is its simplicity from > an application perspective, because it's just a syscall and that's it, > without having to deal with any other subsystems (cgroups, sysfs, and > similar). > > -Andrea Thanks Andrea, I guess for this specific use case I am still missing some key understanding on the role of iohelper in libvirt, Jiri Denemark's comment seems to suggest that having an implementation of FADV_NOREUSE would remove the need for iohelper entirely, so I assume this would remove the extra copy of the data which seems to impose a substantial throughput penalty when migrating to a file. I guess I am hoping for Jiri to weigh in on this, or anyone with a clear understanding of this matter. Ciao, Claudio > >> >> Lots of questions.. >> >> Thanks for all your insight, >> >> Claudio >> >>> >>> Dave >>> >>>> Ciao, >>>> >>>> C >>>> >>>>>> >>>>>> In the above tests with libvirt, were you using the >>>>>> --bypass-cache flag or not ? >>>>> >>>>> No, I do not. Tests with ramdisk did not show a notable difference for me, >>>>> >>>>> but tests with /dev/null were not possible, since the command line is not accepted: >>>>> >>>>> # virsh save centos7 /dev/null >>>>> Domain 'centos7' saved to /dev/null >>>>> [OK] >>>>> >>>>> # virsh save centos7 /dev/null --bypass-cache >>>>> error: Failed to save domain 'centos7' to /dev/null >>>>> error: Failed to create file '/dev/null': Invalid argument >>>>> >>>>> >>>>>> >>>>>> Hopefully use of O_DIRECT doesn't make a difference for >>>>>> /dev/null, since the I/O is being immediately thrown >>>>>> away and so ought to never go into I/O cache. >>>>>> >>>>>> In terms of the comparison, we still have libvirt iohelper >>>>>> giving QEMU a pipe, while your test above gives QEMU a >>>>>> UNIX socket. >>>>>> >>>>>> So I still wonder if the delta is caused by the pipe vs socket >>>>>> difference, as opposed to netcat vs libvirt iohelper code. >>>>> >>>>> I'll look into this aspect, thanks! >>>> >