On Thu, 31 Mar 2022 12:08:21 +0530 Charan Teja Kalla <quic_charante@xxxxxxxxxxx> wrote: > From: Charan Teja Reddy <quic_charante@xxxxxxxxxxx> > > Currently fadvise(2) is supported only for the files that doesn't > associated with noop_backing_dev_info thus for the files, like shmem, > fadvise results into NOP. But then there is file_operations->fadvise() > that lets the file systems to implement their own fadvise > implementation. Use this support to implement some of the POSIX_FADV_XXX > functionality for shmem files. > > This patch aims to implement POSIX_FADV_WILLNEED and POSIX_FADV_DONTNEED > advices to shmem files which can be helpful for the drivers who may want > to manage the shmem pages of the files that are created through > shmem_file_setup[_with_mnt](). An example usecase may be like, driver > can create the shmem file of the size equal to its requirements and > map the pages for DMA and then pass the fd to user. The user who knows > well about the usage of these pages can now decide when these pages are > not required push them to swap through DONTNEED thus free up memory well > in advance rather than relying on the reclaim and use WILLNEED when it > decide that they are useful in the near future. IOW, it lets the clients > to free up/read the memory when it wants to. Is there an actual userspace/driver combination which will use this? Has the new feature been tested in such an arrangement? And if so, which driver(s)? > Another usecase is that GEM > objects which are currently allocated and managed through shmem files > can use vfs_fadvise(DONT|WILLNEED) on shmem fd when the driver comes to > know(like through some hints from user space) that GEM objects are not > going to use/will need in the near future. Again, is this just a theoretical bright idea, or can we be assured that adding this code to the kernel will end up having been useful to our users? > Some questions asked while reviewing this patch: > > Q) Can the same thing be achieved with FD mapped to user and use > madvise? > A) All drivers are not mapping all the shmem fd's to user space and want > to manage them with in the kernel. Ex: shmem memory can be mapped to the > other subsystems and they fill in the data and then give it to other > subsystem for further processing, where, the user mapping is not at all > required. A simple example, memory that is given for gpu subsystem > which can be filled directly and give to display subsystem. And the > respective drivers know well about when to keep that memory in ram or > swap based on may be a user activity. > > Q) Should we add the documentation section in Manual pages? > A) The man[1] pages for the fadvise() whatever says is also applicable > for shmem files. so couldn't feel it correct to add specific to shmem > files separately. > [1] https://linux.die.net/man/2/fadvise > > Q) The proposed semantics of POSIX_FADV_DONTNEED is actually similar to > MADV_PAGEOUT and different from MADV_DONTNEED. This is a user facing API > and this difference will cause confusion? > A) man pages [1] says that "POSIX_FADV_DONTNEED attempts to free cached > pages associated with the specified region." This means on issuing this > FADV, it is expected to free the file cache pages. And it is > implementation defined If the dirty pages may be attempted to writeback. > And the unwritten dirty pages will not be freed. So, FADV_DONTNEED also > covers the semantics of MADV_PAGEOUT for file pages and there is no > purpose of PAGEOUT for file pages.