John Damm Sørensen <john@xxxxxxxxxxxxx> wrote: > I recently filed a bug for the cachefilesd used with Centos 7 but it seems > like nobody has been assigned to the bug. > > Is cachefilesd no longer an active project? I'm trying to overhaul the whole thing to make it less fragile inside the kernel. The problem in the upstream kernel code is that it depends on delicate snooping of wake-up queues that the backing filesystem is supposed to poke when it finished loading a page from storage. Unfortunately, it appears that this doesn't always work - and it's proving impossible to pin down so far (the problem being that the symptoms of the bugs I'm trying to find are showing up so long after the actual bug that the trace buffer can't be made big enough). The upstream code is also slow as it introduces extra copies and VM pressure that the system has trouble accounting for. I've also been using the bmap interface to probe the backing filesystem to find out if there's a block present in the cache file - but that's not viable with modern extent-based filesystems and can lead to corruption by interpolation of blocks of zeros from bridging blocks added in to an extent. However, since I wrote the cachefiles driver, a new interface inside the kernel (kiocb) has come along that allows me to perform asynchronous, direct I/O to the backing device - but to use it I have to completely change the cache I/O API used by network filesystems. The good news it that it's a lot faster, simpler, way less code (3-4000 lines less) and pressures the VM a lot less. https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-iter However, that's just the first phase. The second phase is going to involve changing the way space is managed and culling is done - and this is going to impact cachefilesd. The culling algorithm will need to move into the kernel, I think, and will be done through an index file. It needs careful thought because I have situations where I need to handle anything from a few files to a couple of million. It may be that the best strategy in such a large case is not to cull automatically at all. The third phase is going to involve adding disconnected operation support - and that will probably require yet more changes to cachefilesd. David -- Linux-cachefs mailing list Linux-cachefs@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cachefs