Doesn't such a change increase the "dirty debt" held in the kernel and therefore increase the probability of resulting in a deadlock (when the userspace filesystem's memory request ends up waiting on a dirty page writeout caused by this write-back feature)? Why not implement the write-back inside the userspace filesystem leaving the kernel operate in write-through, which makes things overall less unsafe? Do you have performance numbers comparing write-back in kernel v/s write-back in userspace? Avati ----- Original Message ----- From: "Pavel Emelyanov" <xemul@xxxxxxxxxxxxx> To: fuse-devel@xxxxxxxxxxxxxxxxxxxxx, "Miklos Szeredi" <miklos@xxxxxxxxxx>, "Alexander Viro" <viro@xxxxxxxxxxxxxxxxxx>, "linux-fsdevel" <linux-fsdevel@xxxxxxxxxxxxxxx> Cc: "Kirill Korotaev" <dev@xxxxxxxxxxxxx>, "James Bottomley" <jbottomley@xxxxxxxxxxxxx> Sent: Tuesday, July 3, 2012 8:53:18 AM Subject: [fuse-devel] [PATCH 0/10] fuse: An attempt to implement a write-back cache policy Hi everyone. One of the problems with the existing FUSE implementation is that it uses the write-through cache policy which results in performance problems on certain workloads. E.g. when copying a big file into a FUSE file the cp pushes every 128k to the userspace synchronously. This becomes a problem when the userspace back-end uses networking for storing the data. A good solution of this is switching the FUSE page cache into a write-back policy. With this file data are pushed to the userspace with big chunks (depending on the dirty memory limits, but this is much more than 128k) which lets the FUSE daemons handle the size updates in a more efficient manner. The writeback feature is per-connection and is explicitly configurable at the init stage (is it worth making it CAP_SOMETHING protected?) When the writeback is turned ON: * still copy writeback pages to temporary buffer when sending a writeback request and finish the page writeback immediately * make kernel maintain the inode's i_size to avoid frequent i_size synchronization with the user space * take NR_WRITEBACK_TEMP into account when makeing balance_dirty_pages decision. This protects us from having too many dirty pages on FUSE The provided patchset survives the fsx test. Performance measurements are not yet all finished, but the mentioned copying of a huge file becomes noticeably faster even on machines with few RAM and doesn't make the system stuck (the dirty pages balancer does its work OK). Applies on top of v3.5-rc4. We are currently exploring this with our own distributed storage implementation which is heavily oriented on storing big blobs of data with extremely rare meta-data updates (virtual machines' and containers' disk images). With the existing cache policy a typical usage scenario -- copying a big VM disk into a cloud -- takes way too much time to proceed, much longer than if it was simply scp-ed over the same network. The write-back policy (as I mentioned) noticeably improves this scenario. Kirill (in Cc) can share more details about the performance and the storage concepts details if required. Thanks, Pavel ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ fuse-devel mailing list fuse-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.sourceforge.net/lists/listinfo/fuse-devel -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html