On Sat, 5 Aug 2006 08:26:29 -0400 Chris Mason <mason@xxxxxxxx> wrote: > On Fri, Aug 04, 2006 at 06:31:49PM -0700, Andrew Morton wrote: > > > > Could we have more detail on what you're trying to do here? > > > > You've obviously made some decisions about how to handle under-writeback > > data, dirty data, integrity versus cleaning, etc. What were those > > decisions and what led you to them? > > The basic idea is that someone watching the blinking light on the usb > stick knows that when the light goes out, he can pull the stick out. Yes. > It's an alternative to mount -o sync that it closer to what users are > asking for. > > There's no attempt at integrity, it only tries to keep a fairly constant > flow of data to the device. It's a data-integrity operation, because we want that filesystem to be in good shape after the LED goes off. Not after a pdflush interval of 30 seconds. Consequently the semantics which we require of the library functions which you've added should be: /* * On return from this function all dirty data has been cleaned - it is either * on disk or is in flight in the I/O queues. */ These semantics are exactly provided by do_sync_file_range(SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE). I'd suggest that do_sync_file_range() could be used to sync the file's data in this application, if only for reasons of clarity and precision. That leaves the file metadata, the inode itself and the superblock, for which we'd ideally likely same semantics, but stronger sync semantics would of course meet our requirement. metadata: we don't have an SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE implementation of fsync_buffers_list() so unless we add such, we need to use fsync_buffers_list()'s (SYNC_FILE_RANGE_WAIT_BEFORE|SYNC_FILE_RANGE_WRITE|SYNC_FILE_RANGE_WAIT_AFTER) semantics. I'd suggest a call to sync_mapping_buffers() directly. The inode: write_inode_now(wait=0) will I think be appropriate here. superblock: it's possible that the proposed three-step scheme will leave the filesystem's superblock unwritten. Need to think about that. So I don't think any additional helper functions are needed. The above three are all exported to filesystems. To test all this I'd suggest that you verify that `-o flush' takes /rpoc/meminfo:Dirty reliably down to zero on a quiet system for various operations. If we're not adding new VFS functions then fatfs can generally hack around with these functions until it does what we want it to do. If we _do_ add new VFS functions then their exact data integrity behaviour should be carefully designed and documented, please. Otherwise we'll lose data or make people's machines slower than they need to be - we have a track record of screwing this stuff up. There was quite some duplication of code within the fat patch you sent - it looked like a little helper function was needed. I expect the `-o flush' behaviour will be useful on other filesystem types. Did you consider moving it into the generic mount options (MS_FLUSH?) rather than making it fat-specific? Sorry to complicate a little patch, but I'd prefer to get this right going-in. iirc, Hirofumi-san had a patch which did the same thing as this one 6-odd months ago, but it seemed to die off. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html