On Wed, 26 Feb 2014, Dave Chinner wrote: > On Tue, Feb 25, 2014 at 09:25:40PM -0800, Hugh Dickins wrote: > > > But I wasn't really thinking of the offset > i_size case, just the > > offset + len >= i_size case: which would end with i_size at offset, > > and the areas you're worried about still beyond EOF - or am I confused? > > Right, offset beyond EOF is just plain daft. But you're not thinking > of the entire picture. What happens when a system crashes half way > through a collapse operation? On tmpfs you don't care - everything > is lost, but on real filesystems we have to care about. > > offset + len beyond EOF is just truncate(offset). > > From the point of view of an application offloading a data movement > operation via collapse range, any range that overlaps EOF is wrong - > data beyond EOF is not accessible and is not available for the > application to move. Hence EINVAL - it's an invalid set of > parameters. > > If we do allow it and implement it by block shifting (which, > technically, is the correct interpretation of the collapse range > behaviour because it preserves preallocation beyond > the collapsed region beyond EOF), then we have > thr problem of moving data blocks below EOF by extent shifting > before we change the EOF. That exposes blocks of undefined content > to the user if we crash and recover up to that point of the > operation. It's just plain dangerous, and if we allow this > possibility via the API, someone is going to make that mistake in a > filesystem because it's difficult to test and hard to get right. Again, I would have thought that this is a problem you are already having to solve in the case when offset + len is below EOF, with blocks of undefined content preallocated beyond EOF. But I don't know xfs, you do: so I accept there may be subtle reasons why the offset + len below EOF case is easier for you to handle - and please don't spend your time trying to hammer those into my head! I think I've been somewhat unreasonable: I insisted in the other thread that "Collapse is significantly more challenging than either hole-punch or truncation", so I should give you a break, not demand that you provide a perfect smooth implementation in all circumstances. None of our filesystems were designed with this operation in mind, each may have its own sound reasons to reject those peculiare cases which would pose more trouble and risk than they are worth. Whether that should be enforced at the VFS level is anther matter: if it turns out that the xfs and ext4 limitations match up, okay. I think we have different preferences, for whether to return error or success, when there is nothing to be done; but I notice now that fallocate fails on len 0, so you are being consistent with that. Reject "offset + len >= i_size" or "offset + len > i_size"? Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>