On Wed, Nov 30, 2011 at 12:01:16PM -0500, Ted Ts'o wrote: > What Dave was talking about is something different. He's suggesting a > new call which reserves space, but which does not actually make the > block allocation decision until the time of the write. He suggested > tieing it to the file descriptor, but I wonder if it's actually more > functional to tie it to the process --- that is, the process says, > "guarantee that I will be able to write 5MB", and writes made by that > process get counted against that 5MB reservation. When the process > exits, any reservation made by that process evaporates. It needs to be tied to the inode in some way - there's metadata reservations that need to be made per inode that delayed allocation reserations are made for to take into account the potential need to allocate extent tree blocks as well. If we on't do this, then we'll get ENOSPC reported for writes during writeback that should have succeeded. And that is a Bad Thing. Further, you need to track all the ranges that have space reserved like a special type of delayed allocation extent. That way, when the write() comes along into the reserved range, you don't account for it a second time as delayed allocation as the space usage has already been accounted for. And then there is the problem of freeing space that you don't use. Close the fd and you automatically terminate the reservation. fiemap can be used to find unused reserved ranges. You could probably even release them by punching the range. If you have a per-process pool, how do you only use it for the write() calls you want, on the file you want, over the range you wanted reserved? And when you have finished writing to that file, how do you release any unused reservation? How do you know that you've got reservations remaining? Then the interesting questions start - how does per-process reservation interact with quotas? The quota needs to be checked whenthe reservation is made, and without knowing what file it is being made for this canot be done sanely. Especially for project quotas.... Also, per-process reservatin pools can't really be managed through existing APIs, so we'd need new ones. And then we'd be asking application developers to use two different models for almost identical functionality, which means they'll just use the one that is most effective for their purpose (i.e. fallocate() because they already have a fd open on the file they are going to write to). IOWs, all I see from an implementation persepctive of per-process reservation pools is complexity and nasty corner cases. And from the user persepctive, an API that doesn't match up with the operations at hand. i.e. that of writing a file.... > Whether we tie this space reservation to a fd or a process, we also > would need to decide up front whether this space shows up as "missing" > by statfs(2)/df or not. IMO, reserved space is used space - it's not free for just anyone to use anymore, and it has to be checked and accounted against quotas even before it gets used.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html