On Fri, Apr 08, 2016 at 12:06:35PM +0200, Jan Tulak wrote: > On Fri, Apr 8, 2016 at 2:09 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > > On Thu, Mar 24, 2016 at 12:15:34PM +0100, jtulak@xxxxxxxxxx wrote: > > > From: Jan Tulak <jtulak@xxxxxxxxxx> > > > > > > Unify mkfs.xfs behaviour a bit and never truncate files. If the user > > > is trying to mkfs an existing file, we don't want to destroy anything > > > he did with the file before (sparse file, allocations...) > > > > Why not? We do that with discard-by-default to block devices, > > O_TRUNC is exactly the same situation with a file - we completely > > re-initialise the file from a known state if mkfs has been asked to > > create the file. > > > But AFAIK, we don't zero-out entire spindle devices, Unless the controller above them supports discard or whatever implementation the storage protocol uses (e.g. UNMAP or WRITE_SAME). e.g, the "spindle devices" often are big raid arrays that are using thin provisioning, compression and dedupe internally, so running discard on them does make a significant difference to their behaviour. > we don't ask if the drive skips some blocks (i.e. because they are bad), That's irrelevant to the issue at hand. > and we don't care > about what an underlaying layer (like LVM) did with the block device. Actually, we do, because users care about their storage stack doing sane management operations automatically. That's why we issued a discard - it tells the underlying devices to re-initialise the storage on this device *if they care about such things*. Stuff like thinly provisioned devices rely on mkfs behaviour like this to recycle used storage efficiently and transparently. The user expects things to "just work" and this is one of those things that makes it "just work". > From > this point of view, we shouldn't care about the file either. > > I can be missing something, though. I think you're missing the fact that we don't know what the *underlying storage* cares about, so we need to tell them in some way that a device or image file is being re-initialised from scratch. Whether that is by truncating the image file (so the filesytem can issue discards on the now unused space) or by issuing discard ioctls ourselves, it really doesn't matter. The key point is that we have a mechanism that allows us to notify the underlying storage of the "this is re-initialised storage" intent of mkfs. So from that perspective, the O_TRUNC behaviour should remain. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs