On Thu, Jul 23 2015 at 1:10am -0400, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Wed, Jul 22, 2015 at 11:28:06AM -0500, Eric Sandeen wrote: > > On 7/22/15 8:34 AM, Mike Snitzer wrote: > > > On Tue, Jul 21 2015 at 10:37pm -0400, > > > Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > >> On Tue, Jul 21, 2015 at 09:40:29PM -0400, Mike Snitzer wrote: > > >>> I'm open to considering alternative interfaces for getting you the info > > >>> you need. I just don't have a great sense for what mechanism you'd like > > >>> to use. Do we invent a new block device operations table method that > > >>> sets values in a 'struct no_space_strategy' passed in to the > > >>> blockdevice? > > >> > > >> It's long been frowned on having the filesystems dig into block > > >> device structures. We have lots of wrapper functions for getting > > >> information from or performing operations on block devices. (e.g. > > >> bdev_read_only(), bdev_get_queue(), blkdev_issue_flush(), > > >> blkdev_issue_zeroout(), etc) and so I think this is the pattern we'd > > >> need to follow. If we do that - bdev_get_nospace_strategy() - then > > >> how that information gets to the filesystem is completely opaque > > >> at the fs level, and the block layer can implement it in whatever > > >> way is considered sane... > > >> > > >> And, realistically, all we really need returned is a enum to tell us > > >> how the bdev behaves on enospc: > > >> - bdev fails fast, (i.e. immediate ENOSPC) > > >> - bdev fails slow, (i.e. queue for some time, then ENOSPC) > > >> - bdev never fails (i.e. queue forever) > > >> - bdev doesn't support this (i.e. EOPNOTSUPP) > > > > I'm not sure how this is more useful than the bdev simply responding to > > a query of "should we keep trying IOs?" > > - bdev fails fast, (i.e. immediate ENOSPC) > > XFS should use a bound retry behaviour for to allow the possiblity of > the admin adding more space before we shut down the fs. i.e. > XFS fails slow. > > - bdev fails slow, (i.e. queue for some time, then ENOSPC) > > We know that IOs are going to be delayed before they are failed, so > there's no point in retrying as the admin has already had a chance > to resolve the ENOSPC condition before failure was reported. i.e. > XFS fails fast. > > - bdev never fails (i.e. queue forever) > > Block device will appear to hang when it runs out of space. Nothing > XFS can do here because IOs never fail, but we need to note this in > the log at mount time so that filesystem hangs are easily explained > when reported to us. > > - bdev doesn't support this (i.e. EOPNOTSUPP) > > XFS uses default "retry forever" behaviour. > > > > This 'struct no_space_strategy' would be invented purely for > > > informational purposes for upper layers' benefit -- I don't consider it > > > a "block device structure" it the traditional sense. > > > > > > I was thinking upper layers would like to know the actual timeout value > > > for the "fails slow" case. As such the 'struct no_space_strategy' would > > > have the enum and the timeout. And would be returned with a call: > > > bdev_get_nospace_strategy(bdev, &no_space_strategy) > > > > Asking for the timeout value seems to add complexity. It could change after > > we ask, and knowing it now requires another layer to be handling timeouts... > > I don't think knowing the bdev timeout is necessary because the > default is most likely to be "fail fast" in this case. i.e. no > retries, just shut down. IOWs, if we describe the configs and > actions in neutral terms, then the default configurations easy for > users to understand. i.e: > > bdev enospc XFS default > ----------- ----------- > Fail slow Fail fast > Fail fast Fail slow > Fail never Fail never, Record in log > EOPNOTSUPP Fail never > > With that in mind, I'm thinking I should drop the > "permanent/transient" error classifications, and change it "failure > behaviour" with the options "fast slow [never]" and only the slow > option has retry/timeout configuration options. I think the "never" > option still needs to "fail at unmount" config variable, but we > enable it by default rather than hanging unmount and requiring a > manual shutdown like we do now.... This all sounds good to me. The simpler XFS configuration looks like a nice improvement. If you just want to stub out the call to bdev_get_nospace_strategy() I can crank through implementing it once I get a few minutes. Btw, not sure what I was thinking when suggesting XFS would benefit from knowing the duration of the thinp no_space_timeout. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html