On Fri, Jul 24, 2015 at 09:00:54AM +1000, Dave Chinner wrote: > On Thu, Jul 23, 2015 at 12:43:58PM -0400, Vivek Goyal wrote: > > On Thu, Jul 23, 2015 at 03:10:43PM +1000, Dave Chinner wrote: > > > > [..] > > > I don't think knowing the bdev timeout is necessary because the > > > default is most likely to be "fail fast" in this case. i.e. no > > > retries, just shut down. IOWs, if we describe the configs and > > > actions in neutral terms, then the default configurations easy for > > > users to understand. i.e: > > > > > > bdev enospc XFS default > > > ----------- ----------- > > > Fail slow Fail fast > > > Fail fast Fail slow > > > Fail never Fail never, Record in log > > > EOPNOTSUPP Fail never > > > > > > With that in mind, I'm thinking I should drop the > > > "permanent/transient" error classifications, and change it "failure > > > behaviour" with the options "fast slow [never]" and only the slow > > > option has retry/timeout configuration options. I think the "never" > > > option still needs to "fail at unmount" config variable, but we > > > enable it by default rather than hanging unmount and requiring a > > > manual shutdown like we do now.... > > > > I am wondering instead of 4 knobs (fast,slow,never,retry-timeout) can > > we just do with one knob per error type and that is retry-timout. > > "retry-timeout" == "fail slow". i.e. a 5 minute retry timeout is > configured as: > > # echo slow > fail_method > # echo 0 > max_retries > # echo 300 > retry_timeout Hi Dave, I am sure I am missing something but I will anyway ask. Why do we need this knob "fail_method". Isn't it sort of implied in other two knobs based on their values. max_retries=0 retry_timeout=0 implies fail_method=fast. A non-zero value of max_retries or retry_timeout implies fail_method=slow A very high value (-1) of either max_retries or retry_timeout implies fail_method="almost never". > > retry-timeout=0 (Fail fast) > > retry-timeout=X (Fail slow) > > retry-timeout=-1 (Never Give up). > > What do we do when we want to add a different failure type > with different configuration requirements? Ok, got it. So we are targettting something very generic so that other cases can be handled too. > > > Also do we really need this timeout per error type. > > I don't follow your logic here. What do need a timeout for with > either the "never" or "fast" failure configurations? Ignore this. I had misunderstood it. > > > Also would be nice if this timeout was configurable using a mount > > option. Then we can just specify it during mount time and be done > > with it. > > That way lies madness. The error configuration iinfrastructure we > need is not just for ENOSPC errors on metadata buffers. We need > configurable error behaviour for multiple different errors in > multiple different subsystems (e.g. data IO failure vs metadata > buffer IO failure vs memory allocation failure vs inode corruption > vs freespace corruption vs ....). > > And we still would need the sysfs interface for querying and > configuring at runtime, so mount options are just a bad idea. And > with sysfs, the potential future route for automatic configuration > at mount time is via udev events and configuration files, similar to > block devices. Agreed that sysfs provides lots of flexibility here. I guess I was just thinking in terms of solving this particular issue we are facing. > > > Idea of auto tuning based on what block device is doing sounds reasonable > > but that should not be a requirement for this patch and can go in even > > later. It is one of those nice to have features. > > "this patch"? Just the core infrastructure so far: I was referring to Mike's patch where we add additional method to block device operations. > > 11 files changed, 290 insertions(+), 60 deletions(-) > > and that will need to be split into 4-5 patches for review. There's > a bunch of cleanup that preceeds this, and then there's a patch per > error type we are going to handle in metadata buffer IO completion. > IOWs, the dm-thinp autotuning is just a simple, small patch at the > end of a much larger series - it's maybe 10 lines of code in XFS... Ok. I will wait for the final patches. Thanks Vivek -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html