Re: [RFC PATCH 4/4] btrfs: Moved repair code from inode.c to extent_io.c

Jan Schmidt <list.btrfs@xxxxxxxxxxxxx> · Mon, 25 Jul 2011 10:52:49 +0200

On 25.07.2011 01:01, Andi Kleen wrote:
>> I wasn't clear enough on that: We only track read errors, here. Ans
>> error correction can only happen on the read path. So if the write
>> attempt fails, we can't go into a loop.
> 
> Not in a loop, but you trigger more IO errors, which can be nasty 
> if the IO error logging triggers more IO (pretty common because
> syslogd calls fsync). And then your code does even more IO, floods
> more etc.etc. And the user will be unhappy if their
> console gets flooded.

Okay, I see your point now. Thanks for pointing that out.

> We've have a similar problems in the past with readahead causing
> error flooding.
> 
> Any time where an error can cause more IO you have to be extremly
> careful.
> 
> Right now this seems rather risky to me.

Hum. This brings up a lot of questions. Would you consider throttling an
appropriate solution to prevent error flooding? What would you use as a
base? A per device counter (which might be misleading if there are more
layers below)? A per filesystem counter (which might need
configurability)? Should those "counters" regenerate over time? Any
other approaches?

-Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html