On 17/11/17 10:20, Rui Hua wrote:
Hi, Stefan
2017-11-17 16:28 GMT+08:00 Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx>:
I‘m getting the same xfs error message under high load. Does this patch fix
it?
Did you applied the patch "bcache: only permit to recovery read error
when cache device is clean" ?
If you did, maybe this patch can fix it. And you'd better check
/sys/fs/bcache/XXX/internal/cache_read_races in your environment,
meanwhile, it should not be zero when you get that err message.
Hi all,
I have 3 servers running a very recent 4.9 stable release, with several
recent bcache patches cherry picked, including V4 of "bcache: only
permit to recovery read error when cache device is clean".
In the 3 weeks since using these cherry picks I've experienced a very
small number of isolated read errors in the layer above bcache, on all 3
servers.
On one of the servers, 2 out of the 6 bcache resources have a value of 1
in /sys/fs/bcache/XXX/internal/cache_read_races, and it is on these same
2 bcache resources where one read error has occurred on the upper layer.
The other 4 bcache resources have 0 in cache_read_races and I haven't
had any read errors on the layers above them.
On another server, I have 1 bcache resource out of 10 with a value of 5
in /sys/fs/bcache/XXX/internal/cache_read_races, and it is on that
bcache resource where a read error occurred on one occasion. The other 9
bcache resources have 0 in cache_read_races, and no read errors have
occurred on the layers above any of them.
On the 3rd server where some read errors occurred, I cannot verify if
there were positive values in cache_read_races as I moved the data from
there onto other storage, and shut down the bcache resources where the
errors occurred.
If I can provide any other info which might help with this issue, please
let me know.
regards,
Eddie