在 2022-05-27 03:15,Eric Wheeler 写道:
On Mon, 23 May 2022, Coly Li wrote:
On 5/18/22 9:22 AM, Eric Wheeler wrote:
> Some time ago you ordered an an SSD to test the 4k cache issue, has that
> been fixed? I've kept an eye out for the patch but not sure if it was
> released.
Yes, I got the Intel P3700 PCIe SSD to fix the 4Kn unaligned I/O issue
(borrowed from a hardware vendor). The new situation is, current
kernel does
the sector size alignment checking quite earlier in bio layer, if the
LBA is
not sector size aligned, it is rejected in the bio code, and the
underlying
driver doesn't have chance to see the bio anymore. So for now, the
unaligned
LBA for 4Kn device cannot reach bcache code, that's to say, the
original
reported condition won't happen now.
The issue is not with unaligned 4k IOs hitting /dev/bcache0 because you
are right, the bio layer will reject those before even getting to
bcache:
The issue is that the bcache cache metadata sometimes makes metadata or
journal requests from _inside_ bcache that are not 4k aligned. When
this happens the bio layer rejects the request from bcache (not from
whatever is above bcache).
Correct me if I misunderstood what you meant here, maybe it really was
fixed. Here is your response from that old thread that pointed at
unaligned key access where you said "Wow, the above lines are very
informative, thanks!"
It was not fixed, at least I didn't do it on purpose. Maybe it was
avoided
by other fixes, e.g. the oversize bkey fix. But I don't have evidence
the
issue was fixed.
bcache: check_4k_alignment() KEY_OFFSET(&w->key) is not 4KB aligned:
15725385535
https://www.spinics.net/lists/linux-bcache/msg06076.html
In that thread Kent sent a quick top-post asking "have you checked
extent
merging?"
https://www.spinics.net/lists/linux-bcache/msg06077.html
It embarrassed me that I received your informative debug information,
and I
glared very hard at the code for quite long time, but didn't have any
clue
that how such problem may happen in the extent related code.
Since you reported the issue and I believe you, I will keep my eyes on
the
non-aligned 4Kn issue for bcache internal I/O. Hope someday I may have
idea
suddenly to point out where the problem is, and fix it.
And after this observation, I stopped my investigation on the
unaligned sector
size I/O on 4Kn device, and returned the P3700 PCIe SSD to the
hardware
vendor.
Hmm, sorry that it wasn't reproduced. I hope I'm wrong, but if bcache
is
generating the 4k-unaligned requests against the cache meta then this
bug
might still be floating around for "4Kn" cache users.
I don't think you were wrong, you are people whom I believe :-) It just
needs
time and luck...
Coly Li