Re: RAID6: "Bad block number requested"

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Mon, 11 Jun 2018 15:09:39 -0700

On Mon, 2018-06-11 at 17:56 -0400, Bryan Gurney wrote:
> On Mon, Jun 11, 2018 at 1:00 PM, Anthony Youngman
> <anthony@xxxxxxxxxxxxxxx> wrote:
> > On 11/06/18 16:06, James Bottomley wrote:
> > > Well, this is the problem: a 4k logical (presumably 4k physical)
> > > drive cannot be addressed in block sectors that are not divisible
> > > by 8.  This type of drive configuration is very unusual (although
> > > it was something we tested years ago before the industry realised
> > > it had to ship drives with 4k physical but 512 byte logical
> > > sectors because of the legacy problem).
> > 
> > I understood these drives were now becoming much more common,
> > especially enterprise-grade drives. I know there were problems
> > switching from 512/512 drives to 512/4096, but as you say I thought
> > they were pretty much addressed.
> 
> As soon as I saw the model number "HGST HUH721010AL", and did a
> search, I said, "Oh, it's _this_ drive."
> 
> The HGST Ultrastar He10 has both "512e Format" and "4K Native Format"
> part numbers, so it's easy to potentially buy the wrong type of drive
> (e.g.: accidentally buy a 4K Native drive, and discover some obscure
> I/O failures).
> 
> FYI, in my experience, when an application sends a
> smaller-than-4096-bytes I/O to a 4096-bytes block device, the usual
> error code that's sent by the driver is EINVAL (or "Invalid
> argument"), so see if there's a log message citing that error code.

We've done the work to make this function.  However, it was a while ago
and I don't believe anyone tests regularly now (particularly with the
corner cases) so errors can creep back into the stack.

> > I think it must be a couple of years ago now though, that I heard
> > (on LWN) enterprise drives were apparently switching over to
> > 4096/4096. With NO 512 emulation fall-back.
> 
> Some drive manufacturers seem to be more eager than others, but
> there's still work to be done.  For example, try this with a 4K-
> native drive:
> 
> 1. Write an ISO image to the drive with the command "dd
> if=isofile.iso of=/dev/testdevice bs=4096 oflag=direct"
> 
> 2. Create a test directory (for example, "/mnt/testdir"), then
> attempt to mount the device with "mount /dev/testdevice /mnt/testdir"

This is a textbook case of something that can never work: The
requirement for a 4k drive is that the stack must be aligned, meaning
4k or multiple of 4k block size all the way up and down.  The isofs
you're copying only has a 2k block size.  You get the same failure with
any non 4k multiple filesystem block size.  Fortunately most modern
filesystems have had 4k, or multiple thereof, block sizes for a while
now, so you're unlikely to see this on your old ext4 devices but, in
principle, it could happen.

James

> When I tried it on RHEL 7.5, I saw this: "kernel: isofs_fill_super:
> bread failed, dev=testdevice, iso_blknum=17, block=-2147483648"
> 
> Note that ISO filesystems have a 2048-byte block size (maximum), but
> in this test, it's stored on a block device with a block size of 4096
> bytes.
> 
> There may be more issues out there, but they have to be found first.
> And finding the issues is difficult, due to the obscurity of the
> error messages seen.
> 
> 
> Thanks,
> 
> Bryan
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html