On Mon, Jun 11, 2018 at 6:09 PM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, 2018-06-11 at 17:56 -0400, Bryan Gurney wrote: >> On Mon, Jun 11, 2018 at 1:00 PM, Anthony Youngman >> <anthony@xxxxxxxxxxxxxxx> wrote: >> > On 11/06/18 16:06, James Bottomley wrote: >> > > Well, this is the problem: a 4k logical (presumably 4k physical) >> > > drive cannot be addressed in block sectors that are not divisible >> > > by 8. This type of drive configuration is very unusual (although >> > > it was something we tested years ago before the industry realised >> > > it had to ship drives with 4k physical but 512 byte logical >> > > sectors because of the legacy problem). >> > >> > I understood these drives were now becoming much more common, >> > especially enterprise-grade drives. I know there were problems >> > switching from 512/512 drives to 512/4096, but as you say I thought >> > they were pretty much addressed. >> >> As soon as I saw the model number "HGST HUH721010AL", and did a >> search, I said, "Oh, it's _this_ drive." >> >> The HGST Ultrastar He10 has both "512e Format" and "4K Native Format" >> part numbers, so it's easy to potentially buy the wrong type of drive >> (e.g.: accidentally buy a 4K Native drive, and discover some obscure >> I/O failures). >> >> FYI, in my experience, when an application sends a >> smaller-than-4096-bytes I/O to a 4096-bytes block device, the usual >> error code that's sent by the driver is EINVAL (or "Invalid >> argument"), so see if there's a log message citing that error code. > > We've done the work to make this function. However, it was a while ago > and I don't believe anyone tests regularly now (particularly with the > corner cases) so errors can creep back into the stack. Ah, okay. I was thinking more in the context of the error itself being relatively obscure to find, since the program trying to perform the I/O operation may report the error in a way that makes it look as though an invalid argument to a command was received. (At least that's how I discovered this, when I was wondering why I was seeing "invalid argument" after trying a command that should have worked, but failed; a blktrace run revealed a less-than-4096-byte read that was being attempted, but failed with EINVAL.) >> > I think it must be a couple of years ago now though, that I heard >> > (on LWN) enterprise drives were apparently switching over to >> > 4096/4096. With NO 512 emulation fall-back. >> >> Some drive manufacturers seem to be more eager than others, but >> there's still work to be done. For example, try this with a 4K- >> native drive: >> >> 1. Write an ISO image to the drive with the command "dd >> if=isofile.iso of=/dev/testdevice bs=4096 oflag=direct" >> >> 2. Create a test directory (for example, "/mnt/testdir"), then >> attempt to mount the device with "mount /dev/testdevice /mnt/testdir" > > This is a textbook case of something that can never work: The > requirement for a 4k drive is that the stack must be aligned, meaning > 4k or multiple of 4k block size all the way up and down. The isofs > you're copying only has a 2k block size. You get the same failure with > any non 4k multiple filesystem block size. Fortunately most modern > filesystems have had 4k, or multiple thereof, block sizes for a while > now, so you're unlikely to see this on your old ext4 devices but, in > principle, it could happen. > > James Then I hope that drive manufacturers don't start making 4K-native USB flash drives; otherwise, we'll have a confusing situation on our hands. Bryan > >> When I tried it on RHEL 7.5, I saw this: "kernel: isofs_fill_super: >> bread failed, dev=testdevice, iso_blknum=17, block=-2147483648" >> >> Note that ISO filesystems have a 2048-byte block size (maximum), but >> in this test, it's stored on a block device with a block size of 4096 >> bytes. >> >> There may be more issues out there, but they have to be found first. >> And finding the issues is difficult, due to the obscurity of the >> error messages seen. >> >> >> Thanks, >> >> Bryan >> > -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html