Re: Slow USB storage device?

Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> · Fri, 15 Jan 2010 08:21:38 -0800 (PST)

On Fri, 15 Jan 2010, Karel Zak wrote:
> 
> On Thu, Jan 14, 2010 at 04:28:11PM -0800, Linus Torvalds wrote:
> >   Except it takes about half a minute to show up as a disk, because udev 
> >   and blkid take _forever_, and try to read a hell of a lot more than 
> >   they should. Can somebody please fix this? ]
> 
>  I don't think they read more that they should. Almost all filesystem
>  superblocks are in the first 64kB of the disk, and if you ask for
>  RAIDs probing we also need to seek to the end of the disk ...

See my other email.

That "probing" reads more than a quarter of the whole disk!

There is no way that is acceptable.

Think about it this way: if it was a really fast 500GB _real_ disk, do you 
think it would be ok to read 100GB of data to find a raid volume?

I don't think so.

>  devkit-disks-part-id reads 512 bytes from the device to parse
>  partition table. The libblkid library is trying to detect more that
>  50 filesystems/RAIDs.

Yes, and it's not acceptable.

It's f*cking _moronic_ to read 140kB (and yes, that's _without_ taking 
read-ahead into account, since all those small reads will all have to read 
4kB each. With read-ahead triggering, it ends up being closer to 200kB):

	_llseek(3, 0, [0], SEEK_SET)            = 0
	read(3, 69632) = 69632
	_llseek(3, 393216, [393216], SEEK_SET)  = 0
	read(3, 64) = 64
	_llseek(3, 495616, [495616], SEEK_SET)  = 0
	read(3, 64) = 64
	_llseek(3, 505344, [505344], SEEK_SET)  = 0
	read(3, 40) = 40
	_llseek(3, 374272, [374272], SEEK_SET)  = 0
	read(3, 40) = 40
	_llseek(3, 504832, [504832], SEEK_SET)  = 0
	read(3, 48) = 48
	_llseek(3, 505344, [505344], SEEK_SET)  = 0
	read(3, 6)              = 6
	_llseek(3, 505344, [505344], SEEK_SET)  = 0
	read(3, 51) = 51
	_llseek(3, 505344, [505344], SEEK_SET)  = 0
	read(3, 292) = 292
	_llseek(3, 504832, [504832], SEEK_SET)  = 0
	read(3, 18) = 18
	_llseek(3, 473600, [473600], SEEK_SET)  = 0
	read(3, 24) = 24
	_llseek(3, 375296, [375296], SEEK_SET)  = 0
	read(3, 24) = 24
	_llseek(3, 374784, [374784], SEEK_SET)  = 0
	read(3, 24) = 24
	_llseek(3, 497664, [497664], SEEK_SET)  = 0
	read(3, 24) = 24
	_llseek(3, 301568, [301568], SEEK_SET)  = 0
	read(3, 24) = 24
	_llseek(3, 500224, [500224], SEEK_SET)  = 0
	read(3, 4)                  = 4
	_llseek(3, 505344, [505344], SEEK_SET)  = 0
	read(3, 512) = 512
	_llseek(3, 270336, [270336], SEEK_SET)  = 0
	read(3, 1024) = 1024
	_llseek(3, 262144, [262144], SEEK_SET)  = 0
	read(3, 1377) = 1377

See? That's just the result of crazy shit. This is a half-meg USB-1 
device. Not even USB-2. If somebody puts RAID on such a puppy, they are 
(a) insane and (b) should be forced to do some _manual_ configuration, 
rather than have blkid try to autodetect insane situations.

There is no excuse. Let me show what the kernel printed out:

	usb 3-1: new full speed USB device using uhci_hcd and address 8
	sd 10:0:0:0: [sdc] 988 512-byte logical blocks: (505 kB/494 KiB)
	 sdc: sdc1
	sdc: p1 size 987 exceeds device capacity, limited to end of disk

and the above three lines are enough to determine that it shouldn't be 
treated like some crazy RAID config:

 - "full speed USB" means 12 Mbit/s and 64-byte packets. IOW, we really 
   are talking floppy speeds even at the _best_ of time. Yes, 16kB/s is 
   really slow, but even at the best of times, you can't get over about
   500kB/s over that slow interface (yeah, 12Mbit/s theoretical, in 
   practice it's much less).

   It's good for floppies and serial lines, not RAID devices.

 - 494KiB! I bet you can't even _find_ a floppy that small any more. Sure, 
   they used to have 180kB SDSS 5.25" floppies, but afaik, the smallest 
   3.5" floppy you could buy is 720kB.

   And quite frankly, I'm not even sure of the above, because it's been 
   probably over 15 years since I saw a 5.25" floppy last. Even 3.5" ones 
   I haven't seen in years.

 - "sdc: sdc1". We found a perfectly good partition table. In that size, 
   there is simply absolutely no excuse for looking for anything else.

That's basically what it boils down to: reading lots of data off such a 
device is clearly insane. You wouldn't read a quarter of a real disk, why 
do you read a quarter of this one?

> >  - Why do we try to even identify /dev/sdc, when the kernel has already 
> >    partitioned it and we'd be much better identifying just the partition?
> 
>  I had the same question one week ago. This "feature" has pretty bad
>  side effect, see https://bugzilla.redhat.com/show_bug.cgi?id=543749
>  I hope we will fix this problem.

Yeah, that's the separate problem of RAID signatures just being crazy sh*t 
to begin with, and some of them being particularly bad. It's sad when you 
re-partition a disk, and some old crud at the end of the disk then makes 
various MD tools not see the new format, because they are so enamoured 
with the old one.

			Linus
--
To unsubscribe from this list: send the line "unsubscribe util-linux-ng" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html