On Wed, Aug 26, 2009 at 06:38:26AM +0200, Hans de Goede wrote: > Hi All, > > While testing Intel BIOS RAID using mdraid I noticed that the kernels > view of the partition table never changes even though we successfully > make Parted.Disk.commitToOs() calls. > > This has let to me diving into libparted's commit_to_os() code for Linux > and there are multiple issues hiding in there: > > 1) Parted reads /sys/block/foo/range to determine how many partitions > the device type supports and then makes BLKPG ioctl's to update the > kernels view of the partition table for partitions which fall into > this range. However for example /sys/block/sda/range contains 16, > there are 2 issue with libparted using this number: > 1) scsi major's only support 15 partitions, 1 of the range of 16 > is reserved for the whole device, yet libparted will try > to notify the kernel about 16 partitions if present > 2) If the major's partition minor's run out, the kernel will switch > to the mdp major for the other partitions, iow range no longer limits > the number of partitions. > > 2) libparted assumes the user knows what he is doing, and will ignore > -ebusy errors for partitions, assuming that the user is smart enough > to only change unused partitions (BAD, really really BAD) > > 3) because of 1) libparted will only sync 1 partition on /dev/md# devices > (would be 0 if not for the of by 1 bug as all md#p# partitions use the > mdp major), and it fails to even do that without reporting an error. > > ### > > Now we can fix 1) by simply not checking /sys/block/foo/range, but instead > just syncing as many partitions as are in the table. This is not a plausable solution as deleting a partition would not show up in the refresh if we only refresh the partitions that are in the table. > 2) is more troublesome, > we could just make -EBUSY an error, but that may annoy / bug some users. This is something that I was working on last week and is the reason I asked you (hans) to run a script. It indeed returns EBUSY (sometimes) when the partition is not realy mounted. I have tested a mechanism by which I retry a arbitrary number of times and the call seems to work after some 4 retries. But, I don't like arbitrary values. Moreover, I don't thing getting this error to the user is the solution, given that it is easyly solvable with some retries. What I am trying to do ATM is come up with a mechanism that does not involve "arbitrary" numbers. > > An even bigger problem IMHO is the use of the BLKPG ioctl instead of BLKRRPART > at all. What this does is tell the kernel parted's view of the partition table > and make it use that, instead of telling the kernel to reread the partition table. > According to the parted sources this is done for the case where the kernel does > not know the disklabel type. However during initial scanning, when we don't modify > a disk, and during boot and normal running of the system, we rely on the kernel's > view. So IMHO it would be much better to always use the kernels view and just > always call BLKRRPART in commit_to_os(), this would solve all of the above issues. > Well, the thing you are expressing here is the difference between commit_to_os and commit_to_dev. one tells the kernel what parted has in memory and the other writes to disk. I think a much better solution would be a commit_kernel_read_dev function, that implements just what you suggested. > Regards, > > Hans > > _______________________________________________ > Anaconda-devel-list mailing list > Anaconda-devel-list@xxxxxxxxxx > https://www.redhat.com/mailman/listinfo/anaconda-devel-list -- Joel Andres Granados Brno, Czech Republic, Red Hat.
Attachment:
pgpae5FtHgthZ.pgp
Description: PGP signature
_______________________________________________ Anaconda-devel-list mailing list Anaconda-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/anaconda-devel-list