Re: Fwd: Re: Multiple issues with parted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 26, 2009 at 09:13:34AM +0200, Hans de Goede wrote:
> Hi,
>
> On 08/26/2009 01:41 PM, Joel Granados wrote:
>> On Wed, Aug 26, 2009 at 06:38:26AM +0200, Hans de Goede wrote:
>>> Hi All,
>>>
>>> While testing Intel BIOS RAID using mdraid I noticed that the kernels
>>> view of the partition table never changes even though we successfully
>>> make Parted.Disk.commitToOs() calls.
>>>
>>> This has let to me diving into libparted's commit_to_os() code for Linux
>>> and there are multiple issues hiding in there:
>>>
>>> 1) Parted reads /sys/block/foo/range to determine how many partitions
>>>     the device type supports and then makes BLKPG ioctl's to update the
>>>     kernels view of the partition table for partitions which fall into
>>>     this range. However for example /sys/block/sda/range contains 16,
>>>     there are 2 issue with libparted using this number:
>>>     1) scsi major's only support 15 partitions, 1 of the range of 16
>>>        is reserved for the whole device, yet libparted will try
>>>        to notify the kernel about 16 partitions if present
>>>     2) If the major's partition minor's run out, the kernel will switch
>>>        to the mdp major for the other partitions, iow range no longer limits
>>>        the number of partitions.
>>>
>>> 2) libparted assumes the user knows what he is doing, and will ignore
>>>     -ebusy errors for partitions, assuming that the user is smart enough
>>>     to only change unused partitions (BAD, really really BAD)
>>>
>>> 3) because of 1) libparted will only sync 1 partition on /dev/md# devices
>>>     (would be 0 if not for the of by 1 bug as all md#p# partitions use the
>>>      mdp major), and it fails to even do that without reporting an error.
>>>
>>> ###
>>>
>>> Now we can fix 1) by simply not checking /sys/block/foo/range, but instead
>>> just syncing as many partitions as are in the table.
>>
>> This is not a plausable solution as deleting a partition would not show
>> up in the refresh if we only refresh the partitions that are in the
>> table.
>>
>
> Good point.
>
>>> 2) is more troublesome,
>>> we could just make -EBUSY an error, but that may annoy / bug some users.
>>
>> This is something that I was working on last week and is the reason I
>> asked you (hans) to run a script.  It indeed returns EBUSY (sometimes)
>> when the partition is not realy mounted.  I have tested a mechanism by
>> which I retry a arbitrary number of times and the call seems to work
>> after some 4 retries.  But, I don't like arbitrary values.  Moreover, I
>> don't thing getting this error to the user is the solution, given that
>> it is easyly solvable with some retries.  What I am trying to do ATM is
>> come up with a mechanism that does not involve "arbitrary" numbers.
>>
>
> Hmm, are you sure it really is randomly returning -EBUSY ?? Maybe some slow
> process is using the disk every now and then.

Unfortunatelly I'm not sure of what is causing the issue.  I call it
random because I do not understand the behavior yet.

>
>
>>>
>>> An even bigger problem IMHO is the use of the BLKPG ioctl instead of BLKRRPART
>>> at all. What this does is tell the kernel parted's view of the partition table
>>> and make it use that, instead of telling the kernel to reread the partition table.
>>> According to the parted sources this is done for the case where the kernel does
>>> not know the disklabel type. However during initial scanning, when we don't modify
>>> a disk, and during boot and normal running of the system, we rely on the kernel's
>>> view. So IMHO it would be much better to always use the kernels view and just
>>> always call BLKRRPART in commit_to_os(), this would solve all of the above issues.
>>>
>>
>> Well, the thing you are expressing here is the difference between
>> commit_to_os and commit_to_dev.  one tells the kernel what parted has in
>> memory and the other writes to disk.
>>
>
> Erm, no, these are both commit to os. The difference is BLKPG is modifying the kernels

where does ped_disk_commit_to_dev call BLKPG or BLKRRPART?  I see it
getting to BLKFLSBUF, but not to the others.  So ped_disk_commit_to_dev
does not commit to os. it simply wirtes to dev.

> partitiontable to represent parted's in memory view. Where as BLKRRPART tells the kernel
> to re-read the table from disk and use that, since on a normal boot the kernel reads the
> table from disk, telling it to re-read it seems best to me, as that leads to the kernel
> actually using the same view of the table as will be used during a normal boot of the
> system (although I admit the 2 views should not differ, currently they can due to multiple
> issues with parted's BLKPG code).
>
>> I think a much better solution would be a commit_kernel_read_dev
>> function, that implements just what you suggested.
>
> IMHO that is what commit_to_os is supposed to do, and what it does in some cases. Currently
> parted is not very consistend in some scenario's it will use BLKPG and ignore ebusy errors

Ack, not consistent...

> and in others it will use BLKRRPART and report EBUSY errors. It seems more consistent to
> me to just always use BLKRRPART.
>
> See the attached patch which I'm currently using for testing.
>
> Regards,
>
> Hans

> diff -up parted-1.9.0/libparted/arch/linux.c~ parted-1.9.0/libparted/arch/linux.c
> --- parted-1.9.0/libparted/arch/linux.c~	2009-08-26 06:40:11.000000000 +0200
> +++ parted-1.9.0/libparted/arch/linux.c	2009-08-26 06:51:21.000000000 +0200
> @@ -41,7 +41,6 @@
>  #include <libdevmapper.h>
>  #endif
>  
> -#include "blkpg.h"
>  #include "../architecture.h"
>  #include "dirname.h"
>  
> @@ -587,22 +586,6 @@ _get_linux_version ()
>          return kver = KERNEL_VERSION (major, minor, teeny);
>  }
>  
> -static int
> -_have_devfs ()
> -{
> -        static int have_devfs = -1;
> -        struct stat sb;
...
> @@ -2621,19 +2421,6 @@ linux_disk_commit (PedDisk* disk)
>                  return _dm_reread_part_table (disk);
>  #endif
>          if (disk->dev->type != PED_DEVICE_FILE) {
> -                /* The ioctl() command BLKPG_ADD_PARTITION does not notify
> -                 * the devfs system; consequently, /proc/partitions will not
> -                 * be up to date, and the proper links in /dev are not
> -                 * created.  Therefore, if using DevFS, we must get the kernel
> -                 * to re-read and grok the partition table.
> -                 */
> -                /* Work around kernel dasd problem so we really do BLKRRPART */
> -                if (disk->dev->type != PED_DEVICE_DASD &&
> -                    _have_blkpg () && !_have_devfs ()) {
> -                        if (_disk_sync_part_table (disk))
> -                                return 1;
> -                }
> -
>                  return _kernel_reread_part_table (disk->dev);
>          }
>  

Instead of removing a whole bunch of code I prefer addint an additional
function and deprecating these that you erased.  what do you think?


regards.



-- 
Joel Andres Granados
Brno, Czech Republic, Red Hat.

Attachment: pgpTHeK14puj0.pgp
Description: PGP signature

_______________________________________________
Anaconda-devel-list mailing list
Anaconda-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/anaconda-devel-list

[Index of Archives]     [Kickstart]     [Fedora Users]     [Fedora Legacy List]     [Fedora Maintainers]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]
  Powered by Linux