Re: [PATCH 4/6] md: don't export log device

Neil Brown <neilb@xxxxxxx> · Wed, 14 Oct 2015 07:41:58 +1100

Christoph Hellwig <hch@xxxxxxxxxxxxx> writes:

> On Thu, Oct 08, 2015 at 05:04:54PM +1100, Neil Brown wrote:
>> Having two disks with ->raid_disk==0 does seem a little weird, but we do
>> already have that in some cases.
>> When you have a hot-replace going, both the original and the replacement
>> have the same ->raid_disk numbers.  They can be distinguished by the
>> Replacement flag.
>> I'm suggesting the same (sort of) for journals, and distinguish by the
>> Journal flag.
>> 
>> I did quick audit and just found setup_conf, run() and md_update_sb().
>> If you could do an audit to that would be good.  I'd be surprised if you
>> find many more places where Journal needs to be tested with ->raid_disk.
>
> Overloading positive numbers for the journal disk sounds like a bad idea
> to me as it will cause a lot of confusion.  I'd rather assign specific
> negative values to special roles outside the actual rate.  This will
> require an initial audit, but give us nicely understandable rules later
> on.

The positive numbers are in different name-spaces:
 primary raid disks
 replacement raid disks
 journal disks

While confusion is always possible, I think keeping these separate is
not that hart.  For example ->raid_disk is primarily used to place the
rdev in an array after which it is the position in the array which is
primarily used.  The value of ->raid_disk is mostly tested only for
whether it is <0 or not.

I'm certainly happy to do our best to remove sources of confusion from
the user-space view of these numbers (e.g. put 'journal' or 'none' in
the 'slot' file, not '0' (and definitely not "-5")).  But internally to
the kernel it is important to remember that it isn't a primary key - and
once you do that there is not much chance of confusion.

Thanks,
NeilBrown
Attachment:
signature.asc

Description: PGP signature