Re: mdadm 3.3: issue with mdmon --takeover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neil,

On Wed, Sep 4, 2013 at 8:08 AM, NeilBrown <neilb@xxxxxxx> wrote:
> On Tue, 3 Sep 2013 17:54:55 +0200 Francis Moreau <francis.moro@xxxxxxxxx>
> wrote:
>
>> Hello Martin :)
>>
>> I gave 3.3 release a try and I have a first issue: basically starting
>> mdmon (3.3) with --takeover twice make mdmon failing on the second
>> run.
>>
>> Please find details below:
>>
>> # cat /proc/mdstat
>> Personalities : [raid1]
>> md126 : active raid1 sdb[1] sda[0]
>>       2064384 blocks super external:/md127/0 [2/2] [UU]
>>
>> md127 : inactive sdb[1](S) sda[0](S)
>>       65536 blocks super external:ddf
>>
>> # ps aux | grep dmon
>> root       311  0.4  1.0  80580 10944 ?        SLsl 17:46   0:00
>> @sbin/mdmon --takeover md127
>>
>> # ./mdmon --takeover --all
>>
>> # ps aux | grep dmon
>> root      3182  1.3  1.0  15156 11056 ?        SLsl 17:50   0:00
>> ./mdmon --takeover md127
>>
>> # ./mdmon --takeover --all
>> ...
>> monitor: wake ( )
>> monitor: wake ( )
>> monitor: wake ( )
>> monitor: wake ( )
>> monitor: wake ( )
>> monitor: wake ( 12:array_state )
>> read_and_act(0): 1378223477.512347 state:clean prev:clean action:idle
>> prev: idle start:18446744073709551615
>> ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (5) clean 18446744073709551615
>> manage_new: inst: 0 action: 11 state: 12
>> mdmon: ddf_open_new: subarray 0 doesn't exist
>> mdmon: failed to monitor external:/md127/0
>> free_aa: sys_name: md126
>> read_and_act(0): state:clean action:idle next( )
>> manage_new: inst: 0 action: 20 state: 21
>> ddf_open_new: new subarray 0, GUID: Linux-MDdeadbeef00000000?Ob79e0c8b1n
>> free_aa: sys_name: md126
>> caught sigterm, all clean... exiting
>> monitor: wake ( )
>> no arrays to monitor... exiting
>>
>> # ps aux | grep dmon
>> #
>>
>> Thanks
>
> I can't easily reproduce this.

This is weird, it's 100% reproductible here.

>
> Can you run "mdmon --takeover" in one window, then the next "mdmon
> --takeover" is a different window so we can clearly see which messages are
> coming from the mdmon which is exiting and which are coming from the mdmon
> which is starting.


Sure.

A note that I should have probably tell previously: before I'm
starting manually the first mdmon process, an old mdmon process is
running which was started by the system at boot and this mdmon is
3.2.6.

###
### window 1: starting manually the first mdmon --takeover process ####
###

# ps aux | grep dmon
root       312  0.5  1.0  80580 10944 ?        SLsl 09:24   0:00
@sbin/mdmon --takeover md127

## Note: this mdmon process was started at system boot and is 3.2.6

# ./mdmon --takeover --all
...
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
manage_new: inst: 0 action: 11 state: 12
ddf_open_new: new subarray 0, GUID: Linux-MDdeadbeef00000000?Ob79e0c8b1n
monitor: caught signal
read_and_act(0): 1378279619.393600 state:clean prev:inactive
action:idle prev: idle start:18446744073709551615
pr_state/ddf_set_array_state: 0(s=10 i=02)
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (5) dirty 18446744073709551615
pr_state/ddf_set_array_state: 0(s=00 i=02)
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (5) clean 18446744073709551615
pr_state/__write_init_super_ddf: 0(s=00 i=02)
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
ddf: sync_metadata
read_and_act(0): state:clean action:idle next( )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279621.980656 state:write-pending prev:clean
action:idle prev: idle start:18446744073709551615
pr_state/ddf_set_array_state: 0(s=10 i=02)
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (7) dirty 18446744073709551615
pr_state/__write_init_super_ddf: 0(s=10 i=02)
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
ddf: sync_metadata
read_and_act(0): state:write-pending action:idle next( state:active )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279622.381087 state:active prev:write-pending
action:idle prev: idle start:18446744073709551615
read_and_act(0): state:active action:idle next( )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279626.520845 state:active-idle prev:active
action:idle prev: idle start:18446744073709551615
read_and_act(0): state:active-idle action:idle next( state:clean )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279626.524532 state:clean prev:active-idle
action:idle prev: idle start:18446744073709551615
pr_state/ddf_set_array_state: 0(s=00 i=02)
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (5) clean 18446744073709551615
pr_state/__write_init_super_ddf: 0(s=00 i=02)
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
ddf: sync_metadata
read_and_act(0): state:clean action:idle next( )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279626.981157 state:write-pending prev:clean
action:idle prev: idle start:18446744073709551615
pr_state/ddf_set_array_state: 0(s=10 i=02)
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (7) dirty 18446744073709551615
pr_state/__write_init_super_ddf: 0(s=10 i=02)
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk b342fbdc for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
writing conf record 0 on disk 2cf00056 for
Linux-MDdeadbeef00000000?Ob79e0c8b1n/0
ddf: sync_metadata
read_and_act(0): state:write-pending action:idle next( state:active )
monitor: wake ( 12:array_state )
read_and_act(0): 1378279627.376402 state:active prev:write-pending
action:idle prev: idle start:18446744073709551615
read_and_act(0): state:active action:idle next( )

[launching new mdmon --takeover....]

monitor: wake ( 12:array_state )
read_and_act(0): 1378279678.858186 state:clean prev:clean action:idle
prev: idle start:18446744073709551615
ddf mark 0/Linux-MDdeadbeef00000000?Ob79e0c8b1n (5) clean 18446744073709551615
read_and_act(0): state:clean action:idle next( )
manage_new: inst: 0 action: 20 state: 21
ddf_open_new: new subarray 0, GUID: Linux-MDdeadbeef00000000?Ob79e0c8b1n
free_aa: sys_name: md126
caught sigterm, all clean... exiting

###
### window 2: starting the 2nd mdmon process ###
###

#./mdmon --takeover --all
...
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
monitor: wake ( )
manage_new: inst: 0 action: 11 state: 12
mdmon: ddf_open_new: subarray 0 doesn't exist
mdmon: failed to monitor external:/md127/0
free_aa: sys_name: md126
monitor: wake ( )
no arrays to monitor... exiting


Thanks
-- 
Francis
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux