Re: [PATCH] Assemble: prevent segfault with faulty "best" devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 08 2017, Andrea Righi wrote:

> I was able to trigger this curious problem that seems to happen only on
> one of our server:
>
> # mdadm --assemble /dev/md/10.4.237.12-volume --name 10.4.237.12-volume
> Segmentation fault
>
> This md volume is a raid1 volume made of 2 device mapper (dm-multipath)
> devices and the underlying LUNs are imported via iSCSI.
>
> Applying the following patch (see below) seems to fix the problem:
>
> # ./mdadm --assemble /dev/md/10.4.237.12-volume --name 10.4.237.12-volume
> mdadm: /dev/md/10.4.237.12-volume has been started with 2 drives.
>
> But I'm not sure if it's the right fix or if there're some other
> problems that I'm missing.
>
> More details about the md superblocks that might help to better
> understand the nature of the problem:
>
> # for i in 36001405a04ed0c104881{1,2}00000000000p2; do echo dev: ${i}; mdadm --examine /dev/mapper/${i}; done
> dev: 36001405a04ed0c104881100000000000p2
> /dev/mapper/36001405a04ed0c104881100000000000p2:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
>            Name : 10.4.237.12-volume
>   Creation Time : Thu Jul 27 14:43:16 2017
>      Raid Level : raid1
>    Raid Devices : 2
>
>  Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
>      Array Size : 536864704 (511.99 GiB 549.75 GB)
>   Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
>     Data Offset : 8192 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=8104 sectors, after=95 sectors
>           State : clean
>     Device UUID : 16dae7e3:42f3487f:fbeac43a:71cf1f63
>
> Internal Bitmap : 8 sectors from superblock
>     Update Time : Tue Aug  8 11:12:22 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : 518c443e - correct
>          Events : 167
>
>
>    Device Role : Active device 0
>    Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
> dev: 36001405a04ed0c104881200000000000p2
> /dev/mapper/36001405a04ed0c104881200000000000p2:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x1
>      Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
>            Name : 10.4.237.12-volume
>   Creation Time : Thu Jul 27 14:43:16 2017
>      Raid Level : raid1
>    Raid Devices : 2
>
>  Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
>      Array Size : 536864704 (511.99 GiB 549.75 GB)
>   Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
>     Data Offset : 8192 sectors
>    Super Offset : 8 sectors
>    Unused Space : before=8104 sectors, after=95 sectors
>           State : clean
>     Device UUID : ef612bdd:e475fe02:5d3fc55e:53612f34
>
> Internal Bitmap : 8 sectors from superblock
>     Update Time : Tue Aug  8 11:12:22 2017
>   Bad Block Log : 512 entries available at offset 72 sectors
>        Checksum : c39534fd - correct
>          Events : 167
>
>
>    Device Role : Active device 1
>    Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
>
> # for i in 36001405a04ed0c104881{1,2}00000000000p2; do echo dev: ${i}; hexdump -s 4096 -n 4189696 -C /dev/mapper/${i}; done
> dev: 36001405a04ed0c104881100000000000p2
> 00001000  fc 4e 2b a9 01 00 00 00  01 00 00 00 00 00 00 00  |.N+.............|
> 00001010  5f 3e 82 83 7f 83 1b 85  bc 19 58 b9 6f 27 87 a4  |_>........X.o'..|
> 00001020  31 30 2e 34 2e 32 33 37  2e 31 32 2d 76 6f 6c 75  |10.4.237.12-volu|
> 00001030  6d 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |me..............|
> 00001040  64 50 7a 59 00 00 00 00  01 00 00 00 00 00 00 00  |dPzY............|
> 00001050  80 cf ff 3f 00 00 00 00  00 00 00 00 02 00 00 00  |...?............|
> 00001060  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 00001070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 00001080  00 20 00 00 00 00 00 00  df cf ff 3f 00 00 00 00  |. .........?....|
> 00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000010a0  00 00 00 00 00 00 00 00  16 da e7 e3 42 f3 48 7f  |............B.H.|
> 000010b0  fb ea c4 3a 71 cf 1f 63  00 00 08 00 48 00 00 00  |...:q..c....H...|
> 000010c0  54 f0 89 59 00 00 00 00  a7 00 00 00 00 00 00 00  |T..Y............|
> 000010d0  ff ff ff ff ff ff ff ff  9c 43 8c 51 80 00 00 00  |.........C.Q....|
> 000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
> 00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
> *
> 00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00002000  62 69 74 6d 04 00 00 00  5f 3e 82 83 7f 83 1b 85  |bitm...._>......|
> 00002010  bc 19 58 b9 6f 27 87 a4  a7 00 00 00 00 00 00 00  |..X.o'..........|
> 00002020  a7 00 00 00 00 00 00 00  80 cf ff 3f 00 00 00 00  |...........?....|
> 00002030  00 00 00 00 00 00 00 01  05 00 00 00 00 00 00 00  |................|
> 00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00003100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
> *
> 00004000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 003ffe00
> dev: 36001405a04ed0c104881200000000000p2
> 00001000  fc 4e 2b a9 01 00 00 00  01 00 00 00 00 00 00 00  |.N+.............|
> 00001010  5f 3e 82 83 7f 83 1b 85  bc 19 58 b9 6f 27 87 a4  |_>........X.o'..|
> 00001020  31 30 2e 34 2e 32 33 37  2e 31 32 2d 76 6f 6c 75  |10.4.237.12-volu|
> 00001030  6d 65 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |me..............|
> 00001040  64 50 7a 59 00 00 00 00  01 00 00 00 00 00 00 00  |dPzY............|
> 00001050  80 cf ff 3f 00 00 00 00  00 00 00 00 02 00 00 00  |...?............|
> 00001060  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 00001070  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 00001080  00 20 00 00 00 00 00 00  df cf ff 3f 00 00 00 00  |. .........?....|
> 00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> 000010a0  01 00 00 00 00 00 00 00  ef 61 2b dd e4 75 fe 02  |.........a+..u..|
> 000010b0  5d 3f c5 5e 53 61 2f 34  00 00 08 00 48 00 00 00  |]?.^Sa/4....H...|
> 000010c0  54 f0 89 59 00 00 00 00  a7 00 00 00 00 00 00 00  |T..Y............|
> 000010d0  ff ff ff ff ff ff ff ff  5b 34 95 c3 80 00 00 00  |........[4......|
> 000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
> 00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
> *
> 00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00002000  62 69 74 6d 04 00 00 00  5f 3e 82 83 7f 83 1b 85  |bitm...._>......|
> 00002010  bc 19 58 b9 6f 27 87 a4  a7 00 00 00 00 00 00 00  |..X.o'..........|
> 00002020  a7 00 00 00 00 00 00 00  80 cf ff 3f 00 00 00 00  |...........?....|
> 00002030  00 00 00 00 00 00 00 01  05 00 00 00 00 00 00 00  |................|
> 00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 00003100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
> *
> 00004000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 003ffe00
>
> ---
> Assemble: prevent segfault with faulty "best" devices
>
> In Assemble(), after context reload, best[i] can be -1 in some cases,
> and before checking if this value is negative we use it to access
> devices[j].i.disk.raid_disk, potentially causing a segfault.
>
> Check if best[i] is negative before using it to prevent this potential
> segfault.
>
> Signed-off-by: Andrea Righi <andrea@xxxxxxxxxxxxxxx>
> Signed-off-by: Robert LeBlanc <robert@xxxxxxxxxxxxx>
> ---
>  Assemble.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Assemble.c b/Assemble.c
> index 3da0903..fc681eb 100644
> --- a/Assemble.c
> +++ b/Assemble.c
> @@ -1669,6 +1669,8 @@ try_again:
>  		int j = best[i];
>  		unsigned int desired_state;
>  
> +		if (j < 0)
> +			continue;
>  		if (devices[j].i.disk.raid_disk == MD_DISK_ROLE_JOURNAL)
>  			desired_state = (1<<MD_DISK_JOURNAL);
>  		else if (i >= content->array.raid_disks * 2)
> @@ -1678,8 +1680,6 @@ try_again:
>  		else
>  			desired_state = (1<<MD_DISK_ACTIVE) | (1<<MD_DISK_SYNC);
>  
> -		if (j<0)
> -			continue;
>  		if (!devices[j].uptodate)
>  			continue;
>  

Patch looks good to me, thanks.

Regression was causes by commit 69a481166be6 ("Assemble array with write
journal") which introduced a use of 'j' before the test if it was < 0.

Fixes: 69a481166be6 ("Assemble array with write journal")
Reviewed-by: NeilBrown <neilb@xxxxxxxx>

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux