Re: [PATCH] RAID-6 check standalone suspend array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 8 May 2011 20:54:08 +0200 Piergiorgio Sartor
<piergiorgio.sartor@xxxxxxxx> wrote:

> Hi Neil,
> 
> please find below a small patch which should suspend the
> array while reading the stripes in order to perform the
> check of the RAID-6.
> 
> This should complete the "check" part of the SW.
> Please let me know what else could be needed (docs,
> test or else).
> 
> Please have a careful look at it, since I did not know
> how to test it.
> 
> Thanks.
> 
> --- cut here ---
> 
> 
> diff -uNr a/raid6check.c b/raid6check.c
> --- a/raid6check.c	2011-05-07 20:35:18.693370007 +0200
> +++ b/raid6check.c	2011-05-07 21:00:07.713865939 +0200
> @@ -24,6 +24,7 @@
>  
>  #include "mdadm.h"
>  #include <stdint.h>
> +#include <signal.h>
>  
>  int geo_map(int block, unsigned long long stripe, int raid_disks,
>  	    int level, int layout);
> @@ -99,7 +100,7 @@
>  	return curr_broken_disk;
>  }
>  
> -int check_stripes(int *source, unsigned long long *offsets,
> +int check_stripes(struct mdinfo *info, int *source, unsigned long long *offsets,
>  		  int raid_disks, int chunk_size, int level, int layout,
>  		  unsigned long long start, unsigned long long length, char *name[])
>  {
> @@ -139,10 +140,22 @@
>  
>  		printf("pos --> %llu\n", start);
>  
> +		signal(SIGTERM, SIG_IGN);
> +		signal(SIGINT, SIG_IGN);
> +		signal(SIGQUIT, SIG_IGN);
> +		sysfs_set_num(info, NULL, "suspend_lo", start * data_disks);
> +		sysfs_set_num(info, NULL, "suspend_hi", (start + chunk_size) * data_disks);
>  		for (i = 0 ; i < raid_disks ; i++) {
>  			lseek64(source[i], offsets[i] + start * chunk_size, 0);
>  			read(source[i], stripes[i], chunk_size);
>  		}
> +		sysfs_set_num(info, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL);
> +		sysfs_set_num(info, NULL, "suspend_hi", 0);
> +		sysfs_set_num(info, NULL, "suspend_lo", 0);
> +		signal(SIGQUIT, SIG_DFL);
> +		signal(SIGINT, SIG_DFL);
> +		signal(SIGTERM, SIG_DFL);
> +
>  		for (i = 0 ; i < data_disks ; i++) {
>  			int disk = geo_map(i, start, raid_disks, level, layout);
>  			blocks[i] = stripes[disk];
> @@ -343,7 +356,7 @@
>  		comp = comp->next;
>  	}
>  
> -	int rv = check_stripes(fds, offsets,
> +	int rv = check_stripes(info, fds, offsets,
>  			       raid_disks, chunk_size, level, layout,
>  			       start, length, disk_name);
>  	if (rv != 0) {
> 
> --- cut here ---
> 
> bye,
> 


Looks pretty good.  However:

 - you shouldn't blindly reset the signals to 'SIG_DFL'.  You should capture
   the return value from 'signal', and feed tha back in to restore the
   previous setting.  Alternately use 'sigblock' to just block the signal
   rather than ignoring it, then unblock afterwards.

 - When suspending IO it is safest to call
        mlockall(MCL_CURRENT|MCL_FUTURE);
   before you start.  That ensures that if the device is used for swap there
   is no chance of deadlocking trying to swap-out while the device is locked.

 - You should check the return value from sysfs_set_num and at least report
   any error.  If they return an error then you can know something is wrong...

 - Finally, I think the numbers you are giving to suspend_{lo,hi} are wrong.
   'start' is a number of chunks, so you should write
           start * chunk_size * data_disks
   to suspend_hi, and make a similar change to the calculation for suspend_lo.


Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux