On Sun, 8 May 2011 20:54:08 +0200 Piergiorgio Sartor <piergiorgio.sartor@xxxxxxxx> wrote: > Hi Neil, > > please find below a small patch which should suspend the > array while reading the stripes in order to perform the > check of the RAID-6. > > This should complete the "check" part of the SW. > Please let me know what else could be needed (docs, > test or else). > > Please have a careful look at it, since I did not know > how to test it. > > Thanks. > > --- cut here --- > > > diff -uNr a/raid6check.c b/raid6check.c > --- a/raid6check.c 2011-05-07 20:35:18.693370007 +0200 > +++ b/raid6check.c 2011-05-07 21:00:07.713865939 +0200 > @@ -24,6 +24,7 @@ > > #include "mdadm.h" > #include <stdint.h> > +#include <signal.h> > > int geo_map(int block, unsigned long long stripe, int raid_disks, > int level, int layout); > @@ -99,7 +100,7 @@ > return curr_broken_disk; > } > > -int check_stripes(int *source, unsigned long long *offsets, > +int check_stripes(struct mdinfo *info, int *source, unsigned long long *offsets, > int raid_disks, int chunk_size, int level, int layout, > unsigned long long start, unsigned long long length, char *name[]) > { > @@ -139,10 +140,22 @@ > > printf("pos --> %llu\n", start); > > + signal(SIGTERM, SIG_IGN); > + signal(SIGINT, SIG_IGN); > + signal(SIGQUIT, SIG_IGN); > + sysfs_set_num(info, NULL, "suspend_lo", start * data_disks); > + sysfs_set_num(info, NULL, "suspend_hi", (start + chunk_size) * data_disks); > for (i = 0 ; i < raid_disks ; i++) { > lseek64(source[i], offsets[i] + start * chunk_size, 0); > read(source[i], stripes[i], chunk_size); > } > + sysfs_set_num(info, NULL, "suspend_lo", 0x7FFFFFFFFFFFFFFFULL); > + sysfs_set_num(info, NULL, "suspend_hi", 0); > + sysfs_set_num(info, NULL, "suspend_lo", 0); > + signal(SIGQUIT, SIG_DFL); > + signal(SIGINT, SIG_DFL); > + signal(SIGTERM, SIG_DFL); > + > for (i = 0 ; i < data_disks ; i++) { > int disk = geo_map(i, start, raid_disks, level, layout); > blocks[i] = stripes[disk]; > @@ -343,7 +356,7 @@ > comp = comp->next; > } > > - int rv = check_stripes(fds, offsets, > + int rv = check_stripes(info, fds, offsets, > raid_disks, chunk_size, level, layout, > start, length, disk_name); > if (rv != 0) { > > --- cut here --- > > bye, > Looks pretty good. However: - you shouldn't blindly reset the signals to 'SIG_DFL'. You should capture the return value from 'signal', and feed tha back in to restore the previous setting. Alternately use 'sigblock' to just block the signal rather than ignoring it, then unblock afterwards. - When suspending IO it is safest to call mlockall(MCL_CURRENT|MCL_FUTURE); before you start. That ensures that if the device is used for swap there is no chance of deadlocking trying to swap-out while the device is locked. - You should check the return value from sysfs_set_num and at least report any error. If they return an error then you can know something is wrong... - Finally, I think the numbers you are giving to suspend_{lo,hi} are wrong. 'start' is a number of chunks, so you should write start * chunk_size * data_disks to suspend_hi, and make a similar change to the calculation for suspend_lo. Thanks, NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html