Re: Raid5 race patch (fwd)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 25 Feb 2002, Neil Brown wrote:

> As he says, the patch is rather ugly and doesn't really address the
> root problem.  But if it works for you, that is good.
> 
I think it's ugly because it puts some structures into raid5.c code, which
should be accessible from structures already defined in the code
(function).

I do think, that patch is right as far as root of the problem is
concerned. 

What I don't understand is, why is ->faulty flag used all thru md.c when
we have mark_disk_faulty(sb->disks+disk->number);  and bitmaped status for
the same reason. Are they diferent in any case, or is it the case, that
structure mdp_disk_t used in disk_faulty is not accessible on those
places.

It seems that on SMP machines md_wakeup_thread gets executed on other CPU
without mark ->faulty being set.

If there would be a way to set ->faulty in raid5_error without calling
rrdev = find_rdev(mddev, dev); and friends this would be quite right fix. 

I also suspect, that same race exists for mirror code (probably others
too), since I don't se any lock and logic seems to me exactly the same.

> I think that the "right" approach is to claim reconfig_sem (which is
> currently unused I think) while writing out the superblocks, and when
> releasing the per-device superblock, and probably when doing a few
> other things.
I wouldn't know about those. But if I look closer in raid5.c we kill
ourselves on SMP machines with calling md_wakeup_thread in any case. Would
call to wake_up(&thread->wqueue); honor this mutex and wait for md_error
to finish ?

> I will have a closer look over the code and see how well this can
> work.
> 
Please do.
I'm holding release of few servers into production until this race is
properly fixed and looking forward to the proper fix.

So, we have testing computers on disposal for test for at least this week.

	lp
		gody

__________________________________________________________________
|    Matjaz Godec    |    Agenda d.o.o.    |   ISP for business  |
|   Tech. Manager    |   Gosposvetska 84   |     WAN networks    |
|   gody@slon.net    |   si-2000 Maribor   |  Internet/Intranet  |
| tel:+386.2.2340860 |      Slovenija      | Application servers |
|http://www.slon.net |http://www.agenda.si |  Caldera OpenLinux  |

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux