Re: Problem w/ commit ac8fa4196d20 on older, slower hardware

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Per commit ac8fa4196d20:
> 
> > md: allow resync to go faster when there is competing IO.
> > 
> > When md notices non-sync IO happening while it is trying to resync (or
> > reshape or recover) it slows down to the set minimum.
> > 
> > The default minimum might have made sense many years ago but the drives have
> > become faster. Changing the default to match the times isn't really a long
> > term solution.
> 
> This holds true for modern hardware, but this commit is causing problems on
> older hardware, like SGI MIPS platforms, that use mdraid.  Namely, while trying
> to chase down an unrelated hardlock bug on an Onyx2, one of the arrays got out
> of sync, so on the next reboot, mdraid's attempt to resync at full speed
> absolutely murdered interactivity.  It took close to 30mins for the system to
> finally reach the login prompt.
> 
> Revert this patch was working to mitigate the problem at first, but it appears
> that in recent kernels, this is no longer the case, and reverting this commit
> has no noticeable effect anymore.  I assume I'd have to hunt down newer commits
> to revert, but it's probably saner to just highlight the problem and test any
> proposed solutions.
> 
> Is there some way to resolve this in such a way that old hardware maintains
> some level of interactivity during a resync, but that won't inconvenience the
> more modern systems?
> 
> http://git.linux-mips.org/cgit/ralf/linux.git/commit/?id=ac8fa4196d20
> 
> Thanks!,
>

Hmmm... this change shouldn't have that effect.
It should allow resync to soak up a bit more of the idle time, but when
there is any other IO, resync should still back off.

I wonder if there is some other change which has confused the event
counting for the particular hardware you are using.

How did you identify this commit as a possible cause?

The fact that reverting it no longer helps strongly suggests that some
other change is implicated.  I don't think there have been other changes
in md which could affect this.

Have you tried adjusting /proc/sys/dev/raid/speed_limit_m{ax,in} ??
Did that have any noticeable effect?

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux