Re: Interesting feature request for linux raid, waking up drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 9 May 2012 23:16:33 +0200 Patrik Horník <patrik@xxxxxx> wrote:

> On Wed, May 9, 2012 at 10:06 PM, Larkin Lowrey
> <llowrey@xxxxxxxxxxxxxxxxx> wrote:
> > I second this suggestion but I don't think it's the job of the raid
> > layer to keep track of whether the member drives are spinning or not.
> 
> I also dont think it should be directly in the raid level, but it is
> problem of linux raid and so the solution should be sought here.
> 
> > I have implemented a similar setup to this but am suffering from the
> > sequential spin-up problem you described. It would be nice to have a
> > solution.
> 
> My script is not perfect but it eliminates sequential spin-up problem
> perfectly. If you want, use it. The sequential spin-up problem was the
> reason I wrote it and its main function is to detect woken drives and
> immediately wake other drives from RAID.
> 
> > A userspace daemon could probably do the job. I found that relying on
> > the drive's internal power management for spinning them down was
> > unreliable (especially for WDC "green" drives) so I implemented a script
> > that watches /sys/block/sdX/stat for activity and spins down the drive
> > directly (via hdparm) when no activity has been posted for a
> > configurable period of time. A daemon process that was responsible for
> > spinning down the constituent drives could also be responsible for
> > spinning them up by watching /sys/block/mdX/stat for pending transfers.
> > Perhaps you and I could work on such a project.
> 
> I added support for spinning down drives only as addition after I
> bought first WD Greens . It is done in wrong way, it relies on some
> drives in the array working correctly and I guess your way is the
> correct one. Do you have specification of /sys/block/sdX/stat?
> 
> Right now the script is checking power status of drives by hdparm. I
> dont know yet what is in /sys/block/sdX/stat and what is better, but
> the basic principle behind my script works perfectly at least in my
> setups - if at least one drive from raid array is awake, wake up all
> of them.
> 
> > One thing mdadm could do which would help greatly is to enumerate the
> > member disk block devices (not just partitions or member raid devices)
> > for a given array. This information is known since concurrent sync
> > operations are serialized so no two sync operations occur at the same
> > time on the same physical devices.
> 
> Maybe Neil can give us his thoughts what is the best place / form for
> such functionality.

Maybe he could if he had any clear opinions.

I'm not strongly against including something like this in md.
We already have code which writes to every device on the first write after a
delay (to change the metadata from 'clean' to 'dirty').  Reading all the
metadata on the first read after a delay is at least a little by symmetric
with this.  But I would need a separate buffer to read in to to avoid
confusion.
But that feels a little bit forced, so I'm not sure.

But then I don't like the idea of some script polling the devices either -
polling is bad.  If we could arrange for some uevent to be generated by a
device when it wakes-from-sleep, then have a script to run then, that might
be good.

Of course just reading the superblock isn't really enough (as has been
suggested).  If there are multiple levels of stack devices then you really
want to find all the child devices and  wake them, and that is really best
done in user-space.
To find all child devices I would write a little script that searchs the 
  /sys/block/mdX/slaves
directory.  Everything in there should either be a block device, or a
partition.  If the latter it will contain a file 'partition' and the device
is found by adding '..' to the path.  If that directory contains 'slaves' you
get to search deeper.

So on balance I think it is best if most of the work is done in user-space,
but there could well be a case for arranging an alert of some sort getting
sent to user-space on the first-read-after-a-delay.

NeilBrown



> 
> Patrik
> 
> >
> > --Larkin
> >
> > On 5/9/2012 12:37 PM, Patrik Horník wrote:
> >> Hello Neil,
> >>
> >> I want to propose some functionality for linux raid subsystem that I
> >> think will be very practical for many users, automatic waking of
> >> drives. I am using my own user land script written years ago to do
> >> that and I dont know if there is some standard solution now. If there
> >> is some, please point me to it.
> >>
> >> I am using couple of big RAID5 arrays in servers working like NASes in
> >> small office and home, which are in use only small part of the day. I
> >> am using low power server and aggressive power saving settings on HDDs
> >> to make power consumption substantially lower, for example drives are
> >> going to sleep after 15 min of inactivity. Normally problem with such
> >> settings is extremely long waking time when array is accessed.
> >> Software accessing data often first requests only chunk of data on
> >> first drive in array and waiting cca 20-30 sec for them, after
> >> processing them accessing data on another drive and waiting another
> >> 20-30 sec and so on.
> >>
> >> I solved it with my own script in PHP, which monitors drives' status
> >> periodically. When it detects that drive from RAID array woke up, it
> >> immediately wakes other drives. So total waking time is equal to
> >> waking of one drive plus couple of seconds. It works perfectly and
> >> smoothly for years for me.
> >>
> >> I attached the script from one of my servers, it is little cruel and
> >> using hwparm and smartctl to monitor and manipulating drives. It is
> >> little customized and specific for its server, for example one drive
> >> detected by model is not used to wake up other drives and two drives
> >> are also putting one another into sleep, because I found out the
> >> standby timeout setting was not working reliable on one drive. But you
> >> will get the idea.
> >>
> >> I think it could be useful for some users if there is possibility to
> >> use such feature. Do you think it would be useful? Do you think there
> >> is some place in linux raid infrastructure where it can be somehow
> >> implemented? (Possibly as some user land tool using some kernel APIs,
> >> I dont know.)
> >>
> >> Best regards,
> >>
> >> Patrik Horník
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux