Re: [PATCH 1/3] libata: avoid waking disk for several commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/8/24 03:02, Phillip Susi wrote:
> When a disk is in SLEEP mode it can not respond to any
> any commands.  Several commands are simply a NOOP to a disk
> that is in standby mode, but when a disk is in SLEEP mode,
> they frequencly cause the disk to spin up for no reason.
> To avoid this, complete these commands in libata without
> waking the disk.  These commands are:

As commented in patch 3/3, please use full 72-char lines for commit messages.

> 
> CHECK POWER MODE
> FLUSH CACHE
> SLEEP
> STANDBY IMMEDIATE
> IDENTIFY
> 
> If we know the disk is sleeping, we don't need to wake it up
> to find out if it is in standby, so just pretend it is in

sleep and standby are different power states. So saying that a disk that is
sleeping is in standby does not make sense. And if you wake up a drive from
sleep mode, it will *not* be in standby (need to re-check, but I think that
holds true even with PUIS enabled).

> standby.  While asleep, there's no dirty pages in the cache,
> so there's no need to flush it.  There's no point in waking
> a disk from sleep just to put it back to sleep.  We also have
> a cache of the IDENTIFY information so just return that
> instead of waking the disk.

The problem here is that ATA_DFLAG_SLEEPING is a horrible hack to not endup with
lots of timeout failures if the user execute "hdparm -Y". Executing such
passthrough command with a disk being used by an FS (for instance) is complete
nonsense and should not be done.

So I would rather see this handled correctly, through the kernel pm runtime
suspend/resume:
1) Define a libata device sysfs attribute that allows going to sleep instead of
the default standby when the disk is runtime suspended. If sleep is used, set
ATA_DFLAG_SLEEPING.
2) With that, any command issued to the disk will trigger runtime resume. If
ATA_DFLAG_SLEEPING is set, then the drive can be woken up with a link reset from
EH, going through ata_port_runtime_resume(), exactly like with the default
standby state for suspend. ATA_DFLAG_SLEEPING being set or not will indicate if
a simple verify command can spinup the disk or if a link abort is needed (like
done now in ata_qc_issue() which is really a nasty place to do that).

Now, the annoying thing is the drive being randomly woken-up due to commands
being issued, like the ones you mention. This is indeed bad, and seeing news
like this:

https://www.phoronix.com/news/Linux-PM-Regulatory-Bugs

I think we really should do better...

But I do not think the kernel is necessarilly the right place to fix this, at
least in the case of commands issued from userspace by things like smartd or
udevd. Patching there is needed to avoid uselessly waking up disks in runtime
suspend. systemd already has power policies etc, so there is integration with
the kernel side power management. Your issues come from using a tool (hdparm)
that has no integration at all with the OS daemons.

For FSes issued commands like flush, these are generally not random at all. If
you see them appearing randomly, then there is a problem with the FS and
patching the FS may be needed. Beside flush, there are other things to consider
here. Ex: FSes using zoned block devices (SMR disks) doing garbage collection
while idle. We cannot prevent this from happening, which is why I seriously
dislike the idea of faking any command for a sleeping disk.


> ---
>  drivers/ata/libata-core.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
> index 09ed67772fae..6c5269de4bf2 100644
> --- a/drivers/ata/libata-core.c
> +++ b/drivers/ata/libata-core.c
> @@ -5040,6 +5040,26 @@ void ata_qc_issue(struct ata_queued_cmd *qc)
>  
>  	/* if device is sleeping, schedule reset and abort the link */
>  	if (unlikely(qc->dev->flags & ATA_DFLAG_SLEEPING)) {
> +		switch (qc->tf.command)
> +		{
> +		case ATA_CMD_CHK_POWER:
> +		case ATA_CMD_SLEEP:
> +		case ATA_CMD_FLUSH:
> +		case ATA_CMD_FLUSH_EXT:
> +		case ATA_CMD_STANDBYNOW1:
> +			if (qc->tf.command == ATA_CMD_ID_ATA)
> +			{
> +				/* only fake the reply for IDENTIFY if it is from userspace */
> +				if (ata_tag_internal(qc->tag))
> +					break;
> +				sg_copy_from_buffer(qc->sg, 1, qc->dev->id, 2 * ATA_ID_WORDS);
> +			}
> +			/* fake reply to avoid waking drive */
> +			qc->flags |= ATA_QCFLAG_RTF_FILLED;
> +			qc->result_tf.nsect = 0;
> +			ata_qc_complete(qc);
> +			return;
> +		}
>  		link->eh_info.action |= ATA_EH_RESET;
>  		ata_ehi_push_desc(&link->eh_info, "waking up from sleep");
>  		ata_link_abort(link);

-- 
Damien Le Moal
Western Digital Research





[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux