Re: [PATCH 13/14] ahci: convert to new EH

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2006-04-11 at 22:48 +0900, Tejun Heo wrote:
> Convert AHCI to new EH.  Unfortunately, ICH7 AHCI reacts badly if IRQ
> mask is diddled during operation.  So, freezing is implemented by
> unconditionally clearing interrupt conditions while frozen.

>  
> -static inline int ahci_host_intr(struct ata_port *ap, struct ata_queued_cmd *qc)
> +static inline void ahci_host_intr(struct ata_port *ap)
>  {
> +	struct ahci_port_priv *pp = ap->private_data;
>  	void __iomem *mmio = ap->host_set->mmio_base;
>  	void __iomem *port_mmio = ahci_port_base(mmio, ap->port_no);
> -	u32 status, serr, ci;
> -
> -	serr = readl(port_mmio + PORT_SCR_ERR);
> -	writel(serr, port_mmio + PORT_SCR_ERR);
> +	u32 status, serror, ci;
> +	unsigned int eh_flags;
>  
>  	status = readl(port_mmio + PORT_IRQ_STAT);
>  	writel(status, port_mmio + PORT_IRQ_STAT);
>  
> -	ci = readl(port_mmio + PORT_CMD_ISSUE);
> -	if (likely((ci & 0x1) == 0)) {
> -		if (qc) {
> -			WARN_ON(qc->err_mask);
> -			ata_qc_complete(qc);
> -			qc = NULL;
> -		}
> +	/* AHCI gets unhappy if IRQ mask is diddled with while the
> +	 * port is active, so we cannot disable IRQ when freezing.
> +	 * Clear IRQ conditions and hope screaming IRQs don't happen.
> +	 */
> +	if (ap->flags & ATA_FLAG_FROZEN) {
> +		/* some AHCI errors hang the controller until SError
> +		 * is cleared.  Store and clear it.
> +		 */
> +		serror = scr_read(ap, SCR_ERROR);
> +		scr_write(ap, SCR_ERROR, serror);
> +		pp->eh_irq_stat |= status;
> +		pp->eh_serror |= serror;
> +		return;
>  	}
>  
> -	if (status & PORT_IRQ_FATAL) {
> -		unsigned int err_mask;
> -		if (status & PORT_IRQ_TF_ERR)
> -			err_mask = AC_ERR_DEV;
> -		else if (status & PORT_IRQ_IF_ERR)
> -			err_mask = AC_ERR_ATA_BUS;
> -		else
> -			err_mask = AC_ERR_HOST_BUS;
> -
> -		/* command processing has stopped due to error; restart */
> -		ahci_restart_port(ap, status);
> -
> -		if (qc) {
> -			qc->err_mask |= err_mask;
> -			ata_qc_complete(qc);
> +	if (!(status & PORT_IRQ_ERROR)) {
> +		struct ata_queued_cmd *qc;
> +
> +		if ((qc = ata_qc_from_tag(ap, ap->active_tag))) {
> +			ci = readl(port_mmio + PORT_CMD_ISSUE);
> +			if ((ci & 0x1) == 0) {
> +				ata_qc_complete(qc);
> +				return;
> +			}
>  		}
> +
> +		if (ata_ratelimit())
> +			printk(KERN_INFO "ata%u: spurious interrupt "
> +			       "(irq_stat 0x%x active_tag %d)\n",
> +			       ap->id, status, ap->active_tag);
> +
> +		return;
>  	}
>  
> -	return 1;
> +	/* Something weird is going on.  Hand over to EH. */
> +	serror = scr_read(ap, SCR_ERROR);
> +	scr_write(ap, SCR_ERROR, serror);
> +
> +	pp->eh_irq_stat = status;
> +	pp->eh_serror = serror;
> +
> +	eh_flags = ATA_EH_ABORT;
> +	if (status & PORT_IRQ_FREEZE)
> +		eh_flags |= ATA_EH_FREEZE;
> +
> +	ata_eh_schedule_port(ap, eh_flags);
>  }
>  

Anther problem with this patch is that interrupts happened
when port is in ATA_FLAG_FROZEN state are recorded in pp->eh_irq_stat.
So a later user-initiated warm-unplug operation might trigger EH
to handle the error conditions previously recorded in pp->eh_irq_stat.

I observed this problem in our lab
1 execute many hot-plug/unplug operations, and finally disk B is
at port 2

2 echo "1" > /sys/class/scsi_host/host1/device/target1\:0\:0/1\:0\:0\:0/
delete

3 then the output in dmesg is:
ata2: dev 0 disabled
ata2: stat 0x50 err 0x0 SError 0x40d0000 action 0x2
      (irq_stat 0x00400040, connection status changed)
ata2: probing bus for new devices
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2: dev 0 cfg 49:2f00 82:7469 83:7f01 84:4023 85:7469 86:3c01 87:4023
88:407f
ata2: dev 0 ATA-7, max UDMA/133, 156301488 sectors: LBA48
ata2: dev 0 configured for UDMA/133
  Vendor: ATA       Model: WDC WD800JD-22LS  Rev: 06.0
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 156301488 512-byte hdwr sectors (80026 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
 sdb: unknown partition table
sd 1:0:0:0: Attached scsi disk sdb

4 I execute "echo "1" > /sys/class/scsi_host/host1/device/target1\:0
\:0/1\:0\:0\:0/delete" again, then the output in dmesg is:
ata2: dev 0 disabled
ata2: stat 0x50 err 0x0 SError 0x0 action 0x2
      (irq_stat 0x00000000)
ata2: soft resetting channel for error handling
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)


Thanks,
Forrest
-
: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux