Re: Possible bug cause by Commit 5b01e4b9efa0b78672cbbea830c9fbcc7f239e29 >libata: Implement NCQ autosense

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/17/2016 03:54 PM, Arthur Koziol wrote:
Hannes,

Hi, I'm Arthur. I recently discovered what might be a regression in the
4.8 kernel with ATA code. I was working with Patrick Verner of the
Parted Magic distro but we have had no luck figuring out the issue. I am
attaching a screenshot of the error I am seeing as it might make sense
to you. Patrick said this has been an on again off again bug for years.

This problem is happening on shutdown but it seems, for now, I am only
seeing it on Dell Optiplex 7010 & 7020 models. Both machines have the
most current BIOSes of course and the issue was not present previous to
the 4.8 kernel. I can replicate the same behavior using KUbuntu 16.10
which is from where I got this screen grab.

I found this commit when looking through the changes listed for 4.8 on
Kernel Newbies. I also saw that Linus pulled in a bunch of libata stuff
as shown here:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e4f7bdc2ec0d0dcc27f7d70db27a620dfdc1f697
so it could be one of those. I am just trying to maybe track down where
the change happened that could be causing this issue and let someone
know. I am not a Linux dev, just trying to report a problem.

So, I hope you can help me with this info and if not, maybe you can
point me to someone who can? I'm sure I am not the only one seeing this
weird issue.

This is really an odd issue, but very unlikely to be generated by the mentioned patchset. Keep in mind that the patchset only affects the reporting of errors, not functionality in general. So at first there has to be an error occurring, only then would the patchset have any effect.

The error I'm seeing from the screenshot is an ATA exception due to a failed 'Get event status notification', which typically is only ever sent to a CD-ROM. The next one is a failed 'FLUSH CACHE' command, after which the host is trying to do a COMRESET (ie a hardware reset).
Which fails, causing the system to disable the device.
So far, so good.

Only it doesn't gives any indication what happened prior to that.
Can you send the sequence of events leading to that error?

Cheers,

Hannes
--
Dr. Hannes Reinecke		      zSeries & Storage
hare@xxxxxxx			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux