On 01/14/2016 03:48 PM, Emmanuel Florac wrote:
On a machine running a plain vanilla 3.10.23 kernel (old, I know) on
Debian Wheezy (7.3, not up to date either), with RAID arrays connected
through an Adaptec ASR-6445 RAID controller (aacraid driver, version
30200), I've seen messages like below filling up /var/log/messages.
Of course some hardware fault occurred, but I don't understand why the
message is so mangled. Could it be a driver bug? something else?
Jan 13 18:40:28 storiq -- MARK --
Jan 13 18:44:45 storiq kernel: <<3>sd 6:0:2:0: rejecting I/O to offline device
Jan 13 18:45:10 storiq kernel: quiet_error: 6 callbacks suppressed
Jan 13 18:45:10 storiq kernel: lost page write due to I/O error on dm-1
Jan 13 18:45:10 storiq last message repeated 9 times
Jan 13 18:45:15 storiq kernel: <3<3>sd<3>sd<3>s<3>s<3<3<3>sd<3><3>s<3><3>sd<3><3<3>s<<3><3><3>sd <3><<<3>sd <3><3<3>sd<<3><3>sd 6<3>sd<3>sd<3>sd<<3><3><3>s<3>s<3>sd <3><3>s<3>s<3><3>sd<3><3<3>s<<3>sd<3>sd<<3>s<3><3>s<3>sd<3>s<3<3>s<3>sd <3>sd<3><3><3><3>sd<3><3><3><3<3>s<<3<3><3<3<3><3><3><3><3>sd<3<3><3><3><3><<3>s<3>sd <3>sd <3>sd<3><3>sd<3>s<3>sd <3><3><<<3<3>s<3>sd <3>sd 6:0:2:0: re<3>s<3>s<3>s<3>sd<3>sd<3>sd<3>sd <<3>s<3>sd <3>sd <3>sd<3>sd <3>sd <<3>sd <3<3>sd<3><3>s<3<3>s<<3><3>sd<3>sd<3>sd<<3>sd<3>sd<3>sd<<3><3>s<3>s<3>sd<<3><3><3>sd<3<3><<3>sd <3>s<<3>s<<3>s<3>s<3>sd<3>sd <3>sd<<3>sd<3>sd<3><3>sd<<3>sd<3>sd<3>sd<3>sd<3><3>sd <3>s<3>sd <3>sd 6:<3><<3>s<3>s<3>s<3>sd<3>s<3>s<3>s<3>s<3>s<3>sd<3>s<3><3>s<3>sd<3><3>sd<3>s<3<3><3<3>sd<3><3<3>sd<<3>sd<3<3>s<3>sd <3><3><3>s<3>s<3>sd<3<3><3>s<3>s<3>sd 6:0:2:0: rejectin<3>sd <3>sd<<3><3<<3>s<3>sd <3><3>s<3<3><3>s<3>sd <3>sd <3>s<3>sd<3>sd 6<3>sd<3><3<3>sd <3>sd<3>s<3>sd <3><3><3>s<3<3>sd<3>s<3<3>s<3>sd <<3>sd<3><3><3>sd<3><3>s<3<<3><
3
<3>sd 6:<<<3>s<3>sd <3>sd<3>
Jan 13 18:45:15 storiq kernel: >sd <3>sd <3>s<3>sd<3>s<<3><3>s<3>sd<3<3<<3>sd<3>sd <3>sd<3>sd<3>sd <<3><3><<3>s<3>sd <<3>sd <3>s<3>s<3>sd <3>sd<3>sd <3><3><3>s<3>s<3><3><3<3><3>sd<3><3<3>s<3><3>sd<3>sd <3><3>sd<3><3>sd<3><3><3>s<3>sd <<3>sd 6:0:2:0: rejecting<3><3<3<3>s<3>sd<3>s<3>sd<3>sd <<3>s<3>sd <3<3><3>s<3>sd<3>sd<<3>sd<3>sd<3>sd <3>sd<3>s<3>sd <3>sd <<3>s<3>sd 6:<3>sd<3>sd <3>sd<3>sd <3>sd<<3><3><3><3>s<3>sd<<<3>sd<<3>sd<<3><3>sd<<3>sd<3>s<3>sd<3><3><3><3><<3>s<3>sd <3>sd<3>s<3>sd<3>sd<<3>s<3>sd <3><3><3>sd 6<<3>sd<3>sd<3>sd<3>sd <3><3><3>sd <3>sd<3>sd <3>sd <3>sd <3><3><3>sd<3>sd <3><3><3>sd <3>sd <3>sd <3><3>sd<3>s<3>s<3<<3>sd<3>sd<3><3>sd<<3><3<3>sd<3>s<3><3>s<3>sd<<3>sd<3>sd <<3><3><3>s<3>sd <3>sd 6<3>sd<<3>s<3>s<3><3>sd <3>s<3>sd<<3><<3>sd <3>sd <3>s<3<3>sd 6<<3>s<3>sd <3>sd <3>sd <3><3<3>sd<3<3><3>s<3>s<3>s<3<3>s<3>sd <<3>s<3>sd <3>sd<3<3>s<3><3<3>s<3>sd<3>sd <<3>sd<3><3>s<3>s<3>s<3><3>sd <3>s<3>s<3>sd<3<3><3>sd <3>sd<3<3><3>sd <3>sd 6<3>s<3>sd <3>s<3>sd <3><3><3>sd<<3>
s
d
<3>sd <3><3<3>sd<3>sd<3>sd<
This is an artifact of the linux logging system.
The '<3>' is in fact the logging priority prefix, which _should_
have been evaluated and dropped by the call to 'printk'.
However, printk() has this brilliant function of 'line
continuation', which will assume the output line is to be continued
when no trailing newline is found.
If you add to that the printk() might be called from different
contexts / thread / CPUs simultaneously, you might get a message
interleaving under high load (ie when lots of messages are printed
simultaneously).
And then the message continuation kicks in, and tries to print
everything in one line and doesn't interpret the leading '<3>'.
Which is what you see.
So I wouldn't classify this as a driver bug, but rather a
shortcoming in the linux logging system.
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@xxxxxxx +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html