Re: [SCSI REGRESSION] 3.10.2 or 3.10.3: arcmsr failure at bootup / early userspace transition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29 Jul 2013, Bernd Schubert said:

> Hi Nick,
>
> On 07/29/2013 12:10 PM, Nick Alcock wrote:
>> arcmsr0: abort device command of scsi id = 0 lun = 1
>> arcmsr0: abort device command of scsi id = 0 lun = 0
>> arcmsr: executing bus reset eh.....num_resets=0, num_[...]
>>
>> arcmsr0: wait 'abort all outstanding command' timeout
>> arcmsr0: executing hw bus reset ....
>> arcmsr0: waiting for hw bus reset return, retry=0
>> arcmsr0: waiting for hw bus reset return, retry=1
>> Areca RAID Controller0: F/W V1.46 2009-01-06 & Model ARC-1210
>> arcmsr: scsi  bus reset eh returns with success
>> [and back to the top of the error messages again, apparently forever,
>>   not that the machine would be much use without its RAID array even
>>   if this loop terminated at some point, so I only gave it a couple
>>   of minutes]
>>
>> The failure happens precisely at the moment we transition to early
>> userspace, so presumably userspace I/O is failing (or something related
>> to raw device access, perhaps, since the first thing it does is a
>> vgscan).
>>
>> I haven't bisected yet (sorry, I have work to do which means this
>> machine must be running right now), but nothing has changed in the
>> arcmsr controller, nor in SCSI-land excepting
>>
>> commit 98dcc2946adbe4349ef1ef9b99873b912831edd4
>> Author: Martin K. Petersen <martin.petersen@xxxxxxxxxx>
>> Date:   Thu Jun 6 22:15:55 2013 -0400
[...]
>> Obviously, at this point, this machine has no modules loaded (it has
>> almost none loaded even when fully operational)
>
> I tested this patch with ARC-1260 and F/W V1.49, no issues. Also, this
> patch is only in 3.10.3, but not yet in 3.10.1.

... and I see this problem with 3.10.3 but not 3.10.1. (Haven't tried
3.10.2.)

>                                                 And I don't think this
> commit can cause your issue at all, a failing heuristics would enable
> WRITE SAME and would cause issues with linux-md, but there shouldn't
> happen anything directly in the scsi-layer. Which was your last
> working kernel version?

3.10.1. :)

No changes to arcmsr between those versions... I suspect I'll have to
bisect, which will be a complete pig because every failure means a hard
powerdown of this box. Always-on servers rarely appreciate hard
powerdowns :(

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux