RE: [Fastboot] [RFC] [PATCH 2/2] kdump: cciss driver initialization issue fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 

> -----Original Message-----
> From: Eric W. Biederman [mailto:ebiederm@xxxxxxxxxxxx] 
> Sent: Monday, June 26, 2006 11:38 AM
> To: Miller, Mike (OS Dev)
> Cc: vgoyal@xxxxxxxxxx; Maneesh Soni; Andrew Morton; 
> Neela.Kolli@xxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx; 
> fastboot@xxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [Fastboot] [RFC] [PATCH 2/2] kdump: cciss driver 
> initialization issue fix
> 
> "Miller, Mike (OS Dev)" <Mike.Miller@xxxxxx> writes:
> 
> > All,
> > Sorry to come in late and top post. I've been out of the office and 
> > I'm trying to get to the gist of this issue.
> > Exactly what is the problem? I'm not familiar with kdump so I don't 
> > have a clue about what's going on.
> > There are a couple of reset features supported by _some_ cciss 
> > controllers. I'd have to go back to the open spec to see 
> whats in the 
> > public domain. We're trying to get the open spec updated and more 
> > complete but we're waiting on the lawyers. :(
> 
> 
> kdump or taking crash dumps using the kexec on panic 
> mechanism could be called a drivers worst nightmare.  In the 
> latest distros this is becoming the way crash dump style 
> information is captured.
> 
> Because the initial kernel is broken we do a jump into 
> another kernel that is sufficient to record a crash dump.  
> That second kernel initializes the hardware from whatever 
> random state the first kernel left the drivers in.  That 
> first kernel is not permitted to do any device shutdown activities.
> 
> The problem is that a command the running instance of the 
> driver did not initiate completes.  At least if I read Vivek 
> patch 2/2 correctly.
> 
> So we have three options.
> - reset the card during initialization.
> - handle the case of a command we did not initiate completing.
> - mark the driver/card as impossibly hopeless for use in a crash
>   dump scenario.
> 
> 
> Eric

Thanks Eric, that helps me understand. Section 8.2.2 of the open cciss
spec supports a reset message. Target 0x00 is the controller. We could
add this to the init routine to ensure the board is made sane again but
this would drastically increase init time under normal circumstances.
And I suspect this is a hard reset, also. Not sure if that would
negatively impact kdump. If there were some condition we could test
against and perform the reset when that condition is met it would not
impact 99.9% of users.

Thoughts, comments, flames?

mikem
-
: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux