Re: [PATCH 0/8] AER fix & cleanup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday 03 September 2009 10:45:25 pm Hidetoshi Seto wrote:
> Bjorn Helgaas wrote:
> > Since you're doing a lot of cleanup in aerdrv_errprint.c anyway,
> > the format of the messages it prints is pretty strange compared
> > to typical Linux messages.  Here's a sample:
> > 
> >   +------ PCI-Express Device Error ------+
> >   Error Severity          : Corrected
> >   PCIE Bus Error type     : Physical Layer
> >   Receiver Error          : First
> >   Receiver ID             : 0f00
> >   VendorID=103ch, DeviceID=403bh, Bus=0fh, Device=00h, Function=00h
> > 
> > I'd like this a lot better if it used dev_printk and other Linux
> > conventions, e.g., "pci 0000:0f:00.0" instead of "Bus=0fh, Device=00h,
> > Function=00h" and "device[103c:403b]" instead of "VendorID=103ch,
> > DeviceID=403bh".
> 
> Good point.
> Such kind of format change is worth to do, I think.
> 
> > Also, it currently calls printk many times, so the output is often
> > interspersed with other messages, and there's no nice way to put it
> > back together or even tell where the end is.  There's no loop, so
> > there is a clear maximum output size; maybe everything could be
> > put into a buffer and printed with a single dev_printk to make it
> > atomic.
> 
> Unfortunately there could be some loop if multiple errors are found.
> I think the "interspersed message" is a common issue on where need
> to print long strings in multiple lines, and it will be more serious
> on where uses KERN_CONT and/or pr_cont().

Oh, I see, you got rid of the static errmsg_buff[] (which limited
the output size before), so the loop in aer_get_error_source_name()
determines the output size.  That's a nice change; there was
nothing in the the old code to keep us from overflowing errmsg_buff[]
and corrupting something.

> Buffering messages might be a bit over-engineered if it is just to
> avoid this problem.  Since there is no KERN_CONT in aerdrv_errprint.c,
> I think using dev_printk would be enough since message lines are printed
> each with prefix "pci 0000:00:00.0: ..." so it would not be so hard to
> put it back together.

Yep, if we have the dev_printk prefixes, it probably would be over-
engineering to make everything atomic.

> OK, I'll make one more patch to change the format.
> Following is a blueprint.  Feel free to comment on it:
> (Note: handwritten, actual error not used)
> 
> Before:
>   +------ PCI-Express Device Error ------+
>   Error Severity          : Corrected
>   PCIE Bus Error type     : Data Link Layer
>   Bad TLP                 :
>   Bad DLLP                : First
>   Receiver ID             : 0f00
>   VendorID=103ch, DeviceID=403bh, Bus=0fh, Device=00h, Function=00h
>   Error of this Agent(0f00) is reported first
>   +------ PCI-Express Device Error ------+
>   Error Severity          : Corrected
>   PCIE Bus Error type     : Data Link Layer
>   Bad TLP                 : First
>   Receiver ID             : 1700
>   VendorID=103ch, DeviceID=403bh, Bus=17h, Device=00h, Function=00h
> 
> After:
>   pci 0000:01:00.0: AER: Multiple Corrected error received: id=0f00
>   pci 0000:0f:00.0: PCIE Bus Error: severity=Corrected, type=Data Link Laye
>   r, id=0f00(Receiver ID)
>   pci 0000:0f:00.0:   device[103c:403b] error status/mask=000020c0/00002000

I should have put a space after "device".

>   pci 0000:0f:00.0:    [ 6] Bad TLP
>   pci 0000:0f:00.0:    [ 7] Bad DLLP               (First)
>   pci 0000:0f:00.0:   Error of this Agent(0f00) is reported first

Are "id=0f00" in the first line and "Agent(0f00)" references to PCI
device 0f:00?  If so, it'd be nice to print them in the conventional
Linux format, including the domain.

>   pci 0000:17:00.0: PCIE Bus Error: severity=Corrected, type=Data Link Laye
>   r, id=1700(Receiver ID)
>   pci 0000:17:00.0:   device[103c:403b] error status/mask=00000040/00002000
>   pci 0000:17:00.0:    [ 6] Bad TLP                (First)

I think that's great.  Thanks a lot for considering this change.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux