RE: MegaCli fails to communicate with Raid-Controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>
> Interesting. What is considered old and new? I have a third machine "Dell
> R515, MegaRAID SAS 2108”, is that considered new? Its running the same
> Xen/Kernel/Megacli-versions as the other two, but the error does not
> occur.

Nope this is also old controller. When I say new controller, It is pretty
much active development like SAS3.0 and SAS3.5. Driver level changes related
to DMA mask settings is FW dependent, so we cannot open it for all.

>
> > There can be a two possibilities.
> >
> > 1. This is actual memory allocation failure due to system resource
> > issue.
>
> I have not seen any OOMs on the two machines when/where the SGL-error
> occurs. According to "xl info” and our munin-graphs it all looks ok with a
> couple 100 MiB “free".
>
>
> > 2. IOCLT provided large memory length in iov and dma buffer allocation
> > from below API failed due to large memory chunk requested.
> >
> >                kbuff_arr[i] = dma_alloc_coherent(&instance->pdev->dev,
> >                                                    ioc->sgl[i].iov_len,
> >                                                    &buf_handle,
> > GFP_KERNEL);
> >
> > Can you change driver code *printk* to dump iov_len ? Just to confirm.
>
> Just did that on the “Dell R730xd, MegaRAID SAS-3 3108” and get the
> following output when the megacli works fine.
>
> ###
> Apr 23 09:31:37 xh643 kernel: [  368.319092] GD IOV-len: 2048 Apr 23
> 09:31:37
> xh643 kernel: [  368.319426] GD IOV-len: 32 Apr 23 09:31:37 xh643 kernel:
> [
> 368.319563] GD IOV-len: 320 Apr 23 09:31:37 xh643 kernel: [  368.319698]
> GD
> IOV-len: 616 Apr 23 09:31:37 xh643 kernel: [  368.319887] GD IOV-len: 1664
> Apr 23 09:31:37 xh643 kernel: [  368.320040] GD IOV-len: 32 Apr 23
> 09:31:37
> xh643 kernel: [  368.320174] GD IOV-len: 8 … ###
>
> Full output is attached in iov_len_megacli_works.txt, it also contains the
> output of /proc/buddyinfo which might be important based in my research so
> far.

We need similar output whenever there is a dma_alloc_coherent() failure.
Did you added new prints in failure of dma_alloc_coherent() OR it is generic
print for all the case ?




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux