Hi all, after our latest kernel-update from 4.6. to 4.14.14 we are having trouble getting data out of our LSI-raid-controllers using the megacli-binary. For every execution of the megacli-binary a line shows up in the kern.log ### [547216.425556] megaraid_sas 0000:03:00.0: Failed to alloc kernel SGL buffer for IOCTL ### Stracing a megacli-command shows, that ENOMEM is thrown, but thats expected with an error message like above. ### ioctl(3</dev/megaraid_sas_ioctl_node>, MCE_GET_RECORD_LEN or MTRRIOC_SET_ENTRY, 0x7c98d0) = -1 ENOMEM (Cannot allocate memory) ### This does not happen on a freshly booted machine. After a reboot it usually takes roughly 2-3 days for the error to occur, but then it stays. After the first occurrence sometimes, and very randomly a megacli-command works, but only once, then keeps failing again. Current hardware is Dell R710, MegaRAID SAS 1078, Debian Jessie, Xen 4.10, Kernel 4.14.14 - virtual disk 1 — 2x 600gb SEAGATE ST3600057SS raid-1 - virtual disk 2 — 4x 2tb SEAGATE ST32000444SS raid-10 Dell R730xd, MegaRAID SAS-3 3108, Debian Jessie, Xen 4.10, Kernel 4.14.14 - same as above Megaraid-Driver-Version on new 4.14.14 kernel ### filename: /lib/modules/4.14.14-2-xen0-he+/kernel/drivers/scsi/megaraid/megaraid_sas.ko description: Avago MegaRAID SAS Driver author: megaraidlinux.pdl@xxxxxxxxxxxxx version: 07.702.06.00-rc1 license: GPL srcversion: 15F82F234414CB9CE82AE3D ### Megaraid-Driver-Version on current 4.6. kernel ### filename: /lib/modules/4.4.74-1-xen0-he+/kernel/drivers/scsi/megaraid/megaraid_sas.ko description: Avago MegaRAID SAS Driver author: megaraidlinux.pdl@xxxxxxxxxxxxx version: 06.808.16.00-rc1 license: GPL srcversion: AAF4E2A6BAB0B1254F758CA ### MegaCli Version ### $ megacli-perc5 -v MegaCLI SAS RAID Management Tool Ver 8.07.14 Dec 16, 2013 ### It may also be interesting that trying to query all controllers with “-aall” does not seem to find any controller while querying a specific controller exits with an error, even though its definitely there ### $ megacli-perc5 - -ldpdinfo -aall Exit Code: 0x00 $ megacli-perc5 -ldpdinfo -a0 User specified controller is not present. Failed to get CpController object. Exit Code: 0x01 ### Our monitoring-script runs the following command sequence every 20 minutes: ### megacli-perc5 -LDGetNum -a0 -NoLog megacli-perc5 -adpallinfo -a0 -nolog megacli-perc5 -adpgettime -a0 -nolog megacli-perc5 -fwtermlog -bbuon -a0 -silent -nolog megacli-perc5 -adpbbucmd -getbbucapacityinfo -a0 -nolog megacli-perc5 -ldpdinfo -a0 -nolog megacli-perc5 -ldinfo -l0 -a0 -nolog megacli-perc5 -ldinfo -l1 -a0 -nolog ### I failed to reproduce this on a secondary machine so im looking for clues on how to debug this further. I have looked at the kernels git-log, but couldn’t match any change to my problem. I have looked at the fwtermlog of the controller but theres nothing of interest in there. Any ideas? best regards volker