On Wed, May 21, 2014 at 1:01 PM, Charalampos Pournaris <charpour@xxxxxxxxx> wrote: > Hi Thomas, > > On Tue, May 20, 2014 at 10:44 PM, Thomas Glanzmann <thomas@xxxxxxxxxxxx> wrote: >> Hello Harry, >> >>> http://pastebin.com/AqqJaYVX >> >> I checked my log from 11th October last year when this happened to me, >> and for me it looks like the same error we're hitting: >> >> ... >> Oct 11 11:53:56 node-62 kernel: [219465.151250] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 5488 >> Oct 11 11:53:56 node-62 kernel: [219465.151261] ABORT_TASK: Found referenced iSCSI task_tag: 5494 >> Oct 11 11:53:56 node-62 kernel: [219465.151264] ABORT_TASK: ref_tag: 5494 already complete, skipping >> Oct 11 11:53:56 node-62 kernel: [219465.151267] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 5494 >> Oct 11 11:53:56 node-62 kernel: [219465.151271] ABORT_TASK: Found referenced iSCSI task_tag: 5495 >> Oct 11 11:53:56 node-62 kernel: [219465.151273] ABORT_TASK: ref_tag: 5495 already complete, skipping >> Oct 11 11:53:56 node-62 kernel: [219465.151275] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 5495 >> Oct 11 11:54:09 node-62 kernel: [219478.744212] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000008 >> Oct 11 11:54:09 node-62 kernel: [219478.751738] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 5508 >> Oct 11 11:54:23 node-62 kernel: [219492.351282] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x00000013 >> Oct 11 11:54:23 node-62 kernel: [219492.358819] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 5514 >> Oct 11 11:54:23 node-62 kernel: [219492.630489] TARGET_CORE[iSCSI]: Detected NON_EXISTENT_LUN Access for 0x0000001d >> ... >> >> Full log here: https://thomas.glanzmann.de/crash/extracted_crash_dmesg.log >> >> I'll try to reproduce it and let the list know when I find something. >> Harry, can you let me know how many VM's you had running and if they >> were thin or thick provisioned and if they're idle or they had a heavy >> workload. >> >> In my case I did _not_ use jumbo frames, but 802.3ad bonding. My >> switches are configured to be able to deliver jumbo frames. >> >> Cheers, >> Thomas > > Indeed, it seems that we hit the same issue as the log lines look > pretty similar. > > Our setup was comprised of around 7-10 VMs powered on with some > activity (not too intense), and if I recall correctly some VM > deployments (through OVF/OVA) had been made prior to the failure. > Obviously, there are no discrete steps to reproduce the problem... I > hope this can help reproduce the problem in your environment as it's > kind of difficult for me to make changes in our production one (e.g. > to recompile the kernel with debug on). > > Thanks! > > Regards, > Harry Forgot to mention here that the VMs were thin provisioned (most of them, at least). Regards, Harry -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html