Hey Joseph,
Hey folks. Apologies if this message comes through twice, but when I originally sent it the list flagged it as too large due to the dmesg log attachments, and then a coworker just told me they never saw it, so I don't think it made it through on the first attempt. Please see last note above and dmesg example attached - after more extensive testing with Max's patch we are still able to produce cqe dump errors (at a much lower frequency) as well as a new failure mode involving a crash dump.
This is a different issue AFAICT, Looks like nvmet_sq_destroy() is stuck waiting for the final reference to drop (which seems to never happen). I'm trying to look for a code path where this may happen. Can jyou tell if the backend block device completed all of its I/O when this happens (can check for active tags in debugfs). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html