On Wed, Mar 4, 2020 at 6:02 AM Keith Busch <kbusch@xxxxxxxxxx> wrote: > > On Mon, Mar 02, 2020 at 10:03:39AM +0800, Jason A. Donenfeld wrote: > > Hi, > > > > My torrent client was doing some I/O when the below happened. I'm > > wondering if this is a known thing that's been fixed during the rc > > cycle, a regression, or if my (pretty new) NVMe drive is failing. > > > > Thanks, > > Jason > > > > Feb 24 20:36:58 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, aborting > > Feb 24 20:37:29 thinkpad kernel: nvme nvme1: I/O 852 QID 15 timeout, reset controller > > Feb 24 20:37:59 thinkpad kernel: nvme nvme1: I/O 8 QID 0 timeout, reset controller > > Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Device not ready; aborting reset > > Feb 24 20:39:00 thinkpad kernel: nvme nvme1: Abort status: 0x371 > > Sorry to say, this indicates the controller has become unresponsive. > You usually see "timeout" messages in batches, though, so I wonder if > only the one IO command timed out or if the controller just doesn't > support an abort command limit. > > You can try throttling the queue depth and see if the problem goes away. > The lowest possible depth can be set with kernel param > "nvme.io_queue_depth=2". I was unfortunately never able to reproduce. This happened while downloading a torrent, and torrent clients have a history of creating "interesting" I/O patterns. Hardware is "Samsung SSD 970 EVO Plus 2TB".