Can you capture a blktrace while perform fstrim to record the discard operations? A 1TB trim extent would cause a huge impact since it would translate to approximately 262K IO requests to the OSDs (assuming 4MB backing files). On Fri, Nov 17, 2017 at 6:19 PM, Brendan Moloney <moloney@xxxxxxxx> wrote: > Hi, > > I guess this isn't strictly about Ceph, but I feel like other folks here > must have run into the same issues. > > I am trying to keep my thinly provisioned RBD volumes thin. I use > virtio-scsi to attach the RBD volumes to my VMs with the "discard=unmap" > option. The RBD is formatted as XFS and some of them can be quite large > (16TB+). I have a cron job that runs "fstrim" commands twice a week in the > evenings. > > The issue is that I see massive I/O stalls on the VM during the fstrim. To > the point where I am getting kernel panics from hung tasks and other > timeouts. I have tried a number of things to lessen the impact: > > - Switching from deadline to CFQ (initially I thought this helped, but > now I am not convinced) > - Running fstrim with "ionice -c idle" (this doesn't seem to make a > difference) > - Chunking the fstrim with the offset/length options (helps reduce worst > case, but I can't trim less than 1TB at a time and that can still cause a > pause for several minutes) > > Is there anything else I can do to avoid this issue? > > Thanks, > Brendan > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com