On 12/09/2014 12:12 PM, Luis Periquito wrote: > Hi Wido, > thanks for sharing. > > fortunately I'm still running precise but planning on moving to trusty. > > From what I'm aware it's not a good idea to be running discard on the FS, > as it does have an impact of the delete operation, which some may even > consider an unnecessary amount of work for the SSD. > The 'discard' mount option is a real performance killer. You shouldn't use that. > OTOH we should be running TRIM to improve write performance (and the only > reason we are running SSDs is for performance). Running it weekly seems to > be killing it also. > > So what do you think will be the best way to do this? > I think that fstrim could still run if the proper ionice is used. I haven't tested that yet, but next Sunday I'll know. We modified the CRONs there and somebody will be on it to monitor how it works out. ionice -c Idle fstrim <mountpoint> > And what about the journal? I'm using a raw partition for it, on a SSD. > Will ceph do a proper trimming of it? > No, Ceph will not. The best thing there is to partition just the beginning of the brand-new SSD and leave 80%~90% unused. The Wear Leveling algorithm inside the SSD will do the rest. Wido > On Tue, Dec 9, 2014 at 9:21 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > >> Hi, >> >> Last sunday I got a call early in the morning that a Ceph cluster was >> having some issues. Slow requests and OSDs marking each other down. >> >> Since this is a 100% SSD cluster I was a bit confused and started >> investigating. >> >> It took me about 15 minutes to see that fstrim was running and was >> utilizing the SSDs 100%. >> >> On Ubuntu 14.04 there is a weekly CRON which executes fstrim-all. It >> detects all mountpoints which can be trimmed and starts to trim those. >> >> On the Intel SSDs used here it caused them to become 100% busy for a >> couple of minutes. That was enough for them to no longer respond on >> heartbeats, thus timing out and being marked down. >> >> Luckily we had the "out interval" set to 1800 seconds on that cluster, >> so no OSD was marked as "out". >> >> fstrim-all does not execute fstrim with a ionice priority. From what I >> understand, but haven't tested yet, is that running fstrim with ionice >> -c Idle should solve this. >> >> It's weird that this issue didn't come up earlier on that cluster, but >> after killing fstrim all problems we resolved and the cluster ran >> happily again. >> >> So watch out for fstrim on early Sunday mornings on Ubuntu! >> >> -- >> Wido den Hollander >> 42on B.V. >> Ceph trainer and consultant >> >> Phone: +31 (0)20 700 9902 >> Skype: contact42on >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com