Re: Watch for fstrim running on your Ubuntu systems

Wido den Hollander <wido@xxxxxxxx> · Tue, 09 Dec 2014 19:00:23 +0100

On 12/09/2014 12:12 PM, Luis Periquito wrote:
> Hi Wido,
> thanks for sharing.
> 
> fortunately I'm still running precise but planning on moving to trusty.
> 
> From what I'm aware it's not a good idea to be running discard on the FS,
> as it does have an impact of the delete operation, which some may even
> consider an unnecessary amount of work for the SSD.
> 

The 'discard' mount option is a real performance killer. You shouldn't
use that.

> OTOH we should be running TRIM to improve write performance (and the only
> reason we are running SSDs is for performance). Running it weekly seems to
> be killing it also.
> 
> So what do you think will be the best way to do this?
> 

I think that fstrim could still run if the proper ionice is used. I
haven't tested that yet, but next Sunday I'll know.

We modified the CRONs there and somebody will be on it to monitor how it
works out.

ionice -c Idle fstrim <mountpoint>

> And what about the journal? I'm using a raw partition for it, on a SSD.
> Will ceph do a proper trimming of it?
> 

No, Ceph will not. The best thing there is to partition just the
beginning of the brand-new SSD and leave 80%~90% unused. The Wear
Leveling algorithm inside the SSD will do the rest.

Wido

> On Tue, Dec 9, 2014 at 9:21 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> 
>> Hi,
>>
>> Last sunday I got a call early in the morning that a Ceph cluster was
>> having some issues. Slow requests and OSDs marking each other down.
>>
>> Since this is a 100% SSD cluster I was a bit confused and started
>> investigating.
>>
>> It took me about 15 minutes to see that fstrim was running and was
>> utilizing the SSDs 100%.
>>
>> On Ubuntu 14.04 there is a weekly CRON which executes fstrim-all. It
>> detects all mountpoints which can be trimmed and starts to trim those.
>>
>> On the Intel SSDs used here it caused them to become 100% busy for a
>> couple of minutes. That was enough for them to no longer respond on
>> heartbeats, thus timing out and being marked down.
>>
>> Luckily we had the "out interval" set to 1800 seconds on that cluster,
>> so no OSD was marked as "out".
>>
>> fstrim-all does not execute fstrim with a ionice priority. From what I
>> understand, but haven't tested yet, is that running fstrim with ionice
>> -c Idle should solve this.
>>
>> It's weird that this issue didn't come up earlier on that cluster, but
>> after killing fstrim all problems we resolved and the cluster ran
>> happily again.
>>
>> So watch out for fstrim on early Sunday mornings on Ubuntu!
>>
>> --
>> Wido den Hollander
>> 42on B.V.
>> Ceph trainer and consultant
>>
>> Phone: +31 (0)20 700 9902
>> Skype: contact42on
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 

-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com