Re: Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD

Uwe Sauter <uwe.sauter.de@xxxxxxxxx> · Thu, 28 Feb 2019 15:01:09 +0100

I already sent my configuration to the list about 3,5h ago but here it is again:

[global]
  auth client required = cephx
  auth cluster required = cephx
  auth service required = cephx
  cluster network = 169.254.42.0/24
  fsid = 753c9bbd-74bd-4fea-8c1e-88da775c5ad4
  keyring = /etc/pve/priv/$cluster.$name.keyring
  public network = 169.254.42.0/24

[mon]
  mon allow pool delete = true
  mon data avail crit = 5
  mon data avail warn = 15

[osd]
  keyring = /var/lib/ceph/osd/ceph-$id/keyring
  osd journal size = 5120
  osd pool default min size = 2
  osd pool default size = 3
  osd max backfills = 6
  osd recovery max active = 12

[mon.px-golf-cluster]
  host = px-golf-cluster
  mon addr = 169.254.42.54:6789

[mon.px-hotel-cluster]
  host = px-hotel-cluster
  mon addr = 169.254.42.55:6789

[mon.px-india-cluster]
  host = px-india-cluster
  mon addr = 169.254.42.56:6789

Am 28.02.19 um 14:44 schrieb Matthew H:
> Could you send your ceph.conf file over please? Are you setting any tunables for OSD or Bluestore currently?
> 
> ----------------------------------------------------------------------------------------------------------------------------------
> *From:* ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> on behalf of Uwe Sauter <uwe.sauter.de@xxxxxxxxx>
> *Sent:* Thursday, February 28, 2019 8:33 AM
> *To:* Marc Roos; ceph-users; vitalif
> *Subject:* Re:  Fwd: Re: Blocked ops after change from filestore on HDD to bluestore on SDD
>  
> Do you have anything particular in mind? I'm using mdb backend with maxsize = 1GB but currently the files are only about 23MB.
> 
> 
>> 
>> I am having quite a few openldap servers (slaves) running also, make 
>> sure to use proper caching that saves a lot of disk io.  
>> 
>> 
>> 
>> 
>> -----Original Message-----
>> Sent: 28 February 2019 13:56
>> To: uwe.sauter.de@xxxxxxxxx; Uwe Sauter; Ceph Users
>> Subject: *****SPAM***** Re:  Fwd: Re: Blocked ops after 
>> change from filestore on HDD to bluestore on SDD
>> 
>> "Advanced power loss protection" is in fact a performance feature, not a 
>> safety one.
>> 
>> 
>> 28 февраля 2019 г. 13:03:51 GMT+03:00, Uwe Sauter 
>> <uwe.sauter.de@xxxxxxxxx> пишет:
>> 
>>        Hi all,
>>        
>>        thanks for your insights.
>>        
>>        Eneko,
>>        
>> 
>>                We tried to use a Samsung 840 Pro SSD as OSD some time ago and 
>> it was a no-go; it wasn't that performance was bad, it 
>>                just didn't work for the kind of use of OSD. Any HDD was 
>> better than it (the disk was healthy and have been used in a 
>>                software raid-1 for a pair of years).
>>                
>>                I suggest you check first that your Samsung 860 Pro disks work 
>> well for Ceph. Also, how is your host's RAM?
>> 
>> 
>>        As already mentioned the hosts each have 64GB RAM. Each host has 3 
>> SSDs for OSD usage. Each OSD is using about 1.3GB virtual
>>        memory / 400MB residual memory.
>>        
>>        
>>        
>>        Joachim,
>>        
>> 
>>                I can only recommend the use of enterprise SSDs. We've tested 
>> many consumer SSDs in the past, including your SSDs. Many 
>>                of them are not suitable for long-term use and some weard out 
>> within 6 months.
>> 
>> 
>>        Unfortunately I couldn't afford enterprise grade SSDs. But I 
>> suspect that my workload (about 20 VMs for our infrastructure, the
>>        most IO demanding is probably LDAP) is light enough that wearout 
>> won't be a problem.
>>        
>>        The issue I'm seeing then is probably related to direct IO if using 
>> bluestore. But with filestore, the file system cache probably
>>        hides the latency issues.
>>        
>>        
>>        Igor,
>>        
>> 
>>                AFAIR Samsung 860 Pro isn't for enterprise market, you 
>> shouldn't use consumer SSDs for Ceph.
>>                
>>                I had some experience with Samsung 960 Pro a while ago and it 
>> turned out that it handled fsync-ed writes very slowly 
>>                (comparing to the original/advertised performance). Which one 
>> can probably explain by the lack of power loss protection 
>>                for these drives. I suppose it's the same in your case.
>>                
>>                Here are a couple links on the topic:
>>                
>>                
>> https://www.percona.com/blog/2018/02/08/fsync-performance-storage-devices/
>>                
>>                
>> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>> 
>> 
>>        Power loss protection wasn't a criteria for me as the cluster hosts 
>> are distributed in two buildings with separate battery backed
>>        UPSs. As mentioned above I suspect the main difference for my case 
>> between filestore and bluestore is file system cache vs. direct
>>        IO. Which means I will keep using filestore.
>>        
>>        Regards,
>>        
>>                Uwe
>> ________________________________
>> 
>>        ceph-users mailing list
>>        ceph-users@xxxxxxxxxxxxxx
>>        http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> --
>> With best regards,
>> Vitaliy Filippov
>> 
>> 
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com