Hello, This is rather confusing, as cache-tiers are just normal OSDs/pools and thus should have Ceph objects of around 4MB in size by default. This is matched on what I see with Ext4 here (normal OSD, not a cache tier): --- size: /dev/sde1 2.7T 204G 2.4T 8% /var/lib/ceph/osd/ceph-0 inodes: /dev/sde1 183148544 55654 183092890 1% /var/lib/ceph/osd/ceph-0 --- On a more fragmented cluster I see a 5:1 size to inode ratio. I just can't fathom how there could be 3.3 million inodes (and thus a close number of files) using 30G, making the average file size below 10 Bytes. Something other than your choice of file system is probably at play here. How fragmented are those SSDs? What's your default Ceph object size? Where _are_ those 3 million files in that OSD, are they actually in the object files like: -rw-r--r-- 1 root root 4194304 Jan 9 15:27 /var/lib/ceph/osd/ceph-0/current/3.117_head/DIR_7/DIR_1/DIR_5/rb.0.23a8f.238e1f29.000000027632__head_C4F3D517__3 What's your use case, RBD, CephFS, RadosGW? Regards, Christian On Mon, 23 Mar 2015 10:32:55 +0300 Kamil Kuramshin wrote: > Recently got a problem with OSDs based on SSD disks used in cache tier > for EC-pool > > superuser@node02:~$ df -i > Filesystem Inodes IUsed *IFree* IUse% Mounted on > <...> > /dev/sdb1 3335808 3335808 *0* 100% > /var/lib/ceph/osd/ceph-45 > /dev/sda1 3335808 3335808 *0* 100% > /var/lib/ceph/osd/ceph-46 > > Now that OSDs are down on each ceph-node and cache tiering is not > working. > > superuser@node01:~$ sudo tail /var/log/ceph/ceph-osd.45.log > 2015-03-23 10:04:23.631137 7fb105345840 0 ceph version 0.87.1 > (283c2e7cfa2457799f534744d7d549f83ea1335e), process ceph-osd, pid 1453465 > 2015-03-23 10:04:23.640676 7fb105345840 0 > filestore(/var/lib/ceph/osd/ceph-45) backend generic (magic 0xef53) > 2015-03-23 10:04:23.640735 7fb105345840 -1 > genericfilestorebackend(/var/lib/ceph/osd/ceph-45) detect_features: > unable to create /var/lib/ceph/osd/ceph-45/fiemap_test: (28) No space > left on device > 2015-03-23 10:04:23.640763 7fb105345840 -1 > filestore(/var/lib/ceph/osd/ceph-45) _detect_fs: detect_features error: > (28) No space left on device > 2015-03-23 10:04:23.640772 7fb105345840 -1 > filestore(/var/lib/ceph/osd/ceph-45) FileStore::mount : error in > _detect_fs: (28) No space left on device > 2015-03-23 10:04:23.640783 7fb105345840 -1 ** ERROR: error converting > store /var/lib/ceph/osd/ceph-45: (28) *No space left on device* > > In the same time*df -h *is confusing: > > superuser@node01:~$ df -h > Filesystem Size Used *Avail* Use% Mounted on > <...> > /dev/sda1 50G 29G *20G* > 60% /var/lib/ceph/osd/ceph-45 /dev/sdb1 50G 27G > *21G* 56% /var/lib/ceph/osd/ceph-46 > > > Filesystem used on affected OSDs is EXt4. All OSDs are deployed with > ceph-deploy: > $ ceph-deploy osd create --zap-disk --fs-type ext4 <node-name>:<device> > > > Help me out what it was just test deployment and all EC-pool data was > lost since I /can't start OSDs/ and ceph cluster/becames degraded /until > I removed all affected tiered pools (cache & EC) > So this is just my observation of what kind of problems can be faced if > you choose wrong Filesystem for OSD backend. > And now I *strongly* recommend you to choose*XFS* or *Btrfs* filesystems > because both are supporting dynamic inode allocation and this problem > can't arise with them. > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Fusion Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com