So I'm running into this issue again and after spending a bit of time reading the XFS mailing lists, I believe the free space is too fragmented: [root@den2ceph001 ceph-0]# xfs_db -r "-c freesp -s" /dev/sdb1 from to extents blocks pct 1 1 85773 85773 0.24 2 3 176891 444356 1.27 4 7 430854 2410929 6.87 8 15 2327527 30337352 86.46 16 31 75871 1809577 5.16 total free extents 3096916 total free blocks 35087987 average free extent size 11.33 Compared to a drive which isn't reporting 'No space left on device': [root@den2ceph008 ~]# xfs_db -r "-c freesp -s" /dev/sdc1 from to extents blocks pct 1 1 133148 133148 0.15 2 3 320737 808506 0.94 4 7 809748 4532573 5.27 8 15 4536681 59305608 68.96 16 31 31531 751285 0.87 32 63 364 16367 0.02 64 127 90 9174 0.01 128 255 9 2072 0.00 256 511 48 18018 0.02 512 1023 128 102422 0.12 1024 2047 290 451017 0.52 2048 4095 538 1649408 1.92 4096 8191 851 5066070 5.89 8192 16383 746 8436029 9.81 16384 32767 194 4042573 4.70 32768 65535 15 614301 0.71 65536 131071 1 66630 0.08 total free extents 5835119 total free blocks 86005201 average free extent size 14.7392 What I'm wondering is if reducing the block size from 4K to 2K (or 1K) would help? I'm pretty sure this would take require re-running mkfs.xfs on every OSD to fix if that's the case... Thanks, Bryan On Mon, Oct 14, 2013 at 5:28 PM, Bryan Stillwell <bstillwell@xxxxxxxxxxxxxxx> wrote: > > The filesystem isn't as full now, but the fragmentation is pretty low: > > [root@den2ceph001 ~]# df /dev/sdc1 > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/sdc1 486562672 270845628 215717044 56% /var/lib/ceph/osd/ceph-1 > [root@den2ceph001 ~]# xfs_db -c frag -r /dev/sdc1 > actual 3481543, ideal 3447443, fragmentation factor 0.98% > > Bryan > > On Mon, Oct 14, 2013 at 4:35 PM, Michael Lowe <j.michael.lowe@xxxxxxxxx> wrote: > > > > How fragmented is that file system? > > > > Sent from my iPad > > > > > On Oct 14, 2013, at 5:44 PM, Bryan Stillwell <bstillwell@xxxxxxxxxxxxxxx> wrote: > > > > > > This appears to be more of an XFS issue than a ceph issue, but I've > > > run into a problem where some of my OSDs failed because the filesystem > > > was reported as full even though there was 29% free: > > > > > > [root@den2ceph001 ceph-1]# touch blah > > > touch: cannot touch `blah': No space left on device > > > [root@den2ceph001 ceph-1]# df . > > > Filesystem 1K-blocks Used Available Use% Mounted on > > > /dev/sdc1 486562672 342139340 144423332 71% /var/lib/ceph/osd/ceph-1 > > > [root@den2ceph001 ceph-1]# df -i . > > > Filesystem Inodes IUsed IFree IUse% Mounted on > > > /dev/sdc1 60849984 4097408 56752576 7% /var/lib/ceph/osd/ceph-1 > > > [root@den2ceph001 ceph-1]# > > > > > > I've tried remounting the filesystem with the inode64 option like a > > > few people recommended, but that didn't help (probably because it > > > doesn't appear to be running out of inodes). > > > > > > This happened while I was on vacation and I'm pretty sure it was > > > caused by another OSD failing on the same node. I've been able to > > > recover from the situation by bringing the failed OSD back online, but > > > it's only a matter of time until I'll be running into this issue again > > > since my cluster is still being populated. > > > > > > Any ideas on things I can try the next time this happens? > > > > > > Thanks, > > > Bryan > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com