Re: Btrfs defragmentation

Lionel Bouton <lionel+ceph@xxxxxxxxxxx> · Thu, 07 May 2015 13:21:59 +0200

Hi,

On 05/07/15 12:30, Burkhard Linke wrote:
> [...]
> Part of the OSD boot up process is also the handling of existing
> snapshots and journal replay. I've also had several btrfs based OSDs
> that took up to 20-30 minutes to start, especially after a crash.
> During journal replay the OSD daemon creates a number of new snapshot
> for its operations (newly created snap_XYZ directories that vanish
> after a short time). This snapshotting probably also adds overhead to
> the OSD startup time.
> I have disabled snapshots in my setup now, since the stock ubuntu
> trusty kernel had some stability problems with btrfs.
>
> I also had to establish cron jobs for rebalancing the btrfs
> partitions. It compacts the extents and may reduce the total amount of
> space taken.

I'm not sure what you mean by "compacting" extents. I'm sure balance
doesn't defragment or compress files. It moves extents and before 3.14
according to the Btrfs wiki it was used to reclaim allocated but unused
space.
This shouldn't affect performance and with modern kernels may not be
needed to reclaim unused space anymore.

> Unfortunately this procedure is not a default in most distribution (it
> definitely should be!). The problems associated with unbalanced
> extents should have been solved in kernel 3.18, but I didn't had the
> time to check it yet.

I don't have any btrfs filesystem running on 3.17 or earlier version
anymore (with a notable exception, see below) so I can't comment. I have
old btrfs filesystems that were created on 3.14 and are now on 3.18.x or
3.19.x (by the way avoid 3.18.9 to 3.19.4 if you can have any sort of
power failure, there's a possibility of a mount deadlock which requires
btrfs-zero-log to solve...). btrfs fi usage doesn't show anything
suspicious on these old fs.
I have a Jolla Phone which comes with a btrfs filesystem and uses an old
heavily patched 3.4 kernel. It didn't have any problem yet but I don't
stuff it with data (I've seen discussions about triggering a balance
before a SailfishOS upgrade).
I assume that you shouldn't have any problem with filesystems that
aren't heavily used which should be the case with Ceph OSD (for example
our current alert level is at 75% space usage).

>
> As a side note: I had several OSD with dangling snapshots (more than
> the two usually handled by the OSD). They are probably due to crashed
> OSD daemons. You have to remove the manually, otherwise they start to
> consume disk space.

Thanks a lot, I didn't think it could happen. I'll configure an alert
for this case.

Best regards,

Lionel
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com