Hi, On 05/07/2015 12:04 PM, Lionel Bouton wrote:
On 05/06/15 19:51, Lionel Bouton wrote:
*snipsnap*
We've seen progress on this front. Unfortunately for us we had 2 power outages and they seem to have damaged the disk controller of the system we are testing Btrfs on: we just had a system crash. On the positive side this gives us an update on the OSD boot time. With a freshly booted system without anything in cache : - the first Btrfs OSD we installed loaded the pgs in ~1mn30s which is half of the previous time, - the second Btrfs OSD where defragmentation was disabled for some time and was considered more fragmented by our tool took nearly 10 minutes to load its pgs (and even spent 1mn before starting to load them). - the third Btrfs OSD which was always defragmented took 4mn30 seconds to load its pgs (it was considered more fragmented than the first and less than the second). My current assumption is that the defragmentation process we use can't handle large spikes of writes (at least when originally populating the OSD with data through backfills) but then can repair the damage on performance they cause at least partially (it's still slower to boot than the 3 XFS OSDs on the same system where loading pgs took 6-9 seconds). In the current setup the defragmentation is very slow to process because I set it up to generate very little load on the filesystems it processes : there may be room to improve.
Part of the OSD boot up process is also the handling of existing snapshots and journal replay. I've also had several btrfs based OSDs that took up to 20-30 minutes to start, especially after a crash. During journal replay the OSD daemon creates a number of new snapshot for its operations (newly created snap_XYZ directories that vanish after a short time). This snapshotting probably also adds overhead to the OSD startup time. I have disabled snapshots in my setup now, since the stock ubuntu trusty kernel had some stability problems with btrfs.
I also had to establish cron jobs for rebalancing the btrfs partitions. It compacts the extents and may reduce the total amount of space taken. Unfortunately this procedure is not a default in most distribution (it definitely should be!). The problems associated with unbalanced extents should have been solved in kernel 3.18, but I didn't had the time to check it yet.
As a side note: I had several OSD with dangling snapshots (more than the two usually handled by the OSD). They are probably due to crashed OSD daemons. You have to remove the manually, otherwise they start to consume disk space.
Best regards, Burkhard _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com