Re: Ceph OSD very slow startup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

More information on our Btrfs tests.

Le 14/10/2014 19:53, Lionel Bouton a écrit :


Current plan: wait at least a week to study 3.17.0 behavior and upgrade the 3.12.21 nodes to 3.17.0 if all goes well.


3.17.0 and 3.17.1 have a bug which remounts Btrfs filesystems read-only (no corruption but OSD goes down) on some access patterns with snapshots:
https://www.mail-archive.com/linux-btrfs@xxxxxxxxxxxxxxx/msg36483.html

The bug may be present in earlier kernels (at least the 3.16.4 code in fs/btrfs/qgroup.c doesn't handle the case differently than 3.17.0 and 3.17.1) but seems at least less likely to show up (never saw it with 3.16.4 in several weeks but it happened with 3.17.1 three times in just a few hours). As far as I can tell from its Changelog, 3.17.1 didn't patch any vfs/btrfs path vs 3.17.0 so I assume 3.17.0 has the same behaviour.

I switched all servers to 3.16.4 which I had previously tested without any problem.

The performance problem is still there with 3.16.4. In fact one of the 2 large OSD was so slow it was repeatedly marked out and generated lots of latencies when in. I just had to remove it: when this OSD is shut down with noout to avoid backfills slowing down the storage network, latencies are back to normal. I chose to reformat this one with XFS.

The other "big" node has a nearly perfectly identical system (same hardware, same software configuration, same logical volume configuration, same weight in the crush map, comparable disk usage in the OSD fs, ...) but is behaving itself (maybe slower than our smaller XFS and Btrfs OSD, but usable). The only notable difference is that it was formatted more recently. So the performance problem might be linked to the cumulative amount of data access to the OSD over time. If my suspicion is true I believe we might see performance problems on the other Btrfs OSDs later (we'll have to wait).

Is any Btrfs developper subscribed to this list? I could forward this information to linux-btrfs@vger if needed but I can't offer much debugging help (the storage cluster is in production and I'm more inclined to migrate slow OSDs to XFS than doing invasive debugging with Btrfs).

Best regards,

Lionel Bouton
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux