bluestore smr support update

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

This PR has a big pile of bluestore updates for SMR drives:
   https://github.com/ceph/ceph/pull/42762

This is mostly a cleanup of the existing SMR support code, including a
reimplementation of the backrefs/zone refs/whatevers that the cleaner
will use to identify which objects are left in a victim zone.  There
is enough here that ceph_test_objectstore mostly passes (several tests
are skipped because they don't make sense on SMR, or in a few cases
because there are issues to still address).  fsck is updated to verify
zone metadata and allocations (though it doesn't repair most things
yet).

The main pieces missing are the cleaner itself (migrating objects out
of the zone with the most dead space).

Next steps:

1- I think we should consider unconditionally linking in libzbd.  Do
we know what the status of this is in el8 (or el9?).  This would let
us start testing this for real (e.g., by running ceph_test_objectstore
on a simulated zoned device as part of the rados suite).

2- Bluefs isn't yet running in the conventional region of the device.
This is an easyish next step, although it will make the current
sharing of the bluefs/bluestore allocator a bit weirder than it
already is (because in this case we *won't* share--bluefs will have a
conventional allocator for just the conventional region and the zoned
allocator will only handle the zoned regions).  Longer term, we
probably want something a bit different, though...

3- A few of the store_test tests that I disabled (the Spillover ones)
look important.  For example, the GC logic there only rewrites a small
region, but since SMR drives never do overwrite in place, (I think)
this doesn't tend to collapse the layout at all, which is I think
what's making the test fail.  I have a feeling we need to modify this
to be a bit more aggressive (e.g., by rewriting a larger window of the
object).  (Or maybe I'm misunderstanding the test.)

In the meantime, though, I think we can review this first batch of
changes and (assuming no issues) merge that before tackling the rest.

sage
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux