> - to run bluestore in the same process of crimson-osd, but we will > allocate some dedicated threads (and CPU cores) to it. we could use > ceph::thread::ThreadPool for this purpose. for instance, we will have > 3 ConfigProxy backends. > 1. the classic ConfigProxy used by classic OSD and other daemons and > command line utilities. the ConfigProxy normally resides in a global > CephContext. > 2. the ceph::common::ConfigProxy solely used by crimson OSD. it is > rewritten using seastar. it's a sharded service. normally we just > access the config proxy directly in crimson, like > 'local_conf().get_val< uint64_t>("name")' instead of using something > like 'cct->_conf.get_val<uint64_t>("name")' > 3. the ConfigProxy used by bluestore living in the alien world. its > interface will be exactly the same as the classic one, but it will > call into its crimson counterpart using the `seastar::alien::submit()` > call. I'm not sure this is quite right. I think that the seastar config would have a reference over to the alien config machinery in order to inject config changes and do the initial setup, but the alien side needn't have a reference to the crimson one. > in addition to WITH_SEASTAR macro, we can introduce yet another > macro allowing us to call into the facilities offered by > crimson-common. and we can use inline namespace to differentiate the > 2nd from 3rd implementations. as they will need to be co-located in > the same process. and without using different names, we'd violate ODR. > - to hide bluestore in a library which links against ceph-common > library. but the libblustore won't expose any ceph-common symbols to > crimson-osd. but we need to figure out how to maintain the internal > status of ceph-common. as it not quite self-contained in the sense > that it need to access the logging, config and other facilities > offered by crimson-osd. The library option seems promising to me if we go this direction. It can even export an interface which is entirely agnostic of the config machinery (maybe take a serialized representation of the config values?) and write to a different log file at first. > - to port rocksdb to seastar: to be specific, this approach will use > seastar's green thread to implement the Mutex, CondVar and Thread in > rocksdb, and implement all blocking calls using seastar's > counterparts. if this approach is proved to be workable. the next > problem would be to upstream this change. and in a long run, the > rocksdb backed bluestore will be replaced by seastore if seastore is > capable of supporting relatively slow devices as well. I've started to look at your rocksdb port. It does look like the parts we'd need to adapt are appropriately factored out in rocksdb, and I bet we'd get interest from upstream. We might want to take their temperature sooner rather than later? We'd also have to perform essentially the same refactor in Bluestore in order to break the bluestore logic apart from the IO/blocking/locking portions. I guess this exists in some form with the BlockDevice interface, but we'll also have to introduce something like rocksdb's lock replacement. This path would get us a much more cooperative (probably more performant as well, particularly in high density hosts) bluestore in the long run, so it might be worth the work. > - seastore: a completely rewritten object store backend targeting fast > NVMe devices. but it will take longer to get there. I think we're going to do this no matter what. I think alien/bluestore choice is about how we want to test crimson prior to developing seastore and possibly for handling devices inappropriate for seastore? -Sam _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx