On Sun, Sep 7, 2014 at 1:26 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote: > Hi Ceph, > > There is a need for a cluster to share code such as cls https://github.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://github.com/ceph/ceph/tree/master/src/erasure-code/. > > These plugins could have a life cycle independent of Ceph, as long as they comply to the supported API ( https://github.com/ceph/ceph/blob/master/src/erasure-code/ErasureCodeInterface.h ). For erasure code plugins it currently works this way (or it will as soon as https://github.com/ceph/ceph/pull/2397 is merged): > > a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins > b) the MON will refuse to create an erasure coded pool using the new I* plugins, otherwise the Hammer nodes will find themselves unable to participate in the pool > > Instead it could work this way: > > a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins > b) the new erasure code plugins are uploaded to a "plugins" pool > c) an erasure coded pool is created using a new plugin from I* > d) the Hammer OSD downloads the plugin from the pool and can participate in the pool > > It is easier said than done and there are a lot of details to consider. However it not different from maintaining an Operating System that includes shared libraries and the path to do so properly is well known. > > Thoughts ? And here we go (almost) full circle. Originally the objclass (cls) mechanism worked somewhat similar. The code would be injected to the monitors, and it was then distributed to the osds. When uploading the objects we'd also specify the architecture, and there's an embedded version in each so that it was possible to disable one version and enable another. The problem is that this specific method is very problematic when dealing with heterogeneous environments, where each osd can run on a different architecture, or different distribution. Also need to maintain a very well contained api for the objclasses to use (which we don't have), and be very careful about versioning. In other words, it doesn't work, and the trouble doesn't worth the benefits. I'm not too familiar with the erasure code plugins, but it seems to me that they need to be versioned. Then you could have multiple plugins with different versions installed, and then you could specify which version to use. You could have a tool that would make sure that all the osds have access to the appropriate plugin version. But the actual installation of the plugin wouldn't be part of ceph's internal task. It might be that the erasure code plugins are more contained than the objclasses, and something like you suggested might actually work. Though I'm having trouble seeing that happening having a compiled shared object as the resource that needs to be distributed. The first objclass implementation actually pushed python code to the nodes (it really did!), maybe having something like that for erasure code could work, given the appropriate environment and tools. Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html