Adding reply-all this time... On Tue, Sep 29, 2020 at 2:53 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote: > > On Tue, Sep 29, 2020 at 4:47 PM Travis Nielsen <tnielsen@xxxxxxxxxx> wrote: > > > > On Tue, Sep 29, 2020 at 1:50 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote: > > > > > > On Tue, Sep 29, 2020 at 3:33 PM Travis Nielsen <tnielsen@xxxxxxxxxx> wrote: > > > > > > > > Sebastian and fellow orchestrators, > > > > > > > > Some questions have come up recently about issues in the Rook > > > > orchestrator module and its state of disrepair. Patrick, Varsha, and I > > > > have been discussing these recently as Varsha has been working on the > > > > module. Before we fix all the issues that are being found, I want to > > > > start a higher level conversation. I’ll join the leads meeting > > > > tomorrow to discuss, and would be good to include in the Monday > > > > orchestrator agenda as well, which unfortunately I haven’t been able > > > > to attend recently... > > > > > > > > First, Rook is driven by the K8s APIs, including CRDs, an operator, > > > > the CSI driver, etc. When the admin needs to configure the Ceph > > > > cluster, they create the CRDs and other resources directly with the > > > > K8s tools such as kubectl. Rook does everything with K8s patterns so > > > > that the admin doesn’t need to leave their standard administration > > > > sandbox in order to configure Rook or Ceph. If any Ceph-specific > > > > command needs to be run, the rook toolbox can be used. However, we > > > > prefer to avoid the toolbox for common scenarios that should have CRDs > > > > for declaring desired state. > > > > > > > > The fundamental question then is, **what scenarios require the Rook > > > > orchestrator mgr module**? The module is not enabled by default in > > > > Rook clusters and I am not aware of upstream users consuming it. > > > > > > > > The purpose of the orchestrator module was originally to provide a > > > > common entry point either for Ceph CLI tools or the dashboard. This > > > > would provide the constant interface to work with both Rook or cephadm > > > > clusters. Patrick pointed out that the dashboard isn’t really a > > > > scenario anymore for the orchestrator module. > > > > > > Is that true? [1] > > > > Perhaps I misunderstood. If the dashboard is still a requirement, the > > requirements will certainly be much higher to maintain support. > > > > > > > > > If so, the only > > > > remaining usage is for CLI tools. And if we only have the CLI > > > > scenario, this means that the CLI commands would be run from the > > > > toolbox. But we are trying to avoid the toolbox. We should be putting > > > > our effort into the CRDs, CSI driver, etc. > > > > > > > > If the orchestrator module is creating CRs, we are likely doing > > > > something wrong. We expect the cluster admin to create CRs. > > > > > > > > Thus, I’d like to understand the scenarios where the rook orchestrator > > > > module is needed. If there isn’t a need anymore since dashboard > > > > requirements have changed, I’d propose the module can be removed. > > > > > > I don't have a current stake in the outcome, but I could foresee the > > > future need/desire for letting the Ceph cluster itself spin up > > > resources on-demand in k8s via Rook. Let's say that I want to convert > > > an XFS on RBD image to CephFS, the MGR could instruct the orchestrator > > > to kick off a job to translate between the two formats. I'd imagine > > > the same could be argued for on-demand NFS/SMB gateways or anywhere > > > else there is a delta between a storage administrator setting up the > > > basic Ceph system and Ceph attempting to self-regulate/optimize. > > > > If Ceph needs to self regulate, I could certainly see the module as > > useful, such as auto-scaling the daemons when load is high. But at the > > same time, the operator could watch for Ceph events, metrics, or other > > indicators and perform the self-regulation according to the CR > > settings, instead of it happening inside the mgr module. > > But then you would be embedding low-level business logic about Ceph > inside Rook? Or if you are saying Rook would wait for a special event > / alert hook from Ceph to perform some action. If that's the case, it > sounds a lot like what the orchestrator purports to do (at least to me > and at least as an end-state goal). Agreed we don't want to embed Ceph logic in Rook. But yes, if Rook can have a hook into Ceph to perform the action, the operator could handle it. Then if cephadm needed to handle the same scenario, it might use a mgr module to implement. But no need for a rook module in that case. > > > At the end of the day, I want to make sure we actually need an > > orchestrator interface. K8s and cephadm are very different > > environments and their features probably won't ever be at parity with > > each other. It may be more appropriate to define the rook and cephadm > > module separately. Or at least we need to be very clear why we need > > the common interface, that it's tested, and supported. > > Not going to disagree with that last point. > > > > > > > > Thanks, > > > > Travis > > > > Rook > > > > > > > > > > [1] https://tracker.ceph.com/issues/46756 > > > > > > -- > > > Jason > > > > > > > > -- > Jason >