Re: "issue pool application warning even if pool is empty" change

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 31, 2023 at 10:42 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
> On Thu, Aug 31, 2023 at 1:37 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote:
> >
> > On Thu, Aug 31, 2023 at 9:26 PM Prashant Dhange <pdhange@xxxxxxxxxx> wrote:
> > >
> > > Hi Ilya,
> > >
> > > We discussed this topic in yesterday's RADOS meeting. Overall sentiments are not to revert the PR#47560 till we have a viable solution from the RGW and orchestrator side. Similar problems can be seen with the application built on top of LIBRADOS APIs and fail to enable application for the pool. The end users may find it difficult to debug the issue of why pool is not writable.
> > >
> > > We believe the solution may lie outside RADOS but the end solution should be less intrusive and should be backward-compatible. RGW was silently failing to create buckets. We had to debug the issue through RGW debug logs which was time consuming and not at all user friendly. Reference : https://bugzilla.redhat.com/show_bug.cgi?id=2028999. One of ceph's users had a major production outage for more than 24 hours because the RGW was failing to create buckets after cluster upgrade due to enforcement of tag in OSD caps. Alternatively we can mute the POOL_APP_NOT_ENABLED warning in case HEALTH_WARN is a bit annoying for newly created pools.
> >
> > Hi Prashant,
> >
> > Thanks for providing the context.
> >
> > I can't say I agree with the approach.  There are many other ways to
> > screw up OSD caps (especially if one tries to lock down as tight as
> > possible) none of which would be similarly highlighted in "ceph
> > status", so this doesn't address the general lack of user-friendliness
> > in this area.
> >
> > >
> > > Let us know if you would like to join the next RADOS meeting (or separate meeting) to discuss the feasible solution with the RGW and cephadm team. I will invite all the stakeholders for the meeting.
> >
> > That said, I don't really have a stake in the ground here.  Given that
> > creating pools is a rare operation, perhaps a bogus health alert that
> > shows up only briefly is acceptable.
>
> It is an alert that would only show up when a cluster admin is already
> sitting at the keyboard, and we've gone from "pool tags are a
> new-fangled thing" to "most of the project won't work right if you
> don't have tags set correctly".
> I think this is probably okay, although something more elegant would be nice.

Not particularly elegant, but perhaps we could treat each new
POOL_APP_NOT_ENABLED as muted behind the scenes for a couple of
minutes?  Essentially just delay raising the alert to give admins
enough time to run "ceph osd pool application enable" or our own
tools to call rados_application_enable() API.

Thanks,

                Ilya
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux