Re: RGW versioned objects lost after Octopus 15.2.3 -> 15.2.4 upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chris,

I've confirmed that the issue you're experiencing are addressed in
lifecycle commits that were required but missed during the backport to
Octopus.  I'll work with the backport team to address this quickly.

Thanks for providing the detailed reproducer information, it was very
helpful in identifying the issue.

Matt

On Wed, Aug 5, 2020 at 9:23 AM Matt Benjamin <mbenjami@xxxxxxxxxx> wrote:
>
> Hi Chris,
>
> There is new lifecycle processing logic backported to Octopus, it
> looks like, in 15.2.3.  I'm looking at the non-current calculation to
> see if it could incorrectly rely on a stale value (from an eralier
> entry).
>
> thanks,
>
> Matt
>
> On Wed, Aug 5, 2020 at 8:52 AM Chris Palmer <chris@xxxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > This is starting to look like a regression error in Octopus 15.2.4.
> >
> > After cleaning things up by deleting all old versions, and deleting and
> > recreating the bucket lifecycle policy (see below), I then let it run.
> > Each day a new version got created, dating back to 17 July (correct).
> > Until this morning when we went over the 18-day non-current expiration:
> > this morning all but the latest version of each object disappeared -
> > which is what happened the midnight following after our 15.2.3->15.2.4
> > upgrade.
> >
> > So instead of a gently rolling 18-days of versions (on 15.2.3), we now
> > build up to 18 days after which all non-current versions get deleted (on
> > 15.2.4).
> >
> > Anyone come across versioning problems on 15.2.4?
> >
> > Thanks, Chris
> >
> > On 17/07/2020 09:11, Chris Palmer wrote:
> > > This got worse this morning. An RGW daemon crashed at midnight with a
> > > segfault, and the backtrace hints that it was processing the
> > > expiration rule:
> > >
> > >     "backtrace": [
> > >         "(()+0x12730) [0x7f97b8c4e730]",
> > >         "(()+0x15878a) [0x7f97b862378a]",
> > >         "(std::__cxx11::basic_string<char, std::char_traits<char>,
> > > std::allocator<char> >::compare(std::__cxx11::basic_string<char,
> > > std::char_traits<char>, std::allocator<char> > const&) const+0x23)
> > > [0x7f97c25d3e43]",
> > >         "(LCOpAction_DMExpiration::check(lc_op_ctx&,
> > > std::chrono::time_point<ceph::time_detail::real_clock,
> > > std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >
> > > >*)+0x87) [0x7f97c283d127]",
> > >         "(LCOpRule::process(rgw_bucket_dir_entry&, DoutPrefixProvider
> > > const*)+0x1b8) [0x7f97c281cbc8]",
> > >         "(()+0x5b836d) [0x7f97c281d36d]",
> > >         "(WorkQ::entry()+0x247) [0x7f97c28302d7]",
> > >         "(()+0x7fa3) [0x7f97b8c43fa3]",
> > >         "(clone()+0x3f) [0x7f97b85c44cf]"
> > >
> > > One object version got removed when it should not have.
> > >
> > > In an attempt to clean things up I have manually deleted all
> > > non-current versions, and removed and recreated the (same) lifecycle
> > > policy. I will also create a new test bucket with a similar policy and
> > > test that in parallel. We will see what happens tomorrow....
> > >
> > > Thanks, Chris
> > >
> > >
> > > On 16/07/2020 08:22, Chris Palmer wrote:
> > >> I have an RGW bucket (backups) that is versioned. A nightly job
> > >> creates a new version of a few objects. There is a lifecycle policy
> > >> (see below) that keeps 18 days of versions. This has been working
> > >> perfectly and has not been changed. Until I upgraded Octopus...
> > >>
> > >> The nightly job creates separate log files, including a listing of
> > >> the object versions. From these I can see that:
> > >>
> > >> 13/7  02:14   versions from 13/7 01:13 back to 24/6 01:17 (correct)
> > >>
> > >> 14/7  02:14   versions from 14/7 01:13 back to 25/6 01:14 (correct)
> > >>
> > >> 14/7  10:00   upgrade Octopus 15.2.3 -> 15.2.4
> > >>
> > >> 15/7  02:14   versions from 15/7 01:13 back to 25/6 01:14 (would have
> > >> expected 25/6 to have expired)
> > >>
> > >> 16/7  02:14   versions from 16/7 01:13 back to 15/7 01:13 (now all
> > >> pre-upgrade versions have wrongly disappeared)
> > >>
> > >> It's not a big deal for me as they are only backups, providing it
> > >> continues to work correctly from now on. However it may affect some
> > >> other people  much more.
> > >>
> > >> Any ideas on the root cause? And if it is likely to be stable again now?
> > >>
> > >> Thanks, Chris
> > >>
> > >> {
> > >>     "Rules": [
> > >>         {
> > >>             "Expiration": {
> > >>                 "ExpiredObjectDeleteMarker": true
> > >>             },
> > >>             "ID": "Expiration & incomplete uploads",
> > >>             "Prefix": "",
> > >>             "Status": "Enabled",
> > >>             "NoncurrentVersionExpiration": {
> > >>                 "NoncurrentDays": 18
> > >>             },
> > >>             "AbortIncompleteMultipartUpload": {
> > >>                 "DaysAfterInitiation": 1
> > >>             }
> > >>         }
> > >>     ]
> > >> }
> > >>
> > >> _______________________________________________
> > >> ceph-users mailing list -- ceph-users@xxxxxxx
> > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > _______________________________________________
> > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux