I just love the sound of my own typing... See inline, below. On Fri, 30 Sep 2016 12:18:48 +0900 Christian Balzer wrote: > > Hello, > > On Thu, 29 Sep 2016 20:15:12 +0200 Sascha Vogt wrote: > > > Hi Burkhard, > > > > On 29/09/16 15:08, Burkhard Linke wrote: > > > AFAIK evicting an object also flushes it to the backing storage, so > > > evicting a live object should be ok. It will be promoted again at the > > > next access (or whatever triggers promotion in the caching mechanism). > > >> > > >> For the dead 0-byte files: Should I open a bug report? > > > Not sure whether this is a bug at all. The objects should be evicted and > > > removed if the cache pool hits the max object thresholds. > > d'oh, Ceph and it's hidden gems ;) That was it. > > That's what I alluding to when I wrote "maybe with some delay". > > >Yes, we have currently > > no hard object limit (target_max_objects) as we have target_max_bytes > > set and thought that would be enough. After setting target_max_objects > > (even to a ridiculous high number, I used 200 millions, so double the > > amount we have) and Ceph immediately started dropping objects (and > > blocking all client IO :( ) > > > Please refer to this page for the reminder: > http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ > > So, firstly, what are your ratios (dirty, full) set to? > If it's at the defaults of 0.4 and 0.6 and you REALLY have only 100 > million objects, it should have started to flush stuff (which is likely a > NOOP with these leftovers) and not evict stuff. > What does "ceph df detail" tell you? > > Are you sure the blocking of client I/O is due to the object removal and > your OSDs being too busy and not actually because Ceph thinks that the > cache is full (object wise)? > As in: > "Note All client requests will be blocked only when target_max_bytes or > target_max_objects reached" > > > > Is this behavior documented somewhere? > > Not that I'm aware of. > OTOH, I'd expect even those 0-byte files/objects to be eventually the > subject of removal when the space/size limits are reached and they are > eligible (old enough). > If that is NOT the case, that this is both a bug and at the very least > needs to be put into the documentation. > Gotta love having (only a few years late) a test and staging cluster that is actually usable and comparable to my real ones. So I did create a 500GB image and filled it up. The cache pool is set to 500GB as well and will flush at 60% and evict at 80%. Afterwards I rm'ed the image and had plenty of those orphan objects left in the cache pool. Both the ones created initially AND the ones moved back up to it from the base pool during the removal (all activity happens on the cache tier after all). Repeated that 2 more times and with the flush and evict timers set to 10 and 20 minutes respectively it should have removed those, but it didn't. Started like this: --- NAME ID CATEGORY USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED rbd 0 - 23445M 0.52 3579G 5951 5951 478k 3841k 46890M cache 2 - 11587M 0.26 661G 15778 2821 77955 2522k 23174M --- and ended up like that: --- NAME ID CATEGORY USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED rbd 0 - 245G 5.61 3328G 63015 63015 505k 3953k 490G cache 2 - 3498M 0.08 669G 291626 213k 80552 7995k 6996M --- Set max objects to 200k and that got rid of many (no particular death throes were caused by this), but still left 150k floating around. To remove the remaining ones (and of course clean out the cache entirely) a "rados -p cache cache-try-flush-evict-all" did the trick. Which is of course impractical in a production environment. So yeah, it's definitely a bug as these orphans will never expire it seems. And at the very least the documentation would need to reflect this. Christian > >From the cache tiering doc it > > looked like you either set target_max_bytes OR target_max_objects and > > not both (although I always wondered what sense does it make to talk > > about objects on a cache layer, as it's nature is that it is space bound > > and it is less than the backing pool. I even wondered why > > target_max_bytes is even necessary, as Ceph knows how much space is > > available. > > As it says on that page, it doesn't. > Ceph is notoriously bad (due to the potential complexity of setups) to > figure out what space is actually available and used. > > This is isn't helped by cache-tiering basing things on PGs not pools or > OSDs or anything else that would help to make sizing guesses. > See my old "Cache tier operation clarifications" thread here. > > Christian > > > I mean optionally restricting it further is ok, in case you > > want to have two Cache pools on the same set of fast disks / SSds, but > > IMHO it could be optionally and in case of just one pool use whats there) > > > > Anyway, thanks a lot for the help. We will see how we can get some > > downtime in order to set a limit and cleanup the backlog of stale > > objects from the cache. > > > > Greetings > > -Sascha- > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > -- Christian Balzer Network/Systems Engineer chibi@xxxxxxx Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com