Re: understanding the bluestore blob, chunk and compression params

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 21 Jun 2019 11:44:34 +0200



http://tracker.ceph.com/issues/40480

On Thu, Jun 20, 2019 at 9:12 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
>
> I will try to reproduce with logs and create a tracker once I find the
> smoking gun...
>
> It's very strange -- I had the osd mode set to 'passive', and pool
> option set to 'force', and the osd was compressing objects for around
> 15 minutes. Then suddenly it just stopped compressing, until I did
> 'ceph daemon osd.130 config set bluestore_compression_mode force',
> where it restarted immediately.
>
> FTR, it *should* compress with osd bluestore_compression_mode=none and
> the pool's compression_mode=force, right?
>
> -- dan
>
> -- Dan
>
> On Thu, Jun 20, 2019 at 6:57 PM Igor Fedotov <ifedotov@xxxxxxx> wrote:
> >
> > I'd like to see more details (preferably backed with logs) on this...
> >
> > On 6/20/2019 6:23 PM, Dan van der Ster wrote:
> > > P.S. I know this has been discussed before, but the
> > > compression_(mode|algorithm) pool options [1] seem completely broken
> > > -- With the pool mode set to force, we see that sometimes the
> > > compression is invoked and sometimes it isn't. AFAICT,
> > > the only way to compress every object is to set
> > > bluestore_compression_mode=force on the osd.
> > >
> > > -- dan
> > >
> > > [1] http://docs.ceph.com/docs/master/rados/operations/pools/#set-pool-values
> > >
> > >
> > > On Thu, Jun 20, 2019 at 4:33 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote:
> > >> Hi all,
> > >>
> > >> I'm trying to compress an rbd pool via backfilling the existing data,
> > >> and the allocated space doesn't match what I expect.
> > >>
> > >> Here is the test: I marked osd.130 out and waited for it to erase all its data.
> > >> Then I set (on the pool) compression_mode=force and compression_algorithm=zstd.
> > >> Then I marked osd.130 to get its PGs/objects back (this time compressing them).
> > >>
> > >> After a few 10s of minutes we have:
> > >>          "bluestore_compressed": 989250439,
> > >>          "bluestore_compressed_allocated": 3859677184,
> > >>          "bluestore_compressed_original": 7719354368,
> > >>
> > >> So, the allocated is exactly 50% of original, but we are wasting space
> > >> because compressed is 12.8% of original.
> > >>
> > >> I don't understand why...
> > >>
> > >> The rbd images all use 4MB objects, and we use the default chunk and
> > >> blob sizes (in v13.2.6):
> > >>     osd_recovery_max_chunk = 8MB
> > >>     bluestore_compression_max_blob_size_hdd = 512kB
> > >>     bluestore_compression_min_blob_size_hdd = 128kB
> > >>     bluestore_max_blob_size_hdd = 512kB
> > >>     bluestore_min_alloc_size_hdd = 64kB
> > >>
> > >>  From my understanding, backfilling should read a whole 4MB object from
> > >> the src osd, then write it to osd.130's bluestore, compressing in
> > >> 512kB blobs. Those compress on average at 12.8% so I would expect to
> > >> see allocated being closer to bluestore_min_alloc_size_hdd /
> > >> bluestore_compression_max_blob_size_hdd = 12.5%.
> > >>
> > >> Does someone understand where the 0.5 ratio is coming from?
> > >>
> > >> Thanks!
> > >>
> > >> Dan
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com