Re: [ceph-users] Write back mode Cach-tier behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[ Moving to ceph-devel ]

On Sun, Jun 4, 2017 at 9:25 PM, TYLin <wooertim@xxxxxxxxx> wrote:
> Hi all,
>
> We’re using cache-tier with write-back mode but the write throughput is not
> as good as we expect. We use CephFS and create a 20GB file in it. While data
> is writing, we use iostat to get the disk statistics. From iostat, we saw
> that ssd (cache-tier) is idle most of the time and hdd (storage-tier) is
> busy all the time. From the document
>
> “When admins configure tiers with writeback mode, Ceph clients write data to
> the cache tier and receive an ACK from the cache tier. In time, the data
> written to the cache tier migrates to the storage tier and gets flushed from
> the cache tier.”
>
> So the data is write to cache-tier and then flush to storage tier when dirty
> ratio is more than 0.4? The word “in time” in the document confused me.
>
> We found that the throughput of creating a new file is slower than overwrite
> an existing file, and ssd has more write when doing overwrite. We then look
> into the source code and log. A newly created file goes to proxy_write,
> which is followed by a promote_object. Does this means that the object
> actually goes to storage pool directly and then be promoted to the
> cache-tier when creating a new file?

So I skimmed this thread and thought it was very wrong, since we don't
need to proxy when we're doing fresh writes. But looking at current
master, that does indeed appear to be the case when creating new
objects: they always get proxied (I didn't follow the whole chain, but
PrimaryLogPG::maybe_handle_cache_detail unconditionally calls
do_proxy_write() if the OSD cluster supports proxying and we aren't
must_promote!).

Was this intentional? I know we've flipped around a bit on ideal
tiering behavior but it seems like at the very least it should be
configurable — proxying then promoting is a very inefficient pattern
for workloads that involve generating lots of data, modifying it, and
then never reading it again.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux