Re: question about bluefs log sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 29 Aug 2017, zengran zhang wrote:
> thanks! i open the debug_bluefs to 10, but did not see the
> "_should_compact_log" log when fio is writing the rbd...so how /offen
> will BlueFS::sync_metadata() be called?

Not that often.  We only need to write to the bluefs log when new rocksdb 
files are created.. and they are pretty big.  So it will need to age for 
quite a while before anything happens.

You can get some sense of it by watching the 'ceph daemonperf osd.0' 
output on a running OSD and watching the bluefs.wal column.  There is also 
a bluefs.log_bytes counter in the full perf dump, although you won't see 
the estimated size to compare it against.

sage

> 
> 2017-08-29 21:19 GMT+08:00 Sage Weil <sweil@xxxxxxxxxx>:
> > On Tue, 29 Aug 2017, zengran zhang wrote:
> >> hi Sage,
> >>     I want to ask when will the bluefs log be compacted? I only sure
> >> it will be compacted when umount the bluefs...
> >> I see the log be compact when rocksdb call Dir.Fsync(), but want to
> >> know how to trigger this...
> >
> > There is a heuristic for when it gets "too big":
> >
> > bool BlueFS::_should_compact_log()
> > {
> >   uint64_t current = log_writer->file->fnode.size;
> >   uint64_t expected = _estimate_log_size();
> >   float ratio = (float)current / (float)expected;
> >   dout(10) << __func__ << " current 0x" << std::hex << current
> >            << " expected " << expected << std::dec
> >            << " ratio " << ratio
> >            << (new_log ? " (async compaction in progress)" : "")
> >            << dendl;
> >   if (new_log ||
> >       current < cct->_conf->bluefs_log_compact_min_size ||
> >       ratio < cct->_conf->bluefs_log_compact_min_ratio) {
> >     return false;
> >   }
> >   return true;
> > }
> >
> > and the estimate for the (compacted) size is
> >
> > uint64_t BlueFS::_estimate_log_size()
> > {
> >   int avg_dir_size = 40;  // fixme
> >   int avg_file_size = 12;
> >   uint64_t size = 4096 * 2;
> >   size += file_map.size() * (1 + sizeof(bluefs_fnode_t));
> >   for (auto& p : block_all)
> >     size += p.num_intervals() * (1 + 1 + sizeof(uint64_t) * 2);
> >   size += dir_map.size() + (1 + avg_dir_size);
> >   size += file_map.size() * (1 + avg_dir_size + avg_file_size);
> >   return ROUND_UP_TO(size, super.block_size);
> > }
> >
> > The default min_ratio is 5... so we compact when it's ~5x bigger than it
> > needs to be.
> >
> > sage
> >
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux