Re: Question about cephfs MDS journal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 31, 2016 at 4:20 AM, WuJaven <javen.wu@xxxxxxxxxx> wrote:
> Hi Team,
>
> I have a question about MDS journal trim policy.
> I found the MDS journal would be flushed when the segment number is over the predefined value (default 30) only in tick thread context.
> If the journal is not over the predefined value, the segments would not be flushed to the OMAP even eventhrough  the system is not busy.
> My question is what’s the intention for the design?
> I can understand that journal replay is able to restore metadata into memory, there is no negative impact even though the journal is committed to the backend OMAP.
> Is there any more concern?
> What if we flush the journal segments in the every tick period as along as journal is not empty?

In general, we don't want to flush sooner because it would generate a
higher number of metadata IOs overall.  Because many workloads
repeatedly modify the same pieces of metadata, it's beneficial to
coalesce these and only do the final IO when the dirty
inode/dentry/dirfrag "falls off" the end of the journal.  Note that
the dirty_inodes etc members on LogSegments are `elists`, so that when
we dirty an inode on a newer logsegment, it gets implicitly
disassociated from the old logsegement.

However, I have also thought that it would be useful to flush the
journal on idle systems.  Especially, since our new fsck/recovery
functionality benefits from having a fully written-back metadata
store, it would be nice to do this opportunistically if we could
invent a suitable heuristic for detecting idleness.

John

>
> Thanks
> Javen--
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux