W dniu 09.09.2016 o 03:56, Kent Overstreet pisze: Hi! > On Wed, Sep 07, 2016 at 01:12:12PM -0800, Kent Overstreet wrote: >> So, right now we're checking i_nlinks on every mount - mainly the dirents >> implementation predates the transactional machinery we have now. That's almost >> definitely what's taking so long, but I'll send you a patch to confirm later. > > I just pushed a patch to add printks for the various stages of recovery: use > mount -o verbose_recovery to enable. > > How many files does this filesystem have? (df -i will tell you). >> # time find /mnt/test/ -type d |wc -l >> 10564259 >> real 10m30.305s >> user 1m6.080s >> sys 3m43.770s >> # time find /mnt/test/ -type f |wc -l >> 9145093 >> real 6m28.812s >> user 1m3.940s >> sys 3m46.210s > As another data point, on my laptop mounting takes half a second - smallish > filesystem though, 47 gb of data and 711k inodes (and it's on an SSD). My > expectation is that mount times with the current code will be good enough as > long as you're using SSDs (or tiering, where tier 0 is SSD) - but I could use > more data points. > > Also, increasing the btree node size may help, if you're not already using max > size btree nodes (256k). I may readd prefetching to metadata scans too, that > should help a good bit on rotating disks... I'm using defaults from bcache format, knobs don't have description aboutwneh I should change some options or when I should don't touch it. On this, particular filesystem btree_node_size=128k according to sysfs. > Mounting taking 12 minutes (and the amount of IO you were seeing) implies to me > that a metadata isn't being cached as well as it should be though, which is odd > considering outside of journal replay we aren't doing random access, all the > metadata access is inorder scans. So yeah, definitely want that timing > information... As I mentioned in emai, box has 1GB of RAM, maybe this is bottleneck? Timing from dmesg: [ 375.537762] bcache (sde1): starting mark and sweep: [ 376.220196] bcache (sde1): mark and sweep done [ 376.220489] bcache (sde1): starting journal replay: [ 376.220493] bcache (sde1): journal replay done, 0 keys in 1 entries, seq 133015 [ 376.220496] bcache (sde1): journal replay done [ 376.220498] bcache (sde1): starting fs gc: [ 575.205355] bcache (sde1): fs gc done [ 575.205362] bcache (sde1): starting fsck: [ 822.522269] bcache (sde1): fsck done Marcin -- To unsubscribe from this list: send the line "unsubscribe linux-bcache" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html