On Mon, Jul 11, 2016 at 7:22 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > [please don't top post] > > On Fri, Jul 08, 2016 at 01:48:43PM +0800, Kaho Ng wrote: >> On Fri, Jul 8, 2016 at 6:28 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >> > On Thu, Jul 07, 2016 at 07:01:35PM +0800, Kaho Ng wrote: >> >> I am trying to investigate how freelist allocator in xfs interacts >> >> with freespace B+Tree allocator. >> >> First I prepared a patch >> >> <https://gist.github.com/22ffca35929e67c08759b057779b7566> on >> >> linux-source/fs/xfs/libxfs/xfs_alloc.c to print debugging messages >> >> (The kernel version used is linux-3.10.0-327.22.2.el7). >> > ...... >> >> When reading the log output >> >> <https://gist.github.com/890076405e1c13c0a952a579e25e6afe> , I >> >> realised that there is no B+Tree split >> >> triggered by xfs_alloc_fix_freelist() when calling xfs_free_extent(). >> >> Isn't B+Tree split possible in by-size B+Tree even when truncating a >> >> longer freespace record to shorter one? But what I found in the log is >> >> only a few tree shrinks... And when reading the source code of >> >> freespace allocator I found that a B+Tree growth in this case is >> >> impossible at least... >> > >> > args->isfl doesn't mean what you think it means. >> > >> > args->isfl is only set when moving blocks from the freespace btree >> > to the AGFL, which only occurs when a previous operation allocated a >> > new freespace btree block and depleted the current freelist. i.e. >> > "AG Free List" != "AG freespace btree" - they are different >> > structures on disk... >> > >> > And when you consider that a freelist refill can only remove records >> > from the the freespace btree, it's should be clear that a btree >> > split won't occur during a freelist refill... >> >> Hmm, wouldn't xfs_alloc_ag_vextent_size() first remove the free extent >> record, and insert a new extent record into the freespace by-size >> btree if the found free extent record is longer than args->maxlen? > > Good, you went and looked to verify what I said(*). Indeed, when > you read xfs_alloc_fixup_trees() where the two trees are modified > after an extent has been selected for allocation: > > ... > * Delete the entry from the by-size btree. > ... > * Add new by-size btree entry(s). > ... > * Fix up the by-block btree entry(s) > > It's pretty clear what the answer is. IOWs, yes, you're correct - we > only ever modify or delete records from the by-bno freespace tree, > but the by-size tree does do a delete and insert and so can > split.(**) > > So, while it is possible for a split to occur, you still didn't see > one in your test. That's because a split is rather unlikely in your > scenario because once we have a large enough freespace btree records > stored for a split to occur, we almost always have exact matches in > the by-count tree for freelist fills and so record deletes are the > usual operationon the by-count tree. > > And even if we are doing a by-count insert, the chances the target > block for the insert is full is quite small, as are the chances the > target block for the insert is different to the block the original > record was deleted from (consider ordering, what record index the > initial lookup returns and that freelist fills usually only require > a block or two to be allocated). > > Cheers, > > Dave. > > (*) That was what my answer was aimed at getting you to do as I get > a lot of timewasters asking me via direct email to answer their > homework questions for them. Anyone who is trying to understand the > basics of the freespace btrees should have realised the by-count > tree needs a delete/insert operation pair if we modify the length of > a freespace record. > > (**) The lesson here is this: trust what people say, but always > verify it when you can. e.g. I now know you'll look at the > code to try to understand answers you are given, so my time > answering your questions is not going to be wasted. > > -- > Dave Chinner > david@xxxxxxxxxxxxx Thanks for the detailed explaination! Just wonders why we prefer failing the request of refilling freelist with XFS_WANT_CORRUPTED_RETURN(mp, i == 1) in some rare case, rather than returning NULLAGBLOCK and allowing the loop in xfs_alloc_ag_vextent_size() to try xfs_alloc_ag_vextent_small()... In such corner case there will always be a lot of small extents at the front of the by-count tree, and any truncation changes to the first entry in the tree will not result in tree splits and triggering assertion failure. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs