Re: [PATCH v2 0/9] nilfs2: implementation of cost-benefit GC policy

Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx> · Tue, 05 May 2015 12:09:04 +0900 (JST)

Hi Andreas,
On Sun,  3 May 2015 12:05:13 +0200, Andreas Rohner wrote:
> Hello,
> 
> This is an updated version based on the review of Ryusuke Konishi. It is
> a complete rewrite of the first version and the implementation is much 
> simpler and cleaner.
> 
> I include here a copy of my cover letter from the first version:
> 
> One of the biggest performance problems of NILFS is its
> inefficient Timestamp GC policy. This patch set introduces two new GC
> policies, namely Cost-Benefit and Greedy.
> 
> The Cost-Benefit policy is nothing new. It has been around for a long
> time with log-structured file systems [1]. But it relies on accurate
> information, about the number of live blocks in a segment. NILFS
> currently does not provide the necessary information. So this patch set
> extends the entries in the SUFILE to include a counter for the number of
> live blocks. This counter is decremented whenever a file is deleted or
> overwritten.
> 
> Except for some tricky parts, the counting of live blocks is quite
> trivial. The problem is snapshots. At any time, a checkpoint can be
> turned into a snapshot or vice versa. So blocks that are reclaimable at
> one point in time, are protected by a snapshot a moment later.
> 
> This patch set does not try to track snapshots at all. Instead it uses a
> heuristic approach to prevent the worst case scenario. The performance
> is still significantly better than timestamp for my benchmarks.
> 
> The worst case scenario is, the following:
> 
> 1. Segment 1 is written
> 2. Snapshot is created
> 3. GC tries to reclaim Segment 1, but all blocks are protected
>    by the Snapshot. The GC has to set the number of live blocks
>    to maximum to avoid reclaiming this Segment again in the near future.
> 4. Snapshot is deleted
> 5. Segment 1 is reclaimable, but its counter is so high, that the GC
>    will never try to reclaim it again.
> 
> To prevent this kind of starvation I use another field in the SUFILE
> entry, to store the number of blocks that are protected by a snapshot.
> This value is just a heuristic and it is usually set to 0. Only if the
> GC reclaims a segment, it is written to the SUFILE entry. The GC has to
> check for snapshots anyway, so we get this information for free. By
> storing this information in the SUFILE we can avoid starvation in the
> following way:
> 
> 1. Segment 1 is written
> 2. Snapshot is created
> 3. GC tries to reclaim Segment 1, but all blocks are protected
>    by the Snapshot. The GC has to set the number of live blocks
>    to maximum to avoid reclaiming this Segment again in the near future.
> 4. GC sets the number of snapshot blocks in Segment 1 in the SUFILE
>    entry
> 5. Snapshot is deleted
> 6. On Snapshot deletion we walk through every entry in the SUFILE and
>    reduce the number of live blocks to half, if the number of snapshot
>    blocks is bigger than half of the maximum.
> 7. Segment 1 is reclaimable and the number of live blocks entry is at
>    half the maximum. The GC will try to reclaim this segment as soon as
>    there are no other better choices.
> 
> BENCHMARKS:
> -----------
> 
> My benchmark is quite simple. It consists of a process, that replays
> real NFS traces at a faster speed. It thereby creates relatively
> realistic patterns of file creation and deletions. At the same time
> multiple snapshots are created and deleted in parallel. I use a 100GB
> partition of a Samsung SSD:
> 
> WITH SNAPSHOTS EVERY 5 MINUTES:
> --------------------------------------------------------------------
>                 Execution time       Wear (Data written to disk)
> Timestamp:      100%                 100%
> Cost-Benefit:   80%                  43%
> 
> NO SNAPSHOTS:
> ---------------------------------------------------------------------
>                 Execution time       Wear (Data written to disk)
> Timestamp:      100%                 100%
> Cost-Benefit:   70%                  45%
> 
> I plan on adding more benchmark results soon.
> 
> Best regards,
> Andreas Rohner

Thanks for your effort.

I'm now reviewing the kernel patches.  Please wait for a while.

Regards,
Ryusuke Konishi
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html