Re: [PATCH V4 00/13] MD: a caching layer for raid5/6

NeilBrown <neilb@xxxxxxxx> · Wed, 15 Jul 2015 12:12:34 +1000

On Tue, 14 Jul 2015 17:45:04 -0700 Shaohua Li <shli@xxxxxx> wrote:

> On Fri, Jul 10, 2015 at 02:36:56PM +1000, NeilBrown wrote:

> > Yes it does.  Having a single sort of metadata block is an important
> > part of the goal.  How the code actually chooses to use these is a
> > separate issue that can change harmlessly.
> 
> Taking a close look to reuse MD superblock for caching. It turns out to
> be quite hacky. Suppose I use md_update_sb to update superblock when we
> checkpoint the log. So I update corresponding fields of mddev
> (resync_offset, recovery_offset). In md_update_sb, I must add a bunch of
> 'if (caching_disk) xxx' as raid disks shouldn't store the
> resync_offset/recovery_offset. Or I can add a new cache_update_sb, but I
> thought I must add the same hack code if we don't duplicate a lot of
> code.

in md_update_sb, in the loop:

	/* First make sure individual recovery_offsets are correct */
	rdev_for_each(rdev, mddev) {
		if (rdev->raid_disk >= 0 &&
		    mddev->delta_disks >= 0 &&
		    !test_bit(In_sync, &rdev->flags) &&
		    mddev->curr_resync_completed > rdev->recovery_offset)
				rdev->recovery_offset = mddev->curr_resync_completed;

	}

add something like:
               else if (rdev->is_cache)
                        rdev->recovery_offset =
                        mddev->cache->latest_checkpoint

In super_1_sync, where the code:

	if (rdev->raid_disk >= 0 &&
	    !test_bit(In_sync, &rdev->flags)) {
		sb->feature_map |=
			cpu_to_le32(MD_FEATURE_RECOVERY_OFFSET);
		sb->recovery_offset =
			cpu_to_le64(rdev->recovery_offset);
		if (rdev->saved_raid_disk >= 0 && mddev->bitmap)
			sb->feature_map |=
				cpu_to_le32(MD_FEATURE_RECOVERY_BITMAP);
	}

is, add something like
	else if (rdev->is_a_cache_disk) {
              sb->feature_map |= MD_FEATURE_IMA_CACHE;
              sb->recovery_offset = cpu_to_le64(rdev->recovery_Offset);
        }

or just make the original code a little more general - I'm not sure
exactly how you flag the cache device.

You don't need to do this every time you checkpoint the log.  The
pointer just needs to point to somewhere in the log so that the
start/end can be found (each metadata block points to the next one).
You could leave it until the log wraps completely, though that probably
isn't ideal.

So when you checkpoint the log, if the ->recovery_offset of the cache
device is more than (say) 25% behind the new checkpoint location, just
set MD_CHANGE_PENDING and wake the md thread.

I don't see that as particularly hackish.

Thanks,
NeilBrown

> 
> Adding a superblock for caching looks better. A 4k page data doesn't
> increase any complexity. If you think we shouldn't duplicate too fileds
> of md superblock to the cache superblock, we can store the most
> necessary data in cache superblock.
> 
> Thanks,
> Shaohua

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html