Re: [PATCH 02/12] raid5-cache: free I/O units earlier

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Hellwig <hch@xxxxxx> writes:

> There is no good reason to keep the I/O unit structures around after the
> stripe has been written back to the RAID array.  The only information
> we need is the log sequence number, and the checkpoint offset of the
> highest successfull writeback.  Store those in the log structure, and
> free the IO units from __r5l_stripe_write_finished.
>
> Besides simplifying the code this also avoid having to keep the allocation
> for the I/O unit around for a potentially long time as superblock updates
> that checkpoint the log do not happen very often.

This only saves the allocation for two I/O units doesn't it?  So not a
big saveding, but still a good clean-up.

>
> This also fixes the previously incorrect calculation of 'free' in
> r5l_do_reclaim as a side effect: previous if took the last unit which
> isn't checkpointed into account.

It took me a while to see that - so just to be sure I understand.

There is a list of I/O units which have been completely written to the
log and to the RAID.  The last one of these is used as a marker for the
start of the log, so it cannot be freed yet.  All the earlier ones can
be freed.
The previous code added up the sizes of all of the units in this list,
so it incorrectly included that last unit.
Your new code calculates the difference between the start of the first
unit and the start of the last unit, and so correctly excluded the last
unit from the free space calculation.

Did I get that right?

> @@ -690,60 +675,41 @@ static void r5l_do_reclaim(struct r5l_log *log)
>  	 * shouldn't reuse space of an unreclaimable io_unit
>  	 * */
>  	while (1) {
> -		struct list_head *target_list = NULL;
> -
> -		while (!list_empty(&log->stripe_end_ios)) {
> -			io = list_first_entry(&log->stripe_end_ios,
> -				struct r5l_io_unit, log_sibling);
> -			list_move_tail(&io->log_sibling, &list);
> -			free += r5l_ring_distance(log, io->log_start,
> -						io->log_end);
> -		}
> -
> -		if (free >= reclaim_target ||
> +		reclaimable = r5l_reclaimable_space(log);
> +		if (reclaimable >= reclaim_target ||
>  		    (list_empty(&log->running_ios) &&
>  		     list_empty(&log->io_end_ios) &&
>  		     list_empty(&log->flushing_ios) &&
>  		     list_empty(&log->flushed_ios)))
>  			break;
>  
> -		/* Below waiting mostly happens when we shutdown the raid */
> -		if (!list_empty(&log->flushed_ios))
> -			target_list = &log->flushed_ios;
> -		else if (!list_empty(&log->flushing_ios))
> -			target_list = &log->flushing_ios;
> -		else if (!list_empty(&log->io_end_ios))
> -			target_list = &log->io_end_ios;
> -		else if (!list_empty(&log->running_ios))
> -			target_list = &log->running_ios;
> -
> -		r5l_kick_io_unit(log);
> +		md_wakeup_thread(log->rdev->mddev->thread);
> +		wait_event_lock_irq(log->iounit_wait,
> +				    r5l_reclaimable_space(log) > reclaimable,
> +				    log->io_list_lock);
>  	}

This is ringing warning bells....  I'm not sure it's wrong, but I'm not
sure it is right.
I feel that if the thread gets woken and finds that
r5l_reclaimable_space is still too low, it should wake up the thread
again, just in case.
Also, it should test against reclaim_target, not reclaimable, shouldn't
it?

Instead of a wait_event_lock_irq inside a loop, could we just have a
wake_event_*??

wait_event_lock_irq_cmd(log->iounit_wait,
                        r5l_reclaimable_space(log) >= reclaim_target ||
                        (list_empty() && .... list_empty()),
                        log->io_list_lock,
                        md_wakeup_thread(log->rdev->mddev->thread));

??


Otherwise I like it.  Thanks.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux