Re: [PATCH 10/10] mm: Account for WRITEBACK_TEMP in balance_dirty_pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pavel Emelyanov <xemul@xxxxxxxxxxxxx> writes:

> On 07/17/2012 11:11 PM, Miklos Szeredi wrote:
>> 
>> Okay, maybe I'm blind but if this is true, then how is
>> balance_dirty_pages() supposed to ensure that the per-bdi limit is not
>> exceeded?
>
> The balance_dirty_pages logic is _very_ roughly the the following:
>
> Let this_bdi be a bdi the current task is writing to
> Let D be the total amount of dirty and writeback memory (and writeback_tmp after this patch)
> Let L be the limit of dirty memory (L = ram_size * ratio)
> Let d be the amount of dirty and writeback on this_bdi
> And let l be the limit of dirty memory on this_bdi
>
> With that the balancer logic look like
>
> while (1) {
> 	if (D < L)
> 		return;
>
> 	start_background_writeback(this_bdi);
>
> 	if (d < l)
> 		return;
>
> 	timeout = get_sleep_timeout(d, l, D, L);
> 	shcedule_timeout(timeout);
> }
>
> The d and l are calculated out of the D and L using this_bdi and
> global IO completions proportions (with more complexity, but still).
>
> Thus, since we throttle tasks looking ad d and l only we cannot affect
> all the bdis in the system by live-locking a single one of them.
>
> Accounting for writeback_tmp is required since the D should become
> high when there are lots of pages in-flight in FUSE. Otherwise, the
> balance_dirty_pages will not limit the task writing on a fuse mount.

Okay, that makes sense, and it's certainly an improvement from the
current situation.

What I'm worried about is that with the above algorithm a filesystem's
"d" can grow as high as "L" if only that filesystem is dirtying memory.

If that filesystem is very slow or broken and other filesystems start
dirtying data then they are left with only a fraction of the original
limit.

They won't deadlock, but performance will be affected.  So ideally I'd
like to see more strict per-bdi limit enforcement for fuse (the per-bdi
limit is just 1% of "L" by default on fuse).

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux