Re: [PATCH v2 06/10] ext4: update delalloc data reserve spcae in ext4_es_insert_extent()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024/8/8 1:41, Jan Kara wrote:
> On Fri 02-08-24 19:51:16, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@xxxxxxxxxx>
>>
>> Now that we update data reserved space for delalloc after allocating
>> new blocks in ext4_{ind|ext}_map_blocks(), and if bigalloc feature is
>> enabled, we also need to query the extents_status tree to calculate the
>> exact reserved clusters. This is complicated now and it appears that
>> it's better to do this job in ext4_es_insert_extent(), because
>> __es_remove_extent() have already count delalloc blocks when removing
>> delalloc extents and __revise_pending() return new adding pending count,
>> we could update the reserved blocks easily in ext4_es_insert_extent().
>>
>> Thers is one special case needs to concern is the quota claiming, when
>> bigalloc is enabled, if the delayed cluster allocation has been raced
>> by another no-delayed allocation(e.g. from fallocate) which doesn't
>> cover the delayed blocks:
>>
>>   |<       one cluster       >|
>>   hhhhhhhhhhhhhhhhhhhdddddddddd
>>   ^            ^
>>   |<          >| < fallocate this range, don't claim quota again
>>
>> We can't claim quota as usual because the fallocate has already claimed
>> it in ext4_mb_new_blocks(), we could notice this case through the
>> removed delalloc blocks count.
>>
>> Signed-off-by: Zhang Yi <yi.zhang@xxxxxxxxxx>
> ...
>> @@ -926,9 +928,27 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lblk_t lblk,
>>  			__free_pending(pr);
>>  			pr = NULL;
>>  		}
>> +		pending = err3;
>>  	}
>>  error:
>>  	write_unlock(&EXT4_I(inode)->i_es_lock);
>> +	/*
>> +	 * Reduce the reserved cluster count to reflect successful deferred
>> +	 * allocation of delayed allocated clusters or direct allocation of
>> +	 * clusters discovered to be delayed allocated.  Once allocated, a
>> +	 * cluster is not included in the reserved count.
>> +	 *
>> +	 * When bigalloc is enabled, allocating non-delayed allocated blocks
>> +	 * which belong to delayed allocated clusters (from fallocate, filemap,
>> +	 * DIO, or clusters allocated when delalloc has been disabled by
>> +	 * ext4_nonda_switch()). Quota has been claimed by ext4_mb_new_blocks(),
>> +	 * so release the quota reservations made for any previously delayed
>> +	 * allocated clusters.
>> +	 */
>> +	resv_used = rinfo.delonly_cluster + pending;
>> +	if (resv_used)
>> +		ext4_da_update_reserve_space(inode, resv_used,
>> +					     rinfo.delonly_block);
> 
> I'm not sure I understand here. We are inserting extent into extent status
> tree. We are replacing resv_used clusters worth of space with delayed
> allocation reservation with normally allocated clusters so we need to
> release the reservation (mballoc already reduced freeclusters counter).
> That I understand. In normal case we should also claim quota because we are
> converting from reserved into allocated state. Now if we allocated blocks
> under this range (e.g. from fallocate()) without
> EXT4_GET_BLOCKS_DELALLOC_RESERVE, we need to release quota reservation here
> instead of claiming it. But I fail to see how rinfo.delonly_block > 0 is
> related to whether EXT4_GET_BLOCKS_DELALLOC_RESERVE was set when allocating
> blocks for this extent or not.
> 

Oh, this is really complicated due to the bigalloc feature, please let me
explain it more clearly by listing all related situations.

There are 2 types of paths of allocating delayed/reserved cluster:
1. Normal case, normally allocate delayed clusters from the write back path.
2. Special case, allocate blocks under this delayed range, e.g. from
   fallocate().

There are 4 situations below:

A. bigalloc is disabled. This case is simple, after path 2, we don't need
   to distinguish path 1 and 2, when calling ext4_es_insert_extent(), we
   set EXT4_GET_BLOCKS_DELALLOC_RESERVE after EXT4_MAP_DELAYED bit is
   detected. If the flag is set, we must be replacing a delayed extent and
   rinfo.delonly_block must be > 0. So rinfo.delonly_block > 0 is equal
   to set EXT4_GET_BLOCKS_DELALLOC_RESERVE.

B. bigalloc is enabled, there a 3 sub-cases of allocating a delayed
   cluster:
B0.Allocating a whole delayed cluster, this case is the same to A.

     |<         one cluster       >|
     ddddddd+ddddddd+ddddddd+ddddddd
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ allocating the whole range

B1.Allocating delayed blocks in a reserved cluster, this case is the same
   to A, too.

     |<         one cluster       >|
     hhhhhhh+hhhhhhh+ddddddd+ddddddd
                             ^^^^^^^
                             allocating this range

B2.Allocating blocks which doesn't cover the delayed blocks in one reserved
   cluster,

     |<         one cluster       >|
     hhhhhhh+hhhhhhh+hhhhhhh+ddddddd
     ^^^^^^^
     fallocating this range

  This case must from path 2, which means allocating blocks without
  EXT4_GET_BLOCKS_DELALLOC_RESERVE. In this case, rinfo.delonly_block must
  be 0 since we are not replacing any delayed extents, so
  rinfo.delonly_block == 0 means allocate blocks without EXT4_MAP_DELAYED
  detected, which further means that EXT4_GET_BLOCKS_DELALLOC_RESERVE is
  not set. So I think we could use rinfo.delonly_block to identify this
  case.

As the cases listed above, I think we could use rinfo.delonly_block to
determine whether the EXT4_GET_BLOCKS_DELALLOC_RESERVE is set, so I use it
to determine if we need to claim quota or release quota.

> At this point it would seem much clearer if we passed flag to
> ext4_es_insert_extent() whether EXT4_GET_BLOCKS_DELALLOC_RESERVE was set
> when allocating extent or not instead of computing delonly_block and
> somehow infering from that. But maybe I miss some obvious reason why that
> is correct.
> 

Yeah, I agree that infer from computing delonly_block is little obscure
and not clear enough, passing a flag is a clearer solution, but we have
to pass one more parameter to ext4_es_insert_extent() which could only be
set or not set in the allocating path in ext4_map_create_blocks(), other
5 callers don't care about it (so they should always have no
EXT4_GET_BLOCKS_DELALLOC_RESERVE flag set theoretically).

I have no strong feeling of which one is better, which one do you perfer
after reading my explanation above?

Thanks,
Yi.





[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux