Re: [Ext4 punch hole 1/6 v5] Ext4 Punch Hole Support: Add flag to ext4_has_free_blocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 25, 2011 at 12:08 PM, Yongqiang Yang <xiaoqiangnk@xxxxxxxxx> wrote:
> On Mon, Apr 25, 2011 at 3:51 PM, Amir Goldstein <amir73il@xxxxxxxxx> wrote:
>> Hi Allison,
>>
>> Sorry for the late response.
>> I find it hard to digest yet another set of flags,
>> especially, since there is already an impressive set of
>> flags for allocation hints, which is what USE_ROOTBLKS flag really is.
>>
>> So I think it would be much better to pass the flag in ext4_allocation_request
>> and add the 'ar' argument to functions that don't have it, rather than adding
>> a 'flags' argument.
> It depends.  I had a look at Allison's patch and found that functions
> influenced by the patch can be divided into two groups:
>  1. one of which is extent related and led by ext4_ext_insert_extent()
>  2. while other one of which is very low level such as
> ext4_claim_free_blocks() which is used by ext4_da_reserve_space() in a
> relatively hight level.
>
> So I think maybe it's a good idea for extent related functions, but
> not for low level functions such as ext4_claim_free_blocks.
>
> Actually ext4_ext_insert_extent() seldom allocates blocks, because a
> block can contain a lot of extents.  Thus adding 'ar' parameter to
> ext4_ext_insert_extent() induces much unnecessary 'ar''s initializing.
>

Yeah, it's not clear what's the best way to handle this.
>From a system designer point of view, I would suggest to add a capability flag
for allocating from reserved space, but it is probably not going to be
an easy fix to push.

>
>>
>> If you do create a patch to pass 'ar' down to has_free_blocks()
>> I will also be able to use it to pass the HINT_COWING flag.
> HINT_COWING is being passed via 'ar'.
>

indeed, which is why if has_free_blocks gets 'ar' we will have that
flag, which we do not need anyway, since we have the IS_COWING(handle) flag.

> Yongqiang.
>>
>> Now here is another advise:
>> In ext4_mb_new_blocks() after ext4_claim_free_blocks(), there is a call to
>> dquot_alloc_block().
>> You need to call dquot_alloc_block_nofail() when allocating for punch hole,
>> otherwise punch hole can fail on quota limits.
>> We already have a patch for doing that with HINT_COWING flag.
>>
>> I think maybe it is best if our groups (punch hole and snapshots) have
>> a mutual 'next'
>> repository we can work with to prepare for the 2.6.40 merge window.
>> It would be even better, if Ted also collaborated his big alloc patches.
>>
>> What do you think guys?
>>
>> Amir.
>>
>> On Tue, Apr 19, 2011 at 10:41 AM, Allison Henderson
>> <achender@xxxxxxxxxxxxxxxxxx> wrote:
>>> This patch adds a flag to ext4_has_free_blocks which
>>> enables the use of reserved blocks.  This will allow
>>> a punch hole to proceed even if the disk is full.
>>> Punching a hole may require additional blocks to first
>>> split the extents.  The blocks will be reclaimed after
>>> the punch hole completes.
>>>
>>> Because ext4_has_free_blocks is a low level function,
>>> the flag needs to be passed down through several
>>> functions listed below:
>>>
>>> ext4_ext_insert_extent
>>> ext4_ext_create_new_leaf
>>> ext4_ext_grow_indepth
>>> ext4_ext_split
>>> ext4_ext_new_meta_block
>>> ext4_mb_new_blocks
>>> ext4_claim_free_blocks
>>> ext4_has_free_blocks
>>>
>>> Signed-off-by: Allison Henderson <achender@xxxxxxxxxx>
>>> ---
>>> :100644 100644 97b970e... 794c4d2... M  fs/ext4/balloc.c
>>> :100644 100644 4daaf2b... 6c1f415... M  fs/ext4/ext4.h
>>> :100644 100644 dd2cb50... 0b186d9... M  fs/ext4/extents.c
>>> :100644 100644 1a86282... ec890fd... M  fs/ext4/inode.c
>>> :100644 100644 a5837a8... db8b120... M  fs/ext4/mballoc.c
>>> :100644 100644 b545ca1... 2d9b12c... M  fs/ext4/xattr.c
>>>  fs/ext4/balloc.c  |   17 ++++++++++-------
>>>  fs/ext4/ext4.h    |   16 +++++++++++++---
>>>  fs/ext4/extents.c |   27 ++++++++++++++++-----------
>>>  fs/ext4/inode.c   |    6 +++---
>>>  fs/ext4/mballoc.c |    5 +++--
>>>  fs/ext4/xattr.c   |    2 +-
>>>  6 files changed, 46 insertions(+), 27 deletions(-)
>>>
>>> diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
>>> index 97b970e..794c4d2 100644
>>> --- a/fs/ext4/balloc.c
>>> +++ b/fs/ext4/balloc.c
>>> @@ -493,7 +493,8 @@ error_return:
>>>  * Check if filesystem has nblocks free & available for allocation.
>>>  * On success return 1, return 0 on failure.
>>>  */
>>> -static int ext4_has_free_blocks(struct ext4_sb_info *sbi, s64 nblocks)
>>> +static int ext4_has_free_blocks(struct ext4_sb_info *sbi,
>>> +                               s64 nblocks, int flags)
>>>  {
>>>        s64 free_blocks, dirty_blocks, root_blocks;
>>>        struct percpu_counter *fbc = &sbi->s_freeblocks_counter;
>>> @@ -522,7 +523,9 @@ static int ext4_has_free_blocks(struct ext4_sb_info *sbi, s64 nblocks)
>>>        /* Hm, nope.  Are (enough) root reserved blocks available? */
>>>        if (sbi->s_resuid == current_fsuid() ||
>>>            ((sbi->s_resgid != 0) && in_group_p(sbi->s_resgid)) ||
>>> -           capable(CAP_SYS_RESOURCE)) {
>>> +           capable(CAP_SYS_RESOURCE) ||
>>> +               (flags & EXT4_HAS_FREE_BLKS_USE_ROOTBLKS)) {
>>> +
>>>                if (free_blocks >= (nblocks + dirty_blocks))
>>>                        return 1;
>>>        }
>>> @@ -531,9 +534,9 @@ static int ext4_has_free_blocks(struct ext4_sb_info *sbi, s64 nblocks)
>>>  }
>>>
>>>  int ext4_claim_free_blocks(struct ext4_sb_info *sbi,
>>> -                                               s64 nblocks)
>>> +                                               s64 nblocks, int flags)
>>>  {
>>> -       if (ext4_has_free_blocks(sbi, nblocks)) {
>>> +       if (ext4_has_free_blocks(sbi, nblocks, flags)) {
>>>                percpu_counter_add(&sbi->s_dirtyblocks_counter, nblocks);
>>>                return 0;
>>>        } else
>>> @@ -554,7 +557,7 @@ int ext4_claim_free_blocks(struct ext4_sb_info *sbi,
>>>  */
>>>  int ext4_should_retry_alloc(struct super_block *sb, int *retries)
>>>  {
>>> -       if (!ext4_has_free_blocks(EXT4_SB(sb), 1) ||
>>> +       if (!ext4_has_free_blocks(EXT4_SB(sb), 1, 0) ||
>>>            (*retries)++ > 3 ||
>>>            !EXT4_SB(sb)->s_journal)
>>>                return 0;
>>> @@ -577,7 +580,7 @@ int ext4_should_retry_alloc(struct super_block *sb, int *retries)
>>>  * error stores in errp pointer
>>>  */
>>>  ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
>>> -               ext4_fsblk_t goal, unsigned long *count, int *errp)
>>> +               ext4_fsblk_t goal, unsigned long *count, int *errp, int flags)
>>>  {
>>>        struct ext4_allocation_request ar;
>>>        ext4_fsblk_t ret;
>>> @@ -588,7 +591,7 @@ ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
>>>        ar.goal = goal;
>>>        ar.len = count ? *count : 1;
>>>
>>> -       ret = ext4_mb_new_blocks(handle, &ar, errp);
>>> +       ret = ext4_mb_new_blocks(handle, &ar, errp, flags);
>>>        if (count)
>>>                *count = ar.len;
>>>        /*
>>> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
>>> index 4daaf2b..6c1f415 100644
>>> --- a/fs/ext4/ext4.h
>>> +++ b/fs/ext4/ext4.h
>>> @@ -512,6 +512,8 @@ struct ext4_new_group_data {
>>>        /* Convert extent to initialized after IO complete */
>>>  #define EXT4_GET_BLOCKS_IO_CONVERT_EXT         (EXT4_GET_BLOCKS_CONVERT|\
>>>                                         EXT4_GET_BLOCKS_CREATE_UNINIT_EXT)
>>> +       /* Punch out blocks of an extent */
>>> +#define EXT4_GET_BLOCKS_PUNCH_OUT_EXT          0x0020
>>>
>>>  /*
>>>  * Flags used by ext4_free_blocks
>>> @@ -521,6 +523,11 @@ struct ext4_new_group_data {
>>>  #define EXT4_FREE_BLOCKS_VALIDATED     0x0004
>>>
>>>  /*
>>> + * Flags used by ext4_has_free_blocks
>>> + */
>>> +#define EXT4_HAS_FREE_BLKS_USE_ROOTBLKS 0x0001
>>> +
>>> +/*
>>>  * ioctl commands
>>>  */
>>>  #define        EXT4_IOC_GETFLAGS               FS_IOC_GETFLAGS
>>> @@ -1638,8 +1645,10 @@ extern int ext4_bg_has_super(struct super_block *sb, ext4_group_t group);
>>>  extern unsigned long ext4_bg_num_gdb(struct super_block *sb,
>>>                        ext4_group_t group);
>>>  extern ext4_fsblk_t ext4_new_meta_blocks(handle_t *handle, struct inode *inode,
>>> -                       ext4_fsblk_t goal, unsigned long *count, int *errp);
>>> -extern int ext4_claim_free_blocks(struct ext4_sb_info *sbi, s64 nblocks);
>>> +                       ext4_fsblk_t goal, unsigned long *count,
>>> +                       int *errp, int flags);
>>> +extern int ext4_claim_free_blocks(struct ext4_sb_info *sbi,
>>> +                                       s64 nblocks, int flags);
>>>  extern void ext4_add_groupblocks(handle_t *handle, struct super_block *sb,
>>>                                ext4_fsblk_t block, unsigned long count);
>>>  extern ext4_fsblk_t ext4_count_free_blocks(struct super_block *);
>>> @@ -1696,7 +1705,8 @@ extern long ext4_mb_max_to_scan;
>>>  extern int ext4_mb_init(struct super_block *, int);
>>>  extern int ext4_mb_release(struct super_block *);
>>>  extern ext4_fsblk_t ext4_mb_new_blocks(handle_t *,
>>> -                               struct ext4_allocation_request *, int *);
>>> +                               struct ext4_allocation_request *,
>>> +                               int *, int flags);
>>>  extern int ext4_mb_reserve_blocks(struct super_block *, int);
>>>  extern void ext4_discard_preallocations(struct inode *);
>>>  extern int __init ext4_init_mballoc(void);
>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
>>> index dd2cb50..0b186d9 100644
>>> --- a/fs/ext4/extents.c
>>> +++ b/fs/ext4/extents.c
>>> @@ -192,12 +192,12 @@ static ext4_fsblk_t ext4_ext_find_goal(struct inode *inode,
>>>  static ext4_fsblk_t
>>>  ext4_ext_new_meta_block(handle_t *handle, struct inode *inode,
>>>                        struct ext4_ext_path *path,
>>> -                       struct ext4_extent *ex, int *err)
>>> +                       struct ext4_extent *ex, int *err, int flags)
>>>  {
>>>        ext4_fsblk_t goal, newblock;
>>>
>>>        goal = ext4_ext_find_goal(inode, path, le32_to_cpu(ex->ee_block));
>>> -       newblock = ext4_new_meta_blocks(handle, inode, goal, NULL, err);
>>> +       newblock = ext4_new_meta_blocks(handle, inode, goal, NULL, err, flags);
>>>        return newblock;
>>>  }
>>>
>>> @@ -793,7 +793,7 @@ static int ext4_ext_insert_index(handle_t *handle, struct inode *inode,
>>>  */
>>>  static int ext4_ext_split(handle_t *handle, struct inode *inode,
>>>                                struct ext4_ext_path *path,
>>> -                               struct ext4_extent *newext, int at)
>>> +                               struct ext4_extent *newext, int at, int flags)
>>>  {
>>>        struct buffer_head *bh = NULL;
>>>        int depth = ext_depth(inode);
>>> @@ -847,7 +847,7 @@ static int ext4_ext_split(handle_t *handle, struct inode *inode,
>>>        ext_debug("allocate %d blocks for indexes/leaf\n", depth - at);
>>>        for (a = 0; a < depth - at; a++) {
>>>                newblock = ext4_ext_new_meta_block(handle, inode, path,
>>> -                                                  newext, &err);
>>> +                                                  newext, &err, flags);
>>>                if (newblock == 0)
>>>                        goto cleanup;
>>>                ablocks[a] = newblock;
>>> @@ -1057,7 +1057,7 @@ cleanup:
>>>  */
>>>  static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
>>>                                        struct ext4_ext_path *path,
>>> -                                       struct ext4_extent *newext)
>>> +                                       struct ext4_extent *newext, int flags)
>>>  {
>>>        struct ext4_ext_path *curp = path;
>>>        struct ext4_extent_header *neh;
>>> @@ -1065,7 +1065,8 @@ static int ext4_ext_grow_indepth(handle_t *handle, struct inode *inode,
>>>        ext4_fsblk_t newblock;
>>>        int err = 0;
>>>
>>> -       newblock = ext4_ext_new_meta_block(handle, inode, path, newext, &err);
>>> +       newblock = ext4_ext_new_meta_block(handle, inode, path,
>>> +               newext, &err, flags);
>>>        if (newblock == 0)
>>>                return err;
>>>
>>> @@ -1141,7 +1142,7 @@ out:
>>>  */
>>>  static int ext4_ext_create_new_leaf(handle_t *handle, struct inode *inode,
>>>                                        struct ext4_ext_path *path,
>>> -                                       struct ext4_extent *newext)
>>> +                                       struct ext4_extent *newext, int flags)
>>>  {
>>>        struct ext4_ext_path *curp;
>>>        int depth, i, err = 0;
>>> @@ -1161,7 +1162,7 @@ repeat:
>>>        if (EXT_HAS_FREE_INDEX(curp)) {
>>>                /* if we found index with free entry, then use that
>>>                 * entry: create all needed subtree and add new leaf */
>>> -               err = ext4_ext_split(handle, inode, path, newext, i);
>>> +               err = ext4_ext_split(handle, inode, path, newext, i, flags);
>>>                if (err)
>>>                        goto out;
>>>
>>> @@ -1174,7 +1175,7 @@ repeat:
>>>                        err = PTR_ERR(path);
>>>        } else {
>>>                /* tree is full, time to grow in depth */
>>> -               err = ext4_ext_grow_indepth(handle, inode, path, newext);
>>> +               err = ext4_ext_grow_indepth(handle, inode, path, newext, flags);
>>>                if (err)
>>>                        goto out;
>>>
>>> @@ -1668,6 +1669,7 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode,
>>>        int depth, len, err;
>>>        ext4_lblk_t next;
>>>        unsigned uninitialized = 0;
>>> +       int free_blks_flags = 0;
>>>
>>>        if (unlikely(ext4_ext_get_actual_len(newext) == 0)) {
>>>                EXT4_ERROR_INODE(inode, "ext4_ext_get_actual_len(newext) == 0");
>>> @@ -1742,7 +1744,10 @@ repeat:
>>>         * There is no free space in the found leaf.
>>>         * We're gonna add a new leaf in the tree.
>>>         */
>>> -       err = ext4_ext_create_new_leaf(handle, inode, path, newext);
>>> +       if (flag & EXT4_GET_BLOCKS_PUNCH_OUT_EXT)
>>> +               free_blks_flags = EXT4_HAS_FREE_BLKS_USE_ROOTBLKS;
>>> +       err = ext4_ext_create_new_leaf(handle, inode, path,
>>> +               newext, free_blks_flags);
>>>        if (err)
>>>                goto cleanup;
>>>        depth = ext_depth(inode);
>>> @@ -3446,7 +3451,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
>>>        else
>>>                /* disable in-core preallocation for non-regular files */
>>>                ar.flags = 0;
>>> -       newblock = ext4_mb_new_blocks(handle, &ar, &err);
>>> +       newblock = ext4_mb_new_blocks(handle, &ar, &err, 0);
>>>        if (!newblock)
>>>                goto out2;
>>>        ext_debug("allocate new block: goal %llu, found %llu/%u\n",
>>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>>> index 1a86282..ec890fd 100644
>>> --- a/fs/ext4/inode.c
>>> +++ b/fs/ext4/inode.c
>>> @@ -640,7 +640,7 @@ static int ext4_alloc_blocks(handle_t *handle, struct inode *inode,
>>>                count = target;
>>>                /* allocating blocks for indirect blocks and direct blocks */
>>>                current_block = ext4_new_meta_blocks(handle, inode,
>>> -                                                       goal, &count, err);
>>> +                                                       goal, &count, err, 0);
>>>                if (*err)
>>>                        goto failed_out;
>>>
>>> @@ -686,7 +686,7 @@ static int ext4_alloc_blocks(handle_t *handle, struct inode *inode,
>>>                /* enable in-core preallocation only for regular files */
>>>                ar.flags = EXT4_MB_HINT_DATA;
>>>
>>> -       current_block = ext4_mb_new_blocks(handle, &ar, err);
>>> +       current_block = ext4_mb_new_blocks(handle, &ar, err, 0);
>>>        if (unlikely(current_block + ar.len > EXT4_MAX_BLOCK_FILE_PHYS)) {
>>>                EXT4_ERROR_INODE(inode,
>>>                                 "current_block %llu + ar.len %d > %d!",
>>> @@ -1930,7 +1930,7 @@ repeat:
>>>         * We do still charge estimated metadata to the sb though;
>>>         * we cannot afford to run out of free blocks.
>>>         */
>>> -       if (ext4_claim_free_blocks(sbi, md_needed + 1)) {
>>> +       if (ext4_claim_free_blocks(sbi, md_needed + 1, 0)) {
>>>                dquot_release_reservation_block(inode, 1);
>>>                if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
>>>                        yield();
>>> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
>>> index a5837a8..db8b120 100644
>>> --- a/fs/ext4/mballoc.c
>>> +++ b/fs/ext4/mballoc.c
>>> @@ -4276,7 +4276,8 @@ static int ext4_mb_discard_preallocations(struct super_block *sb, int needed)
>>>  * to usual allocation
>>>  */
>>>  ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
>>> -                               struct ext4_allocation_request *ar, int *errp)
>>> +                               struct ext4_allocation_request *ar, int *errp,
>>> +                               int flags)
>>>  {
>>>        int freed;
>>>        struct ext4_allocation_context *ac = NULL;
>>> @@ -4303,7 +4304,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle,
>>>                 * there is enough free blocks to do block allocation
>>>                 * and verify allocation doesn't exceed the quota limits.
>>>                 */
>>> -               while (ar->len && ext4_claim_free_blocks(sbi, ar->len)) {
>>> +               while (ar->len && ext4_claim_free_blocks(sbi, ar->len, flags)) {
>>>                        /* let others to free the space */
>>>                        yield();
>>>                        ar->len = ar->len >> 1;
>>> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
>>> index b545ca1..2d9b12c 100644
>>> --- a/fs/ext4/xattr.c
>>> +++ b/fs/ext4/xattr.c
>>> @@ -821,7 +821,7 @@ inserted:
>>>                                goal = goal & EXT4_MAX_BLOCK_FILE_PHYS;
>>>
>>>                        block = ext4_new_meta_blocks(handle, inode,
>>> -                                                 goal, NULL, &error);
>>> +                                                 goal, NULL, &error, 0);
>>>                        if (error)
>>>                                goto cleanup;
>>>
>>> --
>>> 1.7.1
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>
>
>
> --
> Best Wishes
> Yongqiang Yang
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux