Re: delalloc and reservation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I guess the list dropped this mail. Sending again.

-aneesh
--- Begin Message ---


Aneesh Kumar K.V wrote:
Hi All,

I looked at the delalloc and reservation differences that Valerie was observing. Below is my understanding. I am not sure whether the below will result in higher fragmentation that Eric Sandeen is observing. I guess it should not. Even though the reservation gets discarded during the clear inode due to memory pressure the request for new reservation should get the blocks nearby and not break extents right ?


any how below is the simple case.

without delalloc the blocks are requested during prepare_write/write_begin.
That means we enter ext4_new_blocks_old which will call ext4_try_to_allocate_with_rsv. Now if there is no reservation for this inode a new one will be allocated. After using the blocks this reservation is destroyed during the close via ext4_release_file

With delalloc the blocks are not requested until we hit writeback/ext4_da_writepages That means if we create new file and close them the reservation will be discarded during close via ext4_release_file.( Actually there will be nothing to clear) Now when we do a sync/or write back. We try to get the block, the inode will request for new reservation. This reservation is not discarded untill we call clear_inode
and that results in the behavior we are seeing.
Free blocks: 1440-8191, 8194-8199, 8202-8207, 8210-8215, 8218-8223, 8226-8231, 8234-8239, 8242-8247, 8250-8255, 8258-8263, 8266-8271, 8274-8279, 8282-8287, 8290-8295, 8298-8303, 8306-8311, 8314-8319, 8322-8327, 8330-8335, 8338-8343, 8346-12799

So now the question is where do we discard the reservation in case of delalloc.

-

with respect to mballoc we are not seeing this because we are doing
allocation from group prealloc list which is per cpu.
For most the case we have EXT4_MB_HINT_GROUP_ALLOC set in mballoc.

In ext4_mb_group_or_file i already have a FIXME!! regarding this.

currently we have

       /* request is so large that we don't care about
        * streaming - it overweights any possible seek */
       if (ac->ac_o_ex.fe_len >= sbi->s_mb_large_req)
               return;

       /* FIXME!!
        * is this  >=  considering the above ?
        */
       if (ac->ac_o_ex.fe_len >= sbi->s_mb_small_req)
               return;

       .....
       ......

      /* we're going to use group allocation */
       ac->ac_flags |= EXT4_MB_HINT_GROUP_ALLOC;
........
      .........

So for small size we have the EXT4_MB_HINT_GROUP_ALLOC set . Now if
i change the the line below FIXME!! to <= , that will force
small size to use inode prealloc and that cause

Free blocks: 1442-1443, 1446-1447, 1450-1451, 1454-1455, 1458-1459, 1462-1463, 1466-1467, 1470-1471, 1474-1475, 1478-1479, 1482-1483, 1486-1487, 1490-1491, 1494-1495, 1498-1499, 1502-1503, 1506-1507, 1510-1511, 1514-1515, 1518-12799


So the problem is generic.


-aneesh




--- End Message ---

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux