Hi Greg, On Fri, Jan 16, 2009 at 5:50 AM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote: > On Thu, Jan 15, 2009 at 12:47 PM, Sandeep K Sinha > <sandeepksinha@xxxxxxxxx> wrote: >> Hey, >> >> On Thu, Jan 15, 2009 at 10:27 PM, Greg Freemyer > <snip> >>> I think I've said it before, but I would think the best real world >>> implementation would be: >>> >>> === >>> pre-allocate destination data blocks >>> >>> For each block >>> prefetch source data block >>> lock inode >>> copy source data block to dest data block IN MEMORY ONLY and put in >>> block queue for delivery to disk >>> release lock >>> end >>> >>> perform_inode_level_block_pointer_swap >>> === >>> >> >> I would be more than very happy if I am able to accomplish this. Greg, >> the only problem that I see here is somebody who has already opened >> the file is making the size of the file to increase, once I >> preallocate destination data blocks. >> And I don;t see a way to avoid that. But surely looking forward to. >> >> I have seen many similar implementations and most of them suffer from >> this issue. But surely there can be a way to optimize it, if not avoid >> it. > > The way ext4_defrag works I believe is to put a lock around the > inode's block list every 64MB and I assume that under that lock it has > a static list of inode block pointers to work with. > > At the conclusion of the 64MB chunk, it releases the lock and allows > writes to occur. That includes writes that extend the file. > For us this granularity initially the size of the file. Meaning whatever number of data blocks it has. We can also break about relocation of blocks of file in 64MB chunks, but then my question would be why not 100MB and why not 20MB ? Its just a granularity that has been taken by ext4_defrag and I don't think there would be any performance philosophy behind that. I would say it will have extra cost of taking/giving locks every 64MB. And what if someone else takes a lock and doesn't give up soon. Your relocation process would be delayed for that reason. I know, above all the lock period should be shorter for all reasons. > Then it locks the inode again and once again gets a full fresh list of > the inode block pointers. If the file has grown between release and > the next lock, then the new inode block pointer list will reflect > those new blocks as well. > What if you don't get a lock again ? How are the linux kernel maintainers accepting a lock for a 64MB block copy ? If thats allowed by would they have issues with us locking it for a granularity of some X. But, first I will see the performance metrics of dividing the copy operation in some chunks. > I think you said ext4_defrag() is using 2 different locks. Maybe one > is just to stop updates to the inode data block pointers, and the > other is finer grained and deals with individual blocks being locked? > Thats very true, that they talk two locks. But if the inode is locked how can the size of the file increase. Is that possible ? As I mentioned you telling that they check the size after every 64MB copy ? > That would make me happier and seems like a more reasonable > implementation than locking the file for all writes for the full 64MB > move. > No, they are locking the inode with both the locks in ext4_defrag. As any read/write would go through the inode. This will protect any updates to the inodes and to all the existing data blocks. > This brings up a question. Are you always "moving" a data block, or > do you have a test in the loop to verify it is not already on the > correct teir of storage? See, I will tell you a bit in detail. we have two fields in the inode, home_tier_id and destination_tier_id. home_tier_id is set if a file qualifies a file allocation policy. If it doesnt qualify any of the policies, its data can be allocated anywhere in the FS, we actually default to the original block allocation method of the FS. If a file qualifies, we set its home_tier_id to the respective tier as mentioned in the policy. And restrict the block allocation to that particular tier. Now, at the time of relocation, if the policy was (in XML policy file ) SELECT *.mp3 from TIER 1, RELOCATE to TIER 4, When file Access temp(FAT) > 200 We do a FS scan and read each inode one by one, now check if it's home tier id != 0, as that means that it has been allocated by OHSM, else we leave that inode. Now we check for the type of the file, if its mp3 we set the destination_tier_id = the dest_tier_in policy. And pass it for relocation. And the relocation function fetched the destination tier_id from inode and allocated new block from that tier. And then set the home_tier_id to dest_tier_id. Does that answer you question sir ? > > Greg > -- > Greg Freemyer > Litigation Triage Solutions Specialist > http://www.linkedin.com/in/gregfreemyer > First 99 Days Litigation White Paper - > http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf > > The Norcross Group > The Intersection of Evidence & Technology > http://www.norcrossgroup.com > -- Regards, Sandeep. "To learn is to change. Education is a process that changes the learner." -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ