On Wed, Jan 7, 2009 at 11:05 PM, Sandeep K Sinha <sandeepksinha@xxxxxxxxx> wrote: > Hi Greg, > > Just to give you a context of the problem : > refer: > http://code.google.com/p/fscops/ > > reply inline. > > On Wed, Jan 7, 2009 at 9:03 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote: >> On Wed, Jan 7, 2009 at 5:48 AM, Rohit Sharma <imreckless@xxxxxxxxx> wrote: >>> On Wed, Jan 7, 2009 at 12:44 PM, Manish Katiyar <mkatiyar@xxxxxxxxx> wrote: >>>> On Wed, Jan 7, 2009 at 12:17 PM, Sandeep K Sinha >>>> <sandeepksinha@xxxxxxxxx> wrote: >>>>> Ok, Let me rephrase what rohit is exactly trying to question. >>>>> >>>>> There is an inode X which has say some N number of data blocks. >>>>> Now, through his own kernel module and some changes to the file system, >>>>> he wants to create a new inode Y in the FS and physically copy all the >>>>> data from the old inode to the new inode. >>>> >>>> Errr....... I must be missing something...... For this why do you need >>>> to copy the data blocks ? if you just copy the old inode to new inode, >>>> you have already copied the direct and indirect block pointers right ? >>>> That will not take much time, and now if you free the old inode, you >>>> have virtually changed the ownership of old blocks to the new inode. >>>> >>> The problem is not replacing the inode, i want to physically move the data. >>> That means if inode X and its data blocks are in block group 1, and >>> new inode is in block group 100 >>> then i will allocate data blocks in block group 100 and copy the data >>> from inode X to inode Y. >>> So i will be able to physically relocate a file, and change the >>> directory entry to contain inode Y. >>> >>>> The problems i can see with this approach is that if the new inode is >>>> not in the same block group as old inode, you have *kind of broken* >>>> the ext2's intelligence of allocating the blocks in the same block >>>> group. >>>> >>>> CMIIW . btw this thread is interesting :-) >>> >>> Yes its interesting. :-) >>> >>>> >>> I haven't actually broken ext2's intelligence completely, i have only put >>> restrictions in allocation of inode and data blocks. >>> And it works fine with existing optimizations. >>> >>> And the major issue is relocating files between different block group range. >> >> So if I understand your high level desire, you want to write a >> filesystem re-org (or defrag or something) that works one file at a >> time. >> > Yes kind of, you can say that. > >> You have to do it in the kernel because you want to control the inode >> and data block allocation. >> > > Because I want to keep control the allocation of data blocks of a file > to a specific device underneath a LVM. And so the mapping of blocks > from FS->LVM->DEVICES resides inside the kernel only. > >> Your current thought is to mount the entire filesystem readonly, do >> the re-org, remount r/w. >> > Well, this should work for now but ya we will look for an alternative > for sure. something like freeze/thaw. > >> If this is just for yourself, it might be acceptable. If this is for >> the community, it is not (IMO). >> > Surely this is for our personal use. > >> To be of value to the community, you need to be more aggressive and >> get this to work on a running filesystem. >> > > Surely be a milestone, soon. > >> My first attempt at high-level pseudo code would be: >> >> =========== >> re_org_file() >> { >> read_orig_inode() >> set inode.re_org_in_progress = true >> >> lock(inode.re_org_in_progress) >> allocate destination inode // Do not >> initiate and real i/o >> allocate all destination indirect pointer blocks // Do not initiate >> and real i/o >> allocate all destination data blocks // Do not >> initiate and real i/o >> >> allocate_file_re_org_done_array // one bit per data block >> memset (file_re_org_done_array, false) >> release_lock(inode.re_org_in_progress) >> >> for each bit in file_re_org_done_array[] { >> if (not file_re_org_done_array[block]) { >> lock(inode_re_org_in_progress) >> copy_block() // I know, your question is how to do this >> set file_re_org_done_array[block] = true >> release_lock(inode_re_org_in_progress) >> } >> } >> >> lock(inode.re_org_in_progress) >> copy_inode_info() > > Well, currently I dont intend to move the inode to a new location. I > would prefer leave the original inode intact just updating the new > data block pointers. This is still in debate, whether to relocate > inode or not. > >> update_directory_entries() >> release_lock(inode._re_org_in_progress) >> >> set inode.re_org_in_progress = false >> } >> >> Then insert logic into the write() code that does: >> >> // inserted write logic >> if inode.re_org_in_progress == true { >> lock(inode.re_org_in_progress) >> send data to orig block // no real i/o needed >> send data to new dest block // no real i/o needed >> file_re_org_done_array[block] = true >> release_lock(inode.re_org_in_progress) >> } else >> send data to normal block // no real i/o needed >> >> // end of inserted write logic > > What exactly is this required for ? Is this for any kind of metadata updates ? I assumed you needed to effectively "copy the file to a new destination, then delete the original file". To do that on a live file with minimal interference with user space invoked i/o I created a ghost version which initially had empty data blocks. Then I allowed normal file i/o to continue. Look at the first chunk of code and see where I released the lock. Whenever the lock is released normal user space file i/o is allowed to occur. Reads are easily handled by reading from the original file. Writes on the other hand have to update both the original file data blocks and the newly allocated data blocks. And as I look again at the pseudo code, I forgot to do the file delete of the original inode at the end. Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ