On Wed, Jan 7, 2009 at 11:33 PM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote: > On Wed, Jan 7, 2009 at 5:48 AM, Rohit Sharma <imreckless@xxxxxxxxx> wrote: >> On Wed, Jan 7, 2009 at 12:44 PM, Manish Katiyar <mkatiyar@xxxxxxxxx> wrote: >>> On Wed, Jan 7, 2009 at 12:17 PM, Sandeep K Sinha >>> <sandeepksinha@xxxxxxxxx> wrote: >>>> Ok, Let me rephrase what rohit is exactly trying to question. >>>> >>>> There is an inode X which has say some N number of data blocks. >>>> Now, through his own kernel module and some changes to the file system, >>>> he wants to create a new inode Y in the FS and physically copy all the >>>> data from the old inode to the new inode. >>> >>> Errr....... I must be missing something...... For this why do you need >>> to copy the data blocks ? if you just copy the old inode to new inode, >>> you have already copied the direct and indirect block pointers right ? >>> That will not take much time, and now if you free the old inode, you >>> have virtually changed the ownership of old blocks to the new inode. >>> >> The problem is not replacing the inode, i want to physically move the data. >> That means if inode X and its data blocks are in block group 1, and >> new inode is in block group 100 >> then i will allocate data blocks in block group 100 and copy the data >> from inode X to inode Y. >> So i will be able to physically relocate a file, and change the >> directory entry to contain inode Y. >> >>> The problems i can see with this approach is that if the new inode is >>> not in the same block group as old inode, you have *kind of broken* >>> the ext2's intelligence of allocating the blocks in the same block >>> group. >>> >>> CMIIW . btw this thread is interesting :-) >> >> Yes its interesting. :-) >> >>> >> I haven't actually broken ext2's intelligence completely, i have only put >> restrictions in allocation of inode and data blocks. >> And it works fine with existing optimizations. >> >> And the major issue is relocating files between different block group range. > > So if I understand your high level desire, you want to write a > filesystem re-org (or defrag or something) that works one file at a > time. > > You have to do it in the kernel because you want to control the inode > and data block allocation. > > Your current thought is to mount the entire filesystem readonly, do > the re-org, remount r/w. > > If this is just for yourself, it might be acceptable. If this is for > the community, it is not (IMO). > > To be of value to the community, you need to be more aggressive and > get this to work on a running filesystem. > > My first attempt at high-level pseudo code would be: > > =========== > re_org_file() > { > read_orig_inode() > set inode.re_org_in_progress = true > > lock(inode.re_org_in_progress) > allocate destination inode // Do not > initiate and real i/o > allocate all destination indirect pointer blocks // Do not initiate > and real i/o > allocate all destination data blocks // Do not > initiate and real i/o > > allocate_file_re_org_done_array // one bit per data block > memset (file_re_org_done_array, false) > release_lock(inode.re_org_in_progress) > > for each bit in file_re_org_done_array[] { > if (not file_re_org_done_array[block]) { > lock(inode_re_org_in_progress) > copy_block() // I know, your question is how to do this > set file_re_org_done_array[block] = true > release_lock(inode_re_org_in_progress) > } > } > > lock(inode.re_org_in_progress) > copy_inode_info() > update_directory_entries() > release_lock(inode._re_org_in_progress) > > set inode.re_org_in_progress = false > } > > Then insert logic into the write() code that does: > > // inserted write logic > if inode.re_org_in_progress == true { > lock(inode.re_org_in_progress) > send data to orig block // no real i/o needed > send data to new dest block // no real i/o needed > file_re_org_done_array[block] = true > release_lock(inode.re_org_in_progress) > } else > send data to normal block // no real i/o needed > > // end of inserted write logic > ==================================== > > Does that capture the essence of what you are trying to do? > > And I assume your first question is still, how do I write copy_block()? > Possibly, u can use this function, which uses the block I/O API, and so is filesystem independent (buffer_head here in this case will be a linked list of buffer read into memory - from each data block): fs/buffer.c: /* * For a data-integrity writeout, we need to wait upon any in-progress I/O * and then start new I/O and then wait upon it. The caller must have a ref on * the buffer_head. */ int sync_dirty_buffer(struct buffer_head *bh) { int ret = 0; WARN_ON(atomic_read(&bh->b_count) < 1); lock_buffer(bh); if (test_clear_buffer_dirty(bh)) { get_bh(bh); bh->b_end_io = end_buffer_write_sync; ret = submit_bh(WRITE_SYNC, bh); wait_on_buffer(bh); if (buffer_eopnotsupp(bh)) { clear_buffer_eopnotsupp(bh); ret = -EOPNOTSUPP; } if (!ret && !buffer_uptodate(bh)) ret = -EIO; } else { unlock_buffer(bh); } return ret; } > I don't know the actual semantics for that, but maybe someone can take > the above and either figure out a better way to accomplish the re-org > or tell you how to implement copy_block() as needed in the above. > > Greg > -- > Greg Freemyer > Litigation Triage Solutions Specialist > http://www.linkedin.com/in/gregfreemyer > First 99 Days Litigation White Paper - > http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf > > The Norcross Group > The Intersection of Evidence & Technology > http://www.norcrossgroup.com > > -- > To unsubscribe from this list: send an email with > "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx > Please read the FAQ at http://kernelnewbies.org/FAQ > > -- Regards, Peter Teoh -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ