On Mon, Jan 12, 2009 at 11:31 PM, Sandeep K Sinha <sandeepksinha@xxxxxxxxx> wrote: > Hi Peter, > > On Mon, Jan 12, 2009 at 9:49 PM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote: >> On Mon, Jan 12, 2009 at 4:26 PM, Sandeep K Sinha >> <sandeepksinha@xxxxxxxxx> wrote: >>> Hi Peter, >>> >>> Don't you think that if will restrict this to a specific file system. >>> VFS inode should be used rather than the FS incore inode ? >>> >> >> vfs have an API: fsync_buffer_list(), and >> invalidate_inode_buffers(), and these API seemed to used spinlock for >> syncing: >> >> void invalidate_inode_buffers(struct inode *inode) >> { >> if (inode_has_buffers(inode)) { >> struct address_space *mapping = &inode->i_data; >> struct list_head *list = &mapping->private_list; >> struct address_space *buffer_mapping = mapping->assoc_mapping; >> >> spin_lock(&buffer_mapping->private_lock); >> while (!list_empty(list)) >> >> __remove_assoc_queue(BH_ENTRY(list->next));======> modify this for >> writing out the data instead. >> spin_unlock(&buffer_mapping->private_lock); >> } >> } >> EXPORT_SYMBOL(invalidate_inode_buffers); >> >> >>> The purpose if to sleep all the i/o's when we are updating the i_data >>> from the new inode to the old inode ( updation of the data blocks ). >>> >>> I think i_alloc_sem should work here, but could not find any instance >>> of its use in the code. >> >> for the case of ext3's blcok allocation, the lock seemed to be >> truncate_mutex - read the remark: >> >> /* >> * From here we block out all ext3_get_block() callers who want to >> * modify the block allocation tree. >> */ >> mutex_lock(&ei->truncate_mutex); >> >> So while it is building the tree, the mutex will lock it. >> >> And the remarks for ext3_get_blocks_handle() are: >> >> /* >> * Allocation strategy is simple: if we have to allocate something, we will >> * have to go the whole way to leaf. So let's do it before attaching anything >> * to tree, set linkage between the newborn blocks, write them if sync is >> * required, recheck the path, free and repeat if check fails, otherwise >> * set the last missing link (that will protect us from any truncate-generated >> ... >> >> reading the source....go down and see the mutex_lock() (where >> multiblock allocation are needed) and after the lock, all the blocks >> allocation/merging etc are done: >> >> /* Next simple case - plain lookup or failed read of indirect block */ >> if (!create || err == -EIO) >> goto cleanup; >> >> mutex_lock(&ei->truncate_mutex); >> <snip> >> count = ext3_blks_to_allocate(partial, indirect_blks, >> maxblocks, blocks_to_boundary); >> <snip> >> err = ext3_alloc_branch(handle, inode, indirect_blks, &count, goal, >> offsets + (partial - chain), partial); >> >> >>> It's working fine currently with i_mutex, meaning if we hold a i_mutex >> >> as far as i know, i_mutex are used for modifying inode's structures information: >> >> grep for i_mutex in fs/ext3/ioctl.c and everytime there is a need to >> maintain inode's structural info, the lock on i_mutex is called. >> >>> lock on the inode while updating the i_data pointers. >>> And try to perform i/o from user space, they are queued. The file was >>> opened in r/w mode prior to taking the lock inside the kernel. >>> >>> But, I still feel i_alloc_sem would be the right option to go ahead with. >>> >>> On Mon, Jan 12, 2009 at 1:11 PM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote: >>>> If u grep for spinlock, mutex, or "sem" in the fs/ext4 directory, u >>>> can find all three types of lock are used - for different class of >>>> object. >>>> >>>> For data blocks I guessed is semaphore - read this >>>> fs/ext4/inode.c:ext4_get_branch(): >>>> >>>> /** >>>> * ext4_get_branch - read the chain of indirect blocks leading to data >>>> <snip> >>>> * >>>> * Need to be called with >>>> * down_read(&EXT4_I(inode)->i_data_sem) >>>> */ >>>> >>>> i guess u have no choice, as it is semaphore, have to follow the rest >>>> of kernel for consistency - don't create your own semaphore :-). >>>> >>>> There exists i_lock as spinlock - which so far i know is for i_blocks >>>> counting purposes: >>>> >>>> spin_lock(&inode->i_lock); >>>> inode->i_blocks += tmp_inode->i_blocks; >>>> spin_unlock(&inode->i_lock); >>>> up_write(&EXT4_I(inode)->i_data_sem); >>>> >>>> But for data it should be i_data_sem. Is that correct? >>>> >>>> On Mon, Jan 12, 2009 at 2:18 PM, Rohit Sharma <imreckless@xxxxxxxxx> wrote: >>>>> Hi, >>>>> >>>>> I am having some issues in locking inode while copying data blocks. >>>>> We are trying to keep file system live during this operation, so >>>>> both read and write operations should work. >>>>> In this case what type of lock on inode should be used, semaphore, >>>>> mutex or spinlock? >>>>> >>>>> >>>>> On Sun, Jan 11, 2009 at 8:45 PM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote: >>>>>> Sorry.....some mistakes...a resent: >>>>>> >>>>>> Here are some tips on the blockdevice API: >>>>>> >>>>>> http://lkml.org/lkml/2006/1/24/287 >>>>>> http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-01/msg09388.html >>>>>> >>>>>> as indicated, documentation is rather sparse in this area. >>>>>> >>>>>> not sure if anyone else have a summary list of blockdevice API and its >>>>>> explanation? >>>>>> >>>>>> not wrt the following "cleanup patch", i am not sure how the API will change: >>>>>> >>>>>> http://lwn.net/Articles/304485/ >>>>>> >>>>>> thanks. >>>>>> >>>>>> On Tue, Jan 6, 2009 at 6:36 PM, Rohit Sharma <imreckless@xxxxxxxxx> wrote: >>>>>>> >>>>>>> I want to read data blocks from one inode >>>>>>> and copy it to other inode. >>>>>>> >>>>>>> I mean to copy data from data blocks associated with one inode >>>>>>> to the data blocks associated with other inode. >>>>>>> >>>>>>> Is that possible in kernel space.? >>>>>>> -- >>>> >> >> comments ???? > > Thats very right !!! > > So, finally we were able to perform the copy operation successfully. > > We did something like this and we named it "ohsm's tricky copy". > Rohit will soon be uploading a new doc soon on the fscops page which > will detail it further. Thanks let us know when the docs and the *source code* is available ;-) > > 1. Read the source inode. > 2. Allocate a new ghost inode. > 3. Take a lock on the source inode. /* mutex , because the nr_blocks > can change if write comes now from user space */ > 4. Read the number of blocks. >> > 5. Allocate the same number of blocks for the dummy ghost inode. /* > the chain will be created automatically */ > 6. Read the source buffer head of the blocks from source inode and > destination buffer head of the blocks of the destination inode. > > 7. dest_buffer->b_data = source_buffer->b_data ; /* its a char * and > this is where the trick is */ > 8. mark the destination buffer dirty. > > perform 6,7,8 for all the blocks. > > 9. swap the src_inode->i_data[15] and dest_dummy_inode->i_data[15]; /* > This helps us to simply avoid copying the block number back from > destination dummy inode to source inode */ I don't know anything about LVM, so this might be a dumb question. Why is this required ? Did you mean swapping all the block numbers rather than just the [15] ?? Here src_inode is the vfs "struct inode" or the FS specific struct FS_inode_info ??? i didn't get this completely, can you explain this point a bit more. Thanks - Manish > /* This also helps to simply destroy the inode, which will eventually > free all the blocks, which otherwise we would have been doing > separately */ > > 9.1 Release the mutex on the src inode. > > 10. set the bit for I_FREEING in dest_inode->i_state. > > '11. call FS_delete_inode(dest_inode); > > Any application which is already opened this inode for read/write, > tries to do read/write when the mutex lock is taken, it will be > queued. > >> > > Thanks a lot Greg,Manish, Peter and all others for all your valuable > inputs and help. > >> -- >> Regards, >> Peter Teoh >> > > -- > Regards, > Sandeep. > > > > > > "To learn is to change. Education is a process that changes the learner." > > -- > To unsubscribe from this list: send an email with > "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx > Please read the FAQ at http://kernelnewbies.org/FAQ > > -- To unsubscribe from this list: send an email with "unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx Please read the FAQ at http://kernelnewbies.org/FAQ