Re: Copying Data Blocks

Peter Teoh <htmldeveloper@xxxxxxxxx> · Tue, 13 Jan 2009 00:19:26 +0800

On Mon, Jan 12, 2009 at 4:26 PM, Sandeep K Sinha
<sandeepksinha@xxxxxxxxx> wrote:
> Hi Peter,
>
> Don't you think that if will restrict this to a specific file system.
> VFS inode should be used rather than the FS incore inode ?
>

vfs have an API:   fsync_buffer_list(), and
invalidate_inode_buffers(), and these API seemed to used spinlock for
syncing:

void invalidate_inode_buffers(struct inode *inode)
{
        if (inode_has_buffers(inode)) {
                struct address_space *mapping = &inode->i_data;
                struct list_head *list = &mapping->private_list;
                struct address_space *buffer_mapping = mapping->assoc_mapping;

                spin_lock(&buffer_mapping->private_lock);
                while (!list_empty(list))

__remove_assoc_queue(BH_ENTRY(list->next));======> modify this for
writing out the data instead.
                spin_unlock(&buffer_mapping->private_lock);
        }
}
EXPORT_SYMBOL(invalidate_inode_buffers);

> The purpose if to sleep all the i/o's when we are updating the i_data
> from the new inode to the old inode ( updation of the data blocks ).
>
> I think i_alloc_sem should work here, but could not find any instance
> of its use in the code.

for the case of ext3's blcok allocation, the lock seemed to be
truncate_mutex - read the remark:

        /*
         * From here we block out all ext3_get_block() callers who want to
         * modify the block allocation tree.
         */
        mutex_lock(&ei->truncate_mutex);

So while it is building the tree, the mutex will lock it.

And the remarks for ext3_get_blocks_handle() are:

/*
 * Allocation strategy is simple: if we have to allocate something, we will
 * have to go the whole way to leaf. So let's do it before attaching anything
 * to tree, set linkage between the newborn blocks, write them if sync is
 * required, recheck the path, free and repeat if check fails, otherwise
 * set the last missing link (that will protect us from any truncate-generated
...

reading the source....go down and see the mutex_lock() (where
multiblock allocation are needed) and after the lock, all the blocks
allocation/merging etc are done:

        /* Next simple case - plain lookup or failed read of indirect block */
        if (!create || err == -EIO)
                goto cleanup;

        mutex_lock(&ei->truncate_mutex);
<snip>
        count = ext3_blks_to_allocate(partial, indirect_blks,
                                        maxblocks, blocks_to_boundary);
<snip>
        err = ext3_alloc_branch(handle, inode, indirect_blks, &count, goal,
                                offsets + (partial - chain), partial);

> It's working fine currently with i_mutex, meaning if we hold a i_mutex

as far as i know, i_mutex are used for modifying inode's structures information:

grep for i_mutex in fs/ext3/ioctl.c and everytime there is a need to
maintain inode's structural info, the lock on i_mutex is called.

> lock on the inode while updating the i_data pointers.
> And try to perform i/o from user space, they are queued. The file was
> opened in r/w mode prior to taking the lock inside the kernel.
>
> But, I still feel i_alloc_sem would be the right option to go ahead with.
>
> On Mon, Jan 12, 2009 at 1:11 PM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:
>> If u grep for spinlock, mutex, or "sem" in the fs/ext4 directory, u
>> can find all three types of lock are used - for different class of
>> object.
>>
>> For data blocks I guessed is semaphore - read this
>> fs/ext4/inode.c:ext4_get_branch():
>>
>> /**
>>  *      ext4_get_branch - read the chain of indirect blocks leading to data
>> <snip>
>>  *
>>  *      Need to be called with
>>  *      down_read(&EXT4_I(inode)->i_data_sem)
>>  */
>>
>> i guess u have no choice, as it is semaphore, have to follow the rest
>> of kernel for consistency - don't create your own semaphore :-).
>>
>> There exists i_lock as spinlock - which so far i know is for i_blocks
>> counting purposes:
>>
>>       spin_lock(&inode->i_lock);
>>        inode->i_blocks += tmp_inode->i_blocks;
>>        spin_unlock(&inode->i_lock);
>>        up_write(&EXT4_I(inode)->i_data_sem);
>>
>> But for data it should be i_data_sem.   Is that correct?
>>
>> On Mon, Jan 12, 2009 at 2:18 PM, Rohit Sharma <imreckless@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> I am having some issues in locking inode while copying data blocks.
>>> We are trying to keep file system live during this operation, so
>>> both read and write operations should work.
>>> In this case what type of lock on inode should be used, semaphore,
>>> mutex or spinlock?
>>>
>>>
>>> On Sun, Jan 11, 2009 at 8:45 PM, Peter Teoh <htmldeveloper@xxxxxxxxx> wrote:
>>>> Sorry.....some mistakes...a resent:
>>>>
>>>> Here are some tips on the blockdevice API:
>>>>
>>>> http://lkml.org/lkml/2006/1/24/287
>>>> http://linux.derkeiler.com/Mailing-Lists/Kernel/2006-01/msg09388.html
>>>>
>>>> as indicated, documentation is rather sparse in this area.
>>>>
>>>> not sure if anyone else have a summary list of blockdevice API and its
>>>> explanation?
>>>>
>>>> not wrt the following "cleanup patch", i am not sure how the API will change:
>>>>
>>>> http://lwn.net/Articles/304485/
>>>>
>>>> thanks.
>>>>
>>>> On Tue, Jan 6, 2009 at 6:36 PM, Rohit Sharma <imreckless@xxxxxxxxx> wrote:
>>>>>
>>>>> I want to read data blocks from one inode
>>>>> and copy it to other inode.
>>>>>
>>>>> I mean to copy data from data blocks associated with one inode
>>>>> to the data blocks associated with other inode.
>>>>>
>>>>> Is that possible in kernel space.?
>>>>> --
>>

comments ????

-- 
Regards,
Peter Teoh

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ