Re: [ext2/ext3] Re-allocation of blocks for an inode

Greg Freemyer <greg.freemyer@xxxxxxxxx> · Fri, 27 Mar 2009 17:48:14 -0400

Sandeep,

I've looked at the code and made comments.  I suspect the issue is an
extraneous call

       dst_bhptr = sb_bread(ohsm_sb, dest_bh.b_blocknr);

If that is actually causing a disk read operation of the unitialized
destination block, it is the culprit.

I would attack that first.  If it proves a false lead, then look at my
other comments below.

Greg

On Thu, Mar 19, 2009 at 8:05 AM, Sandeep K Sinha
<sandeepksinha@xxxxxxxxx> wrote:
> Hi Greg,
>
> On Sun, Mar 15, 2009 at 9:00 AM, Greg Freemyer <greg.freemyer@xxxxxxxxx> wrote:
>> On Sat, Mar 14, 2009 at 3:37 PM, Vineet Agarwal
>> <checkout.vineet@xxxxxxxxx> wrote:
>>> Hello Greg,
>>>
>>> During relocation we are copying data block by block..
>>>
>> Vineet,
>>
>> 1) Be advised that most Linux mailing lists to not like it when you
>> top post.  Answers should follow the questions.  Look up top posting
>> at wikipedia if you don't know what I'm talking about.
>>
>> 2) Can you add some kprintf through the module such that they only
>> print once.  Then enable timestamps on the kprintf's and verify where
>> all the time is going.  It just does not make sense to me that we are
>> now slower the cp.
>>
>> 3) Please post the exact kernel patch you are testing now for the full
>> block copy and inode update.  I don't want to make assumptions about
>> how you redid it.
>>
>
> So, we are.
>
> I will not be sending the complete patch since it will confuse everyone more.
> Rather, I have exported some of the ext2 functions. And have written a
> kernel module to test the time for the re-allocation of blocks for a
> file.
>
> The above code is not aware of any tier information and so. This is
> just a re-allocation code for ext2. Not even specific to OHSM  but
> quite specific to ext2 as of now.
>
> I am just copying the realloc code here:
>
>
> for (done = 0; done < nblocks; done++) {
>                memset(&dest_bh, 0, sizeof(struct buffer_head));
>                memset(&src_bh, 0, sizeof(struct buffer_head));
>                err = ext2_get_block (src_ind, done, &src_bh, 0);

>                if (err < 0) {
>                        printk (KERN_DEBUG "\n OHSM error getting blocks ret_val = %d",err);
>                        goto unlock;
>                }
>                if (!buffer_mapped(&src_bh)){
>                        printk (KERN_DEBUG "\nHOLE ");
>                        continue;
>                }
>
>                dest_bh.b_state = 0;
>                err = ext2_get_block (dest_ind, done, &dest_bh, 1);

I think what you have is fine, but ...

Have you looked at the block layout for a file copied via "cp" and one
done via your patch.  Is the ondisk layout of the blocks used equally
efficient.  If not, it could cause the slow down.  And the fact that
you are allocating one block at a time might cause such an inefficient
layout.

>                if (err < 0) {
>                        printk (KERN_DEBUG "\n OHSM error getting blocks ret_val = %d",err);
>                        goto unlock;
>                }
>                src_bhptr = sb_bread(ohsm_sb, src_bh.b_blocknr);

Does this allow the read ahead logic to work?  ie. Seems to me
ext2_get_block may be too low level.

Trouble is I don't know the ext2 and vfs code well enough to know
where the read ahead logic is implemented.

Does anyone know if sb_bread will leverage readahead?

>                if (!src_bhptr)
>                        goto unlock;
>                dst_bhptr = sb_bread(ohsm_sb, dest_bh.b_blocknr);

Do you have to do this?  Seems like it might be causing the
uninitialized block to be read from the physical disk.  If so, this is
very time consuming.

>                if (!dst_bhptr)
>                        goto unlock;
>                lock_buffer(dst_bhptr);
>                memcpy(dst_bhptr->b_data,src_bhptr->b_data,src_bhptr->b_size);
>                unlock_buffer(dst_bhptr);
>
>                mark_buffer_dirty(dst_bhptr);
>                brelse(src_bhptr);
>                brelse(dst_bhptr);
>        }
>
>
>
> Now, the logs for a 512 MB file being tested. Now,
>
> Here for the loop:
>
> The loop is taking 119897126320 ticks. Considering loop time as 100%,
>
> ext2_sync_inode = 778430
> memset ( both instances included) = 15102500
> memcpy = 693354060  = 00.57%
> source sb_bread = 60658269700 = 50.59%
> dest sb_bread = 57773094420 = 48.18%
> Source ext2_get_block = 178310240 = 00.148%
> Dest ext2_get_block = 391731590
>
>
> The output of the the following command:
> time ./insmod inum='some_value'
>
> real:  1m50.000s
> user:  0m0.004s
> sys:   0m19.437s
>
> Where as a dd to that same file takes.
>
> [/mnt]
> [17:30:51 sinhas]$ sudo dd if=/mnt/test of=/mnt/test1 count=1000000 (file size)
> 1000000+0 records in
> 1000000+0 records out
> 512000000 bytes (512 MB) copied, 27.6875 s, 18.5 MB/s
>
> CP takes:
>
> [17:32:28 sinhas]$ sudo time cp ./test ./test2
> 0.03user 2.13system 0:28.09elapsed 7%CPU (0avgtext+0avgdata 0maxresident)k
> 1001048inputs+1001184outputs (0major+244minor)pagefaults 0swaps
> [/mnt]
> [17:33:13 sinhas]$
>
>
> I have umounted/mounted the FS inbetween all the operations.
>
> Looking at the above stats:
> sb_bread is eating up most of the time, I am looking into it.
>
>> Thanks
>> Greg
>> --
>> Greg Freemyer
>> Head of EDD Tape Extraction and Processing team
>> Litigation Triage Solutions Specialist
>> http://www.linkedin.com/in/gregfreemyer
>> First 99 Days Litigation White Paper -
>> http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
>>
>> The Norcross Group
>> The Intersection of Evidence & Technology
>> http://www.norcrossgroup.com
>>
>
>
>
> --
> Regards,
> Sandeep.
>
>
>
>
>
>
> “To learn is to change. Education is a process that changes the learner.”
>

-- 
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com

--
To unsubscribe from this list: send an email with
"unsubscribe kernelnewbies" to ecartis@xxxxxxxxxxxx
Please read the FAQ at http://kernelnewbies.org/FAQ