[Bug 41552] Performance of writing and reading from multiple drives decreases by 40% when going from Linux Kernel 2.6.36.4 to 2.6.37 (and beyond)

bugzilla-daemon@xxxxxxxxxxxxxxxxxxx · Mon, 22 Aug 2011 21:07:06 GMT

https://bugzilla.kernel.org/show_bug.cgi?id=41552

--- Comment #9 from Mark Petersen <mpete_06@xxxxxxxxxxx>  2011-08-22 21:07:03 ---
We were using the default CFQ scheduler.  I will change it to deadline and see
what happens.  Also, we are not using a file system to perform the writes,
rather we are sending SCSI commands directly to the devices, nor are we doing
anything special to the disks with a device mapper.  We simply write to each
one on a different thread one sector at a time.

I will attempt to get the trace and will add it if I can.

Thanks,
Mark

> Date: Mon, 22 Aug 2011 15:48:54 -0400
> From: vgoyal@xxxxxxxxxx
> To: mpete_06@xxxxxxxxxxx
> CC: bugme-daemon@xxxxxxxxxxxxxxxxxxx; axboe@xxxxxxxxx; linux-mm@xxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx
> Subject: Re: [Bugme-new] [Bug 41552] New: Performance of writing and reading from multiple drives decreases by 40% when going from Linux Kernel 2.6.36.4 to 2.6.37 (and beyond)
> 
> On Mon, Aug 22, 2011 at 12:24:43PM -0700, Andrew Morton wrote:
> > 
> > (switched to email.  Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> > 
> > On Mon, 22 Aug 2011 15:20:41 GMT
> > bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
> > 
> > > https://bugzilla.kernel.org/show_bug.cgi?id=41552
> > > 
> > >            Summary: Performance of writing and reading from multiple
> > >                     drives decreases by 40% when going from Linux Kernel
> > >                     2.6.36.4 to 2.6.37 (and beyond)
> > >            Product: IO/Storage
> > >            Version: 2.5
> > >     Kernel Version: 2.6.37
> > >           Platform: All
> > >         OS/Version: Linux
> > >               Tree: Mainline
> > >             Status: NEW
> > >           Severity: normal
> > >           Priority: P1
> > >          Component: SCSI
> > >         AssignedTo: linux-scsi@xxxxxxxxxxxxxxx
> > >         ReportedBy: mpete_06@xxxxxxxxxxx
> > >         Regression: No
> > > 
> > > 
> > > We have an application that will write and read from every sector on a drive. 
> > > The application can perform these tasks on multiple drives at the same time. 
> > > It is designed to run on top of the Linux Kernel, which we periodically update
> > > so that we can get the latest device drivers.  When performing the last update
> > > from 2.6.33.2 to 2.6.37, we found that the performance of a set of drives
> > > decreased by some 40% (took 3 hours and 11 minutes to write and read from 5
> > > drives on 2.6.37 versus 2 hours and 12 minutes on 2.6.33.3).  I was able to
> > > determine that the issue was in the 2.6.37 Kernel as I was able to run it with
> > > the 2.6.36.4 kernel, and it had the better performance.   After seeing that I/O
> > > throttling was introduced in the 2.6.37 Kernel, I naturally suspected that. 
> > > However, by default, all the throttling was turned off (I attached the actual
> > > .config that was used to build the kernel).  I then tried to turn on the
> > > throttling and set it to a high number to see what would happen.  When I did
> > > that, I was able to reduce the time from 3 hours and 11 minutes to 2 hours and
> > > 50 minutes.  There seems to be something there that changed that is impacting
> > > performance on multiple drives.  When we do this same test with only one drive,
> > > the performance is identical between the systems.  This issue still occurs on
> > > Kernel 3.0.2.
> > > 
> > 
> > Are you able to determine whether this regression is due to slower
> > reading, to slower writing or to both?
> 
> Mark,
> 
> As your initial comment says that you see 40% regression even when block
> throttling infrastructure is not enabled, I think it is not related to
> throttling as blk_throtl_bio() is null when BLK_DEV_THROTTLING=n.
> 
> What IO scheduler are you using? Can you try switching IO scheduler to
> deadline and see if regression is still there. Trying to figure out if
> it has anything to do with IO scheduler.
> 
> What file system are you using with what options? Are you using device
> mapper to create some special configuration on multiple disks?
> 
> Also can you take a trace (blktrace) of any of the disks for 30 seconds
> both without regression and after regression and upload it somewhere.
> Staring at it might give some clues. 
> 
> Thanks
> Vivek

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html