Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 2, 2012 at 7:29 AM, Jim Schutt <jaschut@xxxxxxxxxx> wrote:
> I'm currently running 24 OSDs/server, one 1TB 7200 RPM SAS drive
> per OSD.  During a test I watch both OSD servers with both
> vmstat and iostat.
>
> During a "good" period, vmstat says the server is sustaining > 2 GB/s
> for multiple tens of seconds.  Since I use replication factor 2, that
> means that server is sustaining > 500 MB/s aggregate client throughput,
> right?  During such a period vmstat also reports ~10% CPU idle.
>
> During a "bad" period, vmstat says the server is doing ~200 MB/s,
> with lots of idle cycles.  It is during these periods that
> messages stuck in the policy throttler build up such long
> wait times.  Sometimes I see really bad periods with aggregate
> throughput per server < 100 MB/s.
>
> The typical pattern I see is that a run starts with tens of seconds
> of aggregate throughput > 2 GB/s.  Then it drops and bounces around
> 500 - 1000 MB/s, with occasional excursions under 100 MB/s.  Then
> it ramps back up near 2 GB/s again.

Hmm. 100MB/s is awfully low for this theory, but have you tried to
correlate the drops in throughput with the OSD journals running out of
space? I assume from your setup that they're sharing the disk with the
store (although it works either way), and your description makes me
think that throughput is initially constrained by sequential journal
writes but then the journal runs out of space and the OSD has to wait
for the main store to catch up (with random IO), and that sends the IO
patterns all to hell. (If you can say that random 4MB IOs are
hellish.)
I'm also curious about memory usage as a possible explanation for the
more dramatic drops.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux