Re: Possible to change chunk size on RAID-1 without re-init or destructive result?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 31, 2013 at 8:56 AM, Stan Hoeppner <stan@xxxxxxxxxxxxxxxxx> wrote:
> On 3/27/2013 5:18 PM, Mark Knecht wrote:
<SNIP>
>> Is there a way for me to measure, say over a whole day or some fixed
>> time, what the workload really looks like?
>
> That's not the way to go about this.
>
OK

>> The machine is a basic Gentoo desktop machine running KDE. The only
>> workload where I really care about performance is that I run a bunch
>> of Virtualbox Win 7 & Win XP VMs where I need to the performance to be
>> as good as I can reasonably get. The problem I have is these VMs are
>> either 1 huge file (40-50GB in a single file) or many 2GB files. I
>> haven't a clue how Windows & Virtualbox is accessing what it sees as a
>> virtual drive and then underlying that how the vbox drivers are using
>> the system to get to the RAID.
>
> So you have a bunch of Windows VM guests that write to large sparse
> files residing on what, EXT4?  NTFS block size is 4KB so that's your
> smallest IO.
>

Currently EXT3 based on my starting point 2 years ago and never having
changed. I'm open to EXT4 if this discussion show me it warrants the
work. Would rather not deal with anything more exotic right now.

>> It would be interesting to set some program running, probably on a
>> weekend or sometime when performance isn't so critical, and see what
>> sort of data gets collected, assuming there's a program that does that
>> sort of thing.
>
> Again, that's not the way to approach this.  What would be informative
> to know is what applications you're running in these Windows VMs.  The
> application dictates the write pattern.  You don't need a "collector" to
> tell you that.  You just need to know the application(s).  If you're
> just running productivity apps (web/mail/pdf/etc) inside these VMs then
> there's nothing to optimize WRT RAID stripe parameters as you have no
> sustained write IO.  So what are the Windows apps?

Currently 3 VMs, but only 2 matter for performance. The one that
doesn't matter is a VMWare Player VM used for things like watching
Netflix & Hulu. Nothing much more than that. 1 CPU core dedicated. CPU
usage is generally low. I haven't paid much attention to disk usage
for this VM but will check it out.

Performance VMs:

1) This first VM primarily runs TradeStation, a rules-based trading
platform for trading stocks & futures. I generally run with 2-4 CPU
cores and almost never uses much computational power. The big deal in
this VM is stock data caching with years or even decades of data for
each stock or futures contract. Currently this cache appears to be
sitting in a single file which is about 3GB in size. This data streams
into the VM over the net when the markets are open (pretty much 24/7)
and the cache grows. Depending on the type of market and chart the
data might be as fine grained as each individual trade taking place
that day, or it might only be updated once every bar. (1 minute bar, 5
minute bar, daily bar, etc.) TradeStation reads the cache as it needs
data. I have no idea what the access looks like in real time but
generally I expect that it's accessing the data in date order. Whether
the data is sorted or not in this cache file I have no idea.

2) This second VM is more computational in nature. It primarily runs
two apps for long periods of time, although I don't think either app
is all that disk intensive. Noth apps read market data once from disk,
cache it in memory and then computer for hours to days depending on
what I'm asking them to do. I will say I don't see a lot of disk
activity lights when either of these programs are running.

- Adaptrade Builder - a genetic optimization program that attempts to
generate TradeStation EasyLanguage trading strategies. I believe that
once it has the market data in memory it's using memory and disk to
store interesting strategies for me to look at later. The output of
the program is generally a single file ranging in size from 1MB to
maybe 50MB.

- TradingSolutions - a neural network program that attempts to
generate neural network models for trading markets. Each instance of
this program (I typically run 2-3 instances) generally has access to
one file sized 25MB-200MB plus a lot (50-100) small files under 20K in
size. I have no idea how often any of these programs are read or
written. The program runs for hours doing it's work.

I suppose there are other things that happen in the VMs. I run Excel a
lot, but it's not a lot of data.

Hopefully that gives you enough info to suggest a direction.

Thanks,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux