Re: high throughput storage server?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 27, 2011 at 3:30 PM, Ed W <lists@xxxxxxxxxxxxxx> wrote:
> Your application appears to be an implementation of a queue processing
> system?  ie each machine: pulls a file down, processes it, gets the next
> file, etc?

Sort of.  It's not so much "each machine" as it is "each job".  A
machine can have multiple jobs.

At this point I'm not exactly sure what the jobs' specifics are; that
is, not sure if a job reads a bunch of files at once, then processes;
or, reads one file, then processes (as you described).

> Can you share some information on
> - the size of files you pull down (I saw something in another post)

They vary; they can be anywhere from about 100 MB to a few TB.
Average is probably on the order of a few hundred MB.

> - how long each machine takes to process each file

I'm not sure how long a job takes to process a file; I'm trying to get
these answers from the people who design and run the jobs.

> - whether there is any dependency between the processing machines? eg can
> each machine operate completely independently of the others and start it's
> job when it wishes (or does it need to sync?)

I'm fairly sure the jobs are independent.

> Given the tentative assumption that
> - processing each file takes many multiples of the time needed to download
> the file, and
> - files are processed independently
>
> It would appear that you can use a much lower powered system to basically
> push jobs out to the processing machines in advance, this way your bandwidth
> basically only needs to be:
>    size_of_job * num_machines / time_to_process_jobs
>
> So if the time to process jobs is significant then you have quite some time
> to push out the next job to local storage ready?
>
> Firstly is this architecture workable?  If so then you have some new
> performance parameters to target for the storage architecture?

That might be workable, but it would require me (or someone) to
develop and deploy the job dispatching system.  Which is certainly
doable, but it might meet some "political" resistance.  My boss
basically said, "find a system to buy or spec out a system to build
that meets [the requirements I've mentioned in this and other
emails]."
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux