Ed W put forth on 2/27/2011 3:30 PM: > Your application appears to be an implementation of a queue processing > system? ie each machine: pulls a file down, processes it, gets the next > file, etc? > > Can you share some information on > - the size of files you pull down (I saw something in another post) > - how long each machine takes to process each file > - whether there is any dependency between the processing machines? eg > can each machine operate completely independently of the others and > start it's job when it wishes (or does it need to sync?) > > Given the tentative assumption that > - processing each file takes many multiples of the time needed to > download the file, and > - files are processed independently > > It would appear that you can use a much lower powered system to > basically push jobs out to the processing machines in advance, this way > your bandwidth basically only needs to be: > size_of_job * num_machines / time_to_process_jobs > > So if the time to process jobs is significant then you have quite some > time to push out the next job to local storage ready? > > Firstly is this architecture workable? If so then you have some new > performance parameters to target for the storage architecture? > > Good luck Ed, you stated this thought much more thoroughly and eloquently than I did in my last rambling post. Thank you. -- Stan -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html