these files are not in a single directory, this is a pyramid structure. There are total 15 pyramids and coming down from top to bottom the sub directories and files are multiplied by a factor of 4. The IO is scattered all over!!!! and this is a single disk file system. Since the python application is creating files, it is creating multiple files to multiple sub directories at a time. On Sat, Oct 17, 2009 at 8:02 PM, Eric Sandeen <sandeen@xxxxxxxxxx> wrote: > Viji V Nair wrote: >> >> Hi, >> >> System : Fedora 11 x86_64 >> Current Filesystem: 150G ext4 (formatted with "-T small" option) >> Number of files: 50 Million, 1 to 30K png images >> >> We are generating these files using a python programme and getting very >> slow IO performance. While generation there in only write, no read. After >> generation there is heavy read and no write. >> >> I am looking for best practices/recommendation to get a better >> performance. >> >> Any suggestions of the above are greatly appreciated. >> >> Viji >> > > I would start with using blktrace and/or seekwatcher to see what your IO > patterns look like when you're populating the disk; I would guess that > you're seeing IO scattered all over. > > How you are placing the files in subdirectories will affect this quite a > lot; sitting in 1 directory for a while, filling with images, before moving > on to the next directory, will probably help. Putting each new file in a > new subdirectory will probably give very bad results. > > -Eric > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html