On Sun, Oct 18, 2009 at 9:04 PM, Eric Sandeen <sandeen@xxxxxxxxxx> wrote: > Viji V Nair wrote: >> >> On Sun, Oct 18, 2009 at 3:56 AM, Theodore Tso <tytso@xxxxxxx> wrote: >>> >>> On Sat, Oct 17, 2009 at 11:26:04PM +0530, Viji V Nair wrote: >>>> >>>> these files are not in a single directory, this is a pyramid >>>> structure. There are total 15 pyramids and coming down from top to >>>> bottom the sub directories and files are multiplied by a factor of 4. >>>> >>>> The IO is scattered all over!!!! and this is a single disk file system. >>>> >>>> Since the python application is creating files, it is creating >>>> multiple files to multiple sub directories at a time. >>> >>> What is the application trying to do, at a high level? Sometimes it's >>> not possible to optimize a filesystem against a badly designed >>> application. :-( >> >> The application is reading the gis data from a data source and >> plotting the map tiles (256x256, png images) for different zoom >> levels. The tree output of the first zoom level is as follows >> >> /tiles/00 >> `-- 000 >> `-- 000 >> |-- 000 >> | `-- 000 >> | `-- 000 >> | |-- 000.png >> | `-- 001.png >> |-- 001 >> | `-- 000 >> | `-- 000 >> | |-- 000.png >> | `-- 001.png >> `-- 002 >> `-- 000 >> `-- 000 >> |-- 000.png >> `-- 001.png >> >> in each zoom level the fourth level directories are multiplied by a >> factor of four. Also the number of png images are multiplied by the >> same number. >>> >>> It sounds like it is generating files distributed in subdirectories in >>> a completely random order. How are the files going to be read >>> afterwards? In the order they were created, or some other order >>> different from the order in which they were read? >> >> The application which we are using are modified versions of mapnik and >> tilecache, these are single threaded so we are running 4 process at a >> time. We can say only four images are created at a single point of >> time. Some times a single image is taking around 20 sec to create. I >> can see lots of system resources are free, memory, processors etc >> (these are 4G, 2 x 5420 XEON) >> >> I have checked the delay in the backend data source, it is on a 12Gbps >> LAN and no delay at all. > > The delays are almost certainly due to the drive heads seeking like mad as > they attempt to write data all over the disk; most filesystems are designed > so that files in subdirectories are kept together, and new subdirectories > are placed at relatively distant locations to make room for the files they > will contain. > > In the past I've seen similar applications also slow down due to new inode > searching heuristics in the inode allocator, but that was on ext3 and ext4 > is significantly different in that regard... > >> These images are also read in the same manner. >> >>> With a sufficiently bad access patterns, there may not be a lot you >>> can do, other than (a) throw hardware at the problem, or (b) fix or >>> redesign the application to be more intelligent (if possible). >>> >>> - Ted >>> >> >> The file system is crated with "-i 1024 -b 1024" for larger inode >> number, 50% of the total images are less than 10KB. I have disabled >> access time and given a large value to the commit also. Do you have >> any other recommendation of the file system creation? > > I think you'd do better to change, if possible, how the application behaves. > > I probably don't know enough about the app but rather than: > > /tiles/00 > `-- 000 > `-- 000 > |-- 000 > | `-- 000 > | `-- 000 > | |-- 000.png > | `-- 001.png > > could it do: > > /tiles/00/000000000000000000.png > /tiles/00/000000000000000001.png > > ... > > for example? (or something similar) > > -Eric The tilecache application is creating these directory structure, we need to change it and our application for a new directory tree. > >> Viji > > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html