On Sun, 02 Feb 2003, Andrew Morton wrote: > > Here's the ext3 part. On another machine I got the following times to > > build that list of 530,000 tokens, starting by creating empty db files > > on a partition mounted as: > > How large are the generated output files? Some ten MB, like 30. > > Some further findings: > > > > o It doesn't matter if the source (mbox) file is on ext3/ordered or > > ext2; the difference in time is insignificant. > > > > o Just processing a message to classify it takes about four times as > > long if the .db files are on ext3/ordered as it does if the .db > > files are on ext2. > > I would be suspecting that the database is opening the files with O_SYNC or > is running fsync or such. Maybe. The graphs on my web site have been created by running strace against bogofilter and taking the pwrite() offsets, divided by 4096 to give the "page number", and plot the page numbers over the line number in the strace. There is exactly one fsync(), and it directly precedes the close(). In either case, the data base file is opened with O_RDWR|O_LARGEFILE, no O_SYNC (I straced to figure this). No fdatasync(). > > o Dumping the tokens and counts from the database in text form and > > reloading them into a new database file is not subject to serious > > performance problems; on the machine that needs 24 minutes to > > build from a 200-Mb mbox, rebuilding the database from a list of > > tokens took eight seconds -- this was on ext3 in ordered mode. > > How does this operation differ from the operation which is "slow"? The access pattern changes a lot, because the data base is dumped in traversal order which makes reinserting them into a fresh tree have MUCH better data locality. Most writes are then in sequential order in respect to the file offset, with some excursions to offset #0 and #4096 (pages #0 and #1), as you can see on http://mandree.home.pages.de/bogofilter/bogoutil.png <- write positions http://mandree.home.pages.de/bogofilter/bogoutil-f.png <- page frequency There are also fewer write accesses altogether. > > These data were obtained on machines running linux kernels > > 2.4.21-pre3-ac4 and -ac5 and 2.4.21-pre4-ac1; kernel 2.4.20-ac2 appears > > to give similar results though this has not been thorougly tested. > > Results like those reported were initially obtained with db-3.1.17; the > > tests shown here used db-4.1.25. > > > > More info available on request; tuning hints most gratefully received > > and tested. > > If you can suggest an easy way in which I can reproduce this, that would be > efficient. If it's acceptable for you to build the current bogofilter package http://bogofilter.sourceforge.net/ then Greg could provide you with a Perl script to create a proper input. If that's too much an effort for you which I'd perfectly understand, just state so and I'll ask Greg to send me an strace of his "slow" program and create a short monolithic C or perhaps Perl program that just exactly reproduces the scattered pwrite() sequence pattern we observe in our application. Are you aware of a module that applies to recent kernel versions and that traces block numbers of ll_rw_block()? It might turn up some useful information -- then we'd easier know the ext3 "output" to the hard disk; we already know the "input" from the application at the syscall level. BTW: what's the status of the dirsync patches in respect to 2.4.21-pre? Is further testing needed or just a "ping" to get them merged? Or does Marcelo not want the patch? -- Matthias Andree _______________________________________________ Ext3-users@redhat.com https://listman.redhat.com/mailman/listinfo/ext3-users