On Mon, 21 Nov 2011 17:18:24 +0800 Wu Fengguang <fengguang.wu@xxxxxxxxx> wrote: > The accounting code will be compiled in by default (CONFIG_READAHEAD_STATS=y), > and will remain inactive unless enabled explicitly with either boot option > > readahead_stats=1 > > or through the debugfs interface > > echo 1 > /debug/readahead/stats_enable It's unfortunate that these two things have different names. I'd have thought that the debugfs knob was sufficient - no need for the boot option. > The added overheads are two readahead_stats() calls per readahead. > Which is trivial costs unless there are concurrent random reads on > super fast SSDs, which may lead to cache bouncing when updating the > global ra_stats[][]. Considering that normal users won't need this > except when debugging performance problems, it's disabled by default. > So it looks reasonable to keep this debug code simple rather than trying > to improve its scalability. I may be wrong, but I don't think the CPU cost of this code matters a lot. People will rarely turn it on and disk IO is a lot slower than CPU actions and it's waaaaaaay more important to get high-quality info about readahead than it is to squeeze out a few CPU cycles. > > ... > > @@ -51,6 +62,182 @@ EXPORT_SYMBOL_GPL(file_ra_state_init); > > #define list_to_page(head) (list_entry((head)->prev, struct page, lru)) > > +#ifdef CONFIG_READAHEAD_STATS > +#include <linux/seq_file.h> > +#include <linux/debugfs.h> > + > +static u32 readahead_stats_enable __read_mostly; > + > +static int __init config_readahead_stats(char *str) > +{ > + int enable = 1; > + get_option(&str, &enable); > + readahead_stats_enable = enable; > + return 0; > +} > +early_param("readahead_stats", config_readahead_stats); Why use early_param() rather than plain old __setup()? > +enum ra_account { > + /* number of readaheads */ > + RA_ACCOUNT_COUNT, /* readahead request */ > + RA_ACCOUNT_EOF, /* readahead request covers EOF */ > + RA_ACCOUNT_CHIT, /* readahead request covers some cached pages */ I don't like chit :) "cache_hit" would be better. Or just "hit". > + RA_ACCOUNT_IOCOUNT, /* readahead IO */ > + RA_ACCOUNT_SYNC, /* readahead IO that is synchronous */ > + RA_ACCOUNT_MMAP, /* readahead IO by mmap page faults */ > + /* number of readahead pages */ > + RA_ACCOUNT_SIZE, /* readahead size */ > + RA_ACCOUNT_ASIZE, /* readahead async size */ > + RA_ACCOUNT_ACTUAL, /* readahead actual IO size */ > + /* end mark */ > + RA_ACCOUNT_MAX, > +}; > + > > ... > > +static void readahead_event(struct address_space *mapping, > + pgoff_t offset, > + unsigned long req_size, > + unsigned int ra_flags, > + pgoff_t start, > + unsigned int size, > + unsigned int async_size, > + unsigned int actual) > +{ > +#ifdef CONFIG_READAHEAD_STATS > + if (readahead_stats_enable) { > + readahead_stats(mapping, offset, req_size, ra_flags, > + start, size, async_size, actual); > + readahead_stats(mapping, offset, req_size, > + RA_PATTERN_ALL << READAHEAD_PATTERN_SHIFT, > + start, size, async_size, actual); > + } > +#endif > +} The stub should be inlined, methinks. The overhead of evaluating and preparing eight arguments is significant. I don't think the compiler is yet smart enough to save us. > > ... > > --- linux-next.orig/Documentation/kernel-parameters.txt 2011-11-21 17:08:38.000000000 +0800 > +++ linux-next/Documentation/kernel-parameters.txt 2011-11-21 17:08:51.000000000 +0800 > @@ -2251,6 +2251,12 @@ bytes respectively. Such letter suffixes > This default max readahead size may be overrode > in some cases, notably NFS, btrfs and software RAID. > > + readahead_stats[=0|1] > + Enable/disable readahead stats accounting. > + > + It's also possible to enable/disable it after boot: > + echo 1 > /sys/kernel/debug/readahead/stats_enable Can the current setting be read back? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html