- statistics-infrastructure.patch removed from -mm tree

akpm@xxxxxxxxxxxxxxxxxxxx · Fri, 24 Aug 2007 22:19:18 -0700

The patch titled
     statistics infrastructure
has been removed from the -mm tree.  Its filename was
     statistics-infrastructure.patch

This patch was dropped because it isn't in the present -mm lineup

------------------------------------------------------
Subject: statistics infrastructure
From: Martin Peschke <mp3@xxxxxxxxxx>

Add statistics infrastructure as common code.

Signed-off-by: Martin Peschke <mp3@xxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 Documentation/statistics.txt |  146 ++-
 MAINTAINERS                  |    7 
 arch/s390/Kconfig            |    2 
 arch/s390/oprofile/Kconfig   |    5 
 include/linux/jiffies.h      |    2 
 include/linux/statistic.h    |  281 +++++
 lib/Kconfig.statistic        |   11 
 lib/Makefile                 |    2 
 lib/statistic.c              | 1564 +++++++++++++++++++++++++++++++++
 9 files changed, 1978 insertions(+), 42 deletions(-)

diff -puN Documentation/statistics.txt~statistics-infrastructure Documentation/statistics.txt

--- a/Documentation/statistics.txt~statistics-infrastructure
+++ a/Documentation/statistics.txt
@@ -33,7 +33,7 @@ kernel code as well as users.
  USER	       :			KERNEL
 	       :
 	  user	       statistics	     programming
-	  interface    infrastructure	     interface	  exploiter
+	  interface    infrastructure	     interface	  client
 	       :       +------------------+	  :	  +-----------------+
 	       :       | process data and |	  :	  | collect and     |
  "data"        :       | provide output   |	(X, Y)	  | report data     |
@@ -62,13 +62,13 @@ compute and store, as well as display st
 current settings.
 
 
-	The role of exploiters
+	The role of clients
 
-It is the exploiter's (e.g. device driver's) responsibility to feed the
+It is the client's (e.g. device driver's) responsibility to feed the
 statistics infrastructure with sampled data for the statistics maintained by the
-statistics infrastructure on behalf of the exploiter.
+statistics infrastructure on behalf of the client.
 
-It would be nice of any exploiter to provide a default configuration for each
+It would be nice of any client to provide a default configuration for each
 statistic that most likely works best for general purpose use.
 
 
@@ -85,7 +85,7 @@ a quantity for the main characteristic o
 or request latency, and with Y being a qualifier for that characteristic,
 i.e. the occurrence of a particular X-value.
 
-Thus, the Y-part can be seen as an optimisation that allows exploiters
+Thus, the Y-part can be seen as an optimisation that allows clients
 to report a bunch of similar measurements in one call (see statistic_add()).
 For the programmer's convenience, Y can be omitted when it would be always 1
 (see statistic_inc()).
@@ -95,7 +95,7 @@ For the programmer's convenience, Y can 
 
 There are two methods how such data can be provided to the statistics
 infrastructure, a push interface and a pull interface. Each statistic
-is either a pull-type or push-type statistic as determined by the exploiter.
+is either a pull-type or push-type statistic as determined by the client.
 
 The push-interface is suitable for data feeds that report incremental updates
 to statistics, and where actual accumulation can be left to the statistics
@@ -104,8 +104,8 @@ infrastructure. New measurements usually
 
 The pull-interface is suitable for data that already comes in an aggregated
 form, like hardware measurement data or counters already maintained and
-used by exploiters for other purposes. Reading statistics data from files
-triggers an optional callback of the exploiter, which can update pull-type
+used by clients for other purposes. Reading statistics data from files
+triggers an optional callback of the client, which can update pull-type
 statistics then (see statistic_set()).
 
 
@@ -131,7 +131,7 @@ according to their needs.
 
 	How statistics are organised
 
-Statistics are grouped within "interfaces" (debugfs entries) by exploiters,
+Statistics are grouped within "interfaces" (debugfs entries) by clients,
 in order to reflect collections of related statistics of an entity,
 which is also quite efficient with regard to memory use.
 
@@ -199,7 +199,11 @@ has been implemented:
   size_write 0x14000 12			|
   ...					|
   size_write 0x9000 1			/
-  queue_used_depth 970 1 18.122 32		> num min avg max for a queue
+  queue_used_depth samples 970			\
+  queue_used_depth minimum 1			|
+  queue_used_depth average 18.122		> utilisation of a queue
+  queue_used_depth maximum 32			|
+  queue_used_depth variance 53.324		/
 
 Such output can grow as needed in debugfs files. It is human-readable and
 could be parsed and postprocessed by simple scripts that are aware of what the
@@ -208,7 +212,7 @@ output of the various data processing mo
 
 	State machine
 
-Each statistic has a state that should be initialised by exploiters.
+Each statistic has a state that should be initialised by clients.
 Users probably want to adjust this state, e.g. enable
 data gathering. Defined states and transitions are:
 
@@ -219,7 +223,7 @@ data gathering. Defined states and trans
 	V
   state=released	(mode of data processing has been defined, but memory
 	A		 required for data gathering has not yet been allocated
-	|		 - would be a good default setup provided by exploiters)
+	|		 - would be a good default setup provided by clients)
 	|
 	V
   state=off		(all memory required for the defined mode of data
@@ -245,7 +249,7 @@ FIXME
 
 	Per-CPU data
 
-Measurements reported by exploiters are accumulated into per-CPU data areas
+Measurements reported by clients are accumulated into per-CPU data areas
 in order to avoid the introduction of serialisation during the
 execution of statistic_add(). Locking of per-CPU data is done by disabling
 preemption and interrupts per CPU for the short time of a statistic update.
@@ -326,6 +330,7 @@ Provides a set of values comprising:
 - the minimum X
 - the average X
 - the maximum X
+- the variance of X
 
 This appears to be a useful fill level indicator for queues etc.
 
@@ -400,7 +405,7 @@ in the source code:
 The statistics infrastructure's user interface is in the
 /sys/kernel/debug/statistics directory, assuming debugfs has been mounted at
 /sys/kernel/debug.  The "statistics" directory holds interface subdirectories
-created on the behalf of exploiters, for example:
+created on the behalf of clients, for example:
 
   drwxr-xr-x 2 root root 0 Jul 28 02:16 zfcp-0.0.50d4
 
@@ -542,18 +547,26 @@ this:
   foo 0x1000 4
   foo 0x2000 1
   foo 0x5000 2
-  bar 961 1 42.000 128
+  bar samples 961
+  bar minimum 1
+  bar average 42.000
+  bar maximum 128
+  bar variance 149.254
 
 
 	Output formats of different statistic types
 
   Statistic Type	Output Format				Number of Lines
 
-  counter_inc		<name> <total of Y>				1
+  counter_inc		<name> <total of Y>			1
 
-  counter_prod		<name> <total of Xi*Yi>				1
+  counter_prod		<name> <total of Xi*Yi>			1
 
-  utilisation		<name> <total of Y> <min X> <avg X> <max X>	1
+  utilisation		<name> "samples" <total of Y>		5
+			<name> "minimum" <minimum X>
+			<name> "average" <average X>
+			<name> "maximum" <maximum X>
+			<name> "variance" <variance of X>
 
   sparse		<name> <Xn> <total of Y for Xn>		<= entries
 			...
@@ -590,6 +603,15 @@ representing some entity, the following 
 
 stat is an array of N statistics of various sorts.
 
+An enum that helps addressing individual statistics of an array comes in handy:
+
+  enum my_entitiy_stat_num {
+	  MY_ENTITY_STAT_REFUND,
+	  MY_ENTITY_STAT_FILL,
+	  ...
+	  N
+  };
+
 Since one might want to create several instances of struct my_entity
 each coming with its own set of statistics (stat[N]) setup using the
 same template, provisions for such a template have been made as part of the
@@ -597,20 +619,22 @@ programming interface. An array of struc
 array of struct statistic.
 
   struct statistic_info[] {
-	  { "refund", "cent", "bottle", 0, "type=counter_prod" },
-	  { "fill_level", "millilitre", "bottle", 1, "type=utilisation" },
+	  [MY_ENTITY_STAT_REFUND] = {
+		  .name     = "refund",
+		  .x_unit   = "cent",
+		  .y_unit   = "bottle",
+		  .defaults = "type=counter_prod"
+	  },
+	  [MY_ENTITY_STAT_FILL] = {
+		  .name     = "fill_level",
+		  .x_unit   = "millilitre",
+		  .y_unit   = "bottle",
+		  .flags    = STATISTIC_FLAGS_NOINCR,
+		  .defaults = "type=utilisation"
+	  },
 	  ...
   } my_entity_stat_info;
 
-An enum that helps addressing individual statistics of an array comes in handy:
-
-  enum my_entitiy_stat_num {
-	  MY_ENTITY_STAT_REFUND,
-	  MY_ENTITY_STAT_FILL,
-	  ...
-	  N
-  };
-
 Now, here is how to tie the knot for statistics and templates:
 
   {
@@ -635,6 +659,33 @@ Now, here is how to tie the knot for sta
 
 	Reporting statistics data
 
+In short, this is the complete list of function that can be used
+to update a statistic:
+
+  _statistic_add()
+  _statistic_inc()
+
+   statistic_add()
+   statistic_inc()
+
+  _statistic_add_as()
+  _statistic_inc_as()
+
+   statistic_add_as()
+   statistic_inc_as()
+
+   statistic_set()
+
+Function names starting with an "_" indicate that the function leaves it to
+the calling code to make updates smp-safe (see details below).
+
+The *statistic_*_as() functions are stripped down version that are faster and
+less flexible from the user's perspective (see details below).
+
+While the add/inc-functions are used for accumulating incremental statistics
+data, the set-function is used for storing statistics coming as total numbers
+(see details below).
+
 Add statistic_add*() or statistic_inc*() calls where appropriate for
 reporting statistics data. Data to be reported through these functions has the
 form of (X, Y) as explained above:
@@ -663,7 +714,7 @@ Of course, this example is not optimal. 
 statistic_inc() compare. Sometimes statistic_inc() might be just what you need.
 
 If there is a bunch of statistics to be updated in one go, consider these
-flavours of statistic_add() which require the exploiter to lock per-CPU data
+flavours of statistic_add() which require the client to lock per-CPU data
 in one go for improved performance:
 
   {
@@ -672,20 +723,43 @@ in one go for improved performance:
 	  ...
 
 	  local_irq_save(flags);
-	  statistic_inc_nolock(&one->stat, MY_ENTITY_STAT_X, x);
-	  statistic_inc_nolock(&one->stat, MY_ENTITY_STAT_Y, y);
-	  statistic_add_nolock(&one->stat, MY_ENTITY_STAT_Z, z, number);
+	  _statistic_inc(&one->stat, MY_ENTITY_STAT_X, x);
+	  _statistic_inc(&one->stat, MY_ENTITY_STAT_Y, y);
+	  _statistic_add(&one->stat, MY_ENTITY_STAT_Z, z, number);
 	  ...
 	  local_irq_restore(flags);
   }
 
+You may use the *statistic_*_as() functions instead if you feel that - for your
+purposes - the performance gain outweighs the flexibility of statistic_add() &
+friends. The *statistic_*_as() functions do not allow user's to change the way
+data processing is done (that is the "type=" attribute), but require the client
+to provide this information through an additional parameter passed to the
+*statistic_*_as() functions. For example, the counter named MY_ENTITY_STAT_O
+can't be inflated to a histogram at run time.
+
+  {
+	  struct my_entity *one;
+	  unsigned long flags;
+	  ...
+
+	  local_irq_save(flags);
+	  _statistic_inc_as(STAT_CNTR_INC, &one->stat, MY_ENTITY_STAT_O, o);
+	  _statistic_add_as(STAT_UTIL, &one->stat, MY_ENTITY_STAT_P, p, number);
+	  ...
+	  local_irq_restore(flags);
+  }
+
+Make sure you have set the STATISTIC_FLAGS_NOFLEX flag for statistics
+which are fed through *statistic_*_as() function to prohibit the alteration
+of the "type=" attribute.
+
 The above examples show statistics that feed on incremental updates that
 get accumulated by the statistics infrastructure on top of data already
 gathered by the statistics infrastructure.
-That is why statistic_add() or statistic_inc() respectively are used.
 
 There might be statistics that come as total numbers, e.g. because they feed
-on counters already maintained by the exploiter or some hardware feature.
+on counters already maintained by the client or some hardware feature.
 These numbers can be exported through the statistics infrastructure along
 with any other statistic. In this case, use statistic_set() to report data.
 Usually it is sufficient to do so when the user opens the corresponding
diff -puN MAINTAINERS~statistics-infrastructure MAINTAINERS
--- a/MAINTAINERS~statistics-infrastructure
+++ a/MAINTAINERS
@@ -3381,6 +3381,13 @@ STARMODE RADIO IP (STRIP) PROTOCOL DRIVE
 W:	http://mosquitonet.Stanford.EDU/strip.html
 S:	Unsupported ?
 
+STATISTICS INFRASTRUCTURE
+P:	Martin Peschke
+M:	mpeschke@xxxxxxxxxx
+M:	linux390@xxxxxxxxxx
+W:	http://www.ibm.com/developerworks/linux/linux390/
+S:	Supported
+
 STRADIS MPEG-2 DECODER DRIVER
 P:	Nathan Laredo
 M:	laredo@xxxxxxx
diff -puN arch/s390/Kconfig~statistics-infrastructure arch/s390/Kconfig
--- a/arch/s390/Kconfig~statistics-infrastructure
+++ a/arch/s390/Kconfig
@@ -547,6 +547,8 @@ config KPROBES
 	  for kernel debugging, non-intrusive instrumentation and testing.
 	  If in doubt, say "N".
 
+source "lib/Kconfig.statistic"
+
 endmenu
 
 source "arch/s390/Kconfig.debug"
diff -puN arch/s390/oprofile/Kconfig~statistics-infrastructure arch/s390/oprofile/Kconfig
--- a/arch/s390/oprofile/Kconfig~statistics-infrastructure
+++ a/arch/s390/oprofile/Kconfig
@@ -1,6 +1,3 @@
-
-menu "Profiling support"
-
 config PROFILING
 	bool "Profiling support"
 	help
@@ -18,5 +15,3 @@ config OPROFILE
 
 	  If unsure, say N.
 
-endmenu
-
diff -puN include/linux/jiffies.h~statistics-infrastructure include/linux/jiffies.h
--- a/include/linux/jiffies.h~statistics-infrastructure
+++ a/include/linux/jiffies.h
@@ -278,7 +278,7 @@ extern u64 nsec_to_clock_t(u64 x);
 
 #define TIMESTAMP_SIZE	30
 
-static inline int nsec_to_timestamp(char *s, unsigned long long t)
+static inline int nsec_to_timestamp(char *s, u64 t)
 {
 	unsigned long nsec_rem = do_div(t, NSEC_PER_SEC);
 	return sprintf(s, "[%5lu.%06lu]", (unsigned long)t,
diff -puN /dev/null include/linux/statistic.h
--- /dev/null
+++ a/include/linux/statistic.h
@@ -0,0 +1,281 @@
+/*
+ * include/linux/statistic.h
+ *
+ * Statistics facility
+ *
+ * (C) Copyright IBM Corp. 2005, 2006
+ *
+ * Author(s): Martin Peschke <mpeschke@xxxxxxxxxx>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ */
+
+#ifndef STATISTIC_H
+#define STATISTIC_H
+
+#include <linux/fs.h>
+#include <linux/types.h>
+#include <linux/percpu.h>
+
+/**
+ * struct statistic_info - description of a class of statistics
+ * @name: pointer to name name string
+ * @x_unit: pointer to string describing unit of X of (X, Y) data pair
+ * @y_unit: pointer to string describing unit of Y of (X, Y) data pair
+ * @flags: bits describing special settings
+ * @defaults: pointer to string describing defaults setting for attributes
+ *
+ * Exploiters must setup an array of struct statistic_info for a
+ * corresponding array of struct statistic, which are then pointed to
+ * by struct statistic_interface.
+ *
+ * Struct statistic_info and all members and addressed strings must stay for
+ * the lifetime of corresponding statistics created with statistic_create().
+ *
+ * Except for the name string, all other members may be left blank.
+ * It would be nice of clients to fill it out completely, though.
+ */
+struct statistic_info {
+/* public: */
+	char *name;
+	char *x_unit;
+	char *y_unit;
+	int  flags;
+#define STATISTIC_FLAGS_NOINCR	0x01	/* no incremental data */
+#define STATISTIC_FLAGS_NOFLEX	0x02	/* type can't be altered by user */
+	char *defaults;
+};
+
+enum statistic_state {
+	STATISTIC_STATE_INVALID,
+	STATISTIC_STATE_UNCONFIGURED,
+	STATISTIC_STATE_RELEASED,
+	STATISTIC_STATE_OFF,
+	STATISTIC_STATE_ON
+};
+
+enum statistic_type {
+	STAT_CNTR_INC,
+	STAT_CNTR_PROD,
+	STAT_UTIL,
+	STAT_HGRAM_LIN,
+	STAT_HGRAM_LOG2,
+	STAT_SPARSE,
+	STAT_NONE
+};
+
+/**
+ * struct statistic - any data required for gathering data for a statistic
+ */
+struct statistic {
+/* private: */
+	enum statistic_state	 state;
+	enum statistic_type	 type;
+	void			*data;
+	void			(*add)(struct statistic *, s64, u64);
+	u64			 started;
+	u64			 stopped;
+	u64			 age;
+	union {
+		struct {
+			s64 range_min;
+			u32 last_index;
+			u32 base_interval;
+		} histogram;
+		struct {
+			u32 entries_max;
+		} sparse;
+	} u;
+};
+
+/**
+ * struct statistic_interface - collection of statistics for an entity
+ * @stat: a struct statistic array
+ * @info: a struct statistic_info array describing the struct statistic array
+ * @number: number of entries in both arrays
+ * @pull: an optional function called when user reads data from file
+ * @pull_private: optional data pointer passed to pull function
+ *
+ * Exploiters must setup a struct statistic_interface prior to calling
+ * statistic_create().
+ */
+struct statistic_interface {
+/* private: */
+	struct list_head	 list;
+	struct dentry		*debugfs_dir;
+	struct dentry		*data_file;
+	struct dentry		*def_file;
+/* public: */
+	struct statistic	*stat;
+	struct statistic_info	*info;
+	int			 number;
+	int			(*pull)(void*);
+	void			*pull_private;
+};
+
+#ifdef CONFIG_STATISTICS
+
+extern int statistic_create(struct statistic_interface *, const char *);
+extern int statistic_remove(struct statistic_interface *);
+
+extern void statistic_set(struct statistic *, int, s64, u64);
+
+extern void _statistic_add(struct statistic *, int, s64, u64);
+extern void statistic_add(struct statistic *, int, s64, u64);
+
+/*
+ * Clients are not supposed to call these directly.
+ * The declarations are needed to allow optimisation of _statistic_add_as()
+ * at compile time.
+ */
+extern void statistic_add_counter_inc(struct statistic *, s64, u64);
+extern void statistic_add_counter_prod(struct statistic *, s64, u64);
+extern void statistic_add_util(struct statistic *, s64, u64);
+extern void statistic_add_histogram_lin(struct statistic *, s64, u64);
+extern void statistic_add_histogram_log2(struct statistic *, s64, u64);
+extern void statistic_add_sparse(struct statistic *, s64, u64);
+
+/**
+ * _statistic_add_as - update statistic with incremental data in (X, Y) pair
+ * @type: data proessing mode to be used (must match statistic_info::defaults)
+ * @stat: struct statistic array
+ * @i: index of statistic to be updated
+ * @value: X
+ * @incr: Y
+ *
+ * The actual processing of the (X, Y) data pair is determined by the current
+ * definition applied to the statistic. See Documentation/statistics.txt.
+ *
+ * This function is faster than _statistic_add() because the data
+ * processing mode is already determined at compile time.
+ * Use this when you feel that the perfomance gain outweighs the loss
+ * of flexibility for your particular statistic.
+ *
+ * This variant leaves protecting per-cpu data to clients. It is preferred
+ * whenever clients update several statistics of the same entity in one go.
+ *
+ * You may want to use _statistic_inc_as() for (X, 1) data pairs.
+ */
+static inline void _statistic_add_as(int type, struct statistic *stat, int i,
+				     s64 value, u64 incr)
+{
+	if (stat[i].state == STATISTIC_STATE_ON) {
+		switch (type) {
+		case STAT_CNTR_INC:
+			statistic_add_counter_inc(&stat[i], value, incr);
+			break;
+		case STAT_CNTR_PROD:
+			statistic_add_counter_prod(&stat[i], value, incr);
+			break;
+		case STAT_UTIL:
+			statistic_add_util(&stat[i], value, incr);
+			break;
+		case STAT_HGRAM_LIN:
+			statistic_add_histogram_lin(&stat[i], value, incr);
+			break;
+		case STAT_HGRAM_LOG2:
+			statistic_add_histogram_log2(&stat[i], value, incr);
+			break;
+		case STAT_SPARSE:
+			statistic_add_sparse(&stat[i], value, incr);
+			break;
+		}
+	}
+}
+
+/**
+ * statistic_add_as - update statistic with incremental data in (X, Y) pair
+ * @type: data proessing mode to be used (must match statistic_info::defaults)
+ * @stat: struct statistic array
+ * @i: index of statistic to be updated
+ * @value: X
+ * @incr: Y
+ *
+ * The actual processing of the (X, Y) data pair is determined by the current
+ * the definition applied to the statistic. See Documentation/statistics.txt.
+ *
+ * This function is faster than statistic_add() because the data
+ * processing mode is already determined at compile time.
+ * Use this when you feel that the perfomance gain outweighs the loss
+ * of flexibility for your particular statistic.
+ *
+ * This variant takes care of protecting per-cpu data. It is preferred whenever
+ * clients don't update several statistics of the same entity in one go.
+ *
+ * You may want to use statistic_inc() for (X, 1) data pairs.
+ */
+static inline void statistic_add_as(int type, struct statistic *stat, int i,
+				    s64 value, u64 incr)
+{
+	unsigned long flags;
+	local_irq_save(flags);
+	_statistic_add_as(type, stat, i, value, incr);
+	local_irq_restore(flags);
+}
+
+#else /* !CONFIG_STATISTICS */
+/* These NOP functions unburden clients from handling !CONFIG_STATISTICS. */
+
+static inline int statistic_create(struct statistic_interface *interface,
+				   const char *name)
+{
+	return 0;
+}
+
+static inline int statistic_remove(struct statistic_interface *interface)
+{
+	return 0;
+}
+
+static inline void statistic_set(struct statistic *stat, int i,
+				 s64 value, u64 total)
+{
+}
+
+static inline void _statistic_add(struct statistic *stat, int i,
+				  s64 value, u64 incr)
+{
+}
+
+static inline void statistic_add(struct statistic *stat, int i,
+				 s64 value, u64 incr)
+{
+}
+
+static inline void _statistic_add_as(int type, struct statistic *stat, int i,
+				     s64 value, u64 incr)
+{
+}
+
+static inline void statistic_add_as(int type, struct statistic *stat, int i,
+				    s64 value, u64 incr)
+{
+}
+
+#endif /* CONFIG_STATISTICS */
+
+#define _statistic_inc(stat, i, value) \
+	_statistic_add(stat, i, value, 1)
+
+#define statistic_inc(stat, i, value) \
+	statistic_add(stat, i, value, 1)
+
+#define _statistic_inc_as(type, stat, i, value) \
+	_statistic_add_as(type, stat, i, value, 1)
+
+#define statistic_inc_as(type, stat, i, value) \
+	statistic_add_as(type, stat, i, value, 1)
+
+#endif /* STATISTIC_H */
diff -puN /dev/null lib/Kconfig.statistic
--- /dev/null
+++ a/lib/Kconfig.statistic
@@ -0,0 +1,11 @@
+config STATISTICS
+	bool "Statistics infrastructure"
+	depends on DEBUG_FS
+	help
+	  The statistics infrastructure provides a debugfs based user interface
+	  for statistics of kernel components. Statistics are available for
+	  components that have been instrumented to feed data into the
+	  statistics infrastructure.
+	  This feature is useful for performance measurements or performance
+	  debugging.
+	  If in doubt, say "N".
diff -puN lib/Makefile~statistics-infrastructure lib/Makefile
--- a/lib/Makefile~statistics-infrastructure
+++ a/lib/Makefile
@@ -60,6 +60,8 @@ obj-$(CONFIG_TEXTSEARCH_FSM) += ts_fsm.o
 obj-$(CONFIG_SMP) += percpu_counter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
+obj-$(CONFIG_STATISTICS) += statistic.o
+
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
diff -puN /dev/null lib/statistic.c
--- /dev/null
+++ a/lib/statistic.c
@@ -0,0 +1,1564 @@
+/*
+ *  lib/statistic.c
+ *    statistics facility
+ *
+ *    Copyright (C) 2005, 2006
+ *		IBM Deutschland Entwicklung GmbH,
+ *		IBM Corporation
+ *
+ *    Author(s): Martin Peschke (mpeschke@xxxxxxxxxx),
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ *    another bunch of ideas being pondered:
+ *	- define a set of agreed names or a naming scheme for
+ *	  consistency and comparability across clients;
+ *	  this entails an agreement about granularities
+ *	  as well (e.g. separate statistic for read/write/no-data commands);
+ *	  a common set of unit strings would be nice then, too, of course
+ *	  (e.g. "seconds", "milliseconds", "microseconds", ...)
+ *	- perf. opt. of array: table lookup of values, binary search for values
+ *	- another statistic disclipline based on some sort of tree, but
+ *	  similar in semantics to list discipline (for high-perf. histograms of
+ *	  discrete values)
+ *	- allow for more than a single "view" on data at the same time by
+ *	  providing the capability to attach several (a list of) "definitions"
+ *	  to a struct statistic
+ *	  (e.g. show histogram of requests sizes and history of megabytes/sec.
+ *	  at the same time)
+ *	- multi-dimensional statistic (combination of two or more
+ *	  characteristics/discriminators); worth the effort??
+ *	  (e.g. a matrix of occurences for latencies of requests of
+ *	  particular sizes)
+ *
+ *	FIXME:
+ *	- statistics file access when statistics are being removed
+ */
+
+#include <linux/fs.h>
+#include <linux/debugfs.h>
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/parser.h>
+#include <linux/time.h>
+#include <linux/sched.h>
+#include <linux/cpu.h>
+#include <linux/percpu.h>
+#include <linux/mutex.h>
+#include <linux/statistic.h>
+
+#include <asm/bug.h>
+#include <asm/uaccess.h>
+
+struct statistic_file_private {
+	struct list_head read_seg_lh;
+	struct list_head write_seg_lh;
+	size_t write_seg_total_size;
+};
+
+struct statistic_merge_private {
+	struct statistic *stat;
+	spinlock_t lock;
+	void *dst;
+};
+
+/**
+ * struct statistic_discipline - description of a data processing mode
+ * @parse: parses additional attributes specific to this mode (if any)
+ * @size: sizes a data area prior to allocation (mandatory)
+ * @reset: discards content of a data area (mandatory)
+ * @merge: merges content of a data area into another data area (mandatory)
+ * @fdata: prints content of a data area into buffer (mandatory)
+ * @fdef: prints additional attributes specific to this mode (if any)
+ * @add: updates a data area for a statistic fed incremental data (mandatory)
+ * @set: updates a data area for a statistic fed total numbers (mandatory)
+ * @name: pointer to name string (mandatory)
+ *
+ * Struct statistic_discipline describes a statistic infrastructure internal
+ * programming interface. Another data processing mode can be added by
+ * implementing these routines and appending an entry to the
+ * statistic_discs array.
+ *
+ * "Data area" in te above description usually means a chunk of memory,
+ * may it be allocated for data gathering per CPU, or be shared by all
+ * CPUs, or used for other purposes, like merging per-CPU data when
+ * users read data from files. Implementers of data processing modes
+ * don't need to worry about the designation of a particular chunk of memory.
+ * A data area of a data processing mode always has to look the same.
+ */
+struct statistic_discipline {
+	int (*parse)(struct statistic * stat, struct statistic_info *info,
+		     int type, char *def);
+	size_t (*size)(struct statistic * stat);
+	void (*reset)(struct statistic *stat, void *ptr);
+	void (*merge)(struct statistic *stat, void *dst, void *src);
+	int (*fdata)(struct statistic *stat, const char *name,
+		     struct statistic_file_private *fpriv, void *data);
+	int (*fdef)(struct statistic *stat, char *line);
+	void (*add)(struct statistic *stat, s64 value, u64 incr);
+	void (*set)(struct statistic *stat, s64 value, u64 total);
+	char *name;
+};
+
+static struct statistic_discipline statistic_discs[];
+
+static int statistic_initialise(struct statistic *stat)
+{
+	stat->type = STAT_NONE;
+	stat->state = STATISTIC_STATE_UNCONFIGURED;
+	return 0;
+}
+
+static int statistic_uninitialise(struct statistic *stat)
+{
+	stat->state = STATISTIC_STATE_INVALID;
+	return 0;
+}
+
+static int statistic_define(struct statistic *stat)
+{
+	if (stat->type == STAT_NONE)
+		return -EINVAL;
+	stat->state = STATISTIC_STATE_RELEASED;
+	return 0;
+}
+
+static int statistic_free(struct statistic *stat, struct statistic_info *info)
+{
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	int cpu;
+
+	if (unlikely(info->flags & STATISTIC_FLAGS_NOINCR)) {
+		disc->reset(stat, stat->data);
+		kfree(stat->data);
+	} else {
+		for_each_online_cpu(cpu)
+			disc->reset(stat, percpu_ptr(stat->data, cpu));
+		percpu_free(stat->data);
+	}
+	stat->state = STATISTIC_STATE_RELEASED;
+	return 0;
+}
+
+static int statistic_alloc(struct statistic *stat,
+			   struct statistic_info *info)
+{
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	size_t size = disc->size(stat);
+	int cpu;
+
+	if (unlikely(info->flags & STATISTIC_FLAGS_NOINCR)) {
+		stat->data = kzalloc(size, GFP_KERNEL);
+		if (unlikely(!stat->data))
+			return -ENOMEM;
+		disc->reset(stat, stat->data);
+	} else {
+		stat->data = percpu_alloc(size, GFP_KERNEL);
+		if (unlikely(!stat->data))
+			return -ENOMEM;
+		for_each_online_cpu(cpu)
+			disc->reset(stat, percpu_ptr(stat->data, cpu));
+	}
+	stat->age = timestamp_clock();
+	stat->state = STATISTIC_STATE_OFF;
+	return 0;
+}
+
+static int statistic_start(struct statistic *stat)
+{
+	stat->started = timestamp_clock();
+	stat->state = STATISTIC_STATE_ON;
+	return 0;
+}
+
+static void _statistic_barrier(void *unused)
+{
+}
+
+static int statistic_stop(struct statistic *stat)
+{
+	stat->stopped = timestamp_clock();
+	stat->state = STATISTIC_STATE_OFF;
+	/* ensures that all CPUs have ceased updating statistics */
+	smp_mb();
+	on_each_cpu(_statistic_barrier, NULL, 0, 1);
+	return 0;
+}
+
+static int statistic_transition(struct statistic *stat,
+				struct statistic_info *info,
+				enum statistic_state requested_state)
+{
+	int z = requested_state < stat->state ? 1 : 0;
+	int retval = 0;
+
+	while (!retval && stat->state != requested_state) {
+		switch (stat->state) {
+		case STATISTIC_STATE_INVALID:
+			retval = z ? -EINVAL : statistic_initialise(stat);
+			break;
+		case STATISTIC_STATE_UNCONFIGURED:
+			retval = z ? statistic_uninitialise(stat)
+				   : statistic_define(stat);
+			break;
+		case STATISTIC_STATE_RELEASED:
+			retval = z ? statistic_initialise(stat)
+				   : statistic_alloc(stat, info);
+			break;
+		case STATISTIC_STATE_OFF:
+			retval = z ? statistic_free(stat, info)
+				   : statistic_start(stat);
+			break;
+		case STATISTIC_STATE_ON:
+			retval = z ? statistic_stop(stat) : -EINVAL;
+			break;
+		}
+	}
+	return retval;
+}
+
+static int statistic_reset(struct statistic *stat, struct statistic_info *info)
+{
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	enum statistic_state prev_state = stat->state;
+	int cpu;
+
+	if (unlikely(stat->state < STATISTIC_STATE_OFF))
+		return 0;
+	statistic_transition(stat, info, STATISTIC_STATE_OFF);
+	if (unlikely(info->flags & STATISTIC_FLAGS_NOINCR))
+		disc->reset(stat, stat->data);
+	else
+		for_each_online_cpu(cpu)
+			disc->reset(stat, percpu_ptr(stat->data, cpu));
+	stat->age = timestamp_clock();
+	statistic_transition(stat, info, prev_state);
+	return 0;
+}
+
+static void statistic_merge(void *__mpriv)
+{
+	struct statistic_merge_private *mpriv = __mpriv;
+	struct statistic *stat = mpriv->stat;
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	void *src = percpu_ptr(stat->data, smp_processor_id());
+
+	spin_lock(&mpriv->lock);
+	disc->merge(stat, mpriv->dst, src);
+	spin_unlock(&mpriv->lock);
+}
+
+struct sgrb_seg {
+	struct list_head list;
+	char *address;
+	int offset;
+	int size;
+};
+
+static struct sgrb_seg *sgrb_seg_find(struct list_head *lh, int size)
+{
+	struct sgrb_seg *seg;
+
+	/* only the last buffer, if any, may have spare bytes */
+	list_for_each_entry_reverse(seg, lh, list) {
+		if (likely((PAGE_SIZE - seg->offset) >= size))
+			return seg;
+		break;
+	}
+	seg = kzalloc(sizeof(struct sgrb_seg), GFP_KERNEL);
+	if (unlikely(!seg))
+		return NULL;
+	seg->size = PAGE_SIZE;
+	seg->address = (void*)__get_free_page(GFP_KERNEL);
+	if (unlikely(!seg->address)) {
+		kfree(seg);
+		return NULL;
+	}
+	list_add_tail(&seg->list, lh);
+	return seg;
+}
+
+static void sgrb_seg_release_all(struct list_head *lh)
+{
+	struct sgrb_seg *seg, *tmp;
+
+	list_for_each_entry_safe(seg, tmp, lh, list) {
+		list_del(&seg->list);
+		free_page((unsigned long)seg->address);
+		kfree(seg);
+	}
+}
+
+static char *statistic_state_strings[] = {
+	"undefined(BUG)",
+	"unconfigured",
+	"released",
+	"off",
+	"on",
+};
+
+static int statistic_fdef(struct statistic_interface *interface, int i,
+			  struct statistic_file_private *private)
+{
+	struct statistic *stat = &interface->stat[i];
+	struct statistic_info *info = &interface->info[i];
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	struct sgrb_seg *seg;
+	char t0[TIMESTAMP_SIZE], t1[TIMESTAMP_SIZE], t2[TIMESTAMP_SIZE];
+
+	seg = sgrb_seg_find(&private->read_seg_lh, 512);
+	if (unlikely(!seg))
+		return -ENOMEM;
+
+	seg->offset += sprintf(seg->address + seg->offset,
+			       "name=%s state=%s units=%s/%s",
+			       info->name, statistic_state_strings[stat->state],
+			       info->x_unit, info->y_unit);
+	if (stat->state == STATISTIC_STATE_UNCONFIGURED) {
+		seg->offset += sprintf(seg->address + seg->offset, "\n");
+		return 0;
+	}
+
+	seg->offset += sprintf(seg->address + seg->offset, " type=%s",
+			       disc->name);
+	if (info->flags & STATISTIC_FLAGS_NOFLEX)
+		seg->offset += sprintf(seg->address + seg->offset, "(fix)");
+
+	if (disc->fdef)
+		seg->offset += disc->fdef(stat, seg->address + seg->offset);
+	if (stat->state == STATISTIC_STATE_RELEASED) {
+		seg->offset += sprintf(seg->address + seg->offset, "\n");
+		return 0;
+	}
+
+	nsec_to_timestamp(t0, stat->age);
+	nsec_to_timestamp(t1, stat->started);
+	nsec_to_timestamp(t2, stat->stopped);
+	seg->offset += sprintf(seg->address + seg->offset,
+			       " data=%s started=%s stopped=%s\n", t0, t1, t2);
+	return 0;
+}
+
+static int statistic_fdata(struct statistic_interface *interface, int i,
+			   struct statistic_file_private *fpriv)
+{
+	struct statistic *stat = &interface->stat[i];
+	struct statistic_info *info = &interface->info[i];
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	struct statistic_merge_private mpriv;
+	size_t size = disc->size(stat);
+	int retval;
+
+	if (unlikely(stat->state < STATISTIC_STATE_OFF))
+		return 0;
+	if (unlikely(info->flags & STATISTIC_FLAGS_NOINCR))
+		return disc->fdata(stat, info->name, fpriv, stat->data);
+	mpriv.dst = kzalloc(size, GFP_KERNEL);
+	if (unlikely(!mpriv.dst))
+		return -ENOMEM;
+	disc->reset(stat, mpriv.dst);
+	spin_lock_init(&mpriv.lock);
+	mpriv.stat = stat;
+	on_each_cpu(statistic_merge, &mpriv, 0, 1);
+	retval = disc->fdata(stat, info->name, fpriv, mpriv.dst);
+	kfree(mpriv.dst);
+	return retval;
+}
+
+/* cpu hotplug handling for per-cpu data */
+
+static int _statistic_hotcpu(struct statistic_interface *interface,
+			     int i, unsigned long action, int cpu)
+{
+	struct statistic *stat = &interface->stat[i];
+	struct statistic_info *info = &interface->info[i];
+	struct statistic_discipline *disc = &statistic_discs[stat->type];
+	void *src, *dst;
+	size_t size;
+	unsigned long flags;
+
+	if (unlikely(info->flags & STATISTIC_FLAGS_NOINCR))
+		return NOTIFY_OK;
+	if (stat->state < STATISTIC_STATE_OFF)
+		return NOTIFY_OK;
+	switch (action) {
+	case CPU_UP_PREPARE:
+		size = disc->size(stat);
+		dst = percpu_populate(stat->data, size, GFP_KERNEL, cpu);
+		if (!dst)
+			return NOTIFY_BAD;
+		disc->reset(stat, dst);
+		break;
+	case CPU_UP_CANCELED:
+	case CPU_DEAD:
+		local_irq_save(flags);
+		dst = percpu_ptr(stat->data, smp_processor_id());
+		src = percpu_ptr(stat->data, cpu);
+		disc->merge(stat, dst, src);
+		local_irq_restore(flags);
+		percpu_depopulate(stat->data, cpu);
+		break;
+	}
+	return NOTIFY_OK;
+}
+
+static struct list_head statistic_list;
+static struct mutex statistic_list_mutex;
+
+static int __cpuinit statistic_hotcpu(struct notifier_block *notifier,
+				      unsigned long action, void *__cpu)
+{
+	int cpu = (unsigned long)__cpu, i, retval = NOTIFY_OK;
+	struct statistic_interface *interface;
+
+	mutex_lock(&statistic_list_mutex);
+	list_for_each_entry(interface, &statistic_list, list)
+		for (i = 0; i < interface->number; i++) {
+			retval = _statistic_hotcpu(interface, i, action, cpu);
+			if (retval == NOTIFY_BAD)
+				goto unlock;
+		}
+unlock:
+	mutex_unlock(&statistic_list_mutex);
+	return retval;
+}
+
+static struct notifier_block statistic_hotcpu_notifier =
+{
+	.notifier_call = statistic_hotcpu,
+};
+
+/* module startup / removal */
+
+static struct dentry *statistic_root_dir;
+
+int __init statistic_init(void)
+{
+	statistic_root_dir = debugfs_create_dir("statistics", NULL);
+	if (unlikely(!statistic_root_dir))
+		return -ENOMEM;
+	INIT_LIST_HEAD(&statistic_list);
+	mutex_init(&statistic_list_mutex);
+	register_cpu_notifier(&statistic_hotcpu_notifier);
+	return 0;
+}
+
+void __exit statistic_exit(void)
+{
+	unregister_cpu_notifier(&statistic_hotcpu_notifier);
+	debugfs_remove(statistic_root_dir);
+}
+
+/* parser used for configuring statistics */
+
+static int statistic_parse_single(struct statistic *stat,
+				  struct statistic_info *info,
+				  char *def, int type)
+{
+	struct statistic_discipline *disc = &statistic_discs[type];
+	int prev_state = stat->state, retval = 0;
+	char *copy;
+
+	if (info->flags & STATISTIC_FLAGS_NOFLEX && stat->type != type &&
+	    def != info->defaults)
+		return -EINVAL;
+	if (disc->parse) {
+		copy = kstrdup(def, GFP_KERNEL);
+		if (unlikely(!copy))
+			return -ENOMEM;
+		retval = disc->parse(stat, info, type, copy);
+		kfree(copy);
+	} else if (type != stat->type)
+		statistic_transition(stat, info, STATISTIC_STATE_UNCONFIGURED);
+	if (!retval) {
+		stat->type = type;
+		stat->add = disc->add;
+	}
+	statistic_transition(stat, info,
+			     max(prev_state, STATISTIC_STATE_RELEASED));
+	return retval;
+}
+
+static match_table_t statistic_match_type = {
+	{1, "type=%s"},
+	{9, NULL}
+};
+
+static int statistic_parse_match(struct statistic *stat,
+				 struct statistic_info *info, char *def)
+{
+	int type, len;
+	char *p, *copy, *twisted;
+	substring_t args[MAX_OPT_ARGS];
+	struct statistic_discipline *disc;
+
+	if (!def)
+		def = info->defaults;
+	twisted = copy = kstrdup(def, GFP_KERNEL);
+	if (unlikely(!copy))
+		return -ENOMEM;
+	while ((p = strsep(&twisted, " ")) != NULL) {
+		if (!*p)
+			continue;
+		if (match_token(p, statistic_match_type, args) != 1)
+			continue;
+		len = (args[0].to - args[0].from) + 1;
+		for (type = 0; type < STAT_NONE; type++) {
+			disc = &statistic_discs[type];
+			if (unlikely(strncmp(disc->name, args[0].from, len)))
+				continue;
+			kfree(copy);
+			return statistic_parse_single(stat, info, def, type);
+		}
+	}
+	kfree(copy);
+	if (unlikely(stat->type == STAT_NONE))
+		return -EINVAL;
+	return statistic_parse_single(stat, info, def, stat->type);
+}
+
+static match_table_t statistic_match_common = {
+	{STATISTIC_STATE_UNCONFIGURED, "state=unconfigured"},
+	{STATISTIC_STATE_RELEASED, "state=released"},
+	{STATISTIC_STATE_OFF, "state=off"},
+	{STATISTIC_STATE_ON, "state=on"},
+	{1001, "name=%s"},
+	{1002, "data=reset"},
+	{1003, "defaults"},
+	{9999, NULL}
+};
+
+static void statistic_parse_line(struct statistic_interface *interface,
+				 char *def)
+{
+	char *p, *copy, *twisted, *name = NULL;
+	substring_t args[MAX_OPT_ARGS];
+	int token, reset = 0, defaults = 0, i;
+	int state = STATISTIC_STATE_INVALID;
+	struct statistic *stat = interface->stat;
+	struct statistic_info *info = interface->info;
+
+	if (unlikely(!def))
+		return;
+	twisted = copy = kstrdup(def, GFP_KERNEL);
+	if (unlikely(!copy))
+		return;
+
+	while ((p = strsep(&twisted, " ")) != NULL) {
+		if (!*p)
+			continue;
+		token = match_token(p, statistic_match_common, args);
+		switch (token) {
+		case STATISTIC_STATE_UNCONFIGURED:
+		case STATISTIC_STATE_RELEASED:
+		case STATISTIC_STATE_OFF:
+		case STATISTIC_STATE_ON:
+			state = token;
+			break;
+		case 1001:
+			if (likely(!name))
+				name = match_strdup(&args[0]);
+			break;
+		case 1002:
+			reset = 1;
+			break;
+		case 1003:
+			defaults = 1;
+			break;
+		}
+	}
+	for (i = 0; i < interface->number; i++, stat++, info++) {
+		if (!name || (name && !strcmp(name, info->name))) {
+			if (defaults)
+				statistic_parse_match(stat, info, NULL);
+			if (name)
+				statistic_parse_match(stat, info, def);
+			if (state != STATISTIC_STATE_INVALID)
+				statistic_transition(stat, info, state);
+			if (reset)
+				statistic_reset(stat, info);
+		}
+	}
+	kfree(copy);
+	kfree(name);
+}
+
+static void statistic_parse(struct statistic_interface *interface,
+			    struct list_head *line_lh, size_t line_size)
+{
+	struct sgrb_seg *seg, *tmp;
+	char *buf;
+	int offset = 0;
+
+	if (unlikely(!line_size))
+		return;
+	buf = kmalloc(line_size + 2, GFP_KERNEL);
+	if (unlikely(!buf))
+		return;
+	buf[line_size] = ' ';
+	buf[line_size + 1] = '\0';
+	list_for_each_entry_safe(seg, tmp, line_lh, list) {
+		memcpy(buf + offset, seg->address, seg->size);
+		offset += seg->size;
+		list_del(&seg->list);
+		kfree(seg);
+	}
+	statistic_parse_line(interface, buf);
+	kfree(buf);
+}
+
+/* sequential files comprising user interface */
+
+static int statistic_generic_open(struct inode *inode,
+		struct file *file, struct statistic_interface **interface,
+		struct statistic_file_private **private)
+{
+	*interface = inode->i_private;
+	BUG_ON(!interface);
+	*private = kzalloc(sizeof(struct statistic_file_private), GFP_KERNEL);
+	if (unlikely(!*private))
+		return -ENOMEM;
+	INIT_LIST_HEAD(&(*private)->read_seg_lh);
+	INIT_LIST_HEAD(&(*private)->write_seg_lh);
+	file->private_data = *private;
+	return 0;
+}
+
+static int statistic_generic_close(struct inode *inode, struct file *file)
+{
+	struct statistic_file_private *private = file->private_data;
+	BUG_ON(!private);
+	sgrb_seg_release_all(&private->read_seg_lh);
+	sgrb_seg_release_all(&private->write_seg_lh);
+	kfree(private);
+	return 0;
+}
+
+static ssize_t statistic_generic_read(struct file *file,
+				char __user *buf, size_t len, loff_t *offset)
+{
+	struct statistic_file_private *private = file->private_data;
+	struct sgrb_seg *seg;
+	size_t seg_offset, seg_residual, seg_transfer;
+	size_t transfered = 0;
+	loff_t pos = 0;
+
+	BUG_ON(!private);
+	list_for_each_entry(seg, &private->read_seg_lh, list) {
+		if (unlikely(!len))
+			break;
+		if (*offset >= pos && *offset <= (pos + seg->offset)) {
+			seg_offset = *offset - pos;
+			seg_residual = seg->offset - seg_offset;
+			seg_transfer = min(len, seg_residual);
+			if (unlikely(copy_to_user(buf + transfered,
+						  seg->address + seg_offset,
+						  seg_transfer)))
+				return -EFAULT;
+			transfered += seg_transfer;
+			*offset += seg_transfer;
+			pos += seg_transfer + seg_offset;
+			len -= seg_transfer;
+		} else
+			pos += seg->offset;
+	}
+	return transfered;
+}
+
+static ssize_t statistic_generic_write(struct file *file,
+			const char __user *buf, size_t len, loff_t *offset)
+{
+	struct statistic_file_private *private = file->private_data;
+	struct sgrb_seg *seg;
+	size_t seg_residual, seg_transfer;
+	size_t transfered = 0;
+
+	BUG_ON(!private);
+	if (unlikely(*offset != private->write_seg_total_size))
+		return -EPIPE;
+	while (len) {
+		seg = sgrb_seg_find(&private->write_seg_lh, 1);
+		if (unlikely(!seg))
+			return -ENOMEM;
+		seg_residual = seg->size - seg->offset;
+		seg_transfer = min(len, seg_residual);
+		if (unlikely(copy_from_user(seg->address + seg->offset,
+					    buf + transfered, seg_transfer)))
+			return -EFAULT;
+		private->write_seg_total_size += seg_transfer;
+		seg->offset += seg_transfer;
+		transfered += seg_transfer;
+		*offset += seg_transfer;
+		len -= seg_transfer;
+	}
+	return transfered;
+}
+
+static int statistic_def_close(struct inode *inode, struct file *file)
+{
+	struct statistic_interface *interface = inode->i_private;
+	struct statistic_file_private *private = file->private_data;
+	struct sgrb_seg *seg, *seg_nl;
+	int offset;
+	LIST_HEAD(line_lh);
+	char *nl;
+	size_t line_size = 0;
+
+	list_for_each_entry(seg, &private->write_seg_lh, list) {
+		for (offset = 0; offset < seg->offset; offset += seg_nl->size) {
+			seg_nl = kmalloc(sizeof(struct sgrb_seg), GFP_KERNEL);
+			if (unlikely(!seg_nl))
+				goto out;
+			seg_nl->address = seg->address + offset;
+			nl = strnchr(seg_nl->address,
+				     seg->offset - offset, '\n');
+			if (nl) {
+				seg_nl->offset = nl - seg_nl->address;
+				if (seg_nl->offset)
+					seg_nl->offset--;
+			} else
+				seg_nl->offset = seg->offset - offset;
+			seg_nl->size = seg_nl->offset + 1;
+			line_size += seg_nl->size;
+			list_add_tail(&seg_nl->list, &line_lh);
+			if (nl) {
+				statistic_parse(interface, &line_lh, line_size);
+				line_size = 0;
+			}
+		}
+	}
+out:
+	if (!list_empty(&line_lh))
+		statistic_parse(interface, &line_lh, line_size);
+	return statistic_generic_close(inode, file);
+}
+
+static int statistic_def_open(struct inode *inode, struct file *file)
+{
+	struct statistic_interface *interface;
+	struct statistic_file_private *private;
+	int retval = 0;
+	int i;
+
+	retval = statistic_generic_open(inode, file, &interface, &private);
+	if (unlikely(retval))
+		return retval;
+	for (i = 0; i < interface->number; i++) {
+		retval = statistic_fdef(interface, i, private);
+		if (unlikely(retval)) {
+			statistic_def_close(inode, file);
+			break;
+		}
+	}
+	return retval;
+}
+
+static int statistic_data_open(struct inode *inode, struct file *file)
+{
+	struct statistic_interface *interface;
+	struct statistic_file_private *private;
+	int retval = 0;
+	int i;
+
+	retval = statistic_generic_open(inode, file, &interface, &private);
+	if (unlikely(retval))
+		return retval;
+	if (interface->pull)
+		interface->pull(interface->pull_private);
+	for (i = 0; i < interface->number; i++) {
+		retval = statistic_fdata(interface, i, private);
+		if (unlikely(retval)) {
+			statistic_generic_close(inode, file);
+			break;
+		}
+	}
+	return retval;
+}
+
+static struct file_operations statistic_def_fops = {
+	.owner		= THIS_MODULE,
+	.read		= statistic_generic_read,
+	.write		= statistic_generic_write,
+	.open		= statistic_def_open,
+	.release	= statistic_def_close,
+};
+
+static struct file_operations statistic_data_fops = {
+	.owner		= THIS_MODULE,
+	.read		= statistic_generic_read,
+	.open		= statistic_data_open,
+	.release	= statistic_generic_close,
+};
+
+/* code concerned with single value statistics */
+
+size_t statistic_size_counter(struct statistic *stat)
+{
+	return sizeof(u64);
+}
+
+static void statistic_reset_counter(struct statistic *stat, void *ptr)
+{
+	*(u64*)ptr = 0;
+}
+
+void statistic_add_counter_inc(struct statistic *stat, s64 value, u64 incr)
+{
+	*(u64*)percpu_ptr(stat->data, smp_processor_id()) += incr;
+}
+EXPORT_SYMBOL_GPL(statistic_add_counter_inc);
+
+void statistic_add_counter_prod(struct statistic *stat, s64 value, u64 incr)
+{
+	if (unlikely(value < 0))
+		value = -value;
+	*(u64*)percpu_ptr(stat->data, smp_processor_id()) += value * incr;
+}
+EXPORT_SYMBOL_GPL(statistic_add_counter_prod);
+
+static void statistic_set_counter_inc(struct statistic *stat,
+				      s64 value, u64 total)
+{
+	*(u64*)stat->data = total;
+}
+
+static void statistic_set_counter_prod(struct statistic *stat,
+				       s64 value, u64 total)
+{
+	if (unlikely(value < 0))
+		value = -value;
+	*(u64*)stat->data = value * total;
+}
+
+static void statistic_merge_counter(struct statistic *stat,
+				    void *dst, void *src)
+{
+	*(u64*)dst += *(u64*)src;
+}
+
+static int statistic_fdata_counter(struct statistic *stat, const char *name,
+				   struct statistic_file_private *fpriv,
+				   void *data)
+{
+	struct sgrb_seg *seg;
+	seg = sgrb_seg_find(&fpriv->read_seg_lh, 128);
+	if (unlikely(!seg))
+		return -ENOMEM;
+	seg->offset += sprintf(seg->address + seg->offset, "%s %Lu\n",
+			       name, *(unsigned long long *)data);
+	return 0;
+}
+
+/* code concerned with utilisation indicator statistic */
+
+struct statistic_entry_util {
+	u32 res;
+	u32 num;	/* FIXME: better 64 bit; do_div can't deal with it) */
+	s64 acc;
+	s64 sqr;
+	s64 min;
+	s64 max;
+};
+
+size_t statistic_size_util(struct statistic *stat)
+{
+	return sizeof(struct statistic_entry_util);
+}
+
+static void statistic_reset_util(struct statistic *stat, void *ptr)
+{
+	struct statistic_entry_util *util = ptr;
+	util->num = 0;
+	util->acc = 0;
+	util->sqr = 0;
+	util->min = LLONG_MAX;
+	util->max = LLONG_MIN;
+}
+
+void statistic_add_util(struct statistic *stat, s64 value, u64 incr)
+{
+	struct statistic_entry_util *util;
+	util = percpu_ptr(stat->data, smp_processor_id());
+	util->num += incr;
+	util->acc += value * incr;
+	util->sqr += value * value * incr;
+	if (unlikely(value < util->min))
+		util->min = value;
+	if (unlikely(value > util->max))
+		util->max = value;
+}
+EXPORT_SYMBOL_GPL(statistic_add_util);
+
+static void statistic_set_util(struct statistic *stat, s64 value, u64 total)
+{
+	struct statistic_entry_util *util = stat->data;
+	util->num = total;
+	util->acc = value * total;
+	util->sqr = value * value * total;
+	if (unlikely(value < util->min))
+		util->min = value;
+	if (unlikely(value > util->max))
+		util->max = value;
+}
+
+static void statistic_merge_util(struct statistic *stat, void *_dst, void *_src)
+{
+	struct statistic_entry_util *dst = _dst, *src = _src;
+	dst->num += src->num;
+	dst->acc += src->acc;
+	dst->sqr += src->sqr;
+	if (unlikely(src->min < dst->min))
+		dst->min = src->min;
+	if (unlikely(src->max > dst->max))
+		dst->max = src->max;
+}
+
+static int statistic_div(signed long long *whole, unsigned long long *decimal,
+			 signed long long a, signed long b, int precision)
+{
+	unsigned long long p, rem, _decimal, _whole = a >= 0 ? a : -a;
+	unsigned long _b = b > 0 ? b : -b;
+	signed int sign = (a ^ (signed long long)b) & ~LLONG_MAX ? -1 : 1;
+	if (!b)
+		return -EINVAL;
+	for (p = 1; precision; precision--, p *= 10);
+	_decimal = do_div(_whole, _b) * p;
+	rem = do_div(_decimal, _b) << 2;
+	*whole = sign * _whole;
+	*decimal = _decimal + (rem >= _b ? 1 : 0);
+	return 0;
+}
+
+static int statistic_fdata_util(struct statistic *stat, const char *name,
+				struct statistic_file_private *fpriv,
+				void *data)
+{
+	struct sgrb_seg *seg;
+	struct statistic_entry_util *util = data;
+	unsigned long long mean_w = 0, mean_d = 0, var_w = 0, var_d = 0,
+			   num = util->num, acc = util->acc, sqr = util->sqr;
+	signed long long min = num ? util->min : 0,
+			 max = num ? util->max : 0;
+
+	seg = sgrb_seg_find(&fpriv->read_seg_lh, 512);
+	if (unlikely(!seg))
+		return -ENOMEM;
+	statistic_div(&mean_w, &mean_d, acc, num, 3);
+	statistic_div(&var_w, &var_d, sqr - mean_w * mean_w, num, 3);
+	seg->offset += sprintf(seg->address + seg->offset,
+			       "%s samples %Lu\n"
+			       "%s minimum %Ld\n"
+			       "%s average %Ld.%03Ld\n"
+			       "%s maximum %Ld\n"
+			       "%s variance %Ld.%03Ld\n",
+			       name, num,
+			       name, min,
+			       name, mean_w, mean_d,
+			       name, max,
+			       name, var_w, var_d);
+	return 0;
+}
+
+/* code concerned with histogram statistics */
+
+size_t statistic_size_histogram(struct statistic *stat)
+{
+	return sizeof(u64) * (stat->u.histogram.last_index + 1);
+}
+
+static inline s64 statistic_histogram_calc_value_lin(struct statistic *stat,
+						     int i)
+{
+	return stat->u.histogram.range_min +
+		stat->u.histogram.base_interval * i;
+}
+
+static inline s64 statistic_histogram_calc_value_log2(struct statistic *stat,
+						      int i)
+{
+	return stat->u.histogram.range_min +
+		(i ? (stat->u.histogram.base_interval << (i - 1)) : 0);
+}
+
+static s64 statistic_histogram_calc_value(struct statistic *stat, int i)
+{
+	if (stat->type == STAT_HGRAM_LIN)
+		return statistic_histogram_calc_value_lin(stat, i);
+	else
+		return statistic_histogram_calc_value_log2(stat, i);
+}
+
+static int statistic_histogram_calc_index_lin(struct statistic *stat, s64 value)
+{
+	unsigned long long i;
+	if (value <= stat->u.histogram.range_min)
+		return 0;
+	i = value - stat->u.histogram.range_min;
+	do_div(i, stat->u.histogram.base_interval);
+	return min_t(unsigned long long, i, stat->u.histogram.last_index);
+}
+
+static int statistic_histogram_calc_index_log2(struct statistic *stat,
+					       s64 value)
+{
+	unsigned long long i;
+	for (i = 0;
+	     i < stat->u.histogram.last_index &&
+	     value > statistic_histogram_calc_value_log2(stat, i);
+	     i++);
+	return i;
+}
+
+static void statistic_reset_histogram(struct statistic *stat, void *ptr)
+{
+	memset(ptr, 0, (stat->u.histogram.last_index + 1) * sizeof(u64));
+}
+
+void statistic_add_histogram_lin(struct statistic *stat, s64 value, u64 incr)
+{
+	int i = statistic_histogram_calc_index_lin(stat, value);
+	((u64*)percpu_ptr(stat->data, smp_processor_id()))[i] += incr;
+}
+EXPORT_SYMBOL_GPL(statistic_add_histogram_lin);
+
+void statistic_add_histogram_log2(struct statistic *stat, s64 value, u64 incr)
+{
+	int i = statistic_histogram_calc_index_log2(stat, value);
+	((u64*)percpu_ptr(stat->data, smp_processor_id()))[i] += incr;
+}
+EXPORT_SYMBOL_GPL(statistic_add_histogram_log2);
+
+static void statistic_set_histogram_lin(struct statistic *stat,
+					s64 value, u64 total)
+{
+	int i = statistic_histogram_calc_index_lin(stat, value);
+	((u64*)stat->data)[i] = total;
+}
+
+static void statistic_set_histogram_log2(struct statistic *stat,
+					 s64 value, u64 total)
+{
+	int i = statistic_histogram_calc_index_log2(stat, value);
+	((u64*)stat->data)[i] = total;
+}
+
+static void statistic_merge_histogram(struct statistic *stat,
+				      void *_dst, void *_src)
+{
+	u64 *dst = _dst, *src = _src;
+	int i;
+	for (i = 0; i <= stat->u.histogram.last_index; i++)
+		dst[i] += src[i];
+}
+
+static int statistic_fdata_histogram_line(const char *name,
+					struct statistic_file_private *private,
+					const char *prefix, s64 bound, u64 hits)
+{
+	struct sgrb_seg *seg;
+	seg = sgrb_seg_find(&private->read_seg_lh, 256);
+	if (unlikely(!seg))
+		return -ENOMEM;
+	seg->offset += sprintf(seg->address + seg->offset, "%s %s%Ld %Lu\n",
+			       name, prefix, (signed long long)bound,
+			       (unsigned long long)hits);
+	return 0;
+}
+
+static int statistic_fdata_histogram(struct statistic *stat, const char *name,
+				     struct statistic_file_private *fpriv,
+				     void *data)
+{
+	int i, retval;
+	s64 bound = 0;
+	for (i = 0; i < (stat->u.histogram.last_index); i++) {
+		bound = statistic_histogram_calc_value(stat, i);
+		retval = statistic_fdata_histogram_line(name, fpriv, "<=",
+							bound, ((u64*)data)[i]);
+		if (unlikely(retval))
+			return retval;
+	}
+	return statistic_fdata_histogram_line(name, fpriv, ">",
+					      bound, ((u64*)data)[i]);
+}
+
+static int statistic_fdef_histogram(struct statistic *stat, char *line)
+{
+	return sprintf(line, " range_min=%Li entries=%Li base_interval=%Lu",
+		       (signed long long)stat->u.histogram.range_min,
+		       (unsigned long long)(stat->u.histogram.last_index + 1),
+		       (unsigned long long)stat->u.histogram.base_interval);
+}
+
+static match_table_t statistic_match_histogram = {
+	{1, "entries=%u"},
+	{2, "base_interval=%s"},
+	{3, "range_min=%s"},
+	{9, NULL}
+};
+
+static int statistic_parse_histogram(struct statistic *stat,
+				     struct statistic_info *info,
+				     int type, char *def)
+{
+	char *p;
+	substring_t args[MAX_OPT_ARGS];
+	int token, got_entries = 0, got_interval = 0, got_range = 0;
+	u32 entries, base_interval;
+	s64 range_min;
+
+	while ((p = strsep(&def, " ")) != NULL) {
+		if (!*p)
+			continue;
+		token = match_token(p, statistic_match_histogram, args);
+		switch (token) {
+		case 1:
+			match_int(&args[0], &entries);
+			got_entries = 1;
+			break;
+		case 2:
+			match_int(&args[0], &base_interval);
+			got_interval = 1;
+			break;
+		case 3:
+			match_s64(&args[0], &range_min, 0);
+			got_range = 1;
+			break;
+		}
+	}
+	if (unlikely(type != stat->type &&
+		     !(got_entries && got_interval && got_range)))
+		return -EINVAL;
+	statistic_transition(stat, info, STATISTIC_STATE_UNCONFIGURED);
+	if (got_entries)
+		stat->u.histogram.last_index = entries - 1;
+	if (got_interval)
+		stat->u.histogram.base_interval = base_interval;
+	if (got_range)
+		stat->u.histogram.range_min = range_min;
+	return 0;
+}
+
+/* code concerned with histograms (discrete value) statistics */
+
+struct statistic_entry_sparse {
+	struct list_head list;
+	s64 value;
+	u64 hits;
+};
+
+struct statistic_sparse_list {
+	struct list_head entry_lh;
+	u32 entries;
+	u32 entries_max;
+	u64 hits_missed;
+};
+
+size_t statistic_size_sparse(struct statistic *stat)
+{
+	return sizeof(struct statistic_sparse_list);
+}
+
+static void statistic_reset_sparse(struct statistic *stat, void *ptr)
+{
+	struct statistic_entry_sparse *entry, *tmp;
+	struct statistic_sparse_list *slist = ptr;
+
+	if (!slist->entries) {
+		INIT_LIST_HEAD(&slist->entry_lh);
+		slist->entries_max = stat->u.sparse.entries_max;
+	} else {
+		list_for_each_entry_safe(entry, tmp, &slist->entry_lh, list) {
+			list_del(&entry->list);
+			kfree(entry);
+		}
+		slist->entries = 0;
+	}
+	slist->hits_missed = 0;
+}
+
+static void statistic_add_sparse_sort(struct list_head *head,
+				      struct statistic_entry_sparse *entry)
+{
+	struct statistic_entry_sparse *sort;
+
+	sort = list_prepare_entry(entry, head, list);
+	list_for_each_entry_continue_reverse(sort, head, list)
+		if (likely(sort->hits >= entry->hits))
+			break;
+	if (unlikely(sort->list.next != &entry->list &&
+		     (&sort->list == head || sort->hits >= entry->hits)))
+		list_move(&entry->list, &sort->list);
+}
+
+static int statistic_add_sparse_new(struct statistic_sparse_list *slist,
+				    s64 value, u64 incr)
+{
+	struct statistic_entry_sparse *entry;
+
+	if (unlikely(slist->entries == slist->entries_max))
+		return -ENOMEM;
+	entry = kmalloc(sizeof(struct statistic_entry_sparse), GFP_ATOMIC);
+	if (unlikely(!entry))
+		return -ENOMEM;
+	entry->value = value;
+	entry->hits = incr;
+	slist->entries++;
+	list_add_tail(&entry->list, &slist->entry_lh);
+	return 0;
+}
+
+static void _statistic_add_sparse(struct statistic_sparse_list *slist,
+				  s64 value, u64 incr)
+{
+	struct list_head *head = &slist->entry_lh;
+	struct statistic_entry_sparse *entry;
+
+	list_for_each_entry(entry, head, list) {
+		if (likely(entry->value == value)) {
+			entry->hits += incr;
+			statistic_add_sparse_sort(head, entry);
+			return;
+		}
+	}
+	if (unlikely(statistic_add_sparse_new(slist, value, incr)))
+		slist->hits_missed += incr;
+}
+
+void statistic_add_sparse(struct statistic *stat, s64 value, u64 incr)
+{
+	struct statistic_sparse_list *slist;
+	slist = percpu_ptr(stat->data, smp_processor_id());
+	_statistic_add_sparse(slist, value, incr);
+}
+EXPORT_SYMBOL_GPL(statistic_add_sparse);
+
+static void statistic_set_sparse(struct statistic *stat, s64 value, u64 total)
+{
+	struct statistic_sparse_list *slist = stat->data;
+	struct list_head *head = &slist->entry_lh;
+	struct statistic_entry_sparse *entry;
+
+	list_for_each_entry(entry, head, list) {
+		if (likely(entry->value == value)) {
+			entry->hits = total;
+			statistic_add_sparse_sort(head, entry);
+			return;
+		}
+	}
+	if (unlikely(statistic_add_sparse_new(slist, value, total)))
+		slist->hits_missed += total;
+}
+
+static void statistic_merge_sparse(struct statistic *stat,
+				   void *_dst, void *_src)
+{
+	struct statistic_sparse_list *dst = _dst, *src = _src;
+	struct statistic_entry_sparse *entry;
+	dst->hits_missed += src->hits_missed;
+	list_for_each_entry(entry, &src->entry_lh, list)
+		_statistic_add_sparse(dst, entry->value, entry->hits);
+}
+
+static int statistic_fdata_sparse(struct statistic *stat, const char *name,
+				  struct statistic_file_private *fpriv,
+				  void *data)
+{
+	struct sgrb_seg *seg;
+	struct statistic_sparse_list *slist = data;
+	struct statistic_entry_sparse *entry;
+
+	seg = sgrb_seg_find(&fpriv->read_seg_lh, 256);
+	if (unlikely(!seg))
+		return -ENOMEM;
+	seg->offset += sprintf(seg->address + seg->offset, "%s missed 0x%Lu\n",
+			       name, (unsigned long long)slist->hits_missed);
+	list_for_each_entry(entry, &slist->entry_lh, list) {
+		seg = sgrb_seg_find(&fpriv->read_seg_lh, 256);
+		if (unlikely(!seg))
+			return -ENOMEM;
+		seg->offset += sprintf(seg->address + seg->offset,
+				       "%s 0x%Lx %Lu\n", name,
+				       (signed long long)entry->value,
+				       (unsigned long long)entry->hits);
+	}
+	return 0;
+}
+
+static int statistic_fdef_sparse(struct statistic *stat, char *line)
+{
+	return sprintf(line, " entries=%u", stat->u.sparse.entries_max);
+}
+
+static match_table_t statistic_match_sparse = {
+	{1, "entries=%u"},
+	{9, NULL}
+};
+
+static int statistic_parse_sparse(struct statistic *stat,
+				  struct statistic_info *info,
+				  int type, char *def)
+{
+	char *p;
+	substring_t args[MAX_OPT_ARGS];
+
+	while ((p = strsep(&def, " ")) != NULL) {
+		if (!*p)
+			continue;
+		if (match_token(p, statistic_match_sparse, args) == 1) {
+			statistic_transition(stat, info,
+					     STATISTIC_STATE_UNCONFIGURED);
+			match_int(&args[0], &stat->u.sparse.entries_max);
+			return 0;
+		}
+	}
+	return -EINVAL;
+}
+
+/* code mostly concerned with managing statistics */
+
+static struct statistic_discipline statistic_discs[] = {
+	[STAT_CNTR_INC] = {
+		.size	= statistic_size_counter,
+		.reset	= statistic_reset_counter,
+		.merge	= statistic_merge_counter,
+		.fdata	= statistic_fdata_counter,
+		.add	= statistic_add_counter_inc,
+		.set	= statistic_set_counter_inc,
+		.name	= "counter_inc",
+	},
+	[STAT_CNTR_PROD] = {
+		.size	= statistic_size_counter,
+		.reset	= statistic_reset_counter,
+		.merge	= statistic_merge_counter,
+		.fdata	= statistic_fdata_counter,
+		.add	= statistic_add_counter_prod,
+		.set	= statistic_set_counter_prod,
+		.name	= "counter_prod",
+	},
+	[STAT_UTIL] = {
+		.size	= statistic_size_util,
+		.reset	= statistic_reset_util,
+		.merge	= statistic_merge_util,
+		.fdata	= statistic_fdata_util,
+		.add	= statistic_add_util,
+		.set	= statistic_set_util,
+		.name	= "utilisation",
+	},
+	[STAT_HGRAM_LIN] = {
+		.parse	= statistic_parse_histogram,
+		.size	= statistic_size_histogram,
+		.reset	= statistic_reset_histogram,
+		.merge	= statistic_merge_histogram,
+		.fdata	= statistic_fdata_histogram,
+		.fdef	= statistic_fdef_histogram,
+		.add	= statistic_add_histogram_lin,
+		.set	= statistic_set_histogram_lin,
+		.name	= "histogram_lin",
+	},
+	[STAT_HGRAM_LOG2] = {
+		.parse	= statistic_parse_histogram,
+		.size	= statistic_size_histogram,
+		.reset	= statistic_reset_histogram,
+		.merge	= statistic_merge_histogram,
+		.fdata	= statistic_fdata_histogram,
+		.fdef	= statistic_fdef_histogram,
+		.add	= statistic_add_histogram_log2,
+		.set	= statistic_set_histogram_log2,
+		.name	= "histogram_log2",
+	},
+	[STAT_SPARSE] = {
+		.parse	= statistic_parse_sparse,
+		.size	= statistic_size_sparse,
+		.reset	= statistic_reset_sparse,
+		.merge	= statistic_merge_sparse,
+		.fdata	= statistic_fdata_sparse,
+		.fdef	= statistic_fdef_sparse,
+		.add	= statistic_add_sparse,
+		.set	= statistic_set_sparse,
+		.name	= "sparse",
+	},
+	[STAT_NONE] = {}
+};
+
+/* programming interface functions */
+
+/**
+ * statistic_create - setup statistics and create debugfs files
+ * @interface: struct statistic_interface provided by client
+ * @name: name of debugfs directory to be created
+ *
+ * Creates a debugfs directory in "statistics" as well as the "data" and
+ * "definition" files. Then we attach setup statistics according to the
+ * definition provided by client through struct statistic_interface.
+ *
+ * struct statistic_interface must have been set up prior to calling this.
+ *
+ * On success, 0 is returned.
+ *
+ * If some required memory could not be allocated, or the creation
+ * of debugfs entries failed, this routine fails, and -ENOMEM is returned.
+ */
+int statistic_create(struct statistic_interface *interface, const char *name)
+{
+	struct statistic *stat = interface->stat;
+	struct statistic_info *info = interface->info;
+	int i;
+
+	BUG_ON(!stat || !info || !interface->number);
+
+	interface->debugfs_dir =
+		debugfs_create_dir(name, statistic_root_dir);
+	if (unlikely(!interface->debugfs_dir))
+		return -ENOMEM;
+
+	interface->data_file = debugfs_create_file(
+		"data", S_IFREG | S_IRUSR, interface->debugfs_dir,
+		(void*)interface, &statistic_data_fops);
+	if (unlikely(!interface->data_file)) {
+		debugfs_remove(interface->debugfs_dir);
+		return -ENOMEM;
+	}
+
+	interface->def_file = debugfs_create_file(
+		"definition", S_IFREG | S_IRUSR | S_IWUSR,
+		interface->debugfs_dir, (void*)interface, &statistic_def_fops);
+	if (unlikely(!interface->def_file)) {
+		debugfs_remove(interface->data_file);
+		debugfs_remove(interface->debugfs_dir);
+		return -ENOMEM;
+	}
+
+	for (i = 0; i < interface->number; i++, stat++, info++) {
+		statistic_transition(stat, info, STATISTIC_STATE_UNCONFIGURED);
+		statistic_parse_match(stat, info, NULL);
+	}
+
+	mutex_lock(&statistic_list_mutex);
+	list_add(&interface->list, &statistic_list);
+	mutex_unlock(&statistic_list_mutex);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(statistic_create);
+
+/**
+ * statistic_remove - remove unused statistics
+ * @interface: struct statistic_interface to clean up
+ *
+ * Remove a debugfs directory in "statistics" along with its "data" and
+ * "definition" files. Removing this user interface also causes the removal
+ * of all statistics attached to the interface.
+ *
+ * The client must have ceased reporting statistic data.
+ *
+ * Returns -EINVAL for attempted double removal, 0 otherwise.
+ */
+int statistic_remove(struct statistic_interface *interface)
+{
+	struct statistic *stat = interface->stat;
+	struct statistic_info *info = interface->info;
+	int i;
+
+	if (unlikely(!interface->debugfs_dir))
+		return -EINVAL;
+	mutex_lock(&statistic_list_mutex);
+	list_del(&interface->list);
+	mutex_unlock(&statistic_list_mutex);
+	for (i = 0; i < interface->number; i++, stat++, info++)
+		statistic_transition(stat, info, STATISTIC_STATE_INVALID);
+	debugfs_remove(interface->data_file);
+	debugfs_remove(interface->def_file);
+	debugfs_remove(interface->debugfs_dir);
+	interface->debugfs_dir = NULL;
+	return 0;
+}
+EXPORT_SYMBOL_GPL(statistic_remove);
+
+/**
+ * _statistic_add - update statistic with incremental data in (X, Y) pair
+ * @stat: struct statistic array
+ * @i: index of statistic to be updated
+ * @value: X
+ * @incr: Y
+ *
+ * The actual processing of the (X, Y) data pair is determined by the current
+ * definition applied to the statistic. See Documentation/statistics.txt.
+ *
+ * This variant leaves protecting per-cpu data to clients. It is preferred
+ * whenever clients update several statistics of the same entity in one go.
+ *
+ * You may want to use _statistic_inc() for (X, 1) data pairs.
+ */
+void _statistic_add(struct statistic *stat, int i, s64 value, u64 incr)
+{
+	if (stat[i].state == STATISTIC_STATE_ON)
+		stat[i].add(&stat[i], value, incr);
+}
+EXPORT_SYMBOL_GPL(_statistic_add);
+
+/**
+ * statistic_add - update statistic with incremental data in (X, Y) pair
+ * @stat: struct statistic array
+ * @i: index of statistic to be updated
+ * @value: X
+ * @incr: Y
+ *
+ * The actual processing of the (X, Y) data pair is determined by the current
+ * the definition applied to the statistic. See Documentation/statistics.txt.
+ *
+ * This variant takes care of protecting per-cpu data. It is preferred whenever
+ * clients don't update several statistics of the same entity in one go.
+ *
+ * You may want to use statistic_inc() for (X, 1) data pairs.
+ */
+void statistic_add(struct statistic *stat, int i, s64 value, u64 incr)
+{
+	unsigned long flags;
+	local_irq_save(flags);
+	_statistic_add(stat, i, value, incr);
+	local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(statistic_add);
+
+/**
+ * statistic_set - set statistic using total numbers in (X, Y) data pair
+ * @stat: struct statistic array
+ * @i: index of statistic to be updated
+ * @value: X
+ * @total: Y
+ *
+ * The actual processing of the (X, Y) data pair is determined by the current
+ * definition applied to the statistic. See Documentation/statistics.txt.
+ *
+ * There is no distinction between a concurrency protected and unprotected
+ * statistic_set() flavour needed. statistic_set() may only
+ * be called when we pull statistic updates from clients. The statistics
+ * infrastructure guarantees serialisation for that. Exploiters must not
+ * intermix statistic_set() and statistic_add/inc() anyway. That is why,
+ * concurrent updates won't happen and there is no additional protection
+ * required for statistics fed through statistic_set().
+ */
+void statistic_set(struct statistic *stat, int i, s64 value, u64 total)
+{
+	struct statistic_discipline *disc = &statistic_discs[stat[i].type];
+	if (stat[i].state == STATISTIC_STATE_ON)
+		disc->set(&stat[i], value, total);
+}
+EXPORT_SYMBOL_GPL(statistic_set);
+
+postcore_initcall(statistic_init);
+module_exit(statistic_exit);
+
+MODULE_LICENSE("GPL");
_

Patches currently in -mm which might be from mp3@xxxxxxxxxx are


-
To unsubscribe from this list: send the line "unsubscribe mm-commits" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html