Re: [PATCH 2/2] fs: proc/stat: use usual seq_file ops rather than single_open

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Wed, 28 May 2014 15:37:04 -0700

On Wed, 28 May 2014 11:01:53 +0200 Heiko Carstens <heiko.carstens@xxxxxxxxxx> wrote:

> Now, /proc/stat uses single_open() for showing information. This means
> the all data will be gathered and buffered once to a (big) buf.
> 
> Now, /proc/stat shows stats per cpu and stats per IRQs. To get information
> in once-shot, it allocates a big buffer (until KMALLOC_MAX_SIZE).
> 
> Eric Dumazet reported that the bufsize calculation doesn't take
> the numner of IRQ into account and the information cannot be
> got in one-shot. (By this, seq_read() will allocate buffer again
> and read the whole data gain...)
> 
> This patch changes /proc/stat to use seq_open() rather than single_open()
> and provides  ->start(), ->next(), ->stop(), ->show().
> 
> By this, /proc/stat will not need to take care of size of buffer.
> 
> [heiko.carstens@xxxxxxxxxx]: This is the forward port of a patch
> from KAMEZAWA Hiroyuki (https://lkml.org/lkml/2012/1/23/41).
> I added a couple of simple changes like e.g. that the cpu iterator
> handles 32 cpus in a batch to avoid lots of iterations.
> 
> With this patch it should not happen anymore that reading /proc/stat
> fails because of a failing high order memory allocation.

So this deletes the problematic allocation which [1/2] kind-of fixed,
yes?

I agree with Ian - there's a hotplugging race.  And [1/2] doesn't do
anything to address the worst-case allocation size.  So I think we may
as well do this all in a single patch.

Without having looked closely at the code I worry a bit about the
effects.  /proc/pid/stat is a complex thing and its contents will vary
in strange ways as the things it is reporting on undergo concurrent
changes.  This patch will presumably replace one set of bizarre
behaviour with a new set of bizarre behaviour and there may be
unforseen consequences to established userspace.

So we're going to need a lot of testing and a lot of testing time to
identify issues and weed them out.

So..  can we take this up for 3.16-rc1?  See if we can get some careful
review done then and test it for a couple of months?

Meanwhile, the changelog looks a bit hastily thrown together - some
smoothing would be nice, and perhaps some work spent identifying
possible behavioural changes.  Timing changes, locking canges, effects
of concurrent fork/exit activity etc?

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html