On Wed, 28 May 2014 11:01:53 +0200 Heiko Carstens <heiko.carstens@xxxxxxxxxx> wrote: > Now, /proc/stat uses single_open() for showing information. This means > the all data will be gathered and buffered once to a (big) buf. > > Now, /proc/stat shows stats per cpu and stats per IRQs. To get information > in once-shot, it allocates a big buffer (until KMALLOC_MAX_SIZE). > > Eric Dumazet reported that the bufsize calculation doesn't take > the numner of IRQ into account and the information cannot be > got in one-shot. (By this, seq_read() will allocate buffer again > and read the whole data gain...) > > This patch changes /proc/stat to use seq_open() rather than single_open() > and provides ->start(), ->next(), ->stop(), ->show(). > > By this, /proc/stat will not need to take care of size of buffer. > > [heiko.carstens@xxxxxxxxxx]: This is the forward port of a patch > from KAMEZAWA Hiroyuki (https://lkml.org/lkml/2012/1/23/41). > I added a couple of simple changes like e.g. that the cpu iterator > handles 32 cpus in a batch to avoid lots of iterations. > > With this patch it should not happen anymore that reading /proc/stat > fails because of a failing high order memory allocation. So this deletes the problematic allocation which [1/2] kind-of fixed, yes? I agree with Ian - there's a hotplugging race. And [1/2] doesn't do anything to address the worst-case allocation size. So I think we may as well do this all in a single patch. Without having looked closely at the code I worry a bit about the effects. /proc/pid/stat is a complex thing and its contents will vary in strange ways as the things it is reporting on undergo concurrent changes. This patch will presumably replace one set of bizarre behaviour with a new set of bizarre behaviour and there may be unforseen consequences to established userspace. So we're going to need a lot of testing and a lot of testing time to identify issues and weed them out. So.. can we take this up for 3.16-rc1? See if we can get some careful review done then and test it for a couple of months? Meanwhile, the changelog looks a bit hastily thrown together - some smoothing would be nice, and perhaps some work spent identifying possible behavioural changes. Timing changes, locking canges, effects of concurrent fork/exit activity etc? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html