RE: [PATCH] blk-mq: avoid sysfs buffer overflow by too many CPU cores

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ming,

In the customer case, the cpu_list file was not needed.   It was just part of a SAP Hana script to collect all the block device data (similar to sosreport).    So they were just dumping everything, and it picks up the mq-related files.  

I know with IRQs, we have bitmaps/mask, and can represent the list such as "0-27", without listing every CPU.   I'm sure there's lots of options to address this, and getting rid of the cpu_list is one of them.

Best Regards,

Mark Ray
HPE Global Solutions Engineering
mark.ray@xxxxxxx



-----Original Message-----
From: Ming Lei [mailto:ming.lei@xxxxxxxxxx] 
Sent: Thursday, August 15, 2019 9:43 PM
To: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>; linux-block@xxxxxxxxxxxxxxx; stable@xxxxxxxxxxxxxxx; Ray, Mark C (Global Solutions Engineering (GSE)) <mark.ray@xxxxxxx>
Subject: Re: [PATCH] blk-mq: avoid sysfs buffer overflow by too many CPU cores

On Thu, Aug 15, 2019 at 02:35:35PM +0200, Greg KH wrote:
> On Thu, Aug 15, 2019 at 08:29:10PM +0800, Ming Lei wrote:
> > On Thu, Aug 15, 2019 at 02:24:19PM +0200, Greg KH wrote:
> > > On Thu, Aug 15, 2019 at 08:15:18PM +0800, Ming Lei wrote:
> > > > It is reported that sysfs buffer overflow can be triggered in 
> > > > case of too many CPU cores(>841 on 4K PAGE_SIZE) when showing 
> > > > CPUs in one hctx.
> > > > 
> > > > So use snprintf for avoiding the potential buffer overflow.
> > > > 
> > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > Cc: Mark Ray <mark.ray@xxxxxxx>
> > > > Fixes: 676141e48af7("blk-mq: don't dump CPU -> hw queue map on 
> > > > driver load")
> > > > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
> > > > ---
> > > >  block/blk-mq-sysfs.c | 30 ++++++++++++++++++------------
> > > >  1 file changed, 18 insertions(+), 12 deletions(-)
> > > > 
> > > > diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c index 
> > > > d6e1a9bd7131..e75f41a98415 100644
> > > > --- a/block/blk-mq-sysfs.c
> > > > +++ b/block/blk-mq-sysfs.c
> > > > @@ -164,22 +164,28 @@ static ssize_t blk_mq_hw_sysfs_nr_reserved_tags_show(struct blk_mq_hw_ctx *hctx,
> > > >  	return sprintf(page, "%u\n", hctx->tags->nr_reserved_tags);  }
> > > >  
> > > > +/* avoid overflow by too many CPU cores */
> > > >  static ssize_t blk_mq_hw_sysfs_cpus_show(struct blk_mq_hw_ctx 
> > > > *hctx, char *page)  {
> > > > -	unsigned int i, first = 1;
> > > > -	ssize_t ret = 0;
> > > > -
> > > > -	for_each_cpu(i, hctx->cpumask) {
> > > > -		if (first)
> > > > -			ret += sprintf(ret + page, "%u", i);
> > > > -		else
> > > > -			ret += sprintf(ret + page, ", %u", i);
> > > > -
> > > > -		first = 0;
> > > > +	unsigned int cpu = cpumask_first(hctx->cpumask);
> > > > +	ssize_t len = snprintf(page, PAGE_SIZE - 1, "%u", cpu);
> > > > +	int last_len = len;
> > > > +
> > > > +	while ((cpu = cpumask_next(cpu, hctx->cpumask)) < nr_cpu_ids) {
> > > > +		int cur_len = snprintf(page + len, PAGE_SIZE - 1 - len,
> > > > +				       ", %u", cpu);
> > > > +		if (cur_len >= PAGE_SIZE - 1 - len) {
> > > > +			len -= last_len;
> > > > +			len += snprintf(page + len, PAGE_SIZE - 1 - len,
> > > > +					"...");
> > > > +			break;
> > > > +		}
> > > > +		len += cur_len;
> > > > +		last_len = cur_len;
> > > >  	}
> > > >  
> > > > -	ret += sprintf(ret + page, "\n");
> > > > -	return ret;
> > > > +	len += snprintf(page + len, PAGE_SIZE - 1 - len, "\n");
> > > > +	return len;
> > > >  }
> > > >
> > > 
> > > What????
> > > 
> > > sysfs is "one value per file".  You should NEVER have to care 
> > > about the size of the sysfs buffer.  If you do, you are doing something wrong.
> > > 
> > > What excatly are you trying to show in this sysfs file?  I can't 
> > > seem to find the Documenatation/ABI/ entry for it, am I just 
> > > missing it because I don't know the filename for it?
> > 
> > It is /sys/block/$DEV/mq/$N/cpu_list, all CPUs in this hctx($N) will 
> > be shown via sysfs buffer. The buffer size is one PAGE, how can it 
> > hold when there are too many CPUs(close to 1K)?
> 
> Looks like I only see 1 cpu listed on my machines in those files, what 
> am I doing wrong?

It depends on machine. The issue is reported on one machine with 896 CPU cores, when 4K buffer can only hold 841 cores.

> 
> Also, I don't see cpu_list in any of the documentation files, so I 
> have no idea what you are trying to have this file show.
> 
> And again, "one value per file" is the sysfs rule.  "all cpus in the 
> system" is not "one value" :)

I agree, and this file shouldn't be there, given each CPU will have one kobject dir under the hctx dir.

We may kill the 'cpu_list' attribute, is there anyone who objects?


Thanks,
Ming



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux