On Thu, 23 Sep 2021, Coly Li wrote: > Hi all the kernel gurus, and folks in mailing lists, > > This is a question about exporting 4KB+ text information via sysfs > interface. I need advice on how to handle the problem. Why do you think there is a problem? As documented in Documentation/admin-guide/md.rst, the truncation at 1 page is expected and by design. The "unacknowledge-bad-blocks" file is the important one that is needed for correct behaviour. Being able to read a single block is sufficient, though being able to read more than one could provide better performance in some cases. The "bad-blocks" file primarily exist to provide visibility into the state of the system - useful during development. It can be written to to add bad blocks. I never *needs* to be read from. The authoritative source of information about the set of bad blocks is the on-disk data the can be and should be read directly... Except that mdadm does. That was a mistake. check_for_cleared_bb() is wrong. I wonder why it was added. The commit message doesn't give any justification. NeilBrown > > Recently I work on the bad blocks API (block/badblocks.c) improvement, > there is a sysfs file to export the bad block ranges for me raid. E.g > for a md raid1 device, file > /sys/block/md0/md/rd0/bad_blocks > may contain the following text content, > 64 32 > 128 8 > The above lines mean there are two bad block ranges, one starts at LBA > 64, length 32 sectors, another one starts at LBA 128 and length 8 > sectors. All the content is generated from the internal bad block > records with 512 elements. In my testing the worst case only 185 from > 512 records can be displayed via the sysfs file if the LBA string is > very long, e.g.the following content, > 17668164135030776 512 > 17668164135029776 512 > 17668164135028776 512 > 17668164135027776 512 > ... ... > The bad block ranges stored in internal bad blocks array are correct, > but the output message is truncated. This is the problem I encountered. > > I don't see sysfs has seq_file support (correct me if I am wrong), and I > know it is improper to transfer 4KB+ text via sysfs interface, but the > code is here already for long time. > > There are 2 ideas to fix showing up in my brain, > 1) Do not fix the problem > Normally it is rare that a storage media has 100+ bad block ranges, > maybe in real world all the existing bad blocks information won't exceed > the page size limitation of sysfs file. > 2) Add seq_file support to sysfs interface if there is no > > It is probably there is other better solution to fix. So I do want to > get hint/advice from you. > > Thanks in advance for any comment :-) > > Coly Li > > On 9/14/21 12:36 AM, Coly Li wrote: > > This is the second effort to improve badblocks code APIs to handle > > multiple ranges in bad block table. > > > > There are 2 changes from previous version, > > - Fixes 2 bugs in front_overwrite() which are detected by the user > > space testing code. > > - Provide the user space testing code in last patch. > > > > There is NO in-memory or on-disk format change in the whole series, all > > existing API and data structures are consistent. This series just only > > improve the code algorithm to handle more corner cases, the interfaces > > are same and consistency to all existing callers (md raid and nvdimm > > drivers). > > > > The original motivation of the change is from the requirement from our > > customer, that current badblocks routines don't handle multiple ranges. > > For example if the bad block setting range covers multiple ranges from > > bad block table, only the first two bad block ranges merged and rested > > ranges are intact. The expected behavior should be all the covered > > ranges to be handled. > > > > All the patches are tested by modified user space code and the code > > logic works as expected. The modified user space testing code is > > provided in last patch. The testing code detects 2 defects in helper > > front_overwrite() and fixed in this version. > > > > The whole change is divided into 6 patches to make the code review more > > clear and easier. If people prefer, I'd like to post a single large > > patch finally after the code review accomplished. > > > > This version is seriously tested, and so far no more defect observed. > > > > > > Coly Li > > > > Cc: Dan Williams <dan.j.williams@xxxxxxxxx> > > Cc: Hannes Reinecke <hare@xxxxxxx> > > Cc: Jens Axboe <axboe@xxxxxxxxx> > > Cc: NeilBrown <neilb@xxxxxxx> > > Cc: Richard Fan <richard.fan@xxxxxxxx> > > Cc: Vishal L Verma <vishal.l.verma@xxxxxxxxx> > > --- > > Changelog: > > v3: add tester Richard Fan <richard.fan@xxxxxxxx> > > v2: the improved version, and with testing code. > > v1: the first completed version. > > > > > > Coly Li (6): > > badblocks: add more helper structure and routines in badblocks.h > > badblocks: add helper routines for badblock ranges handling > > badblocks: improvement badblocks_set() for multiple ranges handling > > badblocks: improve badblocks_clear() for multiple ranges handling > > badblocks: improve badblocks_check() for multiple ranges handling > > badblocks: switch to the improved badblock handling code > > Coly Li (1): > > test: user space code to test badblocks APIs > > > > block/badblocks.c | 1599 ++++++++++++++++++++++++++++++------- > > include/linux/badblocks.h | 32 + > > 2 files changed, 1340 insertions(+), 291 deletions(-) > > > >