Re: [PATCH v2 1/8] lib/printbuf: New data structure for heap-allocated strings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 22, 2022 at 03:39:16PM -0400, Steven Rostedt wrote:
> I do not consider Facebook an open source company. One reason I turned them
> down.

Surely you can see how NIH syndrome isn't something that just happens at
closed-source companies? How a default cultural assumption of "we do things the
way we've always done" leads to things getting insular?

> > The reason I bring that up is that in this case, printbuf is the more evolved,
> > more widely used implementation, and you're asking me to discard it so the
> > kernel can stick with its more primitive, less widely used implementation.
> > 
> > $ git grep -w seq_buf|wc -l
> > 86
> > 
> > $ git grep -w printbuf|wc -l
> > 366
> 
> $ git grep printbuf
> drivers/media/i2c/ccs/ccs-reg-access.c:                 char printbuf[(MAX_WRITE_LEN << 1) +
> drivers/media/i2c/ccs/ccs-reg-access.c:                 bin2hex(printbuf, regdata, msg.len);
> drivers/media/i2c/ccs/ccs-reg-access.c:                         regs->addr + j, printbuf);
> 
> I don't see it.

Here: https://evilpiepirate.org/git/bcachefs.git/

It may not be merged yet, but it is actively developed open source code with
active users that's intended to be merged!

> I'd like to know more to why seq_buf is not good for you. And just telling
> me that you never seriously tried to make it work because you were afraid
> of causing tracing regressions without ever asking the tracing maintainer
> is not going to cut it.

I didn't know about seq_buf until a day or two ago, that's literally all it was.

And Steve, apologies if I've come across as being a dick about this, that wasn't
my intent.  I've got nothing against you or your code - I'd love it if we could
just have a discussion about them on their merits, and if it feels like I'm
making an issue about this unnecessarily that's because I think there's
something about kernel process and culture worth improving that I want to raise,
so I'm sticking my neck out a bit here.

So here's the story of how I got from where seq_buf is now to where printbuf is
now:

 - Printbuf started out as almost an exact duplicate of seq_buf (like I said,
   not intentionally), with an external buffer typically allocated on the stack.

 - As error/log messages got to be bigger and more structured, stack usage
   eventually became an issue, so eventually I added the heap allocations. 

 - This made them a lot more convenient to use, and made possible entirely new
   ways of using them - so I started using them more, and converting everything
   that outputted to strings to them.

 - This lead to the realization that when pretty-printers are easy and
   convenient to write, that leads to writing pretty-printers for _more_ stuff,
   which makes it easy to stay in the habit of adding anything relevant to
   sysfs/debugfs - and log/error messages got a _whole_ lot better when I
   realized instead of writing format strings for every basic C type I can just
   use the .to_text() methods of the high level objects I'm working with.

Basically, my debugging life has gotten _drastically_ easier because of this
change in process and approach - deadlocks that I used to have to attach a
debugger for are now trivial because all the relevant state is in debugfs and
greppable, and filesystem inconsistencies that used to suck to debug I now just
take what's in the error message and grep through the journal for.

I can't understate how invaluable all this stuff has been, and I'm excited to
take the lessons I've learned and apply them to the wider kernel and make other
people's lives easier too.

The shrinkers-OOM-reporting patch was an obvious starting point because
 - shrinkers have internal state that's definitely worth reporting on
 - we shouldn't just be logging this on OOM, we should also make this available
   in sysfs or debugfs.

Feature wise, printbufs have:
 - heap allocation
 - extra state for formatting: indent level, tabstops, and a way of specifying
   units.

That's basically it. Heap allocation adds very little code and eliminates a
_lot_ of headaches in playing the "how much do I need to/can I put on the stack"
game, and you'll want the formatting options as soon as you start formatting
multi line output and writing pretty printers that call other pretty printers.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux