On Wed, Jan 20, 2021 at 05:13:15PM -0300, Gabriel Krisman Bertazi wrote: > > I submitted a first set of patches, based on David Howells' original > superblock notifications patchset, that I expanded into error > reporting and had an example implementation for ext4. Upstream review > has revealed a few design problems: > > - Including the "function:line" tuple in the notification allows the > uniquely identification of the error origin but it also ties the > decodification of the error to the source code, i.e. <function:line> > is expected to change between releases. That's true, but it's better than printing the EIP, which is even harder to decode. Having the line number is better than the alternative, which is either (a) nothing, or (b) the EIP. > - Useful debug data (inode number, block group) have formats specific > to the filesystems, and my design wouldn't be expansible to > filesystems other than ext4. Inode numbers are fairly universal; block groups are certainly ext2/3/4 specific. I'll note that in the original ext4_error messages to syslog and the internal-to-Google netlink notification system, they were sent as an text string. We don't necessarily need to send them to userspace. > - The implementation allowed an error string description (equivalent > to what would be thrown in dmesg) that is too short, as it needs to > fit in a single notification. ... or we need to have some kind of contnuation system where the text string is broken across multiple notification, with some kind of notification id umber. > - Visibility of objects. A bind mount of a subtree shouldn't receive > notifications of objects outside of that bind mount. So this is scope creep beyond the original goals of the project. I understand that there is a desire by folks in the community to support various containerization use cases where only only a portion of file system is visible in a container due to a bind mount. However, we need to recall that ext4_error messages can originate in fairly deep inside the ext4 file system. They indicate major problems with the file system --- the kind that might require drastic system administration reaction. As such, at the point where we discover a problem with an inode, that part of the ext4 code might not have access to the pathname that was used to originally access the inode. We might be inside a workqueue handler, for example, so we might not even running in the same process that had been containerized. We might be holding various file system mutexes or even in some cases a spinlock. What follows from that is that it's not really going to be possible to filter notifications to a subtree. Furthermore, if fanotify requires memory allocation, that's going to be problematic, we may not be in a context where memory allocation is possible. So for that reason, it's not clear to me that fanotify is going to be a good match for this use case. Cheers, - Ted