On Tue, Dec 08, 2020 at 04:29:32PM -0300, Gabriel Krisman Bertazi wrote: > "Darrick J. Wong" <darrick.wong@xxxxxxxxxx> writes: > > > On Tue, Dec 08, 2020 at 09:58:25AM -0300, Gabriel Krisman Bertazi wrote: > >> David Howells <dhowells@xxxxxxxxxx> writes: > >> > >> > Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> wrote: > >> > > >> >> @@ -130,6 +131,8 @@ struct superblock_error_notification { > > > > FWIW I wonder if this really should be inode_error_notification? > > > > If (for example) ext4 discovered an error in the blockgroup descriptor > > and wanted to report it, the inode and block numbers would be > > irrelevant, but the blockgroup number would be nice to have. > > A previous RFC had superblock_error_notification and > superblock_inode_error_notification split, I think we can recover that. > > > > >> >> __u32 error_cookie; > >> >> __u64 inode; > >> >> __u64 block; > >> >> + char function[SB_NOTIFICATION_FNAME_LEN]; > >> >> + __u16 line; > >> >> char desc[0]; > >> >> }; > >> > > >> > As Darrick said, this is a UAPI breaker, so you shouldn't do this (you can, > >> > however, merge this ahead a patch). Also, I would put the __u16 before the > >> > char[]. > >> > > >> > That said, I'm not sure whether it's useful to include the function name and > >> > line. Both fields are liable to change over kernel commits, so it's not > >> > something userspace can actually interpret. I think you're better off dumping > >> > those into dmesg. > >> > > >> > Further, this reduces the capacity of desc[] significantly - I don't know if > >> > that's a problem. > >> > >> Yes, that is a big problem as desc is already quite limited. I don't > > > > How limited? > > The largest notification is 128 bytes, the one with the biggest header > is superblock_error_notification which leaves 56 bytes for description. > > > > >> think it is a problem for them to change between kernel versions, as the > >> monitoring userspace can easily associate it with the running kernel. > > > > How do you make that association? $majordistro's 4.18 kernel is not the > > same as the upstream 4.18. Wouldn't you rather the notification message > > be entirely self-describing rather than depending on some external > > information about the sender? > > True. I was thinking on my use case where the customer controls their > infrastructure and would specialize their userspace tools, but that is > poor design on my part. A self describing mechanism would be better. > > > > >> The alternative would be generating something like unique IDs for each > >> error notification in the filesystem, no? > >> > >> > And yet further, there's no room for addition of new fields with the desc[] > >> > buffer on the end. Now maybe you're planning on making use of desc[] for > >> > text-encoding? > >> > >> Yes. I would like to be able to provide more details on the error, > >> without having a unique id. For instance, desc would have the formatted > >> string below, describing the warning: > >> > >> ext4_warning(inode->i_sb, "couldn't mark inode dirty (err %d)", err); > > > > Depending on the upper limit on the length of messages, I wonder if you > > could split the superblock notification and the description string into > > separate messages (with maybe the error cookie to tie them together) so > > that the struct isn't limited by having a VLA on the end, and the > > description can be more or less an arbitrary string? > > > > (That said I'm not familiar with the watch queue system so I have no > > idea if chained messages even make sense here, or are already > > implemented in some other way, or...) > > I don't see any support for chaining messages in the current watch_queue > implementation, I'd need to extend the interface to support it. I > considered this idea before, given the small description size, but I > thought it would be over-complicated, even though much more future > proof. I will look into that. > > What about the kernel exporting a per-filesystem table, as a build > target or in /sys/fs/<fs>/errors, that has descriptions strings for each > error? Then the notification can have only the FS type, index to the > table and params. This won't exactly be self-describing as you wanted > but, differently from function:line, it removes the need for the source > code, and allows localization. The per-filesystem table would be > stable ABI, of course. Yikes. I don't think people are going to be ok with a message table where we can never remove the strings. I bet GregKH won't like that either (one value per sysfs file). (Maybe I misread that and all you meant by stable ABI is the fact that the table exists at a given path and the notification message gives you a index into ... wherever we put it.) --D > > -- > Gabriel Krisman Bertazi