On 1/26/20 9:49 PM, Leon Romanovsky wrote:
On Sun, Jan 26, 2020 at 01:33:53PM -0800, Jakub Kicinski wrote:
On Sun, 26 Jan 2020 23:08:50 +0200, Leon Romanovsky wrote:
On Sun, Jan 26, 2020 at 12:49:57PM -0800, Jakub Kicinski wrote:
On Sun, 26 Jan 2020 21:41:10 +0200, Leon Romanovsky wrote:
This will end up affecting out-of-tree drivers as well, where it is useful
to know what the version number is, most especially since it is different
from what the kernel provided driver is. How else are we to get this
information out to the user? If this feature gets squashed, we'll end up
having to abuse some other mechanism so we can get the live information from
the driver, and probably each vendor will find a different way to sneak it
out, giving us more chaos than where we started. At least the ethtool
version field is a known and consistent place for the version info.
Shannon does have a point that out of tree drivers still make use of
this field. Perhaps it would be a more suitable first step to print the
kernel version as default and add a comment saying upstream modules
shouldn't overwrite it (perhaps one day CI can catch new violators).
Shannon proposed to remove this field and it was me who said no :)
Obviously, we can't remove fields from UAPI structs.
My plan is to overwrite ->version, delete all users and add
WARN_ONEC(strcpy(..->version_)...) inside net/ethtool/ to catch
abusers.
What I was thinking just now was: initialize ->version to utsname
before drivers are called, delete all upstream users, add a coccicheck
for upstream drivers which try to report the version.
The NFP reports the git hash of the driver source plus the string
"(oot)" for out-of-tree:
https://github.com/Netronome/nfp-drv-kmods/blob/master/src/Kbuild#L297
https://github.com/Netronome/nfp-drv-kmods/blob/master/src/Kbuild#L315
I was inspired by upstream code.
https://elixir.bootlin.com/linux/v5.5-rc7/source/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c#L184
Right, upstream nfp reports kernel version (both in modinfo and ethtool)
GitHub/compat/backport/out-of-tree reports kernel version in which the
code was expected to appear in modinfo:
https://github.com/Netronome/nfp-drv-kmods/commit/7ec15c47caf5dbdf1f9806410535ad5b7373ec34#diff-492d7fa4004d885a38cfa889ed1adbe7L1284
And git hash of the driver source plus out of tree marker in ethtool.
That means it's out-of-tree driver which has to carry the extra code
and require extra feeding. As backport should IMHO.
Leaving to deal with driver version to vendors is not an option too,
because they prove for more than once that they are not capable to
define user visible interfaces. It comes due to their natural believe
that their company is alone in the world and user visible interface
should be suitable for them only.
It is already impossible for users to distinguish properly versions
of different vendors, because they use arbitrary strings with some
numbers.
That is true. But reporting the kernel version without even as much as
in-tree/out-of-tree indication makes the field entirely meaningless.
The long-standing policy in kernel that we don't really care about
out-of-tree code.
Yeah... we all know it's not that simple :)
It is simple, unfortunately netdev people like to complicate things
by declaring ABI in very vague way which sometimes goes so far that
it ends more strict than anyone would imagine.
We, RDMA and many other subsystems mentioned in that ksummit thread,
removed MODULE_VERSION() a long time ago and got zero complains from
the real users.
The in-tree driver versions are meaningless and cause annoying churn
when people arbitrarily bump them. If we can get people to stop doing
that we'll be happy, that's all there is to it.
Out of tree the field is useful, so we don't have to take it away just
as a matter of principle. If we can't convince people to stop bringing
the versions into the tree that'll be another story...
As Shannon pointed, even experienced people will try to sneak those
changes. I assume that it is mainly because they are pushed to do it
by the people who doesn't understand Linux kernel process.
I don't think that the Intel Networking folks were trying to "sneak"
something through, I think they have simply been continuing the process
that they've been following for years. When we have groups such as
them, with a long history of contributions to drivers and stack, not
following the new rules, perhaps we need to take a look at how we're
publicizing these changes.
sln