On Wed, 02/18 19:49, Ingo Molnar wrote: > > * Fam Zheng <famz@xxxxxxxxxx> wrote: > > > On Sun, 02/15 15:00, Jonathan Corbet wrote: > > > On Fri, 13 Feb 2015 17:03:56 +0800 > > > Fam Zheng <famz@xxxxxxxxxx> wrote: > > > > > > > SYNOPSIS > > > > > > > > #include <sys/epoll.h> > > > > > > > > int epoll_pwait1(int epfd, int flags, > > > > struct epoll_event *events, > > > > int maxevents, > > > > struct epoll_wait_params *params); > > > > > > Quick, possibly dumb question: might it make sense to also pass in > > > sizeof(struct epoll_wait_params)? That way, when somebody wants to add > > > another parameter in the future, the kernel can tell which version is in > > > use and they won't have to do an epoll_pwait2()? > > > > > > > Flags can be used for that, if the change is not > > radically different. > > Passing in size is generally better than flags, because > that way an extension of the ABI (new field[s]) > automatically signals towards the kernel what to do with > old binaries - while extending the functionality of new > binaries, without sacrificing functionality. > > With flags you are either limited to the same structure > size - or have to decode a 'size' value from the flags > value - which is fragile (and in which case a real 'size' > parameter is better). > > in the perf ABI we use something like that: there's a > perf_attr.size parameter that iterates the ABI forward, > while still being binary compatible with older software. > > If old binaries pass in a smaller structure to a newer > kernel then the kernel pads the new fields with zero by > default - that way the kernel internals are never burdened > with compatibility details and data format versions. > > If new user-space passes in a large structure than the > kernel can handle then the kernel returns an error - this > way user-space can transparently support conditional > features and fallback logic. > > It works really well, we've done literally a hundred perf > ABI extensions this way in the last 4+ years, in a pretty > natural fashion, without littering the kernel (or > user-space) with version legacies and without breaking > existing perf tooling. > > Other syscall ABIs already get painful when trying to > handle 2-3 data structure versions, so people either give > up, or add flags kludges or go to new syscall entries: > which is painful in its own fashion and adds unnecessary > latency to feature introduction as well. > Excellent. This now makes a lot of sense to me, thanks to your explanations, Ingo. I'll add the "size" field in the next revision. Thanks, Fam -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html