On Fri, Feb 08, 2019 at 09:53:03AM +0000, David Laight wrote: > From: 'Marcelo Ricardo Leitner' > > Sent: 07 February 2019 17:47 > ... > > > > Maybe what we want(ed) here then is explicit versioning, to have the 3 > > > > definitions available. Then the application is able to use, say struct > > > > sctp_event_subscribe, and be happy with it, while there is struct > > > > sctp_event_subscribe_v2 and struct sctp_event_subscribe_v3 there too. > > > > > > > > But it's too late for that now because that would break applications > > > > already using the new fields in sctp_event_subscribe. > > > > > > It is probably better to break the recompilation of the few programs > > > that use the new fields (and have them not work on old kernels) > > > than to stop recompilations of old programs stop working on old > > > kernels or have requested new options silently ignored. > > > > I got confused here, not sure what you mean. Seems there is one "stop" > > word too many. > > More confusing than I intended... > > With the current kernel and headers a 'new program' (one that > needs the new options) will fail to run on an old kernel - which is good. > However a recompilation of an 'old program' (that doesn't use > the new options) will also fail to run on an old kernel - which is bad. > I disagree with this, at least as a unilateral statement. I would assert that an old program, within the constraints of the issue being discussed here, will run perfectly well, when built and run against a new kernel. At issue is the size of the structure sctp_event_subscribe, and the fact that in several instances over the last few years, its been extended to be larger and encompass more events to subscribe to. Nominally an application will use this structure (roughly) as follows: ... struct sctp_event_subscribe events; size_t evsize = sizeof(events); memset(&events, 0, sizeof(events)); events.sctp_send_failure_event = 1; /*example event subscription*/ if (sctp_setsocktpt(sctp_fd, SOL_SCTP, SCTP_EVENTS, &events, &evsize) < 0) { /* do error recovery */ } .... Assume this code will be built and run against kernel versions A and B, in which: A) has a struct sctp_event_subscribe with a size of 9 bytes B) has a struct sctp_event_subscribe with a size of 10 bytes (due to the added field sctp_sender_dry_event) That gives us 4 cases to handle 1) Application build against kernel A and run on kernel A. This works fine, the sizes of the struct in question will always match 2) Application is built against kernel A and run on kernel B. In this case, everything will work because the application passes a buffer of size 9, and the kernel accepts it, because it allows for buffers to be shorter than the current struct sctp_event_subscribe size. The kernel simply operates on the options available in the buffer. The application is none the wiser, because it has no knoweldge of the new option, nor should it because it was built against kernel A, that never offered that option 3) Application is built against kernel B and run on kernel B. This works fine for the same reason as (1). 4) Application is built against kernel B and run on kernel A. This will break because the application is passing a buffer that is larger than what the kernel expects, and rightly so. The application is passing in a buffer that is incompatible with what the running kernel expects. We could look into ways in which to detect the cases in which this might be 'ok', but I don't see why we should bother, because at some point its still an error to pass in an incompatible buffer. In my mind this is no different than trying to run a program that allocates hugepages on a kernel that doesn't support hugepages (just to make up an example). Applications built against newer kernel can't expect all the features/semantics/etc to be identical to older kernels. > Changing the kernel to ignore extra events flags breaks the 'new' > program. > It shouldn't. Assuming you have a program built against headers from kernel B (above), if you set a field in the structure that only exists in kernel B, and try to run it on kernel A, you will get an EINVAL return, which is correct behavior because you are attempting to deliver information to the kernel that kernel A (the running kernel) doesn't know about. Thats correct behavior. > Versioning the structure now (even though it should have been done > earlier) won't change the behaviour of existing binaries. > I won't disagree about the niceness of versioning, but that ship has sailed. > However a recompilation of an 'old' program would use the 'old' > structure and work on old kernels. To be clear, this is situation (1) above, and yeah, running on the kernel you built your application against should always work from a compatibility standpoint. > Attempts to recompile a 'new' program will fail - until the structure > name (or some #define to enable the extra fields) is changed. > Yes, but this is alawys the case for structures that change. If you have an application built against kernel (B), and uses structure fields that only exist in that version of the kernel (and not earlier) will fail to compile when built against kernel (A) headers, and thats expected. This happens with any kernel api that exists in a newer kernel but not an older kernel. > Breaking compilations is much better than unexpected run-time > behaviour. > Any time you make a system call to the kernel, you have to be prepared to handle the resulting error condition, thats not unexpected. To assume that a system call will always work is bad programming practice. Neil > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) > >