Hi Shuah, On 23 June 2017 at 01:53, Shuah Khan <shuah@xxxxxxxxxx> wrote: > Hi Tom, > > On 06/22/2017 01:48 PM, Tom Gall wrote: >> Hi >> >> On Thu, Jun 22, 2017 at 2:06 PM, Shuah Khan <shuah@xxxxxxxxxx> wrote: >>> On 06/22/2017 11:50 AM, Kees Cook wrote: >>>> On Thu, Jun 22, 2017 at 10:49 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote: >>>>> On Thu, Jun 22, 2017 at 10:09 AM, Shuah Khan <shuah@xxxxxxxxxx> wrote: >>>>>> On 06/22/2017 10:53 AM, Kees Cook wrote: >>>>>>> On Thu, Jun 22, 2017 at 9:18 AM, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote: >>>>>>>> Hi Kees, Andy, >>>>>>>> >>>>>>>> On 15 June 2017 at 23:26, Sumit Semwal <sumit.semwal@xxxxxxxxxx> wrote: >>>>>>>>> 3. 'seccomp ptrace hole closure' patches got added in 4.7 [3] - >>>>>>>>> feature and test together. >>>>>>>>> - This one also seems like a security hole being closed, and the >>>>>>>>> 'feature' could be a candidate for stable backports, but Arnd tried >>>>>>>>> that, and it was quite non-trivial. So perhaps we'll need some help >>>>>>>>> from the subsystem developers here. >>>>>>>> >>>>>>>> Could you please help us sort this out? Our goal is to help Greg with >>>>>>>> testing stable kernels, and currently the seccomp tests fail due to >>>>>>>> missing feature (seccomp ptrace hole closure) getting tested via >>>>>>>> latest kselftest. >>>>>>>> >>>>>>>> If you feel the feature isn't a stable candidate, then could you >>>>>>>> please help make the test degrade gracefully in its absence? >>> >>> In some cases, it is not easy to degrade and/or check for a feature. >>> Probably several security features could fall in this bucket. >>> >>>>>>> >>>>>>> I don't really want to have that change be a backport -- it's quite >>>>>>> invasive across multiple architectures. >>> >>> Agreed. The same test for kernel applies to tests as well. If a kernel >>> feature can't be back-ported, the test for that feature will fall in the >>> same bucket. It shouldn't be back-ported. >>> >>>>>>> >>>>>>> I would say just add a kernel version check to the test. This is >>>>>>> probably not the only selftest that will need such things. :) >>>>>> >>>>>> Adding release checks to selftests is going to problematic for maintenance. >>>>>> Tests should fail gracefully if feature isn't supported in older kernels. >>>>>> >>>>>> Several tests do that now and please find a way to check for dependencies >>>>>> and feature availability and fail the test gracefully. If there is a test >>>>>> that can't do that for some reason, we can discuss it, but as a general >>>>>> rule, I don't want to see kselftest patches that check release. >>>>> >>>>> If a future kernel inadvertently loses the new feature and degrades to >>>>> the behavior of old kernels, that would be a serious bug and should be >>>>> caught. >>> >>> Agreed. If I understand you correctly, by not testing stable kernels >>> with their own selftests, some serious bugs could go undetected. >> >> Personally I'm a bit skeptical. I think the reasoning is more that the >> latest selftests provide more coverage, and therefore should be better >> tests, even on older kernels. > > The assumption that "the latest selftests provide more coverage, and > therefore should be better tests, even on older kernels." is incorrect. > > Selftests in general track the kernel features. In some cases, new > tests could be added that provide better coverage on older kernels, > however, it is more likely that new tests are added to test new kernel > features and enhancements to existing features. Based on the second > "enhancements to existing features" it is more important to test newer > kernels with older selftests. This does happen in kernel integration > cycles during development. > > As a general rule, testing stable kernels with their own selftests will > yield the best results. > I would have agreed totally, if the selftests and the kernel were in sync since forever. But since the kselftests are a comparatively recent addition, the number of tests available for features existing in LTS kernels is really quite small. Just as a comparison, 4.4-LTS misses tests for bpf, cpufreq, gpio, media_tests, networking, prctl, to name a few. Also, while trying to run kselftests from later kernels with 4.4, we only had a few failures for existing features, while most other tests ran ok. Just another data point. >> >>>> >>>> Right. I really think stable kernels should be tested with their own >>>> selftests. If some test is needed in a stable kernel it should be >>>> backported to that stable kernel. >>> >>> Correct. This is always a safe option. There might be cases that even >>> prevent tests being built, especially if a new feature adds new fields >>> to an existing structure. >>> >>> It appears in some cases, users want to run newer tests on older kernels. >>> Some tests can clearly detect feature support using module presence and/or >>> Kconfig enabled or disabled. These are conditions even on a kernel that >>> supports a new module or new config option. The kernel the test is running >>> on might not have the feature enabled or module might not be present. In >>> these cases, it would be easier to detect and skip the test. >>> >>> However, some features aren't so easy. For example: >>> >>> - a new flag is added to a syscall, and new test is added. It might not >>> be easy to detect that. >>> - We might have some tests that can't detect and skip. >>> >>> Based on this discussion, it is probably accurate to say: >>> >>> 1. It is recommended that selftests from the same release be run on the >>> kernel. >>> 2. Selftests from newer kernels will run on older kernels, user should >>> understand the risks such as some tests might fail and might not >>> detect feature degradation related bugs. >>> 3. Selftests will fail gracefully on older releases if at all possible. >> >> How about gracefully be skipped instead of fail? > > Yes. That is the goal and that is what tests do. Tests do detect > dependencies on features, modules, config options and decide to skip > the test. If a test doesn't do that, it gets fixed. > >> >> The later suggests the test case in some situations can detect it's >> pointless to run something and say as much instead of emitting a >> failure that would be a waste of time to look into. > > Right. Please see above. However, correctly detecting dependencies > isn't possible in all cases. In some cases, fail is what it can do. > >> >> As another example take tools/testing/selftests/net/psock_fanout.c >> On 4.9 it'll fail to compile (using master's selftests) because >> PACKET_FANOUT_FLAG_UNIQUEID isn't defined. Add a simple #ifdef for >> that symbol and the psock_fanout test will compile and run just fine. >> >>> Sumit! >>> >>> 1. What are the reasons for testing older kernel with selftests from >>> newer kernels? What are the benefits you see for doing so? >> >> I think the presumption is the latest greatest collection of selftests >> are the best, most complete. > > Not necessarily the case. > >> >>> I am looking to understand the need/reasons for this use-case. In our >>> previous discussion on this subject, I did say, you should be able to >>> do so with some exceptions. >>> >>> 2. Do you test kernels with the selftests from the same release? >> >> We have the ability to do either. The new shiny .... it calls. > > If the only reason is "shiny", I would say you might not be getting > the best results possible. > >> >>> 3. Do you find testing with newer selftests to be useful? >> >> I think it comes down to coverage and again the current perception >> that latest greatest is better. Quantitatively we haven't collected >> data to support that position tho it would be interesting to compare >> say a 4.4-lts and it's selftests directory to a mainline, see how much >> was new and then find out how much of those new selftests actually >> work on the older 4.4-lts. >> > > As I explained above, The assumption/perception that "the latest selftests > provide more coverage, and therefore should be better tests, even on older > kernels." is incorrect. > > As per collecting data to see if testing newer selftests provide better > coverage or not might or might not be worth while exercise. Some releases > might include tests for existing features and some might not. The mix might > be different. As a general rule "selftests are intended to track and do track > features in their release" is a good assumption. > > It might be useful to fix tests from newer releases so they "never fail" on > older releases might not give us the best ROI as whole. These need to be > evaluated case by case basis. > > I would recommend the following approach based on this discussion and now > that we understand incorrect assumption and/or mis-perception to be the > basis for choosing to test stable kernels with selftests from new releases. > > 1. Testing stable kernels with their own selftests will yield the best > results. > 2. Testing stable kernels with newer selftests could be done if user finds > that it provides better coverage, knowing that there is no guarantee that > it will. > > thanks, > -- Shuah Best, Sumit.