On Thu, Nov 9, 2017 at 5:37 AM, Michael Kerrisk (man-pages) <mtk.manpages@xxxxxxxxx> wrote: > Hi Florian, > > On 9 November 2017 at 13:02, Florian Weimer <fweimer@xxxxxxxxxx> wrote: >> On 11/09/2017 12:58 PM, Michael Kerrisk (man-pages) wrote: >>> >>> [CC += Kees, in case he has some comments] >>> >>> On 9 November 2017 at 08:17, Michael Kerrisk (man-pages) >>> <mtk.manpages@xxxxxxxxx> wrote: >>>> >>>> Hi Florian, >>>> >>>> On 8 November 2017 at 07:24, Florian Weimer <fweimer@xxxxxxxxxx> wrote: >>>>> >>>>> On 11/07/2017 09:35 PM, Michael Kerrisk (man-pages) wrote: >>>>>> >>>>>> >>>>>> This change broke my code that was doing seccomp filtering for the >>>>>> open() system call number (__NR_open). The breakage in question is not >>>>>> serious, since this was really just demonstration code. However, I >>>>>> want to raise awareness that these sorts of changes have the potential >>>>>> to possibly cause breakages for some code using seccomp, and note that >>>>>> I think such changes should not be made lightly or gratuitously. >>>>> >>>>> >>>>> >>>>> I have the opposite view: We should make such changes as often as >>>>> possible, >>>>> to remind people that seccomp filters (and certain SELinux and AppArmor >>>>> policies) are incompatible with the GNU/Linux model, where everything is >>>>> developed separately and not maintained within a single source tree >>>>> (unlike >>>>> say OpenBSD). This means that you really can't deviate from the >>>>> upstream >>>>> Linux userspace ABI (in the broadest possible sense) and still expect >>>>> things >>>>> to work. >>>>> >>>>> I know that people like to slap seccomp filters on everything today, but >>>>> without careful examination, that is likely to introduce bugs >>>>> (particularly >>>>> on rarely used code paths). It can also cause the process to switch to >>>>> legacy interfaces with known issues (e.g., reading from /dev/urandom >>>>> instead >>>>> of getrandom, without waiting for the kernel to signal initialization of >>>>> the >>>>> pool). >>>> >>>> >>>> Thanks. The above is a good summary of the counterpoints to my initial >>>> argument. >>> >>> >>> Florian, taking your and Adhemerval's useful comments into account, I >>> added the following text to the seccomp(2) manual page: >>> >>> [[ >>> Caveats >>> There are various subtleties to consider when applying seccomp >>> filters to a program, including the following: >>> >>> * Some traditional system calls have user-space implementations >>> in the vdso(7) on many architectures. Notable examples include >>> clock_gettime(2), gettimeofday(2), and time(2). On such archi‐ >>> tectures, seccomp filtering for these system calls will have no >>> effect. >> >> >> I think the situation is more complicated for many of those because they can >> still perform system calls on their fallback paths. So it's one more case >> where seccomp can give you unpredictable failures. > > Good point. I added the following text: > > (However, there are cases where the vdso(7) implemen‐ > tations may fall back to invoking the true system > call, in which case seccomp filters would see the sys‐ > tem call.) > >> Rest looks good to me. Thanks for the writeup. > > Thanks for the review! Agreed, this looks good. Thanks for clarifying it. :) -Kees -- Kees Cook Pixel Security -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html