On June 1, 2020 6:59:26 AM PDT, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > >> On Jun 1, 2020, at 2:23 AM, Billy Laws <blaws05@xxxxxxxxx> wrote: >> >> >>> >>> On May 30, 2020, at 5:26 PM, Gabriel Krisman Bertazi ><krisman@xxxxxxxxxxxxx> wrote: >>> >>> Andy Lutomirski <luto@xxxxxxxxxxxxxx> writes: >>> >>>>>>> On May 29, 2020, at 11:00 PM, Gabriel Krisman Bertazi ><krisman@xxxxxxxxxxxxx> wrote: >>>>>> >>>>>> Modern Windows applications are executing system call >instructions >>>>>> directly from the application's code without going through the >WinAPI. >>>>>> This breaks Wine emulation, because it doesn't have a chance to >>>>>> intercept and emulate these syscalls before they are submitted to >Linux. >>>>>> >>>>>> In addition, we cannot simply trap every system call of the >application >>>>>> to userspace using PTRACE_SYSEMU, because performance would >suffer, >>>>>> since our main use case is to run Windows games over Linux. >Therefore, >>>>>> we need some in-kernel filtering to decide whether the syscall >was >>>>>> issued by the wine code or by the windows application. >>>> >>>> Do you really need in-kernel filtering? What if you could have >>>> efficient userspace filtering instead? That is, set something up >so >>>> that all syscalls, except those from a special address, are >translated >>>> to CALL thunk where the thunk is configured per task. Then the >thunk >>>> can do whatever emulation is needed. >>> >>> Hi, >>> >>> I suggested something similar to my customer, by using >>> libsyscall-intercept. The idea would be overwritting the syscall >>> instruction with a call to the entry point. I'm not a specialist on >the >>> specifics of Windows games, (cc'ed Paul Gofman, who can provide more >>> details on that side), but as far as I understand, the reason why >that >>> is not feasible is that the anti-cheat protection in games will >abort >>> execution if the binary region was modified either on-disk or >in-memory. >>> >>> Is there some mechanism to do that without modiyfing the >application? >> >> Hi, >> >> I work on an emulator for the Nintendo Switch that uses a similar >technique, >> in our testing it works very well and is much more performant than >even >> PTRACE_SYSEMU. >> >> To work around DRM reading the memory contents I think mprotect could >> be used, after patching the syscall a copy of the original code could >be >> kept somewhere in memory and the patched region mapped --X. >> With this, any time the DRM attempts to read to the patched region >and >> perform integrity checks it will cause a segfault and a branch to the >> signal handler. This handler can then return the contents of the >original, >> unpatched region to satisfy them checks. >> >> Are memory contents checked by DRM solutions too often for this to be >> performant? > >A bigger issue is that hardware support for —X is quite spotty. There >is no x86 CPU that can do it cleanly in a bare metal setup, and client >CPUs that can do it at all without hypervisor help may be nonexistent. >I don’t know if the ARM situation is much better. > >> -- >> Billy Laws >>> >>>> Getting the details and especially the interaction with any seccomp >>>> filters that may be installed right could be tricky, but the >performance >>>> should be decent, at least on non-PTI systems. >>>> >>>> (If we go this route, I suspect that the correct interaction with >>>> seccomp is that this type of redirection takes precedence over >seccomp >>>> and seccomp filters are not invoked for redirected syscalls. After >all, >>>> a redirected syscall is, functionally, not a syscall at all.) >>>> >>> >>> >>> -- >>> Gabriel Krisman Bertazi Running these things in a minimal VM container would allow this kind of filtering/trapping to be done in the VMM, too. I don't know how many layers deep you invoke native Linux libraries, and so if the option would exist to use out-of-range system call numbers for the Linux system numbers? -- Sent from my Android device with K-9 Mail. Please excuse my brevity.