Hi! thanks for looking into this. On 5/31/20 08:56, Gabriel Krisman Bertazi wrote: > >> Is it possible to disassemble and instrument the Windows code to insert >> breakpoints (or emulation calls) at all the Windows syscall points? > Hi Kees, > > I considered instrumenting the syscall instructions with calls to some > wrapper, but I was told that modifying the game in memory or on disk > will trigger all sorts of anti-cheating mechanisms (my main use case are > windows games). Yes, this is the case. Besides, before instrumenting, we would need some way to find those syscalls in the highly obfuscated dynamically generated code, the whole purpose of which is to prevent disassembling, debugging and finding things like that in it. Ultimately, even for the cases when it would be technically feasible I don't think Wine could ever go for modifying the program's code (unless of course this is the part of what the program is doing itself using winapi calls). Wine is trying to implement the API as close to Windows as possible so the DRM can work with Wine, modifying the program's code is something different. > >>> [...] >>> * Why not SECCOMP_MODE_FILTER? >>> >>> We experimented with dynamically generating BPF filters for whitelisted >>> memory regions and using SECCOMP_MODE_FILTER, but there are a few >>> reasons why it isn't enough nor a good idea for our use case: >>> >>> 1. We cannot set the filters at program initialization time and forget >>> about it, since there is no way of knowing which modules will be loaded, >>> whether native and windows. Filter would need a way to be updated >>> frequently during game execution. >>> >>> 2. We cannot predict which Linux libraries will issue syscalls directly. >>> Most of the time, whitelisting libc and a few other libraries is enough, >>> but there are no guarantees other Linux libraries won't issue syscalls >>> directly and break the execution. Adding every linux library that is >>> loaded also has a large performance cost due to the large resulting >>> filter. >> Just so I can understand the expected use: given the dynamic nature of >> the library loading, how would Wine be marking the VMAs? > Paul (cc'ed) is the wine expert, but my understanding is that memory > allocation and initial program load of the emulated binary will go > through wine. It does the allocation and mark the vma accordingly > before returning the allocated range to the windows application. Yes, exactly. Pretty much any memory allocation which Wine does needs syscalls (if those are ever encountered later during executing code from those areas) to be trapped by Wine and passed to Wine's implementation of the corresponding Windows API function. Linux native libraries loading and memory allocations performed by them go outside of Wine control. > >>> Indeed, points 1 and 2 could be worked around with some userspace work >>> and improved SECCOMP_MODE_FILTER support, but at a high performance and >>> some stability cost, to obtain the semantics we want. Still, the >>> performance would suffer, and SECCOMP_MODE_MEMMAP is non intrusive >>> enough that I believe it should be considered as an upstream solution. >> It looks like you're using SECCOMP_RET_TRAP for this? Signal handling >> can be pretty slow. Did you try SECCOMP_RET_USER_NOTIF? We are not much concerned with the overhead of the trapped syscall in our use case, those are very rare. What we are concerned with is the performance impact on the normal Linux syscalls when the syscall trapping is enabled. When I was measuring that impact I've got the same 10% overhead for the untrapped syscalls (that is, always hitting SECCOMP_RET_ALLOW case) with the filters Gabriel mentioned. >> >>> + >>> + if (!(vma->vm_flags & VM_NOSYSCALL)) >>> + return 0; >>> + >>> + syscall_rollback(current, task_pt_regs(current)); >>> + seccomp_send_sigsys(this_syscall, 0); >>> + >>> + seccomp_log(this_syscall, SIGSYS, SECCOMP_RET_TRAP, true); >>> + >>> + return -1; >>> +} >> This really just looks like an ip_address filter, but I get what you >> mean about stacking filters, etc. This may finally be the day we turn to >> eBPF in seccomp, since that would give you access to using a map lookup >> on ip_address, and the map could be updated externally (removing the >> need for the madvise() changes). > And 64-bit comparisons :) > > that would be a good solution, we'd still need to look at some details, > like disabling/updating filters at runtime when some new library is > loaded, but since we can update externally, I think it covers it. I am afraid that for a general case the filter add / update / removal should be done not just when a new library is loaded or unloaded but pretty much any time new (executable) memory region is allocated or deallocated, or PROT_EXEC should be changed on allocated pages . But I am not sure I've got enough details yet on the suggested approach here and might be missing important details. I guess maybe we could discuss the details separately with Gabriel first.