[PATCHv5 0/6] x86: 32-bit compatible C/R on x86_64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Changes from v4:
- check both vm_ops and vm_private_data to avoid (unlikely) confusion
  with some other vma in map_vdso_once (as Andy noticed) - which would
  lead to unable to use this API in that unlikely-case
  (vm_private_data may be uninitialized and be the same as vvar_mapping
  or vdso_mapping pointer) - so I introduced one-liner helper
  vma_is_special_mapping().

Changes from v3:
- proper ifdefs around vdso_image_32
- missed Reviewed-by tag

Changes from v2:
- reworked map_vdso() part with Andy suggestions
- int arch_prctl(ARCH_MAP_VDSO_*, addr) now returns size of mapped
  vdso blob on success, which is handy for the following blob parsing
  in userspace
- disallowed two vDSO blobs mappings: as Andy noted,
  __insert_special_mapping may not get all accounting right, which
  may lead to abuse this API from userspace. Return -EEXIST if process
  has mapped vdso blob - this will ensure that caller knows what it does.

The following changes are available since v1:
- killed PR_REG_SIZE macro as Oleg suggested
- cleared SA_IA32_ABI|SA_X32_ABI from oact->sa.sa_flags in do_sigaction()
  as noticed by Oleg
- moved SA_IA32_ABI|SA_X32_ABI from uapi header as those flags shouldn't
  be exposed to user-space

I also reworked CRIU's patches to work with this patches set, rather than
on first RFC that swapped TIF_IA32 with arch_prctl. By now it yet fails
~10% of 32-bit tests of CRIU's test suite called ZDTM.
The CRIU branch for this can be viewed on [6] and v3 patches to add
this functionality have been sent to maillist [7].

The patches set is based on [3] and while it's not yet applied -- it
may make kbuild test robot unhappy.

Description from v1 [5]:

This patches set is an attempt to add checkpoint/restore
for 32-bit tasks in compatibility mode on x86_64 hosts.

Restore in CRIU starts from one root restoring process, which
reads info for all threads being restored from images files.
This information is used further to find out which processes
share some resources. Later shared resources are restored only
by one process and all other inherit them.
After that it calls clone() and new threads restore their
properties in parallel. Those threads inherit all parent's
mappings and fetch properties from those mappings
(and do clone themself, if they have children/subthreads). [1]
Then starts restorer blob's play, it's PIE binary, which
unmaps all unneeded for restoring VMAs, maps new VMAs and
finalize restoring with sigreturn syscall. [2]

To restore of 32-bit task we need three things to do in running
x86_64 restorer blob:
a) set code selector to __USER32_CS (to run 32-bit code);
b) remap vdso blob from 64-bit to 32-bit
   This is primary needed because restore may happen on a different
   kernel, which has different vDSO image than we had on dump.
c) if 32-bit vDSO differ to dumped image, move it on free place
   and add jump trampolines to that place.
d) switch TIF_IA32 flag, so kernel would know that it deals with
   compatible 32-bit application.

>From all this:
a) setting CS may be done from userspace, no patches needed;
b) patches 1-3 add ability to map different vDSO blobs on x86 kernel;
c) for remapping/moving 32-bit vDSO blob patches have been send earlier
   and seems to be accepted [3]
d) and for swapping TIF_IA32 flag discussion with Andy ended in conclusion
   that it's better to remove this flag completely.
   Patches 4-6 deletes usage of TIF_IA32 from ptrace, signal and coredump
   code. This is rework/resend of RFC [4]

[1] https://criu.org/Checkpoint/Restore#Restore
[2] https://criu.org/Restorer_context
[3] https://lkml.org/lkml/2016/6/28/489
[4] https://lkml.org/lkml/2016/4/25/650
[5] https://lkml.org/lkml/2016/6/1/425
[6] https://github.com/0x7f454c46/criu/tree/compat-4
[7] https://lists.openvz.org/pipermail/criu/2016-June/029788.html

Dmitry Safonov (6):
  x86/vdso: unmap vdso blob on vvar mapping failure
  x86/vdso: replace calculate_addr in map_vdso() with addr
  x86/arch_prctl/vdso: add ARCH_MAP_VDSO_*
  x86/coredump: use pr_reg size, rather that TIF_IA32 flag
  x86/ptrace: down with test_thread_flag(TIF_IA32)
  x86/signal: add SA_{X32,IA32}_ABI sa_flags

 arch/x86/entry/vdso/vma.c         | 81 +++++++++++++++++++++++++++------------
 arch/x86/ia32/ia32_signal.c       |  2 +-
 arch/x86/include/asm/compat.h     |  8 ++--
 arch/x86/include/asm/fpu/signal.h |  6 +++
 arch/x86/include/asm/signal.h     |  4 ++
 arch/x86/include/asm/vdso.h       |  2 +
 arch/x86/include/uapi/asm/prctl.h |  6 +++
 arch/x86/kernel/process_64.c      | 25 ++++++++++++
 arch/x86/kernel/ptrace.c          |  2 +-
 arch/x86/kernel/signal.c          | 20 +++++-----
 arch/x86/kernel/signal_compat.c   | 34 ++++++++++++++--
 fs/binfmt_elf.c                   | 23 ++++-------
 kernel/signal.c                   |  7 ++++
 13 files changed, 162 insertions(+), 58 deletions(-)

-- 
2.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]