On Tue, May 20, 2014 at 10:47 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote: > On Tue, May 20, 2014 at 10:24:49AM -0700, Andy Lutomirski wrote: >> On Tue, May 20, 2014 at 10:21 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote: >> > On Mon, May 19, 2014 at 03:58:33PM -0700, Andy Lutomirski wrote: >> >> Using arch_vma_name to give special mappings a name is awkward. x86 >> >> currently implements it by comparing the start address of the vma to >> >> the expected address of the vdso. This requires tracking the start >> >> address of special mappings and is probably buggy if a special vma >> >> is split or moved. >> >> >> >> Improve _install_special_mapping to just name the vma directly. Use >> >> it to give the x86 vvar area a name, which should make CRIU's life >> >> easier. >> >> >> >> As a side effect, the vvar area will show up in core dumps. This >> >> could be considered weird and is fixable. Thoughts? >> >> >> >> Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxxx> >> >> Cc: Pavel Emelyanov <xemul@xxxxxxxxxxxxx> >> >> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx> >> > >> > Hi Andy, thanks a lot for this! I must confess I don't yet know how >> > would we deal with compat tasks but this is 'must have' mark which >> > allow us to detect vvar area! >> >> Out of curiosity, how does CRIU currently handle checkpointing a >> restored task? In current kernels, the "[vdso]" name in maps goes >> away after mremapping the vdso. > > We use not only [vdso] mark to detect vdso area but also page frame > number of the living vdso. If mark is not present in procfs output > we examinate executable areas and check if pfn == vdso_pfn, it's > a slow path because there migh be a bunch of executable areas and > touching every of it is not that fast thing, but we simply have no > choise. This patch should fix this issue, at least. If there's still a way to get a native vdso that doesn't say "[vdso]", please let me know/ > > The situation get worse when task was dumped on one kernel and > then restored on another kernel where vdso content is different > from one save in image -- is such case as I mentioned we need > that named vdso proxy which redirect calls to vdso of the system > where task is restoring. And when such "restored" task get checkpointed > second time we don't dump new living vdso but save only old vdso > proxy on disk (detecting it is a different story, in short we > inject a unique mark into elf header). Yuck. But I don't know whether the kernel can help much here. > >> >> I suspect that you'll need kernel changes for compat tasks, since I >> think that mremapping the vdso on any reasonably modern hardware in a >> 32-bit task will cause sigreturn to blow up. This could be fixed by >> making mremap magical, although adding a new prctl or arch_prctl to >> reliably move the vdso might be a better bet. > > Well, as far as I understand compat code uses abs addressing for > vvar data and if vvar data position doesn't change we're safe, > but same time because vvar addresses are not abi I fear one day > we indeed hit the problems and the only solution would be > to use kernel's help. But again, Andy, I didn't think much > about implementing compat mode in criu yet so i might be > missing some details. Prior to 3.15, the compat code didn't have vvar data at all. In 3.15 and up, the vvar data is accessed using PC-relative addressing, even in compat mode (using the usual call; mov trick to read EIP). --Andy -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>