Hi Gerd,
Gerd Hoffmann wrote:
Want reproduce? Here we go:
* grab xenner 0.8 from http://dl.bytesex.org/releases/xenner/
* grab a xenified dom0 kernel without blktap driver (either not
compiled or module not loaded).
* start xend
* start blkbackd from xenner package (you probably want the -d switch
for debug output, twice for more).
* run "xm block-attach 0 tap:aio:/path/to/some/file xvda r"
* watch it blow up ;)
Thanks for the repro details. I'll have a go at this later. One thing we
haven't tested AFAIK is mapping grants in the same domain: could you
check to see if the bug is the same if you attach a block device to a
domain other than Dom0? Also, could you send any Xen console output, if
it contains errors or warnings?
I can't help wondering if this is a hint that now is the time to find a
better API, which doesn't have the requirement (a) that seems to be
causing such trouble? Are other PV guests --- *BSD, Solaris --- going
to have the same problems with their VM layers if they try to implement
this API? Upstream Linux pv_ops certainly will, and it would be good if
we could avoid tying unprivileged guests to ABIs which cannot hope to be
merged into pv_ops.
And I fear the problems I've trapped into up to now is only the tip of
the iceberg. What happens if an application with active grant table
mappings calls fork() ?
Ultimately, fork calls dup_mm, which calls, dup_mmap, which calls
copy_{page,pud,pmd,pte}_range, which calls copy_one_pte, which calls
set_pte_at, which hypercalls HYPERVISOR_update_va_mapping.
The hypercall will not succeed and will return an error code indicating
the reason for this. Therefore the PTE will not be set. There appears to
be no way to propagate this error through the Linux VM code, because
there is no concept of a PTE update failing. I could add return codes to
all those functions, but I don't fancy their chances upstream....
A possibility for solving that might be to carry out the mappings upon a
page fault: I believe this would be compatible with copy_page_range.
(In fact, it's possible that a forked process would attempt to
demand-page in the granted page, bypassing the copy_page_range code.
Since there is no nopage handler for a gntdev VMA, that would lead to an
anonymous page being mapped into memory instead.)
So, as far as I can tell, there would be no kernel BUG() or
domain_crash() in the event of a fork(). It looks like implementing
nopage in gntdev would enable grants to be remapped after a fork() and
the correct behaviour to happen.
Regards,
Derek.
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/virtualization