I found the final reason! The initial use of init_ram is also used by test_rip_relative(), which will cause conflict. I changed it and everything runs well. On Wed, Jun 19, 2013 at 8:32 PM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: > On Wed, Jun 19, 2013 at 08:30:33PM +0800, 李春奇 <Arthur Chunqi Li> wrote: >> On Wed, Jun 19, 2013 at 8:26 PM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: >> > On Wed, Jun 19, 2013 at 08:18:29PM +0800, 李春奇 <Arthur Chunqi Li> wrote: >> >> On Wed, Jun 19, 2013 at 5:31 PM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: >> >> > On Wed, Jun 19, 2013 at 09:26:59AM +0800, 李春奇 <Arthur Chunqi Li> wrote: >> >> >> On Wed, Jun 19, 2013 at 12:44 AM, Gleb Natapov <gleb@xxxxxxxxxx> wrote: >> >> >> > Send code in a form of a patch. >> >> >> > >> >> >> > On Wed, Jun 19, 2013 at 12:14:13AM +0800, 李春奇 <Arthur Chunqi Li> wrote: >> >> >> >> extern u8 insn_page[], insn_page_end[]; >> >> >> >> extern u8 test_insn[], test_insn_end[]; >> >> >> >> extern u8 alt_insn_page[]; >> >> >> >> >> >> >> >> asm( >> >> >> >> ".align 4096\n\t" >> >> >> >> ".global insn_page\n\t" >> >> >> >> ".global insn_page_end\n\t" >> >> >> >> ".global test_insn\n\t" >> >> >> >> ".global test_insn_end\n\t" >> >> >> >> "insn_page:\n\t" >> >> >> >> >> >> >> >> "ret \n\t" >> >> >> >> >> >> >> >> "push %rax; push %rbx\n\t" >> >> >> >> "push %rcx; push %rdx\n\t" >> >> >> >> "push %rsi; push %rdi\n\t" >> >> >> >> "push %rbp\n\t" >> >> >> >> "push %r8; push %r9\n\t" >> >> >> >> "push %r10; push %r11\n\t" >> >> >> >> "push %r12; push %r13\n\t" >> >> >> >> "push %r14; push %r15\n\t" >> >> >> >> "pushf\n\t" >> >> >> >> >> >> >> >> "push 136+save \n\t" >> >> >> >> "popf \n\t" >> >> >> >> "mov 0+save, %rax \n\t" >> >> >> >> "mov 8+save, %rbx \n\t" >> >> >> >> "mov 16+save, %rcx \n\t" >> >> >> >> "mov 24+save, %rdx \n\t" >> >> >> >> "mov 32+save, %rsi \n\t" >> >> >> >> "mov 40+save, %rdi \n\t" >> >> >> >> "mov 56+save, %rbp \n\t" >> >> >> >> "mov 64+save, %r8 \n\t" >> >> >> >> "mov 72+save, %r9 \n\t" >> >> >> >> "mov 80+save, %r10 \n\t" >> >> >> >> "mov 88+save, %r11 \n\t" >> >> >> >> "mov 96+save, %r12 \n\t" >> >> >> >> "mov 104+save, %r13 \n\t" >> >> >> >> "mov 112+save, %r14 \n\t" >> >> >> >> "mov 120+save, %r15 \n\t" >> >> >> >> >> >> >> >> "test_insn:\n\t" >> >> >> >> "in (%dx),%al\n\t" >> >> >> >> ". = . + 31\n\t" >> >> >> >> "test_insn_end:\n\t" >> >> >> >> >> >> >> >> "pushf \n\t" >> >> >> >> "pop 136+save \n\t" >> >> >> >> "mov %rax, 0+save \n\t" >> >> >> >> "mov %rbx, 8+save \n\t" >> >> >> >> "mov %rcx, 16+save \n\t" >> >> >> >> "mov %rdx, 24+save \n\t" >> >> >> >> "mov %rsi, 32+save \n\t" >> >> >> >> "mov %rdi, 40+save \n\t" >> >> >> >> "mov %rbp, 56+save \n\t" >> >> >> >> "mov %r8, 64+save \n\t" >> >> >> >> "mov %r9, 72+save \n\t" >> >> >> >> "mov %r10, 80+save \n\t" >> >> >> >> "mov %r11, 88+save \n\t" >> >> >> >> "mov %r12, 96+save \n\t" >> >> >> >> "mov %r13, 104+save \n\t" >> >> >> >> "mov %r14, 112+save \n\t" >> >> >> >> "mov %r15, 120+save \n\t" >> >> >> >> "popf \n\t" >> >> >> >> "pop %r15; pop %r14 \n\t" >> >> >> >> "pop %r13; pop %r12 \n\t" >> >> >> >> "pop %r11; pop %r10 \n\t" >> >> >> >> "pop %r9; pop %r8 \n\t" >> >> >> >> "pop %rbp \n\t" >> >> >> >> "pop %rdi; pop %rsi \n\t" >> >> >> >> "pop %rdx; pop %rcx \n\t" >> >> >> >> "pop %rbx; pop %rax \n\t" >> >> >> >> >> >> >> >> "ret\n\t" >> >> >> >> "save:\n\t" >> >> >> >> ". = . + 256\n\t" >> >> >> >> ".align 4096\n\t" >> >> >> >> "alt_insn_page:\n\t" >> >> >> >> ". = . + 4096\n\t" >> >> >> >> ); >> >> >> >> >> >> >> >> >> >> >> >> static void mk_insn_page(uint8_t *alt_insn_page, >> >> >> >> uint8_t *alt_insn, int alt_insn_length) >> >> >> >> { >> >> >> >> int i, emul_offset; >> >> >> >> for (i=1; i<test_insn_end - test_insn; i++) >> >> >> >> test_insn[i] = 0x90; // nop >> >> >> > Why? Gcc should pad it with nops. >> >> >> > >> >> >> >> emul_offset = test_insn - insn_page; >> >> >> >> for (i=0; i<alt_insn_length; i++) >> >> >> >> alt_insn_page[i+emul_offset] = alt_insn[i]; >> >> >> >> } >> >> >> >> >> >> >> >> static void trap_emulator(uint64_t *mem, uint8_t* alt_insn, int alt_insn_length) >> >> >> >> { >> >> >> >> ulong *cr3 = (ulong *)read_cr3(); >> >> >> >> int save_offset = (u8 *)(&save) - insn_page; >> >> >> >> >> >> >> >> memset(alt_insn_page, 0x90, 4096); >> >> >> > alt_insn_page should contains the same instruction as insn_page except >> >> >> > between test_insn and test_insn_end. I do not know how you expect it to >> >> >> > work otherwise. >> >> >> In my oponion, only codes between test_insn and test_insn_end in >> >> >> alt_insn_page need to be set, insn_page will be executed in the guest, >> >> >> and when trapping into emulator OS will load alt_insn_page (because of >> >> >> invlpg(insn_page)), then return to guest with executing insn_page >> >> >> (from TLB). >> >> > While before trap the code will likely be executed from insn_page, >> >> > but after the trap it is very optimistic to assume that tlb cache >> >> > will still contain this virtual address since host will execute quite a >> >> > lot of code and can even schedule in the middle, so the TLB will not >> >> > contain the address and your test will crash. Even the code before test >> >> > instruction can be executed from alt_insn_page if guest is scheduled out >> >> > after invlpg() and before it executes every instruction until trapping >> >> > one. In your case the test will crash too instead of yielding false positive. >> >> > >> >> >> I don't know if this is right, but I use this trick in my >> >> >> previous patch and it runs well. >> >> > Your previous patches always had c3 (ret) after tested instruction on >> >> > alt_insn_page. >> >> > >> >> >> I use "trace-cmd record -e kvm" to >> >> >> trace it and found instructions in alt_insn_page are not executed, so >> >> >> I suppose that alt_insn_page is not loaded to the right place. >> >> > Do you see "in" instruction emulated? Anyway current code is incorrect >> >> > since current install_page() implementation cannot handle large pages >> >> > and the code is backed up by large pages. You can fix install_page() to >> >> > check for that and break large page into small one before installing a >> >> > page. >> >> Here I have two questions. >> >> 1. There's another function called "install_large_page", can it be >> >> used to our occasion? I found that this function is not used at all. >> > It is used when initial page tables are created. >> > See lib/x86/vm.c:setup_mmu_range() >> > >> >> 2. Why will current version runs well? Do pages allocated dynamically >> >> are automatically aligned to 2MB (large page size)? >> >> >> > No, they are 4K pages. >> Thus why dynamically creating insn_page and alt_insn_page with >> alloc_page() can get the right result? > Probably. > > -- > Gleb. -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html