Hi All, Recently I hit an OOPS on FPU save/restore in Linux version 2.6.38.8 using aesni_intel_asm.S and aesni_intel_glue.c for native IPSec(netkey) on 32bit System. The same OOPS were found in versions 2.6.39.4, 3.0.x and 3.1.x.But I did not hit this problem in 64 bit system for all these versions. My platform information: "Linux dnsubuntu 2.6.38.8 #7 SMP Sat Nov 12 03:11:12 CST 2011 i686 i686 i386 GNU/Linux" IPsec uses these two crypto driver with aead interface driver : cryptd(__driver-cbc-aes-aesni) --- my understanding (while in irq path, encryption/decryption will be sent to crypto daemon to do an asynchronous operation) driver : authenc(hmac(sha1-generic),cbc-aes-aesni) --- my understanding (IPsec will call it in softirq via aead interface) all the function calls such as (cbc_encrypt/cbc_decrypt) in file aesni_intel_glue.c has been protected inside kernel_fpu_begin()/kernel_fpu_end(). I have done some research on how FPU save/restore in Linux. I still can not figure out where the problem is in this case. I wondered how can fxsave/fxrestor OOPS happen? how can tsk->thread->fpu->state be null when PF_MATH_USED or TS_USEDFPU is set? It is easy to repeat this problem as following steps: 1. build two 32bit system with AESNI in crypto, install openswan, use netkey kernel IPSec stack. Create ESP tunnel between the left and right IPSec gateway. 2. run iperf on host in the left subnet to the host in the right subnet, iperf traffic can be bi-direction. 3. run top or tcpdump inside left and right IPSec gateway 4. From another client or desktop use SSH login to both VPN gateway many times 5. you will find that SSH connection is not stable, top and tcpdump application are not stable ether. In 5 to 10 mins, there will be an OOPS, then system hangs. I have some questions below: 1. Can functions in aesni_intel_glue.c safely be called in softirq (such as IPSec stack)? 2. I think these functions should not be called in interrupt, is it correct? 3. Have these functions be used/tested for native IPSec of Linux via aead interface on 32 bit platform? This could be a bug for 32bit AESNI usage of Linux native IPSec stack. I have attached OOPS image, back trace and decodes Please help to give me some advices on this OOPS, how do you think of this issue, how to fix it? OOPS info <snip> IP: [<c1009880>] __switch_to+0x150/0x190 *pdpt = 0000000030580001 *pde = 0000000000000000 Oops: 0002 [#1] SMP last sysfs file: /sys/module/serpent/initstate <snip> <snip> Code: 00 80 7d e7 00 74 05 e8 ff 23 00 00 64 89 35 2c 82 85 c1 89 d8 83 c4 14 5b 5e 5f 5d c3 8d b6 00 00 00 00 89 f6 8b 83 4c 03 00 00 <0f> ae 00 8b 83 4c 03 00 00 e9 15 ff ff ff 66 90 8b 83 4c 03 00 root@dnsubuntu:/linux-source-2.6.38# find -name decodecode ./scripts/decodecode root@dnsubuntu:/linux-source-2.6.38# echo "Code: 00 80 7d e7 00 74 05 e8 ff 23 00 00 64 89 35 2c 82 85 c1 89 d8 83 c4 14 5b 5e 5f 5d c3 8d b6 00 00 00 00 89 f6 8b 83 4c 03 00 00 <0f> ae 00 8b 83 4c 03 00 00 e9 15 ff ff ff 66 90 8b 83 4c 03 00" | ./scripts/decodecode Code: 00 80 7d e7 00 74 05 e8 ff 23 00 00 64 89 35 2c 82 85 c1 89 d8 83 c4 14 5b 5e 5f 5d c3 8d b6 00 00 00 00 89 f6 8b 83 4c 03 00 00 <0f> ae 00 8b 83 4c 03 00 00 e9 15 ff ff ff 66 90 8b 83 4c 03 00 All code ======== 0: 00 80 7d e7 00 74 add %al,0x7400e77d(%eax) 6: 05 e8 ff 23 00 add $0x23ffe8,%eax b: 00 64 89 35 add %ah,0x35(%ecx,%ecx,4) f: 2c 82 sub $0x82,%al 11: 85 c1 test %eax,%ecx 13: 89 d8 mov %ebx,%eax 15: 83 c4 14 add $0x14,%esp 18: 5b pop %ebx 19: 5e pop %esi 1a: 5f pop %edi 1b: 5d pop %ebp 1c: c3 ret 1d: 8d b6 00 00 00 00 lea 0x0(%esi),%esi 23: 89 f6 mov %esi,%esi 25: 8b 83 4c 03 00 00 mov 0x34c(%ebx),%eax 2b:* 0f ae 00 fxsave (%eax) <-- trapping instruction 2e: 8b 83 4c 03 00 00 mov 0x34c(%ebx),%eax 34: e9 15 ff ff ff jmp 0xffffff4e 39: 66 90 xchg %ax,%ax 3b: 8b .byte 0x8b 3c: 83 .byte 0x83 3d: 4c dec %esp 3e: 03 00 add (%eax),%eax Code starting with the faulting instruction =========================================== 0: 0f ae 00 fxsave (%eax) 3: 8b 83 4c 03 00 00 mov 0x34c(%ebx),%eax 9: e9 15 ff ff ff jmp 0xffffff23 e: 66 90 xchg %ax,%ax 10: 8b .byte 0x8b 11: 83 .byte 0x83 12: 4c dec %esp 13: 03 00 add (%eax),%eax root@dnsubuntu:/linux-source-2.6.38# ^C^CInterrupted while waiting for the program. Give up (and stop debugging it)? (y or n) y (gdb) target remote /dev/ttyS1 Remote debugging using /dev/ttyS1 fpu_fxsave (prev_p=0xf17c71a0, next_p=0xf5891940) at /linux-source-2.6.38/arch/x86/include/asm/i387.h:209 209 asm volatile("fxsave %[fx]" (gdb) bt #0 fpu_fxsave (prev_p=0xf17c71a0, next_p=0xf5891940) at /linux-source-2.6.38/arch/x86/include/asm/i387.h:209 #1 fpu_save_init (prev_p=0xf17c71a0, next_p=0xf5891940) at /linux-source-2.6.38/arch/x86/include/asm/i387.h:238 #2 __save_init_fpu (prev_p=0xf17c71a0, next_p=0xf5891940) at /linux-source-2.6.38/arch/x86/include/asm/i387.h:261 #3 __unlazy_fpu (prev_p=0xf17c71a0, next_p=0xf5891940) at /linux-source-2.6.38/arch/x86/include/asm/i387.h:292 #4 __switch_to (prev_p=0xf17c71a0, next_p=0xf5891940) at arch/x86/kernel/process_32.c:316 #5 0xc151fb3b in context_switch () at kernel/sched.c:2946 #6 schedule () at kernel/sched.c:3999 #7 0xc105073b in __cond_resched () at kernel/sched.c:5258 #8 0xc1520318 in _cond_resched () at kernel/sched.c:5265 #9 0xc1120419 in slab_pre_alloc_hook (s=<value optimized out>, gfpflags=208) at mm/slub.c:795 #10 slab_alloc (s=<value optimized out>, gfpflags=208) at mm/slub.c:1744 #11 kmem_cache_alloc (s=<value optimized out>, gfpflags=208) at mm/slub.c:1770 #12 0xc113ef91 in d_alloc (parent=0x0, name=0xf09d3f24) at fs/dcache.c:1286 #13 0xc113f1ab in d_alloc_pseudo (sb=0xf58b5800, name=<value optimized out>) at fs/dcache.c:1343 #14 0xc1435269 in sock_alloc_file (sock=0xf5667c40, f=0xf09d3f4c, flags=526336) at net/socket.c:365 ---Type <return> to continue, or q <return> to quit--- #15 0xc1435326 in sock_map_fd (sock=<value optimized out>, flags=<value optimized out>) at net/socket.c:397 #16 0xc14364ac in sys_socket (family=1, type=1, protocol=0) at net/socket.c:1313 #17 0xc1437768 in sys_socketcall (call=1, args=0xbfdb6398) at net/socket.c:2256 #18 <signal handler called> #19 0xb7786424 in ?? () #20 0xb7721e11 in ?? () #21 0xb77222b9 in ?? () #22 0xb771f424 in ?? () #23 0xb771f7e2 in ?? () #24 0xb76b50c9 in ?? () #25 0xb76b4a0f in ?? () #26 0x08048627 in ?? () #27 0xb7633e37 in ?? () #28 0x08048501 in ?? () (gdb) </snip> Thanks & Regards TimLee?韬{.n?壏煯壄?%娝?檩?w?{.n?壏{饼黍?{ay?蕠跈?jf"穐殢飦?戧鐉_璁(殠娸"濟?m??G珴?⒏?櫒璀?x忈