report bug in kernel sound driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear all

ÂÂÂÂÂÂÂÂÂÂÂ i want report a bug in davinci sound device driver (sound/soc/davinci-pcm.c)

ÂÂ ÂÂÂÂÂÂÂÂÂThis bug cause by DMA copy Overflow. It will cause kernel oops with a lot of unusual info.

And this problem seem still in latest stable kernel (version 2.6.35.7)

       Bug Symptom at the end of mail


Here is my analyse of this bug:

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Device will call function :Â davinci_pcm_new 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ This function will malloc a lagre Continuous Pages buffer (Typicly:128K) both Playback and Capture.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Those two buffer will use as DMA copy ! 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ When someone recoder sound date ! This driver will use DMA.Copy register date to Capture buffer that malloc at function davinci_pcm_new !

ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ every DMA copy finish.callback function davinci_pcm_dma_irq will run. function davinci_pcm_enqueue_dma will work.This function will 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ set DMA copy params again. And problem is in here ! 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ It set DMA params :
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ src = sound recoder 32-bit reg address
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dst = prtd->period * period_size
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ src_bidx = 0 ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //(Every dma copy finish the src will not change)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dst_bidx = data_type; ÂÂÂÂÂ//date_type = 2, because only high 16-bit is the sound date.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ acnt = 4
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ bcnt = 2048
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ cnt = 1
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ÂUse this param.DMA Internal work like:
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Â 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Â for(c=0;c<cnt;c++) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Â ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ for(b=0;b<bcnt;b++) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ memcopy(&dst,&src,4)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ src += src_bidx;Â //src_bidx = 0;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ dst += dst_bidxÂÂ //dst_bidx = data_type =2Â 16bit sound date
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ }ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Â ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ Â }
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ This copy will make all dst buffer has source high 16 bit date. but will cause 2 byptes Overflow 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂ Â every time the dma copy finish. it will change 4K bytes + 2 bytes. The 2 bytes is DMA copy Overflow.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
it will not error until you copy to the last period! because. your date total copy 128K +2bytes . and we only malloc 128K bytes

other 2 bytes is kernel space memory. this two bytes will be use random by kernel. And those 2 bytes copy by dma. kernel don't know

anything about this segment default. 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ This easy way to fix the problem is change:
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ if(unlikely(prtd->period >= runtime->periods))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ prtd->period = 0;
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ 
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ In function davinci_pcm_enqueue_dma to: 
ÂÂÂ ÂÂÂÂÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂif(unlikely(prtd->period >= (runtime->periods-1)))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ prtd->period = 0;


ÂÂÂÂÂ Below is the Symptom:

Symptom 1:
Bad pte = 04040202, process = sleep, vm_flags = 1875, vaddr = 1b000
VM: killing process sleep
Bad pte = 04040601, process = ???, vm_flags = 1875, vaddr = 17000
Bad pte = ffffffff, process = ???, vm_flags = 1875, vaddr = 43000
Bad pte = 00000001, process = ???, vm_flags = 1875, vaddr = 44000
ââââ..
Bad pte = 00000001, process = ???, vm_flags = 1875, vaddr = 88000ÂÂÂÂÂÂÂ 

Symptom 2:

Unhandled fault: page domain fault (0x8fb) at 0x00011008
Internal error: : 8fb [#1]
Modules linked in: tlv320aic24 dm365_gpio dm365_pwm davinci_vpbe davinci_capture dm365_imp dm365mmap edmak irqk cmemk
CPU: 0
PC is at __copy_to_user+0x54/0x3a8
LR is at 0x5eff968
pc : [<c0117568>]ÂÂÂ lr : [<05eff968>]ÂÂÂ Not tainted
sp : c436befc ip : e4640f80 fp : c436bf4c
r10: 00000000Â r9 : c436a000Â r8 : dcfd0362
r7 : 0ee2fab7Â r6 : f7a60e69Â r5 : fe9cf7d3Â r4 : 026603c7
r3 : 0b7de3b1Â r2 : 00000760Â r1 : c5056020Â r0 : 00011008
Flags: nzCv IRQs on FIQs on Mode SVC_32 Segment user
Control: 5317F
Table: 843C0000Â DAC: 00000015
,,,,,,,,,,,,,,,,,,,,,,,,
page:c0363be0 flags:0x00000068 mapping:c4273d18 mapcount:0 count:0
Trying to fix it up, but a reboot is needed


Symptom 3:

159.99.249.249 login: VM: killing process video_test
Bad pte = 00000003, process = ???, vm_flags = 1875, vaddr = 9000
Bad pte = 00000005, process = ???, vm_flags = 1875, vaddr = b000
,,,,,,,,,,,,,,,,,,,,,
Bad pte = 00000001, process = ???, vm_flags = 100077, vaddr = 31000
Bad page state in process 'desched/0'
page:c035e3e0 flags:0x0000006c mapping:c06ecec8 mapcount:0 count:0
Trying to fix it up, but a reboot is needed


Symptom 4:

159.99.249.249 login: Bad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
Stopping interneBad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
t superserver: iBad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
netdBad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
Bad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
Bad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Bad pte = ffb7ffb6, process = inetd, vm_flags = 100177, vaddr = bea82000
Â
Symptom 5:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
done.
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 817 [#1]
Modules linked in: tlv320aic24 dm365_gpio dm365_pwm davinci_vpbe davinci_capture dm365_imp dm365mmap edmak irqk cmemk
CPU: 0
PC is at __free_pages+0x18/0x58
LR is at __init_begin+0x3fff8000/0x30
pc : [<c007626c>]ÂÂÂ lr : [<00000000>]ÂÂÂ Not tainted
sp : c03cdf50Â ip : c03cdf60Â fp : c03cdf5c
r10: c02de000Â r9 : 00000002Â r8 : c02ca460
r7 : 00000000Â r6 : 843cffd0Â r5 : c43c0000Â r4 : c03cc000
r3 : 00000000Â r2 : c02ca444 Âr1 : 00000000Â r0 : c03659e0
Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment kernel
Control: 5317F
Table: 805BC000Â DAC: 00000017
Process desched/0 (pid: 11, stack limit = 0xc03cc258)
Stack: (0xc03cdf50 to 0xc03ce000)
df40:ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ÂÂÂÂÂÂÂÂÂc03cdf84 c03cdf60 c003ad7c c0076264 
df60: c002b6c0 c002b6c0 00000000 c02c2990 00000001 c02c2998 c03cdf9c c03cdf88 
df80: c0045d54 c003ac64 c03b3f18 c03cc000 c03cdfcc c03cdfa0 c0047b2c c0045d38 
dfa0: 00000000 00000000 c03cc000 c0047a7c c03b3f18 00000000 00000000 00000000 
dfc0: c03cdff4 c03cdfd0 c005eca8 c0047a8c ffffffff ffffffff 00000000 00000000 
dfe0: 00000000 00000000 00000000 c03cdff8 c004ba28 c005ebd0 00000000 00000000 
Backtrace: 
[<c0076254>] (__free_pages+0x0/0x58) from [<c003ad7c>] (free_pgd_slow+0x128/0x148)
[<c003ac54>] (free_pgd_slow+0x0/0x148) from [<c0045d54>] (__mmdrop+0x2c/0x48)
[<c0045d28>] (__mmdrop+0x0/0x48) from [<c0047b2c>] (desched_thread+0xb0/0x130)
Âr4 = C03CC000 
[<c0047a7c>] (desched_thread+0x0/0x130) from [<c005eca8>] (kthread+0xe8/0x128)
[<c005ebc0>] (kthread+0x0/0x128) from [<c004ba28>] (do_exit+0x0/0x9cc)
Âr7 = 00000000Â r6 = 00000000Â r5 = 00000000Â r4 = 00000000
Code: e24cb004 e5903004 e1a0e001 e3530000 (05833000) 
Âprev->state: 2 != TASK_RUNNING??
desched/0/11[CPU#0]: BUG in __schedule at kernel/sched.c:3826

Symptom 6:
VM: killing process sys_monitor
Bad pte = e1a0c00d, process = ???, vm_flags = 100077, vaddr = 12000
Bad pte = e1a0c00d, process = ???, vm_flags = 100077, vaddr = 17000
Bad pte = e1a00001, process = ???, vm_flags = 100077, vaddr = 1a000
Bad pte = e3a00001, process = ???, vm_flags = 100077, vaddr = 22000
Bad pte = e1a0c00d, process = ???, vm_flags = 100077, vaddr = 24000
Bad pte = e1a04003, process = ???, vm_flags = 100077, vaddr = 29000
Bad pte = e0821001, process = ???, vm_flags = 100077, vaddr = 2a000
Bad pte = 979ff101, process = ???, vm_flags = 100077, vaddr = 2c000
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: Oops: 817 [#1]
Modules linked in: tlv320aic24 dm365_gpio dm365_pwm davinci_vpbe davinci_capture dm365_imp dm365mmap edmak irqk cmemk
CPU: 0
PC is at __free_pages+0x18/0x58
LR is at __init_begin+0x3fff8000/0x30
pc : [<c007626c>]ÂÂÂ lr : [<00000000>]ÂÂÂ Not tainted
sp : c434de98Â ip : c434dea8Â fp : c434dea4
r10: c02de000Â r9 : c40b26e0Â r8 : c02ca460
r7 : 00000000Â r6 : 8434ffd1Â r5 : c43c0000Â r4 : c434c000
r3 : 00000000Â r2 : c02ca444Â r1 : 00000000Â r0 : c03649e0
Flags: nZCv IRQs on FIQs on Mode SVC_32 Segment user
Control: 5317F
Table: 8437C000Â DAC: 00000015
Process sys_monitor (pid: 581, stack limit = 0xc434c258)
Stack: (0xc434de98 to 0xc434e000)
de80:ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂc434decc c434dea8 
dea0: c003ad7c c0076264 c40b26e0 c40b26e0 c40b2714 c0495ac0 00000009 00008fa0 
dec0: c434dee4 c434ded0 c0045d54 c003ac64 c0495ac0 c40b26e0 c434defc c434dee8 
dee0: c0045e40 c0045d38 c0063250 c434c000 c434df1c c434df00 c004a28c c0045d80 
df00: c434c000 c0495ac0 c0495ac0 00000001 c434df3c c434df20 c004bbd8 c004a180 
df20: c434df84 c434df40 c00398ec c0049190 c434df84 c434df40 c00398f4 c004ba38 
df40: 00000001 00000000 be90fb28 00000000 c434dfb0 00000000 c434de58 ffffffff 
df60: 00000000 be90fb28 00000000 be90fba8 00000003 be90fc84 c434df9c c434df88 
df80: c00399fc c0039744 0000008e ffffffff c434dfac c434dfa0 c0039aac c00399f0 
dfa0: 00000000 c434dfb0 c0032d88 c0039aa4 00000000 be90fb28 00000000 00000000 
dfc0: be90fc90 00000000 be90fb28 00000000 be90fba8 00000003 be90fc84 00000004 
dfe0: 00000000 be90fb08 00008fa0 00008fa0 00000010 ffffffff 00000000 00000000 

Backtrace: 
[<c0076254>] (__free_pages+0x0/0x58) from [<c003ad7c>] (free_pgd_slow+0x128/0x148)
[<c003ac54>] (free_pgd_slow+0x0/0x148) from [<c0045d54>] (__mmdrop+0x2c/0x48)
[<c0045d28>] (__mmdrop+0x0/0x48) from [<c0045e40>] (mmput+0xd0/0xdc)
Âr4 = C40B26E0 
[<c0045d70>] (mmput+0x0/0xdc) from [<c004a28c>] (exit_mm+0x11c/0x120)
Âr4 = C434C000 
[<c004a170>] (exit_mm+0x0/0x120) from [<c004bbd8>] (do_exit+0x1b0/0x9cc)
Âr7 = 00000001Â r6 = C0495AC0Â r5 = C0495AC0Â r4 = C434C000
[<c004ba28>] (do_exit+0x0/0x9cc) from [<c00398f4>] (do_page_fault+0x1c0/0x228)
[<c0039734>] (do_page_fault+0x0/0x228) from [<c00399fc>] (do_translation_fault+0x1c/0xb4)
[<c00399e0>] (do_translation_fault+0x0/0xb4) from [<c0039aac>] (do_PrefetchAbort+0x18/0x1c)
Âr4 = FFFFFFFF 
[<c0039a94>] (do_PrefetchAbort+0x0/0x1c) from [<c0032d88>] (ret_from_exception+0x0/0x10)
Code: e24cb004 e5903004 e1a0e001 e3530000 (05833000) 
Â<1>Fixing recursive fault but reboot is needed!




Thanks and Best Regards
Â
Honeywell
Ivan Zhang(wenjie.zhang@xxxxxxxxxxxxx)
Firmware EngineerÂ- Honeywell Security R&D - Asia Pacific 
No.430 Li Bing Road, Zhang Jiang Hi-Tech.Park,
Pudong New Area,Shanghai, China(201203)
Telï(8621)-28942292

_______________________________________________
Alsa-devel mailing list
Alsa-devel@xxxxxxxxxxxxxxxx
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel



[Index of Archives]     [ALSA User]     [Linux Audio Users]     [Kernel Archive]     [Asterisk PBX]     [Photo Sharing]     [Linux Sound]     [Video 4 Linux]     [Gimp]     [Yosemite News]

  Powered by Linux