Re: [BUG] KCSAN: data-race in drm_sched_entity_is_ready [gpu_sched] / drm_sched_entity_push_job [gpu_sched]

Christian König <christian.koenig@xxxxxxx> · Fri, 25 Aug 2023 09:05:38 +0200

Hi Mirsad,

the name SPSC stands for SingleProducerSingleConsumer, so yes even by 
the name of the component we make it clear that this can only be used by 
one producer and one consumer thread at the same time.

Here disabling preemption is just done so that the consumer thread 
doesn't busy loop for the producer thread to be scheduled in again.

Regards,
Christian.

Am 24.08.23 um 19:46 schrieb Mirsad Goran Todorovac:
Thank you, Christian.

Glad to hear about that.

However, I guess this assumes that this piece of code between

-----<>-----
     preempt_disable();

     tail = (struct spsc_node **)atomic_long_xchg(&queue->tail, 
(long)&node->next);
     WRITE_ONCE(*tail, node);
     atomic_inc(&queue->job_count);

     /*
      * In case of first element verify new node will be visible to 
the consumer
      * thread when we ping the kernel thread that there is new work 
to do.
      */
     smp_wmb();

     preempt_enable();
-----<>-----

... executes only on one CPU/core/thread?

I understood that preempt_disable() disables only interrupts on one 
core/CPU:

https://kernelnewbies.kernelnewbies.narkive.com/6LTlgsAe/preempt-disable-disables-preemption-on-all-processors 

So, we might have a race in theory between WRITE_ONCE() and atomic_inc().

Kind regards,
Mirsad

On 8/21/2023 8:22 PM, Christian König wrote:
I'm not sure about that.

On the one hand it might generate some noise. I know tons of cases 
where logic is: Ok if we see the updated value immediately it will 
optimize things, but if not it's unproblematic because there is 
another check after the next memory barrier.

On the other hand we probably have cases where this is not correctly 
implemented. So double checking those would most like be good idea.

Regards,
Christian.

Am 21.08.23 um 16:28 schrieb Mirsad Todorovac:
Hi Christian,

Thank you for the update.

Should I continue reporting what KCSAN gives? I will try to filter 
these to save your time for
evaluation ...

Kind regards,
Mirsad

On 8/21/23 15:20, Christian König wrote:
Hi Mirsad,

well this is a false positive.

That drm_sched_entity_is_ready() doesn't see the data written by 
drm_sched_entity_push_job() is part of the logic here.

Regards,
Christian.

Am 18.08.23 um 15:44 schrieb Mirsad Todorovac:
On 8/17/23 21:54, Mirsad Todorovac wrote:
Hi,

This is your friendly bug reporter.

The environment is vanilla torvalds tree kernel on Ubuntu 22.04 
LTS and a Ryzen 7950X box.

Please find attached the complete dmesg output from the ring 
buffer and lshw output.

NOTE: The kernel reports tainted kernel, but to my knowledge 
there are no proprietary (G) modules,
       but this taint is turned on by the previous bugs.

dmesg excerpt:

[ 8791.864576] 
==================================================================
[ 8791.864648] BUG: KCSAN: data-race in drm_sched_entity_is_ready 
[gpu_sched] / drm_sched_entity_push_job [gpu_sched]

[ 8791.864776] write (marked) to 0xffff9b74491b7c40 of 8 bytes by 
task 3807 on cpu 18:
[ 8791.864788]  drm_sched_entity_push_job+0xf4/0x2a0 [gpu_sched]
[ 8791.864852]  amdgpu_cs_ioctl+0x3888/0x3de0 [amdgpu]
[ 8791.868731]  drm_ioctl_kernel+0x127/0x210 [drm]
[ 8791.869222]  drm_ioctl+0x38f/0x6f0 [drm]
[ 8791.869711]  amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu]
[ 8791.873660]  __x64_sys_ioctl+0xd2/0x120
[ 8791.873676]  do_syscall_64+0x58/0x90
[ 8791.873688]  entry_SYSCALL_64_after_hwframe+0x73/0xdd

[ 8791.873710] read to 0xffff9b74491b7c40 of 8 bytes by task 1119 
on cpu 27:
[ 8791.873722]  drm_sched_entity_is_ready+0x16/0x50 [gpu_sched]
[ 8791.873786]  drm_sched_select_entity+0x1c7/0x220 [gpu_sched]
[ 8791.873849]  drm_sched_main+0xd2/0x500 [gpu_sched]
[ 8791.873912]  kthread+0x18b/0x1d0
[ 8791.873924]  ret_from_fork+0x43/0x70
[ 8791.873939]  ret_from_fork_asm+0x1b/0x30

[ 8791.873955] value changed: 0x0000000000000000 -> 
0xffff9b750ebcfc00

[ 8791.873971] Reported by Kernel Concurrency Sanitizer on:
[ 8791.873980] CPU: 27 PID: 1119 Comm: gfx_0.0.0 Tainted: 
G             L 6.5.0-rc6-net-cfg-kcsan-00038-g16931859a650 #35
[ 8791.873994] Hardware name: ASRock X670E PG Lightning/X670E PG 
Lightning, BIOS 1.21 04/26/2023
[ 8791.874002] 
==================================================================

P.S.

According to Mr. Heo's instructions, I am adding the unwound trace 
here:

[ 1879.706518] 
==================================================================
[ 1879.706616] BUG: KCSAN: data-race in drm_sched_entity_is_ready 
[gpu_sched] / drm_sched_entity_push_job [gpu_sched]

[ 1879.706737] write (marked) to 0xffff8f3672748c40 of 8 bytes by 
task 4087 on cpu 10:
[ 1879.706748] drm_sched_entity_push_job 
(./include/drm/spsc_queue.h:74 
drivers/gpu/drm/scheduler/sched_entity.c:574) gpu_sched
[ 1879.706808] amdgpu_cs_ioctl 
(drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1375 
drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1469) amdgpu
[ 1879.710589] drm_ioctl_kernel (drivers/gpu/drm/drm_ioctl.c:788) drm
[ 1879.711068] drm_ioctl (drivers/gpu/drm/drm_ioctl.c:892) drm
[ 1879.711551] amdgpu_drm_ioctl 
(drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2748) amdgpu
[ 1879.715319] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 
fs/ioctl.c:856 fs/ioctl.c:856)
[ 1879.715334] do_syscall_64 (arch/x86/entry/common.c:50 
arch/x86/entry/common.c:80)
[ 1879.715345] entry_SYSCALL_64_after_hwframe 
(arch/x86/entry/entry_64.S:120)

[ 1879.715365] read to 0xffff8f3672748c40 of 8 bytes by task 1098 
on cpu 11:
[ 1879.715376] drm_sched_entity_is_ready 
(drivers/gpu/drm/scheduler/sched_entity.c:134) gpu_sched
[ 1879.715435] drm_sched_select_entity 
(drivers/gpu/drm/scheduler/sched_main.c:248 
drivers/gpu/drm/scheduler/sched_main.c:893) gpu_sched
[ 1879.715495] drm_sched_main 
(drivers/gpu/drm/scheduler/sched_main.c:1019) gpu_sched
[ 1879.715554] kthread (kernel/kthread.c:389)
[ 1879.715563] ret_from_fork (arch/x86/kernel/process.c:145)
[ 1879.715575] ret_from_fork_asm (arch/x86/entry/entry_64.S:312)

[ 1879.715590] value changed: 0x0000000000000000 -> 
0xffff8f360663dc00

[ 1879.715604] Reported by Kernel Concurrency Sanitizer on:
[ 1879.715612] CPU: 11 PID: 1098 Comm: gfx_0.0.0 Tainted: 
G             L     6.5.0-rc6+ #47
[ 1879.715624] Hardware name: ASRock X670E PG Lightning/X670E PG 
Lightning, BIOS 1.21 04/26/2023
[ 1879.715631] 
==================================================================

It seems that the line in question might be:

    first = spsc_queue_push(&entity->job_queue, 
&sched_job->queue_node);

which expands to:

static inline bool spsc_queue_push(struct spsc_queue *queue, 
struct spsc_node *node)
{
    struct spsc_node **tail;

    node->next = NULL;

    preempt_disable();

    tail = (struct spsc_node **)atomic_long_xchg(&queue->tail, 
(long)&node->next);
    WRITE_ONCE(*tail, node);
    atomic_inc(&queue->job_count);

    /*
     * In case of first element verify new node will be visible to 
the consumer
     * thread when we ping the kernel thread that there is new 
work to do.
     */
    smp_wmb();

    preempt_enable();

    return tail == &queue->head;
}

According to the manual, preempt_disable() only guaranteed 
exclusion on a single CPU/core/thread, so
we might be plagued with the slow, old fashioned locking unless 
anyone had a better idea.

Best regards,
Mirsad Todorovac