https://bugzilla.kernel.org/show_bug.cgi?id=204181 --- Comment #56 from Sergey Kondakov (virtuousfox@xxxxxxxxx) --- (In reply to Alex Deucher from comment #54) > (In reply to Sergey Kondakov from comment #53) > > Or any of these ? > > options amdgpu cik_support=1 si_support=1 msi=1 disp_priority=2 dpm=1 > > runpm=1 sched_policy=1 compute_multipipe=1 vm_fragment_size=9 gartsize=1024 > > max_num_of_queues_per_device=65536 sched_hw_submission=32 sched_jobs=1024 > > job_hang_limit=8000 halt_if_hws_hang=1 vm_fault_stop=0 vm_update_mode=0 > > deep_color=1 gpu_recovery=1 lockup_timeout=2500,5000,8000,1000 ras_enable=1 > > mcbp=1 queue_preemption_timeout_ms=48 mes=1 hws_gws_support=1 discovery=1 > > remove all of those. You should use the defaults unless you are > specifically debugging something. Then you may consider that I "specifically debugging" THIS. Because when I ask these questions here or in freedesktop.org, I specifically hope for an factual response from people with actual understanding and experience of how it works and what to be a proper way to debug without guesswork, based on knowledge that would compensate for the lack of meaningful documentation and one of the highest entry-barriers in software (even corporate monstrosity like Intel can't figure out GPUs still, market that is dominated by 2 oligopolists that run it with impunity however they feel like it, after all). This third dereference would be really hard to debug, though, because there is no clear reproduction steps, UNLESS you KNOW where and how to look as a developer. Or are you all just going to ignore the presence of kernel-crashing code because it "may" (or may not) be not triggered by your defaults ? So, can you actually tell which code-path may result in this or, better yet, test it yourself so things like that just would not go into releases ? The original dereference is triggered by mere presence of PageFlip which is on by default, so blindly running developer defaults (you can see what exactly I think about them here: https://bugzilla.kernel.org/show_bug.cgi?id=203703#c9 and c11) didn't help much anyone now, did it ? Or can you at least explain on what exactly each of these options does, what may be desired and undesired consequences and how your consensus about defaults came to be ? Short summary (but not as short as modinfo) or links to mailing list discussions maybe ? Because my goals (as they are for any desktop user) are: minimal guaranteed latency (meaning, full aggressive preemption, lowest scheduling granularity and strict RT priorities) of audio/video/input/network pipelines under stress-load and in that specific order of priority, with working fast fail-over or recovery instead of hangs and reboots. If I'd be using defaults then I still would be sitting on 3,3Ghz (instead of 4Ghz + 2,4Ghz for MMU & cache) FX CPU, non-ECC RAM ran by literally retarded AMD FX's MMU (you KNOW the one, the laughing stock of 2011-2017 x86 CPUs !) by slow default JEDEC timings, ~200W (instead of down-clocked and/or under-voltaged 90-120W) RX580 GPU (that would, no doubt, fry itself at some point like my previous 6870 did) with slow memory timings, sluggish non-patched kwin, 64ms of audio latency (instead of 8-12ms) and whole bunch of random hangs/drops in audio, video stuttering and input delays/skips due to scheduling priorities that are all other the place by default. So, no, thank you very much, on that. And YOU should NOT be testing exclusively on defaults either. (In reply to Tom Seewald from comment #55) > (In reply to Sergey Kondakov from comment #53) > > Created attachment 285209 [details] > > dmesg_2019-09-26-amdgpu-old_dereference_on_patched_5.3.1 > > > > After about a day of uptime my patched 5.3.1 hanged during hours-long > > Youtube video with dereference that is almost identical to the original > one: > > I don't believe the patches[1] have landed in a stable kernel release yet, > at least going by the 5.3.1 change log[2] I don't see any reference to them. > > [1] https://patchwork.freedesktop.org/series/64505/ > [2] https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.3.1 They seem to be in queue for 5.3.2: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/?id=7f2f9d496c3b8809143f1fc14e8cb093cc981d78 BUT those only address #1 (PageFlip) dereference, NOT #2 (when vm_update_mode not 0) and #3 ! -- You are receiving this mail because: You are watching the assignee of the bug. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel