Comment # 12
on bug 108272
from Jan Vesely
Hi, sorry for the delay. somehow I missed the notifications. (In reply to jamespharvey20 from comment #11) > When I originally filed this, I assumed it was 1 bug since I tried 2 things > with OpenCL, and both failed with opencl-mesa but worked with opencl-amd. > > Jan Vesely was correct that there were two separate problems. > > I'm hoping Jan Vesely can give guidance on whether to leave this bug open > for any of the reasons below, or if I should close it and potentially open > up 1-2 new bugs. > > The original luxmark bug (segfault) is solved, but that exposes 2 new > opencl-mesa bugs when running luxmark. > > The original IndigoBenchmark bug (segfault) isn't solved, but as explained > below, I understand if we have to consider that unsolvable for now. > > I don't think this affects any of these bugs, but I'll mention a few weeks > ago, I switched back to my Asus Radeon R9 390. The same behaviors discussed > in this entire bug report occur. (i.e. 18.2.3 and before crash luxmark.) > If someone really wants me to do so, I can switch back to the RX 580 to test > 18.2.4, but I'm betting since it works properly with the R9 390 that the > problem is fixed. > > ORIGINAL LUXMARK BUG #1 > ----------------------------------------- > > Using mesa 18.2.4, the luxmark segfault is solved. As this was the first bug. I'd close this one and open new bugs for both indigo and incorrect rendering in luxmark. > > NEW - LUXMARK BUG #2 > ------------------------------------ > > Jan Vesely's comment on 2018-10-09 mentions: "bumping MAX_GLOBAL_BUFFERS to > 32 allows luxmark to run, albeit still with many incorrect pixels -- libclc > rounding conversions are incorrect." > > That's what I'm seeing out of 18.2.4. Using LuxBall HDR (Simple Benchmark): > > MESA 18.2.4: 40626 (Image validation OK (65739 different pixels, 10.27%) > > AMDGPU-PRO: 15739 (Image validation OK (5736 different pixels, 0.90%) > > There's no typos there. opencl-mesa scores almost unbelievably higher than > opencl-amd, but the different pixels percentage increases by a factor of > 11.4. > > As Jan's other comment on 2018-10-09 mentions, the image looks garbled and > the results are incorrect. > > Not sure if this bug should be left open for this issue, or if I should > create a new bug. (Or, if there is a bug already open for it.) Or, if mesa > will say it's purely libclc's problem, and to go to them about it. I'd say this is probably a purely libclc problem, but feel free to open the bug against clover on freedesktop. 10% is rather good I usually saw ~30% wrong pixels on my machines. > > NEW - LUXMARK BUG #3 > ------------------------------------ > > Although luxmark can now benchmark, when doing so, all input becomes > unusably awful. It reminds me of when Windows has too many things open, > suddenly decided it can't cope, and you're waiting to see if it's going to > recover or crash. Keystrokes take too long to be printed, and the mouse > becomes slow and jumpy. Top shows cpu and memory usage are fine, which was > my first thought. BTW, running xf86-video-amdgpu 18.1.0, and when I > upgraded mesa, it was both mesa and opencl-mesa. > > In comparison, if I use opencl-amd, input is not affected. I wouldn't even > know the GPU is being slammed. > > Using the program radeontop, I can see when using mesa, "Graphics pipe", > "Texture Addresser", and "Shader Interpolator" are between 95-100%, usually > 98-100%. > > When using opencl-amd, radeontop shows the same. (Granted, Vertex Grouper + > Tesselator / Shader Export/Scan Converter/Depth Block/Color Block bounce > between 5-20% vs on opencl-mesa, they bounce between 1-5%.) This sounds like GPU priority/scheduling problem. I haven't looked into whether it can be solved via opening lower priority pipe for compute, or we need to enable advanced features like CWSR. Please open a separate bug. Hogging a large portion of the GPU might explain some of that high score. > > INDIGO BUG > ------------------ > > I edited 18.2.4's si_get.c to be very short: > > snprintf(sscreen->renderer_string, sizeof(sscreen->renderer_string), > "%s", > chip_name); > > And compiled/installed it, but it didn't affect the crash. > > IndigoBenchmark said they're statically linking with LLVM 3.4, which is > quite old. But, it runs fine with opencl-amd, and only crashes on > opencl-mesa. I just posted a followup "where do we go from here"-ish > comment there which has to be moderator approved so isn't showing yet. > https://www.indigorenderer.com/forum/viewtopic.php?f=37&t=14986 > > Part of me thinks it needs to be given up on, being a closed-source > precompiled binary statically linked against LLVM 3.4. > > Part of me thinks since it only crashes with opencl-mesa, and runs perfectly > fine with opencl-amd, there's probably (but not definitely) a bug in > opencl-mesa. > > But, I understand since they don't seem to be paying this any attention, we > may have to give up on the Indigo Bug as being unable to be realistically > investigated further. Can you check if indigo exports any LLVM symbols? It might be that we end up using those instead of the new ones from libLLVM.* If that's the case one solution would be to link mesa/clover with static LLVM. Enabling symbol versioning for LLVM should work as well.
You are receiving this mail because:
- You are the assignee for the bug.
_______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel