On Tue, Jun 06, 2017 at 10:45:42PM +0800, Alex Deucher wrote: > On Tue, Jun 6, 2017 at 10:03 AM, Alex Deucher <alexdeucher at gmail.com> wrote: > > On Tue, Jun 6, 2017 at 7:22 AM, Christian König > > <christian.koenig at amd.com> wrote: > >>> Yes, I agree with you. That's also my orignal opinion. > >>> But we encountered a random buggy when we were calling > >>> device_cache_fw_images. > >> > >> That looks like an upstream bug in device_cache_fw_images. > >> > >> We should probably open a bug report and ping the maintainer. Most likely we > >> are not correctly using the FW interface or trigger a rare bug or something > >> like this. > >> > >>> So then I check these functions and find gpu_info errors. The random buggy > >>> cannot be reproduced constantly.But we expected it can pass more than 30 > >>> cycles > >>> of S3 suspend and resume. Any ideas? > >> > >> I think the real solution is to just stop calling > >> amdgpu_device_parse_gpu_info_fw() during resume. > > > > Right. we only need to parse the firmware once during startup. > > How are hitting this on resume? amdgpu_device_parse_gpu_info_fw() is > called indirectly from amdgpu_device_init() which is only called once > at driver load time. > Yes, I also noted it. So I am confused with why firmware_class will still cache it during suspend. Thanks, Ray