Re: GuC issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Thanks again for reply.

I have implemented __uc_sanitize() function in my code for resetting a GuC just before uploading a HuC firmware blob.

My current setup is as follows: I have TGL and Safety Critiical OS. I load GuC and authenticate HuC successfully. Then I kill my Safety Critical KMD driver. I load it once again, which triggers GuC reset/HuC loading/GuC loading/Enable communication etc.

Unfortunately, when I try to enable communication for the second time (without cold reset), then guc_action_control_ctb(), especially intel_guc_send_mmio() returns GUC_HXG_TYPE_RESPONSE_FAILURE with error 774 or 0x306. 

I'm not sure what it means.

The log buffer just after guc_action_control_ctb() is attached to this email.

Could anybody look into it?

I really appreciate you help,
Maksym




wtorek, 27 lutego 2024 9:03 PM, John Harrison <john.c.harrison@xxxxxxxxx> napisał(a):

> 
> 
> On 2/26/2024 08:30, maksym@xxxxxxxxxxx wrote:
> 
> > Hello,
> > 
> > Thank you for your help.
> > 
> > Is there a possibility to load GuC, then "unload" it and load it again without cold reset?
> 
> You need to reset the GuC at least - bit 3 of GDRST. The GuC cannot be
> reloaded 'live'. It must be put into reset first. The last line of
> __uc_sanitize() is a call to reset the GuC, so yes that would be an
> option. Note that fini is more about cleaning up to unload the driver.
> Whereas the sanitise functions are about resets with the potential to
> restart again.
> 
> John.
> 
> > By loading I mean HuC firmware upload, GuC ADS/log init, GuC firmware upload, CT init, HuC authentication by GuC.
> > 
> > I'm asking because I need to perform severe testing on the target for safety purposes without GPU cold reset.
> > What should be done in order to "unload" the GuC? Is it __uc_sanitize() and __uc_fini()?
> > 
> > Maksym
> > 
> > czwartek, 22 lutego 2024 20:31, Harrison, John C john.c.harrison@xxxxxxxxx napisał(a):
> > 
> > > Hello,
> > > 
> > > That worked better. The complaint is that the engine mapping table is invalid. See the i915 code in guc_mapping_table_init () in gt/uc/intel_guc_ads.c for an example of how to initialise the table.
> > > 
> > > John.
> > > 
> > > -----Original Message-----
> > > From: maksym@xxxxxxxxxxx maksym@xxxxxxxxxxx
> > > 
> > > Sent: Wednesday, February 21, 2024 07:15
> > > To: Harrison, John C john.c.harrison@xxxxxxxxx
> > > 
> > > Cc: maksym@xxxxxxxxxxx; Wajdeczko, Michal Michal.Wajdeczko@xxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx
> > > 
> > > Subject: Re: GuC issue
> > > 
> > > Ah, I dumped them with Windows new line characters.
> > > 
> > > Here is a new log binary dump.
> > > 
> > > I moved to the newest TGL GuC firmware from linux-firmware repo.
> > > 
> > > środa, 21 lutego 2024 12:16 AM, John Harrison john.c.harrison@xxxxxxxxx napisał(a):
> > > 
> > > > Hello,
> > > > 
> > > > Something is very corrupted with that GuC log. The log consists of a
> > > > header page and then a stream of log entry structures. The structure
> > > > is supposed to be 20 bytes long and starts with a four byte time
> > > > stamp. But I am seeing what is conceivably a 32bit timestamp appearing
> > > > at 21 byte increments through the log. Even more curiously, the time
> > > > stamp seems to have an 0x0D, 0x0A after it. Are you doing any printf
> > > > type operation in order to write the log out from memory to disk?
> > > > 
> > > > INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID means that the GuC did not
> > > > like the initialisation data passed in. Most likely, something in the
> > > > ADS structure is not valid. If you try with the latest GuC version,
> > > > that might give you more information as to what is the incorrect. More
> > > > status codes have been added since 70.1.1.
> > > > 
> > > > John.
> > > > 
> > > > On 2/20/2024 05:03, maksym@xxxxxxxxxxx wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > Please see GuC log attached to this email.
> > > > > 
> > > > > Log size is "PAGE_SIZE+Debug Log(64KB) + Crash Log (8KB) + Capture Log (1M)"
> > > > > 
> > > > > Can anybody from Intel decode this log buffer? Thanks.
> > > > > 
> > > > > What am I doing wrong?
> > > > > 
> > > > > Maksym
> > > > > 
> > > > > poniedziałek, 19 lutego 2024 09:44, maksym@xxxxxxxxxxx maksym@xxxxxxxxxxx napisał(a):
> > > > > 
> > > > > > Hi,
> > > > > > 
> > > > > > I fixed one issue in my driver. Log address was set incorrectly.
> > > > > > 
> > > > > > Right now, after GuC uploading, GUC_STATUS changed.
> > > > > > Right now, intel_guc_load_status is INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID = 0x71.
> > > > > > 
> > > > > > What does it mean?
> > > > > > Could you please help me with this?
> > > > > > 
> > > > > > Thanks,
> > > > > > Maksym
> > > > > > 
> > > > > > piątek, 9 lutego 2024 08:42, natur.produkt@xxxxx natur.produkt@xxxxx napisał(a):
> > > > > > 
> > > > > > > Hello,
> > > > > > > 
> > > > > > > Please see my comments below.
> > > > > > > 
> > > > > > > piątek, 9 lutego 2024 2:45 AM, John Harrison john.c.harrison@xxxxxxxxx napisał(a):
> > > > > > > 
> > > > > > > > Hello,
> > > > > > > > 
> > > > > > > > What platform is this on? And which GuC firmware version are you using?
> > > > > > > > It's TGL. I'm using tgl_guc_70.1.1.bin firmware blob.
> > > > > > > > One thing you made need to do is force maximum GT frequency
> > > > > > > > during GuC load. That is something the i915 driver does. If
> > > > > > > > the system decides the GPU is idle and drops the frequency to
> > > > > > > > minimum then it can take multiple seconds for the GuC initialisation to complete.
> > > > > > > > Thanks for the hint. I'm not doing that at all in my code. How am I supposed to do this? Is there a specific register for that?
> > > > > > > > Did the status change at all during that second of waiting? Or
> > > > > > > > was it still reading LAPIC_DONE?
> > > > > > > > It's always LAPIC_DONE.
> > > > > > > > For ADS documentation, I'm afraid that the best we currently
> > > > > > > > have publicly available is the i915 driver code. If you are
> > > > > > > > not intending to use GuC submission then most of the ADS can be ignored.
> > > > > > > > Ok, that great. Which part of ADS is must-have then?
> > > > > > > > If you can share the GuC log, that might provide some clues as
> > > > > > > > to what is happening. For just logging the boot process, you
> > > > > > > > shouldn't need to allocate a large log. The default size of
> > > > > > > > i915 for release builds is 64KB. That should be plenty.
> > > > > > > > I'll collect GuC log as soon as possible. Is it something that can be understood without a knowledge of GuC internals? Or is it simply hex dumps?
> > > > > > > > John.
> > > > > > > > 
> > > > > > > > On 2/6/2024 23:59, natur.produkt@xxxxx wrote:
> > > > > > > > 
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I'm currently implementing GuC/HuC firmware support in one Safety Critical OS.
> > > > > > > > > I'm following i915 code and I implemented all paths (I don't want GuC submission or SLPC features). I need GuC to authenticate HuC firmware blob.
> > > > > > > > > 
> > > > > > > > > I mirrored GuC implementation in my code.
> > > > > > > > > 
> > > > > > > > > After GuC DMA transfer succeeds, I'm reading GUC_STATUS register.
> > > > > > > > > HW returns INTEL_BOOTROM_STATUS_JUMP_PASSED as bootrom status and INTEL_GUC_LOAD_STATUS_LAPIC_DONE as GuC load status.
> > > > > > > > > 
> > > > > > > > > Unfortunately, after one second of waiting, the status didn't get changed to INTEL_GUC_LOAD_STATUS_READY at all.
> > > > > > > > > 
> > > > > > > > > What is a potential issue here?
> > > > > > > > > Could you please help me?
> > > > > > > > > 
> > > > > > > > > In addition to this, could you please point out some documentation about GuC's ADS struct?
> > > > > > > > > 
> > > > > > > > > Thanks,
> > > > > > > > > Maksym

Attachment: fullLog_ct_write.bin
Description: application/macbinary


[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux