Hello, Thanks again for reply. I have implemented __uc_sanitize() function in my code for resetting a GuC just before uploading a HuC firmware blob. My current setup is as follows: I have TGL and Safety Critiical OS. I load GuC and authenticate HuC successfully. Then I kill my Safety Critical KMD driver. I load it once again, which triggers GuC reset/HuC loading/GuC loading/Enable communication etc. Unfortunately, when I try to enable communication for the second time (without cold reset), then guc_action_control_ctb(), especially intel_guc_send_mmio() returns GUC_HXG_TYPE_RESPONSE_FAILURE with error 774 or 0x306. I'm not sure what it means. The log buffer just after guc_action_control_ctb() is attached to this email. Could anybody look into it? I really appreciate you help, Maksym wtorek, 27 lutego 2024 9:03 PM, John Harrison <john.c.harrison@xxxxxxxxx> napisał(a): > > > On 2/26/2024 08:30, maksym@xxxxxxxxxxx wrote: > > > Hello, > > > > Thank you for your help. > > > > Is there a possibility to load GuC, then "unload" it and load it again without cold reset? > > You need to reset the GuC at least - bit 3 of GDRST. The GuC cannot be > reloaded 'live'. It must be put into reset first. The last line of > __uc_sanitize() is a call to reset the GuC, so yes that would be an > option. Note that fini is more about cleaning up to unload the driver. > Whereas the sanitise functions are about resets with the potential to > restart again. > > John. > > > By loading I mean HuC firmware upload, GuC ADS/log init, GuC firmware upload, CT init, HuC authentication by GuC. > > > > I'm asking because I need to perform severe testing on the target for safety purposes without GPU cold reset. > > What should be done in order to "unload" the GuC? Is it __uc_sanitize() and __uc_fini()? > > > > Maksym > > > > czwartek, 22 lutego 2024 20:31, Harrison, John C john.c.harrison@xxxxxxxxx napisał(a): > > > > > Hello, > > > > > > That worked better. The complaint is that the engine mapping table is invalid. See the i915 code in guc_mapping_table_init () in gt/uc/intel_guc_ads.c for an example of how to initialise the table. > > > > > > John. > > > > > > -----Original Message----- > > > From: maksym@xxxxxxxxxxx maksym@xxxxxxxxxxx > > > > > > Sent: Wednesday, February 21, 2024 07:15 > > > To: Harrison, John C john.c.harrison@xxxxxxxxx > > > > > > Cc: maksym@xxxxxxxxxxx; Wajdeczko, Michal Michal.Wajdeczko@xxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx > > > > > > Subject: Re: GuC issue > > > > > > Ah, I dumped them with Windows new line characters. > > > > > > Here is a new log binary dump. > > > > > > I moved to the newest TGL GuC firmware from linux-firmware repo. > > > > > > środa, 21 lutego 2024 12:16 AM, John Harrison john.c.harrison@xxxxxxxxx napisał(a): > > > > > > > Hello, > > > > > > > > Something is very corrupted with that GuC log. The log consists of a > > > > header page and then a stream of log entry structures. The structure > > > > is supposed to be 20 bytes long and starts with a four byte time > > > > stamp. But I am seeing what is conceivably a 32bit timestamp appearing > > > > at 21 byte increments through the log. Even more curiously, the time > > > > stamp seems to have an 0x0D, 0x0A after it. Are you doing any printf > > > > type operation in order to write the log out from memory to disk? > > > > > > > > INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID means that the GuC did not > > > > like the initialisation data passed in. Most likely, something in the > > > > ADS structure is not valid. If you try with the latest GuC version, > > > > that might give you more information as to what is the incorrect. More > > > > status codes have been added since 70.1.1. > > > > > > > > John. > > > > > > > > On 2/20/2024 05:03, maksym@xxxxxxxxxxx wrote: > > > > > > > > > Hi, > > > > > > > > > > Please see GuC log attached to this email. > > > > > > > > > > Log size is "PAGE_SIZE+Debug Log(64KB) + Crash Log (8KB) + Capture Log (1M)" > > > > > > > > > > Can anybody from Intel decode this log buffer? Thanks. > > > > > > > > > > What am I doing wrong? > > > > > > > > > > Maksym > > > > > > > > > > poniedziałek, 19 lutego 2024 09:44, maksym@xxxxxxxxxxx maksym@xxxxxxxxxxx napisał(a): > > > > > > > > > > > Hi, > > > > > > > > > > > > I fixed one issue in my driver. Log address was set incorrectly. > > > > > > > > > > > > Right now, after GuC uploading, GUC_STATUS changed. > > > > > > Right now, intel_guc_load_status is INTEL_GUC_LOAD_STATUS_INIT_DATA_INVALID = 0x71. > > > > > > > > > > > > What does it mean? > > > > > > Could you please help me with this? > > > > > > > > > > > > Thanks, > > > > > > Maksym > > > > > > > > > > > > piątek, 9 lutego 2024 08:42, natur.produkt@xxxxx natur.produkt@xxxxx napisał(a): > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > Please see my comments below. > > > > > > > > > > > > > > piątek, 9 lutego 2024 2:45 AM, John Harrison john.c.harrison@xxxxxxxxx napisał(a): > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > What platform is this on? And which GuC firmware version are you using? > > > > > > > > It's TGL. I'm using tgl_guc_70.1.1.bin firmware blob. > > > > > > > > One thing you made need to do is force maximum GT frequency > > > > > > > > during GuC load. That is something the i915 driver does. If > > > > > > > > the system decides the GPU is idle and drops the frequency to > > > > > > > > minimum then it can take multiple seconds for the GuC initialisation to complete. > > > > > > > > Thanks for the hint. I'm not doing that at all in my code. How am I supposed to do this? Is there a specific register for that? > > > > > > > > Did the status change at all during that second of waiting? Or > > > > > > > > was it still reading LAPIC_DONE? > > > > > > > > It's always LAPIC_DONE. > > > > > > > > For ADS documentation, I'm afraid that the best we currently > > > > > > > > have publicly available is the i915 driver code. If you are > > > > > > > > not intending to use GuC submission then most of the ADS can be ignored. > > > > > > > > Ok, that great. Which part of ADS is must-have then? > > > > > > > > If you can share the GuC log, that might provide some clues as > > > > > > > > to what is happening. For just logging the boot process, you > > > > > > > > shouldn't need to allocate a large log. The default size of > > > > > > > > i915 for release builds is 64KB. That should be plenty. > > > > > > > > I'll collect GuC log as soon as possible. Is it something that can be understood without a knowledge of GuC internals? Or is it simply hex dumps? > > > > > > > > John. > > > > > > > > > > > > > > > > On 2/6/2024 23:59, natur.produkt@xxxxx wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I'm currently implementing GuC/HuC firmware support in one Safety Critical OS. > > > > > > > > > I'm following i915 code and I implemented all paths (I don't want GuC submission or SLPC features). I need GuC to authenticate HuC firmware blob. > > > > > > > > > > > > > > > > > > I mirrored GuC implementation in my code. > > > > > > > > > > > > > > > > > > After GuC DMA transfer succeeds, I'm reading GUC_STATUS register. > > > > > > > > > HW returns INTEL_BOOTROM_STATUS_JUMP_PASSED as bootrom status and INTEL_GUC_LOAD_STATUS_LAPIC_DONE as GuC load status. > > > > > > > > > > > > > > > > > > Unfortunately, after one second of waiting, the status didn't get changed to INTEL_GUC_LOAD_STATUS_READY at all. > > > > > > > > > > > > > > > > > > What is a potential issue here? > > > > > > > > > Could you please help me? > > > > > > > > > > > > > > > > > > In addition to this, could you please point out some documentation about GuC's ADS struct? > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Maksym
Attachment:
fullLog_ct_write.bin
Description: application/macbinary