Thanks Umesh for reviewing the patch. Am fixing all the rest but a couple of comments. Responses to the latter and other questions below: ...alan > > +enum intel_guc_state_capture_event_status { > > + INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_SUCCESS = 0x0, > > + INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_NOSPACE = 0x1, > > +}; > > + > > +#define INTEL_GUC_STATE_CAPTURE_EVENT_STATUS_MASK 0x1 > > MASK is not needed. See below Alan: Oh wait, actually the mask for the capture status is 0x000000FF (above is a typo). I'll fix above mask and shall not change the code below because the upper 24 bits of the first dword of this msg is not defined. ... > > +static int guc_capture_buf_cnt(struct __guc_capture_bufstate *buf) > > +{ > > + if (buf->rd == buf->wr) > > + return 0; > > + if (buf->wr > buf->rd) > > + return (buf->wr - buf->rd); > > + return (buf->size - buf->rd) + buf->wr; > > +} > > Is this a circular buffer shared between GuC and kmd? Since the size is > a power of 2, the above function is simply: > Alan: not this is not a circular buffer, so I'll keep the above version. > static u32 guc_capture_buf_count(struct __guc_capture_bufstate *buf) > { > return (buf->wr - buf->rd) & (buf->size - 1); > } > ... > > +static int > > +guc_capture_log_remove_dw(struct intel_guc *guc, struct __guc_capture_bufstate *buf, > > + u32 *dw) > > +{ > > + struct drm_i915_private *i915 = guc_to_gt(guc)->i915; > > + int tries = 2; > > + int avail = 0; > > + u32 *src_data; > > + > > + if (!guc_capture_buf_cnt(buf)) > > + return 0; > > + > > + while (tries--) { > > + avail = guc_capture_buf_cnt_to_end(buf); > > Shouldn't this be avail = guc_capture_buf_cnt(buf)? > Alan : The "guc_capture_log_get_[foo]" functions only call above guc_capture_log_remove_dw when there isnt sufficient space to copy out an entire structure from the space between the read pointer and the end of the subregion (before the wrap-around). Those function would populate the structure dword by dword by calling above func. (NOTE the buffer and all error capture output structs are dword aligned). Thats why above function tries twice and resets buf->rd = 0 if we find no space left at the end of the subregion (i.e. need to wrap around) - which can only be done by calling "guc_capture_buf_cnt_to_end". ... > > + > > + /* Bookkeeping stuff */ > > + guc->log_state[GUC_CAPTURE_LOG_BUFFER].flush += log_buf_state_local.flush_to_file; > > + new_overflow = intel_guc_check_log_buf_overflow(guc, > > + &guc->log_state[GUC_CAPTURE_LOG_BUFFER], > > + full_count); > > I am not sure how the overflow logic works here and whether it is > applicable to the error capture buffer. Is the guc log buffer one big > buffer where the error capture is just a portion of that buffer? If so, > is the wrap around applicable to just the errorcapture buffer or to the > whole buffer? > Alan: Yes, the guc log buffer is one big log buffer but there are 3 independent subregions within that are populated with different content and are used in different ways and timings. Each guc-log subregion (general-logs, crash-dump and error-capture) has it's own read and write pointers. > Also what is the wrap_offset field in struct guc_log_buffer_state? Alan: This is the byte offset of a location in the subregion that is the 1st byte after the last valid guc entry written by Guc firmware before a wraparound was done. This would generate a tiny hole at the end of the subregion for better cacheline alignment when flushing entries into the subregion. However, the error-capture subregion is dword aligned and all of the output structures used for error-capture are also dword aligned so this can never happen for the error-capture subregion. >