This patch is now pushed, thanks. -- Petri Latvala On Wed, Oct 12, 2016 at 02:32:17PM +0100, Tvrtko Ursulin wrote: > > On 10/10/2016 15:31, Goel, Akash wrote: > > > > > > On 10/10/2016 7:22 PM, Tvrtko Ursulin wrote: > > > > > > On 10/10/2016 11:59, akash.goel@xxxxxxxxx wrote: > > > > From: Akash Goel <akash.goel@xxxxxxxxx> > > > > > > > > This patch provides a test utility which helps capture GuC firmware > > > > logs and > > > > then dump them to file. > > > > The logs are pulled from a debugfs file > > > > '/sys/kernel/debug/dri/guc_log' and > > > > by default stored into a file 'guc_log_dump.dat'. The name, > > > > including the > > > > location, of the output file can be changed through a command line > > > > argument. > > > > > > > > The utility goes into an infinite loop where it waits for the arrival > > > > of new > > > > logs and as soon as new set of logs are produced it captures them in > > > > its local > > > > buffer which is then flushed out to the file on disk. > > > > Any time when logging needs to be ended, User can stop this utility > > > > (CTRL+C). > > > > > > > > Before entering into a loop, it first discards whatever logs are > > > > present in > > > > the debugfs file. > > > > This way User can first launch this utility and then start a > > > > workload/activity > > > > for which GuC firmware logs are to be actually captured and keep > > > > running the > > > > utility for as long as its needed, like once the workload is over this > > > > utility > > > > can be forcefully stopped. > > > > > > > > If the logging wasn't enabled on GuC side by the Driver at boot time, > > > > utility > > > > will first enable the logging and later on when it is stopped (CTRL+C) > > > > it will > > > > also pause the logging on GuC side. > > > > > > > > v2: > > > > - Use combination of alarm system call & SIGALRM signal to run the > > > > utility > > > > for required duration. (Tvrtko) > > > > - Fix inconsistencies, do minor cleanup and refactoring. (Tvrtko) > > > > > > > > v3: > > > > - Fix discrepancy for the output file command line option and > > > > update the > > > > Usage/help string. > > > > > > > > v4: > > > > - Update the exit condition for flusher thread, now will exit > > > > only after > > > > the capture loop is over and not when the flag to stop > > > > logging is set. > > > > This handles a corner case, due to which the dump of last captured > > > > buffer > > > > was getting missed. > > > > - Add a newline character at the end of assert messages. > > > > - Avoid the assert for the case, which occurs very rarely, when there > > > > are no > > > > bytes read from the relay file. > > > > > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> > > > > Signed-off-by: Akash Goel <akash.goel@xxxxxxxxx> > > > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> (v3) > > > > --- > > > > tools/Makefile.sources | 1 + > > > > tools/intel_guc_logger.c | 438 > > > > +++++++++++++++++++++++++++++++++++++++++++++++ > > > > 2 files changed, 439 insertions(+) > > > > create mode 100644 tools/intel_guc_logger.c > > > > > > > > diff --git a/tools/Makefile.sources b/tools/Makefile.sources > > > > index 2bb6c8e..be58871 100644 > > > > --- a/tools/Makefile.sources > > > > +++ b/tools/Makefile.sources > > > > @@ -19,6 +19,7 @@ tools_prog_lists = \ > > > > intel_gpu_time \ > > > > intel_gpu_top \ > > > > intel_gtt \ > > > > + intel_guc_logger \ > > > > intel_infoframes \ > > > > intel_l3_parity \ > > > > intel_lid \ > > > > diff --git a/tools/intel_guc_logger.c b/tools/intel_guc_logger.c > > > > new file mode 100644 > > > > index 0000000..159a54e > > > > --- /dev/null > > > > +++ b/tools/intel_guc_logger.c > > > > @@ -0,0 +1,438 @@ > > > > + > > > > +#define _GNU_SOURCE /* For using O_DIRECT */ > > > > +#include <inttypes.h> > > > > +#include <stdio.h> > > > > +#include <fcntl.h> > > > > +#include <sched.h> > > > > +#include <sys/ioctl.h> > > > > +#include <string.h> > > > > +#include <errno.h> > > > > +#include <signal.h> > > > > +#include <stdlib.h> > > > > +#include <unistd.h> > > > > +#include <sys/poll.h> > > > > +#include <sys/wait.h> > > > > +#include <sys/types.h> > > > > +#include <sys/syscall.h> > > > > +#include <sys/mman.h> > > > > +#include <assert.h> > > > > +#include <pthread.h> > > > > + > > > > +#include "igt.h" > > > > + > > > > +#define MB(x) ((uint64_t)(x) * 1024 * 1024) > > > > +#ifndef PAGE_SIZE > > > > + #define PAGE_SIZE 4096 > > > > +#endif > > > > +/* Currently the size of GuC log buffer is 19 pages & so is the size > > > > of relay > > > > + * subbuffer. If the size changes in future, then this define also > > > > needs to be > > > > + * updated accordingly. > > > > + */ > > > > +#define SUBBUF_SIZE (19*PAGE_SIZE) > > > > +/* Need large buffering from logger side to hide the DISK IO latency, > > > > Driver > > > > + * can only store 8 snapshots of GuC log buffer in relay. > > > > + */ > > > > +#define NUM_SUBBUFS 100 > > > > + > > > > +#define RELAY_FILE_NAME "guc_log" > > > > +#define DEFAULT_OUTPUT_FILE_NAME "guc_log_dump.dat" > > > > +#define CONTROL_FILE_NAME "i915_guc_log_control" > > > > + > > > > +char *read_buffer; > > > > +char *out_filename; > > > > +int poll_timeout = 2; /* by default 2ms timeout */ > > > > +pthread_mutex_t mutex; > > > > +pthread_t flush_thread; > > > > +int verbosity_level = 3; /* by default capture logs at max > > > > verbosity */ > > > > +uint32_t produced, consumed; > > > > +uint64_t total_bytes_written; > > > > +int num_buffers = NUM_SUBBUFS; > > > > +int relay_fd, outfile_fd = -1; > > > > +uint32_t test_duration, max_filesize; > > > > +pthread_cond_t underflow_cond, overflow_cond; > > > > +bool stop_logging, discard_oldlogs, capturing_stopped; > > > > + > > > > +static void guc_log_control(bool enable_logging) > > > > +{ > > > > + int control_fd; > > > > + char data[19]; > > > > + uint64_t val; > > > > + int ret; > > > > + > > > > + control_fd = igt_debugfs_open(CONTROL_FILE_NAME, O_WRONLY); > > > > + igt_assert_f(control_fd >= 0, "couldn't open the guc log control > > > > file\n"); > > > > + > > > > + val = enable_logging ? ((verbosity_level << 4) | 0x1) : 0; > > > > + > > > > + ret = snprintf(data, sizeof(data), "0x%" PRIx64, val); > > > > + igt_assert(ret > 2 && ret < sizeof(data)); > > > > + > > > > + ret = write(control_fd, data, ret); > > > > + igt_assert_f(ret > 0, "couldn't write to the log control file\n"); > > > > + > > > > + close(control_fd); > > > > +} > > > > + > > > > +static void int_sig_handler(int sig) > > > > +{ > > > > + igt_info("received signal %d\n", sig); > > > > + > > > > + stop_logging = true; > > > > +} > > > > + > > > > +static void pull_leftover_data(void) > > > > +{ > > > > + unsigned int bytes_read = 0; > > > > + int ret; > > > > + > > > > + do { > > > > + /* Read the logs from relay buffer */ > > > > + ret = read(relay_fd, read_buffer, SUBBUF_SIZE); > > > > + if (!ret) > > > > + break; > > > > + > > > > + igt_assert_f(ret > 0, "failed to read from the guc log > > > > file\n"); > > > > + igt_assert_f(ret == SUBBUF_SIZE, "invalid read from relay > > > > file\n"); > > > > + > > > > + bytes_read += ret; > > > > + > > > > + if (outfile_fd >= 0) { > > > > + ret = write(outfile_fd, read_buffer, SUBBUF_SIZE); > > > > + igt_assert_f(ret == SUBBUF_SIZE, "couldn't dump the logs > > > > in a file\n"); > > > > + total_bytes_written += ret; > > > > + } > > > > + } while(1); > > > > + > > > > + igt_debug("%u bytes flushed\n", bytes_read); > > > > +} > > > > + > > > > +static int num_filled_bufs(void) > > > > +{ > > > > + return (produced - consumed); > > > > +} > > > > + > > > > +static void pull_data(void) > > > > +{ > > > > + char *ptr; > > > > + int ret; > > > > + > > > > + pthread_mutex_lock(&mutex); > > > > + while (num_filled_bufs() >= num_buffers) { > > > > + igt_debug("overflow, will wait, produced %u, consumed %u\n", > > > > produced, consumed); > > > > + /* Stall the main thread in case of overflow, as there are no > > > > + * buffers available to store the new logs, otherwise there > > > > + * could be corruption if both threads work on the same > > > > buffer. > > > > + */ > > > > + pthread_cond_wait(&overflow_cond, &mutex); > > > > + }; > > > > + pthread_mutex_unlock(&mutex); > > > > + > > > > + ptr = read_buffer + (produced % num_buffers) * SUBBUF_SIZE; > > > > + > > > > + /* Read the logs from relay buffer */ > > > > + ret = read(relay_fd, ptr, SUBBUF_SIZE); > > > > + igt_assert_f(ret >= 0, "failed to read from the guc log file\n"); > > > > + igt_assert_f(!ret || ret == SUBBUF_SIZE, "invalid read from relay > > > > file\n"); > > > > + > > > > + if (ret) { > > > > + pthread_mutex_lock(&mutex); > > > > + produced++; > > > > + pthread_cond_signal(&underflow_cond); > > > > + pthread_mutex_unlock(&mutex); > > > > + } else { > > > > + /* Occasionally (very rare) read from the relay file > > > > returns no > > > > + * data, albeit the polling done prior to read call indicated > > > > + * availability of data. > > > > + */ > > > > + igt_debug("no data read from the relay file\n"); > > > > > > This worries me. Relayfs kernel side is quite simple which would suggest > > > there is maybe a bug in the logger? And it is very important that we > > > trust logger correctly capturing data. Because people will spend a lot > > > of time debugging issues from the logs and it would be bad to have to > > > chase irregularities caused by a problem in the logger. > > > > > Actually I see this issue rarely and only after logging some GBs of > > data. And most likely it won't have any bearing on the correctness of > > captured data, the missed snapshot shall be pulled in the next iteration > > of capture loop. Also no-overwrite mode of relay is being used. > > > > As per my hunch, it can't be ruled out that this is not a relayfs issue. > > > > > Could you please try and identify why exactly is this happening? > > Will try to investigate both from relayfs side as well as logger side, > > but it would take some time. > > This may not be considered a blocker. > > > > As you reported off list, after some deep investigation there seems to be no > potential for lost packets. "Empty wakeup" is just a consequence of > unsynchronized nature of relayfs. > > In the light of that I am happy with the logger, so: > > Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx> > > Regards, > > Tvrtko > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@xxxxxxxxxxxxxxxxxxxxx > https://lists.freedesktop.org/mailman/listinfo/intel-gfx _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx