Quoting Lis, Tomasz (2018-03-20 19:23:03) > > > On 2018-03-19 15:26, Chris Wilson wrote: > > Quoting Lis, Tomasz (2018-03-19 14:14:19) > > > On 2018-03-19 13:43, Chris Wilson wrote: > > Quoting Tomasz Lis (2018-03-19 12:37:35) > > The patch adds a parameter to control the data port coherency functionality > on a per-exec call basis. When data port coherency flag value is different > than what it was in previous call for the context, a command to switch data > port coherency state is added before the buffer to be executed. > > So this is part of the context? Why do it at exec level? > > It is part of the context, stored within HDC chicken bit register. > The exec level was requested by the OCL team, due to concerns about > performance cost of context setparam calls. > > What? Oh dear, oh dear, thrice oh dear. The context setparam would look > like: > > if (arg != context->value) { > rq = request_alloc(context, RCS); > cs = ring_begin(rq, 4); > cs++ = MI_LRI; > cs++ = reg; > cs++ = magic; > cs++ = MI_NOOP; > request_add(rq); > context->value = arg > } > > The argument is whether stuffing it into a crowded, v.frequently > executed execbuf is better than an irregular setparam. If they want to > flip it on every batch, use execbuf. If it's going to be very > infrequent, setparam. > > Implementing the data port coherency switch as context setparam would not be a > problem, I agree. > But this is not a solution OCL is willing to accept. Any additional IOCTL call > is a concern for the OCL developers. Being part of hardware context is a good indication that GEM context is the right place for the bit. Stuffing more into execbuf for one-usecase-one-platform scenario doesn't sound very future looking in terms of overall driver development. I would truly imagine that the IOCTL execution time should not be meaningful compared to compute kernel execution times. If they are already having a large amount of IOCTL calls between each patch, I guess that is something to be discussed separately. Regards, Joonas > > For more explanation on switch frequency - please look at the cover letter I > provided; here's the related part of it: > (note: the data port coherency is called fine grain coherency within UMD) > > 3. Will coherency switch be used frequently? > > There are scenarios that will require frequent toggling of the coherency > switch. > E.g. an application has two OCL compute kernels: kern_master and kern_worker. > kern_master uses, concurrently with CPU, some fine grain SVM resources > (CL_MEM_SVM_FINE_GRAIN_BUFFER). These resources contain descriptors of > computational work that needs to be executed. kern_master analyzes incoming > work descriptors and populates a plain OCL buffer (non-fine-grain) with payload > for kern_worker. Once kern_master is done, kern_worker kicks-in and processes > the payload that kern_master produced. These two kernels work in a loop, one > after another. Since only kern_master requires coherency, kern_worker should > not be forced to pay for it. This means that we need to have the ability to > toggle coherency switch on or off per each GPU submission: > (ENABLE COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> (ENABLE > COHERENCY) kern_master -> (DISABLE COHERENCY)kern_worker -> ... > > That discussion must be part of the rationale in the commitlog. > > Will add. > Should I place the whole text from cover letter within the commit comment? > > Otoh, execbuf3 would accept it as a command packet. Hmm. > > I know we have execbuf2, but execbuf3? Are you proposing to add something like > that? > > If exec level > is desired, why not whitelist it? > > If we have no issue in whitelisting the register, I'm sure OCL will > agree to that. > I assumed the whitelisting will be unacceptable because of security > concerns with some options. > The register also changes its position and content between gens, which > makes whitelisting hard to manage. > > Main purpose of chicken bit registers, in general, is to allow work > around for hardware features which could be buggy or could have > unintended influence on the platform. > The data port coherency functionality landed there for the same reasons; > then it twisted itself in a way that we now need user space to switch it. > Is it really ok to whitelist chicken bit registers? > > It all depends on whether it breaks segregation. If the only users > affected are themselves, fine. Otherwise, no. > -Chris > > Chicken Bit registers are definitely not planned as safe for use. While meaning > of bits within HDC_CHICKEN0 change between gens, I doubt any of the registers > *can't* be used to cause GPU hung. > -Tomasz > _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx