> -----Original Message----- > From: Greg KH [mailto:gregkh@xxxxxxxxxxxxxxxxxxx] > Sent: Monday, September 21, 2015 9:44 AM > To: KY Srinivasan <kys@xxxxxxxxxxxxx> > Cc: Olaf Hering <olaf@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > devel@xxxxxxxxxxxxxxxxxxxxxx; apw@xxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; > jasowang@xxxxxxxxxx > Subject: Re: [PATCH 2/5] hv: add helpers to handle hv_util device state > > On Mon, Sep 21, 2015 at 04:34:56PM +0000, KY Srinivasan wrote: > > > > > > > -----Original Message----- > > > From: Olaf Hering [mailto:olaf@xxxxxxxxx] > > > Sent: Monday, September 21, 2015 3:26 AM > > > To: KY Srinivasan <kys@xxxxxxxxxxxxx>; Greg KH > > > <gregkh@xxxxxxxxxxxxxxxxxxx> > > > Cc: linux-kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; > > > apw@xxxxxxxxxxxxx; vkuznets@xxxxxxxxxx; jasowang@xxxxxxxxxx > > > Subject: Re: [PATCH 2/5] hv: add helpers to handle hv_util device state > > > > > > On Sun, Sep 20, Greg KH wrote: > > > > > > > Just use a lock, that's what it is there for. > > > > > > How would that help? It might help because it enforces ordering. But > > > that requires that all three utils get refactored to deal with the > > > introduced locking. I will let KY comment on this. > > > > > > The issue I see with fcopy is that after or while fcopy_respond_to_host > > > runs an interrupt triggers which also calls into > > > hv_fcopy_onchannelcallback. It was most likely caused by a logic change > > > in "recent" vmbus updates because this did not happen before. At least, > > > the fcopy hang was not seen earler. Maybe the bug did just not trigger > > > up to now for other reasons... > > > > All util channels are bound to CPU 0. Just forcing all activity on CPU 0 may be > the > > simplest solution here. Besides, these are not performance critical services > anyway. > > > > The problem you may have run into could be related to the fact that we > could potentially > > run the polling function on a CPU other than CPU 0. > > Again, this sounds like a locking issue, you have multiple > threads/processes accessing the same data. Even if you bind it all to > one cpu, this shows a real design problem. > > Use a lock to fix this properly. That way, when you stop using only one > CPU, the code will "just work", and if you are really only on one CPU > today, there will not be any lock contention. > > thanks, Thanks Greg; will do. Regards, K. Y > > greg k-h _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel