On 04/06/2017 05:56 AM, Michael S. Tsirkin wrote: > On Wed, Apr 05, 2017 at 04:54:33PM +0800, Cao jin wrote: >> Apparently, I don't have experience to induce non-fatal error, device >> error is more of a chance related with the environment(temperature, >> humidity, etc) as I understand. > > I'm not sure how to interpret this statement. I think what Alex is > saying is simply that patches should include some justification. They > make changes but what are they improving? > For example: > > I tested device ABC in conditions DEF. Without a patch VM > stops. With the patches applied VM recovers and proceeds to > use the device normally. > > is one reasonable justification imho. > Got it. But unfortunately, until now, I haven't seen a VM stop caused by a real device non-fatal error during device assignment(Only saw real fatal errors after start VM). On one side, AER error could occur theoretically; on the other side, seldom people have seen a VM stop caused by AER. Now I am asked that do I have a real evidence or scenario to prove that this patchset is really useful? I don't, and we all know it is hard to trigger a real hardware error, so, seems I am pushed into the corner. I guess these questions also apply for AER driver's author, if the scenario is easy to reproduce, there is no need to write aer_inject to fake errors. -- Sincerely, Cao jin