On 02/25/2019 04:11 AM, Cornelia Huck wrote:
On Fri, 22 Feb 2019 19:39:39 +0100
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
Per the discussion [1] about a problem with how vfio-ccw calculates
the length of a channel program (specifically when using the
forthcoming QEMU BIOS code for DASD IPL), I present this fix.
Patch 1 fixes the problem, and is over-engineered
for readability sake.
:)
Patch 2 takes the functions from Patch 1, and refactors the
existing code to make other areas a little easier to understand.
(I hope.)
I've been running fio for over 24 hours now, and have seen
zero hours. Previously, I would have probably seen "a few"
errors by now, where prior to the original fix I would've seen
"many" errors. Further tests are still ongoing.
Awesome, thanks!
I left fio running over the weekend, with newly-randomized parameters
every hour or two... Had one error yesterday morning, in the
NOP+TIC-to-redrive-I/O case. I didn't leave any tracing on because I
didn't expect I'd be able to get anything before they wrapped, and
didn't have time to figure out a way to cleanly filter errors.
Though I did leave a counter in place for the number of times we
processed a TIC that goes back into the current chain, and it hit about
1900 times since Friday. More than three quarters of them occurred
during the error yesterday morning, so something was being dramatic at
the time. I guess there's one obscure corner to track down, but it
otherwise seems to run quite a bit better than before.
- Eric
[1] https://marc.info/?l=linux-s390&m=155063096321940&w=2
Eric Farman (2):
s390/cio: Fix vfio-ccw handling of recursive TICs
s390/cio: Use cpa range elsewhere within vfio-ccw
drivers/s390/cio/vfio_ccw_cp.c | 55 ++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 13 deletions(-)
I hope we can queue the patches soon, reviewing now.