On 02/21/2019 10:38 AM, Eric Farman wrote:
On 02/21/2019 07:55 AM, Halil Pasic wrote:
On Thu, 21 Feb 2019 10:30:01 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
On Wed, 20 Feb 2019 22:02:10 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
On 02/20/2019 05:35 PM, Eric Farman wrote:
On 02/20/2019 11:44 AM, Cornelia Huck wrote:
On Wed, 20 Feb 2019 11:25:46 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
On 02/20/2019 07:44 AM, Cornelia Huck wrote:
On Wed, 20 Feb 2019 06:29:38 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
On 02/20/2019 04:48 AM, Cornelia Huck wrote:
On Tue, 19 Feb 2019 21:49:07 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:
Hi Connie, Farhan,
On 02/04/2019 12:06 PM, Cornelia Huck wrote:
From: Farhan Ali <alifm@xxxxxxxxxxxxx>
When trying to calculate the length of a ccw chain, we assume
there are ccws after a TIC. This can lead to overcounting and
copying garbage data from guest memory.
Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
Message-Id:
<d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx>
Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx>
Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
---
drivers/s390/cio/vfio_ccw_cp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/s390/cio/vfio_ccw_cp.c
b/drivers/s390/cio/vfio_ccw_cp.c
index 70a006ba4d05..ba08fe137c2e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova,
struct channel_program *cp)
return -EOPNOTSUPP;
}
- if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw)))
+ if (!ccw_is_chain(ccw))
OK, this function now returns the length of the chain excluding the
last tic.
Hm, not so sure about that anymore. I'm a bit tired, please apply
salt
where needed to what I'm seeing.
I didn't get as far as I had hoped today, so I'm going to need to
reset
to tomorrow, have coffee, and try again. But it does seem that with
this patch, we calculate the length of the chain up to and
including the
TIC, and nothing beyond it.
That is, during the boot process we calculate a chain length of "3"
for
a SEEK + SIDE + TIC that QEMU built.
Hm... so it looks like that code does what it says on the tin. But why
are we missing the second round, that should give us a chain with the
forth ccw? Are we missing a loop somewhere?
I think this is close... More below.
Have you seen this email of mine:
https://www.spinics.net/lists/kvm/msg182535.html
(MID: Message-Id: <20190220143729.6dda5665@oc2783563651>)?
Eric told me the TIC points to the SIDE.
I thought I put that in one of my responses, but regardless it's what
the bios code builds so it's not new news.
I.e the channel program says
loop until SIDE presents a status modifier the jump over the TIC and
continue with the READ.
I needed to get past the incorrect length error before I could give
Halil's email any consideration. Now that that is cleared up, we can
look into his idea.
The point is the fourth ccw is only reachable if we jump over the TIC
because of status modifier (from device). Currently we do not seem to
care about this 'jump over a ccw' possibility. IMHO that is where this
problem comes from.
Well, sort of... We probably care about the "jump over a CCW"
possibility if the TIC introduces any recursion. If it goes "somewhere
new", then the likelihood of the Status Modifier coming into play is low.
And besides, in this scenario we don't know about a potential Status
Modifier until the SIDE runs, which means while we're building the
channel program we don't know whether the memory after the TIC is
interesting or not. Reverting this patch makes the "recursion" scenario
(which the bios introduces, and thus brings about the importance of the
Status Modifier) work again, but then we have difficulties when the TIC
branches somewhere else. (See the use of NOP+TIC chains in error
recovery.)
If you're suggesting adding code to handle a Status Modifier, I would
ask why the interrupt handler should redrive things after the TIC, when
it would be simpler to detect the TIC recursion and planning for the
possibility there. That would seem simpler than leaving a window open
between (for example) orientation and data transfer.
The difficulty, as Connie pointed out, is that we're still building the
current chain here, which implies we can't rely on
tic_target_chain_exists() to determine whether we are dealing with a TIC
loop or not. I had some ideas this morning during the commute that
might apply here, and warrant some experimentation. Let me cobble this
together and see what comes from it.
Well, taking the TIC-boundary logic from tic_target_chain_exists() and
stuffing it into ccwchain_calc_length() seems to do the trick for seeing
if we're sending a TIC back into our current chain, and makes the dasd
boot work again (the following is built on top of this patch, and is
surely whitespace damaged):
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index ba08fe137c2e..a4abe7519b7e 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -362,6 +362,7 @@ static int ccwchain_calc_length(u64 iova, struct
channel_program *cp)
{
struct ccw1 *ccw, *p;
int cnt;
+ u64 tail;
/*
* Copy current chain from guest to host kernel.
@@ -392,7 +393,9 @@ static int ccwchain_calc_length(u64 iova, struct
channel_program *cp)
return -EOPNOTSUPP;
}
- if (!ccw_is_chain(ccw))
+ tail = iova + (cnt - 1) * sizeof(struct ccw1);
+
+ if (!(ccw_is_chain(ccw) || (ccw_is_tic(ccw) && iova <=
ccw->cda && ccw->cda <= tail)))
break;
ccw++;
I need to spend some more time looking over whether we should look at
other chains by also calling tic_target_chain_exists(), and of course
see what happens to this when we run the original I/O exercisers that
led Farhan to the original patch. But it looks promising. Thoughts?
- Eric