On Wed, 20 Feb 2019 11:25:46 -0500 Eric Farman <farman@xxxxxxxxxxxxx> wrote: > On 02/20/2019 07:44 AM, Cornelia Huck wrote: > > On Wed, 20 Feb 2019 06:29:38 -0500 > > Eric Farman <farman@xxxxxxxxxxxxx> wrote: > > > >> On 02/20/2019 04:48 AM, Cornelia Huck wrote: > >>> On Tue, 19 Feb 2019 21:49:07 -0500 > >>> Eric Farman <farman@xxxxxxxxxxxxx> wrote: > >>> > >>>> Hi Connie, Farhan, > >>>> > >>>> On 02/04/2019 12:06 PM, Cornelia Huck wrote: > >>>>> From: Farhan Ali <alifm@xxxxxxxxxxxxx> > >>>>> > >>>>> When trying to calculate the length of a ccw chain, we assume > >>>>> there are ccws after a TIC. This can lead to overcounting and > >>>>> copying garbage data from guest memory. > >>>>> > >>>>> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > >>>>> Message-Id: <d63748c1f1b03147bcbf401596638627a5e35ef7.1548082107.git.alifm@xxxxxxxxxxxxx> > >>>>> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx> > >>>>> Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx> > >>>>> --- > >>>>> drivers/s390/cio/vfio_ccw_cp.c | 2 +- > >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c > >>>>> index 70a006ba4d05..ba08fe137c2e 100644 > >>>>> --- a/drivers/s390/cio/vfio_ccw_cp.c > >>>>> +++ b/drivers/s390/cio/vfio_ccw_cp.c > >>>>> @@ -392,7 +392,7 @@ static int ccwchain_calc_length(u64 iova, struct channel_program *cp) > >>>>> return -EOPNOTSUPP; > >>>>> } > >>>>> > >>>>> - if ((!ccw_is_chain(ccw)) && (!ccw_is_tic(ccw))) > >>>>> + if (!ccw_is_chain(ccw)) > > > > OK, this function now returns the length of the chain excluding the > > last tic. Hm, not so sure about that anymore. I'm a bit tired, please apply salt where needed to what I'm seeing. > > > >>>>> break; > >>>>> > >>>>> ccw++; > >>>>> > > > > Now, cp_init will not copy the last tic to the chain. When it then > > looks for tics in that new chain, it won't find any, and stop copying. (...) > >>>> I'm not going to claim to know exactly how this is failing, > >>>> because it's too late for coffee and reading IPL records is > >>>> difficult enough. But in the working case (without this patch), > >>>> cp_init() calls ccwchain_calc_length() and gets a count of four > >>>> CCWs (SEEK + SIDE + TIC > >>>> + READ), which is broken up into two chains. One for the > >>>> SEEK/SIDE/TIC, and another for the READ. In the failing case > >>>> (with this patch), cp_init() only counts three CCWs, we never read > >>>> the IPL2 address (the second chain in the working case), and our > >>>> boot fails. > > > > I think "we don't copy the tic to the chain where we search for tics" > > fits those symptoms. > > Indeed. Hrm... (...) > > See above. We may need two chains: One without the trailing tic, and > > one to process to see where that tic points to... or rework the way to > > follow the tic? Not sure. > > I hate the idea of two chains, because how much storage do we end up > needing to consume? But maybe it's inevitable, or maybe I can think up > a way to rework it without. Yeah, that sucks. If we really need to track tics separately (and I'm not so sure, see above), there should be a better way for that... (...) > > I'm wondering whether we should keep this patch and fix on top of it, > > or revert it for now... I'm not sure tic processing is working at all > > right now. > > I've been wondering this too. Yeah, the bios code isn't merged yet, but > this patch means there's no point in merging it until we get TIC fixed > properly. Which I'd hope is 5.1, but I'm not sure how deep this bowl of > spaghetti is. We also need to make sure that this is not a bug in the bios instead... > > > > > Maybe we need a tool for testing that throws random channel programs at > > the device :) > > > > Are you peeking at my todo/wish list? :) :) Might be something for kvm unit tests? Or do I misunderstand what they can do?