On 07/09/2019 10:21 AM, Halil Pasic wrote:
On Tue, 9 Jul 2019 09:46:51 -0400
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
On 07/09/2019 06:16 AM, Cornelia Huck wrote:
On Mon, 8 Jul 2019 16:10:37 -0400
Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
There is a small window where it's possible that we could be working
on an interrupt (queued in the workqueue) and setting up a channel
program (i.e allocating memory, pinning pages, translating address).
This can lead to allocating and freeing the channel program at the
same time and can cause memory corruption.
Let's not call cp_free if we are currently processing a channel program.
The only way we know for sure that we don't have a thread setting
up a channel program is when the state is set to VFIO_CCW_STATE_CP_PENDING.
Can we pinpoint a commit that introduced this bug, or has it been there
since the beginning?
I think the problem was always there.
I think it became relevant with the async stuff. Because after the async
stuff was added we start getting solicited interrupts that are not about
channel program is done. At least this is how I remember the discussion.
Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
---
drivers/s390/cio/vfio_ccw_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 4e3a903..0357165 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -92,7 +92,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
(SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
if (scsw_is_solicited(&irb->scsw)) {
cp_update_scsw(&private->cp, &irb->scsw);
- if (is_final)
+ if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
Ain't private->state potentially used by multiple threads of execution?
yes
One of the paths I can think of is a machine check from the host which
will ultimately call vfio_ccw_sch_event callback which could set state
to NOT_OPER or IDLE.
Do we need to use atomic operations or external synchronization to avoid
this being another gamble? Or am I missing something?
I think we probably should think about atomic operations for
synchronizing the state (and it could be a separate add on patch?).
But for preventing 2 threads from stomping on the cp the check should be
enough, unless I am missing something?
cp_free(&private->cp);
}
mutex_lock(&private->io_mutex);
Reviewed-by: Cornelia Huck <cohuck@xxxxxxxxxx>
Thanks for reviewing.
Thanks
Farhan