Re: WARNING in __mmdrop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/7/30 下午11:08, Michael S. Tsirkin wrote:
On Tue, Jul 30, 2019 at 03:44:47PM +0800, Jason Wang wrote:
On 2019/7/29 下午10:44, Michael S. Tsirkin wrote:
On Mon, Jul 29, 2019 at 10:24:43PM +0800, Jason Wang wrote:
On 2019/7/29 下午4:59, Michael S. Tsirkin wrote:
On Mon, Jul 29, 2019 at 01:54:49PM +0800, Jason Wang wrote:
On 2019/7/26 下午9:49, Michael S. Tsirkin wrote:
Ok, let me retry if necessary (but I do remember I end up with deadlocks
last try).
Ok, I play a little with this. And it works so far. Will do more testing
tomorrow.

One reason could be I switch to use get_user_pages_fast() to
__get_user_pages_fast() which doesn't need mmap_sem.

Thanks
OK that sounds good. If we also set a flag to make
vhost_exceeds_weight exit, then I think it will be all good.
After some experiments, I came up two methods:

1) switch to use vq->mutex, then we must take the vq lock during range
checking (but I don't see obvious slowdown for 16vcpus + 16queues). Setting
flags during weight check should work but it still can't address the worst
case: wait for the page to be swapped in. Is this acceptable?

2) using current RCU but replace synchronize_rcu() with vhost_work_flush().
The worst case is the same as 1) but we can check range without holding any
locks.

Which one did you prefer?

Thanks
I would rather we start with 1 and switch to 2 after we
can show some gain.

But the worst case needs to be addressed.
Yes.


How about sending a signal to
the vhost thread?  We will need to fix up error handling (I think that
at the moment it will error out in that case, handling this as EFAULT -
and we don't want to drop packets if we can help it, and surely not
enter any error states.  In particular it might be especially tricky if
we wrote into userspace memory and are now trying to log the write.
I guess we can disable the optimization if log is enabled?).
This may work but requires a lot of changes.
I agree.

And actually it's the price of
using vq mutex.
Not sure what's meant here.

I mean if we use vq mutex, it means the critical section was increased and
we need to deal with swapping then.


Actually, the critical section should be rather small, e.g
just inside memory accessors.
Also true.

I wonder whether or not just do synchronize our self like:

static void inline vhost_inc_vq_ref(struct vhost_virtqueue *vq)
{
          int ref = READ_ONCE(vq->ref);

          WRITE_ONCE(vq->ref, ref + 1);
smp_rmb();
}

static void inline vhost_dec_vq_ref(struct vhost_virtqueue *vq)
{
          int ref = READ_ONCE(vq->ref);

smp_wmb();
          WRITE_ONCE(vq->ref, ref - 1);
}

static void inline vhost_wait_for_ref(struct vhost_virtqueue *vq)
{
          while (READ_ONCE(vq->ref));
mb();
}
Looks good but I'd like to think of a strategy/existing lock that let us
block properly as opposed to spinning, that would be more friendly to
e.g. the realtime patch.

Does it make sense to disable preemption in the critical section? Then we
don't need to block and we have a deterministic time spent on memory
accssors?
Hmm maybe. I'm getting really nervious at this point - we
seem to be using every trick in the book.


Yes, looking at the synchronization implemented by other MMU notifiers. Vhost is even the simplest.


Or using smp_load_acquire()/smp_store_release() instead?

Thanks
These are cheaper on x86, yes.

Will use this.

Thanks


This looks suspiciously like a seqlock though.
Can that be used somehow?


seqlock does not provide a way to synchronize with readers. But I did borrow some ideas from seqlock and post a new version.

Please review.

Thanks




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux