Re: [PATCH v2] lib: Add note that bulk stats API queries may overrun RPC buffers

Martin Kletzander <mkletzan@xxxxxxxxxx> · Wed, 24 May 2017 21:30:55 +0200

On Wed, May 24, 2017 at 05:12:06PM +0200, Michal Privoznik wrote:
On 05/24/2017 04:58 PM, Richard W.M. Jones wrote:
On Wed, May 24, 2017 at 04:16:45PM +0200, Michal Privoznik wrote:
On 05/24/2017 02:47 PM, Richard W.M. Jones wrote:

On Wed, May 24, 2017 at 12:49:58PM +0200, Michal Privoznik wrote:
That's quite exact. I mean the word 'guessing'. We can't really provide
reliable way of dealing with what you're suggesting (unless we cut the
limit really small) nor we can guarantee atomicity. Therefore I think it
would be a waste of time to work on this. Yes, it can be done, but the
benefits are pretty small IMO.

Why is atomicity a problem?

The atomicity steps in depending on what level we are talking about
serialization. If we do it the way Martin suggested (= on public API
level) then the list of domains may change between two iterations of the
suggested loop. Thus the result obtained would not reflect the reality.
However, if we are talking on RPC level, then there's no atomicity
problem as RPC is 'blind' to the data (it doesn't care what data is sent).

Just structure the libvirtd
messages so that you have:

  COLLECT_THE_STATS
    - saves the stats into an internal buffer in libvirtd
      and returns a handle and a number of stat items
  RETURN_THE_STATS
    - returns partial subset of previously collected stats,
      called multiple times to transfer the data back to libvirt
  FREE_THE_STATS
    - free the internal buffer

This is exactly what I was proposing in my e-mail that I linked in the
other thread. You just wrote it more cute. Yet again, what's the
gain/advantage of this over one big message?

In the libguestfs case there is an actual security concern.  The
daemon runs in the untrusted context of the guest, where a malicious
guest filesystem or program could (in some cases quite easily) send
back arbitrarily large messages to the library, tieing up infinite
resources on the host.

In libvirt the danger is possibly more on the other side (modified
library sends infinitely large message to libvirtd).  There's also an
issue of libvirt connecting to a compromised remote host.

Whether having a maximum message size prevents all such attacks I'm
less clear about, but it probably helps against simple ones.

I'm not a security expert but I view both approach the same from
security POV. I mean, I'm all up for limits. Don't get me wrong there.
It's just that we have two options:

a) one API call = one RPC call
b) one API call split across multiple RPC messages

So, it's obvious that in case a) we need bigger limit for the RPC
message to fit all the data. In case b) the limit for the RPC message
can be smaller, but in turn we need limit for the maximum messages per
one API call. Both of these approaches consume about the same memory on
src & dst (I think option b) does consume slightly more because of all
the extra headers sent with each message, but that's not the point).
Now, if I were an attacker I can very well send one big message as well
as N small messages to fill up the buffers. Therefore I think the attack
surface is the same for both of these approaches.

a) is easier to implement IMO.

You are forgetting one thing.  If b) is only utilized on the way from
the daemon to the client, then you can keep the number of messages
per RPC call unlimited without any security concern.  The only thing
that would have to get compromised is the daemon and if that happens,
there are bigger issues than DoS'ing the client.  It would actually
solve all of our current issues (not talking about client starting a
domain with 10,000 disks, of course).

Martin
Attachment:
signature.asc

Description: Digital signature
--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list