Re: rgw: coroutine stack protection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 11/10/19 4:34 AM, Yuval Lifshitz wrote:
On Tue, Nov 5, 2019 at 10:33 PM Casey Bodley <cbodley@xxxxxxxxxx> wrote:
Gal was able to track down some memory bugs with address sanitizer and
identify them as overflows of the coroutine stack used by the beast
frontend. Unlike pthread stacks, these heap-allocated coroutine stacks
don't have any memory protection to catch these overflows.

The boost::asio::spawn() [1] function that creates these coroutines does
allow you to pass in coroutine attributes to control the stack size
(which defaults to 128K), and we were able to increase that in testing
and see the errors go away. But since these coroutines are how we expect
the beast frontend to scale to thousands of connections, we really don't
want to raise the stack size permanently and limit how many we can fit
in memory.
what is the required stack size that does not cause in an overflow?

As far as I know, 1MB is all we tested (via https://github.com/cbodley/ceph/commit/d23507bd1295a29ccae3ae36187194d44c9a6438) just to confirm that stack overflow was the root cause.

Unfortunately, the boost::coroutine library used by boost::asio doesn't
give us any flexibility outside of the stack size, though the underlying
boost::context library provides several different options for the stack
allocation [2]. And despite boost::coroutine being long deprecated by
boost::coroutine2, boost::asio has never removed its dependency on the
former. The maintainer of these coroutine libraries even submitted a
pull request against boost::asio in 2017 to use boost::context directly
[3], allowing boost::asio::spawn() to use any of its stack allocators.
My sense is that the asio maintainer is more interested in integration
with the C++ Coroutines TS, rather than continuing to support these
stackful coroutines that aren't part of the C++ Networking TS.

I rebased this stale pull request and added some test coverage in
https://github.com/cbodley/asio/commits/wip-spawn-context. Then, since
spawn() is relatively independent from the rest of boost::asio, I forked
that part into a new repo at https://github.com/cbodley/spawn. Using
this fork, radosgw could continue using boost::asio as it is (and
eventually the Networking TS), but call spawn::spawn() instead for
direct control over the coroutine stack allocation.

Of the available stack allocators, the segmented_stack looked promising;
that would allow us to start with a small stack, and grow as more space
was needed. However, this would require us to build our own boost as
segmented stacks aren't enabled by default. It also doesn't help us
control the total stack size used per connection.
if this is rare (hard to reproduce), why don't we used pooled_fixedsize_stack?
this would limit the overall size, but would allow the specific code
path to use more memory?

The pooled_fixedsize_stack appears to have the same 2 limitations as the current fixedsize_stack: each stack has a fixed size of 128K by default, and there's no memory protection to detect stack overflows. The only difference seems to be that freed stacks are kept in a boost::pool to be reused for later stack allocations.

So the consensus is to use the protected_fixedsize allocator based on
mmap/mprotect with the existing 128K limit. That way we can catch these
stack overflows in testing, and treat the stack usage itself as bugs.
why would this considered as bug (assuming this does not exceed the
2MB you usually get per regular stack)?

The motivation for enforcing a limit on the per-connection stack size is to help scale to lots of concurrent connections. However, if the requests are mostly object gets/puts, we'll be buffering up to 16M of object data per request and that's going to dwarf these stacks.

Another dimension to consider here is the zipper project, which will allow you to insert extra layers into the request processing path to add things like caching, logging, policies, etc. So even if we do set a strict limit on these stacks and verify that we don't hit them in testing, all bets are off when you start adding in layers at runtime.

I think the protected_fixedsize_stack gives us the most flexibility here. In addition to the overflow detection from mmap+mprotect, it lets us map a comfortable amount of memory without having to allocate pages for them up front, and essentially behaving like the pthread stacks we're used to.

Casey


[1]
https://www.boost.org/doc/libs/1_71_0/doc/html/boost_asio/reference/spawn.html

[2]
https://www.boost.org/doc/libs/1_71_0/libs/context/doc/html/context/stack.html

[3] https://github.com/boostorg/asio/pull/55
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux