Gal was able to track down some memory bugs with address sanitizer and
identify them as overflows of the coroutine stack used by the beast
frontend. Unlike pthread stacks, these heap-allocated coroutine stacks
don't have any memory protection to catch these overflows.
The boost::asio::spawn() [1] function that creates these coroutines does
allow you to pass in coroutine attributes to control the stack size
(which defaults to 128K), and we were able to increase that in testing
and see the errors go away. But since these coroutines are how we expect
the beast frontend to scale to thousands of connections, we really don't
want to raise the stack size permanently and limit how many we can fit
in memory.
Unfortunately, the boost::coroutine library used by boost::asio doesn't
give us any flexibility outside of the stack size, though the underlying
boost::context library provides several different options for the stack
allocation [2]. And despite boost::coroutine being long deprecated by
boost::coroutine2, boost::asio has never removed its dependency on the
former. The maintainer of these coroutine libraries even submitted a
pull request against boost::asio in 2017 to use boost::context directly
[3], allowing boost::asio::spawn() to use any of its stack allocators.
My sense is that the asio maintainer is more interested in integration
with the C++ Coroutines TS, rather than continuing to support these
stackful coroutines that aren't part of the C++ Networking TS.
I rebased this stale pull request and added some test coverage in
https://github.com/cbodley/asio/commits/wip-spawn-context. Then, since
spawn() is relatively independent from the rest of boost::asio, I forked
that part into a new repo at https://github.com/cbodley/spawn. Using
this fork, radosgw could continue using boost::asio as it is (and
eventually the Networking TS), but call spawn::spawn() instead for
direct control over the coroutine stack allocation.
Of the available stack allocators, the segmented_stack looked promising;
that would allow us to start with a small stack, and grow as more space
was needed. However, this would require us to build our own boost as
segmented stacks aren't enabled by default. It also doesn't help us
control the total stack size used per connection.
So the consensus is to use the protected_fixedsize allocator based on
mmap/mprotect with the existing 128K limit. That way we can catch these
stack overflows in testing, and treat the stack usage itself as bugs.
Casey
[1]
https://www.boost.org/doc/libs/1_71_0/doc/html/boost_asio/reference/spawn.html
[2]
https://www.boost.org/doc/libs/1_71_0/libs/context/doc/html/context/stack.html
[3] https://github.com/boostorg/asio/pull/55
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx