Here's a simple program that demonstrates that libselinux's attempts to cache data ends up being a memory leak instead. The problem is that __thread variables have no destructor, so when a long-running program that uses multiple pthreads, where each thread uses selinux functions which stash malloc'd memory into a __thread cache, then the cache memory is leaked when the threads exit. __thread is fine for stuff that doesn't need destruction (for example, errno is a perfect candidate for __thread), but a recipe for disaster when used with malloc'd memory. The only thread-safe manner to cache malloc'd memory is to use a thread-local mechanism that includes a destructor, such as pthread_key_t. While looking through the history of selinux.git, I see that there was a past attempt made to use pthread_key_t instead of __thread for at least setrans_client.c (Tomas Mraz' commit a842c9dae863, Jun 2009), but it was done to solve a secondary bug related to crashing due to freeing uninitialized data in a destructor; the pthread_key_t changes were reverted by Stephen Smalley in commit 1ac1ff638250 in order to install a lighter weight fix that avoided destroying uninitialized data in commit 8c372f665db4 (Jul 2009). I'm not sure if anyone was aware of the implications of __thread memory leaks during that exchange. Meanwhile, __thread is abused in more than just setrans_client.c. See also https://bugzilla.redhat.com/show_bug.cgi?id=658571 $ cat foo.c #include <selinux/selinux.h> #include <pthread.h> #include <unistd.h> #include <stdlib.h> static void * leaker(void *arg) { security_context_t s; if (getfilecon (".", &s) < 0) exit (1); freecon (s); return arg; } int main (int argc, char **argv) { int limit = 1000; if (argc > 1) limit = atoi (argv[1]); while (limit--) { pthread_t t; if (pthread_create (&t, NULL, leaker, NULL)) exit (1); if (pthread_join (t, NULL)) exit (1); } return 0; } $ gcc -o foo -Wall foo.c -pthread -lselinux $ valgrind --quiet --leak-check=full ./foo 2 ==5658== 54 bytes in 2 blocks are definitely lost in loss record 1 of 2 ==5658== at 0x4A05E46: malloc (vg_replace_malloc.c:195) ==5658== by 0x37E3082A91: strdup (strdup.c:43) ==5658== by 0x3864E111D3: selinux_raw_to_trans_context (setrans_client.c:314) ==5658== by 0x3864E102B9: getfilecon (getfilecon.c:64) ==5658== by 0x400780: leaker (foo.c:10) ==5658== by 0x37E3806D5A: start_thread (pthread_create.c:301) ==5658== by 0x37E30E4AAC: clone (clone.S:115) ==5658== ==5658== 54 bytes in 2 blocks are definitely lost in loss record 2 of 2 ==5658== at 0x4A05E46: malloc (vg_replace_malloc.c:195) ==5658== by 0x37E3082A91: strdup (strdup.c:43) ==5658== by 0x3864E111FA: selinux_raw_to_trans_context (setrans_client.c:317) ==5658== by 0x3864E102B9: getfilecon (getfilecon.c:64) ==5658== by 0x400780: leaker (foo.c:10) ==5658== by 0x37E3806D5A: start_thread (pthread_create.c:301) ==5658== by 0x37E30E4AAC: clone (clone.S:115) ==5658== If you're brave, run ./foo 1000000000, and hope that the OOM killer doesn't randomly start killing off unrelated processes (or else you're lucky to have that much RAM). I'm wondering whether it is better to just restore the ideas in commit a842c9dae, or whether we need to come up with some better way to avoid leaking cache data on pthread_exit. -- Eric Blake eblake@xxxxxxxxxx +1-801-349-2682 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature