I am using a MIPS CPU running linux 4.4.6, and have a problem when forking from a process which is also using pthreads. I believe this is OK to do, although there are limits on what the forked process is allowed to do. The CPU is a BCM53003, which uses a MIPS32 74K core. Although we had to bring in some patches from Broadcom for this CPU, there are no changes in the memory managment that I can see. The kernel is compiled without CONFIG_PREEMPT or CONFIG_SMP. I have traced the problem down to the duplication of the memory map for the newly forked process. As the parent has created pthreads, this memory map is currently shared, since pthreads all share the same memory space. In copy_pte_range() there is a cond_resched(), which will allow other tasks to run while this copying is taking place. If I remove the cond_resched (and associated logic), then the problem goes away. So I guess my question is why this copy is not working successfully with the reschedule in the middle of it (and yet works on other platforms). The memory corruption is occurring in the original memory map, not the copy, as it is the existing pthreads that segfault. I do know that some pages will get set to write-only to allow COW, so am wondering whether that is somehow related. Is there some extra cache flush or MMU invalidation that needs to occur on this CPU? Here is the test code which will segfault when run on this CPU. The same code runs fine on other architectures (x86, MIPS64, PowerPC, ARM). /** * Simple test app to reproduce a problem we were seeing with stack corruption * within a pthread, while another thread is doing a fork() operation. */ #include <stdlib.h> #include <stdint.h> #include <stdbool.h> #include <string.h> #include <assert.h> #include <syslog.h> #include <stdio.h> #include <time.h> #include <errno.h> #include <unistd.h> #include <sys/time.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/wait.h> #include <linux/if_packet.h> #include <net/ethernet.h> #include <arpa/inet.h> #include <pthread.h> /* If the corruption doesn't occur, this should be more like 1000000 to avoid * screeds of debug output */ #define PERIODIC_DEBUG 1000 #define TEST_PKT_ID 3434 #define TEST_ETHERTYPE htonl(0xbeef) struct test_packet { uint8_t eth_dst[ETHER_ADDR_LEN]; uint8_t eth_src[ETHER_ADDR_LEN]; uint32_t eth_type; uint32_t sequence; uint8_t mac[ETHER_ADDR_LEN]; uint32_t id; uint32_t time; }; uint8_t test_dst_addr[ETHER_ADDR_LEN] = { 0x1, 0x00, 0x11, 0x22, 0x33, 0x44 }; uint8_t test_src_addr[ETHER_ADDR_LEN] = { 0x0b, 0xad, 0xde, 0xad, 0x0be, 0xef }; int test_sequence = 0; uint32_t num_forks = 0; uint32_t num_corruptions = 0; uint32_t num_loops = 0; /** * This is based off code that was originally sending a packet. It turns out that * all we need to do to see the corruption problem is set a few fields in the * packet, and then sanity-check that they're all still correct afterwards. * @note this achieves much the same as compiling with -fstack-protector-all, * however it just casts a slightly wider net to check for stack corruption. */ void test_packet_send (void) { struct test_packet pktPt; uint8_t buf[32]; /* construct a test pkt */ memcpy (pktPt.eth_dst, test_dst_addr, ETHER_ADDR_LEN); memcpy (pktPt.eth_src, test_src_addr, ETHER_ADDR_LEN); pktPt.eth_type = TEST_ETHERTYPE; pktPt.sequence = test_sequence++; memcpy (pktPt.mac, test_src_addr, ETHER_ADDR_LEN); pktPt.id = TEST_PKT_ID; pktPt.time = 0; /* this line/variable isn't strictly needed. It just seems to make the * corruption more likely to occur in the pktPt variable, where we can * detect it more easily */ memset (buf, 0, sizeof (buf)); /* Sanity-check the pkt values set at the start of the function are still * intact, i.e. the stack hasn't been stomped on */ if (memcmp (pktPt.eth_dst, test_dst_addr, ETHER_ADDR_LEN) != 0 || memcmp (pktPt.eth_src, test_src_addr, ETHER_ADDR_LEN) != 0 || pktPt.eth_type != TEST_ETHERTYPE || memcmp (pktPt.mac, test_src_addr, ETHER_ADDR_LEN) != 0 || pktPt.id != TEST_PKT_ID || pktPt.time != 0) { fprintf (stdout, "%u corruptions out of %u loops (%u forks)\n", ++num_corruptions, num_loops, num_forks); fflush (stdout); } } static void * test_send_thread (void *unused) { while (true) { /* without some sort of yield here, the problem doesn't seem to occur. * The original code did a select() here, but a 1us sleep seems to * reproduce the problem a lot better */ usleep (1); test_packet_send (); /* Report on the progress of the test every so often. One problem that * sometimes occurs is that a corruption causes a system call to lock-up. * So it looks like no corruptions are detected, when really the test * isn't running properly */ if ((++num_loops % PERIODIC_DEBUG) == 0) { fprintf (stdout, "%u corruptions out of %u loops (%u forks)\n", num_corruptions, num_loops, num_forks); fflush (stdout); } } return NULL; } /** * Polls to see if the last child process forked has been cleaned up. If so, * then it fork()s new a child again (the child process does nothing - it just * exits). * @note the problem also occurs if the parent process does a blocking call to * waitpid() - it just seems more reproducible using a non-blocking waitpid(). */ void try_fork_again (void) { static pid_t last_pid = -1; int status; if (last_pid >= 0) { /* wait for the child process to exit before proceeding (to make sure * the zombie process gets cleaned up properly) */ if (waitpid (last_pid, &status, WNOHANG) > 0) { last_pid = -1; } } /* if any previous children are now cleaned up, then fork() again */ if (last_pid < 0) { last_pid = fork (); num_forks++; if (last_pid < 0) { fprintf (stderr, "fork() failed - %s", strerror(errno)); } else if (last_pid == 0) { _exit (0); } } } int main (int argc, char *argv[]) { pthread_t tid; /* create a separate thread that pretends to send packets */ if (pthread_create (&tid, NULL, test_send_thread, NULL) != 0) { fprintf (stderr, "Could not create pthread - %s\n", strerror (errno)); return EXIT_FAILURE; } /* meanwhile in the main thread do lots and lots of forks */ while (true) { try_fork_again (); } return EXIT_FAILURE; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href