Hello, Greetings from me! I am reading "4.3.4.2 A Volatile Solution" of perfbook, and come across following summary: To summarize, the volatile keyword can prevent load tearing and store tearing in cases where the loads and stores are machine-sized and properly aligned. It can also prevent load fusing, store fusing, invented loads, and invented stores. ... At first I thought it means accessing volatile, aligned and machine-sized data is atomic operation, so I wrote a small test program to test on a "64-bit" Linux server: #include <pthread.h> #include <stdio.h> #include <stdatomic.h> #include <stdint.h> volatile uint64_t sum; atomic_ullong atomic_sum; void *thread(void *arg) { for (int i = 0; i < 100000; i++) { sum++; atomic_fetch_add(&atomic_sum, 1); } return NULL; } int main() { pthread_t tid[4]; for (int i = 0; i < sizeof(tid) / sizeof(tid[0]); i++) { pthread_create(&tid[i], NULL, thread, NULL); } for (int i = 0; i < sizeof(tid) / sizeof(tid[0]); i++) { pthread_join(tid[i], NULL); } printf("sum=%llu,atomic_sum=%llu\n", sum, atomic_sum); return 0; } But the result seems not: $ gcc -pthread -O3 parallel.c -o parallel $ ./parallel sum=221785,atomic_sum=400000 $ gcc --version gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22) Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. So my understanding is: volatile doesn't guarantee the atomic operation for aligned, machine-sized data, and we can only use atomic_xxx data types and related functions to guarantee atomic operations. Is my understanding correct? Or I misunderstood volatile? Thanks very much in advance! Best Regards Nan Xiao