2020년 4월 3일 (금) 오전 4:42, Roman Gushchin <guro@xxxxxx>님이 작성: > > In fact, I've tested this patch and your fixes for migration problem > > and found that there is > > still migration problem and failure rate is increased by this patch. > > Do you mind sharing any details? What kind of pages are those? I don't investigate more since I had not enough time to do. If I remember correctly, it's the page used by journaling. I attach my test script below to help you reproduce it. My test setup is: - virtual machine, 8 cpus and 1024 MB mem (256 MB cma mem) - ubuntu 16.04 with custom kernel - filesystem is ext4 > I'm using the following patch to dump failed pages: > > @@ -1455,6 +1455,9 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, > private, page, pass > 2, mode, > reason); > > + if (rc && reason == MR_CONTIG_RANGE) > + dump_page(page, "unmap_and_move"); > + > switch(rc) { > case -ENOMEM: > /* > > > > However, given that > > there is no progress on this area for a long time, I think that > > applying the change aggressively > > is required to break the current situation. > > I totally agree! > > Btw, I've found that cma_release() grabs the cma->lock mutex, > so it can't be called from the atomic context (I've got a lockdep warning). > > Of course, I can change the calling side, but I think it's better to change > the cma code to make cma_release() more accepting. What do you think > about the following patch? For 2GB CMA area, we need to check 8192(?) bytes in worst case scenario and I don't think it's small enough for spinlock. Even, there is no limit on the size of the cma area. If cma area is bigger, it takes more. So, I think that spinlock() isn't good here. Anyway, below is the test script that I used. Thanks. -------------------------->8------------------------------------ RUNS=1 MAKE_CPUS=10 KERNEL_DIR=~~~~~~~~~~~~~~~ WORKING_DIR=`pwd` RESULT_OUTPUT=$WORKING_DIR/log-cma-alloc.txt BUILD_KERNEL=1 BUILD_KERNEL_PID=0 SHOW_CONSOLE=1 SHOW_LATENCY=1 CMA_AREA_NAME=cma_reserve CMA_DEBUGFS_ROOT_DIR=/sys/kernel/debug/cma CMA_DEBUGFS_AREA_DIR=$CMA_DEBUGFS_ROOT_DIR/cma-$CMA_AREA_NAME CMA_AREA_COUNT=`sudo cat $CMA_DEBUGFS_AREA_DIR/count` CMA_AREA_ORDERPERBIT=`sudo cat $CMA_DEBUGFS_AREA_DIR/order_per_bit` CMA_AREA_PAGES=$(($CMA_AREA_COUNT * 2 ** $CMA_AREA_ORDERPERBIT)) CMA_ALLOC_DELAY=5 CMA_ALLOC_SPLIT=32 CMA_ALLOC_PAGES=$(($CMA_AREA_PAGES / $CMA_ALLOC_SPLIT)) function show_cma_info() { cat /proc/meminfo | grep -i cma sudo cat $CMA_DEBUGFS_AREA_DIR/{count,used} } function time_begin() { echo $(date +%s.%N) } function time_elapsed() { tmp=$(date +%s.%N) echo $tmp - $1 | bc -l } function time_sum() { echo $1 + $2 | bc -l } function time_avg() { echo $1 / $2 | bc -l } if [ "$1" == "show" ]; then show_cma_info exit 0 fi if [ "$SHOW_CONSOLE" != "1" ]; then exec 3>&1 4>&2 >$RESULT_OUTPUT 2>&1 fi if [ "$BUILD_KERNEL" == "1" ]; then pushd - cd $KERNEL_DIR make clean &> /dev/null; make -j$MAKE_CPUS &> /dev/null & BUILD_KERNEL_PID=$! popd echo "waiting until build kernel runs actively" sleep 10 fi echo "BUILD_KERNEL: $BUILD_KERNEL" echo "BUILD_KERNEL_PID: $BUILD_KERNEL_PID" echo "CMA_AREA_NAME: $CMA_AREA_NAME" echo "CMA_AREA_PAGES: $CMA_AREA_PAGES" echo "CMA_ALLOC_SPLIT: $CMA_ALLOC_SPLIT" echo "CMA_ALLOC_PAGES: $CMA_ALLOC_PAGES" for i in `seq $RUNS`; do echo "begin: $i" show_cma_info CMA_ALLOC_SUCC=0 T_BEGIN=`time_begin` for j in `seq $CMA_ALLOC_SPLIT`; do sudo bash -c "echo $CMA_ALLOC_PAGES > $CMA_DEBUGFS_AREA_DIR/alloc" &> /dev/null if [ "$?" == "0" ]; then CMA_ALLOC_SUCC=$(($CMA_ALLOC_SUCC+1)) fi done T_ELAPSED=`time_elapsed $T_BEGIN` sleep 5 echo "alloced: $CMA_ALLOC_SUCC" show_cma_info for j in `seq $CMA_ALLOC_SUCC`; do sudo bash -c "echo $CMA_ALLOC_PAGES > $CMA_DEBUGFS_AREA_DIR/free" done if [ "$SHOW_LATENCY" == "1" ]; then T_AVG=`time_avg $T_ELAPSED $CMA_ALLOC_SPLIT` echo "T_AVG: $T_AVG" fi sleep $CMA_ALLOC_DELAY done if [ "$BUILD_KERNEL_PID" != "0" ]; then kill $BUILD_KERNEL_PID fi if [ "$SHOW_CONSOLE" != "1" ]; then exec 1>&3 2>&4 fi