On 2019-01-18 7:10 a.m., Marc Gonzalez wrote:
Hello, I'm running into an issue which I don't know how to debug. So I'm open to ideas and suggestions :-) On my arm64 board, I have enabled Universal Flash Storage support. I wanted to benchmark read performance, and noticed that the system locks up when I read partitions larger than 3.5 GB, unless I tell dd to use direct IO:
Marc, If you want to benchmark (or torture) UFS read performance I have many dd variants. In the sg3_utils package there is sgp_dd (using POSIX threads for a multi-threaded copy or read) and sgm_dd (using mmap-ed IO). Plus there is a multi-platform ddpt in a package of the same name. One major difference between my dd variants and the "standard" one is that I split the bs=BS option in two: one called bs=BS where BS is the logical block size of the given device, the other is bpt=BPT (blocks per transfer) which is the number of logical blocks in each copy (or read) segment. So with your example below bs=1M would become, for a LB size of 4096, 'bs=4096 bpt=16k'. Also sgp_dd and sgm_dd don't support status=progress (ddpt does) but you can always send 'kill -s USR1 <pid_of_dd>' from another (virtual) console that has root permissions. All my dd variants and dd itself will accept that signal gracefully and output a progress report then continue. Doug Gilbert
*** WITH O_DIRECT *** # dd if=/dev/sda of=/dev/null bs=1M iflag=direct status=progress 57892929536 bytes (58 GB, 54 GiB) copied, 697.006 s, 83.1 MB/s 55256+0 records in 55256+0 records out 57940115456 bytes (58 GB, 54 GiB) copied, 697.575 s, 83.1 MB/s *** WITHOUT O_DIRECT *** # dd if=/dev/sda of=/dev/null bs=1M status=progress 3853516800 bytes (3.9 GB, 3.6 GiB) copied, 49.0002 s, 78.6 MB/s rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...0: (8242 ticks this GP) idle=106/1/0x4000000000000000 softirq=168/171 fqs=2626 rcu: 6-...0: (99 GPs behind) idle=ec2/1/0x4000000000000000 softirq=71/71 fqs=2626 rcu: (detected by 7, t=5254 jiffies, g=-275, q=2) Task dump for CPU 1: kworker/1:1H R running task 0 675 2 0x0000002a Workqueue: kblockd blk_mq_run_work_fn Call trace: __switch_to+0x168/0x1d0 0xffffffc0f6efbbc8 blk_mq_run_work_fn+0x28/0x40 process_one_work+0x208/0x470 worker_thread+0x48/0x460 kthread+0x128/0x130 ret_from_fork+0x10/0x1c Task dump for CPU 6: kthreadd R running task 0 2 0 0x0000002a Call trace: __switch_to+0x168/0x1d0 0x5b36396f4e7d4000 rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 1-...0: (8242 ticks this GP) idle=106/1/0x4000000000000000 softirq=168/171 fqs=10500 rcu: 6-...0: (99 GPs behind) idle=ec2/1/0x4000000000000000 softirq=71/71 fqs=10500 rcu: (detected by 7, t=21009 jiffies, g=-275, q=2) Task dump for CPU 1: kworker/1:1H R running task 0 675 2 0x0000002a Workqueue: kblockd blk_mq_run_work_fn Call trace: __switch_to+0x168/0x1d0 0xffffffc0f6efbbc8 blk_mq_run_work_fn+0x28/0x40 process_one_work+0x208/0x470 worker_thread+0x48/0x460 kthread+0x128/0x130 ret_from_fork+0x10/0x1c Task dump for CPU 6: kthreadd R running task 0 2 0 0x0000002a Call trace: __switch_to+0x168/0x1d0 0x5b36396f4e7d4000 The system always hangs around the 3.6 GiB mark, wherever I start from. How can I debug this issue? Regards.