Re: cpus_allowed per thread behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2014-02-26 17:12, Elliott, Robert (Server Storage) wrote:
-----Original Message-----
From: Jens Axboe [mailto:axboe@xxxxxxxxx]
Sent: Wednesday, 26 February, 2014 6:08 PM
To: Elliott, Robert (Server Storage); fio@xxxxxxxxxxxxxxx
Subject: Re: cpus_allowed per thread behavior

On 2014-02-26 15:54, Elliott, Robert (Server Storage) wrote:
fio seems to assign the same cpus_allowed/cpumask value to all threads.
  > I think this allows the OS to move the threads around those CPUs.

Correct. As long as the number of cpus in the mask is equal to (or
larger than) the number of jobs within that group, the OS is free to
place them wherever it wants. In practice, unless the CPU scheduling is
horribly broken, they tend to "stick" for most intents and purposes.

In comparison, iometer assigns its worker threads to specific CPUs
  > within the cpumask in round-robin manner.  Would that be worth adding
  > to fio, perhaps with an option like cpus_allowed_policy=roundrobin?

Sure, we could add that feature. You can get the same setup now, if you
"unroll" the job section, but that might not always be practical. How
about cpus_allowed_policy, with 'shared' being the existing (and
default) behavior and 'split' being each thread grabbing one of the CPUs?

Perhaps NUMA and hyperthreading aware allocation policies would
also be useful?

I don't know how consistent hyperthread CPU numbering is across
systems.  On some servers I've tried, linux assigns 0-5 to the main
cores and 6-11 to the hyperthreaded siblings, while Windows assigns
0,2,4,6,8,10 to the main cores and 1,3,5,7,9,11 to their
hyperthreaded siblings.

Linux follows the firmware on that, at least as far as I know. I've seen machines renumber when getting a new firmware, going from the second scheme you list to the first. But for the below, we cannot assume any of them, on some machines you also have > 2 threads per core. So the topology would have to be queried.

Intel's OpenMP library offers two thread affinity types that might
be worth simulating:
COMPACT: pack them tightly
	foreach (node)
		foreach (core in the node)
			foreach (hyperthreaded sibling)

SCATTER: spread across all the cores
	foreach (hyperthreaded sibling)
		foreach (core sharing a node)
			foreach (node)

We could try:
cpus_allowed_policy=shared
cpus_allowed_policy=split  (round-robin, don't care how the
CPU IDs were assigned)
cpus_allowed_policy=compact (NUMA/HT aware)
cpus_allowed_policy=scatter (NUMA/HT aware)

That would definitely be useful, but also requires writing the code to understand the topology of the machine.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux