Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func TakeByTopologyNUMADistributed ¶
func TakeByTopologyNUMADistributed(topo *machine.CPUTopology, availableCPUs machine.CPUSet, numCPUs int, cpuGroupSize int) (machine.CPUSet, error)
takeByTopologyNUMADistributed returns a CPUSet of size 'numCPUs'.
It generates this CPUset by allocating CPUs from 'availableCPUs' according to the algorithm outlined in KEP-2902:
This algorithm evenly distribute CPUs across NUMA nodes in cases where more than one NUMA node is required to satisfy the allocation. This is in contrast to the takeByTopologyNUMAPacked algorithm, which attempts to 'pack' CPUs onto NUMA nodes and fill them up before moving on to the next one.
At a high-level this algorithm can be summarized as:
For each NUMA single node:
- If all requested CPUs can be allocated from this NUMA node; --> Do the allocation by running takeByTopologyNUMAPacked() over the available CPUs in that NUMA node and return
Otherwise, for each pair of NUMA nodes:
- If the set of requested CPUs (modulo 2) can be evenly split across the 2 NUMA nodes; AND
- Any remaining CPUs (after the modulo operation) can be striped across some subset of the NUMA nodes; --> Do the allocation by running takeByTopologyNUMAPacked() over the available CPUs in both NUMA nodes and return
Otherwise, for each 3-tuple of NUMA nodes:
- If the set of requested CPUs (modulo 3) can be evenly distributed across the 3 NUMA nodes; AND
- Any remaining CPUs (after the modulo operation) can be striped across some subset of the NUMA nodes; --> Do the allocation by running takeByTopologyNUMAPacked() over the available CPUs in all three NUMA nodes and return
...
Otherwise, for the set of all NUMA nodes:
- If the set of requested CPUs (modulo NUM_NUMA_NODES) can be evenly distributed across all NUMA nodes; AND
- Any remaining CPUs (after the modulo operation) can be striped across some subset of the NUMA nodes; --> Do the allocation by running takeByTopologyNUMAPacked() over the available CPUs in all NUMA nodes and return
If none of the above conditions can be met, then resort back to a best-effort fit of packing CPUs into NUMA nodes by calling takeByTopologyNUMAPacked() over all available CPUs.
NOTE: A "balance score" will be calculated to help find the best subset of NUMA nodes to allocate any 'remainder' CPUs from (in cases where the total number of CPUs to allocate cannot be evenly distributed across the chosen set of NUMA nodes). This "balance score" is calculated as the standard deviation of how many CPUs will be available on each NUMA node after all evenly distributed and remainder CPUs are allocated. The subset with the lowest "balance score" will receive the CPUs in order to keep the overall allocation of CPUs as "balanced" as possible.
NOTE: This algorithm has been generalized to take an additional 'cpuGroupSize' parameter to ensure that CPUs are always allocated in groups of size 'cpuGroupSize' according to the algorithm described above. This is important, for example, to ensure that all CPUs (i.e. all hyperthreads) from a single core are allocated together.