NUMA is a technology that allows the system memory to be divided into zones, also named nodes
. The NUMA nodes are then allocated to particular CPUs or sockets. In contrast to the traditional monolithic memory approach, where each CPU/core can access all the memory regardless of its locality, usually resulting in larger latencies, NUMA bound processes can access memory that is local to the CPU they are being executed on. In most cases, this is much faster than the memory connected to the remote CPUs on the system.
Automatic NUMA balancing
The main aim of automatic NUMA balancing is to improve the performance of different applications running in a NUMA-aware system. The strategy behind its design is simple: an application will generally perform best when the threads of its processes are accessing memory on the same NUMA node where the threads are scheduled by the kernel. Automatic NUMA balancing moves tasks (threads or processes) closer to the memory they are accessing. It also moves application data to memory closer to the tasks that reference it. This is all done automatically by the kernel when automatic NUMA balancing is active. Automatic NUMA balancing will be enabled when booted on hardware with NUMA properties. The main conditions or criteria are: