NUMA Linux Memory Allocation

Scope of Memory Policies

System Default Policy

Hard coded into the kernel. On system boot it uses interleaved. After bootup it uses local allocation.

Task/Process Policy

Per task policy. Task policies are inherited to child processes. Thus applications like numactl uses this property to propogate the task policy to the child process.

In multi-threaded situation where other threads exist only the thread that calls the MEMORY_POLICY_APIS will set its memory policy. All other exisitng threads will retain the prior policy.

The policy only affects memory allocation after the time the policy is set. All allocations before the change are not affected.

VMA Policy

Only applies to anonymous pages. File mapped VMAs will ignore the VMA policy if it is set to MAP_SHARED. If it is MAP_PRIVATE VMA policy will only be enforced on a write to the mapping (CoW).

VMA policies are shared by threads of the same address space. VMA policies do not persist accross exec() calls (as the address pace is wiped)

Shared Policy

Similar to VMA policy but it is shared among processes. Some more details, but skipped due to irrelevance to my work.

Memory Policies

Default Mode

Specifies the current scope does not follow a policy, fall back to larger scope’s policy. At the root it’ll follow the system policy.

Bind Mode

Memory allocated from the nodes specified by the policy. Proximity is considered first, and if enough free space exists for the closest memory node (to the allocation requestor) it’ll be granted

Preferred Mode

Allocation will be attempted from the preferred (single) node. If it fails, the nodes will be searched for free space in nearest first fashion.
Local allocation is a preferred mode where the node that initiates a page fault is the preferred node.

Interleaved Mode

For Anonymous/shared pages: Node set is indexed using the page offset to the VMA. (Address % node_nums). The indexed node is requested for a page as in Preferered mode. And if it fails follows preferred mode style.

Page cache pages: A node counter is used to wrap around and try to spread out pages among the nodes (that are specified).


Reference: What is Linux Memory Policy



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s