Just looked through the pseudo files in the
/proc/zoneinfo has some interesting information.
There are two zones in my current system:
normal. We also have 2 nodes, and thus the file is shown as:
- Node0 DMA
- Node0 DMA32
- Node0 normal
- Node1 normal
Each section has how many
present pages there are. At the end of the section there is the
start_pfn that shows us the physical address of the beginning of the zone.
Thus we can approximate the physical address space of our system by using the
PAGE_SIZE and also using the number of
Scope of Memory Policies
System Default Policy
Hard coded into the kernel. On system boot it uses
interleaved. After bootup it uses
Per task policy. Task policies are inherited to child processes. Thus applications like
numactl uses this property to propogate the task policy to the child process.
In multi-threaded situation where other threads exist only the thread that calls the
MEMORY_POLICY_APIS will set its memory policy. All other exisitng threads will retain the prior policy.
The policy only affects memory allocation after the time the policy is set. All allocations before the change are not affected.
Only applies to
anonymous pages. File mapped VMAs will ignore the VMA policy if it is set to
MAP_SHARED. If it is
MAP_PRIVATE VMA policy will only be enforced on a write to the mapping (CoW).
VMA policies are shared by threads of the same address space. VMA policies do not persist accross
exec() calls (as the address pace is wiped)
Similar to VMA policy but it is shared among processes. Some more details, but skipped due to irrelevance to my work.
Specifies the current scope does not follow a policy, fall back to larger scope’s policy. At the root it’ll follow the system policy.
Memory allocated from the nodes specified by the policy. Proximity is considered first, and if enough free space exists for the closest memory node (to the allocation requestor) it’ll be granted
Allocation will be attempted from the preferred (single) node. If it fails, the nodes will be searched for free space in nearest first fashion.
Local allocation is a preferred mode where the node that initiates a page fault is the preferred node.
For Anonymous/shared pages: Node set is indexed using the page offset to the VMA. (Address % node_nums). The indexed node is requested for a page as in Preferered mode. And if it fails follows preferred mode style.
Page cache pages: A node counter is used to wrap around and try to spread out pages among the nodes (that are specified).
Reference: What is Linux Memory Policy