]> git.proxmox.com Git - mirror_ubuntu-jammy-kernel.git/blame - Documentation/admin-guide/mm/pagemap.rst
proc: fix documentation and description of pagemap
[mirror_ubuntu-jammy-kernel.git] / Documentation / admin-guide / mm / pagemap.rst
CommitLineData
25c3bf8a
MR
1.. _pagemap:
2
41ea9dd3
MR
3=============================
4Examining Process Page Tables
5=============================
ef421be7
TT
6
7pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow
8userspace programs to examine the page tables and related information by
25c3bf8a 9reading files in ``/proc``.
ef421be7 10
80ae2fdc 11There are four components to pagemap:
ef421be7 12
25c3bf8a 13 * ``/proc/pid/pagemap``. This file lets a userspace process find out which
ef421be7
TT
14 physical frame each virtual page is mapped to. It contains one 64-bit
15 value for each virtual page, containing the following data (from
86207d9a 16 ``fs/proc/task_mmu.c``, above pagemap_read):
ef421be7 17
c9ba78e2 18 * Bits 0-54 page frame number (PFN) if present
ef421be7 19 * Bits 0-4 swap type if swapped
c9ba78e2 20 * Bits 5-54 swap offset if swapped
e27a20f1
MR
21 * Bit 55 pte is soft-dirty (see
22 :ref:`Documentation/admin-guide/mm/soft-dirty.rst <soft_dirty>`)
83b4b0bb 23 * Bit 56 page exclusively mapped (since 4.2)
fb8e37f3
PX
24 * Bit 57 pte is uffd-wp write-protected (since 5.13) (see
25 :ref:`Documentation/admin-guide/mm/userfaultfd.rst <userfaultfd>`)
f529b1bf 26 * Bits 58-60 zero
83b4b0bb 27 * Bit 61 page is file-page or shared-anon (since 3.5)
ef421be7
TT
28 * Bit 62 page swapped
29 * Bit 63 page present
30
83b4b0bb
KK
31 Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs.
32 In 4.0 and 4.1 opens by unprivileged fail with -EPERM. Starting from
33 4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN.
34 Reason: information about PFNs helps in exploiting Rowhammer vulnerability.
35
ef421be7
TT
36 If the page is not present but in swap, then the PFN contains an
37 encoding of the swap file number and the page's offset into the
38 swap. Unmapped pages return a null PFN. This allows determining
39 precisely which pages are mapped (or in swap) and comparing mapped
40 pages between processes.
41
86207d9a 42 Efficient users of this interface will use ``/proc/pid/maps`` to
ef421be7
TT
43 determine which areas of memory are actually mapped and llseek to
44 skip over unmapped regions.
45
25c3bf8a 46 * ``/proc/kpagecount``. This file contains a 64-bit count of the number of
ef421be7
TT
47 times each page is mapped, indexed by PFN.
48
7f1d23e6
CH
49The page-types tool in the tools/vm directory can be used to query the
50number of times a page is mapped.
51
25c3bf8a 52 * ``/proc/kpageflags``. This file contains a 64-bit set of flags for each
ef421be7
TT
53 page, indexed by PFN.
54
25c3bf8a
MR
55 The flags are (from ``fs/proc/page.c``, above kpageflags_read):
56
57 0. LOCKED
58 1. ERROR
59 2. REFERENCED
60 3. UPTODATE
61 4. DIRTY
62 5. LRU
63 6. ACTIVE
64 7. SLAB
65 8. WRITEBACK
66 9. RECLAIM
ef421be7 67 10. BUDDY
17e89501
WF
68 11. MMAP
69 12. ANON
70 13. SWAPCACHE
71 14. SWAPBACKED
72 15. COMPOUND_HEAD
73 16. COMPOUND_TAIL
63f8e8d2 74 17. HUGE
17e89501 75 18. UNEVICTABLE
253fb02d 76 19. HWPOISON
17e89501 77 20. NOPAGE
a1bbb5ec 78 21. KSM
807f0ccf 79 22. THP
ca215086 80 23. OFFLINE
56873f43 81 24. ZERO_PAGE
f074a8f4 82 25. IDLE
ca215086 83 26. PGTABLE
17e89501 84
25c3bf8a 85 * ``/proc/kpagecgroup``. This file contains a 64-bit inode number of the
80ae2fdc
VD
86 memory cgroup each page is charged to, indexed by PFN. Only available when
87 CONFIG_MEMCG is set.
88
86207d9a
MR
89Short descriptions to the page flags
90====================================
25c3bf8a
MR
91
920 - LOCKED
86207d9a 93 page is being locked for exclusive access, e.g. by undergoing read/write IO
25c3bf8a
MR
947 - SLAB
95 page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
96 When compound page is used, SLUB/SLQB will only set this flag on the head
97 page; SLOB will not flag it at all.
9810 - BUDDY
17e89501
WF
99 a free memory block managed by the buddy system allocator
100 The buddy system organizes free memory in blocks of various orders.
101 An order N block has 2^N physically contiguous pages, with the BUDDY flag
102 set for and _only_ for the first page.
25c3bf8a 10315 - COMPOUND_HEAD
17e89501
WF
104 A compound page with order N consists of 2^N physically contiguous pages.
105 A compound page with order 2 takes the form of "HTTT", where H donates its
106 head page and T donates its tail page(s). The major consumers of compound
e27a20f1
MR
107 pages are hugeTLB pages
108 (:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`),
109 the SLUB etc. memory allocators and various device drivers.
110 However in this interface, only huge/giga pages are made visible
111 to end users.
25c3bf8a
MR
11216 - COMPOUND_TAIL
113 A compound page tail (see description above).
11417 - HUGE
17e89501 115 this is an integral part of a HugeTLB page
25c3bf8a 11619 - HWPOISON
253fb02d 117 hardware detected memory corruption on this page: don't touch the data!
25c3bf8a 11820 - NOPAGE
17e89501 119 no page frame exists at the requested address
25c3bf8a 12021 - KSM
a1bbb5ec 121 identical memory pages dynamically shared between one or more processes
25c3bf8a 12222 - THP
807f0ccf 123 contiguous pages which construct transparent hugepages
ca215086
DH
12423 - OFFLINE
125 page is logically offline
25c3bf8a 12624 - ZERO_PAGE
56873f43 127 zero page for pfn_zero or huge_zero page
25c3bf8a 12825 - IDLE
f074a8f4 129 page has not been accessed since it was marked idle (see
e27a20f1
MR
130 :ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).
131 Note that this flag may be stale in case the page was accessed via
132 a PTE. To make sure the flag is up-to-date one has to read
133 ``/sys/kernel/mm/page_idle/bitmap`` first.
ca215086
DH
13426 - PGTABLE
135 page is in use as a page table
25c3bf8a
MR
136
137IO related page flags
138---------------------
139
1401 - ERROR
141 IO error occurred
1423 - UPTODATE
143 page has up-to-date data
144 ie. for file backed page: (in-memory data revision >= on-disk one)
1454 - DIRTY
146 page has been written to, hence contains new data
86207d9a 147 i.e. for file backed page: (in-memory data revision > on-disk one)
25c3bf8a
MR
1488 - WRITEBACK
149 page is being synced to disk
150
151LRU related page flags
152----------------------
153
1545 - LRU
155 page is in one of the LRU lists
1566 - ACTIVE
157 page is in the active LRU list
15818 - UNEVICTABLE
159 page is in the unevictable (non-)LRU list It is somehow pinned and
86207d9a 160 not a candidate for LRU page reclaims, e.g. ramfs pages,
25c3bf8a
MR
161 shmctl(SHM_LOCK) and mlock() memory segments
1622 - REFERENCED
163 page has been referenced since last LRU list enqueue/requeue
1649 - RECLAIM
165 page will be reclaimed soon after its pageout IO completed
16611 - MMAP
167 a memory mapped page
16812 - ANON
169 a memory mapped page that is not part of a file
17013 - SWAPCACHE
86207d9a 171 page is mapped to swap space, i.e. has an associated swap entry
25c3bf8a
MR
17214 - SWAPBACKED
173 page is backed by swap/RAM
17e89501 174
3250af19
RW
175The page-types tool in the tools/vm directory can be used to query the
176above flags.
ef421be7 177
25c3bf8a
MR
178Using pagemap to do something useful
179====================================
ef421be7
TT
180
181The general procedure for using pagemap to find out about a process' memory
182usage goes like this:
183
25c3bf8a 184 1. Read ``/proc/pid/maps`` to determine which parts of the memory space are
ef421be7
TT
185 mapped to what.
186 2. Select the maps you are interested in -- all of them, or a particular
187 library, or the stack or the heap, etc.
25c3bf8a 188 3. Open ``/proc/pid/pagemap`` and seek to the pages you would like to examine.
ef421be7 189 4. Read a u64 for each page from pagemap.
25c3bf8a
MR
190 5. Open ``/proc/kpagecount`` and/or ``/proc/kpageflags``. For each PFN you
191 just read, seek to that entry in the file, and read the data you want.
ef421be7
TT
192
193For example, to find the "unique set size" (USS), which is the amount of
194memory that a process is using that is not shared with any other process,
195you can go through every map in the process, find the PFNs, look those up
196in kpagecount, and tally up the number of pages that are only referenced
197once.
198
25c3bf8a
MR
199Other notes
200===========
ef421be7
TT
201
202Reading from any of the files will return -EINVAL if you are not starting
f884ab15 203the read on an 8-byte boundary (e.g., if you sought an odd number of bytes
ef421be7 204into the file), or if the size of the read is not a multiple of 8 bytes.
83b4b0bb
KK
205
206Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is
207always 12 at most architectures). Since Linux 3.11 their meaning changes
208after first clear of soft-dirty bits. Since Linux 4.2 they are used for
209flags unconditionally.