# To add a new book the only step required is to add the book to the
# list of DOCBOOKS.
- ifeq ($(IGNORE_DOCBOOKS),)
-
DOCBOOKS := z8530book.xml device-drivers.xml \
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
writing_usb_driver.xml networking.xml \
genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
80211.xml debugobjects.xml sh.xml regulator.xml \
alsa-driver-api.xml writing-an-alsa-driver.xml \
- tracepoint.xml gpu.xml media_api.xml w1.xml \
+ tracepoint.xml w1.xml \
writing_musb_glue_layer.xml crypto-API.xml iio.xml
-include Documentation/DocBook/media/Makefile
-
+ ifeq ($(DOCBOOKS),)
+
+ # Skip DocBook build if the user explicitly requested no DOCBOOKS.
+ .DEFAULT:
+ @echo " SKIP DocBook $@ target (DOCBOOKS=\"\" specified)."
+
+ else
+
###
# The build process is as follows (targets):
# (xmldocs) [by docproc]
HTML := $(sort $(patsubst %.xml, %.html, $(BOOKS)))
htmldocs: $(HTML)
$(call cmd,build_main_index)
- $(call install_media_images)
MAN := $(patsubst %.xml, %.9, $(BOOKS))
mandocs: $(MAN)
-e "s/>/\\>/g"; \
echo "</programlisting>") > $@
- else
-
- htmldocs:
- pdfdocs:
- psdocs:
- xmldocs:
- installmandocs:
-
- endif # IGNORE_DOCBOOKS
-
+ endif # DOCBOOKS=""
###
# Help targets as used by the top-level makefile
@echo ' make DOCBOOKS="s1.xml s2.xml" [target] Generate only docs s1.xml s2.xml'
@echo ' valid values for DOCBOOKS are: $(DOCBOOKS)'
@echo
- @echo " make IGNORE_DOCBOOKS=1 [target] Don't generate docs from Docbook"
+ @echo " make DOCBOOKS=\"\" [target] Don't generate docs from Docbook"
@echo ' This is useful to generate only the ReST docs (Sphinx)'
clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man
-cleandocs: cleanmediadocs
+cleandocs:
$(Q)rm -f $(call objectify, $(clean-files))
$(Q)rm -rf $(call objectify, $(clean-dirs))
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
quiet_cmd_sphinx = SPHINX $@
- cmd_sphinx = $(SPHINXBUILD) -b $2 $(ALLSPHINXOPTS) $(BUILDDIR)/$2
+ cmd_sphinx = BUILDDIR=$(BUILDDIR) $(SPHINXBUILD) -b $2 $(ALLSPHINXOPTS) $(BUILDDIR)/$2
htmldocs:
+ $(MAKE) BUILDDIR=$(BUILDDIR) -f $(srctree)/Documentation/media/Makefile $@
$(call cmd,sphinx,html)
pdfdocs:
psdocs:
mandocs:
installmandocs:
-cleanmediadocs:
cleandocs:
$(Q)rm -rf $(BUILDDIR)
+ endif # HAVE_SPHINX
+
dochelp:
@echo ' Linux kernel internal documentation in different formats (Sphinx):'
@echo ' htmldocs - HTML'
@echo ' epubdocs - EPUB'
@echo ' xmldocs - XML'
@echo ' cleandocs - clean all generated files'
-
- endif # HAVE_SPHINX
is complex. This is a document for memcg's internal behavior.
Please note that implementation details can be changed.
- (*) Topics on API should be in Documentation/cgroups/memory.txt)
+ (*) Topics on API should be in Documentation/cgroup-v1/memory.txt)
0. How to record usage ?
2 objects are used.
8. LRU
Each memcg has its own private LRU. Now, its handling is under global
- VM's control (means that it's handled under global zone->lru_lock).
+ VM's control (means that it's handled under global zone_lru_lock).
Almost all routines around memcg's LRU is called by global LRU's
- list management functions under zone->lru_lock().
+ list management functions under zone_lru_lock().
A special function is mem_cgroup_isolate_pages(). This scans
memcg's private LRU and call __isolate_lru_page() to extract a page
You can see charges have been moved by reading *.usage_in_bytes or
memory.stat of both A and B.
- See 8.2 of Documentation/cgroups/memory.txt to see what value should be
+ See 8.2 of Documentation/cgroup-v1/memory.txt to see what value should be
written to move_charge_at_immigrate.
9.10 Memory thresholds
bootmem_debug [KNL] Enable bootmem allocator debug messages.
+ bert_disable [ACPI]
+ Disable BERT OS support on buggy BIOSes.
+
bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
bttv.radio= Most important insmod options are available as
kernel args too.
[SPARC64] tick
[X86-64] hpet,tsc
+ clocksource.arm_arch_timer.evtstrm=
+ [ARM,ARM64]
+ Format: <bool>
+ Enable/disable the eventstream feature of the ARM
+ architected timer so that code using WFE-based polling
+ loops can be debugged more effectively on production
+ systems.
+
clearcpuid=BITNUM [X86]
Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit
dhash_entries= [KNL]
Set number of hash buckets for dentry cache.
+ disable_1tb_segments [PPC]
+ Disables the use of 1TB hash page table segments. This
+ causes the kernel to fall back to 256MB segments which
+ can be useful when debugging issues that require an SLB
+ miss to occur.
+
disable= [IPV6]
See Documentation/networking/ipv6.txt.
+ disable_radix [PPC]
+ Disable RADIX MMU mode on POWER9
+
disable_cpu_apicid= [X86,APIC,SMP]
Format: <int>
The number of initial APIC ID for the
Address Range Mirroring feature even if your box
doesn't support it.
+ efivar_ssdt= [EFI; X86] Name of an EFI variable that contains an SSDT
+ that is to be dynamically loaded by Linux. If there are
+ multiple variables with the same name but with different
+ vendor GUIDs, all of them will be loaded. See
+ Documentation/acpi/ssdt-overlays.txt for details.
+
+
eisa_irq_edge= [PARISC,HW]
See header of drivers/parisc/eisa.c.
js= [HW,JOY] Analog joystick
See Documentation/input/joystick.txt.
- kaslr/nokaslr [X86]
- Enable/disable kernel and module base offset ASLR
- (Address Space Layout Randomization) if built into
- the kernel. When CONFIG_HIBERNATION is selected,
- kASLR is disabled by default. When kASLR is enabled,
- hibernation will be disabled.
+ nokaslr [KNL]
+ When CONFIG_RANDOMIZE_BASE is set, this disables
+ kernel and module base offset ASLR (Address Space
+ Layout Randomization).
keepinitrd [HW,ARM]
Note that if CONFIG_MODULE_SIG_FORCE is set, that
is always true, so this option does nothing.
+ module_blacklist= [KNL] Do not load a comma-separated list of
+ modules. Useful for debugging problem modules.
+
mousedev.tap_time=
[MOUSE] Maximum time between finger touching and
leaving touchpad surface for touch to be considered
timer: [X86] Force use of architectural NMI
timer mode (see also oprofile.timer
for generic hr timer mode)
- [s390] Force legacy basic mode sampling
- (report cpu_type "timer")
oops=panic Always panic on oopses. Default is to just kill the
process, but there is a small probability of
resource_alignment=
Format:
[<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
+ [<order of align>@]pci:<vendor>:<device>\
+ [:<subvendor>:<subdevice>][; ...]
Specifies alignment and device to reassign
aligned memory resources.
If <order of align> is not specified,
hpmemsize=nn[KMG] The fixed amount of bus space which is
reserved for hotplug bridge's memory window.
Default size is 2 megabytes.
+ hpbussize=nn The minimum amount of additional bus numbers
+ reserved for buses below a hotplug bridge.
+ Default is 1.
realloc= Enable/disable reallocating PCI bridge resources
if allocations done by BIOS are too small to
accommodate resources required by all child
compat Treat PCIe ports as PCI-to-PCI bridges, disable the PCIe
ports driver.
+ pcie_port_pm= [PCIE] PCIe port power management handling:
+ off Disable power management of all PCIe ports
+ force Forcibly enable power management of all PCIe ports
+
pcie_pme= [PCIE,PM] Native PCIe PME signaling options:
nomsi Do not use MSI for native PCIe PME signaling (this makes
all PCIe root ports use INTx for all services).
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
default: disabled
+ printk.devkmsg={on,off,ratelimit}
+ Control writing to /dev/kmsg.
+ on - unlimited logging to /dev/kmsg from userspace
+ off - logging to /dev/kmsg disabled
+ ratelimit - ratelimit the logging
+ Default: ratelimit
+
printk.time= Show timing data prefixed to each printk message line
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
relax_domain_level=
[KNL, SMP] Set scheduler's default relax_domain_level.
- See Documentation/cgroups/cpusets.txt.
+ See Documentation/cgroup-v1/cpusets.txt.
relative_sleep_states=
[SUSPEND] Use sleep state labeling where the deepest
present during boot.
nocompress Don't compress/decompress hibernation images.
no Disable hibernation and resume.
+ protect_image Turn on image protection during restoration
+ (that will set all pages holding image data
+ during restoration read-only).
retain_initrd [RAM] Keep initrd memory after extraction
using these two parameters to set the minimum and
maximum port values.
+ sunrpc.svc_rpc_per_connection_limit=
+ [NFS,SUNRPC]
+ Limit the number of requests that the server will
+ process in parallel from a single connection.
+ The default value is 0 (no limit).
+
sunrpc.pool_mode=
[NFS]
Control how the NFS server code allocates CPUs to
swapaccount=[0|1]
[KNL] Enable accounting of swap in memory resource
controller if no parameter or 1 is given or disable
- it if 0 is given (See Documentation/cgroups/memory.txt)
+ it if 0 is given (See Documentation/cgroup-v1/memory.txt)
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
Format: { <int> | force }
Larger installations usually partition the system using cpusets into
sections of nodes. Paul Jackson has equipped cpusets with the ability to
move pages when a task is moved to another cpuset (See
- Documentation/cgroups/cpusets.txt).
+ Documentation/cgroup-v1/cpusets.txt).
Cpusets allows the automation of process locality. If a task is moved to
a new cpuset then also all its pages are moved with it so that the
performance of the process does not sink dramatically. Also the pages
20. The new page is moved to the LRU and can be scanned by the swapper
etc again.
-Christoph Lameter, May 8, 2006.
+C. Non-LRU page migration
+-------------------------
+
+Although original migration aimed for reducing the latency of memory access
+for NUMA, compaction who want to create high-order page is also main customer.
+
+Current problem of the implementation is that it is designed to migrate only
+*LRU* pages. However, there are potential non-lru pages which can be migrated
+in drivers, for example, zsmalloc, virtio-balloon pages.
+
+For virtio-balloon pages, some parts of migration code path have been hooked
+up and added virtio-balloon specific functions to intercept migration logics.
+It's too specific to a driver so other drivers who want to make their pages
+movable would have to add own specific hooks in migration path.
+
+To overclome the problem, VM supports non-LRU page migration which provides
+generic functions for non-LRU movable pages without driver specific hooks
+migration path.
+
+If a driver want to make own pages movable, it should define three functions
+which are function pointers of struct address_space_operations.
+
+1. bool (*isolate_page) (struct page *page, isolate_mode_t mode);
+
+What VM expects on isolate_page function of driver is to return *true*
+if driver isolates page successfully. On returing true, VM marks the page
+as PG_isolated so concurrent isolation in several CPUs skip the page
+for isolation. If a driver cannot isolate the page, it should return *false*.
+
+Once page is successfully isolated, VM uses page.lru fields so driver
+shouldn't expect to preserve values in that fields.
+
+2. int (*migratepage) (struct address_space *mapping,
+ struct page *newpage, struct page *oldpage, enum migrate_mode);
+
+After isolation, VM calls migratepage of driver with isolated page.
+The function of migratepage is to move content of the old page to new page
+and set up fields of struct page newpage. Keep in mind that you should
+indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
+under page_lock if you migrated the oldpage successfully and returns
+MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
+can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
+because VM interprets -EAGAIN as "temporal migration failure". On returning
+any error except -EAGAIN, VM will give up the page migration without retrying
+in this time.
+
+Driver shouldn't touch page.lru field VM using in the functions.
+
+3. void (*putback_page)(struct page *);
+
+If migration fails on isolated page, VM should return the isolated page
+to the driver so VM calls driver's putback_page with migration failed page.
+In this function, driver should put the isolated page back to the own data
+structure.
+4. non-lru movable page flags
+
+There are two page flags for supporting non-lru movable page.
+
+* PG_movable
+
+Driver should use the below function to make page movable under page_lock.
+
+ void __SetPageMovable(struct page *page, struct address_space *mapping)
+
+It needs argument of address_space for registering migration family functions
+which will be called by VM. Exactly speaking, PG_movable is not a real flag of
+struct page. Rather than, VM reuses page->mapping's lower bits to represent it.
+
+ #define PAGE_MAPPING_MOVABLE 0x2
+ page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;
+
+so driver shouldn't access page->mapping directly. Instead, driver should
+use page_mapping which mask off the low two bits of page->mapping under
+page lock so it can get right struct address_space.
+
+For testing of non-lru movable page, VM supports __PageMovable function.
+However, it doesn't guarantee to identify non-lru movable page because
+page->mapping field is unified with other variables in struct page.
+As well, if driver releases the page after isolation by VM, page->mapping
+doesn't have stable value although it has PAGE_MAPPING_MOVABLE
+(Look at __ClearPageMovable). But __PageMovable is cheap to catch whether
+page is LRU or non-lru movable once the page has been isolated. Because
+LRU pages never can have PAGE_MAPPING_MOVABLE in page->mapping. It is also
+good for just peeking to test non-lru movable pages before more expensive
+checking with lock_page in pfn scanning to select victim.
+
+For guaranteeing non-lru movable page, VM provides PageMovable function.
+Unlike __PageMovable, PageMovable functions validates page->mapping and
+mapping->a_ops->isolate_page under lock_page. The lock_page prevents sudden
+destroying of page->mapping.
+
+Driver using __SetPageMovable should clear the flag via __ClearMovablePage
+under page_lock before the releasing the page.
+
+* PG_isolated
+
+To prevent concurrent isolation among several CPUs, VM marks isolated page
+as PG_isolated under lock_page. So if a CPU encounters PG_isolated non-lru
+movable page, it can skip it. Driver doesn't need to manipulate the flag
+because VM will set/clear it automatically. Keep in mind that if driver
+sees PG_isolated page, it means the page have been isolated by VM so it
+shouldn't touch page.lru field.
+PG_isolated is alias with PG_reclaim flag so driver shouldn't use the flag
+for own purpose.
+
+Christoph Lameter, May 8, 2006.
+Minchan Kim, Mar 28, 2016.
--------------------------------
The unevictable LRU facility interacts with the memory control group [aka
- memory controller; see Documentation/cgroups/memory.txt] by extending the
+ memory controller; see Documentation/cgroup-v1/memory.txt] by extending the
lru_list enum.
The memory controller data structure automatically gets a per-zone unevictable
the page migration code and the same work flow as described in MIGRATING
MLOCKED PAGES will apply.
+MLOCKING TRANSPARENT HUGE PAGES
+-------------------------------
+
+A transparent huge page is represented by a single entry on an LRU list.
+Therefore, we can only make unevictable an entire compound page, not
+individual subpages.
+
+If a user tries to mlock() part of a huge page, we want the rest of the
+page to be reclaimable.
+
+We cannot just split the page on partial mlock() as split_huge_page() can
+fail and new intermittent failure mode for the syscall is undesirable.
+
+We handle this by keeping PTE-mapped huge pages on normal LRU lists: the
+PMD on border of VM_LOCKED VMA will be split into PTE table.
+
+This way the huge page is accessible for vmscan. Under memory pressure the
+page will be split, subpages which belong to VM_LOCKED VMAs will be moved
+to unevictable LRU and the rest can be reclaimed.
+
+See also comment in follow_trans_huge_pmd().
mmap(MAP_LOCKED) SYSTEM CALL HANDLING
-------------------------------------