S: D-69126 Heidelberg
S: Germany
+N: Simon Horman
+M: horms@verge.net.au
+D: Renesas ARM/ARM64 SoC maintainer
+
N: Christopher Horn
E: chorn@warwick.net
D: Miscellaneous sysctl hacks
Protections
-----------
-A cgroup is protected to be allocated upto the configured amount of
-the resource if the usages of all its ancestors are under their
+A cgroup is protected upto the configured amount of the resource
+as long as the usages of all its ancestors are under their
protected levels. Protections can be hard guarantees or best effort
soft boundaries. Protections can also be over-committed in which case
only upto the amount available to the parent is protected among
is within its effective min boundary, the cgroup's memory
won't be reclaimed under any conditions. If there is no
unprotected reclaimable memory available, OOM killer
- is invoked.
+ is invoked. Above the effective min boundary (or
+ effective low boundary if it is higher), pages are reclaimed
+ proportionally to the overage, reducing reclaim pressure for
+ smaller overages.
Effective min boundary is limited by memory.min values of
all ancestor cgroups. If there is memory.min overcommitment
Best-effort memory protection. If the memory usage of a
cgroup is within its effective low boundary, the cgroup's
memory won't be reclaimed unless memory can be reclaimed
- from unprotected cgroups.
+ from unprotected cgroups. Above the effective low boundary (or
+ effective min boundary if it is higher), pages are reclaimed
+ proportionally to the overage, reducing reclaim pressure for
+ smaller overages.
Effective low boundary is limited by memory.low values of
all ancestor cgroups. If there is memory.low overcommitment
becomes self-defeating.
The memory.low boundary on the other hand is a top-down allocated
-reserve. A cgroup enjoys reclaim protection when it's within its low,
-which makes delegation of subtrees possible.
+reserve. A cgroup enjoys reclaim protection when it's within its
+effective low, which makes delegation of subtrees possible. It also
+enjoys having reclaim pressure proportional to its overage when
+above its effective low.
The original high boundary, the hard limit, is defined as a strict
limit that can not budge, even if the OOM killer has to be called.
the unplug protocol
never -- do not unplug even if version check succeeds
+ xen_legacy_crash [X86,XEN]
+ Crash from Xen panic notifier, without executing late
+ panic() code such as dumping handler.
+
xen_nopvspin [X86,XEN]
Disables the ticketlock slowpath using Xen PV
optimizations.
Software can "opt-in" to receiving VAs from a 52-bit space by
specifying an mmap hint parameter that is larger than 48-bit.
+
For example:
- maybe_high_address = mmap(~0UL, size, prot, flags,...);
+
+.. code-block:: c
+
+ maybe_high_address = mmap(~0UL, size, prot, flags,...);
It is also possible to build a debug kernel that returns addresses
from a 52-bit space by enabling the following kernel config options:
+
+.. code-block:: sh
+
CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y
Note that this option is only intended for debugging applications
protection-keys
../RCU/index
gcc-plugins
+ symbol-namespaces
Interfaces for kernel debugging
configuration, but it is a good practice to use `kmalloc` for objects
smaller than page size.
+The address of a chunk allocated with `kmalloc` is aligned to at least
+ARCH_KMALLOC_MINALIGN bytes. For sizes which are a power of two, the
+alignment is also guaranteed to be at least the respective size.
+
For large allocations you can use :c:func:`vmalloc` and
:c:func:`vzalloc`, or directly request pages from the page
allocator. The memory allocated by `vmalloc` and related functions is
--- /dev/null
+=================
+Symbol Namespaces
+=================
+
+The following document describes how to use Symbol Namespaces to structure the
+export surface of in-kernel symbols exported through the family of
+EXPORT_SYMBOL() macros.
+
+.. Table of Contents
+
+ === 1 Introduction
+ === 2 How to define Symbol Namespaces
+ --- 2.1 Using the EXPORT_SYMBOL macros
+ --- 2.2 Using the DEFAULT_SYMBOL_NAMESPACE define
+ === 3 How to use Symbols exported in Namespaces
+ === 4 Loading Modules that use namespaced Symbols
+ === 5 Automatically creating MODULE_IMPORT_NS statements
+
+1. Introduction
+===============
+
+Symbol Namespaces have been introduced as a means to structure the export
+surface of the in-kernel API. It allows subsystem maintainers to partition
+their exported symbols into separate namespaces. That is useful for
+documentation purposes (think of the SUBSYSTEM_DEBUG namespace) as well as for
+limiting the availability of a set of symbols for use in other parts of the
+kernel. As of today, modules that make use of symbols exported into namespaces,
+are required to import the namespace. Otherwise the kernel will, depending on
+its configuration, reject loading the module or warn about a missing import.
+
+2. How to define Symbol Namespaces
+==================================
+
+Symbols can be exported into namespace using different methods. All of them are
+changing the way EXPORT_SYMBOL and friends are instrumented to create ksymtab
+entries.
+
+2.1 Using the EXPORT_SYMBOL macros
+==================================
+
+In addition to the macros EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL(), that allow
+exporting of kernel symbols to the kernel symbol table, variants of these are
+available to export symbols into a certain namespace: EXPORT_SYMBOL_NS() and
+EXPORT_SYMBOL_NS_GPL(). They take one additional argument: the namespace.
+Please note that due to macro expansion that argument needs to be a
+preprocessor symbol. E.g. to export the symbol `usb_stor_suspend` into the
+namespace `USB_STORAGE`, use::
+
+ EXPORT_SYMBOL_NS(usb_stor_suspend, USB_STORAGE);
+
+The corresponding ksymtab entry struct `kernel_symbol` will have the member
+`namespace` set accordingly. A symbol that is exported without a namespace will
+refer to `NULL`. There is no default namespace if none is defined. `modpost`
+and kernel/module.c make use the namespace at build time or module load time,
+respectively.
+
+2.2 Using the DEFAULT_SYMBOL_NAMESPACE define
+=============================================
+
+Defining namespaces for all symbols of a subsystem can be very verbose and may
+become hard to maintain. Therefore a default define (DEFAULT_SYMBOL_NAMESPACE)
+is been provided, that, if set, will become the default for all EXPORT_SYMBOL()
+and EXPORT_SYMBOL_GPL() macro expansions that do not specify a namespace.
+
+There are multiple ways of specifying this define and it depends on the
+subsystem and the maintainer's preference, which one to use. The first option
+is to define the default namespace in the `Makefile` of the subsystem. E.g. to
+export all symbols defined in usb-common into the namespace USB_COMMON, add a
+line like this to drivers/usb/common/Makefile::
+
+ ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=USB_COMMON
+
+That will affect all EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL() statements. A
+symbol exported with EXPORT_SYMBOL_NS() while this definition is present, will
+still be exported into the namespace that is passed as the namespace argument
+as this argument has preference over a default symbol namespace.
+
+A second option to define the default namespace is directly in the compilation
+unit as preprocessor statement. The above example would then read::
+
+ #undef DEFAULT_SYMBOL_NAMESPACE
+ #define DEFAULT_SYMBOL_NAMESPACE USB_COMMON
+
+within the corresponding compilation unit before any EXPORT_SYMBOL macro is
+used.
+
+3. How to use Symbols exported in Namespaces
+============================================
+
+In order to use symbols that are exported into namespaces, kernel modules need
+to explicitly import these namespaces. Otherwise the kernel might reject to
+load the module. The module code is required to use the macro MODULE_IMPORT_NS
+for the namespaces it uses symbols from. E.g. a module using the
+usb_stor_suspend symbol from above, needs to import the namespace USB_STORAGE
+using a statement like::
+
+ MODULE_IMPORT_NS(USB_STORAGE);
+
+This will create a `modinfo` tag in the module for each imported namespace.
+This has the side effect, that the imported namespaces of a module can be
+inspected with modinfo::
+
+ $ modinfo drivers/usb/storage/ums-karma.ko
+ [...]
+ import_ns: USB_STORAGE
+ [...]
+
+
+It is advisable to add the MODULE_IMPORT_NS() statement close to other module
+metadata definitions like MODULE_AUTHOR() or MODULE_LICENSE(). Refer to section
+5. for a way to create missing import statements automatically.
+
+4. Loading Modules that use namespaced Symbols
+==============================================
+
+At module loading time (e.g. `insmod`), the kernel will check each symbol
+referenced from the module for its availability and whether the namespace it
+might be exported to has been imported by the module. The default behaviour of
+the kernel is to reject loading modules that don't specify sufficient imports.
+An error will be logged and loading will be failed with EINVAL. In order to
+allow loading of modules that don't satisfy this precondition, a configuration
+option is available: Setting MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS=y will
+enable loading regardless, but will emit a warning.
+
+5. Automatically creating MODULE_IMPORT_NS statements
+=====================================================
+
+Missing namespaces imports can easily be detected at build time. In fact,
+modpost will emit a warning if a module uses a symbol from a namespace
+without importing it.
+MODULE_IMPORT_NS() statements will usually be added at a definite location
+(along with other module meta data). To make the life of module authors (and
+subsystem maintainers) easier, a script and make target is available to fixup
+missing imports. Fixing missing imports can be done with::
+
+ $ make nsdeps
+
+A typical scenario for module authors would be::
+
+ - write code that depends on a symbol from a not imported namespace
+ - `make`
+ - notice the warning of modpost telling about a missing import
+ - run `make nsdeps` to add the import to the correct code location
+
+For subsystem maintainers introducing a namespace, the steps are very similar.
+Again, `make nsdeps` will eventually add the missing namespace imports for
+in-tree modules::
+
+ - move or add symbols to a namespace (e.g. with EXPORT_SYMBOL_NS())
+ - `make` (preferably with an allmodconfig to cover all in-kernel
+ modules)
+ - notice the warning of modpost telling about a missing import
+ - run `make nsdeps` to add the import to the correct code location
+
$ export KBUILD_OUTPUT=/tmp/kselftest; make TARGETS="size timers" kselftest
+Additionally you can use the "SKIP_TARGETS" variable on the make command
+line to specify one or more targets to exclude from the TARGETS list.
+
+To run all tests but a single subsystem::
+
+ $ make -C tools/testing/selftests SKIP_TARGETS=ptrace run_tests
+
+You can specify multiple tests to skip::
+
+ $ make SKIP_TARGETS="size timers" kselftest
+
+You can also specify a restricted list of tests to run together with a
+dedicated skiplist::
+
+ $ make TARGETS="bpf breakpoints size timers" SKIP_TARGETS=bpf kselftest
+
See the top-level tools/testing/selftests/Makefile for the list of all
possible targets.
+++ /dev/null
-=================
-Symbol Namespaces
-=================
-
-The following document describes how to use Symbol Namespaces to structure the
-export surface of in-kernel symbols exported through the family of
-EXPORT_SYMBOL() macros.
-
-.. Table of Contents
-
- === 1 Introduction
- === 2 How to define Symbol Namespaces
- --- 2.1 Using the EXPORT_SYMBOL macros
- --- 2.2 Using the DEFAULT_SYMBOL_NAMESPACE define
- === 3 How to use Symbols exported in Namespaces
- === 4 Loading Modules that use namespaced Symbols
- === 5 Automatically creating MODULE_IMPORT_NS statements
-
-1. Introduction
-===============
-
-Symbol Namespaces have been introduced as a means to structure the export
-surface of the in-kernel API. It allows subsystem maintainers to partition
-their exported symbols into separate namespaces. That is useful for
-documentation purposes (think of the SUBSYSTEM_DEBUG namespace) as well as for
-limiting the availability of a set of symbols for use in other parts of the
-kernel. As of today, modules that make use of symbols exported into namespaces,
-are required to import the namespace. Otherwise the kernel will, depending on
-its configuration, reject loading the module or warn about a missing import.
-
-2. How to define Symbol Namespaces
-==================================
-
-Symbols can be exported into namespace using different methods. All of them are
-changing the way EXPORT_SYMBOL and friends are instrumented to create ksymtab
-entries.
-
-2.1 Using the EXPORT_SYMBOL macros
-==================================
-
-In addition to the macros EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL(), that allow
-exporting of kernel symbols to the kernel symbol table, variants of these are
-available to export symbols into a certain namespace: EXPORT_SYMBOL_NS() and
-EXPORT_SYMBOL_NS_GPL(). They take one additional argument: the namespace.
-Please note that due to macro expansion that argument needs to be a
-preprocessor symbol. E.g. to export the symbol `usb_stor_suspend` into the
-namespace `USB_STORAGE`, use::
-
- EXPORT_SYMBOL_NS(usb_stor_suspend, USB_STORAGE);
-
-The corresponding ksymtab entry struct `kernel_symbol` will have the member
-`namespace` set accordingly. A symbol that is exported without a namespace will
-refer to `NULL`. There is no default namespace if none is defined. `modpost`
-and kernel/module.c make use the namespace at build time or module load time,
-respectively.
-
-2.2 Using the DEFAULT_SYMBOL_NAMESPACE define
-=============================================
-
-Defining namespaces for all symbols of a subsystem can be very verbose and may
-become hard to maintain. Therefore a default define (DEFAULT_SYMBOL_NAMESPACE)
-is been provided, that, if set, will become the default for all EXPORT_SYMBOL()
-and EXPORT_SYMBOL_GPL() macro expansions that do not specify a namespace.
-
-There are multiple ways of specifying this define and it depends on the
-subsystem and the maintainer's preference, which one to use. The first option
-is to define the default namespace in the `Makefile` of the subsystem. E.g. to
-export all symbols defined in usb-common into the namespace USB_COMMON, add a
-line like this to drivers/usb/common/Makefile::
-
- ccflags-y += -DDEFAULT_SYMBOL_NAMESPACE=USB_COMMON
-
-That will affect all EXPORT_SYMBOL() and EXPORT_SYMBOL_GPL() statements. A
-symbol exported with EXPORT_SYMBOL_NS() while this definition is present, will
-still be exported into the namespace that is passed as the namespace argument
-as this argument has preference over a default symbol namespace.
-
-A second option to define the default namespace is directly in the compilation
-unit as preprocessor statement. The above example would then read::
-
- #undef DEFAULT_SYMBOL_NAMESPACE
- #define DEFAULT_SYMBOL_NAMESPACE USB_COMMON
-
-within the corresponding compilation unit before any EXPORT_SYMBOL macro is
-used.
-
-3. How to use Symbols exported in Namespaces
-============================================
-
-In order to use symbols that are exported into namespaces, kernel modules need
-to explicitly import these namespaces. Otherwise the kernel might reject to
-load the module. The module code is required to use the macro MODULE_IMPORT_NS
-for the namespaces it uses symbols from. E.g. a module using the
-usb_stor_suspend symbol from above, needs to import the namespace USB_STORAGE
-using a statement like::
-
- MODULE_IMPORT_NS(USB_STORAGE);
-
-This will create a `modinfo` tag in the module for each imported namespace.
-This has the side effect, that the imported namespaces of a module can be
-inspected with modinfo::
-
- $ modinfo drivers/usb/storage/ums-karma.ko
- [...]
- import_ns: USB_STORAGE
- [...]
-
-
-It is advisable to add the MODULE_IMPORT_NS() statement close to other module
-metadata definitions like MODULE_AUTHOR() or MODULE_LICENSE(). Refer to section
-5. for a way to create missing import statements automatically.
-
-4. Loading Modules that use namespaced Symbols
-==============================================
-
-At module loading time (e.g. `insmod`), the kernel will check each symbol
-referenced from the module for its availability and whether the namespace it
-might be exported to has been imported by the module. The default behaviour of
-the kernel is to reject loading modules that don't specify sufficient imports.
-An error will be logged and loading will be failed with EINVAL. In order to
-allow loading of modules that don't satisfy this precondition, a configuration
-option is available: Setting MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS=y will
-enable loading regardless, but will emit a warning.
-
-5. Automatically creating MODULE_IMPORT_NS statements
-=====================================================
-
-Missing namespaces imports can easily be detected at build time. In fact,
-modpost will emit a warning if a module uses a symbol from a namespace
-without importing it.
-MODULE_IMPORT_NS() statements will usually be added at a definite location
-(along with other module meta data). To make the life of module authors (and
-subsystem maintainers) easier, a script and make target is available to fixup
-missing imports. Fixing missing imports can be done with::
-
- $ make nsdeps
-
-A typical scenario for module authors would be::
-
- - write code that depends on a symbol from a not imported namespace
- - `make`
- - notice the warning of modpost telling about a missing import
- - run `make nsdeps` to add the import to the correct code location
-
-For subsystem maintainers introducing a namespace, the steps are very similar.
-Again, `make nsdeps` will eventually add the missing namespace imports for
-in-tree modules::
-
- - move or add symbols to a namespace (e.g. with EXPORT_SYMBOL_NS())
- - `make` (preferably with an allmodconfig to cover all in-kernel
- modules)
- - notice the warning of modpost telling about a missing import
- - run `make nsdeps` to add the import to the correct code location
-
case 'K':
case 'k':
mem <<= 10;
- /* fall through */
+ fallthrough;
default:
break;
}
Implicit switch case fall-through
---------------------------------
-The C language allows switch cases to "fall through" when
-a "break" statement is missing at the end of a case. This,
-however, introduces ambiguity in the code, as it's not always
-clear if the missing break is intentional or a bug. As there
-have been a long list of flaws `due to missing "break" statements
+The C language allows switch cases to "fall-through" when a "break" statement
+is missing at the end of a case. This, however, introduces ambiguity in the
+code, as it's not always clear if the missing break is intentional or a bug.
+
+As there have been a long list of flaws `due to missing "break" statements
<https://cwe.mitre.org/data/definitions/484.html>`_, we no longer allow
-"implicit fall-through". In order to identify an intentional fall-through
-case, we have adopted the marking used by static analyzers: a comment
-saying `/* Fall through */`. Once the C++17 `__attribute__((fallthrough))`
-is more widely handled by C compilers, static analyzers, and IDEs, we can
-switch to using that instead.
+"implicit fall-through".
+
+In order to identify intentional fall-through cases, we have adopted a
+pseudo-keyword macro 'fallthrough' which expands to gcc's extension
+__attribute__((__fallthrough__)). `Statement Attributes
+<https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html>`_
+
+When the C17/C18 [[fallthrough]] syntax is more commonly supported by
+C compilers, static analyzers, and IDEs, we can switch to using that syntax
+for the macro pseudo-keyword.
+
+All switch/case blocks must end in one of:
+
+ break;
+ fallthrough;
+ continue;
+ goto <label>;
+ return [expression];
F: Documentation/devicetree/bindings/arm/realtek.yaml
ARM/RENESAS ARM64 ARCHITECTURE
-M: Simon Horman <horms@verge.net.au>
M: Geert Uytterhoeven <geert+renesas@glider.be>
M: Magnus Damm <magnus.damm@gmail.com>
L: linux-renesas-soc@vger.kernel.org
Q: http://patchwork.kernel.org/project/linux-renesas-soc/list/
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git next
T: git git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git next
S: Supported
F: arch/arm64/boot/dts/renesas/
F: drivers/media/platform/s5p-mfc/
ARM/SHMOBILE ARM ARCHITECTURE
-M: Simon Horman <horms@verge.net.au>
M: Geert Uytterhoeven <geert+renesas@glider.be>
M: Magnus Damm <magnus.damm@gmail.com>
L: linux-renesas-soc@vger.kernel.org
Q: http://patchwork.kernel.org/project/linux-renesas-soc/list/
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas.git next
T: git git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git next
S: Supported
F: arch/arm/boot/dts/emev2*
R: Dan Murphy <dmurphy@ti.com>
L: linux-leds@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git
S: Maintained
F: Documentation/devicetree/bindings/leds/
F: drivers/leds/
M: Matthias Maennich <maennich@google.com>
S: Maintained
F: scripts/nsdeps
+F: Documentation/core-api/symbol-namespaces.rst
NTB AMD DRIVER
M: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
# in addition to whatever we do anyway.
# Just "make" or "make all" shall build modules as well
-ifneq ($(filter all _all modules,$(MAKECMDGOALS)),)
+ifneq ($(filter all _all modules nsdeps,$(MAKECMDGOALS)),)
KBUILD_MODULES := 1
endif
kselftest:
$(Q)$(MAKE) -C $(srctree)/tools/testing/selftests run_tests
-PHONY += kselftest-clean
-kselftest-clean:
- $(Q)$(MAKE) -C $(srctree)/tools/testing/selftests clean
+kselftest-%: FORCE
+ $(Q)$(MAKE) -C $(srctree)/tools/testing/selftests $*
PHONY += kselftest-merge
kselftest-merge:
tristate "Accelerated AES using ARMv8 Crypto Extensions"
depends on KERNEL_MODE_NEON
select CRYPTO_BLKCIPHER
+ select CRYPTO_LIB_AES
select CRYPTO_SIMD
help
Use an implementation of AES in CBC, CTR and XTS modes that uses
#include <asm/assembler.h>
.text
+ .arch armv8-a
.fpu crypto-neon-fp-armv8
.align 3
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
select GENERIC_GETTIMEOFDAY
- select GENERIC_COMPAT_VDSO if (!CPU_BIG_ENDIAN && COMPAT)
select HANDLE_DOMAIN_IRQ
select HARDIRQS_SW_RESEND
select HAVE_PCI
if COMPAT
config KUSER_HELPERS
- bool "Enable kuser helpers page for 32 bit applications"
+ bool "Enable kuser helpers page for 32-bit applications"
default y
help
Warning: disabling this option may break 32-bit user programs.
Say N here only if you are absolutely certain that you do not
need these helpers; otherwise, the safe option is to say Y.
+config COMPAT_VDSO
+ bool "Enable vDSO for 32-bit applications"
+ depends on !CPU_BIG_ENDIAN && "$(CROSS_COMPILE_COMPAT)" != ""
+ select GENERIC_COMPAT_VDSO
+ default y
+ help
+ Place in the process address space of 32-bit applications an
+ ELF shared object providing fast implementations of gettimeofday
+ and clock_gettime.
+
+ You must have a 32-bit build of glibc 2.22 or later for programs
+ to seamlessly take advantage of this.
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
endif
endif
-ifeq ($(CONFIG_GENERIC_COMPAT_VDSO), y)
- CROSS_COMPILE_COMPAT ?= $(CONFIG_CROSS_COMPILE_COMPAT_VDSO:"%"=%)
-
- ifeq ($(CONFIG_CC_IS_CLANG), y)
- $(warning CROSS_COMPILE_COMPAT is clang, the compat vDSO will not be built)
- else ifeq ($(strip $(CROSS_COMPILE_COMPAT)),)
- $(warning CROSS_COMPILE_COMPAT not defined or empty, the compat vDSO will not be built)
- else ifeq ($(shell which $(CROSS_COMPILE_COMPAT)gcc 2> /dev/null),)
- $(error $(CROSS_COMPILE_COMPAT)gcc not found, check CROSS_COMPILE_COMPAT)
- else
- export CROSS_COMPILE_COMPAT
- export CONFIG_COMPAT_VDSO := y
- compat_vdso := -DCONFIG_COMPAT_VDSO=1
- endif
-endif
-
KBUILD_CFLAGS += -mgeneral-regs-only $(lseinstr) $(brokengasinst) \
$(compat_vdso) $(cc_has_k_constraint)
KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
}
#define __CMPXCHG_CASE(w, sfx, name, sz, mb, cl...) \
-static inline u##sz __lse__cmpxchg_case_##name##sz(volatile void *ptr, \
+static __always_inline u##sz \
+__lse__cmpxchg_case_##name##sz(volatile void *ptr, \
u##sz old, \
u##sz new) \
{ \
#undef __CMPXCHG_CASE
#define __CMPXCHG_DBL(name, mb, cl...) \
-static inline long __lse__cmpxchg_double##name(unsigned long old1, \
+static __always_inline long \
+__lse__cmpxchg_double##name(unsigned long old1, \
unsigned long old2, \
unsigned long new1, \
unsigned long new2, \
#define dmb(option) __asm__ __volatile__ ("dmb " #option : : : "memory")
-#if __LINUX_ARM_ARCH__ >= 8
+#if __LINUX_ARM_ARCH__ >= 8 && defined(CONFIG_AS_DMB_ISHLD)
#define aarch32_smp_mb() dmb(ish)
#define aarch32_smp_rmb() dmb(ishld)
#define aarch32_smp_wmb() dmb(ishst)
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2012 ARM Limited
- */
-#ifndef __ASM_VDSO_DATAPAGE_H
-#define __ASM_VDSO_DATAPAGE_H
-
-#ifndef __ASSEMBLY__
-
-struct vdso_data {
- __u64 cs_cycle_last; /* Timebase at clocksource init */
- __u64 raw_time_sec; /* Raw time */
- __u64 raw_time_nsec;
- __u64 xtime_clock_sec; /* Kernel time */
- __u64 xtime_clock_nsec;
- __u64 xtime_coarse_sec; /* Coarse time */
- __u64 xtime_coarse_nsec;
- __u64 wtm_clock_sec; /* Wall to monotonic time */
- __u64 wtm_clock_nsec;
- __u32 tb_seq_count; /* Timebase sequence counter */
- /* cs_* members must be adjacent and in this order (ldp accesses) */
- __u32 cs_mono_mult; /* NTP-adjusted clocksource multiplier */
- __u32 cs_shift; /* Clocksource shift (mono = raw) */
- __u32 cs_raw_mult; /* Raw clocksource multiplier */
- __u32 tz_minuteswest; /* Whacky timezone stuff */
- __u32 tz_dsttime;
- __u32 use_syscall;
- __u32 hrtimer_res;
-};
-
-#endif /* !__ASSEMBLY__ */
-
-#endif /* __ASM_VDSO_DATAPAGE_H */
struct insn_emulation *insn;
insn = kzalloc(sizeof(*insn), GFP_KERNEL);
+ if (!insn)
+ return;
+
insn->ops = ops;
insn->min = INSN_UNDEF;
insns_sysctl = kcalloc(nr_insn_emulated + 1, sizeof(*sysctl),
GFP_KERNEL);
+ if (!insns_sysctl)
+ return;
raw_spin_lock_irqsave(&insn_emulation_lock, flags);
list_for_each_entry(insn, &insn_emulation, node) {
int cpu, slot = -1;
/*
- * enable_smccc_arch_workaround_1() passes NULL for the hyp_vecs
- * start/end if we're a guest. Skip the hyp-vectors work.
+ * detect_harden_bp_fw() passes NULL for the hyp_vecs start/end if
+ * we're a guest. Skip the hyp-vectors work.
*/
if (!hyp_vecs_start) {
__this_cpu_write(bp_hardening_data.fn, fn);
static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_SB_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FRINTTS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_GPI_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
b.ge el0_dbg
b el0_inv
el0_svc_compat:
+ gic_prio_kentry_setup tmp=x1
mov x0, sp
bl el0_svc_compat_handler
b ret_to_user
/*
* Ensure updated trampoline is visible to instruction
- * fetch before we patch in the branch.
+ * fetch before we patch in the branch. Although the
+ * architecture doesn't require an IPI in this case,
+ * Neoverse-N1 erratum #1542419 does require one
+ * if the TLB maintenance in module_enable_ro() is
+ * skipped due to rodata_enabled. It doesn't seem worth
+ * it to make it conditional given that this is
+ * certainly not a fast-path.
*/
- __flush_icache_range((unsigned long)&dst[0],
- (unsigned long)&dst[1]);
+ flush_icache_range((unsigned long)&dst[0],
+ (unsigned long)&dst[1]);
}
addr = (unsigned long)dst;
#else /* CONFIG_ARM64_MODULE_PLTS */
fpsimd_release_task(tsk);
}
-/*
- * src and dst may temporarily have aliased sve_state after task_struct
- * is copied. We cannot fix this properly here, because src may have
- * live SVE state and dst's thread_info may not exist yet, so tweaking
- * either src's or dst's TIF_SVE is not safe.
- *
- * The unaliasing is done in copy_thread() instead. This works because
- * dst is not schedulable or traceable until both of these functions
- * have been called.
- */
int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
{
if (current->mm)
fpsimd_preserve_current_state();
*dst = *src;
+ /* We rely on the above assignment to initialize dst's thread_flags: */
+ BUILD_BUG_ON(!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK));
+
+ /*
+ * Detach src's sve_state (if any) from dst so that it does not
+ * get erroneously used or freed prematurely. dst's sve_state
+ * will be allocated on demand later on if dst uses SVE.
+ * For consistency, also clear TIF_SVE here: this could be done
+ * later in copy_process(), but to avoid tripping up future
+ * maintainers it is best not to leave TIF_SVE and sve_state in
+ * an inconsistent state, even temporarily.
+ */
+ dst->thread.sve_state = NULL;
+ clear_tsk_thread_flag(dst, TIF_SVE);
+
return 0;
}
memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context));
- /*
- * Unalias p->thread.sve_state (if any) from the parent task
- * and disable discard SVE state for p:
- */
- clear_tsk_thread_flag(p, TIF_SVE);
- p->thread.sve_state = NULL;
-
/*
* In case p was allocated the same task_struct pointer as some
* other recently-exited task, make sure p is disassociated from
ARCH_REL_TYPE_ABS := R_ARM_JUMP_SLOT|R_ARM_GLOB_DAT|R_ARM_ABS32
include $(srctree)/lib/vdso/Makefile
-COMPATCC := $(CROSS_COMPILE_COMPAT)gcc
+# Same as cc-*option, but using CC_COMPAT instead of CC
+ifeq ($(CONFIG_CC_IS_CLANG), y)
+CC_COMPAT ?= $(CC)
+else
+CC_COMPAT ?= $(CROSS_COMPILE_COMPAT)gcc
+endif
-# Same as cc-*option, but using COMPATCC instead of CC
cc32-option = $(call try-run,\
- $(COMPATCC) $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2))
+ $(CC_COMPAT) $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2))
cc32-disable-warning = $(call try-run,\
- $(COMPATCC) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1)))
+ $(CC_COMPAT) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1)))
cc32-ldoption = $(call try-run,\
- $(COMPATCC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+ $(CC_COMPAT) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+cc32-as-instr = $(call try-run,\
+ printf "%b\n" "$(1)" | $(CC_COMPAT) $(VDSO_AFLAGS) -c -x assembler -o "$$TMP" -,$(2),$(3))
# We cannot use the global flags to compile the vDSO files, the main reason
# being that the 32-bit compiler may be older than the main (64-bit) compiler
# arm64 one.
# As a result we set our own flags here.
-# From top-level Makefile
-# NOSTDINC_FLAGS
-VDSO_CPPFLAGS := -nostdinc -isystem $(shell $(COMPATCC) -print-file-name=include)
+# KBUILD_CPPFLAGS and NOSTDINC_FLAGS from top-level Makefile
+VDSO_CPPFLAGS := -D__KERNEL__ -nostdinc -isystem $(shell $(CC_COMPAT) -print-file-name=include)
VDSO_CPPFLAGS += $(LINUXINCLUDE)
-VDSO_CPPFLAGS += $(KBUILD_CPPFLAGS)
# Common C and assembly flags
# From top-level Makefile
VDSO_CAFLAGS := $(VDSO_CPPFLAGS)
+ifneq ($(shell $(CC_COMPAT) --version 2>&1 | head -n 1 | grep clang),)
+VDSO_CAFLAGS += --target=$(notdir $(CROSS_COMPILE_COMPAT:%-=%))
+endif
+
VDSO_CAFLAGS += $(call cc32-option,-fno-PIE)
ifdef CONFIG_DEBUG_INFO
VDSO_CAFLAGS += -g
endif
-ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(COMPATCC)), y)
-VDSO_CAFLAGS += -DCC_HAVE_ASM_GOTO
-endif
# From arm Makefile
VDSO_CAFLAGS += $(call cc32-option,-fno-dwarf2-cfi-asm)
VDSO_CAFLAGS += -fPIC -fno-builtin -fno-stack-protector
VDSO_CAFLAGS += -DDISABLE_BRANCH_PROFILING
+
# Try to compile for ARMv8. If the compiler is too old and doesn't support it,
# fall back to v7. There is no easy way to check for what architecture the code
# is being compiled, so define a macro specifying that (see arch/arm/Makefile).
VDSO_AFLAGS := $(VDSO_CAFLAGS)
VDSO_AFLAGS += -D__ASSEMBLY__
+# Check for binutils support for dmb ishld
+dmbinstr := $(call cc32-as-instr,dmb ishld,-DCONFIG_AS_DMB_ISHLD=1)
+
+VDSO_CFLAGS += $(dmbinstr)
+VDSO_AFLAGS += $(dmbinstr)
+
VDSO_LDFLAGS := $(VDSO_CPPFLAGS)
# From arm vDSO Makefile
VDSO_LDFLAGS += -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1
cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check)
quiet_cmd_vdsold = LD32 $@
- cmd_vdsold = $(COMPATCC) -Wp,-MD,$(depfile) $(VDSO_LDFLAGS) \
+ cmd_vdsold = $(CC_COMPAT) -Wp,-MD,$(depfile) $(VDSO_LDFLAGS) \
-Wl,-T $(filter %.lds,$^) $(filter %.o,$^) -o $@
quiet_cmd_vdsocc = CC32 $@
- cmd_vdsocc = $(COMPATCC) -Wp,-MD,$(depfile) $(VDSO_CFLAGS) -c -o $@ $<
+ cmd_vdsocc = $(CC_COMPAT) -Wp,-MD,$(depfile) $(VDSO_CFLAGS) -c -o $@ $<
quiet_cmd_vdsocc_gettimeofday = CC32 $@
- cmd_vdsocc_gettimeofday = $(COMPATCC) -Wp,-MD,$(depfile) $(VDSO_CFLAGS) $(VDSO_CFLAGS_gettimeofday_o) -c -o $@ $<
+ cmd_vdsocc_gettimeofday = $(CC_COMPAT) -Wp,-MD,$(depfile) $(VDSO_CFLAGS) $(VDSO_CFLAGS_gettimeofday_o) -c -o $@ $<
quiet_cmd_vdsoas = AS32 $@
- cmd_vdsoas = $(COMPATCC) -Wp,-MD,$(depfile) $(VDSO_AFLAGS) -c -o $@ $<
+ cmd_vdsoas = $(CC_COMPAT) -Wp,-MD,$(depfile) $(VDSO_AFLAGS) -c -o $@ $<
quiet_cmd_vdsomunge = MUNGE $@
cmd_vdsomunge = $(obj)/$(munge) $< $@
return arch_kasan_reset_tag(addr) >= PAGE_OFFSET;
}
+static inline unsigned long mm_to_pgd_phys(struct mm_struct *mm)
+{
+ /* Either init_pg_dir or swapper_pg_dir */
+ if (mm == &init_mm)
+ return __pa_symbol(mm->pgd);
+
+ return (unsigned long)virt_to_phys(mm->pgd);
+}
+
/*
* Dump out the page tables associated with 'addr' in the currently active mm.
*/
pr_alert("%s pgtable: %luk pages, %llu-bit VAs, pgdp=%016lx\n",
mm == &init_mm ? "swapper" : "user", PAGE_SIZE / SZ_1K,
- vabits_actual, (unsigned long)virt_to_phys(mm->pgd));
+ vabits_actual, mm_to_pgd_phys(mm));
pgdp = pgd_offset(mm, addr);
pgd = READ_ONCE(*pgdp);
pr_alert("[%016lx] pgd=%016llx", addr, pgd_val(pgd));
* If we got a different type of fault from the AT instruction,
* treat the translation fault as spurious.
*/
- dfsc = FIELD_PREP(SYS_PAR_EL1_FST, par);
+ dfsc = FIELD_GET(SYS_PAR_EL1_FST, par);
return (dfsc & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_FAULT;
}
/* O32 stack has to be 8-byte aligned. */
static u64 o32_stk[4096];
-#define O32_STK &o32_stk[sizeof(o32_stk)]
+#define O32_STK (&o32_stk[ARRAY_SIZE(o32_stk)])
#define __PROM_O32(fun, arg) fun arg __asm__(#fun); \
__asm__(#fun " = call_o32")
extern unsigned long __xchg_small(volatile void *ptr, unsigned long val,
unsigned int size);
-static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
- int size)
+static __always_inline
+unsigned long __xchg(volatile void *ptr, unsigned long x, int size)
{
switch (size) {
case 1:
extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
unsigned long new, unsigned int size);
-static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
- unsigned long new, unsigned int size)
+static __always_inline
+unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
+ unsigned long new, unsigned int size)
{
switch (size) {
case 1:
#define HWCAP_MIPS_R6 (1 << 0)
#define HWCAP_MIPS_MSA (1 << 1)
#define HWCAP_MIPS_CRC32 (1 << 2)
+#define HWCAP_MIPS_MIPS16 (1 << 3)
+#define HWCAP_MIPS_MDMX (1 << 4)
+#define HWCAP_MIPS_MIPS3D (1 << 5)
+#define HWCAP_MIPS_SMARTMIPS (1 << 6)
+#define HWCAP_MIPS_DSP (1 << 7)
+#define HWCAP_MIPS_DSP2 (1 << 8)
+#define HWCAP_MIPS_DSP3 (1 << 9)
+#define HWCAP_MIPS_MIPS16E2 (1 << 10)
+#define HWCAP_LOONGSON_MMI (1 << 11)
+#define HWCAP_LOONGSON_EXT (1 << 12)
+#define HWCAP_LOONGSON_EXT2 (1 << 13)
#endif /* _UAPI_ASM_HWCAP_H */
elf_hwcap |= HWCAP_MIPS_MSA;
}
+ if (cpu_has_mips16)
+ elf_hwcap |= HWCAP_MIPS_MIPS16;
+
+ if (cpu_has_mdmx)
+ elf_hwcap |= HWCAP_MIPS_MDMX;
+
+ if (cpu_has_mips3d)
+ elf_hwcap |= HWCAP_MIPS_MIPS3D;
+
+ if (cpu_has_smartmips)
+ elf_hwcap |= HWCAP_MIPS_SMARTMIPS;
+
+ if (cpu_has_dsp)
+ elf_hwcap |= HWCAP_MIPS_DSP;
+
+ if (cpu_has_dsp2)
+ elf_hwcap |= HWCAP_MIPS_DSP2;
+
+ if (cpu_has_dsp3)
+ elf_hwcap |= HWCAP_MIPS_DSP3;
+
+ if (cpu_has_mips16e2)
+ elf_hwcap |= HWCAP_MIPS_MIPS16E2;
+
+ if (cpu_has_loongson_mmi)
+ elf_hwcap |= HWCAP_LOONGSON_MMI;
+
+ if (cpu_has_loongson_ext)
+ elf_hwcap |= HWCAP_LOONGSON_EXT;
+
+ if (cpu_has_loongson_ext2)
+ elf_hwcap |= HWCAP_LOONGSON_EXT2;
+
if (cpu_has_vz)
cpu_probe_vz(c);
$(call cc-option,-march=mips64r2,-mips64r2 -U_MIPS_ISA -D_MIPS_ISA=_MIPS_ISA_MIPS64)
endif
+# Some -march= flags enable MMI instructions, and GCC complains about that
+# support being enabled alongside -msoft-float. Thus explicitly disable MMI.
+cflags-y += $(call cc-option,-mno-loongson-mmi)
+
#
# Loongson Machines' Support
#
$(filter -mmicromips,$(KBUILD_CFLAGS)) \
$(filter -march=%,$(KBUILD_CFLAGS)) \
$(filter -m%-float,$(KBUILD_CFLAGS)) \
+ $(filter -mno-loongson-%,$(KBUILD_CFLAGS)) \
-D__VDSO__
ifdef CONFIG_CC_IS_CLANG
{
WARN_ON(1);
}
+static inline void radix__flush_all_lpid_guest(unsigned int lpid)
+{
+ WARN_ON(1);
+}
#endif
extern void radix__flush_hugetlb_tlb_range(struct vm_area_struct *vma,
mtspr SPRN_PCR, r6
18:
/* Signal secondary CPUs to continue */
+ li r0, 0
stb r0,VCORE_IN_GUEST(r5)
19: lis r8,0x7fff /* MAX_INT@h */
mtspr SPRN_HDEC,r8
ctx->gid = current_gid();
ctx->mode = 0755;
+ fc->fs_private = ctx;
fc->s_fs_info = sbi;
fc->ops = &spufs_context_ops;
return 0;
unsigned char local_buffer[SPLPAR_TLB_BIC_MAXLENGTH];
int call_status, len, idx, bpsize;
+ if (!firmware_has_feature(FW_FEATURE_BLOCK_REMOVE))
+ return;
+
spin_lock(&rtas_data_buf_lock);
memset(rtas_data_buf, 0, RTAS_DATA_BUF_SIZE);
call_status = rtas_call(rtas_token("ibm,get-system-parameter"), 3, 1,
resume_kernel:
REG_L s0, TASK_TI_PREEMPT_COUNT(tp)
bnez s0, restore_all
-need_resched:
REG_L s0, TASK_TI_FLAGS(tp)
andi s0, s0, _TIF_NEED_RESCHED
beqz s0, restore_all
call preempt_schedule_irq
- j need_resched
+ j restore_all
#endif
work_pending:
asmlinkage void do_trap_break(struct pt_regs *regs)
{
-#ifdef CONFIG_GENERIC_BUG
if (!user_mode(regs)) {
enum bug_trap_type type;
type = report_bug(regs->sepc, regs);
switch (type) {
- case BUG_TRAP_TYPE_NONE:
- break;
+#ifdef CONFIG_GENERIC_BUG
case BUG_TRAP_TYPE_WARN:
regs->sepc += get_break_insn_length(regs->sepc);
- break;
+ return;
case BUG_TRAP_TYPE_BUG:
+#endif /* CONFIG_GENERIC_BUG */
+ default:
die(regs, "Kernel BUG");
}
+ } else {
+ force_sig_fault(SIGTRAP, TRAP_BRKPT,
+ (void __user *)(regs->sepc));
}
-#endif /* CONFIG_GENERIC_BUG */
-
- force_sig_fault(SIGTRAP, TRAP_BRKPT, (void __user *)(regs->sepc));
}
#ifdef CONFIG_GENERIC_BUG
__rc; \
})
-static inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
+static __always_inline int __put_user_fn(void *x, void __user *ptr, unsigned long size)
{
unsigned long spec = 0x010000UL;
int rc;
return rc;
}
-static inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
+static __always_inline int __get_user_fn(void *x, const void __user *ptr, unsigned long size)
{
unsigned long spec = 0x01UL;
int rc;
-// SPDX-License-Identifier: GPL-2.0
+/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_X86_CPU_ENTRY_AREA_H
#define _ASM_X86_CPU_ENTRY_AREA_H
#define INTEL_FAM6_TIGERLAKE_L 0x8C
#define INTEL_FAM6_TIGERLAKE 0x8D
+#define INTEL_FAM6_COMETLAKE 0xA5
+#define INTEL_FAM6_COMETLAKE_L 0xA6
+
/* "Small Core" Processors (Atom) */
#define INTEL_FAM6_ATOM_BONNELL 0x1C /* Diamondville, Pineview */
#define MWAIT_ECX_INTERRUPT_BREAK 0x1
#define MWAITX_ECX_TIMER_ENABLE BIT(1)
#define MWAITX_MAX_LOOPS ((u32)-1)
-#define MWAITX_DISABLE_CSTATES 0xf
+#define MWAITX_DISABLE_CSTATES 0xf0
static inline void __monitor(const void *eax, unsigned long ecx,
unsigned long edx)
-// SPDX-License-Identifier: GPL-2.0
+/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_X86_PTI_H
#define _ASM_X86_PTI_H
#ifndef __ASSEMBLY__
if (unlikely(__gu_err)) goto err_label; \
} while (0)
+/*
+ * We want the unsafe accessors to always be inlined and use
+ * the error labels - thus the macro games.
+ */
+#define unsafe_copy_loop(dst, src, len, type, label) \
+ while (len >= sizeof(type)) { \
+ unsafe_put_user(*(type *)src,(type __user *)dst,label); \
+ dst += sizeof(type); \
+ src += sizeof(type); \
+ len -= sizeof(type); \
+ }
+
+#define unsafe_copy_to_user(_dst,_src,_len,label) \
+do { \
+ char __user *__ucu_dst = (_dst); \
+ const char *__ucu_src = (_src); \
+ size_t __ucu_len = (_len); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u64, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u32, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u16, label); \
+ unsafe_copy_loop(__ucu_dst, __ucu_src, __ucu_len, u8, label); \
+} while (0)
+
#endif /* _ASM_X86_UACCESS_H */
#define VMWARE_CMD_VCPU_RESERVED 31
#define VMWARE_PORT(cmd, eax, ebx, ecx, edx) \
- __asm__("inl (%%dx)" : \
+ __asm__("inl (%%dx), %%eax" : \
"=a"(eax), "=c"(ecx), "=d"(edx), "=b"(ebx) : \
"a"(VMWARE_HYPERVISOR_MAGIC), \
"c"(VMWARE_CMD_##cmd), \
-// SPDX-License-Identifier: GPL-2.0
+/* SPDX-License-Identifier: GPL-2.0 */
//
// Code shared between 32 and 64 bit
__monitorx(raw_cpu_ptr(&cpu_tss_rw), 0, 0);
/*
- * AMD, like Intel, supports the EAX hint and EAX=0xf
- * means, do not enter any deep C-state and we use it
+ * AMD, like Intel's MWAIT version, supports the EAX hint and
+ * EAX=0xf0 means, do not enter any deep C-state and we use it
* here in delay() to minimize wakeup latency.
*/
__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
BUG();
}
+static int reboot_reason = SHUTDOWN_reboot;
+static bool xen_legacy_crash;
void xen_emergency_restart(void)
{
- xen_reboot(SHUTDOWN_reboot);
+ xen_reboot(reboot_reason);
}
static int
xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
{
- if (!kexec_crash_loaded())
- xen_reboot(SHUTDOWN_crash);
+ if (!kexec_crash_loaded()) {
+ if (xen_legacy_crash)
+ xen_reboot(SHUTDOWN_crash);
+
+ reboot_reason = SHUTDOWN_crash;
+
+ /*
+ * If panic_timeout==0 then we are supposed to wait forever.
+ * However, to preserve original dom0 behavior we have to drop
+ * into hypervisor. (domU behavior is controlled by its
+ * config file)
+ */
+ if (panic_timeout == 0)
+ panic_timeout = -1;
+ }
return NOTIFY_DONE;
}
+static int __init parse_xen_legacy_crash(char *arg)
+{
+ xen_legacy_crash = true;
+ return 0;
+}
+early_param("xen_legacy_crash", parse_xen_legacy_crash);
+
static struct notifier_block xen_panic_block = {
.notifier_call = xen_panic_event,
.priority = INT_MIN
return ret;
}
-void rq_depth_scale_up(struct rq_depth *rqd)
+/* Returns true on success and false if scaling up wasn't possible */
+bool rq_depth_scale_up(struct rq_depth *rqd)
{
/*
* Hit max in previous round, stop here
*/
if (rqd->scaled_max)
- return;
+ return false;
rqd->scale_step--;
rqd->scaled_max = rq_depth_calc_max_depth(rqd);
+ return true;
}
/*
* Scale rwb down. If 'hard_throttle' is set, do it quicker, since we
- * had a latency violation.
+ * had a latency violation. Returns true on success and returns false if
+ * scaling down wasn't possible.
*/
-void rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle)
+bool rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle)
{
/*
* Stop scaling down when we've hit the limit. This also prevents
* keep up.
*/
if (rqd->max_depth == 1)
- return;
+ return false;
if (rqd->scale_step < 0 && hard_throttle)
rqd->scale_step = 0;
rqd->scaled_max = false;
rq_depth_calc_max_depth(rqd);
+ return true;
}
struct rq_qos_wait_data {
acquire_inflight_cb_t *acquire_inflight_cb,
cleanup_cb_t *cleanup_cb);
bool rq_wait_inc_below(struct rq_wait *rq_wait, unsigned int limit);
-void rq_depth_scale_up(struct rq_depth *rqd);
-void rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle);
+bool rq_depth_scale_up(struct rq_depth *rqd);
+bool rq_depth_scale_down(struct rq_depth *rqd, bool hard_throttle);
bool rq_depth_calc_max_depth(struct rq_depth *rqd);
void __rq_qos_cleanup(struct rq_qos *rqos, struct bio *bio);
static void scale_up(struct rq_wb *rwb)
{
- rq_depth_scale_up(&rwb->rq_depth);
+ if (!rq_depth_scale_up(&rwb->rq_depth))
+ return;
calc_wb_limits(rwb);
rwb->unknown_cnt = 0;
rwb_wake_all(rwb);
static void scale_down(struct rq_wb *rwb, bool hard_throttle)
{
- rq_depth_scale_down(&rwb->rq_depth, hard_throttle);
+ if (!rq_depth_scale_down(&rwb->rq_depth, hard_throttle))
+ return;
calc_wb_limits(rwb);
rwb->unknown_cnt = 0;
rwb_trace_step(rwb, "scale down");
return;
}
- /*
- * XXX - UGLY HACK
- *
- * The block layer suspend/resume path is fundamentally broken due
- * to freezable kthreads and workqueue and may deadlock if a block
- * device gets removed while resume is in progress. I don't know
- * what the solution is short of removing freezable kthreads and
- * workqueues altogether.
- *
- * The following is an ugly hack to avoid kicking off device
- * removal while freezer is active. This is a joke but does avoid
- * this particular deadlock scenario.
- *
- * https://bugzilla.kernel.org/show_bug.cgi?id=62801
- * http://marc.info/?l=linux-kernel&m=138695698516487
- */
-#ifdef CONFIG_FREEZER
- while (pm_freezing)
- msleep(10);
-#endif
-
DPRINTK("ENTER\n");
mutex_lock(&ap->scsi_scan_mutex);
if (refcount_dec_and_mutex_lock(&nbd->refs,
&nbd_index_mutex)) {
idr_remove(&nbd_index_idr, nbd->index);
- mutex_unlock(&nbd_index_mutex);
nbd_dev_remove(nbd);
+ mutex_unlock(&nbd_index_mutex);
}
}
zone->wp = zone->start;
break;
default:
- cmd->error = BLK_STS_NOTSUPP;
- break;
+ return BLK_STS_NOTSUPP;
}
return BLK_STS_OK;
}
}
for_each_set_bit(n, ®, SPRD_EIC_PER_BANK_NR) {
- girq = irq_find_mapping(chip->irq.domain,
- bank * SPRD_EIC_PER_BANK_NR + n);
+ u32 offset = bank * SPRD_EIC_PER_BANK_NR + n;
+
+ girq = irq_find_mapping(chip->irq.domain, offset);
generic_handle_irq(girq);
- sprd_eic_toggle_trigger(chip, girq, n);
+ sprd_eic_toggle_trigger(chip, girq, offset);
}
}
}
case 0:
val = MAX77620_CNFG_GPIO_DBNC_None;
break;
- case 1 ... 8:
+ case 1000 ... 8000:
val = MAX77620_CNFG_GPIO_DBNC_8ms;
break;
- case 9 ... 16:
+ case 9000 ... 16000:
val = MAX77620_CNFG_GPIO_DBNC_16ms;
break;
- case 17 ... 32:
+ case 17000 ... 32000:
val = MAX77620_CNFG_GPIO_DBNC_32ms;
break;
default:
transitory = flags & OF_GPIO_TRANSITORY;
ret = gpiod_request(desc, label);
- if (ret == -EBUSY && (flags & GPIOD_FLAGS_BIT_NONEXCLUSIVE))
+ if (ret == -EBUSY && (dflags & GPIOD_FLAGS_BIT_NONEXCLUSIVE))
return desc;
if (ret)
return ERR_PTR(ret);
if (!ret)
goto set_output_value;
/* Emulate open drain by not actively driving the line high */
- if (value)
- return gpiod_direction_input(desc);
+ if (value) {
+ ret = gpiod_direction_input(desc);
+ goto set_output_flag;
+ }
}
else if (test_bit(FLAG_OPEN_SOURCE, &desc->flags)) {
ret = gpio_set_config(gc, gpio_chip_hwgpio(desc),
if (!ret)
goto set_output_value;
/* Emulate open source by not actively driving the line low */
- if (!value)
- return gpiod_direction_input(desc);
+ if (!value) {
+ ret = gpiod_direction_input(desc);
+ goto set_output_flag;
+ }
} else {
gpio_set_config(gc, gpio_chip_hwgpio(desc),
PIN_CONFIG_DRIVE_PUSH_PULL);
set_output_value:
return gpiod_direction_output_raw_commit(desc, value);
+
+set_output_flag:
+ /*
+ * When emulating open-source or open-drain functionalities by not
+ * actively driving the line (setting mode to input) we still need to
+ * set the IS_OUT flag or otherwise we won't be able to set the line
+ * value anymore.
+ */
+ if (ret == 0)
+ set_bit(FLAG_IS_OUT, &desc->flags);
+ return ret;
}
EXPORT_SYMBOL_GPL(gpiod_direction_output);
if (value) {
ret = chip->direction_input(chip, offset);
- if (!ret)
- clear_bit(FLAG_IS_OUT, &desc->flags);
} else {
ret = chip->direction_output(chip, offset, 0);
if (!ret)
set_bit(FLAG_IS_OUT, &desc->flags);
} else {
ret = chip->direction_input(chip, offset);
- if (!ret)
- clear_bit(FLAG_IS_OUT, &desc->flags);
}
trace_gpio_direction(desc_to_gpio(desc), !value, ret);
if (ret < 0)
r = amdgpu_bo_create_list_entry_array(&args->in, &info);
if (r)
- goto error_free;
+ return r;
switch (args->in.operation) {
case AMDGPU_BO_LIST_OP_CREATE:
r = idr_alloc(&fpriv->bo_list_handles, list, 1, 0, GFP_KERNEL);
mutex_unlock(&fpriv->bo_list_lock);
if (r < 0) {
- amdgpu_bo_list_put(list);
- return r;
+ goto error_put_list;
}
handle = r;
mutex_unlock(&fpriv->bo_list_lock);
if (IS_ERR(old)) {
- amdgpu_bo_list_put(list);
r = PTR_ERR(old);
- goto error_free;
+ goto error_put_list;
}
amdgpu_bo_list_put(old);
return 0;
+error_put_list:
+ amdgpu_bo_list_put(list);
+
error_free:
- if (info)
- kvfree(info);
+ kvfree(info);
return r;
}
int lower_margin = mode->vsync_start - mode->vdisplay;
int vsync_len = mode->vsync_end - mode->vsync_start;
u32 dp0_syncval;
+ u32 bits_per_pixel = 24;
+ u32 in_bw, out_bw;
/*
* Recommended maximum number of symbols transferred in a transfer unit:
* (output active video bandwidth in bytes))
* Must be less than tu_size.
*/
- max_tu_symbol = TU_SIZE_RECOMMENDED - 1;
+
+ in_bw = mode->clock * bits_per_pixel / 8;
+ out_bw = tc->link.base.num_lanes * tc->link.base.rate;
+ max_tu_symbol = DIV_ROUND_UP(in_bw * TU_SIZE_RECOMMENDED, out_bw);
dev_dbg(tc->dev, "set mode %dx%d\n",
mode->hdisplay, mode->vdisplay);
switch (fb->modifier) {
case DRM_FORMAT_MOD_LINEAR:
case I915_FORMAT_MOD_X_TILED:
- return 4096;
+ /*
+ * Validated limit is 4k, but has 5k should
+ * work apart from the following features:
+ * - Ytile (already limited to 4k)
+ * - FP16 (already limited to 4k)
+ * - render compression (already limited to 4k)
+ * - KVMR sprite and cursor (don't care)
+ * - horizontal panning (TODO verify this)
+ * - pipe and plane scaling (TODO verify this)
+ */
+ if (cpp == 8)
+ return 4096;
+ else
+ return 5120;
case I915_FORMAT_MOD_Y_TILED_CCS:
case I915_FORMAT_MOD_Yf_TILED_CCS:
/* FIXME AUX plane? */
wakeref = intel_runtime_pm_get(rpm);
- srcu = intel_gt_reset_trylock(ggtt->vm.gt);
- if (srcu < 0) {
- ret = srcu;
+ ret = intel_gt_reset_trylock(ggtt->vm.gt, &srcu);
+ if (ret)
goto err_rpm;
- }
ret = i915_mutex_lock_interruptible(dev);
if (ret)
intel_wakeref_auto(&i915->ggtt.userfault_wakeref,
msecs_to_jiffies_timeout(CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND));
- i915_vma_set_ggtt_write(vma);
+ if (write) {
+ GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
+ i915_vma_set_ggtt_write(vma);
+ obj->mm.dirty = true;
+ }
err_fence:
i915_vma_unpin_fence(vma);
mutex_lock(&i915->drm.struct_mutex);
intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
- i915_gem_restore_gtt_mappings(i915);
- i915_gem_restore_fences(i915);
-
if (i915_gem_init_hw(i915))
goto err_wedged;
return READ_ONCE(*execlists->active);
}
+static inline void
+execlists_active_lock_bh(struct intel_engine_execlists *execlists)
+{
+ local_bh_disable(); /* prevent local softirq and lock recursion */
+ tasklet_lock(&execlists->tasklet);
+}
+
+static inline void
+execlists_active_unlock_bh(struct intel_engine_execlists *execlists)
+{
+ tasklet_unlock(&execlists->tasklet);
+ local_bh_enable(); /* restore softirq, and kick ksoftirqd! */
+}
+
struct i915_request *
execlists_unwind_incomplete_requests(struct intel_engine_execlists *execlists);
struct drm_printer *m)
{
struct drm_i915_private *dev_priv = engine->i915;
- const struct intel_engine_execlists * const execlists =
- &engine->execlists;
- unsigned long flags;
+ struct intel_engine_execlists * const execlists = &engine->execlists;
u64 addr;
if (engine->id == RENDER_CLASS && IS_GEN_RANGE(dev_priv, 4, 7))
idx, hws[idx * 2], hws[idx * 2 + 1]);
}
- spin_lock_irqsave(&engine->active.lock, flags);
+ execlists_active_lock_bh(execlists);
for (port = execlists->active; (rq = *port); port++) {
char hdr[80];
int len;
hwsp_seqno(rq));
print_request(m, rq, hdr);
}
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ execlists_active_unlock_bh(execlists);
} else if (INTEL_GEN(dev_priv) > 6) {
drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n",
ENGINE_READ(engine, RING_PP_DIR_BASE));
if (!intel_engine_supports_stats(engine))
return -ENODEV;
- spin_lock_irqsave(&engine->active.lock, flags);
- write_seqlock(&engine->stats.lock);
+ execlists_active_lock_bh(execlists);
+ write_seqlock_irqsave(&engine->stats.lock, flags);
if (unlikely(engine->stats.enabled == ~0)) {
err = -EBUSY;
}
unlock:
- write_sequnlock(&engine->stats.lock);
- spin_unlock_irqrestore(&engine->active.lock, flags);
+ write_sequnlock_irqrestore(&engine->stats.lock, flags);
+ execlists_active_unlock_bh(execlists);
return err;
}
struct intel_engine_cs *cur, *old;
trace_i915_request_out(rq);
- GEM_BUG_ON(intel_context_inflight(ce) != rq->engine);
old = READ_ONCE(ce->inflight);
do
GEM_BUG_ON(prev == next);
GEM_BUG_ON(!assert_priority_queue(prev, next));
+ /*
+ * We do not submit known completed requests. Therefore if the next
+ * request is already completed, we can pretend to merge it in
+ * with the previous context (and we will skip updating the ELSP
+ * and tracking). Thus hopefully keeping the ELSP full with active
+ * contexts, despite the best efforts of preempt-to-busy to confuse
+ * us.
+ */
+ if (i915_request_completed(next))
+ return true;
+
if (!can_merge_ctx(prev->hw_context, next->hw_context))
return false;
static struct i915_request *
last_active(const struct intel_engine_execlists *execlists)
{
- struct i915_request * const *last = execlists->active;
+ struct i915_request * const *last = READ_ONCE(execlists->active);
while (*last && i915_request_completed(*last))
last++;
continue;
}
- if (i915_request_completed(rq)) {
- ve->request = NULL;
- ve->base.execlists.queue_priority_hint = INT_MIN;
- rb_erase_cached(rb, &execlists->virtual);
- RB_CLEAR_NODE(rb);
-
- rq->engine = engine;
- __i915_request_submit(rq);
-
- spin_unlock(&ve->base.active.lock);
-
- rb = rb_first_cached(&execlists->virtual);
- continue;
- }
-
if (last && !can_merge_rq(last, rq)) {
spin_unlock(&ve->base.active.lock);
return; /* leave this for another */
GEM_BUG_ON(ve->siblings[0] != engine);
}
- __i915_request_submit(rq);
- if (!i915_request_completed(rq)) {
+ if (__i915_request_submit(rq)) {
submit = true;
last = rq;
}
+
+ /*
+ * Hmm, we have a bunch of virtual engine requests,
+ * but the first one was already completed (thanks
+ * preempt-to-busy!). Keep looking at the veng queue
+ * until we have no more relevant requests (i.e.
+ * the normal submit queue has higher priority).
+ */
+ if (!submit) {
+ spin_unlock(&ve->base.active.lock);
+ rb = rb_first_cached(&execlists->virtual);
+ continue;
+ }
}
spin_unlock(&ve->base.active.lock);
int i;
priolist_for_each_request_consume(rq, rn, p, i) {
- if (i915_request_completed(rq))
- goto skip;
+ bool merge = true;
/*
* Can we combine this request with the current port?
ctx_single_port_submission(rq->hw_context))
goto done;
- *port = execlists_schedule_in(last, port - execlists->pending);
- port++;
+ merge = false;
}
- last = rq;
- submit = true;
-skip:
- __i915_request_submit(rq);
+ if (__i915_request_submit(rq)) {
+ if (!merge) {
+ *port = execlists_schedule_in(last, port - execlists->pending);
+ port++;
+ last = NULL;
+ }
+
+ GEM_BUG_ON(last &&
+ !can_merge_ctx(last->hw_context,
+ rq->hw_context));
+
+ submit = true;
+ last = rq;
+ }
}
rb_erase_cached(&p->node, &execlists->queue);
static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
{
lockdep_assert_held(&engine->active.lock);
- if (!engine->execlists.pending[0])
+ if (!engine->execlists.pending[0]) {
+ rcu_read_lock(); /* protect peeking at execlists->active */
execlists_dequeue(engine);
+ rcu_read_unlock();
+ }
}
/*
static struct i915_request *active_request(struct i915_request *rq)
{
- const struct list_head * const list = &rq->timeline->requests;
const struct intel_context * const ce = rq->hw_context;
struct i915_request *active = NULL;
+ struct list_head *list;
+
+ if (!i915_request_is_active(rq)) /* unwound, but incomplete! */
+ return rq;
+ list = &rq->timeline->requests;
list_for_each_entry_from_reverse(rq, list, link) {
if (i915_request_completed(rq))
break;
int i;
priolist_for_each_request_consume(rq, rn, p, i) {
- list_del_init(&rq->sched.link);
__i915_request_submit(rq);
dma_fence_set_error(&rq->fence, -EIO);
i915_request_mark_complete(rq);
virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
{
struct virtual_engine *ve = to_virtual_engine(rq->engine);
+ intel_engine_mask_t allowed, exec;
struct ve_bond *bond;
+ allowed = ~to_request(signal)->engine->mask;
+
bond = virtual_find_bond(ve, to_request(signal)->engine);
- if (bond) {
- intel_engine_mask_t old, new, cmp;
+ if (bond)
+ allowed &= bond->sibling_mask;
- cmp = READ_ONCE(rq->execution_mask);
- do {
- old = cmp;
- new = cmp & bond->sibling_mask;
- } while ((cmp = cmpxchg(&rq->execution_mask, old, new)) != old);
- }
+ /* Restrict the bonded request to run on only the available engines */
+ exec = READ_ONCE(rq->execution_mask);
+ while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed))
+ ;
+
+ /* Prevent the master from being re-run on the bonded engines */
+ to_request(signal)->execution_mask &= ~allowed;
}
struct intel_context *
struct intel_engine_cs *engine = rq->engine;
struct i915_gem_context *hung_ctx = rq->gem_context;
- lockdep_assert_held(&engine->active.lock);
-
if (!i915_request_is_active(rq))
return;
+ lockdep_assert_held(&engine->active.lock);
list_for_each_entry_continue(rq, &engine->active.requests, sched.link)
if (rq->gem_context == hung_ctx)
i915_request_skip(rq, -EIO);
rq->fence.seqno,
yesno(guilty));
- lockdep_assert_held(&rq->engine->active.lock);
GEM_BUG_ON(i915_request_completed(rq));
if (guilty) {
intel_runtime_pm_put(>->i915->runtime_pm, wakeref);
}
-int intel_gt_reset_trylock(struct intel_gt *gt)
+int intel_gt_reset_trylock(struct intel_gt *gt, int *srcu)
{
- int srcu;
-
might_lock(>->reset.backoff_srcu);
might_sleep();
rcu_read_lock();
}
- srcu = srcu_read_lock(>->reset.backoff_srcu);
+ *srcu = srcu_read_lock(>->reset.backoff_srcu);
rcu_read_unlock();
- return srcu;
+ return 0;
}
void intel_gt_reset_unlock(struct intel_gt *gt, int tag)
void __i915_request_reset(struct i915_request *rq, bool guilty);
-int __must_check intel_gt_reset_trylock(struct intel_gt *gt);
+int __must_check intel_gt_reset_trylock(struct intel_gt *gt, int *srcu);
void intel_gt_reset_unlock(struct intel_gt *gt, int tag);
void intel_gt_set_wedged(struct intel_gt *gt);
struct intel_engine_cs *engine = rq->engine;
enum intel_engine_id id;
const int num_engines =
- IS_HSW_GT1(i915) ? RUNTIME_INFO(i915)->num_engines - 1 : 0;
+ IS_HASWELL(i915) ? RUNTIME_INFO(i915)->num_engines - 1 : 0;
bool force_restore = false;
int len;
u32 *cs;
/* WaAllowUMDToModifyHDCChicken1:skl,bxt,kbl,glk,cfl */
whitelist_reg(w, GEN8_HDC_CHICKEN1);
+
+ /* WaSendPushConstantsFromMMIO:skl,bxt */
+ whitelist_reg(w, COMMON_SLICE_CHICKEN2);
}
static void skl_whitelist_build(struct intel_engine_cs *engine)
if (ret)
DRM_ERROR("failed to re-enable GGTT\n");
+ mutex_lock(&dev_priv->drm.struct_mutex);
+ i915_gem_restore_gtt_mappings(dev_priv);
+ i915_gem_restore_fences(dev_priv);
+ mutex_unlock(&dev_priv->drm.struct_mutex);
+
intel_csr_ucode_resume(dev_priv);
i915_restore_state(dev_priv);
#define I915_GEM_IDLE_TIMEOUT (HZ / 5)
+static inline void tasklet_lock(struct tasklet_struct *t)
+{
+ while (!tasklet_trylock(t))
+ cpu_relax();
+}
+
static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
{
if (!atomic_fetch_inc(&t->count))
}
}
+static void remove_from_engine(struct i915_request *rq)
+{
+ struct intel_engine_cs *engine, *locked;
+
+ /*
+ * Virtual engines complicate acquiring the engine timeline lock,
+ * as their rq->engine pointer is not stable until under that
+ * engine lock. The simple ploy we use is to take the lock then
+ * check that the rq still belongs to the newly locked engine.
+ */
+ locked = READ_ONCE(rq->engine);
+ spin_lock(&locked->active.lock);
+ while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
+ spin_unlock(&locked->active.lock);
+ spin_lock(&engine->active.lock);
+ locked = engine;
+ }
+ list_del(&rq->sched.link);
+ spin_unlock(&locked->active.lock);
+}
+
static bool i915_request_retire(struct i915_request *rq)
{
struct i915_active_request *active, *next;
* request that we have removed from the HW and put back on a run
* queue.
*/
- spin_lock(&rq->engine->active.lock);
- list_del(&rq->sched.link);
- spin_unlock(&rq->engine->active.lock);
+ remove_from_engine(rq);
spin_lock(&rq->lock);
i915_request_mark_complete(rq);
return 0;
}
-void __i915_request_submit(struct i915_request *request)
+bool __i915_request_submit(struct i915_request *request)
{
struct intel_engine_cs *engine = request->engine;
+ bool result = false;
GEM_TRACE("%s fence %llx:%lld, current %d\n",
engine->name,
GEM_BUG_ON(!irqs_disabled());
lockdep_assert_held(&engine->active.lock);
+ /*
+ * With the advent of preempt-to-busy, we frequently encounter
+ * requests that we have unsubmitted from HW, but left running
+ * until the next ack and so have completed in the meantime. On
+ * resubmission of that completed request, we can skip
+ * updating the payload, and execlists can even skip submitting
+ * the request.
+ *
+ * We must remove the request from the caller's priority queue,
+ * and the caller must only call us when the request is in their
+ * priority queue, under the active.lock. This ensures that the
+ * request has *not* yet been retired and we can safely move
+ * the request into the engine->active.list where it will be
+ * dropped upon retiring. (Otherwise if resubmit a *retired*
+ * request, this would be a horrible use-after-free.)
+ */
+ if (i915_request_completed(request))
+ goto xfer;
+
if (i915_gem_context_is_banned(request->gem_context))
i915_request_skip(request, -EIO);
i915_sw_fence_signaled(&request->semaphore))
engine->saturated |= request->sched.semaphores;
- /* We may be recursing from the signal callback of another i915 fence */
- spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
+ engine->emit_fini_breadcrumb(request,
+ request->ring->vaddr + request->postfix);
- list_move_tail(&request->sched.link, &engine->active.requests);
+ trace_i915_request_execute(request);
+ engine->serial++;
+ result = true;
+
+xfer: /* We may be recursing from the signal callback of another i915 fence */
+ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
- GEM_BUG_ON(test_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags));
- set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags);
+ if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags))
+ list_move_tail(&request->sched.link, &engine->active.requests);
if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
spin_unlock(&request->lock);
- engine->emit_fini_breadcrumb(request,
- request->ring->vaddr + request->postfix);
-
- engine->serial++;
-
- trace_i915_request_execute(request);
+ return result;
}
void i915_request_submit(struct i915_request *request)
void i915_request_add(struct i915_request *rq);
-void __i915_request_submit(struct i915_request *request);
+bool __i915_request_submit(struct i915_request *request);
void i915_request_submit(struct i915_request *request);
void i915_request_skip(struct i915_request *request, int error);
WARN_ON(!IS_CANNONLAKE(dev_priv) && !IS_COFFEELAKE(dev_priv));
return PCH_CNP;
case INTEL_PCH_CMP_DEVICE_ID_TYPE:
+ case INTEL_PCH_CMP2_DEVICE_ID_TYPE:
DRM_DEBUG_KMS("Found Comet Lake PCH (CMP)\n");
WARN_ON(!IS_COFFEELAKE(dev_priv));
/* CometPoint is CNP Compatible */
#define INTEL_PCH_CNP_DEVICE_ID_TYPE 0xA300
#define INTEL_PCH_CNP_LP_DEVICE_ID_TYPE 0x9D80
#define INTEL_PCH_CMP_DEVICE_ID_TYPE 0x0280
+#define INTEL_PCH_CMP2_DEVICE_ID_TYPE 0x0680
#define INTEL_PCH_ICP_DEVICE_ID_TYPE 0x3480
#define INTEL_PCH_MCC_DEVICE_ID_TYPE 0x4B00
#define INTEL_PCH_MCC2_DEVICE_ID_TYPE 0x3880
with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
intel_gt_sanitize(&i915->gt, false);
i915_gem_sanitize(i915);
+
+ mutex_lock(&i915->drm.struct_mutex);
+ i915_gem_restore_gtt_mappings(i915);
+ i915_gem_restore_fences(i915);
+ mutex_unlock(&i915->drm.struct_mutex);
+
i915_gem_resume(i915);
}
}
MODULE_DEVICE_TABLE(of, lb035q02_of_match);
+static const struct spi_device_id lb035q02_ids[] = {
+ { "lb035q02", 0 },
+ { /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(spi, lb035q02_ids);
+
static struct spi_driver lb035q02_driver = {
.probe = lb035q02_probe,
.remove = lb035q02_remove,
+ .id_table = lb035q02_ids,
.driver = {
.name = "panel-lg-lb035q02",
.of_match_table = lb035q02_of_match,
module_spi_driver(lb035q02_driver);
-MODULE_ALIAS("spi:lgphilips,lb035q02");
MODULE_AUTHOR("Tomi Valkeinen <tomi.valkeinen@ti.com>");
MODULE_DESCRIPTION("LG.Philips LB035Q02 LCD Panel driver");
MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(of, nl8048_of_match);
+static const struct spi_device_id nl8048_ids[] = {
+ { "nl8048hl11", 0 },
+ { /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(spi, nl8048_ids);
+
static struct spi_driver nl8048_driver = {
.probe = nl8048_probe,
.remove = nl8048_remove,
+ .id_table = nl8048_ids,
.driver = {
.name = "panel-nec-nl8048hl11",
.pm = &nl8048_pm_ops,
module_spi_driver(nl8048_driver);
-MODULE_ALIAS("spi:nec,nl8048hl11");
MODULE_AUTHOR("Erik Gilling <konkers@android.com>");
MODULE_DESCRIPTION("NEC-NL8048HL11 Driver");
MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(of, acx565akm_of_match);
+static const struct spi_device_id acx565akm_ids[] = {
+ { "acx565akm", 0 },
+ { /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(spi, acx565akm_ids);
+
static struct spi_driver acx565akm_driver = {
.probe = acx565akm_probe,
.remove = acx565akm_remove,
+ .id_table = acx565akm_ids,
.driver = {
.name = "panel-sony-acx565akm",
.of_match_table = acx565akm_of_match,
module_spi_driver(acx565akm_driver);
-MODULE_ALIAS("spi:sony,acx565akm");
MODULE_AUTHOR("Nokia Corporation");
MODULE_DESCRIPTION("Sony ACX565AKM LCD Panel Driver");
MODULE_LICENSE("GPL");
MODULE_DEVICE_TABLE(of, td028ttec1_of_match);
static const struct spi_device_id td028ttec1_ids[] = {
- { "tpo,td028ttec1", 0},
- { "toppoly,td028ttec1", 0 },
+ { "td028ttec1", 0 },
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, td043mtea1_of_match);
+static const struct spi_device_id td043mtea1_ids[] = {
+ { "td043mtea1", 0 },
+ { /* sentinel */ }
+};
+
+MODULE_DEVICE_TABLE(spi, td043mtea1_ids);
+
static struct spi_driver td043mtea1_driver = {
.probe = td043mtea1_probe,
.remove = td043mtea1_remove,
+ .id_table = td043mtea1_ids,
.driver = {
.name = "panel-tpo-td043mtea1",
.pm = &td043mtea1_pm_ops,
module_spi_driver(td043mtea1_driver);
-MODULE_ALIAS("spi:tpo,td043mtea1");
MODULE_AUTHOR("Gražvydas Ignotas <notasas@gmail.com>");
MODULE_DESCRIPTION("TPO TD043MTEA1 Panel Driver");
MODULE_LICENSE("GPL");
struct device *dev = &xb_dev->dev;
int ret;
- /*
- * The device is not spawn from a device tree, so arch_setup_dma_ops
- * is not called, thus leaving the device with dummy DMA ops.
- * This makes the device return error on PRIME buffer import, which
- * is not correct: to fix this call of_dma_configure() with a NULL
- * node to set default DMA ops.
- */
- dev->coherent_dma_mask = DMA_BIT_MASK(32);
- ret = of_dma_configure(dev, NULL, true);
+ ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(64));
if (ret < 0) {
- DRM_ERROR("Cannot setup DMA ops, ret %d", ret);
+ DRM_ERROR("Cannot setup DMA mask, ret %d", ret);
return ret;
}
static void mousevsc_on_channel_callback(void *context)
{
- const int packet_size = 0x100;
- int ret;
struct hv_device *device = context;
- u32 bytes_recvd;
- u64 req_id;
struct vmpacket_descriptor *desc;
- unsigned char *buffer;
- int bufferlen = packet_size;
-
- buffer = kmalloc(bufferlen, GFP_ATOMIC);
- if (!buffer)
- return;
-
- do {
- ret = vmbus_recvpacket_raw(device->channel, buffer,
- bufferlen, &bytes_recvd, &req_id);
-
- switch (ret) {
- case 0:
- if (bytes_recvd <= 0) {
- kfree(buffer);
- return;
- }
- desc = (struct vmpacket_descriptor *)buffer;
-
- switch (desc->type) {
- case VM_PKT_COMP:
- break;
-
- case VM_PKT_DATA_INBAND:
- mousevsc_on_receive(device, desc);
- break;
-
- default:
- pr_err("unhandled packet type %d, tid %llx len %d\n",
- desc->type, req_id, bytes_recvd);
- break;
- }
+ foreach_vmbus_pkt(desc, device->channel) {
+ switch (desc->type) {
+ case VM_PKT_COMP:
break;
- case -ENOBUFS:
- kfree(buffer);
- /* Handle large packet */
- bufferlen = bytes_recvd;
- buffer = kmalloc(bytes_recvd, GFP_ATOMIC);
-
- if (!buffer)
- return;
+ case VM_PKT_DATA_INBAND:
+ mousevsc_on_receive(device, desc);
+ break;
+ default:
+ pr_err("Unhandled packet type %d, tid %llx len %d\n",
+ desc->type, desc->trans_id, desc->len8 * 8);
break;
}
- } while (1);
-
+ }
}
static int mousevsc_connect_to_vsp(struct hv_device *device)
drv->shutdown(dev);
}
+#ifdef CONFIG_PM_SLEEP
/*
* vmbus_suspend - Suspend a vmbus device
*/
return drv->resume(dev);
}
+#endif /* CONFIG_PM_SLEEP */
/*
* vmbus_device_release - Final callback release of the vmbus child device
vmbus_signal_eom(msg, message_type);
}
+#ifdef CONFIG_PM_SLEEP
/*
* Fake RESCIND_CHANNEL messages to clean up hv_sock channels by force for
* hibernation, because hv_sock connections can not persist across hibernation.
vmbus_connection.work_queue,
&ctx->work);
}
+#endif /* CONFIG_PM_SLEEP */
/*
* Direct callback for channels using other deferred processing
return ret_val;
}
+#ifdef CONFIG_PM_SLEEP
static int vmbus_bus_suspend(struct device *dev)
{
struct vmbus_channel *channel, *sc;
return 0;
}
+#endif /* CONFIG_PM_SLEEP */
static const struct acpi_device_id vmbus_acpi_device_ids[] = {
{"VMBUS", 0},
error1:
port_modify.set_port_cap_mask = 0;
port_modify.clr_port_cap_mask = IB_PORT_CM_SUP;
+ kfree(port);
while (--i) {
if (!rdma_cap_ib_cm(ib_device, i))
continue;
ib_modify_port(ib_device, port->port_num, 0, &port_modify);
ib_unregister_mad_agent(port->mad_agent);
cm_remove_port_fs(port);
+ kfree(port);
}
free:
kfree(cm_dev);
spin_unlock_irq(&cm.state_lock);
ib_unregister_mad_agent(cur_mad_agent);
cm_remove_port_fs(port);
+ kfree(port);
}
kfree(cm_dev);
conn_id->cm_id.iw = NULL;
cma_exch(conn_id, RDMA_CM_DESTROYING);
mutex_unlock(&conn_id->handler_mutex);
+ mutex_unlock(&listen_id->handler_mutex);
cma_deref_id(conn_id);
rdma_destroy_id(&conn_id->id);
- goto out;
+ return ret;
}
mutex_unlock(&conn_id->handler_mutex);
if (!netdev)
return -ENODEV;
- dev_put(netdev);
-
port_attr->max_mtu = IB_MTU_4096;
port_attr->active_mtu = ib_mtu_int_to_enum(netdev->mtu);
port_attr->state = IB_PORT_DOWN;
port_attr->phys_state = IB_PORT_PHYS_STATE_DISABLED;
} else {
- inetdev = in_dev_get(netdev);
+ rcu_read_lock();
+ inetdev = __in_dev_get_rcu(netdev);
if (inetdev && inetdev->ifa_list) {
port_attr->state = IB_PORT_ACTIVE;
port_attr->phys_state = IB_PORT_PHYS_STATE_LINK_UP;
- in_dev_put(inetdev);
} else {
port_attr->state = IB_PORT_INIT;
port_attr->phys_state =
IB_PORT_PHYS_STATE_PORT_CONFIGURATION_TRAINING;
}
+
+ rcu_read_unlock();
}
+ dev_put(netdev);
err = device->ops.query_port(device, port_num, port_attr);
if (err)
return err;
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
if (!msg) {
ret = -ENOMEM;
- goto err;
+ goto err_get;
}
nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
cntn = nla_get_u32(tb[RDMA_NLDEV_ATTR_STAT_COUNTER_ID]);
qpn = nla_get_u32(tb[RDMA_NLDEV_ATTR_RES_LQPN]);
- ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
- if (ret)
- goto err_unbind;
-
if (fill_nldev_handle(msg, device) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port) ||
nla_put_u32(msg, RDMA_NLDEV_ATTR_STAT_COUNTER_ID, cntn) ||
goto err_fill;
}
+ ret = rdma_counter_unbind_qpn(device, port, qpn, cntn);
+ if (ret)
+ goto err_fill;
+
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(sock_net(skb->sk), msg, NETLINK_CB(skb).portid);
err_fill:
- rdma_counter_bind_qpn(device, port, qpn, cntn);
-err_unbind:
nlmsg_free(msg);
err:
ib_device_put(device);
int ret;
rdma_for_each_port (dev, i) {
- is_ib = rdma_protocol_ib(dev, i++);
+ is_ib = rdma_protocol_ib(dev, i);
if (is_ib)
break;
}
* that the hardware will not attempt to access the MR any more.
*/
if (!umem_odp->is_implicit_odp) {
+ mutex_lock(&umem_odp->umem_mutex);
ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
ib_umem_end(umem_odp));
+ mutex_unlock(&umem_odp->umem_mutex);
kvfree(umem_odp->dma_list);
kvfree(umem_odp->page_list);
}
u64 addr;
struct ib_device *dev = umem_odp->umem.ibdev;
+ lockdep_assert_held(&umem_odp->umem_mutex);
+
virt = max_t(u64, virt, ib_umem_start(umem_odp));
bound = min_t(u64, bound, ib_umem_end(umem_odp));
/* Note that during the run of this function, the
* faults from completion. We might be racing with other
* invalidations, so we must make sure we free each page only
* once. */
- mutex_lock(&umem_odp->umem_mutex);
for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) {
idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift;
if (umem_odp->page_list[idx]) {
umem_odp->npages--;
}
}
- mutex_unlock(&umem_odp->umem_mutex);
}
EXPORT_SYMBOL(ib_umem_odp_unmap_dma_pages);
}
}
-static int dump_qp(struct c4iw_qp *qp, struct c4iw_debugfs_data *qpd)
+static int dump_qp(unsigned long id, struct c4iw_qp *qp,
+ struct c4iw_debugfs_data *qpd)
{
int space;
int cc;
+ if (id != qp->wq.sq.qid)
+ return 0;
space = qpd->bufsize - qpd->pos - 1;
if (space == 0)
xa_lock_irq(&qpd->devp->qps);
xa_for_each(&qpd->devp->qps, index, qp)
- dump_qp(qp, qpd);
+ dump_qp(index, qp, qpd);
xa_unlock_irq(&qpd->devp->qps);
qpd->buf[qpd->pos++] = 0;
struct sk_buff *skb, struct c4iw_wr_wait *wr_waitp)
{
int err;
- struct fw_ri_tpte tpt;
+ struct fw_ri_tpte *tpt;
u32 stag_idx;
static atomic_t key;
if (c4iw_fatal_error(rdev))
return -EIO;
+ tpt = kmalloc(sizeof(*tpt), GFP_KERNEL);
+ if (!tpt)
+ return -ENOMEM;
+
stag_state = stag_state > 0;
stag_idx = (*stag) >> 8;
mutex_lock(&rdev->stats.lock);
rdev->stats.stag.fail++;
mutex_unlock(&rdev->stats.lock);
+ kfree(tpt);
return -ENOMEM;
}
mutex_lock(&rdev->stats.lock);
/* write TPT entry */
if (reset_tpt_entry)
- memset(&tpt, 0, sizeof(tpt));
+ memset(tpt, 0, sizeof(*tpt));
else {
- tpt.valid_to_pdid = cpu_to_be32(FW_RI_TPTE_VALID_F |
+ tpt->valid_to_pdid = cpu_to_be32(FW_RI_TPTE_VALID_F |
FW_RI_TPTE_STAGKEY_V((*stag & FW_RI_TPTE_STAGKEY_M)) |
FW_RI_TPTE_STAGSTATE_V(stag_state) |
FW_RI_TPTE_STAGTYPE_V(type) | FW_RI_TPTE_PDID_V(pdid));
- tpt.locread_to_qpid = cpu_to_be32(FW_RI_TPTE_PERM_V(perm) |
+ tpt->locread_to_qpid = cpu_to_be32(FW_RI_TPTE_PERM_V(perm) |
(bind_enabled ? FW_RI_TPTE_MWBINDEN_F : 0) |
FW_RI_TPTE_ADDRTYPE_V((zbva ? FW_RI_ZERO_BASED_TO :
FW_RI_VA_BASED_TO))|
FW_RI_TPTE_PS_V(page_size));
- tpt.nosnoop_pbladdr = !pbl_size ? 0 : cpu_to_be32(
+ tpt->nosnoop_pbladdr = !pbl_size ? 0 : cpu_to_be32(
FW_RI_TPTE_PBLADDR_V(PBL_OFF(rdev, pbl_addr)>>3));
- tpt.len_lo = cpu_to_be32((u32)(len & 0xffffffffUL));
- tpt.va_hi = cpu_to_be32((u32)(to >> 32));
- tpt.va_lo_fbo = cpu_to_be32((u32)(to & 0xffffffffUL));
- tpt.dca_mwbcnt_pstag = cpu_to_be32(0);
- tpt.len_hi = cpu_to_be32((u32)(len >> 32));
+ tpt->len_lo = cpu_to_be32((u32)(len & 0xffffffffUL));
+ tpt->va_hi = cpu_to_be32((u32)(to >> 32));
+ tpt->va_lo_fbo = cpu_to_be32((u32)(to & 0xffffffffUL));
+ tpt->dca_mwbcnt_pstag = cpu_to_be32(0);
+ tpt->len_hi = cpu_to_be32((u32)(len >> 32));
}
err = write_adapter_mem(rdev, stag_idx +
(rdev->lldi.vr->stag.start >> 5),
- sizeof(tpt), &tpt, skb, wr_waitp);
+ sizeof(*tpt), tpt, skb, wr_waitp);
if (reset_tpt_entry) {
c4iw_put_resource(&rdev->resource.tpt_table, stag_idx);
rdev->stats.stag.cur -= 32;
mutex_unlock(&rdev->stats.lock);
}
+ kfree(tpt);
return err;
}
if (CHELSIO_CHIP_VERSION(rhp->rdev.lldi.adapter_type) > CHELSIO_T6)
srq->flags = T4_SRQ_LIMIT_SUPPORT;
- ret = xa_insert_irq(&rhp->qps, srq->wq.qid, srq, GFP_KERNEL);
- if (ret)
- goto err_free_queue;
-
if (udata) {
srq_key_mm = kmalloc(sizeof(*srq_key_mm), GFP_KERNEL);
if (!srq_key_mm) {
ret = -ENOMEM;
- goto err_remove_handle;
+ goto err_free_queue;
}
srq_db_key_mm = kmalloc(sizeof(*srq_db_key_mm), GFP_KERNEL);
if (!srq_db_key_mm) {
kfree(srq_db_key_mm);
err_free_srq_key_mm:
kfree(srq_key_mm);
-err_remove_handle:
- xa_erase_irq(&rhp->qps, srq->wq.qid);
err_free_queue:
free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx,
srq->wr_waitp);
rhp = srq->rhp;
pr_debug("%s id %d\n", __func__, srq->wq.qid);
-
- xa_erase_irq(&rhp->qps, srq->wq.qid);
ucontext = rdma_udata_to_drv_context(udata, struct c4iw_ucontext,
ibucontext);
free_srq_queue(srq, ucontext ? &ucontext->uctx : &rhp->rdev.uctx,
}
ret = rhashtable_init(tmp_sdma_rht, &sdma_rht_params);
- if (ret < 0)
+ if (ret < 0) {
+ kfree(tmp_sdma_rht);
goto bail;
+ }
+
dd->sdma_rht = tmp_sdma_rht;
dd_dev_info(dd, "SDMA num_sdma: %u\n", dd->num_sdma);
return -ENOMEM;
iwibdev = iwdev->iwibdev;
rdma_set_device_sysfs_group(&iwibdev->ibdev, &i40iw_attr_group);
+ ret = ib_device_set_netdev(&iwibdev->ibdev, iwdev->netdev, 1);
+ if (ret)
+ goto error;
+
ret = ib_register_device(&iwibdev->ibdev, "i40iw%d");
if (ret)
goto error;
return 0;
}
-static void devx_free_indirect_mkey(struct rcu_head *rcu)
-{
- kfree(container_of(rcu, struct devx_obj, devx_mr.rcu));
-}
-
-/* This function to delete from the radix tree needs to be called before
- * destroying the underlying mkey. Otherwise a race might occur in case that
- * other thread will get the same mkey before this one will be deleted,
- * in that case it will fail via inserting to the tree its own data.
- *
- * Note:
- * An error in the destroy is not expected unless there is some other indirect
- * mkey which points to this one. In a kernel cleanup flow it will be just
- * destroyed in the iterative destruction call. In a user flow, in case
- * the application didn't close in the expected order it's its own problem,
- * the mkey won't be part of the tree, in both cases the kernel is safe.
- */
-static void devx_cleanup_mkey(struct devx_obj *obj)
-{
- xa_erase(&obj->ib_dev->mdev->priv.mkey_table,
- mlx5_base_mkey(obj->devx_mr.mmkey.key));
-}
-
static void devx_cleanup_subscription(struct mlx5_ib_dev *dev,
struct devx_event_subscription *sub)
{
int ret;
dev = mlx5_udata_to_mdev(&attrs->driver_udata);
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY)
- devx_cleanup_mkey(obj);
+ if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
+ /*
+ * The pagefault_single_data_segment() does commands against
+ * the mmkey, we must wait for that to stop before freeing the
+ * mkey, as another allocation could get the same mkey #.
+ */
+ xa_erase(&obj->ib_dev->mdev->priv.mkey_table,
+ mlx5_base_mkey(obj->devx_mr.mmkey.key));
+ synchronize_srcu(&dev->mr_srcu);
+ }
if (obj->flags & DEVX_OBJ_FLAGS_DCT)
ret = mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct);
devx_cleanup_subscription(dev, sub_entry);
mutex_unlock(&devx_event_table->event_xa_lock);
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
- call_srcu(&dev->mr_srcu, &obj->devx_mr.rcu,
- devx_free_indirect_mkey);
- return ret;
- }
-
kfree(obj);
return ret;
}
&obj_id);
WARN_ON(obj->dinlen > MLX5_MAX_DESTROY_INBOX_SIZE_DW * sizeof(u32));
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
- err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out);
- if (err)
- goto obj_destroy;
- }
-
err = uverbs_copy_to(attrs, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_OUT, cmd_out, cmd_out_len);
if (err)
- goto err_copy;
+ goto obj_destroy;
if (opcode == MLX5_CMD_OP_CREATE_GENERAL_OBJECT)
obj_type = MLX5_GET(general_obj_in_cmd_hdr, cmd_in, obj_type);
-
obj->obj_id = get_enc_obj_id(opcode | obj_type << 16, obj_id);
+ if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
+ err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out);
+ if (err)
+ goto obj_destroy;
+ }
return 0;
-err_copy:
- if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY)
- devx_cleanup_mkey(obj);
obj_destroy:
if (obj->flags & DEVX_OBJ_FLAGS_DCT)
mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct);
struct mlx5_ib_dev *dev;
u32 out[MLX5_ST_SZ_DW(create_mkey_out)];
struct mlx5_core_sig_ctx *sig;
- int live;
+ unsigned int live;
void *descs_alloc;
int access_flags; /* Needed for rereg MR */
struct mlx5_ib_devx_mr {
struct mlx5_core_mkey mmkey;
int ndescs;
- struct rcu_head rcu;
};
struct mlx5_ib_umr_context {
length + (start & (MLX5_ADAPTER_PAGE_SIZE - 1));
}
-static void update_odp_mr(struct mlx5_ib_mr *mr)
-{
- if (is_odp_mr(mr)) {
- /*
- * This barrier prevents the compiler from moving the
- * setting of umem->odp_data->private to point to our
- * MR, before reg_umr finished, to ensure that the MR
- * initialization have finished before starting to
- * handle invalidations.
- */
- smp_wmb();
- to_ib_umem_odp(mr->umem)->private = mr;
- /*
- * Make sure we will see the new
- * umem->odp_data->private value in the invalidation
- * routines, before we can get page faults on the
- * MR. Page faults can happen once we put the MR in
- * the tree, below this line. Without the barrier,
- * there can be a fault handling and an invalidation
- * before umem->odp_data->private == mr is visible to
- * the invalidation handler.
- */
- smp_wmb();
- }
-}
-
static void reg_mr_callback(int status, struct mlx5_async_work *context)
{
struct mlx5_ib_mr *mr =
mr->umem = umem;
set_mr_fields(dev, mr, npages, length, access_flags);
- update_odp_mr(mr);
-
if (use_umr) {
int update_xlt_flags = MLX5_IB_UPD_XLT_ENABLE;
}
}
- if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
- mr->live = 1;
+ if (is_odp_mr(mr)) {
+ to_ib_umem_odp(mr->umem)->private = mr;
atomic_set(&mr->num_pending_prefetch, 0);
}
+ if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
+ smp_store_release(&mr->live, 1);
return &mr->ibmr;
error:
if (!mr->umem)
return -EINVAL;
+ if (is_odp_mr(mr))
+ return -EOPNOTSUPP;
+
if (flags & IB_MR_REREG_TRANS) {
addr = virt_addr;
len = length;
}
mr->allocated_from_cache = 0;
- if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING))
- mr->live = 1;
} else {
/*
* Send a UMR WQE
set_mr_fields(dev, mr, npages, len, access_flags);
- update_odp_mr(mr);
return 0;
err:
/* Prevent new page faults and
* prefetch requests from succeeding
*/
- mr->live = 0;
+ WRITE_ONCE(mr->live, 0);
+
+ /* Wait for all running page-fault handlers to finish. */
+ synchronize_srcu(&dev->mr_srcu);
/* dequeue pending prefetch requests for the mr */
if (atomic_read(&mr->num_pending_prefetch))
flush_workqueue(system_unbound_wq);
WARN_ON(atomic_read(&mr->num_pending_prefetch));
- /* Wait for all running page-fault handlers to finish. */
- synchronize_srcu(&dev->mr_srcu);
/* Destroy all page mappings */
if (!umem_odp->is_implicit_odp)
mlx5_ib_invalidate_range(umem_odp,
int mlx5_ib_dealloc_mw(struct ib_mw *mw)
{
+ struct mlx5_ib_dev *dev = to_mdev(mw->device);
struct mlx5_ib_mw *mmw = to_mmw(mw);
int err;
- err = mlx5_core_destroy_mkey((to_mdev(mw->device))->mdev,
- &mmw->mmkey);
- if (!err)
- kfree(mmw);
- return err;
+ if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
+ xa_erase(&dev->mdev->priv.mkey_table,
+ mlx5_base_mkey(mmw->mmkey.key));
+ /*
+ * pagefault_single_data_segment() may be accessing mmw under
+ * SRCU if the user bound an ODP MR to this MW.
+ */
+ synchronize_srcu(&dev->mr_srcu);
+ }
+
+ err = mlx5_core_destroy_mkey(dev->mdev, &mmw->mmkey);
+ if (err)
+ return err;
+ kfree(mmw);
+ return 0;
}
int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
return;
}
+ /*
+ * The locking here is pretty subtle. Ideally the implicit children
+ * list would be protected by the umem_mutex, however that is not
+ * possible. Instead this uses a weaker update-then-lock pattern:
+ *
+ * srcu_read_lock()
+ * <change children list>
+ * mutex_lock(umem_mutex)
+ * mlx5_ib_update_xlt()
+ * mutex_unlock(umem_mutex)
+ * destroy lkey
+ *
+ * ie any change the children list must be followed by the locked
+ * update_xlt before destroying.
+ *
+ * The umem_mutex provides the acquire/release semantic needed to make
+ * the children list visible to a racing thread. While SRCU is not
+ * technically required, using it gives consistent use of the SRCU
+ * locking around the children list.
+ */
+ lockdep_assert_held(&to_ib_umem_odp(mr->umem)->umem_mutex);
+ lockdep_assert_held(&mr->dev->mr_srcu);
+
odp = odp_lookup(offset * MLX5_IMR_MTT_SIZE,
nentries * MLX5_IMR_MTT_SIZE, mr);
struct ib_umem_odp *odp = container_of(work, struct ib_umem_odp, work);
int idx = ib_umem_start(odp) >> MLX5_IMR_MTT_SHIFT;
struct mlx5_ib_mr *mr = odp->private, *imr = mr->parent;
+ struct ib_umem_odp *odp_imr = to_ib_umem_odp(imr->umem);
+ int srcu_key;
mr->parent = NULL;
synchronize_srcu(&mr->dev->mr_srcu);
- ib_umem_odp_release(odp);
- if (imr->live)
+ if (smp_load_acquire(&imr->live)) {
+ srcu_key = srcu_read_lock(&mr->dev->mr_srcu);
+ mutex_lock(&odp_imr->umem_mutex);
mlx5_ib_update_xlt(imr, idx, 1, 0,
MLX5_IB_UPD_XLT_INDIRECT |
MLX5_IB_UPD_XLT_ATOMIC);
+ mutex_unlock(&odp_imr->umem_mutex);
+ srcu_read_unlock(&mr->dev->mr_srcu, srcu_key);
+ }
+ ib_umem_odp_release(odp);
mlx5_mr_cache_free(mr->dev, mr);
if (atomic_dec_and_test(&imr->num_leaf_free))
idx - blk_start_idx + 1, 0,
MLX5_IB_UPD_XLT_ZAP |
MLX5_IB_UPD_XLT_ATOMIC);
- mutex_unlock(&umem_odp->umem_mutex);
/*
* We are now sure that the device will not access the
* memory. We can safely unmap it, and mark it as dirty if
if (unlikely(!umem_odp->npages && mr->parent &&
!umem_odp->dying)) {
- WRITE_ONCE(umem_odp->dying, 1);
+ WRITE_ONCE(mr->live, 0);
+ umem_odp->dying = 1;
atomic_inc(&mr->parent->num_leaf_free);
schedule_work(&umem_odp->work);
}
+ mutex_unlock(&umem_odp->umem_mutex);
}
void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev)
mr->ibmr.lkey = mr->mmkey.key;
mr->ibmr.rkey = mr->mmkey.key;
- mr->live = 1;
-
mlx5_ib_dbg(dev, "key %x dev %p mr %p\n",
mr->mmkey.key, dev->mdev, mr);
mtt->parent = mr;
INIT_WORK(&odp->work, mr_leaf_free_action);
+ smp_store_release(&mtt->live, 1);
+
if (!nentries)
start_idx = addr >> MLX5_IMR_MTT_SHIFT;
nentries++;
init_waitqueue_head(&imr->q_leaf_free);
atomic_set(&imr->num_leaf_free, 0);
atomic_set(&imr->num_pending_prefetch, 0);
+ smp_store_release(&imr->live, 1);
return imr;
}
if (mr->parent != imr)
continue;
+ mutex_lock(&umem_odp->umem_mutex);
ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
ib_umem_end(umem_odp));
- if (umem_odp->dying)
+ if (umem_odp->dying) {
+ mutex_unlock(&umem_odp->umem_mutex);
continue;
+ }
- WRITE_ONCE(umem_odp->dying, 1);
+ umem_odp->dying = 1;
atomic_inc(&imr->num_leaf_free);
schedule_work(&umem_odp->work);
+ mutex_unlock(&umem_odp->umem_mutex);
}
up_read(&per_mm->umem_rwsem);
switch (mmkey->type) {
case MLX5_MKEY_MR:
mr = container_of(mmkey, struct mlx5_ib_mr, mmkey);
- if (!mr->live || !mr->ibmr.pd) {
+ if (!smp_load_acquire(&mr->live) || !mr->ibmr.pd) {
mlx5_ib_dbg(dev, "got dead MR\n");
ret = -EFAULT;
goto srcu_unlock;
mr = container_of(mmkey, struct mlx5_ib_mr, mmkey);
- if (mr->ibmr.pd != pd) {
+ if (!smp_load_acquire(&mr->live)) {
ret = false;
break;
}
- if (!mr->live) {
+ if (mr->ibmr.pd != pd) {
ret = false;
break;
}
pvrdma_page_dir_cleanup(dev, &srq->pdir);
- kfree(srq);
-
atomic_dec(&dev->num_srqs);
}
*/
void siw_qp_llp_write_space(struct sock *sk)
{
- struct siw_cep *cep = sk_to_cep(sk);
+ struct siw_cep *cep;
- cep->sk_write_space(sk);
+ read_lock(&sk->sk_callback_lock);
+
+ cep = sk_to_cep(sk);
+ if (cep) {
+ cep->sk_write_space(sk);
- if (!test_bit(SOCK_NOSPACE, &sk->sk_socket->flags))
- (void)siw_sq_start(cep->qp);
+ if (!test_bit(SOCK_NOSPACE, &sk->sk_socket->flags))
+ (void)siw_sq_start(cep->qp);
+ }
+
+ read_unlock(&sk->sk_callback_lock);
}
static int siw_qp_readq_init(struct siw_qp *qp, int irq_size, int orq_size)
u32 out[MLX5_ST_SZ_DW(destroy_mkey_out)] = {0};
u32 in[MLX5_ST_SZ_DW(destroy_mkey_in)] = {0};
struct xarray *mkeys = &dev->priv.mkey_table;
- struct mlx5_core_mkey *deleted_mkey;
unsigned long flags;
xa_lock_irqsave(mkeys, flags);
- deleted_mkey = __xa_erase(mkeys, mlx5_base_mkey(mkey->key));
+ __xa_erase(mkeys, mlx5_base_mkey(mkey->key));
xa_unlock_irqrestore(mkeys, flags);
- if (!deleted_mkey) {
- mlx5_core_dbg(dev, "failed xarray delete of mkey 0x%x\n",
- mlx5_base_mkey(mkey->key));
- return -ENOENT;
- }
MLX5_SET(destroy_mkey_in, in, opcode, MLX5_CMD_OP_DESTROY_MKEY);
MLX5_SET(destroy_mkey_in, in, mkey_index, mlx5_mkey_to_idx(mkey->key));
enum sch_todo todo;
struct work_struct todo_work;
struct schib_config config;
+ u64 dma_mask;
char *driver_override; /* Driver name to force a match */
} __attribute__ ((aligned(8)));
* belong to a subchannel need to fit 31 bit width (e.g. ccw).
*/
sch->dev.coherent_dma_mask = DMA_BIT_MASK(31);
- sch->dev.dma_mask = &sch->dev.coherent_dma_mask;
+ /*
+ * But we don't have such restrictions imposed on the stuff that
+ * is handled by the streaming API.
+ */
+ sch->dma_mask = DMA_BIT_MASK(64);
+ sch->dev.dma_mask = &sch->dma_mask;
return sch;
err:
if (!cdev->private)
goto err_priv;
cdev->dev.coherent_dma_mask = sch->dev.coherent_dma_mask;
- cdev->dev.dma_mask = &cdev->dev.coherent_dma_mask;
+ cdev->dev.dma_mask = sch->dev.dma_mask;
dma_pool = cio_gp_dma_create(&cdev->dev, 1);
if (!dma_pool)
goto err_dma_pool;
#define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt
+#include <linux/dma-mapping.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/slab.h>
#include <linux/highmem.h>
#include <linux/refcount.h>
-#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
-#include <linux/of_device.h>
-#endif
#include <xen/xen.h>
#include <xen/grant_table.h>
flip->private_data = priv;
#ifdef CONFIG_XEN_GRANT_DMA_ALLOC
priv->dma_dev = gntdev_miscdev.this_device;
-
- /*
- * The device is not spawn from a device tree, so arch_setup_dma_ops
- * is not called, thus leaving the device with dummy DMA ops.
- * Fix this by calling of_dma_configure() with a NULL node to set
- * default DMA ops.
- */
- of_dma_configure(priv->dma_dev, NULL, true);
+ dma_coerce_mask_and_coherent(priv->dma_dev, DMA_BIT_MASK(64));
#endif
pr_debug("priv %p\n", priv);
if (xen_feature(XENFEAT_auto_translated_physmap) && gnttab_shared.addr == NULL) {
gnttab_shared.addr = xen_auto_xlat_grant_frames.vaddr;
if (gnttab_shared.addr == NULL) {
- pr_warn("gnttab share frames (addr=0x%08lx) is not mapped!\n",
- (unsigned long)xen_auto_xlat_grant_frames.vaddr);
+ pr_warn("gnttab share frames is not mapped!\n");
return -ENOMEM;
}
}
struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
struct btrfs_root *root = BTRFS_I(inode)->root;
struct page **pages = NULL;
- struct extent_state *cached_state = NULL;
struct extent_changeset *data_reserved = NULL;
u64 release_bytes = 0;
u64 lockstart;
return -ENOMEM;
while (iov_iter_count(i) > 0) {
+ struct extent_state *cached_state = NULL;
size_t offset = offset_in_page(pos);
size_t sector_offset;
size_t write_bytes = min(iov_iter_count(i),
if (copied > 0)
ret = btrfs_dirty_pages(inode, pages, dirty_pages,
pos, copied, &cached_state);
+
+ /*
+ * If we have not locked the extent range, because the range's
+ * start offset is >= i_size, we might still have a non-NULL
+ * cached extent state, acquired while marking the extent range
+ * as delalloc through btrfs_dirty_pages(). Therefore free any
+ * possible cached extent state to avoid a memory leak.
+ */
if (extents_locked)
unlock_extent_cached(&BTRFS_I(inode)->io_tree,
lockstart, lockend, &cached_state);
+ else
+ free_extent_state(cached_state);
+
btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes,
true);
if (ret) {
u32 sizes[2];
int nitems = name ? 2 : 1;
unsigned long ptr;
+ unsigned int nofs_flag;
int ret;
path = btrfs_alloc_path();
if (!path)
return ERR_PTR(-ENOMEM);
+ nofs_flag = memalloc_nofs_save();
inode = new_inode(fs_info->sb);
+ memalloc_nofs_restore(nofs_flag);
if (!inode) {
btrfs_free_path(path);
return ERR_PTR(-ENOMEM);
struct btrfs_extent_data_ref *dref;
struct btrfs_shared_data_ref *sref;
u32 count;
- int i = 0, tree_block_level = 0, ret;
+ int i = 0, tree_block_level = 0, ret = 0;
struct btrfs_key key;
int nritems = btrfs_header_nritems(leaf);
struct btrfs_path *path;
struct btrfs_key key;
int ret;
- u64 clone_src_i_size;
+ u64 clone_src_i_size = 0;
/*
* Prevent cloning from a zero offset with a length matching the sector
* in the tree of log roots
*/
static int update_log_root(struct btrfs_trans_handle *trans,
- struct btrfs_root *log)
+ struct btrfs_root *log,
+ struct btrfs_root_item *root_item)
{
struct btrfs_fs_info *fs_info = log->fs_info;
int ret;
if (log->log_transid == 1) {
/* insert root item on the first sync */
ret = btrfs_insert_root(trans, fs_info->log_root_tree,
- &log->root_key, &log->root_item);
+ &log->root_key, root_item);
} else {
ret = btrfs_update_root(trans, fs_info->log_root_tree,
- &log->root_key, &log->root_item);
+ &log->root_key, root_item);
}
return ret;
}
struct btrfs_fs_info *fs_info = root->fs_info;
struct btrfs_root *log = root->log_root;
struct btrfs_root *log_root_tree = fs_info->log_root_tree;
+ struct btrfs_root_item new_root_item;
int log_transid = 0;
struct btrfs_log_ctx root_log_ctx;
struct blk_plug plug;
goto out;
}
+ /*
+ * We _must_ update under the root->log_mutex in order to make sure we
+ * have a consistent view of the log root we are trying to commit at
+ * this moment.
+ *
+ * We _must_ copy this into a local copy, because we are not holding the
+ * log_root_tree->log_mutex yet. This is important because when we
+ * commit the log_root_tree we must have a consistent view of the
+ * log_root_tree when we update the super block to point at the
+ * log_root_tree bytenr. If we update the log_root_tree here we'll race
+ * with the commit and possibly point at the new block which we may not
+ * have written out.
+ */
btrfs_set_root_node(&log->root_item, log->node);
+ memcpy(&new_root_item, &log->root_item, sizeof(new_root_item));
root->log_transid++;
log->log_transid = root->log_transid;
root->log_start_pid = 0;
- /*
- * Update or create log root item under the root's log_mutex to prevent
- * races with concurrent log syncs that can lead to failure to update
- * log root item because it was not created yet.
- */
- ret = update_log_root(trans, log);
/*
* IO has been started, blocks of the log tree have WRITTEN flag set
* in their headers. new modifications of the log will be written to
mutex_unlock(&log_root_tree->log_mutex);
mutex_lock(&log_root_tree->log_mutex);
+
+ /*
+ * Now we are safe to update the log_root_tree because we're under the
+ * log_mutex, and we're a current writer so we're holding the commit
+ * open until we drop the log_mutex.
+ */
+ ret = update_log_root(trans, log, &new_root_item);
+
if (atomic_dec_and_test(&log_root_tree->log_writers)) {
/* atomic_dec_and_test implies a barrier */
cond_wake_up_nomb(&log_root_tree->log_writer_wait);
return !extended; /* "0" is valid for usual profiles */
/* true if exactly one bit set */
- return is_power_of_2(flags);
+ /*
+ * Don't use is_power_of_2(unsigned long) because it won't work
+ * for the single profile (1ULL << 48) on 32-bit CPUs.
+ */
+ return flags != 0 && (flags & (flags - 1)) == 0;
}
static inline int balance_need_close(struct btrfs_fs_info *fs_info)
else
sb->s_maxbytes = MAX_NON_LFS;
- /* BB FIXME fix time_gran to be larger for LANMAN sessions */
- sb->s_time_gran = 100;
-
- if (tcon->unix_ext) {
- ts = cifs_NTtimeToUnix(0);
+ /* Some very old servers like DOS and OS/2 used 2 second granularity */
+ if ((tcon->ses->server->vals->protocol_id == SMB10_PROT_ID) &&
+ ((tcon->ses->capabilities &
+ tcon->ses->server->vals->cap_nt_find) == 0) &&
+ !tcon->unix_ext) {
+ sb->s_time_gran = 1000000000; /* 1 second is max allowed gran */
+ ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MIN), 0, 0);
sb->s_time_min = ts.tv_sec;
- ts = cifs_NTtimeToUnix(cpu_to_le64(S64_MAX));
+ ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MAX),
+ cpu_to_le16(SMB_TIME_MAX), 0);
sb->s_time_max = ts.tv_sec;
} else {
- ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MIN), 0, 0);
+ /*
+ * Almost every server, including all SMB2+, uses DCE TIME
+ * ie 100 nanosecond units, since 1601. See MS-DTYP and MS-FSCC
+ */
+ sb->s_time_gran = 100;
+ ts = cifs_NTtimeToUnix(0);
sb->s_time_min = ts.tv_sec;
- ts = cnvrtDosUnixTm(cpu_to_le16(SMB_DATE_MAX), cpu_to_le16(SMB_TIME_MAX), 0);
+ ts = cifs_NTtimeToUnix(cpu_to_le64(S64_MAX));
sb->s_time_max = ts.tv_sec;
}
bool smallBuf:1; /* so we know which buf_release function to call */
};
-#define ACL_NO_MODE -1
+#define ACL_NO_MODE ((umode_t)(-1))
struct cifs_open_parms {
struct cifs_tcon *tcon;
struct cifs_sb_info *cifs_sb;
server->ops->qfs_tcon(*xid, tcon);
if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RO_CACHE) {
if (tcon->fsDevInfo.DeviceCharacteristics &
- FILE_READ_ONLY_DEVICE)
+ cpu_to_le32(FILE_READ_ONLY_DEVICE))
cifs_dbg(VFS, "mounted to read only share\n");
else if ((cifs_sb->mnt_cifs_flags &
CIFS_MOUNT_RW_CACHE) == 0)
int rc;
struct dfs_info3_param ref = {0};
char *mdata = NULL, *fake_devname = NULL;
- struct smb_vol fake_vol = {0};
+ struct smb_vol fake_vol = {NULL};
cifs_dbg(FYI, "%s: dfs path: %s\n", __func__, path);
static int
cifs_d_revalidate(struct dentry *direntry, unsigned int flags)
{
+ struct inode *inode;
+
if (flags & LOOKUP_RCU)
return -ECHILD;
if (d_really_is_positive(direntry)) {
+ inode = d_inode(direntry);
+ if ((flags & LOOKUP_REVAL) && !CIFS_CACHE_READ(CIFS_I(inode)))
+ CIFS_I(inode)->time = 0; /* force reval */
+
if (cifs_revalidate_dentry(direntry))
return 0;
else {
* attributes will have been updated by
* cifs_revalidate_dentry().
*/
- if (IS_AUTOMOUNT(d_inode(direntry)) &&
+ if (IS_AUTOMOUNT(inode) &&
!(direntry->d_flags & DCACHE_NEED_AUTOMOUNT)) {
spin_lock(&direntry->d_lock);
direntry->d_flags |= DCACHE_NEED_AUTOMOUNT;
rc = cifs_get_inode_info(&inode, full_path, buf, inode->i_sb,
xid, fid);
+ if (rc) {
+ server->ops->close(xid, tcon, fid);
+ if (rc == -ESTALE)
+ rc = -EOPENSTALE;
+ }
+
out:
kfree(buf);
return rc;
{
struct cifsFileInfo *open_file = NULL;
struct cifs_sb_info *cifs_sb = CIFS_SB(cifs_inode->vfs_inode.i_sb);
- struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
/* only filter by fsuid on multiuser mounts */
if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MULTIUSER))
fsuid_only = false;
- spin_lock(&tcon->open_file_lock);
+ spin_lock(&cifs_inode->open_file_lock);
/* we could simply get the first_list_entry since write-only entries
are always at the end of the list but since the first entry might
have a close pending, we go through the whole list */
/* found a good file */
/* lock it so it will not be closed on us */
cifsFileInfo_get(open_file);
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
return open_file;
} /* else might as well continue, and look for
another, or simply have the caller reopen it
} else /* write only file */
break; /* write only files are last so must be done */
}
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
return NULL;
}
{
struct cifsFileInfo *open_file, *inv_file = NULL;
struct cifs_sb_info *cifs_sb;
- struct cifs_tcon *tcon;
bool any_available = false;
int rc = -EBADF;
unsigned int refind = 0;
}
cifs_sb = CIFS_SB(cifs_inode->vfs_inode.i_sb);
- tcon = cifs_sb_master_tcon(cifs_sb);
/* only filter by fsuid on multiuser mounts */
if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MULTIUSER))
fsuid_only = false;
- spin_lock(&tcon->open_file_lock);
+ spin_lock(&cifs_inode->open_file_lock);
refind_writable:
if (refind > MAX_REOPEN_ATT) {
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
return rc;
}
list_for_each_entry(open_file, &cifs_inode->openFileList, flist) {
if (!open_file->invalidHandle) {
/* found a good writable file */
cifsFileInfo_get(open_file);
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
*ret_file = open_file;
return 0;
} else {
cifsFileInfo_get(inv_file);
}
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
if (inv_file) {
rc = cifs_reopen_file(inv_file, false);
cifsFileInfo_put(inv_file);
++refind;
inv_file = NULL;
- spin_lock(&tcon->open_file_lock);
+ spin_lock(&cifs_inode->open_file_lock);
goto refind_writable;
}
static int is_inode_writable(struct cifsInodeInfo *cifs_inode)
{
struct cifsFileInfo *open_file;
- struct cifs_tcon *tcon =
- cifs_sb_master_tcon(CIFS_SB(cifs_inode->vfs_inode.i_sb));
- spin_lock(&tcon->open_file_lock);
+ spin_lock(&cifs_inode->open_file_lock);
list_for_each_entry(open_file, &cifs_inode->openFileList, flist) {
if (OPEN_FMODE(open_file->f_flags) & FMODE_WRITE) {
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
return 1;
}
}
- spin_unlock(&tcon->open_file_lock);
+ spin_unlock(&cifs_inode->open_file_lock);
return 0;
}
/* if uniqueid is different, return error */
if (unlikely(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM &&
CIFS_I(*pinode)->uniqueid != fattr.cf_uniqueid)) {
+ CIFS_I(*pinode)->time = 0; /* force reval */
rc = -ESTALE;
goto cgiiu_exit;
}
/* if filetype is different, return error */
if (unlikely(((*pinode)->i_mode & S_IFMT) !=
(fattr.cf_mode & S_IFMT))) {
+ CIFS_I(*pinode)->time = 0; /* force reval */
rc = -ESTALE;
goto cgiiu_exit;
}
/* if uniqueid is different, return error */
if (unlikely(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SERVER_INUM &&
CIFS_I(*inode)->uniqueid != fattr.cf_uniqueid)) {
+ CIFS_I(*inode)->time = 0; /* force reval */
rc = -ESTALE;
goto cgii_exit;
}
/* if filetype is different, return error */
if (unlikely(((*inode)->i_mode & S_IFMT) !=
(fattr.cf_mode & S_IFMT))) {
+ CIFS_I(*inode)->time = 0; /* force reval */
rc = -ESTALE;
goto cgii_exit;
}
{0, 0}
};
-static const struct smb_to_posix_error mapping_table_ERRHRD[] = {
- {0, 0}
-};
-
/*
* Convert a string containing text IPv4 or IPv6 address to binary form.
*
unsigned int num = *num_iovec;
iov[num].iov_base = create_posix_buf(mode);
- if (mode == -1)
- cifs_dbg(VFS, "illegal mode\n"); /* BB REMOVEME */
+ if (mode == ACL_NO_MODE)
+ cifs_dbg(FYI, "illegal mode\n");
if (iov[num].iov_base == NULL)
return -ENOMEM;
iov[num].iov_len = sizeof(struct create_posix);
return rc;
}
- /* TODO: add handling for the mode on create */
- if (oparms->disposition == FILE_CREATE)
- cifs_dbg(VFS, "mode is 0x%x\n", oparms->mode); /* BB REMOVEME */
-
- if ((oparms->disposition == FILE_CREATE) && (oparms->mode != -1)) {
+ if ((oparms->disposition == FILE_CREATE) &&
+ (oparms->mode != ACL_NO_MODE)) {
if (n_iov > 2) {
struct create_context *ccontext =
(struct create_context *)iov[n_iov-1].iov_base;
req->PersistentFileId = persistent_fid;
req->VolatileFileId = volatile_fid;
- req->OutputBufferLength = SMB2_MAX_BUFFER_SIZE - MAX_SMB2_HDR_SIZE;
+ req->OutputBufferLength =
+ cpu_to_le32(SMB2_MAX_BUFFER_SIZE - MAX_SMB2_HDR_SIZE);
req->CompletionFilter = cpu_to_le32(completion_filter);
if (watch_tree)
req->Flags = cpu_to_le16(SMB2_WATCH_TREE);
bool is_fsctl, char *in_data, u32 indatalen,
__u32 max_response_size);
extern void SMB2_ioctl_free(struct smb_rqst *rqst);
+extern int SMB2_change_notify(const unsigned int xid, struct cifs_tcon *tcon,
+ u64 persistent_fid, u64 volatile_fid, bool watch_tree,
+ u32 completion_filter);
+
extern int SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_file_id, u64 volatile_file_id);
extern int SMB2_close_flags(const unsigned int xid, struct cifs_tcon *tcon,
if (work->auto_free)
kfree(work);
- if (done && atomic_dec_and_test(&done->cnt))
- wake_up_all(done->waitq);
+ if (done) {
+ wait_queue_head_t *waitq = done->waitq;
+
+ /* @done can't be accessed after the following dec */
+ if (atomic_dec_and_test(&done->cnt))
+ wake_up_all(waitq);
+ }
}
static void wb_queue_work(struct bdi_writeback *wb,
io_cqring_ev_posted(ctx);
}
-static void io_ring_drop_ctx_refs(struct io_ring_ctx *ctx, unsigned refs)
-{
- percpu_ref_put_many(&ctx->refs, refs);
-
- if (waitqueue_active(&ctx->wait))
- wake_up(&ctx->wait);
-}
-
static struct io_kiocb *io_get_req(struct io_ring_ctx *ctx,
struct io_submit_state *state)
{
req->result = 0;
return req;
out:
- io_ring_drop_ctx_refs(ctx, 1);
+ percpu_ref_put(&ctx->refs);
return NULL;
}
{
if (*nr) {
kmem_cache_free_bulk(req_cachep, *nr, reqs);
- io_ring_drop_ctx_refs(ctx, *nr);
+ percpu_ref_put_many(&ctx->refs, *nr);
*nr = 0;
}
}
{
if (req->file && !(req->flags & REQ_F_FIXED_FILE))
fput(req->file);
- io_ring_drop_ctx_refs(req->ctx, 1);
+ percpu_ref_put(&req->ctx->refs);
kmem_cache_free(req_cachep, req);
}
if (link)
io_queue_link_head(ctx, link, &link->submit, shadow_req,
- block_for_last);
+ !block_for_last);
if (statep)
io_submit_state_end(statep);
static void io_destruct_skb(struct sk_buff *skb)
{
struct io_ring_ctx *ctx = skb->sk->sk_user_data;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(ctx->sqo_wq); i++)
+ if (ctx->sqo_wq[i])
+ flush_workqueue(ctx->sqo_wq[i]);
- io_finish_async(ctx);
unix_destruct_scm(skb);
}
}
}
- io_ring_drop_ctx_refs(ctx, 1);
+ percpu_ref_put(&ctx->refs);
out_fput:
fdput(f);
return submitted ? submitted : ret;
EXPORT_SYMBOL(dcache_dir_close);
/* parent is locked at least shared */
-static struct dentry *next_positive(struct dentry *parent,
- struct list_head *from,
- int count)
+/*
+ * Returns an element of siblings' list.
+ * We are looking for <count>th positive after <p>; if
+ * found, dentry is grabbed and returned to caller.
+ * If no such element exists, NULL is returned.
+ */
+static struct dentry *scan_positives(struct dentry *cursor,
+ struct list_head *p,
+ loff_t count,
+ struct dentry *last)
{
- unsigned *seq = &parent->d_inode->i_dir_seq, n;
- struct dentry *res;
- struct list_head *p;
- bool skipped;
- int i;
+ struct dentry *dentry = cursor->d_parent, *found = NULL;
-retry:
- i = count;
- skipped = false;
- n = smp_load_acquire(seq) & ~1;
- res = NULL;
- rcu_read_lock();
- for (p = from->next; p != &parent->d_subdirs; p = p->next) {
+ spin_lock(&dentry->d_lock);
+ while ((p = p->next) != &dentry->d_subdirs) {
struct dentry *d = list_entry(p, struct dentry, d_child);
- if (!simple_positive(d)) {
- skipped = true;
- } else if (!--i) {
- res = d;
- break;
+ // we must at least skip cursors, to avoid livelocks
+ if (d->d_flags & DCACHE_DENTRY_CURSOR)
+ continue;
+ if (simple_positive(d) && !--count) {
+ spin_lock_nested(&d->d_lock, DENTRY_D_LOCK_NESTED);
+ if (simple_positive(d))
+ found = dget_dlock(d);
+ spin_unlock(&d->d_lock);
+ if (likely(found))
+ break;
+ count = 1;
+ }
+ if (need_resched()) {
+ list_move(&cursor->d_child, p);
+ p = &cursor->d_child;
+ spin_unlock(&dentry->d_lock);
+ cond_resched();
+ spin_lock(&dentry->d_lock);
}
}
- rcu_read_unlock();
- if (skipped) {
- smp_rmb();
- if (unlikely(*seq != n))
- goto retry;
- }
- return res;
-}
-
-static void move_cursor(struct dentry *cursor, struct list_head *after)
-{
- struct dentry *parent = cursor->d_parent;
- unsigned n, *seq = &parent->d_inode->i_dir_seq;
- spin_lock(&parent->d_lock);
- for (;;) {
- n = *seq;
- if (!(n & 1) && cmpxchg(seq, n, n + 1) == n)
- break;
- cpu_relax();
- }
- __list_del(cursor->d_child.prev, cursor->d_child.next);
- if (after)
- list_add(&cursor->d_child, after);
- else
- list_add_tail(&cursor->d_child, &parent->d_subdirs);
- smp_store_release(seq, n + 2);
- spin_unlock(&parent->d_lock);
+ spin_unlock(&dentry->d_lock);
+ dput(last);
+ return found;
}
loff_t dcache_dir_lseek(struct file *file, loff_t offset, int whence)
return -EINVAL;
}
if (offset != file->f_pos) {
+ struct dentry *cursor = file->private_data;
+ struct dentry *to = NULL;
+
+ inode_lock_shared(dentry->d_inode);
+
+ if (offset > 2)
+ to = scan_positives(cursor, &dentry->d_subdirs,
+ offset - 2, NULL);
+ spin_lock(&dentry->d_lock);
+ if (to)
+ list_move(&cursor->d_child, &to->d_child);
+ else
+ list_del_init(&cursor->d_child);
+ spin_unlock(&dentry->d_lock);
+ dput(to);
+
file->f_pos = offset;
- if (file->f_pos >= 2) {
- struct dentry *cursor = file->private_data;
- struct dentry *to;
- loff_t n = file->f_pos - 2;
-
- inode_lock_shared(dentry->d_inode);
- to = next_positive(dentry, &dentry->d_subdirs, n);
- move_cursor(cursor, to ? &to->d_child : NULL);
- inode_unlock_shared(dentry->d_inode);
- }
+
+ inode_unlock_shared(dentry->d_inode);
}
return offset;
}
{
struct dentry *dentry = file->f_path.dentry;
struct dentry *cursor = file->private_data;
- struct list_head *p = &cursor->d_child;
- struct dentry *next;
- bool moved = false;
+ struct list_head *anchor = &dentry->d_subdirs;
+ struct dentry *next = NULL;
+ struct list_head *p;
if (!dir_emit_dots(file, ctx))
return 0;
if (ctx->pos == 2)
- p = &dentry->d_subdirs;
- while ((next = next_positive(dentry, p, 1)) != NULL) {
+ p = anchor;
+ else if (!list_empty(&cursor->d_child))
+ p = &cursor->d_child;
+ else
+ return 0;
+
+ while ((next = scan_positives(cursor, p, 1, next)) != NULL) {
if (!dir_emit(ctx, next->d_name.name, next->d_name.len,
d_inode(next)->i_ino, dt_type(d_inode(next))))
break;
- moved = true;
- p = &next->d_child;
ctx->pos++;
+ p = &next->d_child;
}
- if (moved)
- move_cursor(cursor, p);
+ spin_lock(&dentry->d_lock);
+ if (next)
+ list_move_tail(&cursor->d_child, &next->d_child);
+ else
+ list_del_init(&cursor->d_child);
+ spin_unlock(&dentry->d_lock);
+ dput(next);
+
return 0;
}
EXPORT_SYMBOL(dcache_readdir);
static struct kmem_cache *nfs_direct_cachep;
-/*
- * This represents a set of asynchronous requests that we're waiting on
- */
-struct nfs_direct_mirror {
- ssize_t count;
-};
-
struct nfs_direct_req {
struct kref kref; /* release manager */
atomic_t io_count; /* i/os we're waiting for */
spinlock_t lock; /* protect completion state */
- struct nfs_direct_mirror mirrors[NFS_PAGEIO_DESCRIPTOR_MIRROR_MAX];
- int mirror_count;
-
loff_t io_start; /* Start offset for I/O */
ssize_t count, /* bytes actually processed */
max_count, /* max expected count */
}
static void
-nfs_direct_good_bytes(struct nfs_direct_req *dreq, struct nfs_pgio_header *hdr)
+nfs_direct_handle_truncated(struct nfs_direct_req *dreq,
+ const struct nfs_pgio_header *hdr,
+ ssize_t dreq_len)
{
- int i;
- ssize_t count;
+ if (!(test_bit(NFS_IOHDR_ERROR, &hdr->flags) ||
+ test_bit(NFS_IOHDR_EOF, &hdr->flags)))
+ return;
+ if (dreq->max_count >= dreq_len) {
+ dreq->max_count = dreq_len;
+ if (dreq->count > dreq_len)
+ dreq->count = dreq_len;
+
+ if (test_bit(NFS_IOHDR_ERROR, &hdr->flags))
+ dreq->error = hdr->error;
+ else /* Clear outstanding error if this is EOF */
+ dreq->error = 0;
+ }
+}
- WARN_ON_ONCE(dreq->count >= dreq->max_count);
+static void
+nfs_direct_count_bytes(struct nfs_direct_req *dreq,
+ const struct nfs_pgio_header *hdr)
+{
+ loff_t hdr_end = hdr->io_start + hdr->good_bytes;
+ ssize_t dreq_len = 0;
- if (dreq->mirror_count == 1) {
- dreq->mirrors[hdr->pgio_mirror_idx].count += hdr->good_bytes;
- dreq->count += hdr->good_bytes;
- } else {
- /* mirrored writes */
- count = dreq->mirrors[hdr->pgio_mirror_idx].count;
- if (count + dreq->io_start < hdr->io_start + hdr->good_bytes) {
- count = hdr->io_start + hdr->good_bytes - dreq->io_start;
- dreq->mirrors[hdr->pgio_mirror_idx].count = count;
- }
- /* update the dreq->count by finding the minimum agreed count from all
- * mirrors */
- count = dreq->mirrors[0].count;
+ if (hdr_end > dreq->io_start)
+ dreq_len = hdr_end - dreq->io_start;
- for (i = 1; i < dreq->mirror_count; i++)
- count = min(count, dreq->mirrors[i].count);
+ nfs_direct_handle_truncated(dreq, hdr, dreq_len);
- dreq->count = count;
- }
+ if (dreq_len > dreq->max_count)
+ dreq_len = dreq->max_count;
+
+ if (dreq->count < dreq_len)
+ dreq->count = dreq_len;
}
/*
cinfo->completion_ops = &nfs_direct_commit_completion_ops;
}
-static inline void nfs_direct_setup_mirroring(struct nfs_direct_req *dreq,
- struct nfs_pageio_descriptor *pgio,
- struct nfs_page *req)
-{
- int mirror_count = 1;
-
- if (pgio->pg_ops->pg_get_mirror_count)
- mirror_count = pgio->pg_ops->pg_get_mirror_count(pgio, req);
-
- dreq->mirror_count = mirror_count;
-}
-
static inline struct nfs_direct_req *nfs_direct_req_alloc(void)
{
struct nfs_direct_req *dreq;
INIT_LIST_HEAD(&dreq->mds_cinfo.list);
dreq->verf.committed = NFS_INVALID_STABLE_HOW; /* not set yet */
INIT_WORK(&dreq->work, nfs_direct_write_schedule_work);
- dreq->mirror_count = 1;
spin_lock_init(&dreq->lock);
return dreq;
struct nfs_direct_req *dreq = hdr->dreq;
spin_lock(&dreq->lock);
- if (test_bit(NFS_IOHDR_ERROR, &hdr->flags))
- dreq->error = hdr->error;
-
if (test_bit(NFS_IOHDR_REDO, &hdr->flags)) {
spin_unlock(&dreq->lock);
goto out_put;
}
- if (hdr->good_bytes != 0)
- nfs_direct_good_bytes(dreq, hdr);
-
- if (test_bit(NFS_IOHDR_EOF, &hdr->flags))
- dreq->error = 0;
-
+ nfs_direct_count_bytes(dreq, hdr);
spin_unlock(&dreq->lock);
while (!list_empty(&hdr->pages)) {
LIST_HEAD(reqs);
struct nfs_commit_info cinfo;
LIST_HEAD(failed);
- int i;
nfs_init_cinfo_from_dreq(&cinfo, dreq);
nfs_direct_write_scan_commit_list(dreq->inode, &reqs, &cinfo);
dreq->count = 0;
+ dreq->max_count = 0;
+ list_for_each_entry(req, &reqs, wb_list)
+ dreq->max_count += req->wb_bytes;
dreq->verf.committed = NFS_INVALID_STABLE_HOW;
nfs_clear_pnfs_ds_commit_verifiers(&dreq->ds_cinfo);
- for (i = 0; i < dreq->mirror_count; i++)
- dreq->mirrors[i].count = 0;
get_dreq(dreq);
nfs_pageio_init_write(&desc, dreq->inode, FLUSH_STABLE, false,
&nfs_direct_write_completion_ops);
desc.pg_dreq = dreq;
- req = nfs_list_entry(reqs.next);
- nfs_direct_setup_mirroring(dreq, &desc, req);
- if (desc.pg_error < 0) {
- list_splice_init(&reqs, &failed);
- goto out_failed;
- }
-
list_for_each_entry_safe(req, tmp, &reqs, wb_list) {
/* Bump the transmission count */
req->wb_nio++;
}
nfs_pageio_complete(&desc);
-out_failed:
while (!list_empty(&failed)) {
req = nfs_list_entry(failed.next);
nfs_list_remove_request(req);
nfs_init_cinfo_from_dreq(&cinfo, dreq);
spin_lock(&dreq->lock);
-
- if (test_bit(NFS_IOHDR_ERROR, &hdr->flags))
- dreq->error = hdr->error;
-
if (test_bit(NFS_IOHDR_REDO, &hdr->flags)) {
spin_unlock(&dreq->lock);
goto out_put;
}
+ nfs_direct_count_bytes(dreq, hdr);
if (hdr->good_bytes != 0) {
- nfs_direct_good_bytes(dreq, hdr);
if (nfs_write_need_commit(hdr)) {
if (dreq->flags == NFS_ODIRECT_RESCHED_WRITES)
request_commit = true;
break;
}
- nfs_direct_setup_mirroring(dreq, &desc, req);
if (desc.pg_error < 0) {
nfs_free_request(req);
result = desc.pg_error;
status = nfs4_call_sync_custom(&task_setup_data);
if (setclientid.sc_cred) {
+ kfree(clp->cl_acceptor);
clp->cl_acceptor = rpcauth_stringify_acceptor(setclientid.sc_cred);
put_rpccred(setclientid.sc_cred);
}
struct nfs_inode *nfsi = NFS_I(inode);
struct nfs_page *head;
- atomic_long_dec(&nfsi->nrequests);
if (nfs_page_group_sync_on_bit(req, PG_REMOVE)) {
head = req->wb_head;
spin_unlock(&mapping->private_lock);
}
- if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags))
+ if (test_and_clear_bit(PG_INODE_REF, &req->wb_flags)) {
nfs_release_request(req);
+ atomic_long_dec(&nfsi->nrequests);
+ }
}
static void
inode->i_mtime = inode->i_ctime = current_time(inode);
di->i_mtime = di->i_ctime = cpu_to_le64(inode->i_mtime.tv_sec);
di->i_mtime_nsec = di->i_ctime_nsec = cpu_to_le32(inode->i_mtime.tv_nsec);
- ocfs2_update_inode_fsync_trans(handle, inode, 1);
+ if (handle)
+ ocfs2_update_inode_fsync_trans(handle, inode, 1);
}
if (handle)
ocfs2_journal_dirty(handle, wc->w_di_bh);
struct ocfs2_dio_write_ctxt *dwc = NULL;
struct buffer_head *di_bh = NULL;
u64 p_blkno;
- loff_t pos = iblock << inode->i_sb->s_blocksize_bits;
+ unsigned int i_blkbits = inode->i_sb->s_blocksize_bits;
+ loff_t pos = iblock << i_blkbits;
+ sector_t endblk = (i_size_read(inode) - 1) >> i_blkbits;
unsigned len, total_len = bh_result->b_size;
int ret = 0, first_get_block = 0;
len = osb->s_clustersize - (pos & (osb->s_clustersize - 1));
len = min(total_len, len);
+ /*
+ * bh_result->b_size is count in get_more_blocks according to write
+ * "pos" and "end", we need map twice to return different buffer state:
+ * 1. area in file size, not set NEW;
+ * 2. area out file size, set NEW.
+ *
+ * iblock endblk
+ * |--------|---------|---------|---------
+ * |<-------area in file------->|
+ */
+
+ if ((iblock <= endblk) &&
+ ((iblock + ((len - 1) >> i_blkbits)) > endblk))
+ len = (endblk - iblock + 1) << i_blkbits;
+
mlog(0, "get block of %lu at %llu:%u req %u\n",
inode->i_ino, pos, len, total_len);
if (desc->c_needs_zero)
set_buffer_new(bh_result);
+ if (iblock > endblk)
+ set_buffer_new(bh_result);
+
/* May sleep in end_io. It should not happen in a irq context. So defer
* it to dio work queue. */
set_buffer_defer_completion(bh_result);
if (inode_alloc)
inode_lock(inode_alloc);
- if (o2info_coherent(&fi->ifi_req)) {
+ if (inode_alloc && o2info_coherent(&fi->ifi_req)) {
status = ocfs2_inode_lock(inode_alloc, &bh, 0);
if (status < 0) {
mlog_errno(status);
return loc->xl_ops->xlo_check_space(loc, xi);
}
-static void ocfs2_xa_add_entry(struct ocfs2_xa_loc *loc, u32 name_hash)
-{
- loc->xl_ops->xlo_add_entry(loc, name_hash);
- loc->xl_entry->xe_name_hash = cpu_to_le32(name_hash);
- /*
- * We can't leave the new entry's xe_name_offset at zero or
- * add_namevalue() will go nuts. We set it to the size of our
- * storage so that it can never be less than any other entry.
- */
- loc->xl_entry->xe_name_offset = cpu_to_le16(loc->xl_size);
-}
-
static void ocfs2_xa_add_namevalue(struct ocfs2_xa_loc *loc,
struct ocfs2_xattr_info *xi)
{
if (rc)
goto out;
- if (loc->xl_entry) {
- if (ocfs2_xa_can_reuse_entry(loc, xi)) {
- orig_value_size = loc->xl_entry->xe_value_size;
- rc = ocfs2_xa_reuse_entry(loc, xi, ctxt);
- if (rc)
- goto out;
- goto alloc_value;
- }
+ if (!loc->xl_entry) {
+ rc = -EINVAL;
+ goto out;
+ }
- if (!ocfs2_xattr_is_local(loc->xl_entry)) {
- orig_clusters = ocfs2_xa_value_clusters(loc);
- rc = ocfs2_xa_value_truncate(loc, 0, ctxt);
- if (rc) {
- mlog_errno(rc);
- ocfs2_xa_cleanup_value_truncate(loc,
- "overwriting",
- orig_clusters);
- goto out;
- }
+ if (ocfs2_xa_can_reuse_entry(loc, xi)) {
+ orig_value_size = loc->xl_entry->xe_value_size;
+ rc = ocfs2_xa_reuse_entry(loc, xi, ctxt);
+ if (rc)
+ goto out;
+ goto alloc_value;
+ }
+
+ if (!ocfs2_xattr_is_local(loc->xl_entry)) {
+ orig_clusters = ocfs2_xa_value_clusters(loc);
+ rc = ocfs2_xa_value_truncate(loc, 0, ctxt);
+ if (rc) {
+ mlog_errno(rc);
+ ocfs2_xa_cleanup_value_truncate(loc,
+ "overwriting",
+ orig_clusters);
+ goto out;
}
- ocfs2_xa_wipe_namevalue(loc);
- } else
- ocfs2_xa_add_entry(loc, name_hash);
+ }
+ ocfs2_xa_wipe_namevalue(loc);
/*
* If we get here, we have a blank entry. Fill it. We grow our
/*
* Note the "unsafe_put_user() semantics: we goto a
* label for errors.
- *
- * Also note how we use a "while()" loop here, even though
- * only the biggest size needs to loop. The compiler (well,
- * at least gcc) is smart enough to turn the smaller sizes
- * into just if-statements, and this way we don't need to
- * care whether 'u64' or 'u32' is the biggest size.
- */
-#define unsafe_copy_loop(dst, src, len, type, label) \
- while (len >= sizeof(type)) { \
- unsafe_put_user(get_unaligned((type *)src), \
- (type __user *)dst, label); \
- dst += sizeof(type); \
- src += sizeof(type); \
- len -= sizeof(type); \
- }
-
-/*
- * We avoid doing 64-bit copies on 32-bit architectures. They
- * might be better, but the component names are mostly small,
- * and the 64-bit cases can end up being much more complex and
- * put much more register pressure on the code, so it's likely
- * not worth the pain of unaligned accesses etc.
- *
- * So limit the copies to "unsigned long" size. I did verify
- * that at least the x86-32 case is ok without this limiting,
- * but I worry about random other legacy 32-bit cases that
- * might not do as well.
- */
-#define unsafe_copy_type(dst, src, len, type, label) do { \
- if (sizeof(type) <= sizeof(unsigned long)) \
- unsafe_copy_loop(dst, src, len, type, label); \
-} while (0)
-
-/*
- * Copy the dirent name to user space, and NUL-terminate
- * it. This should not be a function call, since we're doing
- * the copy inside a "user_access_begin/end()" section.
*/
#define unsafe_copy_dirent_name(_dst, _src, _len, label) do { \
char __user *dst = (_dst); \
const char *src = (_src); \
size_t len = (_len); \
- unsafe_copy_type(dst, src, len, u64, label); \
- unsafe_copy_type(dst, src, len, u32, label); \
- unsafe_copy_type(dst, src, len, u16, label); \
- unsafe_copy_type(dst, src, len, u8, label); \
- unsafe_put_user(0, dst, label); \
+ unsafe_put_user(0, dst+len, label); \
+ unsafe_copy_to_user(dst, src, len, label); \
} while (0)
mutex_lock(&bdev->bd_fsfreeze_mutex);
if (bdev->bd_fsfreeze_count > 0) {
mutex_unlock(&bdev->bd_fsfreeze_mutex);
+ blkdev_put(bdev, mode);
warnf(fc, "%pg: Can't mount, blockdev is frozen", bdev);
return -EBUSY;
}
fc->sget_key = bdev;
s = sget_fc(fc, test_bdev_super_fc, set_bdev_super_fc);
mutex_unlock(&bdev->bd_fsfreeze_mutex);
- if (IS_ERR(s))
+ if (IS_ERR(s)) {
+ blkdev_put(bdev, mode);
return PTR_ERR(s);
+ }
if (s->s_root) {
/* Don't summarily change the RO/RW state. */
struct xfs_mount *mp,
xfs_daddr_t blkno,
size_t numblks,
- int flags,
const struct xfs_buf_ops *ops)
{
struct xfs_buf *bp;
- bp = xfs_buf_get_uncached(mp->m_ddev_targp, numblks, flags);
+ bp = xfs_buf_get_uncached(mp->m_ddev_targp, numblks, 0);
if (!bp)
return NULL;
{
struct xfs_buf *bp;
- bp = xfs_get_aghdr_buf(mp, id->daddr, id->numblks, 0, ops);
+ bp = xfs_get_aghdr_buf(mp, id->daddr, id->numblks, ops);
if (!bp)
return -ENOMEM;
sf = (xfs_attr_shortform_t *)tmpbuffer;
xfs_idata_realloc(dp, -size, XFS_ATTR_FORK);
- xfs_bmap_local_to_extents_empty(dp, XFS_ATTR_FORK);
+ xfs_bmap_local_to_extents_empty(args->trans, dp, XFS_ATTR_FORK);
bp = NULL;
error = xfs_da_grow_inode(args, &blkno);
- if (error) {
- /*
- * If we hit an IO error middle of the transaction inside
- * grow_inode(), we may have inconsistent data. Bail out.
- */
- if (error == -EIO)
- goto out;
- xfs_idata_realloc(dp, size, XFS_ATTR_FORK); /* try to put */
- memcpy(ifp->if_u1.if_data, tmpbuffer, size); /* it back */
+ if (error)
goto out;
- }
ASSERT(blkno == 0);
error = xfs_attr3_leaf_create(args, blkno, &bp);
- if (error) {
- /* xfs_attr3_leaf_create may not have instantiated a block */
- if (bp && (xfs_da_shrink_inode(args, 0, bp) != 0))
- goto out;
- xfs_idata_realloc(dp, size, XFS_ATTR_FORK); /* try to put */
- memcpy(ifp->if_u1.if_data, tmpbuffer, size); /* it back */
+ if (error)
goto out;
- }
memset((char *)&nargs, 0, sizeof(nargs));
nargs.dp = dp;
*/
void
xfs_bmap_local_to_extents_empty(
+ struct xfs_trans *tp,
struct xfs_inode *ip,
int whichfork)
{
ifp->if_u1.if_root = NULL;
ifp->if_height = 0;
XFS_IFORK_FMT_SET(ip, whichfork, XFS_DINODE_FMT_EXTENTS);
+ xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
}
ASSERT(XFS_IFORK_FORMAT(ip, whichfork) == XFS_DINODE_FMT_LOCAL);
if (!ifp->if_bytes) {
- xfs_bmap_local_to_extents_empty(ip, whichfork);
+ xfs_bmap_local_to_extents_empty(tp, ip, whichfork);
flags = XFS_ILOG_CORE;
goto done;
}
/* account for the change in fork size */
xfs_idata_realloc(ip, -ifp->if_bytes, whichfork);
- xfs_bmap_local_to_extents_empty(ip, whichfork);
+ xfs_bmap_local_to_extents_empty(tp, ip, whichfork);
flags |= XFS_ILOG_CORE;
ifp->if_u1.if_root = NULL;
xfs_filblks_t len);
int xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
int xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
-void xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
+void xfs_bmap_local_to_extents_empty(struct xfs_trans *tp,
+ struct xfs_inode *ip, int whichfork);
void __xfs_bmap_add_free(struct xfs_trans *tp, xfs_fsblock_t bno,
xfs_filblks_t len, const struct xfs_owner_info *oinfo,
bool skip_discard);
memcpy(sfp, oldsfp, ifp->if_bytes);
xfs_idata_realloc(dp, -ifp->if_bytes, XFS_DATA_FORK);
- xfs_bmap_local_to_extents_empty(dp, XFS_DATA_FORK);
+ xfs_bmap_local_to_extents_empty(tp, dp, XFS_DATA_FORK);
dp->i_d.di_size = 0;
/*
xfs_extlen_t len;
xfs_nlink_t refcount;
bool has_cowflag;
- int error = 0;
bno = be32_to_cpu(rec->refc.rc_startblock);
len = be32_to_cpu(rec->refc.rc_blockcount);
xchk_refcountbt_xref(bs->sc, bno, len, refcount);
- return error;
+ return 0;
}
/* Make sure we have as many refc blocks as the rmap says. */
xfs_filblks_t allocatesize_fsb;
xfs_extlen_t extsz, temp;
xfs_fileoff_t startoffset_fsb;
+ xfs_fileoff_t endoffset_fsb;
int nimaps;
int quota_flag;
int rt;
imapp = &imaps[0];
nimaps = 1;
startoffset_fsb = XFS_B_TO_FSBT(mp, offset);
- allocatesize_fsb = XFS_B_TO_FSB(mp, count);
+ endoffset_fsb = XFS_B_TO_FSB(mp, offset + count);
+ allocatesize_fsb = endoffset_fsb - startoffset_fsb;
/*
* Allocate file space until done or until there is an error
unsigned short page_count, i;
xfs_off_t start, end;
int error;
+ xfs_km_flags_t kmflag_mask = 0;
+
+ /*
+ * assure zeroed buffer for non-read cases.
+ */
+ if (!(flags & XBF_READ)) {
+ kmflag_mask |= KM_ZERO;
+ gfp_mask |= __GFP_ZERO;
+ }
/*
* for buffers that are contained within a single page, just allocate
size = BBTOB(bp->b_length);
if (size < PAGE_SIZE) {
int align_mask = xfs_buftarg_dma_alignment(bp->b_target);
- bp->b_addr = kmem_alloc_io(size, align_mask, KM_NOFS);
+ bp->b_addr = kmem_alloc_io(size, align_mask,
+ KM_NOFS | kmflag_mask);
if (!bp->b_addr) {
/* low memory - use alloc_page loop instead */
goto use_alloc_page;
prev_iclog = iclog;
iclog->ic_data = kmem_alloc_io(log->l_iclog_size, align_mask,
- KM_MAYFAIL);
+ KM_MAYFAIL | KM_ZERO);
if (!iclog->ic_data)
goto out_free_iclog;
#ifdef DEBUG
if (nbblks > 1 && log->l_sectBBsize > 1)
nbblks += log->l_sectBBsize;
nbblks = round_up(nbblks, log->l_sectBBsize);
- return kmem_alloc_io(BBTOB(nbblks), align_mask, KM_MAYFAIL);
+ return kmem_alloc_io(BBTOB(nbblks), align_mask, KM_MAYFAIL | KM_ZERO);
}
/*
# define __GCC4_has_attribute___noclone__ 1
# define __GCC4_has_attribute___nonstring__ 0
# define __GCC4_has_attribute___no_sanitize_address__ (__GNUC_MINOR__ >= 8)
+# define __GCC4_has_attribute___fallthrough__ 0
#endif
/*
# define __noclone
#endif
+/*
+ * Add the pseudo keyword 'fallthrough' so case statement blocks
+ * must end with any of these keywords:
+ * break;
+ * fallthrough;
+ * goto <label>;
+ * return [expression];
+ *
+ * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#Statement-Attributes
+ */
+#if __has_attribute(__fallthrough__)
+# define fallthrough __attribute__((__fallthrough__))
+#else
+# define fallthrough do {} while (0) /* fallthrough */
+#endif
+
/*
* Note the missing underscores.
*
__ADDRESSABLE(sym) \
asm(" .section \"___ksymtab" sec "+" #sym "\", \"a\" \n" \
" .balign 4 \n" \
- "__ksymtab_" #sym NS_SEPARATOR #ns ": \n" \
+ "__ksymtab_" #ns NS_SEPARATOR #sym ": \n" \
" .long " #sym "- . \n" \
" .long __kstrtab_" #sym "- . \n" \
- " .long __kstrtab_ns_" #sym "- . \n" \
+ " .long __kstrtabns_" #sym "- . \n" \
" .previous \n")
#define __KSYMTAB_ENTRY(sym, sec) \
#else
#define __KSYMTAB_ENTRY_NS(sym, sec, ns) \
static const struct kernel_symbol __ksymtab_##sym##__##ns \
- asm("__ksymtab_" #sym NS_SEPARATOR #ns) \
+ asm("__ksymtab_" #ns NS_SEPARATOR #sym) \
__attribute__((section("___ksymtab" sec "+" #sym), used)) \
__aligned(sizeof(void *)) \
- = { (unsigned long)&sym, __kstrtab_##sym, __kstrtab_ns_##sym }
+ = { (unsigned long)&sym, __kstrtab_##sym, __kstrtabns_##sym }
#define __KSYMTAB_ENTRY(sym, sec) \
static const struct kernel_symbol __ksymtab_##sym \
/* For every exported symbol, place a struct in the __ksymtab section */
#define ___EXPORT_SYMBOL_NS(sym, sec, ns) \
___export_symbol_common(sym, sec); \
- static const char __kstrtab_ns_##sym[] \
+ static const char __kstrtabns_##sym[] \
__attribute__((section("__ksymtab_strings"), used, aligned(1))) \
= #ns; \
__KSYMTAB_ENTRY_NS(sym, sec, ns)
/**
* led_set_brightness_sync - set LED brightness synchronously
* @led_cdev: the LED to set
- * @brightness: the brightness to set it to
+ * @value: the brightness to set it to
*
* Set an LED's brightness immediately. This function will block
* the caller for the time required for accessing device registers,
/**
* led_compose_name - compose LED class device name
* @dev: LED controller device object
- * @child: child fwnode_handle describing a LED or a group of synchronized LEDs;
- * it must be provided only for fwnode based LEDs
+ * @init_data: the LED class device initialization data
* @led_classdev_name: composed LED class device name
*
* Create LED class device name basing on the provided init_data argument.
return !cgroup_subsys_enabled(memory_cgrp_subsys);
}
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+ bool in_low_reclaim)
+{
+ if (mem_cgroup_disabled())
+ return 0;
+
+ if (in_low_reclaim)
+ return READ_ONCE(memcg->memory.emin);
+
+ return max(READ_ONCE(memcg->memory.emin),
+ READ_ONCE(memcg->memory.elow));
+}
+
enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root,
struct mem_cgroup *memcg);
unsigned long mem_cgroup_get_max(struct mem_cgroup *memcg);
+unsigned long mem_cgroup_size(struct mem_cgroup *memcg);
+
void mem_cgroup_print_oom_context(struct mem_cgroup *memcg,
struct task_struct *p);
{
}
+static inline unsigned long mem_cgroup_protection(struct mem_cgroup *memcg,
+ bool in_low_reclaim)
+{
+ return 0;
+}
+
static inline enum mem_cgroup_protection mem_cgroup_protected(
struct mem_cgroup *root, struct mem_cgroup *memcg)
{
return 0;
}
+static inline unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
+{
+ return 0;
+}
+
static inline void
mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p)
{
static inline void mem_cgroup_track_foreign_dirty(struct page *page,
struct bdi_writeback *wb)
{
+ if (mem_cgroup_disabled())
+ return;
+
if (unlikely(&page->mem_cgroup->css != wb->memcg_css))
mem_cgroup_track_foreign_dirty_slowpath(page, wb);
}
* kmalloc is the normal method of allocating memory
* for objects smaller than page size in the kernel.
*
+ * The allocated object address is aligned to at least ARCH_KMALLOC_MINALIGN
+ * bytes. For @size of power of two bytes, the alignment is also guaranteed
+ * to be at least to the size.
+ *
* The @flags argument may be one of the GFP flags defined at
* include/linux/gfp.h and described at
* :ref:`Documentation/core-api/mm-api.rst <mm-api-gfp-flags>`
}
size_t memweight(const void *ptr, size_t bytes);
-void memzero_explicit(void *s, size_t count);
+
+/**
+ * memzero_explicit - Fill a region of memory (e.g. sensitive
+ * keying data) with 0s.
+ * @s: Pointer to the start of the area.
+ * @count: The size of the area.
+ *
+ * Note: usually using memset() is just fine (!), but in cases
+ * where clearing out _local_ data at the end of a scope is
+ * necessary, memzero_explicit() should be used instead in
+ * order to prevent the compiler from optimising away zeroing.
+ *
+ * memzero_explicit() doesn't need an arch-specific version as
+ * it just invokes the one of memset() implicitly.
+ */
+static inline void memzero_explicit(void *s, size_t count)
+{
+ memset(s, 0, count);
+ barrier_data(s);
+}
/**
* kbasename - return the last part of a pathname.
struct mutex recv_mutex;
struct sockaddr_storage srcaddr;
unsigned short srcport;
+ int xprt_err;
/*
* UDP socket buffer size parameters
#ifndef user_access_begin
#define user_access_begin(ptr,len) access_ok(ptr, len)
#define user_access_end() do { } while (0)
-#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0)
-#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0)
+#define unsafe_op_wrap(op, err) do { if (unlikely(op)) goto err; } while (0)
+#define unsafe_get_user(x,p,e) unsafe_op_wrap(__get_user(x,p),e)
+#define unsafe_put_user(x,p,e) unsafe_op_wrap(__put_user(x,p),e)
+#define unsafe_copy_to_user(d,s,l,e) unsafe_op_wrap(__copy_to_user(d,s,l),e)
static inline unsigned long user_access_save(void) { return 0UL; }
static inline void user_access_restore(unsigned long flags) { }
#endif
struct ctl_table t;
int ret;
int threads = max_threads;
- int min = MIN_THREADS;
+ int min = 1;
int max = MAX_THREADS;
t = *table;
if (ret || !write)
return ret;
- set_max_threads(threads);
+ max_threads = threads;
return 0;
}
bool pm_freezing;
bool pm_nosig_freezing;
-/*
- * Temporary export for the deadlock workaround in ata_scsi_hotplug().
- * Remove once the hack becomes unnecessary.
- */
-EXPORT_SYMBOL_GPL(pm_freezing);
-
/* protects freezing and frozen transitions */
static DEFINE_SPINLOCK(freezer_lock);
* after setting panic_cpu) from invoking panic() again.
*/
local_irq_disable();
+ preempt_disable_notrace();
/*
* It's possible to come here directly from a panic-assertion and
EXPORT_SYMBOL(memset);
#endif
-/**
- * memzero_explicit - Fill a region of memory (e.g. sensitive
- * keying data) with 0s.
- * @s: Pointer to the start of the area.
- * @count: The size of the area.
- *
- * Note: usually using memset() is just fine (!), but in cases
- * where clearing out _local_ data at the end of a scope is
- * necessary, memzero_explicit() should be used instead in
- * order to prevent the compiler from optimising away zeroing.
- *
- * memzero_explicit() doesn't need an arch-specific version as
- * it just invokes the one of memset() implicitly.
- */
-void memzero_explicit(void *s, size_t count)
-{
- memset(s, 0, count);
- barrier_data(s);
-}
-EXPORT_SYMBOL(memzero_explicit);
-
#ifndef __HAVE_ARCH_MEMSET16
/**
* memset16() - Fill a memory area with a uint16_t
help
This config option enables the compat VDSO layer.
-config CROSS_COMPILE_COMPAT_VDSO
- string "32 bit Toolchain prefix for compat vDSO"
- default ""
- depends on GENERIC_COMPAT_VDSO
- help
- Defines the cross-compiler prefix for compiling compat vDSO.
- If a 64 bit compiler (i.e. x86_64) can compile the VDSO for
- 32 bit, it does not need to define this parameter.
-
endif
{
int err;
- bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_FREEZABLE |
- WQ_UNBOUND | WQ_SYSFS, 0);
+ bdi_wq = alloc_workqueue("writeback", WQ_MEM_RECLAIM | WQ_UNBOUND |
+ WQ_SYSFS, 0);
if (!bdi_wq)
return -ENOMEM;
return max;
}
+unsigned long mem_cgroup_size(struct mem_cgroup *memcg)
+{
+ return page_counter_read(&memcg->memory);
+}
+
static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
int order)
{
#include <linux/xarray.h>
static DEFINE_XARRAY(pgmap_array);
-#define SECTION_MASK ~((1UL << PA_SECTION_SHIFT) - 1)
-#define SECTION_SIZE (1UL << PA_SECTION_SHIFT)
#ifdef CONFIG_DEV_PAGEMAP_OPS
DEFINE_STATIC_KEY_FALSE(devmap_managed_key);
debug_check_no_obj_freed(page_address(page),
PAGE_SIZE << order);
}
- arch_free_page(page, order);
if (want_init_on_free())
kernel_init_free_pages(page, 1 << order);
kernel_poison_pages(page, 1 << order, 0);
+ /*
+ * arch_free_page() can make the page's contents inaccessible. s390
+ * does this. So nothing which can access the page's contents should
+ * happen after this.
+ */
+ arch_free_page(page, order);
+
if (debug_pagealloc_enabled())
kernel_map_pages(page, 1 << order, 0);
{
char *options = data;
+ if (options) {
+ int err = security_sb_eat_lsm_opts(options, &fc->security);
+ if (err)
+ return err;
+ }
+
while (options != NULL) {
char *this_char = options;
for (;;) {
}
static bool shuffle_param;
-extern int shuffle_show(char *buffer, const struct kernel_param *kp)
+static int shuffle_show(char *buffer, const struct kernel_param *kp)
{
return sprintf(buffer, "%c\n", test_bit(SHUFFLE_ENABLE, &shuffle_state)
? 'Y' : 'N');
unsigned int useroffset, unsigned int usersize)
{
int err;
+ unsigned int align = ARCH_KMALLOC_MINALIGN;
s->name = name;
s->size = s->object_size = size;
- s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
+
+ /*
+ * For power of two sizes, guarantee natural alignment for kmalloc
+ * caches, regardless of SL*B debugging options.
+ */
+ if (is_power_of_2(size))
+ align = max(align, size);
+ s->align = calculate_alignment(flags, align, size);
+
s->useroffset = useroffset;
s->usersize = usersize;
*/
void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
{
- void *ret;
+ void *ret = NULL;
struct page *page;
flags |= __GFP_COMP;
page = alloc_pages(flags, order);
- ret = page ? page_address(page) : NULL;
+ if (likely(page)) {
+ ret = page_address(page);
+ mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
+ 1 << order);
+ }
ret = kasan_kmalloc_large(ret, size, flags);
/* As ret might get tagged, call kmemleak hook after KASAN. */
kmemleak_alloc(ret, size, 1, flags);
static void *slob_new_pages(gfp_t gfp, int order, int node)
{
- void *page;
+ struct page *page;
#ifdef CONFIG_NUMA
if (node != NUMA_NO_NODE)
if (!page)
return NULL;
+ mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
+ 1 << order);
return page_address(page);
}
static void slob_free_pages(void *b, int order)
{
+ struct page *sp = virt_to_page(b);
+
if (current->reclaim_state)
current->reclaim_state->reclaimed_slab += 1 << order;
- free_pages((unsigned long)b, order);
+
+ mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
+ -(1 << order));
+ __free_pages(sp, order);
}
/*
* @sp: Page to look in.
* @size: Size of the allocation.
* @align: Allocation alignment.
+ * @align_offset: Offset in the allocated block that will be aligned.
* @page_removed_from_list: Return parameter.
*
* Tries to find a chunk of memory at least @size bytes big within @page.
* true (set to false otherwise).
*/
static void *slob_page_alloc(struct page *sp, size_t size, int align,
- bool *page_removed_from_list)
+ int align_offset, bool *page_removed_from_list)
{
slob_t *prev, *cur, *aligned = NULL;
int delta = 0, units = SLOB_UNITS(size);
for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
slobidx_t avail = slob_units(cur);
+ /*
+ * 'aligned' will hold the address of the slob block so that the
+ * address 'aligned'+'align_offset' is aligned according to the
+ * 'align' parameter. This is for kmalloc() which prepends the
+ * allocated block with its size, so that the block itself is
+ * aligned when needed.
+ */
if (align) {
- aligned = (slob_t *)ALIGN((unsigned long)cur, align);
+ aligned = (slob_t *)
+ (ALIGN((unsigned long)cur + align_offset, align)
+ - align_offset);
delta = aligned - cur;
}
if (avail >= units + delta) { /* room enough? */
/*
* slob_alloc: entry point into the slob allocator.
*/
-static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
+static void *slob_alloc(size_t size, gfp_t gfp, int align, int node,
+ int align_offset)
{
struct page *sp;
struct list_head *slob_list;
if (sp->units < SLOB_UNITS(size))
continue;
- b = slob_page_alloc(sp, size, align, &page_removed_from_list);
+ b = slob_page_alloc(sp, size, align, align_offset, &page_removed_from_list);
if (!b)
continue;
INIT_LIST_HEAD(&sp->slab_list);
set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
set_slob_page_free(sp, slob_list);
- b = slob_page_alloc(sp, size, align, &_unused);
+ b = slob_page_alloc(sp, size, align, align_offset, &_unused);
BUG_ON(!b);
spin_unlock_irqrestore(&slob_lock, flags);
}
__do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
{
unsigned int *m;
- int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
void *ret;
gfp &= gfp_allowed_mask;
fs_reclaim_acquire(gfp);
fs_reclaim_release(gfp);
- if (size < PAGE_SIZE - align) {
+ if (size < PAGE_SIZE - minalign) {
+ int align = minalign;
+
+ /*
+ * For power of two sizes, guarantee natural alignment for
+ * kmalloc()'d objects.
+ */
+ if (is_power_of_2(size))
+ align = max(minalign, (int) size);
+
if (!size)
return ZERO_SIZE_PTR;
- m = slob_alloc(size + align, gfp, align, node);
+ m = slob_alloc(size + minalign, gfp, align, node, minalign);
if (!m)
return NULL;
*m = size;
- ret = (void *)m + align;
+ ret = (void *)m + minalign;
trace_kmalloc_node(caller, ret,
- size, size + align, gfp, node);
+ size, size + minalign, gfp, node);
} else {
unsigned int order = get_order(size);
int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
unsigned int *m = (unsigned int *)(block - align);
slob_free(m, *m + align);
- } else
- __free_pages(sp, compound_order(sp));
+ } else {
+ unsigned int order = compound_order(sp);
+ mod_node_page_state(page_pgdat(sp), NR_SLAB_UNRECLAIMABLE,
+ -(1 << order));
+ __free_pages(sp, order);
+
+ }
}
EXPORT_SYMBOL(kfree);
fs_reclaim_release(flags);
if (c->size < PAGE_SIZE) {
- b = slob_alloc(c->size, flags, c->align, node);
+ b = slob_alloc(c->size, flags, c->align, node, 0);
trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size,
SLOB_UNITS(c->size) * SLOB_UNIT,
flags, node);
{
struct page *page;
void *ptr = NULL;
+ unsigned int order = get_order(size);
flags |= __GFP_COMP;
- page = alloc_pages_node(node, flags, get_order(size));
- if (page)
+ page = alloc_pages_node(node, flags, order);
+ if (page) {
ptr = page_address(page);
+ mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
+ 1 << order);
+ }
return kmalloc_large_node_hook(ptr, size, flags);
}
page = virt_to_head_page(x);
if (unlikely(!PageSlab(page))) {
+ unsigned int order = compound_order(page);
+
BUG_ON(!PageCompound(page));
kfree_hook(object);
- __free_pages(page, compound_order(page));
+ mod_node_page_state(page_pgdat(page), NR_SLAB_UNRECLAIMABLE,
+ -(1 << order));
+ __free_pages(page, order);
return;
}
slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
return next_present_section_nr(-1);
}
-void subsection_mask_set(unsigned long *map, unsigned long pfn,
+static void subsection_mask_set(unsigned long *map, unsigned long pfn,
unsigned long nr_pages)
{
int idx = subsection_map_index(pfn);
* "hierarchy" or "local").
*
* To be used as memcg event method.
+ *
+ * Return: 0 on success, -ENOMEM on memory failure or -EINVAL if @args could
+ * not be parsed.
*/
int vmpressure_register_event(struct mem_cgroup *memcg,
struct eventfd_ctx *eventfd, const char *args)
struct vmpressure *vmpr = memcg_to_vmpressure(memcg);
struct vmpressure_event *ev;
enum vmpressure_modes mode = VMPRESSURE_NO_PASSTHROUGH;
- enum vmpressure_levels level = -1;
+ enum vmpressure_levels level;
char *spec, *spec_orig;
char *token;
int ret = 0;
/* Find required level */
token = strsep(&spec, ",");
- level = match_string(vmpressure_str_levels, VMPRESSURE_NUM_LEVELS, token);
- if (level < 0) {
- ret = level;
+ ret = match_string(vmpressure_str_levels, VMPRESSURE_NUM_LEVELS, token);
+ if (ret < 0)
goto out;
- }
+ level = ret;
/* Find optional mode */
token = strsep(&spec, ",");
if (token) {
- mode = match_string(vmpressure_str_modes, VMPRESSURE_NUM_MODES, token);
- if (mode < 0) {
- ret = mode;
+ ret = match_string(vmpressure_str_modes, VMPRESSURE_NUM_MODES, token);
+ if (ret < 0)
goto out;
- }
+ mode = ret;
}
ev = kzalloc(sizeof(*ev), GFP_KERNEL);
mutex_lock(&vmpr->events_lock);
list_add(&ev->node, &vmpr->events);
mutex_unlock(&vmpr->events_lock);
+ ret = 0;
out:
kfree(spec_orig);
return ret;
*lru_pages = 0;
for_each_evictable_lru(lru) {
int file = is_file_lru(lru);
- unsigned long size;
+ unsigned long lruvec_size;
unsigned long scan;
+ unsigned long protection;
+
+ lruvec_size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
+ protection = mem_cgroup_protection(memcg,
+ sc->memcg_low_reclaim);
+
+ if (protection) {
+ /*
+ * Scale a cgroup's reclaim pressure by proportioning
+ * its current usage to its memory.low or memory.min
+ * setting.
+ *
+ * This is important, as otherwise scanning aggression
+ * becomes extremely binary -- from nothing as we
+ * approach the memory protection threshold, to totally
+ * nominal as we exceed it. This results in requiring
+ * setting extremely liberal protection thresholds. It
+ * also means we simply get no protection at all if we
+ * set it too low, which is not ideal.
+ *
+ * If there is any protection in place, we reduce scan
+ * pressure by how much of the total memory used is
+ * within protection thresholds.
+ *
+ * There is one special case: in the first reclaim pass,
+ * we skip over all groups that are within their low
+ * protection. If that fails to reclaim enough pages to
+ * satisfy the reclaim goal, we come back and override
+ * the best-effort low protection. However, we still
+ * ideally want to honor how well-behaved groups are in
+ * that case instead of simply punishing them all
+ * equally. As such, we reclaim them based on how much
+ * memory they are using, reducing the scan pressure
+ * again by how much of the total memory used is under
+ * hard protection.
+ */
+ unsigned long cgroup_size = mem_cgroup_size(memcg);
+
+ /* Avoid TOCTOU with earlier protection check */
+ cgroup_size = max(cgroup_size, protection);
+
+ scan = lruvec_size - lruvec_size * protection /
+ cgroup_size;
+
+ /*
+ * Minimally target SWAP_CLUSTER_MAX pages to keep
+ * reclaim moving forwards, avoiding decremeting
+ * sc->priority further than desirable.
+ */
+ scan = max(scan, SWAP_CLUSTER_MAX);
+ } else {
+ scan = lruvec_size;
+ }
+
+ scan >>= sc->priority;
- size = lruvec_lru_size(lruvec, lru, sc->reclaim_idx);
- scan = size >> sc->priority;
/*
* If the cgroup's already been deleted, make sure to
* scrape out the remaining cache.
*/
if (!scan && !mem_cgroup_online(memcg))
- scan = min(size, SWAP_CLUSTER_MAX);
+ scan = min(lruvec_size, SWAP_CLUSTER_MAX);
switch (scan_balance) {
case SCAN_EQUAL:
case SCAN_ANON:
/* Scan one type exclusively */
if ((scan_balance == SCAN_FILE) != file) {
- size = 0;
+ lruvec_size = 0;
scan = 0;
}
break;
BUG();
}
- *lru_pages += size;
+ *lru_pages += lruvec_size;
nr[lru] = scan;
}
}
memcg_memory_event(memcg, MEMCG_LOW);
break;
case MEMCG_PROT_NONE:
+ /*
+ * All protection thresholds breached. We may
+ * still choose to vary the scan pressure
+ * applied based on by how much the cgroup in
+ * question has exceeded its protection
+ * thresholds (see get_scan_count).
+ */
break;
}
struct z3fold_header *zhdr;
struct page *page;
enum buddy bud;
+ bool page_claimed;
zhdr = handle_to_z3fold_header(handle);
page = virt_to_page(zhdr);
+ page_claimed = test_and_set_bit(PAGE_CLAIMED, &page->private);
if (test_bit(PAGE_HEADLESS, &page->private)) {
/* if a headless page is under reclaim, just leave.
* has not been set before, we release this page
* immediately so we don't care about its value any more.
*/
- if (!test_and_set_bit(PAGE_CLAIMED, &page->private)) {
+ if (!page_claimed) {
spin_lock(&pool->lock);
list_del(&page->lru);
spin_unlock(&pool->lock);
atomic64_dec(&pool->pages_nr);
return;
}
- if (test_bit(PAGE_CLAIMED, &page->private)) {
+ if (page_claimed) {
+ /* the page has not been claimed by us */
z3fold_page_unlock(zhdr);
return;
}
if (unlikely(PageIsolated(page)) ||
test_and_set_bit(NEEDS_COMPACTING, &page->private)) {
z3fold_page_unlock(zhdr);
+ clear_bit(PAGE_CLAIMED, &page->private);
return;
}
if (zhdr->cpu < 0 || !cpu_online(zhdr->cpu)) {
zhdr->cpu = -1;
kref_get(&zhdr->refcount);
do_compact_page(zhdr, true);
+ clear_bit(PAGE_CLAIMED, &page->private);
return;
}
kref_get(&zhdr->refcount);
queue_work_on(zhdr->cpu, pool->compact_wq, &zhdr->work);
+ clear_bit(PAGE_CLAIMED, &page->private);
z3fold_page_unlock(zhdr);
}
case SCTP_PARAM_SET_PRIMARY:
if (ep->asconf_enable)
break;
- goto fallthrough;
+ goto unhandled;
case SCTP_PARAM_HOST_NAME_ADDRESS:
/* Tell the peer, we won't support this param. */
case SCTP_PARAM_FWD_TSN_SUPPORT:
if (ep->prsctp_enable)
break;
- goto fallthrough;
+ goto unhandled;
case SCTP_PARAM_RANDOM:
if (!ep->auth_enable)
- goto fallthrough;
+ goto unhandled;
/* SCTP-AUTH: Secion 6.1
* If the random number is not 32 byte long the association
case SCTP_PARAM_CHUNKS:
if (!ep->auth_enable)
- goto fallthrough;
+ goto unhandled;
/* SCTP-AUTH: Section 3.2
* The CHUNKS parameter MUST be included once in the INIT or
case SCTP_PARAM_HMAC_ALGO:
if (!ep->auth_enable)
- goto fallthrough;
+ goto unhandled;
hmacs = (struct sctp_hmac_algo_param *)param.p;
n_elt = (ntohs(param.p->length) -
retval = SCTP_IERROR_ABORT;
}
break;
-fallthrough:
+unhandled:
default:
pr_debug("%s: unrecognized param:%d for chunk:%d\n",
__func__, ntohs(param.p->type), cid);
{
struct sock_xprt *transport;
struct rpc_xprt *xprt;
- int err;
read_lock_bh(&sk->sk_callback_lock);
if (!(xprt = xprt_from_sock(sk)))
goto out;
transport = container_of(xprt, struct sock_xprt, xprt);
- err = -sk->sk_err;
- if (err == 0)
+ transport->xprt_err = -sk->sk_err;
+ if (transport->xprt_err == 0)
goto out;
dprintk("RPC: xs_error_report client %p, error=%d...\n",
- xprt, -err);
- trace_rpc_socket_error(xprt, sk->sk_socket, err);
+ xprt, -transport->xprt_err);
+ trace_rpc_socket_error(xprt, sk->sk_socket, transport->xprt_err);
+
+ /* barrier ensures xprt_err is set before XPRT_SOCK_WAKE_ERROR */
+ smp_mb__before_atomic();
xs_run_error_worker(transport, XPRT_SOCK_WAKE_ERROR);
out:
read_unlock_bh(&sk->sk_callback_lock);
static void xs_wake_error(struct sock_xprt *transport)
{
int sockerr;
- int sockerr_len = sizeof(sockerr);
if (!test_bit(XPRT_SOCK_WAKE_ERROR, &transport->sock_state))
return;
goto out;
if (!test_and_clear_bit(XPRT_SOCK_WAKE_ERROR, &transport->sock_state))
goto out;
- if (kernel_getsockopt(transport->sock, SOL_SOCKET, SO_ERROR,
- (char *)&sockerr, &sockerr_len) != 0)
- goto out;
+ sockerr = xchg(&transport->xprt_err, 0);
if (sockerr < 0)
xprt_wake_pending_tasks(&transport->xprt, sockerr);
out:
/// add a missing namespace tag to a module source file.
///
+virtual report
+
@has_ns_import@
declarer name MODULE_IMPORT_NS;
identifier virtual.ns;
struct module *module;
unsigned int crc;
int crc_valid;
- const char *namespace;
+ char *namespace;
unsigned int weak:1;
unsigned int vmlinux:1; /* 1 if symbol is defined in vmlinux */
unsigned int kernel:1; /* 1 if symbol is from kernel
return export_unknown;
}
-static const char *sym_extract_namespace(const char **symname)
+static char *sym_extract_namespace(const char **symname)
{
- size_t n;
- char *dupsymname;
+ char *namespace = NULL;
+ char *ns_separator;
- n = strcspn(*symname, ".");
- if (n < strlen(*symname) - 1) {
- dupsymname = NOFAIL(strdup(*symname));
- dupsymname[n] = '\0';
- *symname = dupsymname;
- return dupsymname + n + 1;
+ ns_separator = strchr(*symname, '.');
+ if (ns_separator) {
+ namespace = NOFAIL(strndup(*symname, ns_separator - *symname));
+ *symname = ns_separator + 1;
}
- return NULL;
+ return namespace;
}
/**
if (!s) {
s = new_symbol(name, mod, export);
- s->namespace = namespace;
} else {
if (!s->preloaded) {
warn("%s: '%s' exported twice. Previous export was in %s%s\n",
s->module = mod;
}
}
+ free(s->namespace);
+ s->namespace = namespace ? strdup(namespace) : NULL;
s->preloaded = 0;
s->vmlinux = is_vmlinux(mod->name);
s->kernel = 0;
unsigned int crc;
enum export export;
bool is_crc = false;
- const char *name, *namespace;
+ const char *name;
+ char *namespace;
if ((!is_vmlinux(mod->name) || mod->is_dot_o) &&
strstarts(symname, "__ksymtab"))
name = symname + strlen("__ksymtab_");
namespace = sym_extract_namespace(&name);
sym_add_exported(name, namespace, mod, export);
+ free(namespace);
}
if (strcmp(symname, "init_module") == 0)
mod->has_init = 1;
else
basename = mod->name;
- if (exp->namespace) {
+ if (exp->namespace && exp->namespace[0]) {
add_namespace(&mod->required_namespaces,
exp->namespace);
-#!/bin/bash
+#!/bin/sh
# SPDX-License-Identifier: GPL-2.0
# Linux kernel symbol namespace import generator
#
for source_file in $mod_source_files; do
sed '/MODULE_IMPORT_NS/Q' $source_file > ${source_file}.tmp
offset=$(wc -l ${source_file}.tmp | awk '{print $1;}')
- cat $source_file | grep MODULE_IMPORT_NS | sort -u >> ${source_file}.tmp
+ cat $source_file | grep MODULE_IMPORT_NS | LANG=C sort -u >> ${source_file}.tmp
tail -n +$((offset +1)) ${source_file} | grep -v MODULE_IMPORT_NS >> ${source_file}.tmp
if ! diff -q ${source_file} ${source_file}.tmp; then
mv ${source_file}.tmp ${source_file}
rc = string_to_context_struct(args->newp, NULL, s,
newc, SECSID_NULL);
if (rc == -EINVAL) {
- /* Retain string representation for later mapping. */
+ /*
+ * Retain string representation for later mapping.
+ *
+ * IMPORTANT: We need to copy the contents of oldc->str
+ * back into s again because string_to_context_struct()
+ * may have garbled it.
+ */
+ memcpy(s, oldc->str, oldc->len);
context_init(newc);
newc->str = s;
newc->len = oldc->len;
TARGETS_HOTPLUG = cpu-hotplug
TARGETS_HOTPLUG += memory-hotplug
+# User can optionally provide a TARGETS skiplist.
+SKIP_TARGETS ?=
+ifneq ($(SKIP_TARGETS),)
+ TMP := $(filter-out $(SKIP_TARGETS), $(TARGETS))
+ override TARGETS := $(TMP)
+endif
+
# Clear LDFLAGS and MAKEFLAGS if called from main
# Makefile to avoid test build failures when test
# Makefile doesn't have explicit build rules.
# 1. output_dir=kernel_src
# 2. a separate output directory is specified using O= KBUILD_OUTPUT
# 3. a separate output directory is specified using KBUILD_OUTPUT
+# Avoid conflict with INSTALL_PATH set by the main Makefile
#
-INSTALL_PATH ?= $(BUILD)/install
-INSTALL_PATH := $(abspath $(INSTALL_PATH))
+KSFT_INSTALL_PATH ?= $(BUILD)/kselftest_install
+KSFT_INSTALL_PATH := $(abspath $(KSFT_INSTALL_PATH))
+# Avoid changing the rest of the logic here and lib.mk.
+INSTALL_PATH := $(KSFT_INSTALL_PATH)
ALL_SCRIPT := $(INSTALL_PATH)/run_kselftest.sh
install: all
echo " cat /dev/null > \$$logfile" >> $(ALL_SCRIPT)
echo "fi" >> $(ALL_SCRIPT)
+ @# While building run_kselftest.sh skip also non-existent TARGET dirs:
+ @# they could be the result of a build failure and should NOT be
+ @# included in the generated runlist.
for TARGET in $(TARGETS); do \
BUILD_TARGET=$$BUILD/$$TARGET; \
+ [ ! -d $$INSTALL_PATH/$$TARGET ] && echo "Skipping non-existent dir: $$TARGET" && continue; \
echo "[ -w /dev/kmsg ] && echo \"kselftest: Running tests in $$TARGET\" >> /dev/kmsg" >> $(ALL_SCRIPT); \
echo "cd $$TARGET" >> $(ALL_SCRIPT); \
echo -n "run_many" >> $(ALL_SCRIPT); \
+ echo -n "Emit Tests for $$TARGET\n"; \
$(MAKE) -s --no-print-directory OUTPUT=$$BUILD_TARGET -C $$TARGET emit_tests >> $(ALL_SCRIPT); \
echo "" >> $(ALL_SCRIPT); \
echo "cd \$$ROOT" >> $(ALL_SCRIPT); \
#
# Runs a set of tests in a given subdirectory.
export skip_rc=4
+export timeout_rc=124
export logfile=/dev/stdout
export per_test_logging=
+# Defaults for "settings" file fields:
+# "timeout" how many seconds to let each test run before failing.
+export kselftest_default_timeout=45
+
# There isn't a shell-agnostic way to find the path of a sourced file,
# so we must rely on BASE_DIR being set to find other tools.
if [ -z "$BASE_DIR" ]; then
fi
}
+tap_timeout()
+{
+ # Make sure tests will time out if utility is available.
+ if [ -x /usr/bin/timeout ] ; then
+ /usr/bin/timeout "$kselftest_timeout" "$1"
+ else
+ "$1"
+ fi
+}
+
run_one()
{
DIR="$1"
BASENAME_TEST=$(basename $TEST)
+ # Reset any "settings"-file variables.
+ export kselftest_timeout="$kselftest_default_timeout"
+ # Load per-test-directory kselftest "settings" file.
+ settings="$BASE_DIR/$DIR/settings"
+ if [ -r "$settings" ] ; then
+ while read line ; do
+ field=$(echo "$line" | cut -d= -f1)
+ value=$(echo "$line" | cut -d= -f2-)
+ eval "kselftest_$field"="$value"
+ done < "$settings"
+ fi
+
TEST_HDR_MSG="selftests: $DIR: $BASENAME_TEST"
echo "# $TEST_HDR_MSG"
if [ ! -x "$TEST" ]; then
echo "not ok $test_num $TEST_HDR_MSG"
else
cd `dirname $TEST` > /dev/null
- (((((./$BASENAME_TEST 2>&1; echo $? >&3) |
+ ((((( tap_timeout ./$BASENAME_TEST 2>&1; echo $? >&3) |
tap_prefix >&4) 3>&1) |
(read xs; exit $xs)) 4>>"$logfile" &&
echo "ok $test_num $TEST_HDR_MSG") ||
- (if [ $? -eq $skip_rc ]; then \
+ (rc=$?; \
+ if [ $rc -eq $skip_rc ]; then \
echo "not ok $test_num $TEST_HDR_MSG # SKIP"
+ elif [ $rc -eq $timeout_rc ]; then \
+ echo "not ok $test_num $TEST_HDR_MSG # TIMEOUT"
else
- echo "not ok $test_num $TEST_HDR_MSG"
+ echo "not ok $test_num $TEST_HDR_MSG # exit=$rc"
fi)
cd - >/dev/null
fi
echo "$0: Installing in specified location - $install_loc ..."
fi
- install_dir=$install_loc/kselftest
+ install_dir=$install_loc/kselftest_install
# Create install directory
mkdir -p $install_dir
# Build tests
- INSTALL_PATH=$install_dir make install
+ KSFT_INSTALL_PATH=$install_dir make install
}
main "$@"
nrthreads = strtoul(optarg, NULL, 10);
break;
case 'l':
- strncpy(logdir, optarg, LOGDIR_NAME_SIZE);
+ strncpy(logdir, optarg, LOGDIR_NAME_SIZE - 1);
break;
case 't':
run_time = strtoul(optarg, NULL, 10);
--- /dev/null
+timeout=90
int fd;
const char v = 'V';
-static const char sopts[] = "bdehp:t:Tn:NLf:";
+static const char sopts[] = "bdehp:t:Tn:NLf:i";
static const struct option lopts[] = {
{"bootstatus", no_argument, NULL, 'b'},
{"disable", no_argument, NULL, 'd'},
{"getpretimeout", no_argument, NULL, 'N'},
{"gettimeleft", no_argument, NULL, 'L'},
{"file", required_argument, NULL, 'f'},
+ {"info", no_argument, NULL, 'i'},
{NULL, no_argument, NULL, 0x0}
};
printf("Usage: %s [options]\n", progname);
printf(" -f, --file\t\tOpen watchdog device file\n");
printf("\t\t\tDefault is /dev/watchdog\n");
+ printf(" -i, --info\t\tShow watchdog_info\n");
printf(" -b, --bootstatus\tGet last boot status (Watchdog/POR)\n");
printf(" -d, --disable\t\tTurn off the watchdog timer\n");
printf(" -e, --enable\t\tTurn on the watchdog timer\n");
int c;
int oneshot = 0;
char *file = "/dev/watchdog";
+ struct watchdog_info info;
setbuf(stdout, NULL);
exit(-1);
}
+ /*
+ * Validate that `file` is a watchdog device
+ */
+ ret = ioctl(fd, WDIOC_GETSUPPORT, &info);
+ if (ret) {
+ printf("WDIOC_GETSUPPORT error '%s'\n", strerror(errno));
+ close(fd);
+ exit(ret);
+ }
+
optind = 0;
while ((c = getopt_long(argc, argv, sopts, lopts, NULL)) != -1) {
case 'f':
/* Handled above */
break;
+ case 'i':
+ /*
+ * watchdog_info was obtained as part of file open
+ * validation. So we just show it here.
+ */
+ oneshot = 1;
+ printf("watchdog_info:\n");
+ printf(" identity:\t\t%s\n", info.identity);
+ printf(" firmware_version:\t%u\n",
+ info.firmware_version);
+ printf(" options:\t\t%08x\n", info.options);
+ break;
default:
usage(argv[0]);