tools/libbpf: improve the pr_debug statements to contain section numbers
While debugging a bpf ELF loading issue, I needed to correlate the
ELF section number with the failed relocation section reference.
Thus, add section numbers/index to the pr_debug.
In debug mode, also print section that were skipped. This helped
me identify that a section (.eh_frame) was skipped, and this was
the reason the relocation section (.rel.eh_frame) could not find
that section number.
The section numbers corresponds to the readelf tools Section Headers [Nr].
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpf: Sync kernel ABI header with tooling header for bpf_common.h
I recently fixed up a lot of commits that forgot to keep the tooling
headers in sync. And then I forgot to do the same thing in commit cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes"). Let correct
that before people notice ;-).
Lawrence did partly fix/sync this for bpf.h in commit d6d4f60c3a09
("bpf: add selftest for tcpbpf").
Fixes: cb5f7334d479 ("bpf: add comments to BPF ld/ldx sizes") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Thu, 8 Feb 2018 10:59:51 +0000 (11:59 +0100)]
Merge branch 'bpf-misc-nfp-bpftool-doc-fixes'
Jakub Kicinski says:
====================
First patch in this series fixes applying the relocation to immediate
load instructions in the NFP JIT.
The remaining patches come from Quentin. Small addition to libbpf
makes sure it recognizes all standard section names. Makefile in
bpftool/Documentation is improved to explicitly check for rst2man
being installed on the system, otherwise we risk installing empty
files. Man page for bpftool-map is corrected to include program
as a potential value for map of programs.
Last two patches are slightly longer, those update bash completions to
include this release cycle's additions from Roman. Maybe the use of
Fixes tags is slightly frivolous there, but having bash completions
which don't cover all commands and options could be disruptive to work
flow for users.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Quentin Monnet [Thu, 8 Feb 2018 04:27:15 +0000 (20:27 -0800)]
tools: bpftool: make syntax for program map update explicit in man page
Specify in the documentation that when using bpftool to update a map of
type BPF_MAP_TYPE_PROG_ARRAY, the syntax for the program used as a value
should use the "id|tag|pinned" keywords convention, as used with
"bpftool prog" commands.
Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Quentin Monnet [Thu, 8 Feb 2018 04:27:14 +0000 (20:27 -0800)]
tools: bpftool: exit doc Makefile early if rst2man is not available
If rst2man is not available on the system, running `make doc` from the
bpftool directory fails with an error message. However, it creates empty
manual pages (.8 files in this case). A subsequent call to `make
doc-install` would then succeed and install those empty man pages on the
system.
To prevent this, raise a Makefile error and exit immediately if rst2man
is not available before generating the pages from the rst documentation.
Fixes: ff69c21a85a4 ("tools: bpftool: add documentation") Reported-by: Jason van Aaardt <jason.vanaardt@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Quentin Monnet [Thu, 8 Feb 2018 04:27:13 +0000 (20:27 -0800)]
libbpf: complete list of strings for guessing program type
It seems that the type guessing feature for libbpf, based on the name of
the ELF section the program is located in, was inspired from
samples/bpf/prog_load.c, which was not used by any sample for loading
programs of certain types such as TC actions and classifiers, or
LWT-related types. As a consequence, libbpf is not able to guess the
type of such programs and to load them automatically if type is not
provided to the `bpf_load_prog()` function.
Add ELF section names associated to those eBPF program types so that
they can be loaded with e.g. bpftool as well.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Naresh Kamboju [Wed, 7 Feb 2018 18:15:34 +0000 (23:45 +0530)]
selftests: bpf: test_kmod.sh: check the module path before insmod
test_kmod.sh reported false failure when module not present.
Check test_bpf.ko is present in the path before loading it.
Two cases to be addressed here,
In the development process of test_bpf.c unit testing will be done by
developers by using "insmod $SRC_TREE/lib/test_bpf.ko"
On the other hand testers run full tests by installing modules on device
under test (DUT) and followed by modprobe to insert the modules accordingly.
Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Daniel Borkmann [Tue, 6 Feb 2018 10:39:32 +0000 (11:39 +0100)]
Merge branch 'bpf-sockmap-fixes'
John Fastabend says:
====================
A set of fixes for sockmap to resolve programs referencing sockmaps
and closing without deleting all entries in the map and/or not detaching
BPF programs attached to the map. Both leaving entries in the map and
not detaching programs may result in the map failing to be removed by
BPF infrastructure due to reference counts never reaching zero.
For this we pull in the ULP infrastructure to hook into the close()
hook of the sock layer. This seemed natural because we have additional
sockmap features (to add support for TX hooks) that will also use the
ULP infrastructure. This allows us to cleanup entries in the map when
socks are closed() and avoid trying to get the sk_state_change() hook
to fire in all cases.
The second issue resolved here occurs when users don't detach
programs. The gist is a refcnt issue resolved by implementing the
release callback. See patch for details.
For testing I ran both sample/sockmap and selftests bpf/test_maps.c.
Dave Watson ran TLS test suite on v1 version of the patches without
the put_module error path change.
v4 fix missing rcu_unlock()
v3 wrap psock reference in RCU
v2 changes rebased onto bpf-next with small update adding module_put
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 5 Feb 2018 18:17:54 +0000 (10:17 -0800)]
bpf: sockmap, fix leaking maps with attached but not detached progs
When a program is attached to a map we increment the program refcnt
to ensure that the program is not removed while it is potentially
being referenced from sockmap side. However, if this same program
also references the map (this is a reasonably common pattern in
my programs) then the verifier will also increment the maps refcnt
from the verifier. This is to ensure the map doesn't get garbage
collected while the program has a reference to it.
So we are left in a state where the map holds the refcnt on the
program stopping it from being removed and releasing the map refcnt.
And vice versa the program holds a refcnt on the map stopping it
from releasing the refcnt on the prog.
All this is fine as long as users detach the program while the
map fd is still around. But, if the user omits this detach command
we are left with a dangling map we can no longer release.
To resolve this when the map fd is released decrement the program
references and remove any reference from the map to the program.
This fixes the issue with possibly dangling map and creates a
user side API constraint. That is, the map fd must be held open
for programs to be attached to a map.
Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 5 Feb 2018 18:17:49 +0000 (10:17 -0800)]
bpf: sockmap, add sock close() hook to remove socks
The selftests test_maps program was leaving dangling BPF sockmap
programs around because not all psock elements were removed from
the map. The elements in turn hold a reference on the BPF program
they are attached to causing BPF programs to stay open even after
test_maps has completed.
The original intent was that sk_state_change() would be called
when TCP socks went through TCP_CLOSE state. However, because
socks may be in SOCK_DEAD state or the sock may be a listening
socket the event is not always triggered.
To resolve this use the ULP infrastructure and register our own
proto close() handler. This fixes the above case.
Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Reported-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 5 Feb 2018 18:17:43 +0000 (10:17 -0800)]
net: add a UID to use for ULP socket assignment
Create a UID field and enum that can be used to assign ULPs to
sockets. This saves a set of string comparisons if the ULP id
is known.
For sockmap, which is added in the next patches, a ULP is used to
hook into TCP sockets close state. In this case the ULP being added
is done at map insert time and the ULP is known and done on the kernel
side. In this case the named lookup is not needed. Because we don't
want to expose psock internals to user space socket options a user
visible flag is also added. For TLS this is set for BPF it will be
cleared.
Alos remove pr_notice, user gets an error code back and should check
that rather than rely on logs.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Yonghong Song [Tue, 6 Feb 2018 00:25:24 +0000 (16:25 -0800)]
tools/bpf: fix batch-mode test failure of test_xdp_redirect.sh
The tests at tools/testing/selftests/bpf can run in patch mode, e.g.,
make -C tools/testing/selftests/bpf run_tests
With the batch mode, I experimented intermittent test failure of
test_xdp_redirect.sh.
....
selftests: test_xdp_redirect [PASS]
selftests: test_xdp_redirect.sh [PASS]
RTNETLINK answers: File exists
selftests: test_xdp_meta [FAILED]
selftests: test_xdp_meta.sh [FAIL]
....
The following illustrates what caused the failure:
(1). test_xdp_redirect creates veth pairs (veth1,veth11) and
(veth2,veth22), and assign veth11 and veth22 to namespace
ns1 and ns2 respectively.
(2). at the end of test_xdp_redirect test, ns1 and ns2 are
deleted. During this process, the deletion of actual
namespace resources, including deletion of veth1{1} and veth2{2},
is put into a workqueue to be processed asynchronously.
(3). test_xdp_meta tries to create veth pair (veth1, veth2).
The previous veth deletions in step (2) have not finished yet,
and veth1 or veth2 may be still valid in the kernel, thus
causing the failure.
The fix is to explicitly delete the veth pair before test_xdp_redirect
exits. Only one end of veth needs deletion as the kernel will delete
the other end automatically. Also test_xdp_meta is also fixed in
similar manner to avoid future potential issues.
Fixes: 996139e801fd ("selftests: bpf: add a test for XDP redirect") Fixes: 22c8852624fc ("bpf: improve selftests and add tests for meta pointer") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
The test_kmod.sh load/remove test_bpf.ko multiple times with different
settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
of test_bpf.ko is designed such that JIT always fails.
Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
introduced the following tightening logic:
...
if (!bpf_prog_is_dev_bound(fp->aux)) {
fp = bpf_int_jit_compile(fp);
#ifdef CONFIG_BPF_JIT_ALWAYS_ON
if (!fp->jited) {
*err = -ENOTSUPP;
return fp;
}
#endif
...
With this logic, Test #297 always gets return value -ENOTSUPP
when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.
This patch fixed the failure by marking Test #297 as expected failure
when CONFIG_BPF_JIT_ALWAYS_ON is defined.
Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config) Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Linus Torvalds [Sun, 4 Feb 2018 19:45:55 +0000 (11:45 -0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull spectre/meltdown updates from Thomas Gleixner:
"The next round of updates related to melted spectrum:
- The initial set of spectre V1 mitigations:
- Array index speculation blocker and its usage for syscall,
fdtable and the n180211 driver.
- Speculation barrier and its usage in user access functions
- Make indirect calls in KVM speculation safe
- Blacklisting of known to be broken microcodes so IPBP/IBSR are not
touched.
- The initial IBPB support and its usage in context switch
- The exposure of the new speculation MSRs to KVM guests.
- A fix for a regression in x86/32 related to the cpu entry area
- Proper whitelisting for known to be safe CPUs from the mitigations.
- objtool fixes to deal proper with retpolines and alternatives
- Exclude __init functions from retpolines which speeds up the boot
process.
- Removal of the syscall64 fast path and related cleanups and
simplifications
- Removal of the unpatched paravirt mode which is yet another source
of indirect unproteced calls.
- A new and undisputed version of the module mismatch warning
- A couple of cleanup and correctness fixes all over the place
Yet another step towards full mitigation. There are a few things still
missing like the RBS underflow mitigation for Skylake and other small
details, but that's being worked on.
That said, I'm taking a belated christmas vacation for a week and hope
that everything is magically solved when I'm back on Feb 12th"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
KVM/x86: Add IBPB support
KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
x86/pti: Mark constant arrays as __initconst
x86/spectre: Simplify spectre_v2 command line parsing
x86/retpoline: Avoid retpolines for built-in __init functions
x86/kvm: Update spectre-v1 mitigation
KVM: VMX: make MSR bitmaps per-VCPU
x86/paravirt: Remove 'noreplace-paravirt' cmdline option
x86/speculation: Use Indirect Branch Prediction Barrier in context switch
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
x86/spectre: Report get_user mitigation for spectre_v1
nl80211: Sanitize array index in parse_txq_params
vfs, fdtable: Prevent bounds-check bypass via speculative execution
x86/syscall: Sanitize syscall table de-references under speculation
x86/get_user: Use pointer masking to limit speculation
...
Linus Torvalds [Sun, 4 Feb 2018 19:43:30 +0000 (11:43 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A small set of changes:
- a fixup for kexec related to 5-level paging mode. That covers most
of the cases except kexec from a 5-level kernel to a 4-level
kernel. The latter needs more work and is going to come in 4.17
- two trivial fixes for build warnings triggered by LTO and gcc-8"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/power: Fix swsusp_arch_resume prototype
x86/dumpstack: Avoid uninitlized variable
x86/kexec: Make kexec (mostly) work in 5-level paging mode
Linus Torvalds [Sun, 4 Feb 2018 19:41:31 +0000 (11:41 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Two small changes:
- a fix for a interrupt regression caused by the vector management
changes in 4.15 affecting museum pieces which rely on interrupt
probing for legacy (e.g. parallel port) devices.
One of the startup calls in the autoprobe code was not changed to
the new activate_and_startup() function resulting in a warning and
as a consequence failing to discover the device interrupt.
- a trivial update to the copyright/license header of the STM32 irq
chip driver"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Make legacy autoprobing work again
irqchip/stm32: Fix copyright
Linus Torvalds [Sun, 4 Feb 2018 19:16:35 +0000 (11:16 -0800)]
Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"Most of this is fixes and not new code/features:
- skd fix from Arnd, fixing a build error dependent on sla allocator
type.
- blk-mq scheduler discard merging fixes, one from me and one from
Keith. This fixes a segment miscalculation for blk-mq-sched, where
we mistakenly think two segments are physically contigious even
though the request isn't carrying real data. Also fixes a bio-to-rq
merge case.
- Don't re-set a bit on the buffer_head flags, if it's already set.
This can cause scalability concerns on bigger machines and
workloads. From Kemi Wang.
- Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
distuingish between a local (device related) resource starvation
and a global one. The latter might happen without IO being in
flight, so it has to be handled a bit differently. From Ming"
* tag 'for-linus-20180204' of git://git.kernel.dk/linux-block:
block: skd: fix incorrect linux/slab_def.h inclusion
buffer: Avoid setting buffer bits that are already set
blk-mq-sched: Enable merging discard bio into request
blk-mq: fix discard merge with scheduler attached
blk-mq: introduce BLK_STS_DEV_RESOURCE
Linus Torvalds [Sun, 4 Feb 2018 19:13:49 +0000 (11:13 -0800)]
Merge tag 'ntb-4.16' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"Bug fixes galore, removal of the ntb atom driver, and updates to the
ntb tools and tests to support the multi-port interface"
* tag 'ntb-4.16' of git://github.com/jonmason/ntb: (37 commits)
NTB: ntb_perf: fix cast to restricted __le32
ntb_perf: Fix an error code in perf_copy_chunk()
ntb_hw_switchtec: Make function switchtec_ntb_remove() static
NTB: ntb_tool: fix memory leak on 'buf' on error exit path
NTB: ntb_perf: fix printing of resource_size_t
NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology
NTB: ntb_test: Update ntb_perf tests
NTB: ntb_test: Update ntb_tool MW tests
NTB: ntb_test: Add ntb_tool Message tests
NTB: ntb_test: Update ntb_tool Scratchpad tests
NTB: ntb_test: Update ntb_tool DB tests
NTB: ntb_test: Update ntb_tool link tests
NTB: ntb_test: Add ntb_tool port tests
NTB: ntb_test: Safely use paths with whitespace
NTB: ntb_perf: Add full multi-port NTB API support
NTB: ntb_tool: Add full multi-port NTB API support
NTB: ntb_pp: Add full multi-port NTB API support
NTB: Fix UB/bug in ntb_mw_get_align()
NTB: Set dma mask and dma coherent mask to NTB devices
NTB: Rename NTB messaging API methods
...
Linus Torvalds [Sun, 4 Feb 2018 19:11:23 +0000 (11:11 -0800)]
Merge tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
"Misc driver changes only:
- TI-MsgMgr: Fix print format for a printk
- TI-MSgMgr: SPDX license switch for the driver
- QCOM-IPC: Convert driver to use regmap
- QCOM-IPC: Spawn sibling clock device from mailbox driver"
* tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
dt-bindings: mailbox: qcom: Document the APCS clock binding
mailbox: qcom: Create APCS child device for clock controller
mailbox: qcom: Convert APCS IPC driver to use regmap
mailbox: ti-msgmgr: Use %zu for size_t print format
mailbox: ti-msgmgr: Switch to SPDX Licensing
Linus Torvalds [Sun, 4 Feb 2018 18:57:43 +0000 (10:57 -0800)]
Merge branch 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"I2C has the following changes for you:
- new flag to mark DMA safe buffers in i2c_msg. Also, some
infrastructure around it. And docs.
- huge refactoring of the at24 driver led by the new maintainer
Bartosz
- update I2C bus recovery to send STOP after recovery
- conversion from gpio to gpiod for I2C bus recovery
- adding a fault-injector to the i2c-gpio driver
- lots of small driver improvements, and bigger ones to
i2c-sh_mobile"
* 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits)
i2c: mv64xxx: Add myself as maintainer for this driver
i2c: mv64xxx: Fix clock resource by adding an optional bus clock
i2c: mv64xxx: Remove useless test before clk_disable_unprepare
i2c: mxs: use true and false for boolean values
i2c: meson: update doc description to fix build warnings
i2c: meson: add configurable divider factors
dt-bindings: i2c: update documentation for the Meson-AXG
i2c: imx-lpi2c: add runtime pm support
i2c: rcar: fix some trivial typos in comments
i2c: davinci: fix the cpufreq transition
i2c: rk3x: add proper kerneldoc header
i2c: rk3x: account for const type of of_device_id.data
i2c: acorn: remove outdated path from file header
i2c: acorn: add MODULE_LICENSE tag
i2c: rcar: implement bus recovery
i2c: send STOP after successful bus recovery
i2c: ensure SDA is released in recovery if SDA is controllable
i2c: add 'set_sda' to bus_recovery_info
i2c: add identifier in declarations for i2c_bus_recovery
i2c: make kerneldoc about bus recovery more precise
...
Linus Torvalds [Sun, 4 Feb 2018 18:43:12 +0000 (10:43 -0800)]
Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt
Pull fscrypt updates from Ted Ts'o:
"Refactor support for encrypted symlinks to move common code to fscrypt"
Ted also points out about the merge:
"This makes the f2fs symlink code use the fscrypt_encrypt_symlink()
from the fscrypt tree. This will end up dropping the kzalloc() ->
f2fs_kzalloc() change, which means the fscrypt-specific allocation
won't get tested by f2fs's kmalloc error injection system; which is
fine"
* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: (26 commits)
fscrypt: fix build with pre-4.6 gcc versions
fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
fscrypt: document symlink length restriction
fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
fscrypt: calculate NUL-padding length in one place only
fscrypt: move fscrypt_symlink_data to fscrypt_private.h
fscrypt: remove fscrypt_fname_usr_to_disk()
ubifs: switch to fscrypt_get_symlink()
ubifs: switch to fscrypt ->symlink() helper functions
ubifs: free the encrypted symlink target
f2fs: switch to fscrypt_get_symlink()
f2fs: switch to fscrypt ->symlink() helper functions
ext4: switch to fscrypt_get_symlink()
ext4: switch to fscrypt ->symlink() helper functions
fscrypt: new helper function - fscrypt_get_symlink()
fscrypt: new helper functions for ->symlink()
fscrypt: trim down fscrypt.h includes
fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
...
Georgi Djakov [Tue, 5 Dec 2017 15:46:57 +0000 (17:46 +0200)]
mailbox: qcom: Create APCS child device for clock controller
There is a clock controller functionality provided by the APCS hardware
block of msm8916 devices. The device-tree would represent an APCS node
with both mailbox and clock provider properties.
Create a platform child device for the clock controller functionality so
the driver can probe and use APCS as parent.
Linus Torvalds [Sun, 4 Feb 2018 00:25:42 +0000 (16:25 -0800)]
Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardened usercopy whitelisting from Kees Cook:
"Currently, hardened usercopy performs dynamic bounds checking on slab
cache objects. This is good, but still leaves a lot of kernel memory
available to be copied to/from userspace in the face of bugs.
To further restrict what memory is available for copying, this creates
a way to whitelist specific areas of a given slab cache object for
copying to/from userspace, allowing much finer granularity of access
control.
Slab caches that are never exposed to userspace can declare no
whitelist for their objects, thereby keeping them unavailable to
userspace via dynamic copy operations. (Note, an implicit form of
whitelisting is the use of constant sizes in usercopy operations and
get_user()/put_user(); these bypass all hardened usercopy checks since
these sizes cannot change at runtime.)
This new check is WARN-by-default, so any mistakes can be found over
the next several releases without breaking anyone's system.
The series has roughly the following sections:
- remove %p and improve reporting with offset
- prepare infrastructure and whitelist kmalloc
- update VFS subsystem with whitelists
- update SCSI subsystem with whitelists
- update network subsystem with whitelists
- update process memory with whitelists
- update per-architecture thread_struct with whitelists
- update KVM with whitelists and fix ioctl bug
- mark all other allocations as not whitelisted
- update lkdtm for more sensible test overage"
* tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
lkdtm: Update usercopy tests for whitelisting
usercopy: Restrict non-usercopy caches to size 0
kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
kvm: whitelist struct kvm_vcpu_arch
arm: Implement thread_struct whitelist for hardened usercopy
arm64: Implement thread_struct whitelist for hardened usercopy
x86: Implement thread_struct whitelist for hardened usercopy
fork: Provide usercopy whitelisting for task_struct
fork: Define usercopy region in thread_stack slab caches
fork: Define usercopy region in mm_struct slab caches
net: Restrict unwhitelisted proto caches to size 0
sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
sctp: Define usercopy region in SCTP proto slab cache
caif: Define usercopy region in caif proto slab cache
ip: Define usercopy region in IP proto slab cache
net: Define usercopy region in struct proto slab cache
scsi: Define usercopy region in scsi_sense_cache slab cache
cifs: Define usercopy region in cifs_request slab cache
vxfs: Define usercopy region in vxfs_inode slab cache
ufs: Define usercopy region in ufs_inode_cache slab cache
...
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
[ Based on a patch from Paolo Bonzini <pbonzini@redhat.com> ]
... basically doing exactly what we do for VMX:
- Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
- Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
actually used it.
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
[ Based on a patch from Ashok Raj <ashok.raj@intel.com> ]
Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.
To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
guests that do not actually use the MSR, only start saving and restoring
when a non-zero is written to it.
No attempt is made to handle STIBP here, intentionally. Filtering STIBP
may be added in a future patch, which may require trapping all writes
if we don't want to pass it through directly to the guest.
[dwmw2: Clean up CPUID bits, save/restore manually, handle reset]
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de
Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
(bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
contents will come directly from the hardware, but user-space can still
override it.
[dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: kvm@vger.kernel.org Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de
Ashok Raj [Thu, 1 Feb 2018 21:59:43 +0000 (22:59 +0100)]
KVM/x86: Add IBPB support
The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
control mechanism. It keeps earlier branches from influencing
later ones.
Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
It's a command that ensures predicted branch targets aren't used after
the barrier. Although IBRS and IBPB are enumerated by the same CPUID
enumeration, IBPB is very different.
IBPB helps mitigate against three potential attacks:
* Mitigate guests from being attacked by other guests.
- This is addressed by issing IBPB when we do a guest switch.
* Mitigate attacks from guest/ring3->host/ring3.
These would require a IBPB during context switch in host, or after
VMEXIT. The host process has two ways to mitigate
- Either it can be compiled with retpoline
- If its going through context switch, and has set !dumpable then
there is a IBPB in that path.
(Tim's patch: https://patchwork.kernel.org/patch/10192871)
- The case where after a VMEXIT you return back to Qemu might make
Qemu attackable from guest when Qemu isn't compiled with retpoline.
There are issues reported when doing IBPB on every VMEXIT that resulted
in some tsc calibration woes in guest.
* Mitigate guest/ring0->host/ring0 attacks.
When host kernel is using retpoline it is safe against these attacks.
If host kernel isn't using retpoline we might need to do a IBPB flush on
every VMEXIT.
Even when using retpoline for indirect calls, in certain conditions 'ret'
can use the BTB on Skylake-era CPUs. There are other mitigations
available like RSB stuffing/clearing.
* IBPB is issued only for SVM during svm_free_vcpu().
VMX has a vmclear and SVM doesn't. Follow discussion here:
https://lkml.org/lkml/2018/1/15/146
Please refer to the following spec for more details on the enumeration
and control.
Refer here to get documentation about mitigations.
[peterz: rebase and changelog rewrite]
[karahmed: - rebase
- vmx: expose PRED_CMD if guest has it in CPUID
- svm: only pass through IBPB if guest has it in CPUID
- vmx: support !cpu_has_vmx_msr_bitmap()]
- vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
PRED_CMD is a write-only MSR]
Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: kvm@vger.kernel.org Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de
KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
[dwmw2: Stop using KF() for bits in it, too] Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jim Mattson <jmattson@google.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Link: https://lkml.kernel.org/r/1517522386-18410-2-git-send-email-karahmed@amazon.de
Linus Torvalds [Sat, 3 Feb 2018 21:49:22 +0000 (13:49 -0800)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Only miscellaneous cleanups and bug fixes for ext4 this cycle"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: create ext4_kset dynamically
ext4: create ext4_feat kobject dynamically
ext4: release kobject/kset even when init/register fail
ext4: fix incorrect indentation of if statement
ext4: correct documentation for grpid mount option
ext4: use 'sbi' instead of 'EXT4_SB(sb)'
ext4: save error to disk in __ext4_grp_locked_error()
jbd2: fix sphinx kernel-doc build warnings
ext4: fix a race in the ext4 shutdown path
mbcache: make sure c_entry_count is not decremented past zero
ext4: no need flush workqueue before destroying it
ext4: fixed alignment and minor code cleanup in ext4.h
ext4: fix ENOSPC handling in DAX page fault handler
dax: pass detailed error code from dax_iomap_fault()
mbcache: revert "fs/mbcache.c: make count_objects() more robust"
mbcache: initialize entry->e_referenced in mb_cache_entry_create()
ext4: fix up remaining files with SPDX cleanups
1) The bnx2x can hang if you give it a GSO packet with a segment size
which is too big for the hardware, detect and drop in this case.
From Daniel Axtens.
2) Fix some overflows and pointer leaks in xtables, from Dmitry Vyukov.
3) Missing RCU locking in igmp, from Eric Dumazet.
4) Fix RX checksum handling on r8152, it can only checksum UDP and TCP
packets. From Hayes Wang.
5) Minor pacing tweak to TCP BBR congestion control, from Neal
Cardwell.
6) Missing RCU annotations in cls_u32, from Paolo Abeni.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
Revert "defer call to mem_cgroup_sk_alloc()"
soreuseport: fix mem leak in reuseport_add_sock()
net: qlge: use memmove instead of skb_copy_to_linear_data
net: qed: use correct strncpy() size
net: cxgb4: avoid memcpy beyond end of source buffer
cls_u32: add missing RCU annotation.
r8152: set rx mode early when linking on
r8152: fix wrong checksum status for received IPv4 packets
nfp: fix TLV offset calculation
net: pxa168_eth: add netconsole support
net: igmp: add a missing rcu locking section
ibmvnic: fix firmware version when no firmware level has been provided by the VIOS server
vmxnet3: remove redundant initialization of pointer 'rq'
lan78xx: remove redundant initialization of pointer 'phydev'
net: jme: remove unused initialization of 'rxdesc'
rtnetlink: remove check for IFLA_IF_NETNSID
rocker: fix possible null pointer dereference in rocker_router_fib_event_work
inet: Avoid unitialized variable warning in inet_unhash()
net: bridge: Fix uninitialized error in br_fdb_sync_static()
openvswitch: Remove padding from packet before L3+ conntrack processing
...
Linus Torvalds [Sat, 3 Feb 2018 21:07:56 +0000 (13:07 -0800)]
Merge tag 'scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull second set of SCSI updates from James Bottomley:
"This is a set of three patches that depended on mq and zone changes in
the block tree (now upstream)"
* tag 'scsi-postmerge' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Remove zone write locking
scsi: sd_zbc: Initialize device request queue zoned data
scsi: scsi-mq-debugfs: Show more information
Linus Torvalds [Sat, 3 Feb 2018 00:44:14 +0000 (16:44 -0800)]
pinctrl: remove include file from <linux/device.h>
When pulling the recent pinctrl merge, I was surprised by how a
pinctrl-only pull request ended up rebuilding basically the whole
kernel.
The reason for that ended up being that <linux/device.h> included
<linux/pinctrl/devinfo.h>, so any change to that file ended up causing
pretty much every driver out there to be rebuilt.
The reason for that was because 'struct device' has this in it:
but we already avoid header includes for these kinds of things in that
header file, preferring to just use a forward-declaration of the
structure instead. Exactly to avoid this kind of header dependency.
Since some drivers seem to expect that <linux/pinctrl/devinfo.h> header
to come in automatically, move the include to <linux/pinctrl/pinctrl.h>
instead. It might be better to just make the includes more targeted,
but I'm not going to review every driver.
It would definitely be good to have a tool for finding and minimizing
header dependencies automatically - or at least help with them. Right
now we almost certainly end up having way too many of these things, and
it's hard to test every single configuration.
FWIW, you can get a sense of the "hotness" of a header file with something
like this after doing a full build:
which isn't exact (there are other things in those '*.o.cmd' than just
the dependencies, and the "--lines=+2" only removes the header), but
might a useful approximation.
With this patch, <linux/pinctrl/devinfo.h> drops to "only" having 833
users in the current x86-64 allmodconfig. In contrast, <linux/device.h>
has 14857 build files including it directly or indirectly.
Of course, the headers that absolutely _everybody_ includes (things like
<linux/types.h> etc) get a score of 23000+.
Jean Delvare [Sat, 3 Feb 2018 10:25:20 +0000 (11:25 +0100)]
firmware: dmi_scan: Fix handling of empty DMI strings
The handling of empty DMI strings looks quite broken to me:
* Strings from 1 to 7 spaces are not considered empty.
* True empty DMI strings (string index set to 0) are not considered
empty, and result in allocating a 0-char string.
* Strings with invalid index also result in allocating a 0-char
string.
* Strings starting with 8 spaces are all considered empty, even if
non-space characters follow (sounds like a weird thing to do, but
I have actually seen occurrences of this in DMI tables before.)
* Strings which are considered empty are reported as 8 spaces,
instead of being actually empty.
Some of these issues are the result of an off-by-one error in memcmp,
the rest is incorrect by design.
So let's get it square: missing strings and strings made of only
spaces, regardless of their length, should be treated as empty and
no memory should be allocated for them. All other strings are
non-empty and should be allocated.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 79da4721117f ("x86: fix DMI out of memory problems") Cc: Parag Warudkar <parag.warudkar@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de>
Jean Delvare [Sat, 3 Feb 2018 10:25:20 +0000 (11:25 +0100)]
firmware: dmi_scan: Drop dmi_initialized
I don't think it makes sense to check for a possible bad
initialization order at run time on every system when it is all
decided at build time.
A more efficient way to make sure developers do not introduce new
calls to dmi_check_system() too early in the initialization sequence
is to simply document the expected call order. That way, developers
have a chance to get it right immediately, without having to
test-boot their kernel, wonder why it does not work, and parse the
kernel logs for a warning message. And we get rid of the run-time
performance penalty as a nice side effect.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Ingo Molnar <mingo@kernel.org>
Jean Delvare [Sat, 3 Feb 2018 10:25:20 +0000 (11:25 +0100)]
firmware: dmi: Optimize dmi_matches
Function dmi_matches can me made a bit faster:
* The documented purpose of dmi_initialized is to catch too early
calls to dmi_check_system(). I'm not fully convinced it justifies
slowing down the initialization of all systems out there, but at
least the check should not have been moved from dmi_check_system()
to dmi_matches(). dmi_matches() is being called for every entry of
the table passed to dmi_check_system(), causing the same redundant
check to be performed again and again. So move it back to
dmi_check_system(), reverting this specific portion of commit d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface
more flexible").
* Don't check for the exact_match flag again when we already know its
value.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: d7b1956fed33 ("DMI: Introduce dmi_first_match to make the interface more flexible") Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Jeff Garzik <jgarzik@redhat.com>
====================
Here is an updated v8 version:
- add if_link.h in uapi and remove the definition
- fix a commit message
- remove uapi from a include
====================
Eric Leblond [Tue, 30 Jan 2018 20:55:02 +0000 (21:55 +0100)]
libbpf: add error reporting in XDP
Parse netlink ext attribute to get the error message returned by
the card. Code is partially take from libnl.
We add netlink.h to the uapi include of tools. And we need to
avoid include of userspace netlink header to have a successful
build of sample so nlattr.h has a define to avoid
the inclusion. Using a direct define could have been an issue
as NLMSGERR_ATTR_MAX can change in the future.
We also define SOL_NETLINK if not defined to avoid to have to
copy socket.h for a fixed value.
Roman Gushchin [Fri, 2 Feb 2018 15:26:57 +0000 (15:26 +0000)]
Revert "defer call to mem_cgroup_sk_alloc()"
This patch effectively reverts commit 9f1c2674b328 ("net: memcontrol:
defer call to mem_cgroup_sk_alloc()").
Moving mem_cgroup_sk_alloc() to the inet_csk_accept() completely breaks
memcg socket memory accounting, as packets received before memcg
pointer initialization are not accounted and are causing refcounting
underflow on socket release.
Actually the free-after-use problem was fixed by
commit c0576e397508 ("net: call cgroup_sk_alloc() earlier in
sk_clone_lock()") for the cgroup pointer.
So, let's revert it and call mem_cgroup_sk_alloc() just before
cgroup_sk_alloc(). This is safe, as we hold a reference to the socket
we're cloning, and it holds a reference to the memcg.
Also, let's drop BUG_ON(mem_cgroup_is_root()) check from
mem_cgroup_sk_alloc(). I see no reasons why bumping the root
memcg counter is a good reason to panic, and there are no realistic
ways to hit it.
Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2. move *prog under rcu, since it's not ok to dereference it afterwards
3. in a rare case of prog array being swapped between bpf_prog_array_length()
and bpf_prog_array_copy_to_user() calls make sure to copy zeros to user space,
so the user doesn't walk over uninited prog_ids while kernel reported
uattr->query.prog_cnt > 0
Eric Dumazet [Fri, 2 Feb 2018 18:27:27 +0000 (10:27 -0800)]
soreuseport: fix mem leak in reuseport_add_sock()
reuseport_add_sock() needs to deal with attaching a socket having
its own sk_reuseport_cb, after a prior
setsockopt(SO_ATTACH_REUSEPORT_?BPF)
Without this fix, not only a WARN_ONCE() was issued, but we were also
leaking memory.
Thanks to sysbot and Eric Biggers for providing us nice C repros.
------------[ cut here ]------------
socket already in reuseport group
WARNING: CPU: 0 PID: 3496 at net/core/sock_reuseport.c:119
reuseport_add_sock+0x742/0x9b0 net/core/sock_reuseport.c:117
Kernel panic - not syncing: panic_on_warn set ...
Arnd Bergmann [Fri, 2 Feb 2018 15:45:44 +0000 (16:45 +0100)]
net: qlge: use memmove instead of skb_copy_to_linear_data
gcc-8 points out that the skb_copy_to_linear_data() argument points to
the skb itself, which makes it run into a problem with overlapping
memcpy arguments:
In file included from include/linux/ip.h:20,
from drivers/net/ethernet/qlogic/qlge/qlge_main.c:26:
drivers/net/ethernet/qlogic/qlge/qlge_main.c: In function 'ql_realign_skb':
include/linux/skbuff.h:3378:2: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
memcpy(skb->data, from, len);
It's unclear to me what the best solution is, maybe it ought to use a
different helper that adjusts the skb data in a safe way. Simply using
memmove() here seems like the easiest workaround.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 2 Feb 2018 15:44:47 +0000 (16:44 +0100)]
net: qed: use correct strncpy() size
passing the strlen() of the source string as the destination
length is pointless, and gcc-8 now warns about it:
drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump':
include/linux/string.h:253: error: 'strncpy' specified bound depends on the length of the source argument [-Werror=stringop-overflow=]
This changes qed_grc_dump_big_ram() to instead uses the length of
the destination buffer, and use strscpy() to guarantee nul-termination.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 2 Feb 2018 15:18:37 +0000 (16:18 +0100)]
net: cxgb4: avoid memcpy beyond end of source buffer
Building with link-time-optimizations revealed that the cxgb4 driver does
a fixed-size memcpy() from a variable-length constant string into the
network interface name:
In function 'memcpy',
inlined from 'cfg_queues_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:335:2,
inlined from 'cxgb4_register_uld.constprop' at drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c:719:9:
include/linux/string.h:350:3: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
__read_overflow2();
^
I can see two equally workable solutions: either we use a strncpy() instead
of the memcpy() to stop at the end of the input, or we make the source buffer
fixed length as well. This implements the latter.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 2 Feb 2018 15:02:22 +0000 (16:02 +0100)]
cls_u32: add missing RCU annotation.
In a couple of points of the control path, n->ht_down is currently
accessed without the required RCU annotation. The accesses are
safe, but sparse complaints. Since we already held the
rtnl lock, let use rtnl_dereference().
Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs") Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang [Fri, 2 Feb 2018 08:43:36 +0000 (16:43 +0800)]
r8152: set rx mode early when linking on
Set rx mode before calling netif_wake_queue() when linking on to avoid
the device missing the receiving packets.
The transmission may start after calling netif_wake_queue(), and the
packets of resopnse may reach before calling rtl8152_set_rx_mode()
which let the device could receive packets. Then, the packets of
response would be missed.
Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang [Fri, 2 Feb 2018 08:43:35 +0000 (16:43 +0800)]
r8152: fix wrong checksum status for received IPv4 packets
The device could only check the checksum of TCP and UDP packets. Therefore,
for the IPv4 packets excluding TCP and UDP, the check of checksum is necessary,
even though the IP checksum is correct.
Take ICMP for example, The IP checksum may be correct, but the ICMP checksum
may be wrong.
Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edwin Peer [Fri, 2 Feb 2018 03:41:43 +0000 (19:41 -0800)]
nfp: fix TLV offset calculation
The data pointer in the config space TLV parser already includes
NFP_NET_CFG_TLV_BASE, it should not be added again. Incorrect
offset values were only used in printed user output, rendering
the bug merely cosmetic.
Fixes: 73a0329b057e ("nfp: add TLV capabilities to the BAR") Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 2 Feb 2018 22:57:44 +0000 (14:57 -0800)]
Merge tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire updates from Stefan Richter
- make JMicron JMB38x controllers work with IOMMU-equipped systems
- IP-over-1394: allow user-configured MTU of up to 4096 bytes
* tag 'firewire-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire-ohci: work around oversized DMA reads on JMicron controllers
firewire: net: max MTU off by one
Arnd Bergmann [Fri, 2 Feb 2018 14:56:18 +0000 (15:56 +0100)]
x86/power: Fix swsusp_arch_resume prototype
The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the
definition in x86-32 does not, and it fails to include the header with the
declaration. This leads to a warning when building with
link-time-optimizations:
kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch]
extern asmlinkage int swsusp_arch_resume(void);
^
arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here
int swsusp_arch_resume(void)
This moves the declaration into a globally visible header file and fixes up
both x86 definitions to match it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <len.brown@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: linux-pm@vger.kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bart Van Assche <bart.vanassche@wdc.com> Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.de
Arnd Bergmann [Fri, 2 Feb 2018 14:56:17 +0000 (15:56 +0100)]
x86/dumpstack: Avoid uninitlized variable
In some configurations, 'partial' does not get initialized, as shown by
this gcc-8 warning:
arch/x86/kernel/dumpstack.c: In function 'show_trace_log_lvl':
arch/x86/kernel/dumpstack.c:156:4: error: 'partial' may be used uninitialized in this function [-Werror=maybe-uninitialized]
show_regs_if_on_stack(&stack_info, regs, partial);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This initializes it to false, to get the previous behavior in this case.
Fixes: a9cdbe72c4e8 ("x86/dumpstack: Fix partial register dumps") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lkml.kernel.org/r/20180202145634.200291-1-arnd@arndb.de
Linus Torvalds [Fri, 2 Feb 2018 22:22:53 +0000 (14:22 -0800)]
Merge tag 'pinctrl-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v4.16 kernel cycle.
Like with GPIO it is actually a bit calm this time.
Core changes:
- After lengthy discussions and partly due to my ignorance, we have
merged a patch making pinctrl_force_default() and
pinctrl_force_sleep() reprogram the states into the hardware of any
hogged pins, even if they are already in the desired state.
This only apply to hogged pins since groups of pins owned by
drivers need to be managed by each driver, lest they could not do
things like runtime PM and put pins to sleeping state even if the
system as a whole is not in sleep.
New drivers:
- New driver for the Microsemi Ocelot SoC. This is used in ethernet
switches.
- The X-Powers AXP209 GPIO driver was extended to also deal with pin
control and moved over from the GPIO subsystem. This circuit is a
mixed-mode integrated circuit which is part of AllWinner designs.
- New subdriver for the Qualcomm MSM8998 SoC, core of a high end
mobile devices (phones) chipset.
- New subdriver for the ST Microelectronics STM32MP157 MPU and
STM32F769 MCU from the STM32 family.
- New subdriver for the MediaTek MT7622 SoC. This is used for
routers, repeater, gateways and such network infrastructure.
- New subdriver for the NXP (former Freescale) i.MX 6ULL. This SoC
has multimedia features and target "smart devices", I guess in-car
entertainment, in-flight entertainment, industrial control panels
etc.
General improvements:
- Incremental improvements on the SH-PFC subdrivers for things like
the CAN bus.
- Enable the glitch filter on Baytrail GPIOs used for interrupts.
- Proper handling of pins to GPIO ranges on the Semtec SX150X
- An IRQ setup ordering fix on MCP23S08.
- A good set of janitorial coding style fixes"
* tag 'pinctrl-v4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (102 commits)
pinctrl: mcp23s08: fix irq setup order
pinctrl: Forward declare struct device
pinctrl: sunxi: Use of_clk_get_parent_count() instead of open coding
pinctrl: stm32: add STM32F769 MCU support
pinctrl: sx150x: Add a static gpio/pinctrl pin range mapping
pinctrl: sx150x: Register pinctrl before adding the gpiochip
pinctrl: sx150x: Unregister the pinctrl on release
pinctrl: ingenic: Remove redundant dev_err call in ingenic_pinctrl_probe()
pinctrl: sprd: Use seq_putc() in sprd_pinconf_group_dbg_show()
pinctrl: pinmux: Use seq_putc() in pinmux_pins_show()
pinctrl: abx500: Use seq_putc() in abx500_gpio_dbg_show()
pinctrl: mediatek: mt7622: align error handling of mtk_hw_get_value call
pinctrl: mediatek: mt7622: fix potential uninitialized value being returned
pinctrl: uniphier: refactor drive strength get/set functions
pinctrl: imx7ulp: constify struct imx_cfg_params_decode
pinctrl: imx: constify struct imx_pinctrl_soc_info
pinctrl: imx7d: simplify imx7d_pinctrl_probe
pinctrl: imx: use struct imx_pinctrl_soc_info as a const
pinctrl: sunxi-pinctrl: fix pin funtion can not be match correctly.
pinctrl: qcom: Add msm8998 pinctrl driver
...
Darren Kenny [Fri, 2 Feb 2018 19:12:20 +0000 (19:12 +0000)]
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit") Signed-off-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202191220.blvgkgutojecxr3b@starbug-vm.ie.oracle.com
Arnd Bergmann [Fri, 2 Feb 2018 21:39:23 +0000 (22:39 +0100)]
x86/pti: Mark constant arrays as __initconst
I'm seeing build failures from the two newly introduced arrays that
are marked 'const' and '__initdata', which are mutually exclusive:
arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init'
arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init'
The correct annotation is __initconst.
Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Thomas Garnier <thgarnie@google.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180202213959.611210-1-arnd@arndb.de
Linus Torvalds [Fri, 2 Feb 2018 21:46:21 +0000 (13:46 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha updates from Matt Turner:
"A few small fixes and clean ups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
alpha: fix crash if pthread_create races with signal delivery
alpha: fix formating of stack content
alpha: fix reboot on Avanti platform
alpha: deprecate pci_get_bus_and_slot()
alpha: Fix mixed up args in EXC macro in futex operations
alpha: osf_sys.c: use timespec64 where appropriate
alpha: osf_sys.c: fix put_tv32 regression
alpha: make thread_saved_pc static
alpha: make XTABS equivalent to TAB3
Linus Torvalds [Fri, 2 Feb 2018 18:01:04 +0000 (10:01 -0800)]
Merge tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Enable support for memory protection keys aka "pkeys" on Power7/8/9
when using the hash table MMU.
- Extend our interrupt soft masking to support masking PMU interrupts
as well as "normal" interrupts, and then use that to implement
local_t for a ~4x speedup vs the current atomics-based
implementation.
- A new driver "ocxl" for "Open Coherent Accelerator Processor
Interface (OpenCAPI)" devices.
- Support for new device tree properties on PowerVM to describe
hotpluggable memory and devices.
- Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE to the 64-bit
VDSO.
- Freescale updates from Scott: fixes for CPM GPIO and an FSL PCI
erratum workaround, plus a minor cleanup patch.
As well as quite a lot of other changes all over the place, and small
fixes and cleanups as always.
Thanks to: Alan Modra, Alastair D'Silva, Alexey Kardashevskiy,
Alistair Popple, Andreas Schwab, Andrew Donnellan, Aneesh Kumar K.V,
Anju T Sudhakar, Anshuman Khandual, Anton Blanchard, Arnd Bergmann,
Balbir Singh, Benjamin Herrenschmidt, Bhaktipriya Shridhar, Bryant G.
Ly, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Cyril Bur,
David Gibson, Desnes A. Nunes do Rosario, Dmitry Torokhov, Frederic
Barrat, Geert Uytterhoeven, Guilherme G. Piccoli, Gustavo A. R. Silva,
Gustavo Romero, Ivan Mikhaylov, Joakim Tjernlund, Joe Perches, Josh
Poimboeuf, Juan J. Alvarez, Julia Cartwright, Kamalesh Babulal,
Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Malaterre, Michael
Bringmann, Michael Hanselmann, Michael Neuling, Nathan Fontenot,
Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Philippe Bergheaud,
Ram Pai, Russell Currey, Santosh Sivaraj, Scott Wood, Seth Forshee,
Simon Guo, Stewart Smith, Sukadev Bhattiprolu, Thiago Jung Bauermann,
Vaibhav Jain, Vasyl Gomonovych"
* tag 'powerpc-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (199 commits)
powerpc/mm/radix: Fix build error when RADIX_MMU=n
macintosh/ams-input: Use true and false for boolean values
macintosh: change some data types from int to bool
powerpc/watchdog: Print the NIP in soft_nmi_interrupt()
powerpc/watchdog: regs can't be null in soft_nmi_interrupt()
powerpc/watchdog: Tweak watchdog printks
powerpc/cell: Remove axonram driver
rtc-opal: Fix handling of firmware error codes, prevent busy loops
powerpc/mpc52xx_gpt: make use of raw_spinlock variants
macintosh/adb: Properly mark continued kernel messages
powerpc/pseries: Fix cpu hotplug crash with memoryless nodes
powerpc/numa: Ensure nodes initialized for hotplug
powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
powerpc/kernel: Block interrupts when updating TIDR
powerpc/powernv/idoa: Remove unnecessary pcidev from pci_dn
powerpc/mm/nohash: do not flush the entire mm when range is a single page
powerpc/pseries: Add Initialization of VF Bars
powerpc/pseries/pci: Associate PEs to VFs in configure SR-IOV
powerpc/eeh: Add EEH notify resume sysfs
powerpc/eeh: Add EEH operations to notify resume
...
skd includes slab_def.h to get access to the slab cache object size.
However, including this header breaks when we use SLUB or SLOB instead of
the SLAB allocator, since the structure layout is completely different,
as shown by this warning when we build this driver in one of the invalid
configurations with link-time optimizations enabled:
include/linux/slab.h:715:0: error: type of 'kmem_cache_size' does not match original declaration [-Werror=lto-type-mismatch]
unsigned int kmem_cache_size(struct kmem_cache *s);
mm/slab_common.c:77:14: note: 'kmem_cache_size' was previously declared here
unsigned int kmem_cache_size(struct kmem_cache *s)
^
mm/slab_common.c:77:14: note: code may be misoptimized unless -fno-strict-aliasing is used
include/linux/slab.h:147:0: error: type of 'kmem_cache_destroy' does not match original declaration [-Werror=lto-type-mismatch]
void kmem_cache_destroy(struct kmem_cache *);
mm/slab_common.c:858:6: note: 'kmem_cache_destroy' was previously declared here
void kmem_cache_destroy(struct kmem_cache *s)
^
mm/slab_common.c:858:6: note: code may be misoptimized unless -fno-strict-aliasing is used
include/linux/slab.h:140:0: error: type of 'kmem_cache_create' does not match original declaration [-Werror=lto-type-mismatch]
struct kmem_cache *kmem_cache_create(const char *name, size_t size,
mm/slab_common.c:534:1: note: 'kmem_cache_create' was previously declared here
kmem_cache_create(const char *name, size_t size, size_t align,
^
This removes the header inclusion and instead uses the kmem_cache_size()
interface to get the size in a reliable way.
Kemi Wang [Tue, 24 Oct 2017 01:16:42 +0000 (09:16 +0800)]
buffer: Avoid setting buffer bits that are already set
It's expensive to set buffer flags that are already set, because that
causes a costly cache line transition.
A common case is setting the "verified" flag during ext4 writes.
This patch checks for the flag being set first.
With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4
file system, we see 3.3%(15431->15936) improvement of aim7.jobs-per-min on
a 2-sockets broadwell platform.
What the benchmark does is: it forks 3000 processes, and each process do
the following:
a) open a new file
b) close the file
c) delete the file
until loop=100*1000 times.
The original patch is contributed by Andi Kleen.
Signed-off-by: Andi Kleen <ak@linux.intel.com> Tested-by: Kemi Wang <kemi.wang@intel.com> Signed-off-by: Kemi Wang <kemi.wang@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
David Woodhouse [Thu, 1 Feb 2018 11:27:20 +0000 (11:27 +0000)]
x86/retpoline: Avoid retpolines for built-in __init functions
There's no point in building init code with retpolines, since it runs before
any potentially hostile userspace does. And before the retpoline is actually
ALTERNATIVEd into place, for much of it.
Linus Torvalds [Fri, 2 Feb 2018 01:48:47 +0000 (17:48 -0800)]
Merge tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"This seems to have been a comparatively quieter merge window, I assume
due to holidays etc. The "biggest" change is AMD header cleanups, which
merge/remove a bunch of them. The AMD gpu scheduler is now being made generic
with the etnaviv driver wanting to reuse the code, hopefully other drivers
can go in the same direction.
Otherwise it's the usual lots of stuff in i915/amdgpu, not so much stuff
elsewhere.
Core:
- Add .last_close and .output_poll_changed helpers to reduce driver footprints
- Fix plane clipping
- Improved debug printing support
- Add panel orientation property
- Update edid derived properties at edid setting
- Reduction in fbdev driver footprint
- Move amdgpu scheduler into core for other drivers to use.
i915:
- Selftest and IGT improvements
- Fast boot prep work on IPS, pipe config
- HW workarounds for Cannonlake, Geminilake
- Cannonlake clock and HDMI2.0 fixes
- GPU cache invalidation and context switch improvements
- Display planes cleanup
- New PMU interface for perf queries
- New firmware support for KBL/SKL
- Geminilake HW workaround for perforamce
- Coffeelake stolen memory improvements
- GPU reset robustness work
- Cannonlake horizontal plane flipping
- GVT work
amdgpu/radeon:
- RV and Vega header file cleanups (lots of lines gone!)
- TTM operation context support
- 48-bit GPUVM support for Vega/RV
- ECC support for Vega
- Resizeable BAR support
- Multi-display sync support
- Enable swapout for reserved BOs during allocation
- S3 fixes on Raven
- GPU reset cleanup and fixes
- 2+1 level GPU page table
amdkfd:
- GFX7/8 SDMA user queues support
- Hardware scheduling for multiple processes
- dGPU prep work
rcar:
- Added R8A7743/5 support
- System suspend/resume support
sun4i:
- Multi-plane support for YUV formats
- A83T and LVDS support
msm:
- Devfreq support for GPU
tegra:
- Prep work for adding Tegra186 support
- Tegra186 HDMI support
- HDMI2.0 and zpos support by using generic helpers
etnaviv:
- Occlusion query fixes
- Job handling fixes
- Prep work for hooking in gpu scheduler
armada:
- Move closer to atomic modesetting
- Allow disabling primary plane if overlay is full screen
imx:
- Format modifier support
- Add tile prefetch to PRE
- Runtime PM support for PRG
ast:
- fix LUT loading"
* tag 'drm-for-v4.16' of git://people.freedesktop.org/~airlied/linux: (1471 commits)
drm/ast: Load lut in crtc_commit
drm: Check for lessee in DROP_MASTER ioctl
drm: fix gpu scheduler link order
drm/amd/display: Demote error print to debug print when ATOM impl missing
dma-buf: fix reservation_object_wait_timeout_rcu once more v2
drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)
drm/amd/amdgpu: Add Polaris version check
drm/amdgpu: Reenable manual GPU reset from sysfs
drm/amdgpu: disable MMHUB power gating on raven
drm/ttm: Don't unreserve swapped BOs that were previously reserved
drm/ttm: Don't add swapped BOs to swap-LRU list
drm/amdgpu: only check for ECC on Vega10
drm/amd/powerplay: Fix smu_table_entry.handle type
drm/ttm: add VADDR_FLAG_UPDATED_COUNT to correctly update dma_page global count
drm: Fix PANEL_ORIENTATION_QUIRKS breaking the Kconfig DRM menuconfig
drm/radeon: fill in rb backend map on evergreen/ni.
drm/amdgpu/gfx9: fix ngg enablement to clear gds reserved memory (v2)
drm/ttm: only free pages rather than update global memory count together
drm/amdgpu: fix CPU based VM updates
drm/amdgpu: fix typo in amdgpu_vce_validate_bo
...
Linus Torvalds [Fri, 2 Feb 2018 00:56:07 +0000 (16:56 -0800)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"The core framework has a handful of patches this time around, mostly
due to the clk rate protection support added by Jerome Brunet.
This feature will allow consumers to lock in a certain rate on the
output of a clk so that things like audio playback don't hear pops
when the clk frequency changes due to shared parent clks changing
rates. Currently the clk API doesn't guarantee the rate of a clk stays
at the rate you request after clk_set_rate() is called, so this new
API will allow drivers to express that requirement.
Beyond this, the core got some debugfs pretty printing patches and a
couple minor non-critical fixes.
Looking outside of the core framework diff we have some new driver
additions and the removal of a legacy TI clk driver. Both of these hit
high in the dirstat. Also, the removal of the asm-generic/clkdev.h
file causes small one-liners in all the architecture Kbuild files.
Overall, the driver diff seems to be the normal stuff that comes all
the time to fix little problems here and there and to support new
hardware.
Removed Drivers:
- TI OMAP 3xxx legacy clk (non-DT) support
- asm*/clkdev.h got removed (not really a driver)
Updates:
- Renesas FDP1-0 module clock on R-Car M3-W
- Renesas LVDS module clock on R-Car V3M
- Misc fixes to pr_err() prints
- Qualcomm MSM8916 audio fixes
- Qualcomm IPQ8074 rounded out support for more peripherals
- Qualcomm Alpha PLL variants
- Divider code was using container_of() on bad pointers
- Allwinner DE2 clks on H3
- Amlogic minor data fixes and dropping of CLK_IGNORE_UNUSED
- Mediatek clk driver compile test support
- AT91 PMC clk suspend/resume restoration support
- PLL issues fixed on si5351
- Broadcom IProc PLL calculation updates
- DVFS support for Armada mvebu CPU clks
- Allwinner fixed post-divider support
- TI clkctrl fixes and support for newer SoCs"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (125 commits)
clk: aspeed: Handle inverse polarity of USB port 1 clock gate
clk: aspeed: Fix return value check in aspeed_cc_init()
clk: aspeed: Add reset controller
clk: aspeed: Register gated clocks
clk: aspeed: Add platform driver and register PLLs
clk: aspeed: Register core clocks
clk: Add clock driver for ASPEED BMC SoCs
clk: mediatek: adjust dependency of reset.c to avoid unexpectedly being built
clk: fix reentrancy of clk_enable() on UP systems
clk: meson-axg: fix potential NULL dereference in axg_clkc_probe()
clk: Simplify debugfs registration
clk: Fix debugfs_create_*() usage
clk: Show symbolic clock flags in debugfs
clk: renesas: r8a7796: Add FDP clock
clk: Move __clk_{get,put}() into private clk.h API
clk: sunxi: Use CLK_IS_CRITICAL flag for critical clks
clk: Improve flags doc for of_clk_detect_critical()
arch: Remove clkdev.h asm-generic from Kbuild
clk: sunxi-ng: a83t: Add M divider to TCON1 clock
clk: Prepare to remove asm-generic/clkdev.h
...
Linus Torvalds [Fri, 2 Feb 2018 00:35:31 +0000 (16:35 -0800)]
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Arnd Bergmann:
"A number of new drivers get added this time, along with many
low-priority bugfixes. The most interesting changes by subsystem are:
bus drivers:
- Updates to the Broadcom bus interface driver to support newer SoC
types
- The TI OMAP sysc driver now supports updated DT bindings
memory controllers:
- A new driver for Tegra186 gets added
- A new driver for the ti-emif sram, to allow relocating
suspend/resume handlers there
SoC specific:
- A new driver for Qualcomm QMI, the interface to the modem on MSM
SoCs
- A new driver for power domains on the actions S700 SoC
- A driver for the Xilinx Zynq VCU logicoreIP
reset controllers:
- A new driver for Amlogic Meson-AGX
- various bug fixes
tee subsystem:
- A new user interface got added to enable asynchronous communication
with the TEE supplicant.
- A new method of using user space memory for communication with the
TEE is added"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (84 commits)
of: platform: fix OF node refcount leak
soc: fsl: guts: Add a NULL check for devm_kasprintf()
bus: ti-sysc: Fix smartreflex sysc mask
psci: add CPU_IDLE dependency
soc: xilinx: Fix Kconfig alignment
soc: xilinx: xlnx_vcu: Use bitwise & rather than logical && on clkoutdiv
soc: xilinx: xlnx_vcu: Depends on HAS_IOMEM for xlnx_vcu
soc: bcm: brcmstb: Be multi-platform compatible
soc: brcmstb: biuctrl: exit without warning on non brcmstb platforms
Revert "soc: brcmstb: Only register SoC device on STB platforms"
bus: omap: add MODULE_LICENSE tags
soc: brcmstb: Only register SoC device on STB platforms
tee: shm: Potential NULL dereference calling tee_shm_register()
soc: xilinx: xlnx_vcu: Add Xilinx ZYNQMP VCU logicoreIP init driver
dt-bindings: soc: xilinx: Add DT bindings to xlnx_vcu driver
soc: xilinx: Create folder structure for soc specific drivers
of: platform: populate /firmware/ node from of_platform_default_populate_init()
soc: samsung: Add SPDX license identifiers
soc: qcom: smp2p: Use common error handling code in qcom_smp2p_probe()
tee: shm: don't put_page on null shm->pages
...
Linus Torvalds [Fri, 2 Feb 2018 00:17:40 +0000 (16:17 -0800)]
Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Arnd Bergmann:
"These are mostly minor bugfixes, cleanup and many defconfig updates to
support added drivers. In particular OMAP and PXA keep cleaning up the
legacy code base, as usual.
Nvidia adds some more SoC support code for Tegra 186.
For the first time on years, we are actually adding a non-DT platform
for the EP93xx based Liebherr controller BK3.1. It's a minor variation
of the EP93xx reference design and in active use, while EP93xx
apparently doesn't have enough new development to have any device tree
support"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (73 commits)
ARM: omap: hwmod: fix section mismatch warnings
ARM: pxa/tosa-bt: add MODULE_LICENSE tag
arm64: defconfig: enable CONFIG_ACPI_APEI_EINJ
arm64: defconfig: enable EDAC GHES option
arm64: defconfig: enable CONFIG_ACPI_APEI_MEMORY_FAILURE
ARM: imx_v6_v7_defconfig: enable CONFIG_CPU_FREQ_STAT
Wind down ARM/TANGO port
ARM: davinci: constify gpio_led
ARM: davinci: drop unneeded newline
soc: Add SoC driver for Gemini
ARM: SAMSUNG: Add SPDX license identifiers
ARM: S5PV210: Add SPDX license identifiers
ARM: S3C64XX: Add SPDX license identifiers
ARM: S3C24XX: Add SPDX license identifiers
ARM: EXYNOS: Add SPDX license identifiers
ARM: imx: remove unused imx3 pm definitions
ARM: imx: don't abort MMDC probe if power saving status doesn't match
ARM: imx_v6_v7_defconfig: enable RTC_DRV_MXC_V2
ARM: imx_v6_v7_defconfig: Add missing config for DART-MX6 SoM
ARM: davinci: Use PTR_ERR_OR_ZERO()
...
Linus Torvalds [Fri, 2 Feb 2018 00:07:54 +0000 (16:07 -0800)]
Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC device tree updates from Arnd Bergmann:
"We get a moderate number of new machines this time, and only one new
SoC variant (Actions S700):
Actions:
- S700 Soc and CubieBoard7 development board
- Allo.com Sparky Single-board-computer
Allwinner:
- Orange Pi R1 development board
- Libre Computer Board ALL-H3-CC H3 single-board computer
ASpeed ast2x00:
- Witherspoon: OpenPower Power9 server manufactured by IBM that uses the ASPEED ast2500
- Zaius: OpenPower Power9 server manufactured by Invatech that uses the ASPEED ast2500
- Q71L: Intel Xeon server manufactured by Qanta that uses the ASPEED ast2400
Freescale/NXP i.MX:
- SolidRun Humminboard2 development board
- Variscite DART-MX6 SoM and Carrier-board
- Technologic TS-4600 and TS-7970 development board
- Toradex Colibri iMX7D SoM board
- v1.5 variant of Solidrun Cubox-i and Hummingboard
Freescale/NXP Layerscape:
- Moxa UC-8410A Series industrial computer
We finally managed to get the dtc warnings under control, with no more
build-time warnings for bad device tree files. This includes fixes for
the majority of platforms, including nomadik, samsung, lpc32xx, STi,
spear, mediatek, freescale, qcom, realview, keystone, omap, kirkwood,
renesas, hisilicon, and broadcom.
Files get rearranged on a few platforms, in particular the Marvell
Armada 7K/8K device tree files are changed in preparation for future
SoC support, based on more than two of the same chips in one package,
and some boards get renamed for oxnas for consistency.
Finally, many existing SoCs gain descriptions for additional on-chip
devices that we can now support with kernel drivers:
Linus Torvalds [Thu, 1 Feb 2018 21:36:15 +0000 (13:36 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- Add a console_msg_format command line option:
The value "default" keeps the old "[time stamp] text\n" format. The
value "syslog" allows to see the syslog-like "<log
level>[timestamp] text" format.
This feature was requested by people doing regression tests, for
example, 0day robot. They want to have both filtered and full logs
at hands.
- Reduce the risk of softlockup:
Pass the console owner in a busy loop.
This is a new approach to the old problem. It was first proposed by
Steven Rostedt on Kernel Summit 2017. It marks a context in which
the console_lock owner calls console drivers and could not sleep.
On the other side, printk() callers could detect this state and use
a busy wait instead of a simple console_trylock(). Finally, the
console_lock owner checks if there is a busy waiter at the end of
the special context and eventually passes the console_lock to the
waiter.
The hand-off works surprisingly well and helps in many situations.
Well, there is still a possibility of the softlockup, for example,
when the flood of messages stops and the last owner still has too
much to flush.
There is increasing number of people having problems with
printk-related softlockups. We might eventually need to get better
solution. Anyway, this looks like a good start and promising
direction.
- Do not allow to schedule in console_unlock() called from printk():
This reverts an older controversial commit. The reschedule helped
to avoid softlockups. But it also slowed down the console output.
This patch is obsoleted by the new console waiter logic described
above. In fact, the reschedule made the hand-off less effective.
- Deprecate "%pf" and "%pF" format specifier:
It was needed on ia64, ppc64 and parisc64 to dereference function
descriptors and show the real function address. It is done
transparently by "%ps" and "pS" format specifier now.
Sergey Senozhatsky found that all the function descriptors were in
a special elf section and could be easily detected.
- Remove printk_symbol() API:
It has been obsoleted by "%pS" format specifier, and this change
helped to remove few continuous lines and a less intuitive old API.
- Remove redundant memsets:
Sergey removed unnecessary memset when processing printk.devkmsg
command line option.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: (27 commits)
printk: drop redundant devkmsg_log_str memsets
printk: Never set console_may_schedule in console_trylock()
printk: Hide console waiter logic into helpers
printk: Add console owner and waiter logic to load balance console writes
kallsyms: remove print_symbol() function
checkpatch: add pF/pf deprecation warning
symbol lookup: introduce dereference_symbol_descriptor()
parisc64: Add .opd based function descriptor dereference
powerpc64: Add .opd based function descriptor dereference
ia64: Add .opd based function descriptor dereference
sections: split dereference_function_descriptor()
openrisc: Fix conflicting types for _exext and _stext
lib: do not use print_symbol()
irq debug: do not use print_symbol()
sysfs: do not use print_symbol()
drivers: do not use print_symbol()
x86: do not use print_symbol()
unicore32: do not use print_symbol()
sh: do not use print_symbol()
mn10300: do not use print_symbol()
...
Linus Torvalds [Thu, 1 Feb 2018 21:18:25 +0000 (13:18 -0800)]
Merge tag 'vfio-v4.16-rc1' of git://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson:
- Mask INTx from user if pdev->irq is zero (Alexey Kardashevskiy)
- Capability helper cleanup (Alex Williamson)
- Allow mmaps overlapping MSI-X vector table with region capability
exposing this feature (Alexey Kardashevskiy)
- mdev static cleanups (Xiongwei Song)
* tag 'vfio-v4.16-rc1' of git://github.com/awilliam/linux-vfio:
vfio: mdev: make a couple of functions and structure vfio_mdev_driver static
vfio-pci: Allow mapping MSIX BAR
vfio: Simplify capability helper
vfio-pci: Mask INTx if a device is not capabable of enabling it
Linus Torvalds [Thu, 1 Feb 2018 21:15:23 +0000 (13:15 -0800)]
Merge tag 'trace-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"There's not much changes for the tracing system this release. Mostly
small clean ups and fixes.
The biggest change is to how bprintf works. bprintf is used by
trace_printk() to just save the format and args of a printf call, and
the formatting is done when the trace buffer is read. This is done to
keep the formatting out of the fast path (this was recommended by
you). The issue is when arguments are de-referenced.
If a pointer is saved, and the format has something like "%*pbl", when
the buffer is read, it will de-reference the argument then. The
problem is if the data no longer exists. This can cause the kernel to
oops.
The fix for this was to make these de-reference pointes do the
formatting at the time it is called (the fast path), as this
guarantees that the data exists (and doesn't change later)"
* tag 'trace-v4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
vsprintf: Do not have bprintf dereference pointers
ftrace: Mark function tracer test functions noinline/noclone
trace_uprobe: Display correct offset in uprobe_events
tracing: Make sure the parsed string always terminates with '\0'
tracing: Clear parser->idx if only spaces are read
tracing: Detect the string nul character when parsing user input string
Decoding the assembly, the request claims to have 0xffff segments,
while nvme counts two. This turns out to be because we don't check
for a data carrying request on the mq scheduler path, and since
blk_phys_contig_segment() returns true for a non-data request,
we decrement the initial segment count of 0 and end up with
0xffff in the unsigned short.
There are a few issues here:
1) We should initialize the segment count for a discard to 1.
2) The discard merging is currently using the data limits for
segments and sectors.
Fix this up by having attempt_merge() correctly identify the
request, and by initializing the segment count correctly
for discards.
This can only be triggered with mq-deadline on discard capable
devices right now, which isn't a common configuration.
Linus Torvalds [Thu, 1 Feb 2018 20:20:53 +0000 (12:20 -0800)]
Merge branch 'KASAN-read_word_at_a_time'
Merge KASAN word-at-a-time fixups from Andrey Ryabinin.
The word-at-a-time optimizations have caused headaches for KASAN, since
the whole point is that we access byte streams in bigger chunks, and
KASAN can be unhappy about the potential extra access at the end of the
string.
We used to have a horrible hack in dcache, and then people got
complaints from the strscpy() case. This fixes it all up properly, by
adding an explicit helper for the "access byte stream one word at a
time" case.
* emailed patches from Andrey Ryabinin <aryabinin@virtuozzo.com>:
fs: dcache: Revert "manually unpoison dname after allocation to shut up kasan's reports"
fs/dcache: Use read_word_at_a_time() in dentry_string_cmp()
lib/strscpy: Shut up KASAN false-positives in strscpy()
compiler.h: Add read_word_at_a_time() function.
compiler.h, kasan: Avoid duplicating __read_once_size_nocheck()
Andrey Ryabinin [Thu, 1 Feb 2018 18:00:51 +0000 (21:00 +0300)]
fs/dcache: Use read_word_at_a_time() in dentry_string_cmp()
dentry_string_cmp() performs the word-at-a-time reads from 'cs' and may
read slightly more than it was requested in kmallac(). Normally this
would make KASAN to report out-of-bounds access, but this was
workarounded by commit df4c0e36f1b1 ("fs: dcache: manually unpoison
dname after allocation to shut up kasan's reports").
This workaround is not perfect, since it allows out-of-bounds access to
dentry's name for all the code, not just in dentry_string_cmp().
So it would be better to use read_word_at_a_time() instead and revert
commit df4c0e36f1b1.
Andrey Ryabinin [Thu, 1 Feb 2018 18:00:50 +0000 (21:00 +0300)]
lib/strscpy: Shut up KASAN false-positives in strscpy()
strscpy() performs the word-at-a-time optimistic reads. So it may may
access the memory past the end of the object, which is perfectly fine
since strscpy() doesn't use that (past-the-end) data and makes sure the
optimistic read won't cross a page boundary.
Use new read_word_at_a_time() to shut up the KASAN.
Note that this potentially could hide some bugs. In example bellow,
stscpy() will copy more than we should (1-3 extra uninitialized bytes):
Andrey Ryabinin [Thu, 1 Feb 2018 18:00:49 +0000 (21:00 +0300)]
compiler.h: Add read_word_at_a_time() function.
Sometimes we know that it's safe to do potentially out-of-bounds access
because we know it won't cross a page boundary. Still, KASAN will
report this as a bug.
Add read_word_at_a_time() function which is supposed to be used in such
cases. In read_word_at_a_time() KASAN performs relaxed check - only the
first byte of access is validated.
Instead of having two identical __read_once_size_nocheck() functions
with different attributes, consolidate all the difference in new macro
__no_kasan_or_inline and use it. No functional changes.
This implements ndo_poll_controller callback which is necessary to
enable netconsole.
Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>