David Miller [Sun, 24 Jun 2018 05:13:49 +0000 (14:13 +0900)]
net: Convert GRO SKB handling to list_head.
Manage pending per-NAPI GRO packets via list_head.
Return an SKB pointer from the GRO receive handlers. When GRO receive
handlers return non-NULL, it means that this SKB needs to be completed
at this time and removed from the NAPI queue.
Several operations are greatly simplified by this transformation,
especially timing out the oldest SKB in the list when gro_count
exceeds MAX_GRO_SKBS, and napi_gro_flush() which walks the queue
in reverse order.
Signed-off-by: David S. Miller <davem@davemloft.net>
2) Fix bpf instruction alignment on powerpc et al., from Eric Dumazet.
3) Don't ignore IFLA_MTU attribute when creating new ipvlan links. From
Xin Long.
4) Fix use after free in AF_PACKET, from Eric Dumazet.
5) Mis-matched RTNL unlock in xen-netfront, from Ross Lagerwall.
6) Fix VSOCK loopback on big-endian, from Claudio Imbrenda.
7) Missing RX buffer offset correction when computing DMA addresses in
mvneta driver, from Antoine Tenart.
8) Fix crashes in DCCP's ccid3_hc_rx_send_feedback, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
sfc: make function efx_rps_hash_bucket static
strparser: Corrected typo in documentation.
qmi_wwan: add support for the Dell Wireless 5821e module
cxgb4: when disabling dcb set txq dcb priority to 0
net_sched: remove a bogus warning in hfsc
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
net: dccp: avoid crash in ccid3_hc_rx_send_feedback()
net: Remove depends on HAS_DMA in case of platform dependency
MAINTAINERS: Add file patterns for dsa device tree bindings
net: mscc: make sparse happy
net: mvneta: fix the Rx desc DMA address in the Rx path
Documentation: e1000: Fix docs build error
Documentation: e100: Fix docs build error
Documentation: e1000: Use correct heading adornment
Documentation: e100: Use correct heading adornment
ipv6: mcast: fix unsolicited report interval after receiving querys
vhost_net: validate sock before trying to put its fd
VSOCK: fix loopback on big-endian systems
net: ethernet: ti: davinci_cpdma: make function cpdma_desc_pool_create static
xen-netfront: Update features after registering netdev
...
Heiner Kallweit [Sun, 24 Jun 2018 16:40:23 +0000 (18:40 +0200)]
r8169: don't check WoL when powering down PHY and interface is down
We can power down the PHY irregardless of WOL settings if interface
is down. So far we would have left the PHY enabled if WOL options
are set and the interface is brought down.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Jun 2018 16:39:06 +0000 (18:39 +0200)]
r8169: improve saved_wolopts handling
Let's make saved_wolopts a shadow copy of the WoL options. This allows
to simplify the code and get rid of calls to now unneeded function
__rtl8169_get_wol(). However don't remove __rtl8169_get_wol()
completely to be prepared for the case that we can respect BIOS WOL
settings again.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 24 Jun 2018 16:37:36 +0000 (18:37 +0200)]
r8169: improve phy initialization when resuming
Let's move calling rtl8169_init_phy() to __rtl8169_resume().
It simplifies the code and avoids rtl8169_init_phy() being called
when resuming whilst interface is down. rtl_open() will initialize
the PHY when the interface is brought up.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: sched: couple of ndo_setup_tc fixes and adjustments
This patchset includes couple of patches that fix or adjust default
cases and return values in ndo_setup_tc implementations in drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 24 Jun 2018 08:38:38 +0000 (10:38 +0200)]
nfp: handle cls_flower command default case
Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Sun, 24 Jun 2018 08:38:37 +0000 (10:38 +0200)]
bnxt: simplify cls_flower command switch and handle default case
Currently the default case is not handled, which with future command
introductions would introduce a warning. So handle it and make the
switch a bit simplier removing unneeded "rc" variable.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 24 Jun 2018 12:29:15 +0000 (20:29 +0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A pile of perf updates:
Kernel side:
- Remove an incorrect warning in uprobe_init_insn() when
insn_get_length() fails. The error return code is handled at the
call site.
- Move the inline keyword to the right place in the perf ringbuffer
code to address a W=1 build warning.
Tooling:
perf stat:
- Fix metric column header display alignment
- Improve error messages for default attributes, providing better
output for error in command line.
- Add --interval-clear option, to provide a 'watch' like printing
perf script:
- Show hw-cache events too
perf c2c:
- Fix data dependency problem in layout of 'struct c2c_hist_entry'
Core:
- Do not blindly assume that 'struct perf_evsel' can be obtained via
a straight forward container_of() as there are call sites which
hand in a plain 'struct hist' which is not part of a container.
- Fix error index in the PMU event parser, so that error messages can
point to the problematic token"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Move the inline keyword at the beginning of the function declaration
uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn()
perf script: Show hw-cache events
perf c2c: Keep struct hist_entry at the end of struct c2c_hist_entry
perf stat: Add event parsing error handling to add_default_attributes
perf stat: Allow to specify specific metric column len
perf stat: Fix metric column header display alignment
perf stat: Use only color_fprintf call in print_metric_only
perf stat: Add --interval-clear option
perf tools: Fix error index for pmu event parser
perf hists: Reimplement hists__has_callchains()
perf hists browser gtk: Use hist_entry__has_callchains()
perf hists: Make hist_entry__has_callchains() work with 'perf c2c'
perf hists: Save the callchain_size in struct hist_entry
Linus Torvalds [Sun, 24 Jun 2018 12:18:19 +0000 (20:18 +0800)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fixes from Thomas Gleixer:
"A pile of rseq related fixups:
- Prevent infinite recursion when delivering SIGSEGV
- Remove the abort of rseq critical section on fork() as syscalls
inside rseq critical sections are explicitely forbidden. So no
point in doing the abort on the child.
- Align the rseq structure on 32 bytes in the ARM selftest code.
- Fix file permissions of the test script"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Avoid infinite recursion when delivering SIGSEGV
rseq/cleanup: Do not abort rseq c.s. in child on fork()
rseq/selftests/arm: Align 'struct rseq_cs' on 32 bytes
rseq/selftests: Make run_param_test.sh executable
Linus Torvalds [Sun, 24 Jun 2018 12:16:17 +0000 (20:16 +0800)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Thomas Gleixner:
"Two fixlets for the EFI maze:
- Properly zero variables to prevent an early boot hang on EFI mixed
mode systems
- Fix the fallout of merging the 32bit and 64bit variants of EFI PCI
related code which ended up chosing the 32bit variant of the actual
EFi call invocation which leads to failures on 64bit"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix incorrect invocation of PciIo->Attributes()
efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode
Linus Torvalds [Sun, 24 Jun 2018 12:06:42 +0000 (20:06 +0800)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"Two tiny fixes:
- Add the missing machine_real_restart() to objtools noreturn list so
it stops complaining
- Fix a trivial comment typo"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kernel.h: Fix a typo in comment
objtool: Add machine_real_restart() to the noreturn list
Linus Torvalds [Sun, 24 Jun 2018 11:59:52 +0000 (19:59 +0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Make Xen PV guest deal with speculative store bypass correctly
- Address more fallout from the 5-Level pagetable handling. Undo an
__initdata annotation to avoid section mismatch and malfunction
when post init code would touch the freed variable.
- Handle exception fixup in math_error() before calling notify_die().
The reverse call order incorrectly triggers notify_die() listeners
for soemthing which is handled correctly at the site which issues
the floating point instruction.
- Fix an off by one in the LLC topology calculation on AMD
- Handle non standard memory block sizes gracefully un UV platforms
- Plug a memory leak in the microcode loader
- Sanitize the purgatory build magic
- Add the x86 specific device tree bindings directory to the x86
MAINTAINER file patterns"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix 'no5lvl' handling
Revert "x86/mm: Mark __pgtable_l5_enabled __initdata"
x86/CPU/AMD: Fix LLC ID bit-shift calculation
MAINTAINERS: Add file patterns for x86 device tree bindings
x86/microcode/intel: Fix memleak in save_microcode_patch()
x86/platform/UV: Add kernel parameter to set memory block size
x86/platform/UV: Use new set memory block size function
x86/platform/UV: Add adjustable set memory block size function
x86/build: Remove unnecessary preparation for purgatory
Revert "kexec/purgatory: Add clean-up for purgatory directory"
x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths
x86: Call fixup_exception() before notify_die() in math_error()
Linus Torvalds [Sun, 24 Jun 2018 11:48:30 +0000 (19:48 +0800)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti fixes from Thomas Gleixner:
"Two small updates for the speculative distractions:
- Make it more clear to the compiler that array_index_mask_nospec()
is not subject for optimizations. It's not perfect, but ...
- Don't report XEN PV guests as vulnerable because their mitigation
state depends on the hypervisor. Report unknown and refer to the
hypervisor requirement"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec()
x86/pti: Don't report XenPV as vulnerable
Linus Torvalds [Sun, 24 Jun 2018 11:36:16 +0000 (19:36 +0800)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of fixes and updates for the locking code:
- Prevent lockdep from updating irq state within its own code and
thereby confusing itself.
- Buid fix for older GCCs which mistreat anonymous unions
- Add a missing lockdep annotation in down_read_non_onwer() which
causes up_read_non_owner() to emit a lockdep splat
- Remove the custom alpha dec_and_lock() implementation which is
incorrect in terms of ordering and use the generic one.
The remaining two commits are not strictly fixes. They provide irqsave
variants of atomic_dec_and_lock() and refcount_dec_and_lock(). These
are required to merge the relevant updates and cleanups into different
maintainer trees for 4.19, so routing them into mainline without
actual users is the sanest approach.
They should have been in -rc1, but last weekend I took the liberty to
just avoid computers in order to regain some mental sanity"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/qspinlock: Fix build for anonymous union in older GCC compilers
locking/lockdep: Do not record IRQ state within lockdep code
locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS
locking/refcounts: Implement refcount_dec_and_lock_irqsave()
atomic: Add irqsave variant of atomic_dec_and_lock()
alpha: Remove custom dec_and_lock() implementation
Linus Torvalds [Sun, 24 Jun 2018 11:22:19 +0000 (19:22 +0800)]
Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull ras fixes from Thomas Gleixner:
"A set of fixes for RAS/MCE:
- Improve the error message when the kernel cannot recover from a MCE
so the maximum amount of information gets provided.
- Individually check MCE recovery features on SkyLake CPUs instead of
assuming none when the CAPID0 register does not advertise the
general ability for recovery.
- Prevent MCE to output inconsistent messages which first show an
error location and then claim that the source is unknown.
- Prevent overwriting MCi_STATUS in the attempt to gather more
information when a fatal MCE has alreay been detected. This leads
to empty status values in the printout and failing to react
promptly on the fatal event"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Fix incorrect "Machine check from unknown source" message
x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
x86/mce: Check for alternate indication of machine check recovery on Skylake
x86/mce: Improve error message when kernel cannot recover
Linus Torvalds [Sun, 24 Jun 2018 11:16:42 +0000 (19:16 +0800)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for time(r) related issues:
- Fix a long standing conversion issue in jiffies_to_msecs() for odd
HZ values like 1024 or 1200 which resulted in returning 0 for small
jiffies values due to rounding down.
- Use the proper CONFIG symbol in the new Y2038 safe compat code for
posix-timers. Not yet a visible breakage, but this will immediately
trigger when the architecture support for the new interfaces is
merged.
- Return an error code in the STM32 clocksource driver on failure
instead of success.
- Remove the redundant and stale irq disabled check in the posix cpu
timer code. The check is at the wrong place anyway and lockdep
already covers it via the sighand lock locking coverage"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Make sure jiffies_to_msecs() preserves non-zero time periods
posix-timers: Fix nanosleep_copyout() for CONFIG_COMPAT_32BIT_TIME
clocksource/drivers/stm32: Fix error return code
posix-cpu-timers: Remove lockdep_assert_irqs_disabled()
Linus Torvalds [Sun, 24 Jun 2018 11:01:18 +0000 (19:01 +0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of fixes mostly for the ARM/GIC world:
- Fix the MSI affinity handling in the ls-scfg irq chip driver so it
updates and uses the effective affinity mask correctly
- Prevent binding LPIs to offline CPUs and respect the Cavium erratum
which requires that LPIs which belong to an offline NUMA node are
not bound to a CPU on a different NUMA node.
- Free only the amount of allocated interrupts in the GIC-V2M driver
instead of trying to free log2(nrirqs).
- Prevent emitting SYNC and VSYNC targetting non existing interrupt
collections in the GIC-V3 ITS driver
- Ensure that the GIV-V3 interrupt redistributor is correctly
reprogrammed on CPU hotplug
- Remove a stale unused helper function"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqdesc: Delete irq_desc_get_msi_desc()
irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
irqchip/gic-v2m: Fix SPI release on error path
irqchip/ls-scfg-msi: Fix MSI affinity handling
genirq/debugfs: Add missing IRQCHIP_SUPPORTS_LEVEL_MSI debug
Linus Torvalds [Sun, 24 Jun 2018 09:19:42 +0000 (17:19 +0800)]
Merge tag 'mips_fixes_4.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
"A few MIPS fixes for 4.18:
- a GPIO device name fix for a regression in v4.15-rc1.
- an errata workaround for the BCM5300X platform.
- a fix to ftrace function graph tracing, broken for a long time with
the fix applying cleanly back as far as v3.17.
- addition of read barriers to in{b,w,l,q}() functions, matching
behavior of other architectures & mirroring the equivalent addition
to read{b,w,l,q} in v4.17-rc2.
Plus changes to wire up new syscalls introduced in the 4.18 cycle:
- Restartable sequences support is added, including MIPS support in
the selftests.
- io_pgetevents is wired up"
* tag 'mips_fixes_4.18_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Wire up io_pgetevents syscall
rseq/selftests: Implement MIPS support
MIPS: Wire up the restartable sequences (rseq) syscall
MIPS: Add syscall detection for restartable sequences
MIPS: Add support for restartable sequences
MIPS: io: Add barrier after register read in inX()
mips: ftrace: fix static function graph tracing
MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum
MIPS: pb44: Fix i2c-gpio GPIO descriptor table
Ard Biesheuvel [Sat, 23 Jun 2018 21:19:03 +0000 (23:19 +0200)]
efi/x86: Fix incorrect invocation of PciIo->Attributes()
The following commit:
2c3625cb9fa2 ("efi/x86: Fold __setup_efi_pci32() and __setup_efi_pci64() into one function")
... merged the two versions of __setup_efi_pciXX(), without taking into
account that the 32-bit version used a rather dodgy trick to pass an
immediate 0 constant as argument for a uint64_t parameter.
The issue is caused by the fact that on x86, UEFI protocol method calls
are redirected via struct efi_config::call(), which is a variadic function,
and so the compiler has to infer the types of the parameters from the
arguments rather than from the prototype.
As the 32-bit x86 calling convention passes arguments via the stack,
passing the unqualified constant 0 twice is the same as passing 0ULL,
which is why the 32-bit code in __setup_efi_pci32() contained the
following call:
status = efi_early->call(pci->attributes, pci,
EfiPciIoAttributeOperationGet, 0, 0,
&attributes);
to invoke this UEFI protocol method:
typedef
EFI_STATUS
(EFIAPI *EFI_PCI_IO_PROTOCOL_ATTRIBUTES) (
IN EFI_PCI_IO_PROTOCOL *This,
IN EFI_PCI_IO_PROTOCOL_ATTRIBUTE_OPERATION Operation,
IN UINT64 Attributes,
OUT UINT64 *Result OPTIONAL
);
After the merge, we inadvertently ended up with this version for both
32-bit and 64-bit builds, breaking the latter.
So replace the two zeroes with the explicitly typed constant 0ULL,
which works as expected on both 32-bit and 64-bit builds.
Wilfried tested the 64-bit build, and I checked the generated assembly
of a 32-bit build with and without this patch, and they are identical.
qmi_wwan: add support for the Dell Wireless 5821e module
This module exposes two USB configurations: a QMI+AT capable setup on
USB config #1 and a MBIM capable setup on USB config #2.
By default the kernel will choose the MBIM capable configuration as
long as the cdc_mbim driver is available. This patch adds support for
the QMI port in the secondary configuration.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Sat, 23 Jun 2018 14:58:26 +0000 (20:28 +0530)]
cxgb4: when disabling dcb set txq dcb priority to 0
When we are disabling DCB, store "0" in txq->dcb_prio
since that's used for future TX Work Request "OVLAN_IDX"
values. Setting non zero priority upon disabling DCB
would halt the traffic.
Reported-by: AMG Zollner Robert <robert@cloudmedia.eu> CC: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 23 Jun 2018 22:33:54 +0000 (06:33 +0800)]
Merge tag 'for-linus-20180623' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Further timeout fixes. We aren't quite there yet, so expect another
round of fixes for that to completely close some of the IRQ vs
completion races. (Christoph/Bart)
- Set of NVMe fixes from the usual suspects, mostly error handling
- Two off-by-one fixes (Dan)
- Another bdi race fix (Jan)
- Fix nbd reconfigure with NBD_DISCONNECT_ON_CLOSE (Doron)
* tag 'for-linus-20180623' of git://git.kernel.dk/linux-block:
blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE
bdi: Fix another oops in wb_workfn()
lightnvm: Remove depends on HAS_DMA in case of platform dependency
nvme-pci: limit max IO size and segments to avoid high order allocations
nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
nvme-fc: release io queues to allow fast fail
nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag.
block: sed-opal: Fix a couple off by one bugs
blk-mq-debugfs: Off by one in blk_mq_rq_state_name()
nvmet: reset keep alive timer in controller enable
nvme-rdma: don't override opts->queue_size
nvme-rdma: Fix command completion race at error recovery
nvme-rdma: fix possible free of a non-allocated async event buffer
nvme-rdma: fix possible double free condition when failing to create a controller
Revert "block: Add warning for bi_next not NULL in bio_endio()"
block: fix timeout changes for legacy request drivers
Linus Torvalds [Sat, 23 Jun 2018 22:31:54 +0000 (06:31 +0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Fix use after free in chtls
- Fix RBP breakage in sha3
- Fix use after free in hwrng_unregister
- Fix overread in morus640
- Move sleep out of kernel_neon in arm64/aes-blk
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: core - Always drop the RNG in hwrng_unregister()
crypto: morus640 - Fix out-of-bounds access
crypto: don't optimize keccakf()
crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end
crypto: chtls - use after free in chtls_pt_recvmsg()
Linus Torvalds [Sat, 23 Jun 2018 22:23:28 +0000 (06:23 +0800)]
Merge tag 'trace-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This contains a few fixes and a clean up.
- a bad merge caused an "endif" to go in the wrong place in
scripts/Makefile.build
- softirq tracing fix for tracing that corrupts lockdep and causes a
false splat
- histogram documentation typo fixes
- fix a bad memory reference when passing in no filter to the filter
code
- simplify code by using the swap macro instead of open coding the
swap"
* tag 'trace-v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix SKIP_STACK_VALIDATION=1 build due to bad merge with -mrecord-mcount
tracing: Fix some errors in histogram documentation
tracing: Use swap macro in update_max_tr
softirq: Reorder trace_softirqs_on to prevent lockdep splat
tracing: Check for no filter when processing event filters
Bart Van Assche [Fri, 22 Jun 2018 20:18:09 +0000 (13:18 -0700)]
blk-mq: Fix timeout handling in case the timeout handler returns BLK_EH_DONE
Make sure that RQF_TIMED_OUT is cleared when a request is reused
after a block driver timeout handler has returned BLK_EH_DONE.
Fixes: da6612673988 ("blk-mq: don't time out requests again that are in the timeout handler") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jianchao Wang <jianchao.w.wang@oracle.com> Cc: Andrew Randrianasulu <randrianasulu@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Sat, 23 Jun 2018 13:13:05 +0000 (21:13 +0800)]
Merge tag 'powerpc-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- a fix for hugetlb with 4K pages, broken by our recent changes for
split PMD PTL.
- set the correct assembler machine type on e500mc, needed since
binutils 2.26 introduced two forms for the "wait" instruction.
- a fix for potential missed TLB flushes with MADV_[FREE|DONTNEED] etc.
and THP on Power9 Radix.
- three fixes to try and make our panic handling more robust by hard
disabling interrupts, and not marking stopped CPUs as offline because
they haven't been properly offlined.
- three other minor fixes.
Thanks to: Aneesh Kumar K.V, Michael Jeanson, Nicholas Piggin.
* tag 'powerpc-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/hash/4k: Free hugetlb page table caches correctly.
powerpc/64s/radix: Fix radix_kvm_prefetch_workaround paca access of not possible CPU
powerpc/64s: Fix build failures with CONFIG_NMI_IPI=n
powerpc/64: hard disable irqs on the panic()ing CPU
powerpc: smp_send_stop do not offline stopped CPUs
powerpc/64: hard disable irqs in panic_smp_self_stop
powerpc/64s: Fix DT CPU features Power9 DD2.1 logic
powerpc/64s/radix: Fix MADV_[FREE|DONTNEED] TLB flush miss problem with THP
powerpc/e500mc: Set assembler machine type to e500mc
Linus Torvalds [Sat, 23 Jun 2018 13:07:43 +0000 (21:07 +0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- clear buffers allocated with FORCE_CONTIGUOUS explicitly until the
CMA code honours __GFP_ZERO
- notrace annotation for secondary_start_kernel()
- use early_param() instead of __setup() for "kpti=" as it is needed
for the cpufeature callback remapping swapper to non-global mappings
- ensure writes to swapper are ordered wrt subsequent cache maintenance
in the kpti non-global remapping code
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
arm64: kpti: Use early_param for kpti= command-line option
arm64: make secondary_start_kernel() notrace
arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag
Linus Torvalds [Sat, 23 Jun 2018 12:59:00 +0000 (20:59 +0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"ARM:
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
x86:
- Add nested VM entry checks to avoid broken error recovery path
- Minor documentation fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: fix KVM_CAP_HYPERV_TLBFLUSH paragraph number
kvm: vmx: Nested VM-entry prereqs for event inj.
KVM: arm64: Prevent KVM_COMPAT from being selected
KVM: Enforce error in ioctl for compat tasks when !KVM_COMPAT
KVM: arm/arm64: add WARN_ON if size is not PAGE_SIZE aligned in unmap_stage2_range
KVM: arm64: Avoid mistaken attempts to save SVE state for vcpus
KVM: arm64/sve: Fix SVE trap restoration for non-current tasks
KVM: arm64: Don't mask softirq with IRQs disabled in vcpu_put()
arm64: Introduce sysreg_clear_set()
KVM: arm/arm64: Drop resource size check for GICV window
Linus Torvalds [Sat, 23 Jun 2018 12:44:11 +0000 (20:44 +0800)]
Merge tag 'for-linus-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains the following fixes/cleanups:
- the removal of a BUG_ON() which wasn't necessary and which could
trigger now due to a recent change
- a correction of a long standing bug happening very rarely in Xen
dom0 when a hypercall buffer from user land was not accessible by
the hypervisor for very short periods of time due to e.g. page
migration or compaction
- usage of EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() in a
Xen-related driver (no breakage possible as using those symbols
without others already exported via EXPORT-SYMBOL_GPL() wouldn't
make any sense)
- a simplification for Xen PVH or Xen ARM guests
- some additional error handling for callers of xenbus_printf()"
* tag 'for-linus-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Remove unnecessary BUG_ON from __unbind_from_irq()
xen: add new hypercall buffer mapping device
xen/scsiback: add error handling for xenbus_printf
scsi: xen-scsifront: add error handling for xenbus_printf
xen/grant-table: Export gnttab_{alloc|free}_pages as GPL
xen: add error handling for xenbus_printf
xen: share start flags between PV and PVH
__pgtable_l5_enabled marked as __initdata, but cpu_init() is not __init.
It's fixable: early code can be isolated into a separate translation unit,
but such change collides with other work in the area. That's too much
hassle to save 4 bytes of memory.
Return __pgtable_l5_enabled back to be __ro_after_init.
Heiner Kallweit [Sat, 23 Jun 2018 07:53:00 +0000 (09:53 +0200)]
r8169: enable ASPM on RTL8168E-VL
Let's enable ASPM also on the RTL8168E-VL (chip version 34).
Works fine on my Zotac Mini PC with this chip. Temperature when
being idle is significantly lower than before due to reaching
deeper PC states.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sat, 23 Jun 2018 07:51:28 +0000 (09:51 +0200)]
r8169: align ASPM entry latency setting with vendor driver
The r8168 vendor driver always uses value 0x27. In r8169 we have few
chips where 0x17 is used. So far this didn't matter because ASPM was
disabled anyway. Now that ASPM was re-enabled let's also use 0x27 only.
One of the chips affected by this change is RTL8168E-VL, on my system
with this chip value 0x27 works fine.
In addition rename rtl_csi_access_enable_2() to
rtl_set_def_aspm_entry_latency() to make clear that we set the default
ASPM entry latency.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 23 Jun 2018 11:52:09 +0000 (20:52 +0900)]
Merge branch 'pch_gbe-Cleanups'
Paul Burton says:
====================
net: pch_gbe: Cleanups
This series begins the process of cleaning up the pch_gbe network
driver. Whilst my ultimate goal is to add support for using this driver
on the MIPS Boston development board, this series sets that aside in
favor of making some more general cleanups. My hope is that this will
both make the driver a little more maleable & reduce the probability of
me gouging out my eyes.
Applies cleanly atop net-next as of 5424ea27390f ("netns: get more
entropy from net_hash_mix()").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:53 +0000 (20:17 -0700)]
net: pch_gbe: Clean up pch_gbe_set_multi
Refactor pch_gbe_set_multi in order to avoid unnecessary indentation &
make it clearer what the code is doing.
The one behavioral change from this patch is that we'll no longer
configure the MAC address registers for multicast addresses when the
IFF_PROMISC or IFF_ALLMULTI flags are set. In these cases, just as when
we want to monitor more multicast addresses than we have MAC address
registers, we disable multicast filtering so the MAC address registers
are unused.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The pch_gbe driver sets up multicast address filters using a convoluted
mechanism by which pch_gbe_set_multi allocates an array to hold
multicast addresses, copies desired addresses into that array, calls a
pch_gbe_mac_mc_addr_list_update function which copies addresses out of
that array into MAC registers, then frees the array.
This patch simplifies this somewhat by inlining
pch_gbe_mac_mc_addr_list_update into pch_gbe_set_multi, and removing the
requirement for the MAC addresses to stored consecutively in a single
array.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:51 +0000 (20:17 -0700)]
net: pch_gbe: Use module_pci_driver()
Make use of the module_pci_driver() macro to remove some needless
boilerplate code from the pch_gbe driver. This does have the side effect
of removing the print of the driver's version during probe, but this is
pretty useless information anyway - the version has changed only once
whilst the driver has been in mainline, despite many changes being made
to it before and since.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:50 +0000 (20:17 -0700)]
net: pch_gbe: Remove dead RINGFREE code
The pch_gbe driver includes some code which appears to be an attempt to
work around a problem with the pch_gbe_free_rx_resources &
pch_gbe_free_tx_resources functions that no longer exists. Remove the
code guarded by the never-defined RINGFREE preprocessor macro.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The pch_gbe driver currently presumes that the PHY is connected using
RGMII, and would need further work to support other buses. It includes a
define which is always set that conditionalises some of the
RGMII-specific code regardless. Remove it. If we do ever support
different MII buses then preprocessor defines won't be the best way to
select between them anyway.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:48 +0000 (20:17 -0700)]
net: pch_gbe: Remove pch_gbe_hal_setup_init_funcs
The pch_gbe driver calls a pch_gbe_hal_setup_init_funcs function which
ultimately sets the value of one field in struct pch_gbe_phy_info in a
convoluted way.
This patch removes pch_gbe_hal_setup_init_funcs in favor of inlining it,
and in turn its callee pch_gbe_plat_init_function_pointers, into the
single caller pch_gbe_sw_init.
With this pch_gbe_api.c & pch_gbe_api.h are essentially empty, so they
are removed & inclusions of the latter replaced with pch_gbe_phy.h which
was previously being included via pch_gbe_api.h.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:47 +0000 (20:17 -0700)]
net: pch_gbe: Remove get_bus_info HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the get_bus_info abstraction. Its single
implementation (pch_gbe_plat_get_bus_info) only sets values within a
struct pch_gbe_bus_info which is never used, so we simply remove the
call to it in pch_gbe_probe & remove struct pch_gbe_bus_info entirely.
Now that struct pch_gbe_functions is empty we remove it entirely too.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:46 +0000 (20:17 -0700)]
net: pch_gbe: Remove init_hw HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the init_hw abstraction in favor of inlining its
single implementation (pch_gbe_plat_init_hw) into its single caller
(pch_gbe_reset).
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:45 +0000 (20:17 -0700)]
net: pch_gbe: Remove {read,write}_phy_reg HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the read_phy_reg & write_phy_reg abstractions in
favor of calling pch_gbe_phy_read_reg_miic & pch_gbe_phy_write_reg_miic
directly.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:44 +0000 (20:17 -0700)]
net: pch_gbe: Remove reset_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the reset_phy abstraction in favor of calling
pch_gbe_phy_hw_reset directly.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:43 +0000 (20:17 -0700)]
net: pch_gbe: Remove sw_reset_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the sw_reset_phy abstraction, which it turns out is
never even used. Its one implementation, which is already called
directly within the same translation unit, can therefore be made static
and removed from the pch_gbe_phy.h header.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:42 +0000 (20:17 -0700)]
net: pch_gbe: Remove read_mac_addr HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the read_mac_addr abstraction in favor of calling
pch_gbe_mac_read_mac_addr directly. Since this is defined in the same
translation unit as all of its callers, we can make it static & remove
it from the pch_gbe.h header.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:41 +0000 (20:17 -0700)]
net: pch_gbe: Remove power_{up,down}_phy HAL abstraction
For some reason the pch_gbe driver contains a struct pch_gbe_functions
with pointers used by a HAL abstraction layer, even though there is only
one implementation of each function.
This patch removes the power_up_phy & power_down_phy abstractions in
favor of calling pch_phy_power_up & pch_phy_power_down directly.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Burton [Sat, 23 Jun 2018 03:17:40 +0000 (20:17 -0700)]
net: pch_gbe: Remove unused copybreak parameter
The pch_gbe driver includes a 'copybreak' parameter which appears to
have been copied from the e1000e driver but is entirely unused. Remove
the dead code.
Signed-off-by: Paul Burton <paul.burton@mips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 22 Jun 2018 23:27:47 +0000 (16:27 -0700)]
netns: get more entropy from net_hash_mix()
struct net are effectively allocated from order-1 pages on x86,
with one object per slab, meaning that the 13 low order bits
of their addresses are zero.
Once shifted by L1_CACHE_SHIFT, this leaves 7 zero-bits,
meaning that net_hash_mix() does not help spreading
objects on various hash tables.
For example, TCP listen table has 32 buckets, meaning that
all netns use the same bucket for port 80 or port 443.
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Fri, 22 Jun 2018 21:33:16 +0000 (14:33 -0700)]
net_sched: remove a bogus warning in hfsc
In update_vf():
cftree_remove(cl);
update_cfmin(cl->cl_parent);
the cl_cfmin of cl->cl_parent is intentionally updated to 0
when that parent only has one child. And if this parent is
root qdisc, we could end up, in hfsc_schedule_watchdog(),
that we can't decide the next schedule time for qdisc watchdog.
But it seems safe that we can just skip it, as this watchdog is
not always scheduled anyway.
Thanks to Marco for testing all the cases, nothing is broken.
Reported-by: Marco Berizzi <pupilla@libero.it> Tested-by: Marco Berizzi <pupilla@libero.it> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 22 Jun 2018 13:44:15 +0000 (06:44 -0700)]
net: dccp: switch rx_tstamp_last_feedback to monotonic clock
To compute delays, better not use time of the day which can
be changed by admins or malicious programs.
Also change ccid3_first_li() to use s64 type for delta variable
to avoid potential overflows.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
net: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 22 Jun 2018 09:50:52 +0000 (11:50 +0200)]
net: mscc: make sparse happy
This patch fixes a sparse warning about using an incorrect type in
argument 2 of ocelot_write_rix(), as an u32 was expected but a __be32
was given. The conversion to u32 is forced, which is safe as the value
will be written as-is in the hardware without any modification.
Fixes: 08d02364b12f ("net: mscc: fix the injection header") Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 22 Jun 2018 08:15:39 +0000 (10:15 +0200)]
net: mvneta: fix the Rx desc DMA address in the Rx path
When using s/w buffer management, buffers are allocated and DMA mapped.
When doing so on an arm64 platform, an offset correction is applied on
the DMA address, before storing it in an Rx descriptor. The issue is
this DMA address is then used later in the Rx path without removing the
offset correction. Thus the DMA address is wrong, which can led to
various issues.
This patch fixes this by removing the offset correction from the DMA
address retrieved from the Rx descriptor before using it in the Rx path.
Fixes: 8d5047cf9ca2 ("net: mvneta: Convert to be 64 bits compatible") Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tobin C. Harding [Fri, 22 Jun 2018 00:37:08 +0000 (10:37 +1000)]
Documentation: e1000: Fix docs build error
Recent patch updated e1000 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Tobin C. Harding <me@tobin.cc> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tobin C. Harding [Fri, 22 Jun 2018 00:37:07 +0000 (10:37 +1000)]
Documentation: e100: Fix docs build error
Recent patch updated e100 docs to rst format. Docs build (`make
htmldocs`) is currently failing due to this file with error:
(SEVERE/4) Unexpected section title.
This is because a section of the file is indented 2 spaces. Build error
can be cleared by aligning the text with column 0. While we are changing
these lines we can make sure line length does not exceed 72, that
newlines following headings are uniform, and that full stops are
followed by two spaces.
Align text with column 0, limit line length to 72, ensure two spaces
follow all full stops, ensure uniform use of newlines after heading.
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Tobin C. Harding <me@tobin.cc> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Tobin C. Harding <me@tobin.cc> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Tobin C. Harding <me@tobin.cc> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: phy: Allow compile test of GPIO consumers if !GPIOLIB
The GPIO subsystem provides dummy GPIO consumer functions if GPIOLIB is
not enabled. Hence drivers that depend on GPIOLIB, but use GPIO consumer
functionality only, can still be compiled if GPIOLIB is not enabled.
Relax the dependency on GPIOLIB if COMPILE_TEST is enabled, where
appropriate.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Thu, 21 Jun 2018 11:49:36 +0000 (19:49 +0800)]
ipv6: mcast: fix unsolicited report interval after receiving querys
After recieving MLD querys, we update idev->mc_maxdelay with max_delay
from query header. This make the later unsolicited reports have the same
interval with mc_maxdelay, which means we may send unsolicited reports with
long interval time instead of default configured interval time.
Also as we will not call ipv6_mc_reset() after device up. This issue will
be there even after leave the group and join other groups.
Fixes: fc4eba58b4c14 ("ipv6: make unsolicited report intervals configurable for mld") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Thu, 21 Jun 2018 05:11:31 +0000 (13:11 +0800)]
vhost_net: validate sock before trying to put its fd
Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when
we meet errors during ubuf allocation, the code does not check for
NULL before calling sockfd_put(), this will lead NULL
dereferencing. Fixing by checking sock pointer before.
Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The current logic incorrectly calculates the LLC ID from the APIC ID.
Unless specified otherwise, the LLC ID should be calculated by removing
the Core and Thread ID bits from the least significant end of the APIC
ID. For more info, see "ApicId Enumeration Requirements" in any Fam17h
PPR document.
Jan Kara [Mon, 18 Jun 2018 13:46:58 +0000 (15:46 +0200)]
bdi: Fix another oops in wb_workfn()
syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was
WB_shutting_down after wb->bdi->dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.
The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:
We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().
Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>
lightnvm: Remove depends on HAS_DMA in case of platform dependency
Remove dependencies on HAS_DMA where a Kconfig symbol depends on another
symbol that implies HAS_DMA, and, optionally, on "|| COMPILE_TEST".
In most cases this other symbol is an architecture or platform specific
symbol, or PCI.
Generic symbols and drivers without platform dependencies keep their
dependencies on HAS_DMA, to prevent compiling subsystems or drivers that
cannot work anyway.
This simplifies the dependencies, and allows to improve compile-testing.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Will Deacon [Fri, 22 Jun 2018 10:45:07 +0000 (11:45 +0100)]
rseq: Avoid infinite recursion when delivering SIGSEGV
When delivering a signal to a task that is using rseq, we call into
__rseq_handle_notify_resume() so that the registers pushed in the
sigframe are updated to reflect the state of the restartable sequence
(for example, ensuring that the signal returns to the abort handler if
necessary).
However, if the rseq management fails due to an unrecoverable fault when
accessing userspace or certain combinations of RSEQ_CS_* flags, then we
will attempt to deliver a SIGSEGV. This has the potential for infinite
recursion if the rseq code continuously fails on signal delivery.
Avoid this problem by using force_sigsegv() instead of force_sig(), which
is explicitly designed to reset the SEGV handler to SIG_DFL in the case
of a recursive fault. In doing so, remove rseq_signal_deliver() from the
internal rseq API and have an optional struct ksignal * parameter to
rseq_handle_notify_resume() instead.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: peterz@infradead.org Cc: paulmck@linux.vnet.ibm.com Cc: boqun.feng@gmail.com Link: https://lkml.kernel.org/r/1529664307-983-1-git-send-email-will.deacon@arm.com
Will Deacon [Fri, 22 Jun 2018 15:23:45 +0000 (16:23 +0100)]
arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.
This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.
Fixes: f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings") Cc: <stable@vger.kernel.org> # 4.16.x- Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
time: Make sure jiffies_to_msecs() preserves non-zero time periods
For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of
1000, jiffies_to_msecs() never returns zero when passed a non-zero time
period.
However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or
1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero
for small non-zero time periods. This may break code that relies on
receiving back a non-zero value.
jiffies_to_usecs() does not need such a fix: one jiffy can only be less
than one µs if HZ > 1000000, and such large values of HZ are already
rejected at build time, twice:
- include/linux/jiffies.h does #error if HZ >= 12288,
- kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC).
Broken since forever.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org
Marc Orr [Thu, 21 Jun 2018 00:21:29 +0000 (17:21 -0700)]
kvm: vmx: Nested VM-entry prereqs for event inj.
This patch extends the checks done prior to a nested VM entry.
Specifically, it extends the check_vmentry_prereqs function with checks
for fields relevant to the VM-entry event injection information, as
described in the Intel SDM, volume 3.
This patch is motivated by a syzkaller bug, where a bad VM-entry
interruption information field is generated in the VMCS02, which causes
the nested VM launch to fail. Then, KVM fails to resume L1.
While KVM should be improved to correctly resume L1 execution after a
failed nested launch, this change is justified because the existing code
to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is
sparse.
Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Marc Orr <marcorr@google.com>
[Removed comment whose parts were describing previous revisions and the
rest was obvious from function/variable naming. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Jens Axboe [Fri, 22 Jun 2018 14:45:29 +0000 (08:45 -0600)]
Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Pull NVMe fixes from Christoph:
"Various relatively small fixes, mostly to fix error handling of various
sorts."
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-pci: limit max IO size and segments to avoid high order allocations
nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl
nvme-fc: release io queues to allow fast fail
nvmet: reset keep alive timer in controller enable
nvme-rdma: don't override opts->queue_size
nvme-rdma: Fix command completion race at error recovery
nvme-rdma: fix possible free of a non-allocated async event buffer
nvme-rdma: fix possible double free condition when failing to create a controller
Radim Krčmář [Fri, 22 Jun 2018 12:56:19 +0000 (14:56 +0200)]
Merge tag 'kvmarm-fixes-for-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm
KVM/arm fixes for 4.18, take #1
- Lazy FPSIMD switching fixes
- Really disable compat ioctls on architectures that don't want it
- Disable compat on arm64 (it was never implemented...)
- Rely on architectural requirements for GICV on GICv3
- Detect bad alignments in unmap_stage2_range
Tony Luck [Fri, 22 Jun 2018 09:54:23 +0000 (11:54 +0200)]
x86/mce: Fix incorrect "Machine check from unknown source" message
Some injection testing resulted in the following console log:
mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134
mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]}
mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88
mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043
mce: [Hardware Error]: Run the above through 'mcelog --ascii'
Kernel panic - not syncing: Machine check from unknown source
This confused everybody because the first line quite clearly shows
that we found a logged error in "Bank 1", while the last line says
"unknown source".
The problem is that the Linux code doesn't do the right thing
for a local machine check that results in a fatal error.
It turns out that we know very early in the handler whether the
machine check is fatal. The call to mce_no_way_out() has checked
all the banks for the CPU that took the local machine check. If
it says we must crash, we can do so right away with the right
messages.
We do scan all the banks again. This means that we might initially
not see a problem, but during the second scan find something fatal.
If this happens we print a slightly different message (so I can
see if it actually every happens).
[ bp: Remove unneeded severity assignment. ]
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org # 4.2 Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com
Borislav Petkov [Fri, 22 Jun 2018 09:54:28 +0000 (11:54 +0200)]
x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out()
mce_no_way_out() does a quick check during #MC to see whether some of
the MCEs logged would require the kernel to panic immediately. And it
passes a struct mce where MCi_STATUS gets written.
However, after having saved a valid status value, the next iteration
of the loop which goes over the MCA banks on the CPU, overwrites the
valid status value because we're using struct mce as storage instead of
a temporary variable.
Which leads to MCE records with an empty status value:
mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000
mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10}
In order to prevent the loss of the status register value, return
immediately when severity is a panic one so that we can panic
immediately with the first fatal MCE logged. This is also the intention
of this function and not to noodle over the banks while a fatal MCE is
already logged.
Tony: read the rest of the MCA bank to populate the struct mce fully.
Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de
John Garry [Fri, 22 Jun 2018 11:35:33 +0000 (19:35 +0800)]
irqdesc: Delete irq_desc_get_msi_desc()
Function irq_desc_get_msi_desc() is not referenced in the kernel (and does
not seem to have been referenced since e39758e0ea76, 3 years ago), so
delete it.
Marc Zyngier [Fri, 22 Jun 2018 09:52:54 +0000 (10:52 +0100)]
irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug
Enabling LPIs was made a lot stricter recently, by checking that they are
disabled before enabling them. By doing so, the CPU hotplug case was missed
altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to
eventually come back), and won't write a different value anyway on hotplug
on.
So skip that check if that particular case is detected
Fixes: 6eb486b66a30 ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling") Reported-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sumit Garg <sumit.garg@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com
Marc Zyngier [Fri, 22 Jun 2018 09:52:53 +0000 (10:52 +0100)]
irqchip/gic-v3-its: Only emit VSYNC if targetting a valid collection
Similarily to the SYNC operation, it must be verified that the VPE
targetted by a VLPI is backed by a valid collection in the GIC driver data
structures.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-7-marc.zyngier@arm.com
Marc Zyngier [Fri, 22 Jun 2018 09:52:52 +0000 (10:52 +0100)]
irqchip/gic-v3-its: Only emit SYNC if targetting a valid collection
It is possible, under obscure circumstances, to convince the ITS driver to
emit a SYNC operation that targets a collection that is not bound to any
redistributor (and the target_address field is zero) because the
corresponding CPU has not been seen yet (the system has been booted with
max_cpus="something small").
If the ITS is using the linear CPU number as the target, this is not a big
deal, as we just end-up issuing a SYNC to CPU0. But if the ITS requires the
physical address of the redistributor (with GITS_TYPER.PTA==1), we end-up
asking the ITS to write to the physical address zero, which is not exactly
a good idea (there has been report of the ITS locking up). This should of
course never happen, but hey, this is SW...
In order to avoid the above disaster, let's track which collections have
been actually initialized, and let's not generate a SYNC if the collection
hasn't been properly bound to a redistributor. Take this opportunity to
spit our a warning, in the hope that someone may report the issue if it
arrises again.
Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-6-marc.zyngier@arm.com
Yang Yingliang [Fri, 22 Jun 2018 09:52:51 +0000 (10:52 +0100)]
irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node
On a NUMA system, if an ITS is local to an offline node, the ITS driver may
pick an offline CPU to bind the LPI. In this case, pick an online CPU (and
the first one will do).
But on some systems, binding an LPI to non-local node CPU may cause
deadlock (see Cavium erratum 23144). In this case, just fail the activate
and return an error code.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com