s390/appldata: do not use stack buffers for hardware data
With CONFIG_VMAP_STACK=y the stack is allocated from the vmalloc space.
Data structures passed to a hardware or a hypervisor interface that
requires V=R can not be allocated on the stack anymore.
Use kmalloc to get memory for the appldata_product_id and the
appldata_parameter_list structures.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provide function to find a ccwgroup device by its busid.
Acked-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch is an extension to the zcrypt device driver to provide,
support and maintain multiple zcrypt device nodes. The individual
zcrypt device nodes can be restricted in terms of crypto cards,
domains and available ioctls. Such a device node can be used as a
base for container solutions like docker to control and restrict
the access to crypto resources.
The handling is done with a new sysfs subdir /sys/class/zcrypt.
Echoing a name (or an empty sting) into the attribute "create" creates
a new zcrypt device node. In /sys/class/zcrypt a new link will appear
which points to the sysfs device tree of this new device. The
attribute files "ioctlmask", "apmask" and "aqmask" in this directory
are used to customize this new zcrypt device node instance. Finally
the zcrypt device node can be destroyed by echoing the name into
/sys/class/zcrypt/destroy. The internal structs holding the device
info are reference counted - so a destroy will not hard remove a
device but only marks it as removable when the reference counter drops
to zero.
The mask values are bitmaps in big endian order starting with bit 0.
So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs
attributes accept 2 different formats:
* Absolute hex string starting with 0x like "0x12345678" does set
the mask starting from left to right. If the given string is shorter
than the mask it is padded with 0s on the right. If the string is
longer than the mask an error comes back (EINVAL).
* Relative format - a concatenation (done with ',') of the
terms +<bitnr>[-<bitnr>] or -<bitnr>[-<bitnr>]. <bitnr> may be any
valid number (hex, decimal or octal) in the range 0...255. Here are
some examples:
"+0-15,+32,-128,-0xFF"
"-0-255,+1-16,+0x128"
"+1,+2,+3,+4,-5,-7-10"
A simple usage examples:
# create new zcrypt device 'my_zcrypt':
echo "my_zcrypt" >/sys/class/zcrypt/create
# go into the device dir of this new device
echo "my_zcrypt" >create
cd my_zcrypt/
ls -l
total 0
-rw-r--r-- 1 root root 4096 Jul 20 15:23 apmask
-rw-r--r-- 1 root root 4096 Jul 20 15:23 aqmask
-r--r--r-- 1 root root 4096 Jul 20 15:23 dev
-rw-r--r-- 1 root root 4096 Jul 20 15:23 ioctlmask
lrwxrwxrwx 1 root root 0 Jul 20 15:23 subsystem -> ../../../../class/zcrypt
...
# customize this zcrypt node clone
# enable only adapter 0 and 2
echo "0xa0" >apmask
# enable only domain 6
echo "+6" >aqmask
# enable all 256 ioctls
echo "+0-255" >ioctls
# now the /dev/my_zcrypt may be used
# finally destroy it
echo "my_zcrypt" >/sys/class/zcrypt/destroy
Please note that a very similar 'filtering behavior' also applies to
the parent z90crypt device. The two mask attributes apmask and aqmask
in /sys/bus/ap act the very same for the z90crypt device node. However
the implementation here is totally different as the ap bus acts on
bind/unbind of queue devices and associated drivers but the effect is
still the same. So there are two filters active for each additional
zcrypt device node: The adapter/domain needs to be enabled on the ap
bus level and it needs to be active on the zcrypt device node level.
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Julian Wiedmann [Tue, 15 May 2018 19:17:38 +0000 (21:17 +0200)]
s390/qdio: clean up AOB handling
I've stumbled over this too many times now... AOBs are only ever used on
Output Queues. So in qdio_kick_handler(), move the call to their handler
into the Output-only path, and get rid of the convoluted contains_aobs()
helper. No functional change.
While at it, also remove
1. the unused sbal_state->aob field. For processing an async completion,
upper-layer drivers get their AOB pointer from the CQ buffer.
2. an unused EXPORT for qdio_allocate_aob(). External users would have
no way of passing an allocated AOB back into qdio.ko anyways...
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390/vdso: correct CFI annotations of vDSO functions
Correct stack frame overhead for 31-bit vdso, which should be 96 rather
then 160. This is done by reusing STACK_FRAME_OVERHEAD definition which
contains correct value based on build flags. This fixes stack unwinding
within vdso code for 31-bit processes. While at it replace all hard coded
stack frame overhead values with the same definition in vdso64 as well.
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390/vdso: avoid 64-bit vdso mapping for compat tasks
vdso_fault used is_compat_task function (on s390 it tests "current"
thread_info flags) to distinguish compat tasks and map 31-bit vdso
pages. But "current" task might not correspond to mm context.
When 31-bit compat inferior is executed under gdb, gdb does
PTRACE_PEEKTEXT on vdso page, causing vdso_fault with "current" being
64-bit gdb process. So, 31-bit inferior ends up with 64-bit vdso mapped.
To avoid this problem a new compat_mm flag has been introduced into
mm context. This flag is used in vdso_fault and vdso_mremap instead
of is_compat_task.
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Halil Pasic [Mon, 17 Sep 2018 13:23:03 +0000 (15:23 +0200)]
s390/zcrypt: enable AP bus scan without a valid default domain
The AP bus scan is aborted before doing anything worth mentioning if
ap_select_domain() fails, e.g. if the ap_rights.aqm mask is all zeros.
As the result of this the ap bus fails to manage (e.g. create and
register) devices like it is supposed to.
Let us make ap_scan_bus() work even if ap_select_domain() can't select a
default domain. Let's also make ap_select_domain() return void, as there
are no more callers interested in its return value.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reported-by: Michael Mueller <mimu@linux.ibm.com> Fixes: 7e0bdbe5c21c "s390/zcrypt: AP bus support for alternate driver(s)"
[freude@linux.ibm.com: title and patch header slightly modified] Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
To be able to start a kernel image loaded into memory with a
PSW restart, place a 64-bit restart PSW at 0x1a0 in absolute
lowcore.
Suggested-by: Dominik Klein <dominik.klein@linux.ibm.com> Tested-by: Dominik Klein <dominik.klein@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390/zcrypt: Use kmemdup to replace kmalloc + memcpy
kmemdup has implemented the function that kmalloc() + memcpy() will
do. We prefer to use the kmemdup function rather than an open coded
implementation.
Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Jan Höppner [Thu, 6 Sep 2018 11:16:40 +0000 (13:16 +0200)]
s390/sclp: Allow to request adapter reset
The SCLP event 24 "Adapter Error Notification" supports three different
action qualifier of which 'adapter reset' is currently not enabled in
the sysfs interface. However, userspace tools might want to be able
to use the reset functionality as well. Enable the 'adapter reset'
qualifier.
Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
s390/hibernate: fix error handling when suspend cpu != resume cpu
The resume code checks if the resume cpu is the same as the suspend cpu.
If not, and if it is also not possible to switch to the suspend cpu, an
error message should be printed and the resume process should be stopped
by loading a disabled wait psw.
The current logic is broken in multiple ways, the message is never printed,
and the disabled wait psw never loaded because the kernel panics before that:
- sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break
on the first 64bit instruction in sclp_early_printk().
- The init stack should be used, but the stack pointer is not set up correctly
(missing aghi %r15,-STACK_FRAME_OVERHEAD).
- __sclp_early_printk() checks the sclp_init_state. If it is not
sclp_init_state_uninitialized, it simply returns w/o printing anything.
In the resumed kernel however, sclp_init_state will never be uninitialized.
This patch fixes those issues by removing the sam31/ESA logic, adding a
correct init stack pointer, and also introducing sclp_early_printk_force()
to allow using sclp_early_printk() even when sclp_init_state is not
uninitialized.
Merge tag 'mtd/fixes-for-4.19-rc5' of git://git.infradead.org/linux-mtd
Boris writes:
"- Fixes a bug in the ->read/write_reg() implementation of the m25p80
driver
- Make sure of_node_get/put() calls are balanced in the partition
parsing code
- Fix a race in the denali NAND controller driver
- Fix false positive WARN_ON() in the marvell NAND controller driver"
* tag 'mtd/fixes-for-4.19-rc5' of git://git.infradead.org/linux-mtd:
mtd: devices: m25p80: Make sure the buffer passed in op is DMA-able
mtd: partitions: fix unbalanced of_node_get/put()
mtd: rawnand: denali: fix a race condition when DMA is kicked
mtd: rawnand: marvell: prevent harmless warnings
Merge tag 'sound-4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Takashi writes:
"sound fixes for 4.19-rc5
here comes a collection of various fixes, mostly for stable-tree
or regression fixes.
Two relatively high LOCs are about the (rather simple) conversion of
uapi integer types in topology API, and a regression fix about HDMI
hotplug notification on AMD HD-audio. The rest are all small
individual fixes like ASoC Intel Skylake race condition, minor
uninitialized page leak in emu10k1 ioctl, Firewire audio error paths,
and so on."
* tag 'sound-4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ALSA: fireworks: fix memory leak of response buffer at error path
ALSA: oxfw: fix memory leak of discovered stream formats at error path
ALSA: oxfw: fix memory leak for model-dependent data at error path
ALSA: bebob: fix memory leak for M-Audio FW1814 and ProjectMix I/O at error path
ALSA: hda - Enable runtime PM only for discrete GPU
ALSA: oxfw: fix memory leak of private data
ALSA: firewire-tascam: fix memory leak of private data
ALSA: firewire-digi00x: fix memory leak of private data
sound: don't call skl_init_chip() to reset intel skl soc
sound: enable interrupt after dma buffer initialization
Revert "ASoC: Intel: Skylake: Acquire irq after RIRB allocation"
ALSA: emu10k1: fix possible info leak to userspace on SNDRV_EMU10K1_IOCTL_INFO
ASoC: cs4265: fix MMTLR Data switch control
ASoC: AMD: Ensure reset bit is cleared before configuring
ALSA: fireface: fix memory leak in ff400_switch_fetching_mode()
ALSA: bebob: use address returned by kmalloc() instead of kernel stack for streaming DMA mapping
ASoC: rsnd: don't fallback to PIO mode when -EPROBE_DEFER
ASoC: rsnd: adg: care clock-frequency size
ASoC: uniphier: change status to orphan
ASoC: rsnd: fixup not to call clk_get/set under non-atomic
...
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Crypto stuff from Herbert:
"This push fixes a potential boot hang in ccp and an incorrect
CPU capability check in aegis/morus on x86."
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/aegis,morus - Do not require OSXSAVE for SSE2
crypto: ccp - add timeout support in the SEV command
Merge tag 'trace-v4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Steven writes:
"Vaibhav Nagarnaik found that modifying the ring buffer size could cause
a huge latency in the system because it does a while loop to free pages
without releasing the CPU (on non preempt kernels). In a case where there
are hundreds of thousands of pages to free it could actually cause a system
stall. A properly place cond_resched() solves this issue."
Merge tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86
Darren writes:
"platform-drivers-x86 for v4.19-2
Free allocated ACPI buffers in two drivers.
The following is an automated git shortlog grouped by driver:
alienware-wmi:
- Correct a memory leak
dell-smbios-wmi:
- Correct a memory leak"
* tag 'platform-drivers-x86-v4.19-2' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: alienware-wmi: Correct a memory leak
platform/x86: dell-smbios-wmi: Correct a memory leak
Boris Brezillon [Mon, 17 Sep 2018 14:31:30 +0000 (16:31 +0200)]
mtd: devices: m25p80: Make sure the buffer passed in op is DMA-able
As documented in spi-mem.h, spi_mem_op->data.buf.{in,out} must be
DMA-able, and commit 4120f8d158ef ("mtd: spi-nor: Use the spi_mem_xx()
API") failed to follow this rule as buffers passed to
->{read,write}_reg() are usually placed on the stack.
Fix that by allocating a scratch buffer and copying the data around.
Fixes: 4120f8d158ef ("mtd: spi-nor: Use the spi_mem_xx() API") Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
1) OOB data generation fix in bluetooth, from Matias Karhumaa.
2) BPF BTF boundary calculation fix, from Martin KaFai Lau.
3) Don't bug on excessive frags, to be compatible in situations mixing
older and newer kernels on each end. From Juergen Gross.
4) Scheduling in RCU fix in hv_netvsc, from Stephen Hemminger.
5) Zero keying information in TLS layer before freeing copies
of them, from Sabrina Dubroca.
6) Fix NULL deref in act_sample, from Davide Caratti.
7) Orphan SKB before GRO in veth to prevent crashes with XDP,
from Toshiaki Makita.
8) Fix use after free in ip6_xmit, from Eric Dumazet.
9) Fix VF mac address regression in bnxt_en, from Micahel Chan.
10) Fix MSG_PEEK behavior in TLS layer, from Daniel Borkmann.
11) Programming adjustments to r8169 which fix not being to enter deep
sleep states on some machines, from Kai-Heng Feng and Hans de
Goede.
12) Fix DST_NOCOUNT flag handling for ipv6 routes, from Peter
Oskolkov."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (45 commits)
net/ipv6: do not copy dst flags on rt init
qmi_wwan: set DTR for modems in forced USB2 mode
clk: x86: Stop marking clocks as CLK_IS_CRITICAL
r8169: Get and enable optional ether_clk clock
clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail
r8169: enable ASPM on RTL8106E
r8169: Align ASPM/CLKREQ setting function with vendor driver
Revert "kcm: remove any offset before parsing messages"
kcm: remove any offset before parsing messages
net: ethernet: Fix a unused function warning.
net: dsa: mv88e6xxx: Fix ATU Miss Violation
tls: fix currently broken MSG_PEEK behavior
hv_netvsc: pair VF based on serial number
PCI: hv: support reporting serial number as slot information
bnxt_en: Fix VF mac address regression.
ipv6: fix possible use-after-free in ip6_xmit()
net: hp100: fix always-true check for link up state
ARM: dts: at91: add new compatibility string for macb on sama5d3
net: macb: disable scatter-gather for macb on sama5d3
net: mvpp2: let phylink manage the carrier state
...
Peter Oskolkov [Mon, 17 Sep 2018 17:20:53 +0000 (10:20 -0700)]
net/ipv6: do not copy dst flags on rt init
DST_NOCOUNT in dst_entry::flags tracks whether the entry counts
toward route cache size (net->ipv6.sysctl.ip6_rt_max_size).
If the flag is NOT set, dst_ops::pcpuc_entries counter is incremented
in dist_init() and decremented in dst_destroy().
This flag is tied to allocation/deallocation of dst_entry and
should not be copied from another dst/route. Otherwise it can happen
that dst_ops::pcpuc_entries counter grows until no new routes can
be allocated because the counter reached ip6_rt_max_size due to
DST_NOCOUNT not set and thus no counter decrements on gc-ed routes.
Fixes: 3b6761d18bc1 ("net/ipv6: Move dst flags to booleans in fib entries") Cc: David Ahern <dsahern@gmail.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Recent firmware revisions have added the ability to force
these modems to USB2 mode, hiding their SuperSpeed
capabilities from the host. The driver has been using the
SuperSpeed capability, as shown by the bcdUSB field of the
device descriptor, to detect the need to enable the DTR
quirk. This method fails when the modems are forced to
USB2 mode by the modem firmware.
Fix by unconditionally enabling the DTR quirk for the
affected device IDs.
Reported-by: Fred Veldini <fred.veldini@gmail.com> Reported-by: Deshu Wen <dwen@sierrawireless.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Reported-by: Fred Veldini <fred.veldini@gmail.com> Reported-by: Deshu Wen <dwen@sierrawireless.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Sep 2018 01:47:58 +0000 (18:47 -0700)]
Merge branch 'r8169-clk-fixes'
Hans de Goede says:
====================
r8169 (x86) clk fixes to fix S0ix not being reached
This series adds code to the r8169 ethernet driver to get and enable an
external clock if present, avoiding the need for a hack in the
clk-pmc-atom driver where that clock was left on continuesly causing x86
some devices to not reach deep power saving states (S0ix) when suspended
causing to them to quickly drain their battery while suspended.
The 3 commits in this series need to be merged in order to avoid
regressions while bisecting. The clk-pmc-atom driver does not see much
changes (it was last touched over a year ago). So the clk maintainers
have agreed with merging all 3 patches through the net tree.
All 3 patches have Stephen Boyd's Acked-by for this purpose.
This v2 of the series only had some minor tweaks done to the commit
messages and is ready for merging through the net tree now.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hans de Goede [Wed, 12 Sep 2018 09:34:56 +0000 (11:34 +0200)]
clk: x86: Stop marking clocks as CLK_IS_CRITICAL
Commit d31fd43c0f9a ("clk: x86: Do not gate clocks enabled by the
firmware"), which added the code to mark clocks as CLK_IS_CRITICAL, causes
all unclaimed PMC clocks on Cherry Trail devices to be on all the time,
resulting on the device not being able to reach S0i3 when suspended.
The reason for this commit is that on some Bay Trail / Cherry Trail devices
the r8169 ethernet controller uses pmc_plt_clk_4. Now that the clk-pmc-atom
driver exports an "ether_clk" alias for pmc_plt_clk_4 and the r8169 driver
has been modified to get and enable this clock (if present) the marking of
the clocks as CLK_IS_CRITICAL is no longer necessary.
This commit removes the CLK_IS_CRITICAL marking, fixing Cherry Trail
devices not being able to reach S0i3 greatly decreasing their battery
drain when suspended.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hans de Goede [Wed, 12 Sep 2018 09:34:55 +0000 (11:34 +0200)]
r8169: Get and enable optional ether_clk clock
On some boards a platform clock is used as clock for the r8169 chip,
this commit adds support for getting and enabling this clock (assuming
it has an "ether_clk" alias set on it).
This is related to commit d31fd43c0f9a ("clk: x86: Do not gate clocks
enabled by the firmware") which is a previous attempt to fix this for some
x86 boards, but this causes all Cherry Trail SoC using boards to not reach
there lowest power states when suspending.
This commit (together with an atom-pmc-clk driver commit adding the alias)
fixes things properly by making the r8169 get the clock and enable it when
it needs it.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Acked-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hans de Goede [Wed, 12 Sep 2018 09:34:54 +0000 (11:34 +0200)]
clk: x86: add "ether_clk" alias for Bay Trail / Cherry Trail
Commit d31fd43c0f9a ("clk: x86: Do not gate clocks enabled by the
firmware") causes all unclaimed PMC clocks on Cherry Trail devices to be on
all the time, resulting on the device not being able to reach S0i2 or S0i3
when suspended.
The reason for this commit is that on some Bay Trail / Cherry Trail devices
the ethernet controller uses pmc_plt_clk_4. This commit adds an "ether_clk"
alias, so that the relevant ethernet drivers can try to (optionally) use
this, without needing X86 specific code / hacks, thus fixing ethernet on
these devices without breaking S0i3 support.
This commit uses clkdev_hw_create() to create the alias, mirroring the code
for the already existing "mclk" alias for pmc_plt_clk_3.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=193891#c102 Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=196861 Cc: Johannes Stezenbach <js@sig21.net> Cc: Carlo Caione <carlo@endlessm.com> Reported-by: Johannes Stezenbach <js@sig21.net> Acked-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
r8169: Align ASPM/CLKREQ setting function with vendor driver
There's a small delay after setting ASPM in vendor drivers, r8101 and
r8168.
In addition, those drivers enable ASPM before ClkReq, also change that
to align with vendor driver.
I haven't seen anything bad becasue of this, but I think it's better to
keep in sync with vendor driver.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The current code assumes kcm users know they need to look for the
strparser offset within their bpf program, which is not documented
anywhere and examples laying around do not do.
The actual recv function does handle the offset well, so we can create a
temporary clone of the skb and pull that one up as required for parsing.
The pull itself has a cost if we are pulling beyond the head data,
measured to 2-3% latency in a noisy VM with a local client stressing
that path. The clone's impact seemed too small to measure.
This bug can be exhibited easily by implementing a "trivial" kcm parser
taking the first bytes as size, and on the client sending at least two
such packets in a single write().
Note that bpf sockmap has the same problem, both for parse and for recv,
so it would pulling twice or a real pull within the strparser logic if
anyone cares about that.
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ring-buffer: Allow for rescheduling when removing pages
When reducing ring buffer size, pages are removed by scheduling a work
item on each CPU for the corresponding CPU ring buffer. After the pages
are removed from ring buffer linked list, the pages are free()d in a
tight loop. The loop does not give up CPU until all pages are removed.
In a worst case behavior, when lot of pages are to be freed, it can
cause system stall.
After the pages are removed from the list, the free() can happen while
the work is rescheduled. Call cond_resched() in the loop to prevent the
system hangup.
Link: http://lkml.kernel.org/r/20180907223129.71994-1-vnagarnaik@google.com Cc: stable@vger.kernel.org Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Reported-by: Jason Behmer <jbehmer@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Merge tag 'spi-fix-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Mark writes:
"spi: Fixes for v4.19
As well as one driver fix there's a couple of fixes here which address
issues with the use of IDRs for allocation of dynamic bus numbers,
ensuring that dynamic bus numbers interact well with static bus numbers
assigned via DT and otherwise."
* tag 'spi-fix-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-fsl-dspi: fix broken DSPI_EOQ_MODE
spi: Fix double IDR allocation with DT aliases
spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers
Merge tag 'asoc-v4.19-rc4' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v4.19
This is the usual set of small fixes scatterd around various drivers,
plus one fix for DAPM and a UAPI build fix. There's not a huge amount
that stands out here relative to anything else.
drivers/net/ethernet/microchip/lan743x_main.c:2964:12: warning: \91lan743x_pm_suspend\92 defined but not used [-Wunused-function]
static int lan743x_pm_suspend(struct device *dev)
drivers/net/ethernet/microchip/lan743x_main.c:2987:12: warning: \91lan743x_pm_resume\92 defined but not used [-Wunused-function]
static int lan743x_pm_resume(struct device *dev)
Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 14 Sep 2018 21:46:12 +0000 (23:46 +0200)]
net: dsa: mv88e6xxx: Fix ATU Miss Violation
Fix a cut/paste error and a typo which results in ATU miss violations
not being reported.
Fixes: 0977644c5005 ("net: dsa: mv88e6xxx: Decode ATU problem interrupt") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
As can be seen from strace, there are two TLS records sent,
i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
peeking 'test_read_peektest_read_peektest'. This is clearly
wrong, and what happens is that given peek cannot call into
tls_sw_advance_skb() to unpause strparser and proceed with
the next skb, we end up looping over the current one, copying
the 'test_read_peek' over and over into the user provided
buffer.
Here, we can only peek into the currently held skb (current,
full TLS record) as otherwise we would end up having to hold
all the original skb(s) (depending on the peek depth) in a
separate queue when unpausing strparser to process next
records, minimally intrusive is to return only up to the
current record's size (which likely was what c46234ebb4d1
("tls: RX path for ktls") originally intended as well). Thus,
after patch we properly peek the first record:
Fixes: c46234ebb4d1 ("tls: RX path for ktls") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
hv_netvsc: associate VF and PV device by serial number
The Hyper-V implementation of PCI controller has concept of 32 bit serial number
(not to be confused with PCI-E serial number). This value is sent in the protocol
from the host to indicate SR-IOV VF device is attached to a synthetic NIC.
Using the serial number (instead of MAC address) to associate the two devices
avoids lots of potential problems when there are duplicate MAC addresses from
tunnels or layered devices.
The patch set is broken into two parts, one is for the PCI controller
and the other is for the netvsc device. Normally, these go through different
trees but sending them together here for better review. The PCI changes
were submitted previously, but the main review comment was "why do you
need this?". This is why.
v2 - slot name can be shorter.
remove locking when creating pci_slots; see comment for explaination
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matching network device based on MAC address is problematic
since a non VF network device can be creted with a duplicate MAC
address causing confusion and problems. The VMBus API does provide
a serial number that is a better matching method.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
PCI: hv: support reporting serial number as slot information
The Hyper-V host API for PCI provides a unique "serial number" which
can be used as basis for sysfs PCI slot table. This can be useful
for cases where userspace wants to find the PCI device based on
serial number.
When an SR-IOV NIC is added, the host sends an attach message
with serial number. The kernel doesn't use the serial number, but
it is useful when doing the same thing in a userspace driver such
as the DPDK. By having /sys/bus/pci/slots/N it provides a direct
way to find the matching PCI device.
There maybe some cases where serial number is not unique such
as when using GPU's. But the PCI slot infrastructure will handle
that.
This has a side effect which may also be useful. The common udev
network device naming policy uses the slot information (rather
than PCI address).
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 14 Sep 2018 19:41:29 +0000 (15:41 -0400)]
bnxt_en: Fix VF mac address regression.
The recent commit to always forward the VF MAC address to the PF for
approval may not work if the PF driver or the firmware is older. This
will cause the VF driver to fail during probe:
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): hwrm req_type 0xf seq id 0x5 error 0xffff
bnxt_en 0000:00:03.0 (unnamed net_device) (uninitialized): VF MAC address 00:00:17:02:05:d0 not approved by the PF
bnxt_en 0000:00:03.0: Unable to initialize mac address.
bnxt_en: probe of 0000:00:03.0 failed with error -99
We fix it by treating the error as fatal only if the VF MAC address is
locally generated by the VF.
Fixes: 707e7e966026 ("bnxt_en: Always forward VF MAC address to the PF.") Reported-by: Seth Forshee <seth.forshee@canonical.com> Reported-by: Siwei Liu <loseweigh@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 14 Sep 2018 19:02:31 +0000 (12:02 -0700)]
ipv6: fix possible use-after-free in ip6_xmit()
In the unlikely case ip6_xmit() has to call skb_realloc_headroom(),
we need to call skb_set_owner_w() before consuming original skb,
otherwise we risk a use-after-free.
Bring IPv6 in line with what we do in IPv4 to fix this.
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 14 Sep 2018 16:39:53 +0000 (17:39 +0100)]
net: hp100: fix always-true check for link up state
The operation ~(p100_inb(VG_LAN_CFG_1) & HP100_LINK_UP) returns a value
that is always non-zero and hence the wait for the link to drop always
terminates prematurely. Fix this by using a logical not operator instead
of a bitwise complement. This issue has been in the driver since
pre-2.6.12-rc2.
Detected by CoverityScan, CID#114157 ("Logical vs. bitwise operator")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 14 Sep 2018 15:48:11 +0000 (17:48 +0200)]
ARM: dts: at91: add new compatibility string for macb on sama5d3
We need this new compatibility string as we experienced different behavior
for this 10/100Mbits/s macb interface on this particular SoC.
Backward compatibility is preserved as we keep the alternative strings.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Ferre [Fri, 14 Sep 2018 15:48:10 +0000 (17:48 +0200)]
net: macb: disable scatter-gather for macb on sama5d3
Create a new configuration for the sama5d3-macb new compatibility string.
This configuration disables scatter-gather because we experienced lock down
of the macb interface of this particular SoC under very high load.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 14 Sep 2018 14:56:35 +0000 (16:56 +0200)]
net: mvpp2: let phylink manage the carrier state
Net drivers using phylink shouldn't mess with the link carrier
themselves and should let phylink manage it. The mvpp2 driver wasn't
following this best practice as the mac_config() function made calls to
change the link carrier state. This led to wrongly reported carrier link
state which then triggered other issues. This patch fixes this
behaviour.
But the PPv2 driver relied on this misbehaviour in two cases: for fixed
links and when not using phylink (ACPI mode). The later was fixed by
adding an explicit call to link_up(), which when the ACPI mode will use
phylink should be removed.
The fixed link case was relying on the mac_config() function to set the
link up, as we found an issue in phylink_start() which assumes the
carrier is off. If not, the link_up() function is never called. To fix
this, a call to netif_carrier_off() is added just before phylink_start()
so that we do not introduce a regression in the driver.
Fixes: 4bb043262878 ("net: mvpp2: phylink support") Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
pppoe_rcv() needs to look back at the Ethernet header in order to
lookup the PPPoE session. Therefore we need to ensure that the mac
header is big enough to contain an Ethernet header. Otherwise
eth_hdr(skb)->h_source might access invalid data.
Fixes: 224cf5ad14c0 ("ppp: Move the PPP drivers") Reported-by: syzbot+f5f6080811c849739212@syzkaller.appspotmail.com Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch mades TI_DAVINCI_CPDMA select GENERIC_ALLOCATOR.
without that, the following sparc64 build failure happen
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_check_free_tx_desc':
(.text+0x278): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_chan_submit':
(.text+0x340): undefined reference to `gen_pool_alloc'
(.text+0x5c4): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `__cpdma_chan_free':
davinci_cpdma.c:(.text+0x64c): undefined reference to `gen_pool_free'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_desc_pool_destroy.isra.6':
davinci_cpdma.c:(.text+0x17ac): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x17b8): undefined reference to `gen_pool_avail'
davinci_cpdma.c:(.text+0x1824): undefined reference to `gen_pool_size'
davinci_cpdma.c:(.text+0x1830): undefined reference to `gen_pool_avail'
drivers/net/ethernet/ti/davinci_cpdma.o: In function `cpdma_ctlr_create':
(.text+0x19f8): undefined reference to `devm_gen_pool_create'
(.text+0x1a90): undefined reference to `gen_pool_add_virt'
Makefile:1011: recipe for target 'vmlinux' failed
Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
While at first mtd_part_of_parse() would just call
of_get_chil_by_name(), it has been patched to deal with sub-partitions
and will now directly manipulate the node returned by mtd_get_of_node()
if the MTD device is a partition.
A of_node_put() was a bit below in the code, to balance the
of_get_child_by_name(). However, despite its name, mtd_get_of_node()
does not take a reference on the OF node. It is a simple helper hiding
some pointer logic to retrieve the OF node related to an MTD
device.
The direct effect of such unbalanced reference counting is visible by
rmmod'ing any module that would have added MTD partitions:
OF: ERROR: Bad of_node_put() on <of_path_to_partition>
As it seems normal to get a reference on the OF node during the
of_property_for_each_string() that follows, add a call to
of_node_get() when relevant.
Fixes: 76a832254ab0 ("mtd: partitions: use DT info for parsing partitions with "compatible" prop") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
hwmon: (nct6775) Use different register to get fan RPM for fan7
The documented register to retrieve the fan RPM for fan7 is found
to be unreliable at least with NCT6796D revision 3. Let's use
register 0x4ce instead. This is undocumented for NCT6796D, but
documented for NCT6797D and NCT6798D and known to be working.
Reported-by: Robert Kern <ulteq@web.de> Cc: Robert Kern <ulteq@web.de> Fixes: 81820059a428 ("hwmon: (nct6775) Add support for NCT6796D") Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Ted writes:
Various ext4 bug fixes; primarily making ext4 more robust against
maliciously crafted file systems, and some DAX fixes.
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4, dax: set ext4_dax_aops for dax files
ext4, dax: add ext4_bmap to ext4_dax_aops
ext4: don't mark mmp buffer head dirty
ext4: show test_dummy_encryption mount option in /proc/mounts
ext4: close race between direct IO and ext4_break_layouts()
ext4: fix online resizing for bigalloc file systems with a 1k block size
ext4: fix online resize's handling of a too-small final block group
ext4: recalucate superblock checksum after updating free blocks/inodes
ext4: avoid arithemetic overflow that can trigger a BUG
ext4: avoid divide by zero fault when deleting corrupted inline directories
ext4: check to make sure the rename(2)'s destination is not freed
ext4: add nonstring annotations to ext4.h
Merge tag 'linux-kselftest-4.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pulled kselftest fixes from Shuah:
"This Kselftest fixes update for 4.9-rc5 consists of:
-- fixes to build failures
-- fixes to add missing config files to increase test coverage
-- fixes to cgroup test and a new cgroup test for memory.oom.group"
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix end boundary calculation in BTF for the type section, from Martin.
2) Fix and revert subtraction of pointers that was accidentally allowed
for unprivileged programs, from Alexei.
3) Fix bpf_msg_pull_data() helper by using __GFP_COMP in order to avoid
a warning in linearizing sg pages into a single one for large allocs,
from Tushar.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to avoid this, orphan the skb before entering GRO.
Fixes: 948d4f214fde ("veth: Add driver XDP") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Tested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
udp: add missing check on edumx rx path
The early demux RX path for the UDP protocol is currently missing
some checks. Both ipv4 and ipv6 implementations lack checksum conversion
and the ipv6 implementation additionally lack the zero checksum
validation.
The first patch takes care of UDPv4 and the second one of UDPv6
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 13 Sep 2018 14:27:21 +0000 (16:27 +0200)]
udp6: add missing checks on edumux packet processing
Currently the UDPv6 early demux rx code path lacks some mandatory
checks, already implemented into the normal RX code path - namely
the checksum conversion and no_check6_rx check.
Similar to the previous commit, we move the common processing to
an UDPv6 specific helper and call it from both edemux code path
and normal code path. In respect to the UDPv4, we need to add an
explicit check for non zero csum according to no_check6_rx value.
Reported-by: Jianlin Shi <jishi@redhat.com> Suggested-by: Xin Long <lucien.xin@gmail.com> Fixes: c9f2c1ae123a ("udp6: fix socket leak on early demux") Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 13 Sep 2018 14:27:20 +0000 (16:27 +0200)]
udp4: fix IP_CMSG_CHECKSUM for connected sockets
commit 2abb7cdc0dc8 ("udp: Add support for doing checksum
unnecessary conversion") left out the early demux path for
connected sockets. As a result IP_CMSG_CHECKSUM gives wrong
values for such socket when GRO is not enabled/available.
This change addresses the issue by moving the csum conversion to a
common helper and using such helper in both the default and the
early demux rx path.
Fixes: 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jongsung Kim [Thu, 13 Sep 2018 09:32:21 +0000 (18:32 +0900)]
stmmac: fix valid numbers of unicast filter entries
Synopsys DWC Ethernet MAC can be configured to have 1..32, 64, or
128 unicast filter entries. (Table 7-8 MAC Address Registers from
databook) Fix dwmac1000_validate_ucast_entries() to accept values
between 1 and 32 in addition.
Signed-off-by: Jongsung Kim <neidhard.kim@lge.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The Code of Conflict is not achieving its implicit goal of fostering
civility and the spirit of 'be excellent to each other'. Explicit
guidelines have demonstrated success in other projects and other areas
of the kernel.
Here is a Code of Conduct statement for the wider kernel. It is based
on the Contributor Covenant as described at www.contributor-covenant.org
From this point forward, we should abide by these rules in order to help
make the kernel community a welcoming environment to participate in.
Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Olof Johansson <olof@lxom.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sync syscall to DAX file needs to flush processor cache, but it
currently does not flush to existing DAX files. This is because
'ext4_da_aops' is set to address_space_operations of existing DAX
files, instead of 'ext4_dax_aops', since S_DAX flag is set after
ext4_set_aops() in the open path.
New file
--------
lookup_open
ext4_create
__ext4_new_inode
ext4_set_inode_flags // Set S_DAX flag
ext4_set_aops // Set aops to ext4_dax_aops
Existing file
-------------
lookup_open
ext4_lookup
ext4_iget
ext4_set_aops // Set aops to ext4_da_aops
ext4_set_inode_flags // Set S_DAX flag
Change ext4_iget() to initialize i_flags before ext4_set_aops().
Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
Ext4 mount path calls .bmap to the journal inode. This currently
works for the DAX mount case because ext4_iget() always set
'ext4_da_aops' to any regular files.
In preparation to fix ext4_iget() to set 'ext4_dax_aops' for ext4
DAX files, add ext4_bmap() to 'ext4_dax_aops', since bmap works for
DAX inodes.
Fixes: 5f0663bb4a64 ("ext4, dax: introduce ext4_dax_aops") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Suggested-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org
hwmon: (nct6775) Fix RPM output for fan7 on NCT6796D
fan7 on NCT6796D does not have a fan count register; it only has an RPM
register. Switch to using RPM registers to read the fan speed for all
chips supporting it to solve the problem for good.
Reported-by: Robert Kern <ulteq@web.de> Cc: Robert Kern <ulteq@web.de> Fixes: 81820059a428 ("hwmon: (nct6775) Add support for NCT6796D") Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Li Dongyang [Sat, 15 Sep 2018 21:11:25 +0000 (17:11 -0400)]
ext4: don't mark mmp buffer head dirty
Marking mmp bh dirty before writing it will make writeback
pick up mmp block later and submit a write, we don't want the
duplicate write as kmmpd thread should have full control of
reading and writing the mmp block.
Another reason is we will also have random I/O error on
the writeback request when blk integrity is enabled, because
kmmpd could modify the content of the mmp block(e.g. setting
new seq and time) while the mmp block is under I/O requested
by writeback.
Signed-off-by: Li Dongyang <dongyangli@ddn.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@vger.kernel.org
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingol Molnar:
"Misc fixes:
- EFI crash fix
- Xen PV fixes
- do not allow PTI on 2-level 32-bit kernels for now
- documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/APM: Fix build warning when PROC_FS is not enabled
Revert "x86/mm/legacy: Populate the user page-table with user pgd's"
x86/efi: Load fixmap GDT in efi_call_phys_epilog() before setting %cr3
x86/xen: Disable CPU0 hotplug for Xen PV
x86/EISA: Don't probe EISA bus for Xen PV guests
x86/doc: Fix Documentation/x86/earlyprintk.txt
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: various scheduler metrics corner case fixes, a
sched_features deadlock fix, and a topology fix for certain NUMA
systems"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix kernel-doc notation warning
sched/fair: Fix load_balance redo for !imbalance
sched/fair: Fix scale_rt_capacity() for SMT
sched/fair: Fix vruntime_normalized() for remote non-migration wakeup
sched/pelt: Fix update_blocked_averages() for RT and DL classes
sched/topology: Set correct NUMA topology type
sched/debug: Fix potential deadlock when writing to sched_features
hwmon: (nct6775) Fix virtual temperature sources for NCT6796D
The following kernel log message is reported for the nct6775 driver
on ASUS WS X299 SAGE.
nct6775: Found NCT6796D or compatible chip at 0x2e:0x290
nct6775 nct6775.656: Invalid temperature source 11 at index 0,
source register 0x100, temp register 0x73
nct6775 nct6775.656: Invalid temperature source 11 at index 2,
source register 0x300, temp register 0x77
nct6775 nct6775.656: Invalid temperature source 11 at index 3,
source register 0x800, temp register 0x79
nct6775 nct6775.656: Invalid temperature source 11 at index 4,
source register 0x900, temp register 0x7b
A recent version of the datasheet lists temperature source 11 as reserved.
However, an older version of the datasheet lists temperature sources 10
and 11 as supported virtual temperature sources. Apparently the older
version of the datasheet is correct, so list those temperature sources
as supported.
Virtual temperature sources are different than other temperature sources:
Values are not read from a temperature sensor, but written either from
BIOS or an embedded controller. As such, each virtual temperature has to
be reported. Since there is now more than one temperature source, we have
to keep virtual temperature sources in a chip-specific mask and can no
longer rely on the assumption that there is only one virtual temperature
source with a fixed index. This accounts for most of the complexity of this
patch.
Reported-by: Robert Kern <ulteq@web.de> Cc: Robert Kern <ulteq@web.de> Fixes: 81820059a428 ("hwmon: (nct6775) Add support for NCT6796D") Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Randy Dunlap [Fri, 14 Sep 2018 22:10:29 +0000 (15:10 -0700)]
x86/APM: Fix build warning when PROC_FS is not enabled
Fix build warning in apm_32.c when CONFIG_PROC_FS is not enabled:
../arch/x86/kernel/apm_32.c:1643:12: warning: 'proc_apm_show' defined but not used [-Wunused-function]
static int proc_apm_show(struct seq_file *m, void *v)
Fixes: 3f3942aca6da ("proc: introduce proc_create_single{,_data}") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jiri Kosina <jikos@kernel.org> Link: https://lkml.kernel.org/r/be39ac12-44c2-4715-247f-4dcc3c525b8b@infradead.org
mtd: rawnand: denali: fix a race condition when DMA is kicked
I thought the read-back of the DMA_ENABLE register was unnecessary
(at least it is working on my boards), then deleted it in commit 586a2c52909d ("mtd: nand: denali: squash denali_enable_dma() helper
into caller"). Sorry, I was wrong - it caused a timing issue on
Cyclone5 SoCFPGAs.
Revive the register read-back, commenting why this is necessary.
Merge tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Fixes for four CIFS/SMB3 potential pointer overflow issues, one minor
build fix, and a build warning cleanup"
* tag '4.19-rc3-smb3-cifs' of git://git.samba.org/sfrench/cifs-2.6:
cifs: read overflow in is_valid_oplock_break()
cifs: integer overflow in in SMB2_ioctl()
CIFS: fix wrapping bugs in num_entries()
cifs: prevent integer overflow in nxt_dir_entry()
fs/cifs: require sha512
fs/cifs: suppress a string overflow warning
Merge tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client bugfixes from Anna Schumaker:
"These are a handful of fixes for problems that Trond found. Patch #1
and #3 have the same name, a second issue was found after applying the
first patch.
Stable bugfixes:
- v4.17+: Fix tracepoint Oops in initiate_file_draining()
- v4.11+: Fix an infinite loop on I/O
Other fixes:
- Return errors if a waiting layoutget is killed
- Don't open code clearing of delegation state"
* tag 'nfs-for-4.19-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
NFS: Don't open code clearing of delegation state
NFSv4.1 fix infinite loop on I/O.
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
pNFS: Ensure we return the error if someone kills a waiting layoutget
NFSv4: Fix a tracepoint Oops in initiate_file_draining()
Merge tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This fixes an issue with the build system caused by a change that
modifies CC_FLAGS_FTRACE. The issue is that it breaks the dependencies
and causes "make targz-pkg" to rebuild the entire world"
* tag 'trace-v4.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/Makefile: Fix handling redefinition of CC_FLAGS_FTRACE
Merge tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull DeviceTree fix from Rob Herring:
"One regression for a 20 year old PowerMac:
- Fix a regression on systems having a DT without any phandles which
happens on a PowerMac G3"
* tag 'devicetree-fixes-for-4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: fix phandle cache creation for DTs with no phandles
Merge tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"This contains some minor cleanups and fixes:
- a new knob for controlling scrubbing of pages returned by the Xen
balloon driver to the Xen hypervisor to address a boot performance
issue seen in large guests booted pre-ballooned
- a fix of a regression in the gntdev driver which made it impossible
to use fully virtualized guests (HVM guests) with a 4.19 based dom0
- a fix in Xen cpu hotplug functionality which could be triggered by
wrong admin commands (setting number of active vcpus to 0)
One further note: the patches have all been under test for several
days in another branch. This branch has been rebased in order to avoid
merge conflicts"
* tag 'for-linus-4.19c-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/gntdev: fix up blockable calls to mn_invl_range_start
xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage
xen: avoid crash in disable_hotplug_cpu
xen/balloon: add runtime control for scrubbing ballooned out pages
xen/manage: don't complain about an empty value in control/sysrq node
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The trickle of arm64 fixes continues to come in.
Nothing that's the end of the world, but we've got a fix for PCI IO
port accesses, an accidental naked "asm goto" and a fix to the
vmcoreinfo PT_NOTE merged this time around which we'd like to get
sorted before it becomes ABI.
- Fix ioport_map() mapping the wrong physical address for some I/O
BARs
- Remove direct use of "asm goto", since some compilers don't like
that
- Ensure kimage_voffset is always present in vmcoreinfo PT_NOTE"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
asm-generic: io: Fix ioport_map() for !CONFIG_GENERIC_IOMAP && CONFIG_INDIRECT_PIO
arm64: kernel: arch_crash_save_vmcoreinfo() should depend on CONFIG_CRASH_CORE
arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto"
The previous fix broke recovery of delegated stateids because it assumes
that if we did not mark the delegation as suspect, then the delegation has
effectively been revoked, and so it removes that delegation irrespectively
of whether or not it is valid and still in use. While this is "mostly
harmless" for ordinary I/O, we've seen pNFS fail with LAYOUTGET spinning
in an infinite loop while complaining that we're using an invalid stateid
(in this case the all-zero stateid).
What we rather want to do here is ensure that the delegation is always
correctly marked as needing testing when that is the case. So we want
to close the loophole offered by nfs4_schedule_stateid_recovery(),
which marks the state as needing to be reclaimed, but not the
delegation that may be backing it.
Merge tag 'dmaengine-fix-4.19-rc4' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
"Fix the mic_x100_dma driver to use devm_kzalloc for driver memory, so
that it is freed properly when it unregisters from dmaengine using
managed API"
* tag 'dmaengine-fix-4.19-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: mic_x100_dma: use devm_kzalloc to fix an issue
Merge tag 'usb-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB driver fixes for -rc4.
The usual suspects of gadget, xhci, and dwc2/3 are in here, along with
some reverts of reported problem changes, and a number of build
documentation warning fixes. Full details are in the shortlog.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits)
Revert "cdc-acm: implement put_char() and flush_chars()"
usb: Change usb_of_get_companion_dev() place to usb/common
usb: xhci: fix interrupt transfer error happened on MTK platforms
usb: cdc-wdm: Fix a sleep-in-atomic-context bug in service_outstanding_interrupt()
usb: misc: uss720: Fix two sleep-in-atomic-context bugs
usb: host: u132-hcd: Fix a sleep-in-atomic-context bug in u132_get_frame()
usb: Avoid use-after-free by flushing endpoints early in usb_set_interface()
linux/mod_devicetable.h: fix kernel-doc missing notation for typec_device_id
usb/typec: fix kernel-doc notation warning for typec_match_altmode
usb: Don't die twice if PCI xhci host is not responding in resume
usb: mtu3: fix error of xhci port id when enable U3 dual role
usb: uas: add support for more quirk flags
USB: Add quirk to support DJI CineSSD
usb: typec: fix kernel-doc parameter warning
usb/dwc3/gadget: fix kernel-doc parameter warning
USB: yurex: Check for truncation in yurex_read()
USB: yurex: Fix buffer over-read in yurex_write()
usb: host: xhci-plat: Iterate over parent nodes for finding quirks
xhci: Fix use after free for URB cancellation on a reallocated endpoint
USB: add quirk for WORLDE Controller KS49 or Prodipe MIDI 49C USB controller
...
tcf_sample_act() tried to update its per-cpu stats, but tcf_sample_init()
forgot to allocate them, because tcf_idr_create() was called with a wrong
value of 'cpustats'. Setting it to true proved to fix the reported crash.
Reported-by: Matteo Croce <mcroce@redhat.com> Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR") Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action") Tested-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'staging-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO driver fixes from Greg KH:
"Here are a few small staging and iio driver fixes for -rc4.
Nothing major, just a few small bugfixes for some reported issues, and
a MAINTAINERS file update for the fbtft drivers.
We also re-enable the building of the erofs filesystem as the XArray
patches that were causing it to break never got merged in the -rc1
cycle, so there's no reason it can't be turned back on for now. The
problem that was previously there is now being handled in the Xarray
tree at the moment, so it will not hit us again in the future.
All of these patches have been in linux-next with no reported issues"
* tag 'staging-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: vboxvideo: Change address of scanout buffer on page-flip
staging: vboxvideo: Fix IRQs no longer working
staging: gasket: TODO: re-implement using UIO
staging/fbtft: Update TODO and mailing lists
staging: erofs: rename superblock flags (MS_xyz -> SB_xyz)
iio: imu: st_lsm6dsx: take into account ts samples in wm configuration
Revert "iio: temperature: maxim_thermocouple: add MAX31856 part"
Revert "staging: erofs: disable compiling temporarile"
MAINTAINERS: Switch a maintainer for drivers/staging/gasket
staging: wilc1000: revert "fix TODO to compile spi and sdio components in single module"
Merge tag 'char-misc-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a small handful of char/misc driver fixes for 4.19-rc4.
All of them are simple, resolving reported problems in a few drivers.
Full details are in the shortlog.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-4.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
firmware: Fix security issue with request_firmware_into_buf()
vmbus: don't return values for uninitalized channels
fpga: dfl: fme: fix return value check in in pr_mgmt_init()
misc: hmc6352: fix potential Spectre v1
Tools: hv: Fix a bug in the key delete code
misc: ibmvsm: Fix wrong assignment of return code
android: binder: fix the race mmap and alloc_new_buf_locked
mei: bus: need to unlink client before freeing
mei: bus: fix hw module get/put balance
mei: fix use-after-free in mei_cl_write
mei: ignore not found client in the enumeration
It turned out that this patch is not sufficient to enable PTI on 32 bit
systems with legacy 2-level page-tables. In this paging mode the huge-page
PTEs are in the top-level page-table directory, where also the mirroring to
the user-space page-table happens. So every huge PTE exits twice, in the
kernel and in the user page-table.
That means that accessed/dirty bits need to be fetched from two PTEs in
this mode to be safe, but this is not trivial to implement because it needs
changes to generic code just for the sake of enabling PTI with 32-bit
legacy paging. As all systems that need PTI should support PAE anyway,
remove support for PTI when 32-bit legacy paging is used.
Fixes: 7757d607c6b3 ('x86/pti: Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32') Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Link: https://lkml.kernel.org/r/1536922754-31379-1-git-send-email-joro@8bytes.org
Michal Hocko [Tue, 4 Sep 2018 23:21:39 +0000 (09:21 +1000)]
xen/gntdev: fix up blockable calls to mn_invl_range_start
Patch series "mmu_notifiers follow ups".
Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish
blockable mode for mmu notifiers"). One of them has been fixed and picked
up by AMD/DRM maintainer [1]. XEN issue is fixed by patch 1. I have also
clarified expectations about blockable semantic of invalidate_range_end.
Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
longer used nor needed.
93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has
introduced blockable parameter to all mmu_notifiers and the notifier has
to back off when called in !blockable case and it could block down the
road.
The above commit implemented that for mn_invl_range_start but both
in_range checks are done unconditionally regardless of the blockable mode
and as such they would fail all the time for regular calls. Fix this by
checking blockable parameter as well.
Once we are there we can remove the stale TODO. The lock has to be
sleepable because we wait for completion down in gnttab_unmap_refs_sync.
Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>