It's not safe to call io_cqring_overflow_flush() for IOPOLL mode without
hodling uring_lock, because it does synchronisation differently. Make
sure we have it.
As for io_ring_exit_work(), we don't even need it there because
io_ring_ctx_wait_and_kill() already set force flag making all overflowed
requests to be dropped.
Cc: <stable@vger.kernel.org> # 5.5+ Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The argv_split() function must be paired with argv_free(), else we must
keep a reference to the argv array received or do the freeing ourselves,
in synthesize_sdt_probe_command() we were simply leaking that argv[]
array.
Fixes: 3b1f8311f6963cd1 ("perf probe: Add sdt probes arguments into the uprobe cmd string") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: He Zhe <zhe.he@windriver.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201224135139.GF477817@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit f77ac2e378be9dd6 ("ARM: 9030/1: entry: omit FP emulation for UND
exceptions taken in kernel mode") failed to take into account that there
is in fact a case where we relied on this code path: during boot, the
VFP detection code issues a read of FPSID, which will trigger an undef
exception on cores that lack VFP support.
So let's reinstate this logic using an undef hook which is registered
only for the duration of the initcall to vpf_init(), and which sets
VFP_arch to a non-zero value - as before - if no VFP support is present.
Fixes: f77ac2e378be9dd6 ("ARM: 9030/1: entry: omit FP emulation for UND ...") Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
WARNING: modpost: vmlinux.o(.text.unlikely+0x2d98): Section mismatch in reference from the function init_big_cores.isra.0() to the function .init.text:init_thread_group_cache_map()
The function init_big_cores.isra.0() references
the function __init init_thread_group_cache_map().
This is often because init_big_cores.isra.0 lacks a __init
annotation or the annotation of init_thread_group_cache_map is wrong.
Fixes: 425752c63b6f ("powerpc: Detect the presence of big-cores via "ibm, thread-groups"") Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201221074154.403779-1-clg@kaod.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The lkp robot reported that some configs fail to build, for example
mpc85xx_smp_defconfig, with:
cc1: fatal error: opening output file arch/powerpc/boot/dts/fsl/.mpc8540ads.dtb.dts.tmp: No such file or directory
This bisects to: cc8a51ca6f05 ("kbuild: always create directories of targets")
Although that commit claims to be about in-tree builds, it somehow
breaks out-of-tree builds. But presumably it's just exposing a latent
bug in our Makefiles.
We can fix it by adding to targets for dts/fsl in the same way that we
do for dts.
Fixes: cc8a51ca6f05 ("kbuild: always create directories of targets") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201215032906.473460-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
$(error-if,...) is expanded to an empty string. Currently, it relies on
eval_clause() returning xstrdup("") when all attempts for expansion fail,
but the correct implementation is to make do_error_if() return xstrdup("").
Fixes: 1d6272e6fe43 ("kconfig: add 'info', 'warning-if', and 'error-if' built-in functions") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 45c940184b501fc6 ("dt-bindings: clk: versaclock5: convert to
yaml") accidentally changed "idt,voltage-microvolts" to
"idt,voltage-microvolt" in the DT bindings, while the driver still used
the former.
Update the driver to match the bindings, as
Documentation/devicetree/bindings/property-units.txt actually recommends
using "microvolt".
Fixes: 260249f929e81d3d ("clk: vc5: Enable addition output configurations of the Versaclock") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20201218125253.3815567-1-geert+renesas@glider.be Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Two clock divider tables are missing sentinel at the end. Effect of that
is that clock framework reads past the last entry. Fix that with adding
sentinel at the end.
Issue was discovered with KASan.
Fixes: 0577e4853bfb ("clk: sunxi-ng: Add H3 clocks") Fixes: c6a0637460c2 ("clk: sunxi-ng: Add A64 clocks") Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Link: https://lore.kernel.org/r/20201202203817.438713-1-jernej.skrabec@siol.net Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Some resource should be released in the error handling path of the probe
function, as already done in the remove function.
The remove function was fixed in commit bf416bd45738 ("clk: s2mps11: Add
missing of_node_put and of_clk_del_provider")
Fixes: 7cc560dea415 ("clk: s2mps11: Add support for s2mps11") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201212122818.86195-1-christophe.jaillet@wanadoo.fr Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
pmc_data_allocate() has been changed. pmc_data_free() was removed.
Adapt the code taking this into consideration. With this the programmable
clocks were also saved in sama7g5_pmc so that they could be later
referenced.
Fixes: cb783bbbcf54 ("clk: at91: sama7g5: add clock support for sama7g5") Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Tested-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/r/1605800597-16720-2-git-send-email-claudiu.beznea@microchip.com Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This patch series is a followup based on the suggestions and feedback by
Linus:
https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com
The first patch in the series is a fix for the epoll race in presence of
timeouts, so that it can be cleanly backported to all affected stable
kernels.
The rest of the patch series simplify the ep_poll() implementation. Some
of these simplifications result in minor performance enhancements as well.
We have kept these changes under self tests and internal benchmarks for a
few days, and there are minor (1-2%) performance enhancements as a result.
This patch (of 8):
After abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2)
timeout"), we break out of the ep_poll loop upon timeout, without checking
whether there is any new events available. Prior to that patch-series we
always called ep_events_available() after exiting the loop.
This can cause races and missed wakeups. For example, consider the
following scenario reported by Guantao Liu:
Suppose we have an eventfd added using EPOLLET to an epollfd.
Thread 1: Sleeps for just below 5ms and then writes to an eventfd.
Thread 2: Calls epoll_wait with a timeout of 5 ms. If it sees an
event of the eventfd, it will write back on that fd.
Thread 3: Calls epoll_wait with a negative timeout.
Prior to abc610e01c66, it is guaranteed that Thread 3 will wake up either
by Thread 1 or Thread 2. After abc610e01c66, Thread 3 can be blocked
indefinitely if Thread 2 sees a timeout right before the write to the
eventfd by Thread 1. Thread 2 will be woken up from
schedule_hrtimeout_range and, with evail 0, it will not call
ep_send_events().
To fix this issue:
1) Simplify the timed_out case as suggested by Linus.
2) while holding the lock, recheck whether the thread was woken up
after its time out has reached.
Note that (2) is different from Linus' original suggestion: It do not set
"eavail = ep_events_available(ep)" to avoid unnecessary contention (when
there are too many timed-out threads and a small number of events), as
well as races mentioned in the discussion thread.
This is the first patch in the series so that the backport to stable
releases is straightforward.
Link: https://lkml.kernel.org/r/20201106231635.3528496-1-soheil.kdev@gmail.com Link: https://lkml.kernel.org/r/CAHk-=wizk=OxUyQPbO8MS41w2Pag1kniUV5WdD5qWL-gq1kjDA@mail.gmail.com Link: https://lkml.kernel.org/r/20201106231635.3528496-2-soheil.kdev@gmail.com Fixes: abc610e01c66 ("fs/epoll: avoid barrier after an epoll_wait(2) timeout") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Guantao Liu <guantaol@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Guantao Liu <guantaol@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Khazhismel Kumykov <khazhy@google.com> Reviewed-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 25b98b64e284 ("vhost scsi: alloc cmds per vq instead of session") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Link: https://lore.kernel.org/r/1607071411-33484-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The "vq" struct is added to the "vdev->vqs" list prematurely. If we
encounter an error later in the function then the "vq" is freed, but
since it is still on the list that could lead to a use after free bug.
Fixes: cbeedb72b97a ("virtio_ring: allocate desc state for split ring separately") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGaG/zkI3jk8mk@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Set a negative error code intead of returning success if the MTU has
been changed to something invalid.
Fixes: fe36cbe0671e ("virtio_net: clear MTU when out of range") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGVJSeeCdII1Ys@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is a copy and paste bug in the error handling of this code and
it uses "ring_dma_addr" three times instead of "device_event_dma_addr"
and "driver_event_dma_addr".
Fixes: 1ce9e6055fa0 (" virtio_ring: introduce packed ring support") Reported-by: Robert Buhren <robert.buhren@sect.tu-berlin.de> Reported-by: Felicitas Hetzelt <file@sect.tu-berlin.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X8pGRJlEzyn+04u2@mwanda Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Make sure to put dma write memory barrier after updating CQ consumer
index so the hardware knows that there are available CQE slots in the
queue.
Failure to do this can cause the update of the RX doorbell record to get
updated before the CQ consumer index resulting in CQ overrun.
Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Eli Cohen <elic@nvidia.com> Link: https://lore.kernel.org/r/20201209140004.15892-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The indirect block cleanup may cause control messages to be sent
if offloaded flows are present. However, by the time the flower app
cleanup callback is called txbufs are no longer available and attempts
to send control messages result in a NULL-pointer dereference in
nfp_ctrl_tx_one().
This problem may be resolved by moving the indirect block cleanup
to the stop callback, where txbufs are still available.
As suggested by Jakub Kicinski and Louis Peens.
Fixes: a1db217861f3 ("net: flow_offload: fix flow_indr_dev_unregister path") Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Link: https://lore.kernel.org/r/20201216145701.30005-1-simon.horman@netronome.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Return -EINVAL if we can't find the correct device. Currently it
returns success.
Fixes: 13159183ec7a ("qlcnic: 83xx base driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/X9nHbMqEyI/xPfGd@mwanda Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When using 'perf record's option '-I' or '--user-regs=' along with
argument '?' to list available register names, memory of variable 'os'
allocated by strdup() needs to be released before __parse_regs()
returns, otherwise memory leak will occur.
Fixes: bcc84ec65ad1 ("perf record: Add ability to name registers to record") Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20200703093344.189450-1-zhengzengkai@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We're missing -lcap in test-all.bin target, so in case it's the only
library missing (if more are missing test-all.bin fails anyway), we will
falsely claim that we detected it and fail build, like:
$ make
...
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... libbfd: [ on ]
... libbfd-buildid: [ on ]
... libcap: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... libzstd: [ on ]
... disassembler-four-args: [ on ]
...
CC builtin-ftrace.o
In file included from builtin-ftrace.c:29:
util/cap.h:11:10: fatal error: sys/capability.h: No such file or directory
11 | #include <sys/capability.h>
| ^~~~~~~~~~~~~~~~~~
compilation terminated.
Fixes: 74d5f3d06f707eb5 ("tools build: Add capability-related feature detection") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201203230836.3751981-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
io_uring_cancel_files() cancels all request that match files regardless
of task. There is no real need in that, cancel only requests of the
specified task. That also handles SQPOLL case as it already changes task
to it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit d3817a647059 ("pwm: sun4i: Remove redundant needs_delay") changed
the logic of an else branch so that the PWM_EN and PWM_CLK_GATING bits
are now cleared if the PWM is to be disabled, whereas previously the
condition was always false, and hence the branch never got executed.
This code is reported causing backlight issues on boards based on the
Allwinner A20 SoC. Fix this by removing the else branch, which restores
the behaviour prior to the offending commit.
Note that the PWM_EN and PWM_CLK_GATING bits still get cleared later in
sun4i_pwm_apply() if the PWM is to be disabled.
The second parameter of do_div is an u32 and NSEC_PER_SEC * prescale
overflows this for bigger periods. Assuming the usual pwm input clk rate
of 66 MHz this happens starting at requested period > 606060 ns.
Splitting the division into two operations doesn't loose any precision.
It doesn't need to be feared that c / NSEC_PER_SEC doesn't fit into the
unsigned long variable "duty_cycles" because in this case the assignment
above to period_cycles would already have been overflowing as
period >= duty_cycle and then the calculation is moot anyhow.
Fixes: aef1a3799b5c ("pwm: imx27: Fix rounding behavior") Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Tested-by: Johannes Pointner <johannes.pointner@br-automation.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When there are other PWM controllers enabled along with pwm-lp3943,
pwm-lp3942 is failing to probe with -EEXIST error. This is because
other PWM controllers are probed first and assigned PWM base 0 and
pwm-lp3943 is requesting for 0 again.
In order to avoid this, assign the chip base with -1, so that it is
dynamically allocated.
If clk_register fails, we should goto free branch
before function returns to prevent memleak.
Fixes: 163152cbbe321 ("clk: ti: Add support for FAPLL on dm816x") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Link: https://lore.kernel.org/r/20201113131623.2098222-1-zhangqilong3@huawei.com Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
clang produces a build failure in configurations without COMMON_CLK
when a timeout calculation goes wrong:
arm-linux-gnueabi-ld: drivers/watchdog/coh901327_wdt.o: in function `coh901327_enable':
coh901327_wdt.c:(.text+0x50): undefined reference to `__bad_udelay'
Add a Kconfig dependency to only do build testing when COMMON_CLK
is enabled.
Fixes: da2a68b3eb47 ("watchdog: Enable COMPILE_TEST where possible") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201203223358.1269372-1-arnd@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The use of msleep() in the restart handler will cause scheduler to
induce a context switch which is not desirable. This generates below
warning on SDX55 when WDT is the only available restart source:
[ 39.800188] reboot: Restarting system
[ 39.804115] ------------[ cut here ]------------
[ 39.807855] WARNING: CPU: 0 PID: 678 at kernel/rcu/tree_plugin.h:297 rcu_note_context_switch+0x190/0x764
[ 39.812538] Modules linked in:
[ 39.821954] CPU: 0 PID: 678 Comm: reboot Not tainted 5.10.0-rc1-00063-g33a9990d1d66-dirty #47
[ 39.824854] Hardware name: Generic DT based system
[ 39.833470] [<c0310fbc>] (unwind_backtrace) from [<c030c544>] (show_stack+0x10/0x14)
[ 39.838154] [<c030c544>] (show_stack) from [<c0c218f0>] (dump_stack+0x8c/0xa0)
[ 39.846049] [<c0c218f0>] (dump_stack) from [<c0322f80>] (__warn+0xd8/0xf0)
[ 39.853058] [<c0322f80>] (__warn) from [<c0c1dc08>] (warn_slowpath_fmt+0x64/0xc8)
[ 39.859925] [<c0c1dc08>] (warn_slowpath_fmt) from [<c038b6f4>] (rcu_note_context_switch+0x190/0x764)
[ 39.867503] [<c038b6f4>] (rcu_note_context_switch) from [<c0c2aa3c>] (__schedule+0x84/0x640)
[ 39.876685] [<c0c2aa3c>] (__schedule) from [<c0c2b050>] (schedule+0x58/0x10c)
[ 39.885095] [<c0c2b050>] (schedule) from [<c0c2eed0>] (schedule_timeout+0x1e8/0x3d4)
[ 39.892135] [<c0c2eed0>] (schedule_timeout) from [<c039ad40>] (msleep+0x2c/0x38)
[ 39.899947] [<c039ad40>] (msleep) from [<c0a59d0c>] (qcom_wdt_restart+0xc4/0xcc)
[ 39.907319] [<c0a59d0c>] (qcom_wdt_restart) from [<c0a58290>] (watchdog_restart_notifier+0x18/0x28)
[ 39.914715] [<c0a58290>] (watchdog_restart_notifier) from [<c03468e0>] (atomic_notifier_call_chain+0x60/0x84)
[ 39.923487] [<c03468e0>] (atomic_notifier_call_chain) from [<c030ae64>] (machine_restart+0x78/0x7c)
[ 39.933551] [<c030ae64>] (machine_restart) from [<c0348048>] (__do_sys_reboot+0xdc/0x1e0)
[ 39.942397] [<c0348048>] (__do_sys_reboot) from [<c0300060>] (ret_fast_syscall+0x0/0x54)
[ 39.950721] Exception stack(0xc3e0bfa8 to 0xc3e0bff0)
[ 39.958855] bfa0: 0001221cbed2fe24fee1dead281219690123456700000000
[ 39.963832] bfc0: 0001221cbed2fe240000000300000058000225e0000000000000000000000000
[ 39.971985] bfe0: b6e62560bed2fc8400010fd8b6e62580
[ 39.980124] ---[ end trace 3f578288bad866e4 ]---
Hence, replace msleep() with mdelay() to fix this issue.
Fixes: 05e487d905ab ("watchdog: qcom: register a restart notifier") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201207060005.21293-1-manivannan.sadhasivam@linaro.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently pmac32_defconfig with SMP=y doesn't build:
arch/powerpc/platforms/powermac/smp.c:
error: implicit declaration of function 'cleanup_cpu_mmu_context'
It would be nice for consistency if all platforms clear mm_cpumask and
flush TLBs on unplug, but the TLB invalidation bug described in commit 01b0f0eae081 ("powerpc/64s: Trim offlined CPUs from mm_cpumasks") only
applies to 64s and for now we only have the TLB flush code for that
platform.
So just add an empty version for 32-bit Book3S.
Fixes: 01b0f0eae081 ("powerpc/64s: Trim offlined CPUs from mm_cpumasks") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Change log based on comments from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The BIT() macro is not available for the UAPI headers. Moreover, it can
be defined differently in user space headers. Thus, replace its usage
with the _BITUL() macro which is already used in other macro definitions
in <linux/devlink.h>.
Fixes: dc64cc7c6310 ("devlink: Add devlink reload limit option") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/r/20201215102531.16958-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The ndo_start_xmit() method must not attempt to free the skb to transmit
when returning NETDEV_TX_BUSY. Therefore, make sure the
korina_send_packet() function returns NETDEV_TX_OK when it frees a packet.
Fixes: ef11291bcd5f ("Add support the Korina (IDT RC32434) Ethernet MAC") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20201214220952.19935-1-vincent.stehle@laposte.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kernel test robot triggerred the following warning,
>> drivers/block/rnbd/rnbd-clt.c:1397:42: warning: size argument in
'strlcpy' call appears to be size of the source; expected the size of the
destination [-Wstrlcpy-strlcat-size]
strlcpy(dev->pathname, pathname, strlen(pathname) + 1);
~~~~~~~^~~~~~~~~~~~~
To get rid of the above warning, use a kstrdup as Bart suggested.
Fixes: 64e8a6ece1a5 ("block/rnbd-clt: Dynamically alloc buffer for pathname & blk_symlink_name") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This patch fixes an error condition triggered when the code path which
transmits a S/G frame descriptor when the skb's headroom is not enough
for DPAA2's needs.
We are greated with a splat like the one below when a SGT structure is
recycled and that is because even though a dma_unmap is performed on the
Tx confirmation path, the unmap is not done with the proper size.
The dpaa2-eth driver uses an area of software annotations to transmit
necessary information from the Tx path to the Tx confirmation one. This
SWA structure has a different layout for each kind of frame that we are
dealing with: linear, S/G or XDP.
The commit referenced was incorrectly setting up the 'sgt_size' field
for the S/G type of SWA even though we are dealing with a linear skb
here.
Fixes: d70446ee1f40 ("dpaa2-eth: send a scatter-gather FD instead of realloc-ing") Reported-by: Daniel Thompson <daniel.thompson@linaro.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20201211171607.108034-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On the Rx side, the next_to_use index points to the next item in the
HW ring to be refilled/allocated, and next_to_clean points to the next
item to potentially be processed.
When the HW Rx ring is fully refilled, i.e. no packets has been
processed, the next_to_use will be next_to_clean - 1. When the ring is
fully processed next_to_clean will be equal to next_to_use. The latter
case is where a bug is triggered.
If the next_to_use bits are not cleared, and the "fully processed"
state is entered, a stale descriptor can be processed.
The skb-path correctly clear the status bit for the next_to_use
descriptor, but the AF_XDP zero-copy path did not do that.
This change adds the status bits clearing of the next_to_use
descriptor.
Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On the Rx side, the next_to_use index points to the next item in the
HW ring to be refilled/allocated, and next_to_clean points to the next
item to potentially be processed.
When the HW Rx ring is fully refilled, i.e. no packets has been
processed, the next_to_use will be next_to_clean - 1. When the ring is
fully processed next_to_clean will be equal to next_to_use. The latter
case is where a bug is triggered.
If the next_to_use bits are not cleared, and the "fully processed"
state is entered, a stale descriptor can be processed.
The skb-path correctly clear the status bit for the next_to_use
descriptor, but the AF_XDP zero-copy path did not do that.
This change adds the status bits clearing of the next_to_use
descriptor.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Even if there is more rx data waiting on the chip, the rx napi poll fn
will never run more than once - it will always read a few buffers, then
bail out and re-arm interrupts. Which results in ping-pong between napi
and interrupt.
This defeats the purpose of napi, and is bad for performance.
Fix by making the rx napi poll behave identically to other ethernet
drivers:
1. initialize rx napi polling with an arbitrary budget (64).
2. in the polling fn, return full weight if rx queue is not depleted,
this tells the napi core to "keep polling".
3. update the rx tail ("ring the doorbell") once for every 8 processed
rx ring buffers.
Thanks to Jakub Kicinski, Eric Dumazet and Andrew Lunn for their expert
opinions and suggestions.
Tested with 20 seconds of full bandwidth receive (iperf3):
rx irqs softirqs(NET_RX)
-----------------------------
before 23827 33620
after 129 4081
Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430 Fixes: 23f0703c125be ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201215161954.5950-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The CALL_ON_STACK tests use the no_dat stack to switch to a different
stack for unwinding tests. If an interrupt or machine check happens
while using that stack, and previously being on the async stack, the
interrupt / machine check entry code (SWITCH_ASYNC) will assume that
the previous context did not use the async stack and happily use the
async stack again.
This will lead to stack corruption of the previous context.
To solve this disable both interrupts and machine checks before
switching to the no_dat stack.
Fixes: 7868249fbbc8 ("s390/test_unwind: add CALL_ON_STACK tests") Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is an unescaped left brace in a regex in OPEN_BRACE check. This
throws a runtime error when checkpatch is run with --fix flag and the
OPEN_BRACE check is executed.
Fix it by escaping the left brace.
Link: https://lkml.kernel.org/r/20201115202928.81955-1-dwaipayanray1@gmail.com Fixes: 8d1824780f2f ("checkpatch: add --fix option for a couple OPEN_BRACE misuses") Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On 2-node NUMA hosts we see bursts of kswapd reclaim and subsequent
pressure spikes and stalls from cache refaults while there is plenty of
free memory in the system.
Usually, kswapd is woken up when all eligible nodes in an allocation are
full. But the code related to watermark boosting can wake kswapd on one
full node while the other one is mostly empty. This may be justified to
fight fragmentation, but is currently unconditionally done whether
watermark boosting is occurring or not.
In our case, many of our workloads' throughput scales with available
memory, and pure utilization is a more tangible concern than trends
around longer-term fragmentation. As a result we generally disable
watermark boosting.
Wake kswapd only woken when watermark boosting is requested.
Link: https://lkml.kernel.org/r/20201020175833.397286-1-hannes@cmpxchg.org Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
madvise_inject_error() uses get_user_pages_fast to translate the address
we specified to a page. After [1], we drop the extra reference count for
memory_failure() path. That commit says that memory_failure wanted to
keep the pin in order to take the page out of circulation.
The truth is that we need to keep the page pinned, otherwise the page
might be re-used after the put_page() and we can end up messing with
someone else's memory.
E.g:
CPU0
process X CPU1
madvise_inject_error
get_user_pages
put_page
page gets reclaimed
process Y allocates the page
memory_failure
// We mess with process Y memory
madvise() is meant to operate on a self address space, so messing with
pages that do not belong to us seems the wrong thing to do.
To avoid that, let us keep the page pinned for memory_failure as well.
Pages for DAX mappings will release this extra refcount in
memory_failure_dev_pagemap.
[1] ("23e7b5c2e271: mm, madvise_inject_error:
Let memory_failure() optionally take a page reference")
Link: https://lkml.kernel.org/r/20201207094818.8518-1-osalvador@suse.de Fixes: 23e7b5c2e271 ("mm, madvise_inject_error: Let memory_failure() optionally take a page reference") Signed-off-by: Oscar Salvador <osalvador@suse.de> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The size of vm area can be affected by the presence or not of the guard
page. In particular when VM_NO_GUARD is present, the actual accessible
size has to be considered like the real size minus the guard page.
Currently kasan does not keep into account this information during the
poison operation and in particular tries to poison the guard page as well.
This approach, even if incorrect, does not cause an issue because the tags
for the guard page are written in the shadow memory. With the future
introduction of the Tag-Based KASAN, being the guard page inaccessible by
nature, the write tag operation on this page triggers a fault.
Fix kasan shadow poisoning size invoking get_vm_area_size() instead of
accessing directly the field in the data structure to detect the correct
value.
Link: https://lkml.kernel.org/r/20201027160213.32904-1-vincenzo.frascino@arm.com Fixes: d98c9e83b5e7c ("kasan: fix crashes on access to memory mapped by vm_map_ram()") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This unlock sequence, though allowed, is not optimal. If a waiter is
present, mutex_unlock() will need to go through the slowpath of waking
up the waiter with preemption disabled. Fix that by releasing the
spinlock first before the mutex.
Link: https://lkml.kernel.org/r/20201213180843.16938-1-longman@redhat.com Fixes: e36176be1c39 ("mm/vmalloc: rework vmap_area_lock") Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The page has just been allocated, so its refcount is 1. free_unref_page()
is for use on pages which have a zero refcount. Use __free_page() like
the other implementations of pte_alloc_one().
Link: https://lkml.kernel.org/r/20201125034655.27687-1-willy@infradead.org Fixes: 1ae9ae5f7df7 ("sparc: handle pgtable_page_ctor() fail") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since commit 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic
v2"), the code to check the secondary MMU's page table access bit is
broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from the
secondary MMU's page table before the check. More specifically for those
secondary MMUs which unmap the memory in
mmu_notifier_invalidate_range_start() like kvm.
However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the
absence of TTU_IGNORE_ACCESS and it explicitly performs the page table
access check before trying to unmap the page. So, at worst the reclaim
will miss accesses in a very short window if we remove page table access
check in unmapping code.
There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg
reclaim. From memcg reclaim the page_referenced() only account the
accesses from the processes which are in the same memcg of the target page
but the unmapping code is considering accesses from all the processes, so,
decreasing the effectiveness of memcg reclaim.
The simplest solution is to always assume TTU_IGNORE_ACCESS in unmapping
code.
Link: https://lkml.kernel.org/r/20201104231928.1494083-1-shakeelb@google.com Fixes: 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic v2") Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The rcu_read_lock/unlock only can guarantee that the memcg will not be
freed, but it cannot guarantee the success of css_get to memcg.
If the whole process of a cgroup offlining is completed between reading a
objcg->memcg pointer and bumping the css reference on another CPU, and
there are exactly 0 external references to this memory cgroup (how we get
to the obj_cgroup_charge() then?), css_get() can change the ref counter
from 0 back to 1.
Link: https://lkml.kernel.org/r/20201028035013.99711-2-songmuchun@bytedance.com Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <guro@fb.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Yafang Shao <laoar.shao@gmail.com> Cc: Chris Down <chris@chrisdown.name> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
These functions accomplish the same thing but have different
implementations.
unpin_user_page() has a bug where it calls mod_node_page_state() after
calling put_page() which creates a risk that the page could have been
hot-uplugged from the system.
Fix this by using put_compound_head() as the only implementation.
__unpin_devmap_managed_user_page() and related can be deleted as well in
favour of the simpler, but slower, version in put_compound_head() that has
an extra atomic page_ref_sub, but always calls put_page() which internally
contains the special devmap code.
Move put_compound_head() to be directly after try_grab_compound_head() so
people can find it in future.
Link: https://lkml.kernel.org/r/0-v1-6730d4ee0d32+40e6-gup_combine_put_jgg@nvidia.com Fixes: 1970dc6f5226 ("mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> CC: Joao Martins <joao.m.martins@oracle.com> CC: Jonathan Corbet <corbet@lwn.net> CC: Dan Williams <dan.j.williams@intel.com> CC: Dave Chinner <david@fromorbit.com> CC: Christoph Hellwig <hch@infradead.org> CC: Jane Chu <jane.chu@oracle.com> CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> CC: Michal Hocko <mhocko@suse.com> CC: Mike Kravetz <mike.kravetz@oracle.com> CC: Shuah Khan <shuah@kernel.org> CC: Muchun Song <songmuchun@bytedance.com> CC: Vlastimil Babka <vbabka@suse.cz> CC: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during
fork() for ptes") pages under a FOLL_PIN will not be write protected
during COW for fork. This means that pages returned from
pin_user_pages(FOLL_WRITE) should not become write protected while the pin
is active.
However, there is a small race where get_user_pages_fast(FOLL_PIN) can
establish a FOLL_PIN at the same time copy_present_page() is write
protecting it:
CPU 0 CPU 1
get_user_pages_fast()
internal_get_user_pages_fast()
copy_page_range()
pte_alloc_map_lock()
copy_present_page()
atomic_read(has_pinned) == 0
page_maybe_dma_pinned() == false
atomic_set(has_pinned, 1);
gup_pgd_range()
gup_pte_range()
pte_t pte = gup_get_pte(ptep)
pte_access_permitted(pte)
try_grab_compound_head()
pte = pte_wrprotect(pte)
set_pte_at();
pte_unmap_unlock()
// GUP now returns with a write protected page
The first attempt to resolve this by using the write protect caused
problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid
early COW write protect games during fork()")
Instead wrap copy_p4d_range() with the write side of a seqcount and check
the read side around gup_pgd_range(). If there is a collision then
get_user_pages_fast() fails and falls back to slow GUP.
Slow GUP is safe against this race because copy_page_range() is only
called while holding the exclusive side of the mmap_lock on the src
mm_struct.
[akpm@linux-foundation.org: coding style fixes] Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Peter Xu <peterx@redhat.com> Acked-by: "Ahmed S. Darwish" <a.darwish@linutronix.de> [seqcount_t parts] Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kirill Shutemov <kirill@shutemov.name> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Patch series "Add a seqcount between gup_fast and copy_page_range()", v4.
As discussed and suggested by Linus use a seqcount to close the small race
between gup_fast and copy_page_range().
Ahmed confirms that raw_write_seqcount_begin() is the correct API to use
in this case and it doesn't trigger any lockdeps.
I was able to test it using two threads, one forking and the other using
ibv_reg_mr() to trigger GUP fast. Modifying copy_page_range() to sleep
made the window large enough to reliably hit to test the logic.
This patch (of 2):
The next patch in this series makes the lockless flow a little more
complex, so move the entire block into a new function and remove a level
of indention. Tidy a bit of cruft:
- addr is always the same as start, so use start
- Use the modern check_add_overflow() for computing end = start + len
- nr_pinned/pages << PAGE_SHIFT needs the LHS to be unsigned long to
avoid shift overflow, make the variables unsigned long to avoid coding
casts in both places. nr_pinned was missing its cast
- The handling of ret and nr_pinned can be streamlined a bit
Commit e1c92a7fbbc5 ("perf tests: Add another metric parsing test") add
another test for metric parsing. The test goes through all metrics
compiled for arch within pmu events and try to parse them.
Right now this test is failing in powerpc machine.
Result in power9 platform:
[command]# ./perf test 10
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : FAILED!
Issue is we are passing different runtime parameter value in
"expr__find_other" and "expr__parse" function which is called from
function `metric_parse_fake`. And because of this parsing of hv-24x7
metrics is failing.
[command]# ./perf test 10 -vv
.....
hv_24x7/pm_mcs01_128b_rd_disp_port01,chip=1/ not found
expr__parse failed
test child finished with -1
---- end ----
PMU events subtest 4: FAILED!
This patch fix this issue and change runtime parameter value to '0' in
expr__parse function.
Result in power9 platform after this patch:
[command]# ./perf test 10
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
Fixes: e1c92a7fbbc5 ("perf tests: Add another metric parsing test") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20201119152411.46041-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Threshold Event Counter Multiplier (TECM) is part of Monitor Mode
Control Register A (MMCRA). This field along with Threshold Event
Counter Exponent (TECE) is used to get threshould counter value.
In Power10, this is a 8bit field, so patch fixes the
current code to modify the MMCRA[TECM] extraction macro to
handle this change. ISA v3.1 says this is a 7 bit field but
POWER10 it's actually 8 bits which will hopefully be fixed
in ISA v3.1 update.
Fixes: 170a315f41c6 ("powerpc/perf: Support to export MMCRA[TEC*] field to userspace") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1608022578-1532-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Silence this by checking for -EPROBE_DEFER and using dev_err_probe() so
we set a deferred reason in case a dependency fails to probe (which
quickly happens on small config/DT changes due to the rather long probe
chain which can include bridges, phys, panels, backights, leds, etc.).
This also removes the only DRM_DEV_ERROR() usage, the rest of the driver
uses dev_err().
As part of the cma_dev release, that pointer will be set to NULL. In case
it happens in rdma_bind_addr() (part of an error flow), the next call to
addr_handler() will have a call to cma_acquire_dev_by_src_ip() which will
overwrite sgid_attr without releasing it.
Fixes: ff11c6cd521f ("RDMA/cma: Introduce and use cma_acquire_dev_by_src_ip()") Link: https://lore.kernel.org/r/20201213132940.345554-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fixes: 1769c4c57548 ("RDMA/mlx5: Always remove MRs from the cache before destroying them") Link: https://lore.kernel.org/r/20201213132940.345554-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When receiving pages data, return value 'ret' when positive includes
`buf->page_base`, so we should subtract that before it is used for
changing `offset` and comparing against `want`.
This was discovered on the very rare cases where the server returned a
chunk of bytes that when added to the already received amount of bytes
for the pages happened to match the current `recv.len`, for example
on this case:
buf->page_base : 258356
actually received from socket: 1740
ret : 260096
want : 260096
In this case neither of the two 'if ... goto out' trigger, and we
continue to tail parsing.
Worth to mention that the ensuing EMSGSIZE from the continued execution of
`xs_read_xdr_buf` may be observed by an application due to 4 superfluous
bytes being added to the pages data.
Fixes: 277e4ab7d530 ("SUNRPC: Simplify TCP receive code by switching to using iterators") Signed-off-by: Dan Aloni <dan@kernelim.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
xterm serial channel was leaking a fd used in setting up the
port helper
This bug is prehistoric - it predates switching to git. The "fixes"
header here is really just to mark all the versions we would like this to
apply to which is "Anything from the Cretaceous period onwards".
No dinosaurs were harmed in fixing this bug.
Fixes: b40997b872cd ("um: drivers/xterm.c: fix a file descriptor leak") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix a logical error in tty reading. We get 0 and errno == EAGAIN
on the first attempt to read from a closed file descriptor.
Compared to that a true EAGAIN is EAGAIN and -1.
If we check errno for EAGAIN first, before checking the return
value we miss the fact that the descriptor is closed.
This bug is as old as the driver. It was not showing up with
the original POLL based IRQ controller, because it was
producing multiple events. Switching to EPOLL unmasked it.
Fixes: ff6a17989c08 ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Ensure that file closes, connection closes, etc are propagated
as interrupts in the interrupt controller.
Fixes: ff6a17989c08 ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix to return PTR_ERR() error code from the error handling case where
ubifs_hash_get_desc() failed instead of 0 in ubifs_init_authentication(),
as done elsewhere in this function.
Fixes: 49525e5eecca5 ("ubifs: Add helper functions for authentication support") Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
A reboot notifier, which stops the WDT by calling the stop hook without
any check, would be registered when we set WDOG_STOP_ON_REBOOT flag.
Howerer we allow the WDT driver to omit the stop hook since commit
"d0684c8a93549" ("watchdog: Make stop function optional") and provide
a module parameter for user that controls the WDOG_STOP_ON_REBOOT flag
in commit 9232c80659e94 ("watchdog: Add stop_on_reboot parameter to
control reboot policy"). Together that commits make user potential to
insert a watchdog driver that don't provide a stop hook but with the
stop_on_reboot parameter set, then dereferencing of null pointer occurs
on system reboot.
Check the stop hook before registering the reboot notifier to fix the
issue.
Fixes: d0684c8a9354 ("watchdog: Make stop function optional") Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20201109130512.28121-1-wangwensheng4@huawei.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
As the specification described, users must check busy bit before start
a new loading operation to make sure that the previous loading is done
and the device is ready to accept a new one.
sprd_wdt_start() would return fail if the loading operation is not completed
in a certain time, disabling watchdog for that case would probably cause
the kernel crash when kick watchdog later, that's too bad, so remove the
watchdog disable operation for the fail case to make sure other parts in
the kernel can run normally.
The following kbuild warning is seen on a system without HAS_IOMEM.
WARNING: unmet direct dependencies detected for MFD_SYSCON
Depends on [n]: HAS_IOMEM [=n]
Selected by [y]:
- ARMADA_37XX_WATCHDOG [=y] && WATCHDOG [=y] && (ARCH_MVEBU || COMPILE_TEST
This results in a subsequent compile error.
drivers/watchdog/armada_37xx_wdt.o: in function `armada_37xx_wdt_probe':
armada_37xx_wdt.c:(.text+0xdc): undefined reference to `devm_ioremap'
Add the missing dependency.
Reported-by: Necip Fazil Yildiran <fazilyildiran@gmail.com> Fixes: 54e3d9b518c8 ("watchdog: Add support for Armada 37xx CPU watchdog") Link: https://lore.kernel.org/r/20201108162550.27660-1-linux@roeck-us.net Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We have a problem if we use gpio-keys and configure wakeups such that
we only want one edge to wake us up. AKA:
wakeup-event-action = <EV_ACT_DEASSERTED>;
wakeup-source;
Specifically we end up with a phantom interrupt that blocks suspend if
the line was already high and we want wakeups on rising edges (AKA we
want the GPIO to go low and then high again before we wake up). The
opposite is also problematic.
Specifically, here's what's happening today:
1. Normally, gpio-keys configures to look for both edges. Due to the
current workaround introduced in commit c3c0c2e18d94 ("pinctrl:
qcom: Handle broken/missing PDC dual edge IRQs on sc7180"), if the
line was high we'd configure for falling edges.
2. At suspend time, we change to look for rising edges.
3. After qcom_pdc_gic_set_type() runs, we get a phantom interrupt.
We can solve this by just clearing the phantom interrupt.
NOTE: it is possible that this could cause problems for a client with
very specific needs, but there's not much we can do with this
hardware. As an example, let's say the interrupt signal is currently
high and the client is looking for falling edges. The client now
changes to look for rising edges. The client could possibly expect
that if the line has a short pulse low (and back high) that it would
always be detected. Specifically no matter when the pulse happened,
it should either have tripped the (old) falling edge trigger or the
(new) rising edge trigger. We will simply not trip it. We could
narrow down the race a bit by polling our parent before changing
types, but no matter what we do there will still be a period of time
where we can't tell the difference between a real transition (or more
than one transition) and the phantom.
Fixes: f55c73aef890 ("irqchip/pdc: Add PDC interrupt controller for QCOM SoCs") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Maulik Shah <mkshah@codeaurora.org> Reviewed-by: Maulik Shah <mkshah@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20201211141514.v4.1.I2702919afc253e2a451bebc3b701b462b2d22344@changeid Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently 6G specific tlvs have duplicate entries which is causing
scan failures. Fix this by removing the duplicate entries of the same
tlv. This also fixes out-of-bound memory writes caused due to
adding tlvs when num_hint_bssid and num_hint_s_ssid are ZEROs.
Fixes: 74601ecfef6e ("ath11k: Add support for 6g scan hint") Reported-by: Carl Huang <cjhuang@codeaurora.org> Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1607609124-17250-7-git-send-email-kvalo@codeaurora.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
irqchip shared with multiple gpiochips, leads to recursive call of
gpiochip_irq_mask/gpiochip_irq_unmask which was assigned to
rqchip->irq_mask/irqchip->irq_unmask, these happens becouse of
only irqchip->irq_enable == gpiochip_irq_enable is checked.
Let's add an additional check to make sure shared irqchip is detected
even if irqchip->irq_enable wasn't defined.
Fixes: a8173820f441 ("gpio: gpiolib: Allow GPIO IRQs to lazy disable") Signed-off-by: Nikita Shubin <nikita.shubin@maquefel.me> Link: https://lore.kernel.org/r/20201210070514.13238-1-nikita.shubin@maquefel.me Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The high 6 bits of traffic class in GRH is DSCP (Differentiated Services
Codepoint), the driver should shift it before the hardware gets it when
using RoCEv2.
Fixes: 606bf89e98ef ("RDMA/hns: Refactor for hns_roce_v2_modify_qp function") Fixes: fba429fcf9a5 ("RDMA/hns: Fix missing fields in address vector") Link: https://lore.kernel.org/r/1607650657-35992-4-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Whether to enable the these features should better depend on the enable
flags, not the value of related fields.
Fixes: 5c1f167af112 ("RDMA/hns: Init SRQ table for hip08") Fixes: 3cb2c996c9dc ("RDMA/hns: Add support for SCCC in size of 64 Bytes") Link: https://lore.kernel.org/r/1607650657-35992-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
For ib_copy_from_user(), the length of udata may not be the same as that
of cmd. For ib_copy_to_user(), the length of udata may not be the same as
that of resp. So limit the length to prevent out-of-bounds read and write
operations from ib_copy_from_user() and ib_copy_to_user().
Fixes: de77503a5940 ("RDMA/hns: RDMA/hns: Assign rq head pointer when enable rq record db") Fixes: 633fb4d9fdaa ("RDMA/hns: Use structs to describe the uABI instead of opencoding") Fixes: ae85bf92effc ("RDMA/hns: Optimize qp param setup flow") Fixes: 6fd610c5733d ("RDMA/hns: Support 0 hop addressing for SRQ buffer") Fixes: 9d9d4ff78884 ("RDMA/hns: Update the kernel header file of hns") Link: https://lore.kernel.org/r/1607650657-35992-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
According to different sections of the TRM, the hchan_cnt of CAP3 includes
the number of uchan in UDMA, thus the start offset of the normal channels
are hchan_cnt.
Fixes: daf4ad0499aa4 ("dmaengine: ti: k3-udma: Query throughput level information from hardware") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Link: https://lore.kernel.org/r/20201208090440.31792-2-peter.ujfalusi@ti.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
ti_sci_intr_irq_domain_free() assumes that out_irq of intr is stored in
data->chip_data and uses it for calling ti_sci irq_free() and then
mark the out_irq as available resource. But ti_sci_intr_irq_domain_alloc()
is storing p_hwirq(parent's hardware irq) which is translated from out_irq.
This is causing resource leakage and eventually out_irq resources might
be exhausted. Fix ti_sci_intr_irq_domain_alloc() by storing the out_irq
in data->chip_data.
Fixes: a5b659bd4bc7 ("irqchip/ti-sci-intr: Add support for INTR being a parent to INTR") Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201102120631.11165-1-lokeshvutla@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On a successful probe, the driver tries to print a success message with
INTA device id. It uses pdev->id for printing the id but id is stored in
inta->ti_sci_id. Fix it by correcting the dev_info parameter.
Fixes: 5c4b585d2910 ("irqchip/ti-sci-inta: Add support for INTA directly connecting to GIC") Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201102120614.11109-1-lokeshvutla@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The error handling frees "ctl" but it's still on the "dsp->ctl_list"
list so that could result in a use after free. Remove it from the list
before returning.
Fixes: 2323736dca72 ("ASoC: wm_adsp: Add basic support for rev 1 firmware file format") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/X9B0keV/02wrx9Xs@mwanda Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
It should be !is_multicast_ether_addr() in ieee80211_rx_h_sta_process()
for the rx_stats update, below commit remove the !, this patch is to
change it back.
It lead the rx rate "iw wlan0 station dump" become invalid for some
scenario when IEEE80211_HW_USES_RSS is set.
Fixes: 09a740ce352e ("mac80211: receive and process S1G beacons") Signed-off-by: Wen Gong <wgong@codeaurora.org> Link: https://lore.kernel.org/r/1607483189-3891-1-git-send-email-wgong@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When we set up a TDLS station, we set sta->sta.bandwidth solely based
on the capabilities, because the "what's the current bandwidth" check
is bypassed and only applied for other types of stations.
This leads to the unfortunate scenario that the sta->sta.bandwidth is
160 MHz if both stations support it, but we never actually configure
this bandwidth unless the AP is already using 160 MHz; even for wider
bandwidth support we only go up to 80 MHz (at least right now.)
For iwlwifi, this can also lead to firmware asserts, telling us that
we've configured the TX rates for a higher bandwidth than is actually
available due to the PHY configuration.
For non-TDLS, we check against the interface's requested bandwidth,
but we explicitly skip this check for TDLS to cope with the wider BW
case. Change this to
(a) still limit to the TDLS peer's own chandef, which gets factored
into the overall PHY configuration we request from the driver,
and
(b) limit it to when the TDLS peer is authorized, because it's only
factored into the channel context in this case.
Fixes: 504871e602d9 ("mac80211: fix bandwidth computation for TDLS peers") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20201206145305.fcc7d29c4590.I11f77e9e25ddf871a3c8d5604650c763e2c5887a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The bitreverse helper is almost always built into the kernel,
but in a rare randconfig build it is possible to hit a case
in which it is a loadable module while the atmel-i2c driver
is built-in:
arm-linux-gnueabi-ld: drivers/crypto/atmel-i2c.o: in function `atmel_i2c_checksum':
atmel-i2c.c:(.text+0xa0): undefined reference to `byte_rev_table'
Add one more 'select' statement to prevent this.
Fixes: 11105693fa05 ("crypto: atmel-ecc - introduce Microchip / Atmel ECC driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The platform device driver name is "max77693-muic", so advertise it
properly in the modalias string. This fixes automated module loading when
this driver is compiled as a module.
Fixes: db1b9037424b ("extcon: MAX77693: Add extcon-max77693 driver to support Maxim MAX77693 MUIC device") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If generic_drop_inode() returns true, it means iput_final() can evict
this inode regardless of whether it is dirty or not. If we check
I_DONTCACHE in generic_drop_inode(), any inode with this bit set will be
evicted unconditionally. This is not the desired behavior because
I_DONTCACHE only means the inode shouldn't be cached on the LRU list.
As for whether we need to evict this inode, this is what
generic_drop_inode() should do. This patch corrects the usage of
I_DONTCACHE.
Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer") Signed-off-by: Hao Li <lihao2018.fnst@cn.fujitsu.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix a possible hang in xdpsock that can occur when using multiple
threads. In this case, one or more of the threads might get stuck in
the while-loop in tx_only after the user has signaled the main thread
to stop execution. In this case, no more Tx packets will be sent, so a
thread might get stuck in the aforementioned while-loop. Fix this by
introducing a test inside the while-loop to check if the benchmark has
been terminated. If so, return from the function.
Fixes: cd9e72b6f210 ("samples/bpf: xdpsock: Add option to specify batch size") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20201210163407.22066-1-magnus.karlsson@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
To get better performance, current gpmi driver collected and chained all
small DMA transfers in gpmi_nfc_exec_op, the whole chain triggered and
wait for complete at the end.
But some random DMA timeout found in this new driver, with the help of
ftrace, we found the root cause is as follows:
Take gpmi_ecc_read_page() as an example, gpmi_nfc_exec_op collected 6
DMA transfers and the DMA chain triggered at the end. It waits for bch
completion and check jiffies if it's timeout. The typical function graph
shown below,
but it's not gurantee that bch irq handled always after dma irq handled,
sometimes bch_irq comes first and gpmi_nfc_exec_op won't wait anymore,
another gpmi_nfc_exec_op may get invoked before last DMA chain IRQ
handled, this messed up the next DMA chain and causes DMA timeout. Check
the trace log when issue happened.
In the first gpmi_nfc_exec_op, bch_irq comes first and gpmi_nfc_exec_op
exits, but DMA IRQ still not happened yet until the middle of following
gpmi_nfc_exec_op, the first DMA transfer index get messed and DMA get
timeout.
To fix the issue, when there is bch ops in DMA chain, the
gpmi_nfc_exec_op should wait for both completions rather than bch
completion only.
Fixes: ef347c0cfd61 ("mtd: rawnand: gpmi: Implement exec_op") Signed-off-by: Han Xu <han.xu@nxp.com> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201209035104.22679-3-han.xu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
pm_runtime_get_sync() will increment pm usage at first and it
will resume the device later. If runtime of the device has
error or device is in inaccessible state(or other error state),
resume operation will fail. If we do not call put operation to
decrease the reference, it will result in reference leak in
the two functions(gpmi_init and gpmi_nfc_exec_op). Moreover,
this device cannot enter the idle state and always stay busy or
other non-idle state later. So we fixed it through adding
pm_runtime_put_noidle.
Fixes: 5bc6bb603b4d0 ("mtd: rawnand: gpmi: Fix suspend/resume problem") Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Acked-by: Han Xu <han.xu@nxp.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20201107110552.1568742-1-zhangqilong3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The periph_clks[] array contains duplicated entry for Security Engine
clock which was meant to be defined for T210, but it wasn't added
properly. This patch corrects the T210 SE entry and fixes the following
error message on T114/T124: "Tegra clk 127: register failed with -17".
Fixes: dc37fec48314 ("clk: tegra: periph: Add new periph clks and muxes for Tegra210")
Tested-by Nicolas Chauvet <kwizart@gmail.com>
Reported-by Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20201025224212.7790-1-digetx@gmail.com Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>