a893ea15d764 ("tpm: move tpm_chip definition to include/linux/tpm.h")
introduced a build error when both IMA and EFI are enabled:
In file included from ../security/integrity/ima/ima_fs.c:30:
../security/integrity/ima/ima.h:176:7: error: redeclaration of enumerator "NONE"
What happens is that both headers (ima.h and efi.h) defines the same
'NONE' constant, and it broke when they started getting included from
the same file:
Rework to prefix the EFI enum with 'EFI_*'.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20190215165551.12220-2-ard.biesheuvel@linaro.org
[ Cleaned up the changelog a bit. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
cgroup_rstat_cpu_pop_updated() is used to traverse the updated cgroups
on flush. While it was only visiting updated ones in the subtree, it
was visiting @root unconditionally. We can easily check whether @root
is updated or not by looking at its ->updated_next just as with the
cgroups in the subtree.
* Remove the unnecessary cgroup_parent() test. The system root cgroup
is never updated and thus its ->updated_next is always NULL. No
need to test whether cgroup_parent() exists in addition to
->updated_next.
* Terminate traverse if ->updated_next is NULL. This can only happen
for subtree @root and there's no reason to visit it if it's not
marked updated.
This reduces cpu consumption when reading a lot of rstat backed files.
In a micro benchmark reading stat from ~1600 cgroups, the sys time was
lowered by >40%.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
When performing a warm reset in ishtp bus driver, the ishtp_cl_device
will not be removed, its fw_client still points to the already freed
ishtp_device.fw_clients array.
Later after driver finishing ishtp client enumeration, this dangling
pointer may cause driver to bind the wrong ishtp_cl_device to the new
client, causing wrong callback to be called for messages intended for
the new client.
This helps in development of firmware where frequent switching of
firmwares is required without Linux reboot.
Signed-off-by: Hong Liu <hong.liu@intel.com> Tested-by: Hongyan Song <hongyan.song@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The preadv2 and pwritev2 syscalls are supposed to emulate the readv and
writev syscalls when offset == -1. Therefore the compat code should
check for offset before calling do_compat_preadv64 and
do_compat_pwritev64. This is the case for the preadv2 and pwritev2
syscalls, but handling of offset == -1 is missing in their 64-bit
equivalent.
This patch fixes that, calling do_compat_readv and do_compat_writev when
offset == -1. This fixes the following glibc tests on x32:
- misc/tst-preadvwritev2
- misc/tst-preadvwritev64v2
Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: H.J. Lu <hjl.tools@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
If there are exported DMA buffers which are still in use and
grant device is closed by either normal user-space close or by
a signal this leads to the grant device context to be destroyed,
thus making it not possible to correctly destroy those exported
buffers when they are returned back to gntdev and makes the module
crash:
Since commit d6cd33ad7102 ("regulator: gpio: Convert to use descriptors")
the GPIO regulator had inverted the polarity of the control GPIO. This
problem manifested itself on systems with DT containing the following
description (snippet from salvator-common.dtsi):
Prior to the aforementioned commit, the gpio-regulator code used
gpio_request_array() to claim the GPIO(s) specified in the "gpios"
DT node, while the commit changed that to devm_gpiod_get_index().
The legacy gpio_request_array() calls gpio_request_one() and then
gpiod_request(), which parses the DT flags of the "gpios" node and
populates the GPIO descriptor flags field accordingly.
The new devm_gpiod_get_index() calls gpiod_get_index(), then
of_find_gpio(), of_get_named_gpiod_flags() with flags != NULL,
and then of_gpio_flags_quirks(). Since commit a603a2b8d86e
("gpio: of: Add special quirk to parse regulator flags"),
of_gpio_flags_quirks() contains a quirk for regulator-gpio
which was never triggered by the legacy gpio_request_array()
code path, but is triggered by devm_gpiod_get_index() code
path.
This quirk checks whether a GPIO is associated with a fixed
or gpio-regulator and if so, checks two additional conditions.
First, whether such GPIO is active-low, and if so, ignores the
active-low flag. Second, whether the regulator DT node does
have an "enable-active-high" property and if the property is
NOT present, sets the GPIO flags as active-low.
The second check triggers a problem, since it is applied to all
GPIOs associated with a gpio-regulator, rather than only on the
"enable" GPIOs, as the old code did. This changes the way the
gpio-regulator interprets the DT description of the control
GPIOs.
The old code using gpio_request_array() explicitly parsed the
"enable-active-high" DT property and only applied it to the
GPIOs described in the "enable-gpios" DT node, and only if
those were present.
This patch fixes the quirk code by only applying the quirk
to "enable-gpios", thus restoring the old behavior.
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Jan Kotas <jank@cadence.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Mark Brown <broonie@kernel.org> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: linux-renesas-soc@vger.kernel.org
To: linux-gpio@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Do not call mt76u_queues_deinit routine in mt76u_alloc_queues error path
since it will be run in mt76x0u_register_device or
mt76x2u_register_device error path. Current implementation triggers the
following kernel warning:
Fixes: b40b15e1521f ("mt76: add usb support to mt76 layer") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The runtime PM of this device is enabled after v4l2_ctrl_handler_setup(),
and this makes this device's runtime PM usage count a negative value.
The ov7740_set_ctrl() tries to do something only if the device's runtime
PM usage counter is nonzero.
ov7740_set_ctrl()
{
if (!pm_runtime_get_if_in_use(&client->dev))
return 0;
<do something>;
pm_runtime_put(&client->dev);
return ret;
}
However, the ov7740_set_ctrl() is called by v4l2_ctrl_handler_setup()
while the runtime PM of this device is not yet enabled. In this case,
the pm_runtime_get_if_in_use() returns -EINVAL (!= 0).
Therefore we can't bail out of this function and the usage count is
decreased by pm_runtime_put() without increment.
This fixes this problem by enabling the runtime PM of this device before
v4l2_ctrl_handler_setup() so that the ov7740_set_ctrl() is always called
when the runtime PM is enabled.
Cc: Wenyou Yang <wenyou.yang@microchip.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Eugen Hristev <eugen.hristev@microchip.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The of_find_device_by_node() takes a reference to the underlying device
structure, we should release that reference.
Detected by coccinelle with the following warnings:
./sound/soc/fsl/imx-sgtl5000.c:169:1-7: ERROR: missing put_device;
call of_find_device_by_node on line 105, but without a corresponding
object release within this function.
./sound/soc/fsl/imx-sgtl5000.c:177:1-7: ERROR: missing put_device;
call of_find_device_by_node on line 105, but without a corresponding
object release within this function.
Signed-off-by: Wen Yang <yellowriver2010@hotmail.com> Cc: Timur Tabi <timur@kernel.org> Cc: Nicolin Chen <nicoleotsuka@gmail.com> Cc: Xiubo Li <Xiubo.Lee@gmail.com> Cc: Fabio Estevam <festevam@gmail.com> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Mark Brown <broonie@kernel.org> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: NXP Linux Team <linux-imx@nxp.com> Cc: alsa-devel@alsa-project.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
We can't assume inlined symbols with the same name are equal, because
their address range may be different. This will cause the symbols with
different addresses be shadowed when adding to the hist entry, and lead
to ERANGE error when checking the symbol address during sample parse,
the addr should be within the range of [sym.start, sym.end].
The error message is like: "0x36aea60 [0x8]: failed to process type: 68".
The second parameter of symbol__new() is the length of the fake symbol
for the inline frame, which is the subtraction of the end and start
address of base_sym.
Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: aa441895f7b4 ("perf report: Compare symbol name for inlined frames when sorting") Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
on a firmware that doesn't have ad-hoc support just yields failures of
HostCmd_CMD_SET_BSS_MODE, which happened to return a '-1' error code
(-EPERM? not really right...) and sometimes may even crash the firmware
along the way.
Let's parse the firmware capability flag while registering the wiphy, so
we don't allow attempting IBSS at all, and we get a proper -EOPNOTSUPP
from nl80211 instead.
Fixes: e267e71e68ae ("mwifiex: Disable adhoc feature based on firmware capability") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
reveals the character arrays prev_comm and next_comm are per
default unsigned char and have values in the range of 0..255.
On x86 both fields are signed as this output shows:
[root@f29]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:1;
...
field:char next_comm[16]; offset:40; size:16; signed:1;
and the character arrays prev_comm and next_comm are per default signed
char and have values in the range of -1..127. The implementation of
type char is architecture specific.
Since the character arrays in both tracepoints sched_switch and
sched_wakeup should contain ascii characters, simply omit the check for
signedness in the test case.
Output before:
[root@m35lp76 perf]# ./perf test -F 14
14: Parse sched tracepoints fields :
--- start ---
sched:sched_switch: "prev_comm" signedness(0) is wrong, should be 1
sched:sched_switch: "next_comm" signedness(0) is wrong, should be 1
sched:sched_wakeup: "comm" signedness(0) is wrong, should be 1
---- end ----
14: Parse sched tracepoints fields : FAILED!
[root@m35lp76 perf]#
Output after:
[root@m35lp76 perf]# ./perf test -Fv 14
14: Parse sched tracepoints fields :
--- start ---
---- end ----
Parse sched tracepoints fields: Ok
[root@m35lp76 perf]#
Fixes: 489338a717a0 ("perf tests evsel-tp-sched: Fix bitwise operator") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190219153639.31267-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
In this sequence we want to keep the exiting CRTC0 but it's not in the
atomic state for the commit since it hasn't been modified. In this case
the stream->mode_changed flag persists as true and we don't re-program
the planes for the existing stream.
[How]
The flag needs to be cleared and it makes the most sense to do it within
DC after the state has been committed. Nothing following dc_commit_state
should think that the stream's mode has changed.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Acked-by: Tony Cheng <Tony.Cheng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
commit 1917d42d14b7 ("fcoe: use enum for fip_mode") introduces a separate
enum for the fip_mode that shall be used during initialisation handling
until it is passed to fcoe_ctrl_link_up to set the initial fip_state. That
change was incomplete and gcc quietly converted in various places between
the fip_mode and the fip_state enum values with implicit enum conversions,
which fortunately cannot cause any issues in the actual code's execution.
clang however warns about these implicit enum conversions in the scsi
drivers. This commit consolidates the use of the two enums, guided by
clang's enum-conversion warnings.
This commit now completes the use of the fip_mode: It expects and uses
fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls
fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up(). It
also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to
indicate these two enums are distinct.
Link: https://github.com/ClangBuiltLinux/linux/issues/151 Fixes: 1917d42d14b7 ("fcoe: use enum for fip_mode") Reported-by: Dmitry Golovin <dima@golovin.in> Original-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> CC: Lukas Bulwahn <lukas.bulwahn@gmail.com> CC: Nick Desaulniers <ndesaulniers@google.com> CC: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Suggested-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Prior to dma unmap/free operations the ism driver tries to ensure
that the memory is no longer accessed by the HW. When errors
during deregistration of memory regions from the HW occur the ism
driver will not unmap/free this memory.
When we receive notification from the hypervisor that a PCI function
has been detached we can no longer access the device and would never
unmap/free these memory regions which led to complaints by the DMA
debug API.
Treat this kind of errors during the deregistration of memory regions
from the HW as success since it is already ensured that the memory
is no longer accessed by HW.
Reported-by: Karsten Graul <kgraul@linux.ibm.com> Reported-by: Hans Wippel <hwippel@linux.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
When checking a generic status block, we iterate over all the generic
data blocks. The loop condition only checks that the start of the
generic data block is valid (within estatus->data_length) but not the
whole block. Because the size of data blocks (excluding error data) may
vary depending on the revision and the revision is contained within the
data block, ensure that enough of the current data block is valid before
dereferencing any members otherwise an out-of-bounds access may occur if
estatus->data_length is invalid.
This relies on the fact that struct acpi_hest_generic_data_v300 is a
superset of the earlier version. Also rework the other checks to avoid
potential underflow.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Borislav Petkov <bp@suse.de> Tested-by: Tyler Baicar <baicar.tyler@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Although qcom_snd_parse_of() tries to manage the of-node refcount,
there are still a few places that lead to the unblanced refcount in
the error code path. Namely,
- for_each_child_of_node() needs to unreference the iterator node if
aborting the loop in the middle,
- cpu, codec and platform node objects have to be unreferenced at each
iteration,
- platform and codec node objects have to be referred before jumping
to the error handling code that unreference them unconditionally.
This patch tries to address these by moving the assignment of platform
and codec node objects to the beginning of the loop and adding the
of_node_put() calls adequately.
Fixes: c25e295cd77b ("ASoC: qcom: Add support to parse common audio device nodes") Cc: Patrick Lai <plai@codeaurora.org> Cc: Banajit Goswami <bgoswami@codeaurora.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The recent rework of PCI kconfig symbols exposed an existing bug in
the CURRITUCK kconfig logic.
It selects PPC4xx_PCI_EXPRESS which depends on PCI, but PCI is user
selectable and might be disabled, leading to a warning:
WARNING: unmet direct dependencies detected for PPC4xx_PCI_EXPRESS
Depends on [n]: PCI [=n] && 4xx [=y]
Selected by [y]:
- CURRITUCK [=y] && PPC_47x [=y]
Prior to commit eb01d42a7778 ("PCI: consolidate PCI config entry in
drivers/pci") PCI was enabled by default for currituck_defconfig so we
didn't see the warning. The bad logic was still there, it just
required someone disabling PCI in their .config to hit it.
Fix it by forcing PCI on for CURRITUCK, which seems was always the
expectation anyway.
Fixes: eb01d42a7778 ("PCI: consolidate PCI config entry in drivers/pci") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33
("perf annotate: No need to calculate notes->start twice") removed notes->start
assignment in symbol__calc_lines(). It will get failed in
find_address_in_section() from symbol__tty_annotate() subroutine as the
a2l->addr is wrong. So the annotate summary doesn't report the line number of
source code correctly.
Before fix:
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
void hotspot_1(void)
{
volatile int i;
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
}
Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice") Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Custom approximation of fractional-divider may not need parent clock
rate checking. For example Rockchip SoCs work fine using grand parent
clock rate even if target rate is greater than parent.
This patch checks parent clock rate only if CLK_SET_RATE_PARENT flag
is set.
For detailed example, clock tree of Rockchip I2S audio hardware.
- Clock rate of CPLL is 1.2GHz, GPLL is 491.52MHz.
- i2s1_div is integer divider can divide N (N is 1~128).
Input clock is CPLL or GPLL. Initial divider value is N = 1.
Ex) PLL = CPLL, N = 10, i2s1_div output rate is
CPLL / 10 = 1.2GHz / 10 = 120MHz
- i2s1_frac is fractional divider can divide input to x/y, x and
y are 16bit integer.
Clock mux system try to choose suitable one from i2s1_div and
i2s1_frac for master clock (MCLK) of I2S1.
Bad scenario as follows:
- Try to set MCLK to 8.192MHz (32kHz audio replay)
Candidate setting is
- i2s1_div: GPLL / 60 = 8.192MHz
i2s1_div candidate is exactly same as target clock rate, so mux
choose this clock source. i2s1_div output rate is changed
491.52MHz -> 8.192MHz
- After that try to set to 11.2896MHz (44.1kHz audio replay)
Candidate settings are
- i2s1_div : CPLL / 107 = 11.214945MHz
- i2s1_frac: i2s1_div = 8.192MHz
This is because clk_fd_round_rate() thinks target rate
(11.2896MHz) is higher than parent rate (i2s1_div = 8.192MHz)
and returns parent clock rate.
Above is current upstreamed behavior. Clock mux system choose
i2s1_div, but this clock rate is not acceptable for I2S driver, so
users cannot replay audio.
Expected behavior is:
- Try to set master clock to 11.2896MHz (44.1kHz audio replay)
Candidate settings are
- i2s1_div : CPLL / 107 = 11.214945MHz
- i2s1_frac: i2s1_div * 147/6400 = 11.2896MHz
Change i2s1_div to GPLL / 1 = 491.52MHz at same
time.
If apply this commit, clk_fd_round_rate() calls custom approximate
function of Rockchip even if target rate is higher than parent.
Custom function changes both grand parent (i2s1_div) and parent
(i2s_frac) settings at same time. Clock mux system can choose
i2s1_frac and audio works fine.
Signed-off-by: Katsuhiro Suzuki <katsuhiro@katsuster.net> Reviewed-by: Heiko Stuebner <heiko@sntech.de>
[sboyd@kernel.org: Make function into a macro instead] Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Using CX-3 virtual functions, either from a bare-metal machine or
pass-through from a VM, MAD packets are proxied through the PF driver.
Since the VF drivers have separate name spaces for MAD Transaction Ids
(TIDs), the PF driver has to re-map the TIDs and keep the book keeping
in a cache.
Following the RDMA Connection Manager (CM) protocol, it is clear when
an entry has to evicted form the cache. But life is not perfect,
remote peers may die or be rebooted. Hence, it's a timeout to wipe out
a cache entry, when the PF driver assumes the remote peer has gone.
During workloads where a high number of QPs are destroyed concurrently,
excessive amount of CM DREQ retries has been observed
The problem can be demonstrated in a bare-metal environment, where two
nodes have instantiated 8 VFs each. This using dual ported HCAs, so we
have 16 vPorts per physical server.
64 processes are associated with each vPort and creates and destroys
one QP for each of the remote 64 processes. That is, 1024 QPs per
vPort, all in all 16K QPs. The QPs are created/destroyed using the
CM.
When tearing down these 16K QPs, excessive CM DREQ retries (and
duplicates) are observed. With some cat/paste/awk wizardry on the
infiniband_cm sysfs, we observe as sum of the 16 vPorts on one of the
nodes:
Note that the active/passive side is equally distributed between the
two nodes.
Enabling pr_debug in cm.c gives tons of:
[171778.814239] <mlx4_ib> mlx4_ib_multiplex_cm_handler: id{slave:
1,sl_cm_id: 0xd393089f} is NULL!
By increasing the CM_CLEANUP_CACHE_TIMEOUT from 5 to 30 seconds, the
tear-down phase of the application is reduced from approximately 90 to
50 seconds. Retries/duplicates are also significantly reduced:
Increasing the timeout further didn't help, as these duplicates and
retries stems from a too short CMA timeout, which was 20 (~4 seconds)
on the systems. By increasing the CMA timeout to 22 (~17 seconds), the
numbers fell down to about 10 for both of them.
Adjustment of the CMA timeout is not part of this commit.
Signed-off-by: HÃ¥kon Bugge <haakon.bugge@oracle.com> Acked-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
On most Intel Bay- and Cherry-Trail systems the PMIC is connected over I2C
and the PMIC is accessed through various means by the _PS0 and _PS3 ACPI
methods (power on / off methods) of various devices.
This leads to suspend/resume ordering problems where a device may be
resumed and get its _PS0 method executed before the I2C controller is
resumed. On Cherry Trail this leads to errors like these:
i2c_designware 808622C1:06: controller timed out
ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion]
ACPI Error: Method parse/execution failed \_SB.P18W._ON, AE_ERROR
video LNXVIDEO:00: Failed to change power state to D0
But on Bay Trail this caused I2C reads to seem to succeed, but they end
up returning wrong data, which ends up getting written back by the typical
read-modify-write cycle done to turn on various power-resources.
Debugging the problems caused by this silent data corruption is quite
nasty. This commit adds a check which disallows i2c_dw_xfer() calls to
happen until the controller's resume method has completed.
Which turns the silent data corruption into getting these errors in
dmesg instead:
i2c_designware 80860F41:04: Error i2c_dw_xfer call while suspended
ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion]
ACPI Error: Method parse/execution failed \_SB.PCI0.GFX0._PS0, AE_ERROR
Which is much better.
Note the above errors are an example of issues which this patch will
help to debug, the actual fix requires fixing the suspend order and
this has been fixed by a different commit.
Note the setting / clearing of the suspended flag in the suspend / resume
methods is NOT protected by i2c_lock_bus(). This is intentional as these
methods get called from i2c_dw_xfer() (through pm_runtime_get/put) a nd
i2c_dw_xfer() is called with the i2c_bus_lock held, so otherwise we would
deadlock. This means that there is a theoretical race between a non runtime
suspend and the suspended check in i2c_dw_xfer(), this is not a problem
since normally we should not hit the race and this check is primarily a
debugging tool so hitting the check if there are suspend/resume ordering
problems does not need to be 100% reliable.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Commit 0da03cab87e6
("loop: Fix deadlock when calling blkdev_reread_part()") moves
blkdev_reread_part() out of the loop_ctl_mutex. However,
GENHD_FL_NO_PART_SCAN is set before __blkdev_reread_part(). As a result,
__blkdev_reread_part() will fail the check of GENHD_FL_NO_PART_SCAN and
will not rescan the loop device to delete all partitions.
The warning is caused by the below loop:
for_each_set_bit(bit, (unsigned long *)&asserted, 8) {
while "asserted" is declared as 'unsigned'.
The casting of 32-bit unsigned integer pointer to a 64-bit unsigned long
pointer. There are two problems here.
It causes the access of four extra byte, which can corrupt memory
The 32-bit pointer address may not be 64-bit aligned.
The fix changes variable "asserted" to "unsigned long".
Fixes: 1f976f6978bf ("platform/x86: Move Mellanox platform hotplug driver to platform/mellanox") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
So on Lenovo RESCUER R720-15IKBN:
DMI_SYS_VENDOR should match "LENOVO",
DMI_BOARD_NAME should match "Provence-5R3",
DMI_PRODUCT_NAME should match "80WW",
DMI_PRODUCT_VERSION should match "Lenovo R720-15IKBN".
Fix it, and in according with other entries in no_hw_rfkill_list,
use DMI_PRODUCT_VERSION instead of DMI_BOARD_NAME.
Fixes: ae7c8cba3221 ("platform/x86: ideapad-laptop: add lenovo RESCUER R720-15IKBN to no_hw_rfkill_list") Signed-off-by: Yang Fan <nullptr.cpp@gmail.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
A previous change allowed I2C client devices to discover new IRQs upon
reprobe by clearing the IRQ in i2c_device_remove. However, if an IRQ was
assigned in i2c_new_device, that information is lost.
For example, the touchscreen and trackpad devices on a Dell Inspiron laptop
are I2C devices whose IRQs are defined by ACPI extended IRQ types. The
client device structures are initialized during an ACPI walk. After
removing the i2c_hid device, modprobe fails.
This change caches the initial IRQ value in i2c_new_device and then resets
the client device IRQ to the initial value in i2c_device_remove.
Fixes: 6f108dd70d30 ("i2c: Clear client->irq in i2c_device_remove") Signed-off-by: Jim Broadus <jbroadus@gmail.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
[wsa: this is an easy to backport fix for the regression. We will
refactor the code to handle irq assignments better in general.] Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Give precision identifiers to the two snprintf() formatting the priority
and TC strings to avoid producing these two warnings:
drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function
'mlxsw_sp_port_get_prio_strings':
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2132:37: warning: '%d'
directive output may be truncated writing between 1 and 3 bytes into a
region of size between 0 and 31 [-Wformat-truncation=]
snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
^~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2132:3: note: 'snprintf'
output between 3 and 36 bytes into a destination of size 32
snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mlxsw_sp_port_hw_prio_stats[i].str, prio);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c: In function
'mlxsw_sp_port_get_tc_strings':
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2143:37: warning: '%d'
directive output may be truncated writing between 1 and 11 bytes into a
region of size between 0 and 31 [-Wformat-truncation=]
snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
^~
drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2143:3: note: 'snprintf'
output between 3 and 44 bytes into a destination of size 32
snprintf(*p, ETH_GSTRING_LEN, "%s_%d",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mlxsw_sp_port_hw_tc_stats[i].str, tc);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Provide precision hints to snprintf() since we know the destination
buffer size of the RX/TX ring names are IFNAMSIZ + 5 - 1. This fixes the
following warnings:
drivers/net/ethernet/intel/e1000e/netdev.c: In function
'e1000_request_msix':
drivers/net/ethernet/intel/e1000e/netdev.c:2109:13: warning: 'snprintf'
output may be truncated before the last format character
[-Wformat-truncation=]
"%s-rx-0", netdev->name);
^
drivers/net/ethernet/intel/e1000e/netdev.c:2107:3: note: 'snprintf'
output between 6 and 21 bytes into a destination of size 20
snprintf(adapter->rx_ring->name,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sizeof(adapter->rx_ring->name) - 1,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"%s-rx-0", netdev->name);
~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/intel/e1000e/netdev.c:2125:13: warning: 'snprintf'
output may be truncated before the last format character
[-Wformat-truncation=]
"%s-tx-0", netdev->name);
^
drivers/net/ethernet/intel/e1000e/netdev.c:2123:3: note: 'snprintf'
output between 6 and 21 bytes into a destination of size 20
snprintf(adapter->tx_ring->name,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sizeof(adapter->tx_ring->name) - 1,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"%s-tx-0", netdev->name);
~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Provide a precision hint to snprintf() in order to eliminate a
-Wformat-truncation warning provided below. A maximum of 11 characters
is allowed to reach a maximum of 32 - 1 characters given a possible
maximum value of queues using up to UINT_MAX which occupies 10
characters. Incidentally 11 is the number of characters for
"xdp_packets" which is the largest string we append.
drivers/net/veth.c: In function 'veth_get_strings':
drivers/net/veth.c:118:47: warning: '%s' directive output may be
truncated writing up to 31 bytes into a region of size between 12 and 21
[-Wformat-truncation=]
snprintf(p, ETH_GSTRING_LEN, "rx_queue_%u_%s",
^~
drivers/net/veth.c:118:5: note: 'snprintf' output between 12 and 52
bytes into a destination of size 32
snprintf(p, ETH_GSTRING_LEN, "rx_queue_%u_%s",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
i, veth_rq_stats_desc[j].desc);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
&desc->request_mutex refer to two different mutex. #1 is the GPIO for
the chip interrupt. #2 is the chained interrupt between global 1 and
global 2.
Add lockdep classes to the GPIO interrupt to avoid this.
Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
When running OMAP1 kernel on QEMU, MMC access is annoyingly noisy:
MMC: CTO of 0xff and 0xfe cannot be used!
MMC: CTO of 0xff and 0xfe cannot be used!
MMC: CTO of 0xff and 0xfe cannot be used!
[ad inf.]
Emulator warnings appear to be valid. The TI document SPRU680 [1]
("OMAP5910 Dual-Core Processor MultiMedia Card/Secure Data Memory Card
(MMC/SD) Reference Guide") page 36 states that the maximum timeout is 253
cycles and "0xff and 0xfe cannot be used".
Fix by using 0xfd as the maximum timeout.
Tested using QEMU 2.5 (Siemens SX1 machine, OMAP310), and also checked on
real hardware using Palm TE (OMAP310), Nokia 770 (OMAP1710) and Nokia N810
(OMAP2420) that MMC works as before.
[1] http://www.ti.com/lit/ug/spru680/spru680.pdf
Fixes: 730c9b7e6630f ("[MMC] Add OMAP MMC host driver") Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
With the introduction of the per-inode block_rsv it became possible to
have really really large reservation requests made because of data
fragmentation. Since the ticket stuff assumed that we'd always have
relatively small reservation requests it just killed all tickets if we
were unable to satisfy the current request.
However, this is generally not the case anymore. So fix this logic to
instead see if we had a ticket that we were able to give some
reservation to, and if we were continue the flushing loop again.
Likewise we make the tickets use the space_info_add_old_bytes() method
of returning what reservation they did receive in hopes that it could
satisfy reservations down the line.
Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
[CAUSE]
Since commit a514d63882c3 ("btrfs: qgroup: Commit transaction in advance
to reduce early EDQUOT"), btrfs is not forced to commit transaction to
reclaim more quota space.
Instead, we just check pertrans metadata reservation against some
threshold and try to do asynchronously transaction commit.
However in above case, the pertrans metadata reservation is pretty small
thus it will never trigger asynchronous transaction commit.
[FIX]
Instead of only accounting pertrans metadata reservation, we calculate
how much free space we have, and if there isn't much free space left,
commit transaction asynchronously to try to free some space.
This may slow down the fs when we have less than 32M free qgroup space,
but should reduce a lot of false EDQUOT, so the cost should be
acceptable.
Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
When using -F + syntax to add a field the existing defaults are
currently all marked user_set. This can cause errors when some field is
missing in the perf.data
This patch tracks the actually user set fields separately, so that we don't
error out in this case.
Before:
% perf record true
% perf script -F +metric
Samples for 'cycles:ppp' event do not have CPU attribute set. Cannot print 'cpu' field.
%
When adding multiple VLANs to the same VSI, the ice_add_vlan code will
share the VSI list, so as not to create multiple unnecessary VSI lists.
Consider the following flow
ice_add_vlan(hw, <VSI 0 VID 7, VSI 0 VID 8, VSI 0 VID 9>)
Where we add three VLAN filters for VIDs 7, 8, and 9, all for VSI 0.
The ice_add_vlan will create a single vsi_list and share it among all
the filters.
Later, if we try to remove a VLAN,
ice_remove_vlan(hw, <VSI 0 VID 7>)
Then the removal code will update the vsi_list and remove VSI 0 from it.
But, since the vsi_list is shared, this breaks the list for the other
users who reference it. We actually even free the VSI list memory, and
may result in segmentation faults.
This is due to the way that VLAN rule share VSI lists with reference
counts, and is caused because we call ice_rem_update_vsi_list even when
the ref_cnt is greater than one.
To fix this, handle the case where ref_cnt is greater than one
separately. In this case, we need to remove the associated rule without
modifying the vsi_list, since it is currently being referenced by
another rule. Instead, we just need to decrement the VSI list ref_cnt.
The case for handling sharing of VSI lists with multiple VSIs is not
currently supported by this code. No such rules will be created today,
and this code will require changes if/when such code is added.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Commit 787799a9d555 sets the SERDES interfaces of 6390 and 6390X to
1000BaseX, but this is only needed on 6390X, since there are SERDES
interfaces which can be used on lower ports on 6390.
This commit fixes this by returning to previous behaviour on 6390.
(Previous behaviour means that CMODE is not set at all if requested mode
is NA).
This is needed on Turris MOX, where the 88e6190 is connected to CPU in
2500BaseX mode.
Fixes: 787799a9d555 ("net: dsa: mv88e6xxx: Default ports 9/10 6390X CMODE to 1000BaseX") Signed-off-by: Marek Behún <marek.behun@nic.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
After we ALIGN up the address we need to make sure we didn't overflow
and resulted in zero address. In that case, we need to make sure that
the returned address is greater than mmap_min_addr.
This fixes selftest va_128TBswitch --run-hugetlb reporting failures when
run as non root user for
mmap(-1, MAP_HUGETLB)
The bug is that a non-root user requesting address -1 will be given address 0
which will then fail, whereas they should have been given something else that
would have succeeded.
We also avoid the first mmap(-1, MAP_HUGETLB) returning NULL address as mmap address
with this change. So we think this is not a security issue, because it only affects
whether we choose an address below mmap_min_addr, not whether we
actually allow that address to be mapped. ie. there are existing capability
checks to prevent a user mapping below mmap_min_addr and those will still be
honoured even without this fix.
Fixes: 484837601d4d ("powerpc/mm: Add radix support for hugetlb") Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Use unified assembler syntax (UAL) in inline assembler. Divided
syntax is considered deprecated. This will also allow to build
the kernel using LLVM's integrated assembler.
When compiling non-Thumb2 GCC always emits a ".syntax divided"
at the beginning of the inline assembly which makes the
assembler fail. Since GCC 5 there is the -masm-syntax-unified
GCC option which make GCC assume unified syntax asm and hence
emits ".syntax unified" even in ARM mode. However, the option
is broken since GCC version 6 (see GCC PR88648 [1]). Work
around by adding ".syntax unified" as part of the inline
assembly.
Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
SDM845 has ETMv4.2 and can use the existing etm4x driver.
But the current etm driver checks only for ETMv4.0 and
errors out for other etm4x versions. This patch adds this
missing support to enable SoC's with ETMv4x to use same
driver by checking only the ETM architecture major version
number.
Without this change, we get below error during etm probe:
/ # dmesg | grep etm
[ 6.660093] coresight-etm4x: probe of 7040000.etm failed with error -22
[ 6.666902] coresight-etm4x: probe of 7140000.etm failed with error -22
[ 6.673708] coresight-etm4x: probe of 7240000.etm failed with error -22
[ 6.680511] coresight-etm4x: probe of 7340000.etm failed with error -22
[ 6.687313] coresight-etm4x: probe of 7440000.etm failed with error -22
[ 6.694113] coresight-etm4x: probe of 7540000.etm failed with error -22
[ 6.700914] coresight-etm4x: probe of 7640000.etm failed with error -22
[ 6.707717] coresight-etm4x: probe of 7740000.etm failed with error -22
When building with -Wsometimes-uninitialized, Clang warns:
arch/powerpc/xmon/ppc-dis.c:157:7: warning: variable 'opcode' is used
uninitialized whenever 'if' condition is false
[-Wsometimes-uninitialized]
if (cpu_has_feature(CPU_FTRS_POWER9))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/xmon/ppc-dis.c:167:7: note: uninitialized use occurs here
if (opcode == NULL)
^~~~~~
arch/powerpc/xmon/ppc-dis.c:157:3: note: remove the 'if' if its
condition is always true
if (cpu_has_feature(CPU_FTRS_POWER9))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/xmon/ppc-dis.c:132:38: note: initialize the variable
'opcode' to silence this warning
const struct powerpc_opcode *opcode;
^
= NULL
1 warning generated.
This warning seems to make no sense on the surface because opcode is set
to NULL right below this statement. However, there is a comma instead of
semicolon to end the dialect assignment, meaning that the opcode
assignment only happens in the if statement. Properly terminate that
line so that Clang no longer warns.
Fixes: 5b102782c7f4 ("powerpc/xmon: Enable disassembly files (compilation changes)") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The SDIO firmware does not provide RSSI value to the host, it's only set to
zero. In that case don't report the value to mac80211. One risk here is that
value zero might be a valid value with other firmware, currently there's no way
to detect that.
Without the fix, the rssi value indicated by iw changes between the actual
value and -95.
Tested with QCA6174 SDIO with firmware WLAN.RMH.4.4.1-00005-QCARMSWP-1.
Co-developed-by: Wen Gong <wgong@codeaurora.org> Signed-off-by: Alagu Sankar <alagusankar@silex-india.com> Signed-off-by: Wen Gong <wgong@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Reference counting in amdgpu_dm_connector for amdgpu_dm_connector::dc_sink
and amdgpu_dm_connector::dc_em_sink as well as in dc_link::local_sink seems
to be out of shape. Thus make reference counting consistent for these
members and just plain increment the reference count when the variable
gets assigned and decrement when the pointer is set to zero or replaced.
Also simplify reference counting in selected function sopes to be sure the
reference is released in any case. In some cases add NULL pointer check
before dereferencing.
At a hand full of places a comment is placed to stat that the reference
increment happened already somewhere else.
This actually fixes the following kernel bug on my system when enabling
display core in amdgpu. There are some more similar bug reports around,
so it probably helps at more places.
This patch is based on agd5f/drm-next-5.1-wip. This patch does not require
all of that, but agd5f/drm-next-5.1-wip contains at least one more dc_sink
counting fix that I could spot.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Previously we only updated the drop_progress key if we were in the
DROP_REFERENCE stage of snapshot deletion. This is because the
UPDATE_BACKREF stage checks the flags of the blocks it's converting to
FULL_BACKREF, so if we go over a block we processed before it doesn't
matter, we just don't do anything.
The problem is in do_walk_down() we will go ahead and drop the roots
reference to any blocks that we know we won't need to walk into.
Given subvolume A and snapshot B. The root of B points to all of the
nodes that belong to A, so all of those nodes have a refcnt > 1. If B
did not modify those blocks it'll hit this condition in do_walk_down
if (!wc->update_ref ||
generation <= root->root_key.offset)
goto skip;
and in "goto skip" we simply do a btrfs_free_extent() for that bytenr
that we point at.
Now assume we modified some data in B, and then took a snapshot of B and
call it C. C points to all the nodes in B, making every node the root
of B points to have a refcnt > 1. This assumes the root level is 2 or
higher.
We delete snapshot B, which does the above work in do_walk_down,
free'ing our ref for nodes we share with A that we didn't modify. Now
we hit a node we _did_ modify, thus we own. We need to walk down into
this node and we set wc->stage == UPDATE_BACKREF. We walk down to level
0 which we also own because we modified data. We can't walk any further
down and thus now need to walk up and start the next part of the
deletion. Now walk_up_proc is supposed to put us back into
DROP_REFERENCE, but there's an exception to this
if (level < wc->shared_level)
goto out;
we are at level == 0, and our shared_level == 1. We skip out of this
one and go up to level 1. Since path->slots[1] < nritems we
path->slots[1]++ and break out of walk_up_tree to stop our transaction
and loop back around. Now in btrfs_drop_snapshot we have this snippet
our stage == UPDATE_BACKREF still, so we don't update the drop_progress
key. This is a problem because we would have done btrfs_free_extent()
for the nodes leading up to our current position. If we crash or
unmount here and go to remount we'll start over where we were before and
try to free our ref for blocks we've already freed, and thus abort()
out.
Fix this by keeping track of the last place we dropped a reference for
our block in do_walk_down. Then if wc->stage == UPDATE_BACKREF we know
we'll start over from a place we meant to, and otherwise things continue
to work as they did before.
I have a complicated reproducer for this problem, without this patch
we'll fail to fsck the fs when replaying the log writes log. With this
patch we can replay the whole log without any fsck or mount failures.
The steps to reproduce this easily are sort of tricky, I had to add a
couple of debug patches to the kernel in order to make it easy,
basically I just needed to make sure we did actually commit the
transaction every time we finished a walk_down_tree/walk_up_tree combo.
The reproducer:
1) Creates a base subvolume.
2) Creates 100k files in the subvolume.
3) Snapshots the base subvolume (snap1).
4) Touches files 5000-6000 in snap1.
5) Snapshots snap1 (snap2).
6) Deletes snap1.
I do this with dm-log-writes, and then replay to every FUA in the log
and fsck the fs.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
[ copy reproducer steps ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Adding -rR to MAKEFLAGS is important because we do not want to
be bothered by built-in implicit rules or variables.
One problem that used to exist in older GNU Make versions is
MAKEFLAGS += -rR
... does not become effective in the current Makefile. When you are
building with O= option, it becomes effective in the top Makefile
since it recurses via 'sub-make' target. Otherwise, the top Makefile
tries implicit rules. That is why we explicitly add empty rules for
Makefiles, but we often miss to do that.
In fact, adding -d option to older GNU Make versions shows it is
trying a bunch of implicit pattern rules.
Considering target file `scripts/Makefile.kcov'.
Looking for an implicit rule for `scripts/Makefile.kcov'.
Trying pattern rule with stem `Makefile.kcov'.
Trying implicit prerequisite `scripts/Makefile.kcov.o'.
Trying pattern rule with stem `Makefile.kcov'.
Trying implicit prerequisite `scripts/Makefile.kcov.c'.
Trying pattern rule with stem `Makefile.kcov'.
Trying implicit prerequisite `scripts/Makefile.kcov.cc'.
Trying pattern rule with stem `Makefile.kcov'.
Trying implicit prerequisite `scripts/Makefile.kcov.C'.
...
This issue was fixed by GNU Make commit 58dae243526b ("[Savannah #20501]
Handle adding -r/-R to MAKEFLAGS in the makefile"). So, it is no longer
a problem if you use GNU Make 4.0 or later. However, older versions are
still widely used.
So, I decided to patch the kernel Makefile to invoke sub-make regardless
of O= option. This will allow further cleanups.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
If include/config/auto.conf.cmd is lost for some reasons, it is not
self-healing, so the top Makefile misses to run syncconfig.
Move include/config/auto.conf.cmd to the target side.
I used a pattern rule instead of a normal rule here although it is
a bit gross.
If the rule were written with a normal rule like this,
Using a pattern rule makes sure that syncconfig is executed just once
because Make assumes the recipe will create all of the targets.
Here is a quote from the GNU Make manual [1]:
"Pattern rules may have more than one target. Unlike normal rules,
this does not act as many different rules with the same prerequisites
and recipe. If a pattern rule has multiple targets, make knows that
the rule's recipe is responsible for making all of the targets. The
recipe is executed only once to make all the targets. When searching
for a pattern rule to match a target, the target patterns of a rule
other than the one that matches the target in need of a rule are
incidental: make worries only about giving a recipe and prerequisites
to the file presently in question. However, when this file's recipe is
run, the other targets are marked as having been updated themselves."
We had a test-report where, under memory pressure, adding LUNs to the
systems would fail (the tests add LUNs strictly in sequence):
[ 5525.853432] scsi 0:0:1:1088045124: Direct-Access IBM 2107900 .148 PQ: 0 ANSI: 5
[ 5525.853826] scsi 0:0:1:1088045124: alua: supports implicit TPGS
[ 5525.853830] scsi 0:0:1:1088045124: alua: device naa.6005076303ffd32700000000000044da port group 0 rel port 43
[ 5525.853931] sd 0:0:1:1088045124: Attached scsi generic sg10 type 0
[ 5525.854075] sd 0:0:1:1088045124: [sdk] Disabling DIF Type 1 protection
[ 5525.855495] sd 0:0:1:1088045124: [sdk] 2097152 512-byte logical blocks: (1.07 GB/1.00 GiB)
[ 5525.855606] sd 0:0:1:1088045124: [sdk] Write Protect is off
[ 5525.855609] sd 0:0:1:1088045124: [sdk] Mode Sense: ed 00 00 08
[ 5525.855795] sd 0:0:1:1088045124: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 5525.857838] sdk: sdk1
[ 5525.859468] sd 0:0:1:1088045124: [sdk] Attached SCSI disk
[ 5525.865073] sd 0:0:1:1088045124: alua: transition timeout set to 60 seconds
[ 5525.865078] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015070] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.015213] sd 0:0:1:1088045124: alua: port group 00 state A preferred supports tolusnA
[ 5526.587439] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
[ 5526.588562] scsi_alloc_sdev: Allocation failure during SCSI scanning, some SCSI devices might not be configured
Looking at the code of scsi_alloc_sdev(), and all the calling contexts,
there seems to be no reason to use GFP_ATMOIC here. All the different
call-contexts use a mutex at some point, and nothing in between that
requires no sleeping, as far as I could see. Additionally, the code that
later allocates the block queue for the device (scsi_mq_alloc_queue())
already uses GFP_KERNEL.
There are similar allocations in two other functions:
scsi_probe_and_add_lun(), and scsi_add_lun(),; that can also be done with
GFP_KERNEL.
Here is the contexts for the three functions so far:
So replace all these, and give them a bit of a better chance to succeed,
with more chances of reclaim.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
We store 2 multilevel tables in iommu_table - one for the hardware and
one with the corresponding userspace addresses. Before allocating
the tables, the iommu_table_group_ops::get_table_size() hook returns
the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts
the locked_vm counter correctly. When the table is actually allocated,
the amount of allocated memory is stored in iommu_table::it_allocated_size
and used to decrement the locked_vm counter when we release the memory
used by the table; .get_table_size() and .create_table() calculate it
independently but the result is expected to be the same.
However the allocator does not add the userspace table size to
.it_allocated_size so when we destroy the table because of VFIO PCI
unplug (i.e. VFIO container is gone but the userspace keeps running),
we decrement locked_vm by just a half of size of memory we are
releasing.
To make things worse, since we enabled on-demand allocation of
indirect levels, it_allocated_size contains only the amount of memory
actually allocated at the table creation time which can just be a
fraction. It is not a problem with incrementing locked_vm (as
get_table_size() value is used) but it is with decrementing.
As the result, we leak locked_vm and may not be able to allocate more
IOMMU tables after few iterations of hotplug/unplug.
This sets it_allocated_size in the pnv_pci_ioda2_ops::create_table()
hook to what pnv_pci_ioda2_get_table_size() returns so from now on we
have a single place which calculates the maximum memory a table can
occupy. The original meaning of it_allocated_size is somewhat lost now
though.
We do not ditch it_allocated_size whatsoever here and we do not call
get_table_size() from vfio_iommu_spapr_tce.c when decrementing
locked_vm as we may have multiple IOMMU groups per container and even
though they all are supposed to have the same get_table_size()
implementation, there is a small chance for failure or confusion.
Fixes: 090bad39b237 ("powerpc/powernv: Add indirect levels to it_userspace") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
According to the chipidea driver bindings, the USB PHY is specified via
the "phys" phandle node. However, this only takes effect for USB PHYs
that use the common PHY framework. For legacy USB PHYs, a simple lookup
based on the USB PHY type is done instead.
This does not play out well when more than one USB PHY is registered,
since the first registered PHY matching the type will always be
returned regardless of what the driver was bound to.
Fix this by looking up the PHY based on the "phys" phandle node.
Although generic PHYs are rather matched by their "phys-name" and not
the "phys" phandle directly, there is no helper for similar lookup on
legacy PHYs and it's probably not worth the effort to add it.
When no legacy USB PHY is found by phandle, fallback to grabbing any
registered USB2 PHY. This ensures backward compatibility if some users
were actually relying on this mechanism.
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
The cavium/zip implementation of the deflate compression algorithm is
incorrectly being registered under the generic driver name, which
prevents the generic implementation from being registered with the
crypto API when CONFIG_CRYPTO_DEV_CAVIUM_ZIP=y. Similarly the lzs
algorithm (which does not currently have a generic implementation...)
is incorrectly being registered as lzs-generic.
Fix the naming collision by adding a suffix "-cavium" to the
cra_driver_name of the cavium/zip algorithms.
Fixes: 640035a2dc55 ("crypto: zip - Add ThunderX ZIP driver core") Cc: Mahipal Challa <mahipalreddy2006@gmail.com> Cc: Jan Glauber <jglauber@cavium.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
e = f(...);
... when != of_node_put(e)
when != x = e
when != e = x
when any
if (<+...of_device_is_available(e)...+>) {
... when != of_node_put(e)
(
return e;
|
+ of_node_put(e);
return ...;
)
}
// </smpl>
Fixes: 5343e674f32fb ("crypto4xx: integrate ppc4xx-rng into crypto4xx") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Apparently the execute bits were set for the tests/*.sh scripts on my
test setup but these are not set in the kernel tree. Fix this by adding
the interpreter path in front of the script paths.
Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: johannes.berg@intel.com Cc: tj@kernel.org Fixes: 5ecb8e94b494 ("tools/lib/lockdep/tests: Improve testing accuracy") # v5.0-rc1 Link: https://lkml.kernel.org/r/20190214230058.196511-23-bvanassche@acm.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Currently, the bandwidth is updated wrongly in BW table in tx_stats
debugfs per sta as there is difference in number of bandwidth type
in mac80211 and driver stats table. This leads to bandwidth getting
updated at wrong index in bandwidth table in tx_stats.
Fix this index mismatch between mac80211 and driver stats table (BW table)
by making the number of bandwidth type in driver compatible with mac80211.
The call to of_find_node_by_phandle returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
./drivers/net/wireless/mediatek/mt76/eeprom.c:58:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 48, but without a corresponding object release within this function.
./drivers/net/wireless/mediatek/mt76/eeprom.c:61:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 48, but without a corresponding object release within this function.
./drivers/net/wireless/mediatek/mt76/eeprom.c:67:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 48, but without a corresponding object release within this function.
./drivers/net/wireless/mediatek/mt76/eeprom.c:70:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 48, but without a corresponding object release within this function.
./drivers/net/wireless/mediatek/mt76/eeprom.c:72:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 48, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Felix Fietkau <nbd@nbd.name> Cc: Lorenzo Bianconi <lorenzo.bianconi83@gmail.com> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mediatek@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
ies1 or ies2 might be null when code inside
_wil_cfg80211_merge_extra_ies access them.
Add explicit check for null and make sure ies1/ies2 are not
accessed in such a case.
spos might be null and be accessed inside
_wil_cfg80211_merge_extra_ies.
Add explicit check for null in the while condition statement
and make sure spos is not accessed in such a case.
Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
It is incorrect to call pcie_pme_suspend() from pcie_pme_remove() for two
reasons.
First, pcie_pme_suspend() calls synchronize_irq(), which will wait for the
native hotplug interrupt handler as well as for the PME one, because they
share one IRQ (as per the spec). That may deadlock if hotplug is signaled
while pcie_pme_remove() is running and the latter calls
pci_lock_rescan_remove() before the former.
Second, if pcie_pme_suspend() figures out that wakeup needs to be enabled
for the port, it will return without disabling the interrupt as expected by
pcie_pme_remove() which was overlooked by commit c7b5a4e6e8fb ("PCI / PM:
Fix native PME handling during system suspend/resume").
To fix that, rework pcie_pme_remove() to disable the PME interrupt, clear
its status and prevent the PME worker function from re-enabling it before
calling free_irq() on it, which should be sufficient.
Fixes: c7b5a4e6e8fb ("PCI / PM: Fix native PME handling during system suspend/resume") Link: https://lore.kernel.org/linux-pci/c7697e7c-e1af-13e4-8491-0a3996e6ab5d@huawei.com Reported-by: Dongdong Liu <liudongdong3@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[bhelgaas: add URL and deadlock details from Dongdong] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
walk_system_ram_range() can return an error code either becuase
*it* failed, or because the 'func' that it calls returned an
error. The memory hotplug does the following:
ret = walk_system_ram_range(..., func);
if (ret)
return ret;
and 'ret' makes it out to userspace, eventually. The problem
s, walk_system_ram_range() failues that result from *it* failing
(as opposed to 'func') return -1. That leads to a very odd
-EPERM (-1) return code out to userspace.
Make walk_system_ram_range() return -EINVAL for internal
failures to keep userspace less confused.
This return code is compatible with all the callers that I
audited.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying <ying.huang@intel.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Keith Busch <keith.busch@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
guard_bio_eod() can truncate a segment in bio to allow it to do IO on
odd last sectors of a device.
It already checks if the IO starts past EOD, but it does not consider
the possibility of an IO request starting within device boundaries can
contain more than one segment past EOD.
In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
underflow bvec->bv_len.
Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
This situation has been found on filesystems such as isofs and vfat,
which doesn't check the device size before mount, if the device is
smaller than the filesystem itself, a readahead on such filesystem,
which spans EOD, can trigger this situation, leading a call to
zero_user() with a wrong size possibly corrupting memory.
I didn't see any crash, or didn't let the system run long enough to
check if memory corruption will be hit somewhere, but adding
instrumentation to guard_bio_end() to check truncated_bytes size, was
enough to see the error.
Ext4 may not free clusters correctly when punching holes in bigalloc
file systems under high load conditions. If it's not possible to
extend and restart the journal in ext4_ext_rm_leaf() when preparing to
remove blocks from a punched region, a retry of the entire punch
operation is triggered in ext4_ext_remove_space(). This causes a
partial cluster to be set to the first cluster in the extent found to
the right of the punched region. However, if the punch operation
prior to the retry had made enough progress to delete one or more
extents and a partial cluster candidate for freeing had already been
recorded, the retry would overwrite the partial cluster. The loss of
this information makes it impossible to correctly free the original
partial cluster in all cases.
This bug can cause generic/476 to fail when run as part of
xfstests-bld's bigalloc and bigalloc_1k test cases. The failure is
reported when e2fsck detects bad iblocks counts greater than expected
in units of whole clusters and also detects a number of negative block
bitmap differences equal to the iblocks discrepancy in cluster units.
Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
In jbd2_journal_commit_transaction(), if we are in abort mode,
we may flush the buffer without setting descriptor block checksum
by goto start_journal_io. Then fs is mounted,
jbd2_descriptor_block_csum_verify() failed.
Commit fb58fdcd295b9 ("iommu/vt-d: Do not enable ATS for untrusted
devices") disables ATS support on the devices which have been marked
as untrusted. Unfortunately this is not enough to fix the DMA attack
vulnerabiltiies because IOMMU driver allows translated requests as
long as a device advertises the ATS capability. Hence a malicious
peripheral device could use this to bypass IOMMU.
This disables the ATS support on untrusted devices by clearing the
internal per-device ATS mark. As the result, IOMMU driver will block
any translated requests from any device marked as untrusted.
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Suggested-by: Kevin Tian <kevin.tian@intel.com> Suggested-by: Ashok Raj <ashok.raj@intel.com> Fixes: fb58fdcd295b9 ("iommu/vt-d: Do not enable ATS for untrusted devices") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
TCP resets cause instant transition from established to closed state
provided the reset is in-window. Endpoints that implement RFC 5961
require resets to match the next expected sequence number.
RST segments that are in-window (but that do not match RCV.NXT) are
ignored, and a "challenge ACK" is sent back.
Main problem for conntrack is that its a middlebox, i.e. whereas an end
host might have ACK'd SEQ (and would thus accept an RST with this
sequence number), conntrack might not have seen this ACK (yet).
Therefore we can't simply flag RSTs with non-exact match as invalid.
This updates RST processing as follows:
1. If the connection is in a state other than ESTABLISHED, nothing is
changed, RST is subject to normal in-window check.
2. If the RSTs sequence number either matches exactly RCV.NXT,
connection state moves to CLOSE.
3. The same applies if the RST sequence number aligns with a previous
packet in the same direction.
In all other cases, the connection remains in ESTABLISHED state.
If the normal-in-window check passes, the timeout will be lowered
to that of CLOSE.
If the peer sends a challenge ack, connection timeout will be reset.
If the challenge ACK triggers another RST (RST was valid after all),
this 2nd RST will match expected sequence and conntrack state changes to
CLOSE.
If no challenge ACK is received, the connection will time out after
CLOSE seconds (10 seconds by default), just like without this patch.
// First reset, old sequence. Conntrack (correctly) considers this
// invalid due to failed window validation (regardless of this patch).
0.260 < R 2:2(0) ack 1001 win 260
// 2nd reset, but too far ahead sequence. Same: correctly handled
// as invalid.
0.270 < R 99990001:99990001(0) ack 1001 win 260
// in-window, but not exact sequence.
// Current Linux kernels might reply with a challenge ack, and do not
// remove connection.
// Without this patch, conntrack state moves to CLOSE.
// With patch, timeout is lowered like CLOSE, but connection stays
// in ESTABLISHED state.
0.280 < R 1010:1010(0) ack 1001 win 260
// With or without this patch, RST will cause connection
// to move to CLOSE (sequence number matches)
// 0.282 < R 1001:1001(0) ack 1001 win 260
// ACK
0.300 < . 1001:1001(0) ack 1001 win 257
// more data could be exchanged here, connection
// is still established
// Client closes the connection.
0.610 < F. 1001:1001(0) ack 1001 win 260
0.650 > . 1001:1001(0) ack 1002
// Close the connection without reading outstanding data
0.700 close(4) = 0
// so one more reset. Will be deemed acceptable with patch as well:
// connection is already closing.
0.701 > R. 1001:1001(0) ack 1002 win 501
// End packetdrill test case.
Without patch, first RST moves connection to close, whereas socket state
does not change until FIN is received.
[NEW] 120 SYN_SENT src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [UNREPLIED]
[UPDATE] 60 SYN_RECV src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80
[UPDATE] 432000 ESTABLISHED src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED]
[UPDATE] 10 CLOSE src=10.0.2.1 dst=10.0.0.1 sport=5141 dport=80 [ASSURED]
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Mediatek's HW assigns a MMIO address range (typically starts from
0x20000000 to 0x2fffffff for both mt2712 and mt7622) for PCI usage.
This MMIO address space represents the address space that can
be allocated to PCI devices through Base Address Registers.
Even though the full MMIO address range is available to be allocated, it
should be enabled by the PCIE_AHB_TRANS_BASE register in the host
controller and the size that is enabled is determined by AHB2PCIE_SIZE
bits in this register.
Owing to a bug in the MMIO window size computation, current code does
not enable the full size of the available MMIO address range in the
PCI host controller; if the PCI devices BARs requested size exceeds the
size enabled through the PCIE_AHB_TRANS_BASE register the requests
targeting the disabled address address space will be blocked by the root
complex causing a system error.
Existing code has never run into a system error in production because
even half of the enabled MMIO range (128MB) is big enough for typical
devices BAR requests (4MB) but the full MMIO address range should
be enabled regardless.
Fix the MMIO window size computation by using resource_size(mem) instead
of mem->end - mem->start.
Since the MMIO window size for both MT2712 and MT7622 is 0x10000000,
this change will update the parameter passed to fls() from 0xfffffff to
0x10000000 and calculate the whole memory mapped IO range size
correctly.
Detected through coccinelle semantic patch (and related warning):
scripts/coccinelle/api/resource_size.cocci:
pcie-mediatek.c:720:13-16: WARNING: Suspicious code. resource_size is maybe missing with mem
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: rewrote the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Check the result of dereferencing base_chain->stats, instead of result
of this_cpu_ptr with NULL.
base_chain->stats maybe be changed to NULL when a chain is updated and a
new NULL counter can be attached.
And we do not need to check returning of this_cpu_ptr since
base_chain->stats is from percpu allocator if it is non-NULL,
this_cpu_ptr returns a valid value.
And fix two sparse error by replacing rcu_access_pointer and
rcu_dereference with READ_ONCE under rcu_read_lock.
Thanks for Eric's help to finish this patch.
Fixes: 009240940e84c1 ("netfilter: nf_tables: don't assume chain stats are set when jumplabel is set") Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: c65c83ffe904 ("perf trace: Allow asking for not suppressing common string prefixes") Link: https://lkml.kernel.org/n/tip-t2eu1rqx710k6jr4814mlzg7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Fix this by adding a NULL check on devname in cifs_parse_devname()
Signed-off-by: Yao Liu <yotta.liu@ucloud.cn> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Old windows version or Netapp SMB server will return
NT_STATUS_NOT_SUPPORTED since they do not allow or implement
FSCTL_VALIDATE_NEGOTIATE_INFO. The client should accept the response
provided it's properly signed.
See
https://blogs.msdn.microsoft.com/openspecification/2012/06/28/smb3-secure-dialect-negotiation/
There is there problems in that check:
- we should allow inline_xattr_size equaling to min size of inline
{data,dentry} area.
- F2FS_TOTAL_EXTRA_ATTR_SIZE and inline_xattr_size are based on
different size unit, previous one is 4 bytes, latter one is 1 bytes.
- DEF_MIN_INLINE_SIZE only indicate min size of inline data area,
however, we need to consider min size of inline dentry area as well,
minimal inline dentry should at least contain two entries: '.' and
'..', so that min inline_dentry size is 40 bytes.
Invoking dm_get_device() twice on the same device path with different
modes is dangerous. Because in that case, upgrade_mode() will alloc a
new 'dm_dev' and free the old one, which may be referenced by a previous
caller. Dereferencing the dangling pointer will trigger kernel NULL
pointer dereference.
The following two cases can reproduce this issue. Actually, they are
invalid setups that must be disallowed, e.g.:
1. Creating a thin-pool with read_only mode, and the same device as
both metadata and data.
Signed-off-by: Jason Cai (Xiang Feng) <jason.cai@linux.alibaba.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
When compiling with -Wformat, clang emits the following warnings:
fs/cifs/smb1ops.c:312:20: warning: format specifies type 'unsigned
short' but the argument has type 'unsigned int' [-Wformat]
tgt_total_cnt, total_in_tgt);
^~~~~~~~~~~~
fs/cifs/cifs_dfs_ref.c:289:4: warning: format specifies type 'short'
but the argument has type 'int' [-Wformat]
ref->flags, ref->server_type);
^~~~~~~~~~
fs/cifs/cifs_dfs_ref.c:289:16: warning: format specifies type 'short'
but the argument has type 'int' [-Wformat]
ref->flags, ref->server_type);
^~~~~~~~~~~~~~~~
fs/cifs/cifs_dfs_ref.c:291:4: warning: format specifies type 'short'
but the argument has type 'int' [-Wformat]
ref->ref_flag, ref->path_consumed);
^~~~~~~~~~~~~
fs/cifs/cifs_dfs_ref.c:291:19: warning: format specifies type 'short'
but the argument has type 'int' [-Wformat]
ref->ref_flag, ref->path_consumed);
^~~~~~~~~~~~~~~~~~
The types of these arguments are unconditionally defined, so this patch
updates the format character to the correct ones for ints and unsigned
ints.
Link: https://github.com/ClangBuiltLinux/linux/issues/378 Signed-off-by: Louis Taylor <louis@kragniz.eu> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Building little-endian allmodconfig kernels on arm64 started failing
with the generated atomic.h implementation, since we now try to call
kasan helpers from the EFI stub:
aarch64-linux-gnu-ld: drivers/firmware/efi/libstub/arm-stub.stub.o: in function `atomic_set':
include/generated/atomic-instrumented.h:44: undefined reference to `__efistub_kasan_check_write'
I suspect that we get similar problems in other files that explicitly
disable KASAN for some reason but call atomic_t based helper functions.
We can fix this by checking the predefined __SANITIZE_ADDRESS__ macro
that the compiler sets instead of checking CONFIG_KASAN, but this in
turn requires a small hack in mm/kasan/common.c so we do see the extern
declaration there instead of the inline function.
Link: http://lkml.kernel.org/r/20181211133453.2835077-1-arnd@arndb.de Fixes: b1864b828644 ("locking/atomics: build atomic headers as required") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
(Taken from https://bugzilla.kernel.org/show_bug.cgi?id=200647)
'get_unused_fd_flags' in kthread cause kernel crash. It works fine on
4.1, but causes crash after get 64 fds. It also cause crash on
ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the
same.
This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4
(3.10.0-693.21.1 ) is ok. Root cause: the item 'resize_wait' is not
initialized before being used.
Reported-by: Richard Zhang <zhang.zijian@h3c.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
In the process of creating a node, it will cause NULL pointer
dereference in kernel if o2cb_ctl failed in the interval (mkdir,
o2cb_set_node_attribute(node_num)] in function o2cb_add_node.
The node num is initialized to 0 in function o2nm_node_group_make_item,
o2nm_node_group_drop_item will mistake the node number 0 for a valid
node number when we delete the node before the node number is set
correctly. If the local node number of the current host happens to be
0, cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while
o2hb_thread still running. The panic stack is generated as follows:
Kmemleak does not track the array cache (alc->ac) but the alien cache
(alc) instead, so let it track the latter by lifting kmemleak_no_scan()
out of init_arraycache().
There is another place that calls init_arraycache(), but
alloc_kmem_cache_cpus() uses the percpu allocation where will never be
considered as a leak.
As it's difficult to report where exactly the uninit value resides in
the mempolicy object, we have to guess a bit. mm/mempolicy.c:353
contains this part of mpol_rebind_policy():
if (!mpol_store_user_nodemask(pol) &&
nodes_equal(pol->w.cpuset_mems_allowed, *newmask))
"mpol_store_user_nodemask(pol)" is testing pol->flags, which I couldn't
ever see being uninitialized after leaving mpol_new(). So I'll guess
it's actually about accessing pol->w.cpuset_mems_allowed on line 354,
but still part of statement starting on line 353.
For w.cpuset_mems_allowed to be not initialized, and the nodes_equal()
reachable for a mempolicy where mpol_set_nodemask() is called in
do_mbind(), it seems the only possibility is a MPOL_PREFERRED policy
with empty set of nodes, i.e. MPOL_LOCAL equivalent, with MPOL_F_LOCAL
flag. Let's exclude such policies from the nodes_equal() check. Note
the uninit access should be benign anyway, as rebinding this kind of
policy is always a no-op. Therefore no actual need for stable
inclusion.
Link: http://lkml.kernel.org/r/a71997c3-e8ae-a787-d5ce-3db05768b27c@suse.cz Link: http://lkml.kernel.org/r/73da3e9c-cc84-509e-17d9-0c434bb9967d@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: syzbot+b19c2dc2c990ea657a71@syzkaller.appspotmail.com Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
If a memory cgroup contains a single process with many threads
(including different process group sharing the mm) then it is possible
to trigger a race when the oom killer complains that there are no oom
elible tasks and complain into the log which is both annoying and
confusing because there is no actual problem. The race looks as
follows:
Fix this by checking for fatal_signal_pending from
mem_cgroup_out_of_memory when the oom_lock is already held.
The oom bypass is safe because we do the same early in the try_charge
path already. The situation migh have changed in the mean time. It
should be safe to check for fatal_signal_pending and tsk_is_oom_victim
but for a better code readability abstract the current charge bypass
condition into should_force_charge and reuse it from that path. "
Link: http://lkml.kernel.org/r/01370f70-e1f6-ebe4-b95e-0df21a0bc15e@i-love.sakura.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Since setting global init process to some memory cgroup is technically
possible, oom_kill_memcg_member() must check it.
Tasks in /test1 are going to be killed due to memory.oom.group set
Memory cgroup out of memory: Killed process 1 (systemd) total-vm:43400kB, anon-rss:1228kB, file-rss:3992kB, shmem-rss:0kB
oom_reaper: reaped process 1 (systemd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000008b
The descriptions of userspace memory access functions had minor issues
with formatting that made kernel-doc unable to properly detect the
function/macro names and the return value sections:
./arch/x86/include/asm/uaccess.h:80: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:139: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:231: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:505: info: Scanning doc for
./arch/x86/include/asm/uaccess.h:530: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:58: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:69: warning: No description found for return
value of 'clear_user'
./arch/x86/lib/usercopy_32.c:78: info: Scanning doc for
./arch/x86/lib/usercopy_32.c:90: warning: No description found for return
value of '__clear_user'
Fix the formatting.
Link: http://lkml.kernel.org/r/1549549644-4903-3-git-send-email-rppt@linux.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Dan Carpenter reports a potential NULL dereference in
get_swap_page_of_type:
Smatch complains that the NULL checks on "si" aren't consistent. This
seems like a real bug because we have not ensured that the type is
valid and so "si" can be NULL.
Add the missing check for NULL, taking care to use a read barrier to
ensure CPU1 observes CPU0's updates in the correct order:
CPU0 CPU1
alloc_swap_info() if (type >= nr_swapfiles)
swap_info[type] = p /* handle invalid entry */
smp_wmb() smp_rmb()
++nr_swapfiles p = swap_info[type]
Without smp_rmb, CPU1 might observe CPU0's write to nr_swapfiles before
CPU0's write to swap_info[type] and read NULL from swap_info[type].
Ying Huang noticed other places in swapfile.c don't order these reads
properly. Introduce swap_type_to_swap_info to encourage correct usage.
Use READ_ONCE and WRITE_ONCE to follow the Linux Kernel Memory Model
(see tools/memory-model/Documentation/explanation.txt).
This ordering need not be enforced in places where swap_lock is held
(e.g. si_swapinfo) because swap_lock serializes updates to nr_swapfiles
and the swap_info array.
Link: http://lkml.kernel.org/r/20190131024410.29859-1-daniel.m.jordan@oracle.com Fixes: ec8acf20afb8 ("swap: add per-partition lock for swapfile") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Suggested-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Shaohua Li <shli@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Tejun Heo <tj@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
After offlining a memory block, kmemleak scan will trigger a crash, as
it encounters a page ext address that has already been freed during
memory offlining. At the beginning in alloc_page_ext(), it calls
kmemleak_alloc(), but it does not call kmemleak_free() in
free_page_ext().
Kmemleak is supposed to work with the memblock_{alloc,free} pair and it
ignores the memblock_reserve() as a memblock_alloc() implementation
detail. It is, however, tolerant to memblock_free() being called on
a sub-range or just a different range from a previous memblock_alloc().
So the original patch looks fine to me. FWIW:
Link: http://lkml.kernel.org/r/20190227144631.16708-1-peng.fan@nxp.com Signed-off-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
next_present_section_nr() could only return an unsigned number -1, so
just check it specifically where compilers will convert -1 to unsigned
if needed.
mm/sparse.c: In function 'sparse_init_nid':
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:478:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin, pnum) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:497:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin, pnum) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/sparse.c: In function 'sparse_init':
mm/sparse.c:200:20: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
((section_nr >= 0) && \
^~
mm/sparse.c:520:2: note: in expansion of macro
'for_each_present_section_nr'
for_each_present_section_nr(pnum_begin + 1, pnum_end) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~
Link: http://lkml.kernel.org/r/20190228181839.86504-1-cai@lca.pw Fixes: c4e1be9ec113 ("mm, sparsemem: break out of loops early") Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>