Due to some renaming, we ended up with the "indirect iomem"
naming in Kconfig, following INDIRECT_PIO. However, clearly
I missed following through on that in the ifdefs, but so far
INDIRECT_IOMEM_FALLBACK isn't used by any architecture.
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Fixes: ca2e334232b6 ("lib: add iomem emulation (logic_iomem)") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 2be1a7f09635031aab33b7400d3c991876b74b5d) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup")
killed PCIe on my XGene-1 box (a Mustang board). The machine itself
is still alive, but half of its storage (over NVMe) is gone, and the
NVMe driver just times out.
Note that this machine boots with a device tree provided by the
UEFI firmware (2016 vintage), which could well be non conformant
with the spec, hence the breakage.
With the patch reverted, the box boots 5.17-rc8 with flying colors.
Link: https://lore.kernel.org/all/Yf2wTLjmcRj+AbDv@xps13.dannf Link: https://lore.kernel.org/r/20220321104843.949645-2-maz@kernel.org Fixes: 6dce5aa59e0b ("PCI: xgene: Use inbound resources for setup") Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org Cc: Rob Herring <robh@kernel.org> Cc: Toan Le <toan@os.amperecomputing.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stéphane Graber <stgraber@ubuntu.com> Cc: dann frazier <dann.frazier@canonical.com>
[dannf: minor context adjustment] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 541b7456fc4d370d22f9213636e03bf536c3d616) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Matthew Wilcox reported that there is a missing mmap_lock in
file_files_note that could possibly lead to a user after free.
Solve this by using the existing vma snapshot for consistency
and to avoid the need to take the mmap_lock anywhere in the
coredump code except for dump_vma_snapshot.
Update the dump_vma_snapshot to capture vm_pgoff and vm_file
that are neeeded by fill_files_note.
Add free_vma_snapshot to free the captured values of vm_file.
Reported-by: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20220131153740.2396974-1-willy@infradead.org Cc: stable@vger.kernel.org Fixes: a07279c9a8cd ("binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot") Fixes: 2aa362c49c31 ("coredump: extend core dump note section to contain file names of mapped files") Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 39fd0cc079c98dafcf355997ada7b5e67f0bb10a) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Move the call of dump_vma_snapshot and kvfree(vma_meta) out of the
individual coredump routines into do_coredump itself. This makes
the code less error prone and easier to maintain.
Make the vma snapshot available to the coredump routines
in struct coredump_params. This makes it easier to
change and update what is captures in the vma snapshot
and will be needed for fixing fill_file_notes.
Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f6ca862806df3762170cd3251852330304e781c9) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Pass the non-aligned size to __iommu_dma_map when using swiotlb bounce
buffers in iommu_dma_map_page, to account for min_align_mask.
To deal with granule alignment, __iommu_dma_map maps iova_align(size +
iova_off) bytes starting at phys - iova_off. If iommu_dma_map_page
passes aligned size when using swiotlb, then this becomes
iova_align(iova_align(orig_size) + iova_off). Normally iova_off will be
zero when using swiotlb. However, this is not the case for devices that
set min_align_mask. When iova_off is non-zero, __iommu_dma_map ends up
mapping an extra page at the end of the buffer. Beyond just being a
security issue, the extra page is not cleaned up by __iommu_dma_unmap.
This causes problems when the IOVA is reused, due to collisions in the
iommu driver. Just passing the original size is sufficient, since
__iommu_dma_map will take care of granule alignment.
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where adjacent
slots could be exposed to the untrusted device if IO_TLB_SIZE < iova
granule < PAGE_SIZE.
Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210929023300.335969-7-stevensd@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Mario Limonciello <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3e44e136560cb66dd5156889f811b870789b9499) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20210929023300.335969-6-stevensd@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Mario Limonciello <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 52d23f5f091509acda49b6b38edad96f175c06ff) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fold the _swiotlb helper functions into the respective _page functions,
since recent fixes have moved all logic from the _page functions to the
_swiotlb functions.
Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210929023300.335969-5-stevensd@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Mario Limonciello <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bc05d84824c0d1deba7a36bc63785d611833c513) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Do the arch sync directly in
__iommu_dma_unmap_swiotlb instead to avoid this. This makes the call to
iommu_dma_sync_sg_for_cpu for untrusted devices in iommu_dma_unmap_sg no
longer necessary, so move that invocation later in the function.
Signed-off-by: David Stevens <stevensd@chromium.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20210929023300.335969-4-stevensd@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Mario Limonciello <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c3841d020b82f6d92f8ac84ef2bb2f398c6eae7e) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
FNAME(cmpxchg_gpte) is an inefficient mess. It is at least decent if it
can go through get_user_pages_fast(), but if it cannot then it tries to
use memremap(); that is not just terribly slow, it is also wrong because
it assumes that the VM_PFNMAP VMA is contiguous.
The right way to do it would be to do the same thing as
hva_to_pfn_remapped() does since commit add6a0cd1c5b ("KVM: MMU: try to
fix up page faults before giving up", 2016-07-05), using follow_pte()
and fixup_user_fault() to determine the correct address to use for
memremap(). To do this, one could for example extract hva_to_pfn()
for use outside virt/kvm/kvm_main.c. But really there is no reason to
do that either, because there is already a perfectly valid address to
do the cmpxchg() on, only it is a userspace address. That means doing
user_access_begin()/user_access_end() and writing the code in assembly
to handle exceptions correctly. Worse, the guest PTE can be 8-byte
even on i686 so there is the extra complication of using cmpxchg8b to
account for. But at least it is an efficient mess.
(Thanks to Linus for suggesting improvement on the inline assembly).
Reported-by: Qiuhao Li <qiuhao@sysec.org> Reported-by: Gaoning Pan <pgn@zju.edu.cn> Reported-by: Yongkang Jia <kangel@zju.edu.cn> Reported-by: syzbot+6cde2282daa792c49ab8@syzkaller.appspotmail.com Debugged-by: Tadeusz Struk <tadeusz.struk@linaro.org> Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Fixes: bd53cb35a3e9 ("X86/KVM: Handle PFNs outside of kernel reach when touching GPTEs") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8771d9673e0bdb7148299f3c074667124bde6dff) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since MMC core handles runtime PM reference counting, we can avoid doing
redundant runtime PM work in the driver. That means the only thing
commit 5b4258f6721f ("misc: rtsx: rts5249 support runtime PM") misses is
to always enable runtime PM, to let its parent driver enable ASPM in the
runtime idle routine.
drivers/block/n64cart.c: In function ‘n64cart_submit_bio’:
drivers/block/n64cart.c:91:26: error: ‘struct bio’ has no member named ‘bi_disk’
91 | struct device *dev = bio->bi_disk->private_data;
| ^~
CC drivers/slimbus/qcom-ctrl.o
CC drivers/auxdisplay/hd44780.o
CC drivers/watchdog/watchdog_core.o
CC drivers/nvme/host/fault_inject.o
AR drivers/accessibility/braille/built-in.a
make[2]: *** [scripts/Makefile.build:288: drivers/block/n64cart.o] Error 1
Fixes: 309dca309fc3 ("block: store a block_device pointer in struct bio"); Reported-by: k2ci <kernel-bot@kylinos.cn> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220321071216.1549596-1-liu.yun@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a9bbdeef768faac1a00a17f98d40af218e29be71) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This commit fixes a couple of typos: s/--doall/--do-all/ and
s/--doallmodconfig/--do-allmodconfig/.
[ paulmck: Add Fixes: supplied by Paul Menzel. ]
Fixes: a115a775a8d5 ("torture: Add "make allmodconfig" to torture.sh") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2a710a5c59e9ff2214def7199a4f5e7fd48f0247) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This is a mix of a documentation fix with some additions to the
"panic_print" syscall / parameter. The goal here is being able to collect
all CPUs backtraces during a panic event and also to enable "panic_print"
in a kdump event - details of the reasoning and design choices in the
patches.
This patch (of 3):
Commit de6da1e8bcf0 ("panic: add an option to replay all the printk
message in buffer") added a new bit to the sysctl/kernel parameter
"panic_print", but the documentation was added only in
kernel-parameters.txt, not in the sysctl guide.
Fix it here by adding bit 5 to sysctl admin-guide documentation.
Moving to an EPOLL based IRQ controller broke uml_mconsole stop/go
commands. This fixes it and restores stop/go functionality.
Fixes: ff6a17989c08 ("Epoll based IRQ controller") Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 166abd13eab0e816a73c7d936be95d819fcfc6eb) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
this patch support tick_delay bit[31:30] without enhance_timing feature.
Fixes: f84d866ab43f("spi: mediatek: add tick_delay support") Signed-off-by: Leilk Liu <leilk.liu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220315032411.2826-2-leilk.liu@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit dd8772224c19ef0851baa8bea4aafd50a2276fa5) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
According to subdevice interface specification found in V4L2 API
documentation, set format pad operations should not affect image
geometry set in preceding image processing steps. Unfortunately, that
requirement is not respected by the driver implementation of set format
as it was not the case when that code was still implementing a pair of
now obsolete .s_mbus_fmt() / .try_mbus_fmt() video operations before
they have been merged and reused as an implementation of .set_fmt() pad
operation by commit 717fd5b4907a ("[media] v4l2: replace try_mbus_fmt
by set_fmt").
Exclude non-compliant crop rectangle adjustments from set format try,
as well as a call to .set_selection() from set format active processing
path, so only frame scaling is applied as needed and crop rectangle is
no longer modified.
[Sakari Ailus: Rebase on subdev state patches]
Fixes: 717fd5b4907a ("[media] v4l2: replace try_mbus_fmt by set_fmt") Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2a6e0695ddd51d90de298951bd53d04f5aef03c4) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Try requests are now only supported by format processing pad operations
implemented by the driver. The driver selection API operations
currently respond to them with -EINVAL. While that is correct, it
constraints video device drivers to not use subdevice cropping at all
while processing user requested active frame size, otherwise their set
try format results might differ from active. As a consequence, we
can't fix set format pad operation as not to touch crop rectangle since
that would affect users not being able to set arbitrary frame sizes.
Moreover, without a working set try selection support we are not able
to use pad config crop rectangle as a reference while processing set
try format requests.
Implement missing try selection support. Moreover, as it will be now
possible to maintain the pad config crop rectangle via selection API,
start using it instead of the active one as a reference while
processing set try format requests.
is_unscaled_ok() helper, now also called from set selection operation,
has been just moved up in the source file to avoid a prototype, with no
functional changes.
[Sakari Ailus: Rebase on subdev state patches]
Fixes: 717fd5b4907a ("[media] v4l2: replace try_mbus_fmt by set_fmt") Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3995d4cf529c3369ad2901d8b9d0a8bcfc6cc840) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.26:
https://download.01.org/perfmon/SKX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Fixes were made that allow the skx-metrics.json to successfully
generate, bringing back TopdownL1 metrics.
Tested:
$ perf test
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Skip
91: perf all PMU test : Ok
...
90 skips due to a lack of floating point samples, which is
understandable.
Fixes: c4ad8fabd03f76ed ("perf vendor events: Update metrics for SkyLake Server") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220201015858.1226914-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8f5e6110e108ba69e59aa04c49697b3477710d27) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is no reason to force readwrite access on TLV controls. It can be
either read, write or both. This is further evidenced in code where it
performs following checks:
if ((k->access & SNDRV_CTL_ELEM_ACCESS_TLV_READ) && !sbe->get)
return -EINVAL;
if ((k->access & SNDRV_CTL_ELEM_ACCESS_TLV_WRITE) && !sbe->put)
return -EINVAL;
Fixes: 1a3232d2f61d ("ASoC: topology: Add support for TLV bytes controls") Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20220112170030.569712-3-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b0e5c18317f7bafe0ab93c6369381e696375faf2) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
sound/soc/sof/intel/pci-tng.o:(.data+0x1c0): undefined reference to `sof_pci_probe'
sound/soc/sof/intel/pci-tng.o:(.data+0x1c8): undefined reference to `sof_pci_remove'
sound/soc/sof/intel/pci-tng.o:(.data+0x1e0): undefined reference to `sof_pci_shutdown'
sound/soc/sof/intel/pci-tng.o:(.data+0x290): undefined reference to `sof_pci_pm'
Make SND_SOC_SOF_MERRIFIELD select SND_SOC_SOF_PCI_DEV to fix this.
Fixes: 8d4ba1be3d22 ("ASoC: SOF: pci: split PCI into different drivers") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20220323092501.145879-1-zhengbin13@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 678b6901d00bcc870b7e1ccf33e992371da470ed) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Abort fastmap scanning and return error code if memory allocation fails
in add_aeb(). Otherwise ubi will get wrong peb statistics information
after scanning.
Fixes: dbb7d2a88d2a7b ("UBI: Add fastmap core") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ba8260872dd50fc058f7d489794c9c8cf84668fa) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The blamed commit adds support for irq, but the reqisters for irq are
outside of the memory size. They are at address 0x108. Therefore update
the memory size to cover all the registers used by the device.
larb@14016000: 'mediatek,larb-id' is a required property
arch/arm64/boot/dts/mediatek/mt8167-pumpkin.dt.yaml
larb@15001000: 'mediatek,larb-id' is a required property
arch/arm64/boot/dts/mediatek/mt8167-pumpkin.dt.yaml
larb@16010000: 'mediatek,larb-id' is a required property
arch/arm64/boot/dts/mediatek/mt8167-pumpkin.dt.yaml
As the description of mediatek,larb-id, the property is only
required when the larbid is not consecutive from its IOMMU point of view.
Also, from the description of mediatek,larbs in
Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml, all the larbs
must sort by the larb index.
In mt8167, there is only one IOMMU HW and three larbs. The drivers already
know its larb index from the mediatek,larbs property of IOMMU, thus no
need this property.
Fixes: 27bb0e42855a ("dt-bindings: memory: mediatek: Convert SMI to DT schema") Signed-off-by: Yong Wu <yong.wu@mediatek.com> Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220113111057.29918-3-yong.wu@mediatek.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e918b36600d6f51fe7ac241a11f8d586f32588bb) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The interrupt property is not mandatory at all, this property should not
be part of the required properties list, so move it into the optional
properties list.
Fixes: 326e5c8d4a87 ("dt-binding: spi: Document Macronix controller bindings") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/linux-mtd/20211216111654.238086-8-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b2b85196a31a245093e770eecb788f1be81f5b9b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The controller properties should be in the controller 'parent' node,
while properties in the children nodes are specific to the NAND
*chip*. This error was already present during the yaml conversion.
The reg property of a NAND device always references the chip-selects.
The ready/busy lines are described in the nand-rb property. I believe
this was a harmless copy/paste error during the conversion to yaml.
Commit 5b4258f6721f ("misc: rtsx: rts5249 support runtime PM") doesn't
use pm_runtime_{get,put}() helpers when it should, so the RPM refcount
keeps at zero, hence its parent driver, rtsx_pci, has to do lots of
weird tricks to keep it from runtime suspending.
So use those helpers at right places to properly manage runtime PM.
Fixes: 5b4258f6721f ("misc: rtsx: rts5249 support runtime PM") Cc: Ricky WU <ricky_wu@realtek.com> Tested-by: Ricky WU <ricky_wu@realtek.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20220125055010.1866563-1-kai.heng.feng@canonical.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8ec990990be3b59ce4a7ac5a53432f4c2f79a96d) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Let's say that the caller has storage for num_elem stack frames. Then,
the BPF stack helper functions walk the stack for only num_elem frames.
This means that if skip > 0, one keeps only 'num_elem - skip' frames.
This is because it sets init_nr in the perf_callchain_entry to the end
of the buffer to save num_elem entries only. I believe it was because
the perf callchain code unwound the stack frames until it reached the
global max size (sysctl_perf_event_max_stack).
However it now has perf_callchain_entry_ctx.max_stack to limit the
iteration locally. This simplifies the code to handle init_nr in the
BPF callstack entries and removes the confusion with the perf_event's
__PERF_SAMPLE_CALLCHAIN_EARLY which sets init_nr to 0.
Also change the comment on bpf_get_stack() in the header file to be
more explicit what the return value means.
The commit 314001f0bf92 ("af_unix: Add OOB support") introduced OOB for
AF_UNIX, but it lacks some changes for POLLPRI. Let's add the missing
piece.
In the selftest, normal datagrams are sent followed by OOB data, so this
commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
case.
Fixes: 314001f0bf92 ("af_unix: Add OOB support") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 21b6b8d43d87708c88ae34be67bb6cbba7bc60a0) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
__setup() handlers should return 1 if the command line option is handled
and 0 if not (or maybe never return 0; it just pollutes init's
environment). This prevents:
Unknown kernel command line parameters \
"BOOT_IMAGE=/boot/bzImage-517rc5 hardened_usercopy=off", will be \
passed to user space.
Run /sbin/init as init process
with arguments:
/sbin/init
with environment:
HOME=/
TERM=linux
BOOT_IMAGE=/boot/bzImage-517rc5
hardened_usercopy=off
or
hardened_usercopy=on
but when "hardened_usercopy=foo" is used, there is no Unknown kernel
command line parameter.
Return 1 to indicate that the boot option has been handled.
Print a warning if strtobool() returns an error on the option string,
but do not mark this as in unknown command line option and do not cause
init's environment to be polluted with this string.
Link: https://lkml.kernel.org/r/20220222034249.14795-1-rdunlap@infradead.org
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Fixes: b5cb15d9372ab ("usercopy: Allow boot cmdline disabling of hardening") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru> Acked-by: Chris von Recklinghausen <crecklin@redhat.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 260daa256d3056fd14d96c7a7519adc2ce4a7862) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
__setup() handlers should return 1 if the command line option is handled
and 0 if not (or maybe never return 0; it just pollutes init's
environment).
The only reason that this particular __setup handler does not pollute
init's environment is that the setup string contains a '.', as in
"cgroup.memory". This causes init/main.c::unknown_boottoption() to
consider it to be an "Unused module parameter" and ignore it. (This is
for parsing of loadable module parameters any time after kernel init.)
Otherwise the string "cgroup.memory=whatever" would be added to init's
environment strings.
Instead of relying on this '.' quirk, just return 1 to indicate that the
boot option has been handled.
Note that there is no warning message if someone enters:
cgroup.memory=anything_invalid
Link: https://lkml.kernel.org/r/20220222005811.10672-1-rdunlap@infradead.org Fixes: f7e1cb6ec51b0 ("mm: memcontrol: account socket memory in unified hierarchy memory controller") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Reviewed-by: Michal Koutný <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c9acbcd636ab8110e6adfaa0eede1ddb71903c88) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
__setup() handlers should return 1 to obsolete_checksetup() in
init/main.c to indicate that the boot option has been handled.
A return of 0 causes the boot option/value to be listed as an Unknown
kernel parameter and added to init's (limited) argument or environment
strings. Also, error return codes don't mean anything to
obsolete_checksetup() -- only non-zero (usually 1) or zero.
So return 1 from jive_mtdset().
Fixes: 9db829f485c5 ("[ARM] JIVE: Initial machine support for Logitech Jive") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-samsung-soc@vger.kernel.org Cc: patches@armlinux.org.uk Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 48ddbd8b4e42e870fe1abcc0a111e53a40a66f9e) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
__setup() handlers should return 1 if the command line option is handled
and 0 if not (or maybe never return 0; it just pollutes init's
environment). This prevents:
Unknown kernel command line parameters \
"BOOT_IMAGE=/boot/bzImage-517rc5 stack_guard_gap=100", will be \
passed to user space.
Run /sbin/init as init process
with arguments:
/sbin/init
with environment:
HOME=/
TERM=linux
BOOT_IMAGE=/boot/bzImage-517rc5
stack_guard_gap=100
Return 1 to indicate that the boot option has been handled.
Note that there is no warning message if someone enters:
stack_guard_gap=anything_invalid
and 'val' and stack_guard_gap are both set to 0 due to the use of
simple_strtoul(). This could be improved by using kstrtoxxx() and
checking for an error.
It appears that having stack_guard_gap == 0 is valid (if unexpected) since
using "stack_guard_gap=0" on the kernel command line does that.
Link: https://lkml.kernel.org/r/20220222005817.11087-1-rdunlap@infradead.org
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru Fixes: 1be7107fbe18e ("mm: larger stack guard gap, between vmas") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6795b20d4b2cbc2f1d39a6b0aa7628f786bacc34) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
if (ptr1 && ptr2)
ASSERT(ksize(ptr1) == ksize(ptr2));
We attempted to fix these issues in the blamed commits, but forgot
that TCP was possibly shifting data after skb_unclone_keeptruesize()
has been used, notably from tcp_retrans_try_collapse().
So we not only need to keep same skb->truesize value,
we also need to make sure TCP wont fill new tailroom
that pskb_expand_head() was able to get from a
addr = kmalloc(...) followed by ksize(addr)
Split skb_unclone_keeptruesize() into two parts:
1) Inline skb_unclone_keeptruesize() for the common case,
when skb is not cloned.
2) Out of line __skb_unclone_keeptruesize() for the 'slow path'.
Fixes: c4777efa751d ("net: add and use skb_unclone_keeptruesize() helper") Fixes: 097b9146c0e2 ("net: fix up truesize of cloned skb in skb_prepare_for_shift()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Marco Elver <elver@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 23629b673b780d967b88a850b1518cf0f0ffc6aa) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We have multiple places where this helper is convenient,
and plan using it in the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 51e458fc0ca63a0a8e8946dfad19535ae7bb2772) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When an enum is used in the visible parts of a trace event that is
exported to user space, the user space applications like perf and
trace-cmd do not have a way to know what the value of the enum is. To
solve this, at boot up (or module load) the printk formats are modified to
replace the enum with their numeric value in the string output.
Array fields of the event are defined by [<nr-elements>] in the type
portion of the format file so that the user space parsers can correctly
parse the array into the appropriate size chunks. But in some trace
events, an enum is used in defining the size of the array, which once
again breaks the parsing of user space tooling.
This was solved the same way as the print formats were, but it modified
the type strings of the trace event. This caused crashes in some
architectures because, as supposed to the print string, is a const string
value. This was not detected on x86, as it appears that const strings are
still writable (at least in boot up), but other architectures this is not
the case, and writing to a const string will cause a kernel fault.
To fix this, use kstrdup() to copy the type before modifying it. If the
trace event is for the core kernel there's no need to free it because the
string will be in use for the life of the machine being on line. For
modules, create a link list to store all the strings being allocated for
modules and when the module is removed, free them.
Halil Pasic points out [1] that the full revert of that commit (revert
in bddac7c1e02b), and that a partial revert that only reverts the
problematic case, but still keeps some of the cleanups is probably
better. 
And that partial revert [2] had already been verified by Oleksandr
Natalenko to also fix the issue, I had just missed that in the long
discussion.
So let's reinstate the cleanups from commit aa6f8dcbab47 ("swiotlb:
rework "fix info leak with DMA_FROM_DEVICE""), and effectively only
revert the part that caused problems.
It should be better to reverse the check on codec_dai
and returned early in order to be easier to understand.
Fixes: de2c6f98817f ("ASoC: soc-compress: prevent the potentially use of null pointer") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20220310030041.1556323-1-jiasheng@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 63351e2e13625435843f901c8aa14751e09b6f45) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 031495635b46 ("arm64: Do not defer reserve_crashkernel() for
platforms with no DMA memory zones") introduced different definitions
for 'arm64_dma_phys_limit' depending on CONFIG_ZONE_DMA{,32} based on
a late suggestion from Pasha. Sadly, this results in a build error when
passing W=1:
| arch/arm64/mm/init.c:90:19: error: conflicting type qualifiers for 'arm64_dma_phys_limit'
Drop the 'const' for now and use '__ro_after_init' consistently.
This done routine will delete the timer and check for its return value and
decrease the reference count accordingly. This prevents boot hangs reported
after commit 31e6cdbe0eae ("scsi: qla2xxx: Implement ref count for SRB")
was merged.
Link: https://lore.kernel.org/r/20220208093946.4471-1-njavali@marvell.com Fixes: 31e6cdbe0eae ("scsi: qla2xxx: Implement ref count for SRB") Reported-by: Ewan Milne <emilne@redhat.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0e39097efcb5ae054e119005168aca181aa4bc29) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 4adc33f36d80 ("drm/edid: Split deep color modes between RGB and
YUV444") introduced two new variables in struct drm_display_info and
their documentation, but the documentation part had a typo resulting in
a doc build warning.
Fixes: 4adc33f36d80 ("drm/edid: Split deep color modes between RGB and YUV444") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Simon Ser <contact@emersion.fr> Link: https://patchwork.freedesktop.org/patch/msgid/20220202094340.875190-1-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b9cf1208af36f6552e6f3b8fdf364983e3cc8e60) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In commit 42bf50a1795a ("can: isotp: support MSG_TRUNC flag when
reading from socket") a new check for recvmsg flags has been
introduced that only checked for the flags that are handled in
isotp_recvmsg() itself.
This accidentally removed the MSG_PEEK feature flag which is processed
later in the call chain in __skb_try_recv_from_queue().
Add MSG_PEEK to the set of valid flags to restore the feature.
The Type C ACPI device on older Chromebooks is not generated correctly
(since their EC firmware doesn't support the new commands required). In
such cases, the crafted ACPI device doesn't have an EC parent, and it is
therefore not useful (it shouldn't be generated in the first place since
the EC firmware doesn't support any of the Type C commands).
To handle devices which use these older firmware revisions, check for
the parent EC device handle, and fail the probe if it's not found.
Fixes: fdc6b21e2444 ("platform/chrome: Add Type C connector class driver") Reported-by: Alyssa Ross <hi@alyssa.is> Reviewed-by: Tzung-Bi Shih <tzungbi@google.com> Signed-off-by: Prashant Malani <pmalani@chromium.org> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Alyssa Ross <hi@alyssa.is> Tested-by: Alyssa Ross <hi@alyssa.is> Link: https://lore.kernel.org/r/20220126190219.3095419-1-pmalani@chromium.org Signed-off-by: Benson Leung <bleung@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3fc81968625a66ae54db37b7188f32edd0c6752c) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When running dt_binding_check on the nvidia,tegra210-quad.yaml binding
document the following error is reported ...
nvidia,tegra210-quad.example.dt.yaml:0:0: /example-0/spi@70410000/flash@0:
failed to match any schema with compatible: ['spi-nor']
Update the example in the binding document to fix the above error.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Fixes: 9684752e5fe3 ("dt-bindings: spi: Add Tegra Quad SPI device tree binding") Link: https://lore.kernel.org/r/20220307113529.315685-1-jonathanh@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8c1c3c00dceb9d78bba6658658d109cc0c1d5b32) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
vhost_iotlb_add_range_ctx() handles the range [0, ULONG_MAX] by
splitting it into two ranges and adding them separately. The return
value of adding the first range to the iotlb is currently ignored.
Check the return value and bail out in case of an error.
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com> Link: https://lore.kernel.org/r/20220312141121.4981-1-mail@anirudhrb.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: e2ae38cf3d91 ("vhost: fix hung thread due to erroneous iotlb entries") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7480cc0240ebcf11a5943940ec84bd96c2e22b5f) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If the NumEntries field in the _CPC return package is less than 2, do
not attempt to access the "Revision" element of that package, because
it may not be present then.
Fixes: 337aadff8e45 ("ACPI: Introduce CPU performance controls using CPPC") BugLink: https://lore.kernel.org/lkml/20220322143534.GC32582@xsang-OptiPlex-9020/ Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 97b5593fd1b182b3fdb180b6bbe64ec09669988b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually
inappropriate for .plt, .got, and .got.plt sections which are always
SHT_PROGBITS.
In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway
and (NOLOAD) will be essentially ignored. In ld.lld, since
https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to
customize the output section type"), ld.lld will report a `section type
mismatch` error (later changed to a warning). Just remove (NOLOAD) to
fix the warning.
[1] https://lld.llvm.org/ELF/linker_script.html As of today, "The
section should be marked as not loadable" on
https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is
outdated for ELF.
Link: https://github.com/ClangBuiltLinux/linux/issues/1597 Fixes: ab1ef68e5401 ("RISC-V: Add sections of PLT and GOT for kernel module") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit dae2529013786873d985fabfbe34c43abec329b2) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
ida_alloc_range(..., min, max, ...) returns values from min to max,
inclusive.
So, NR_EXT_DEVT is a valid idx returned by blk_alloc_ext_minor().
This is an issue because in device_add_disk(), this value is used in:
ddev->devt = MKDEV(disk->major, disk->first_minor);
and NR_EXT_DEVT is '(1 << MINORBITS)'.
So, should 'disk->first_minor' be NR_EXT_DEVT, it would overflow.
iop32x is one of the last platforms to use IRQ 0, and this has apparently
stopped working in a 2014 cleanup without anyone noticing. This interrupt
is used for the DMA engine, so most likely this has not actually worked
in the past 7 years, but it's also not essential for using this board.
I'm splitting out this change from my GENERIC_IRQ_MULTI_HANDLER
conversion so it can be backported if anyone cares.
Fixes: a71b092a9c68 ("ARM: Convert handle_IRQ to use __handle_domain_irq") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[ardb: take +1 offset into account in mask/unmask and init as well] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 21cfddd5e0f6ef438fbae84924669ca6d6701b27) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Hulk Robot reported a KASAN report about use-after-free:
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0x13d/0x160
Read of size 8 at addr ffff888035e37d98 by task ubiattach/1385
[...]
Call Trace:
klist_dec_and_del+0xa7/0x4a0
klist_put+0xc7/0x1a0
device_del+0x4d4/0xed0
cdev_device_del+0x1a/0x80
ubi_attach_mtd_dev+0x2951/0x34b0 [ubi]
ctrl_cdev_ioctl+0x286/0x2f0 [ubi]
Allocated by task 1414:
device_add+0x60a/0x18b0
cdev_device_add+0x103/0x170
ubi_create_volume+0x1118/0x1a10 [ubi]
ubi_cdev_ioctl+0xb7f/0x1ba0 [ubi]
The lock held by ctrl_cdev_ioctl is ubi_devices_mutex, but the lock held
by ubi_cdev_ioctl is ubi->device_mutex. Therefore, the two locks can be
concurrent.
ctrl_cdev_ioctl contains two operations: ubi_attach and ubi_detach.
ubi_detach is bug-free because it uses reference counting to prevent
concurrency. However, uif_init and uif_close in ubi_attach may race with
ubi_cdev_ioctl.
uif_init will race with ubi_cdev_ioctl as in the following stack.
cpu1 cpu2 cpu3
_______________________|________________________|______________________
ctrl_cdev_ioctl
ubi_attach_mtd_dev
uif_init
ubi_cdev_ioctl
ubi_create_volume
cdev_device_add
ubi_add_volume
// sysfs exist
kill_volumes
ubi_cdev_ioctl
ubi_remove_volume
cdev_device_del
// first free
ubi_free_volume
cdev_del
// double free
cdev_device_del
And uif_close will race with ubi_cdev_ioctl as in the following stack.
cpu1 cpu2 cpu3
_______________________|________________________|______________________
ctrl_cdev_ioctl
ubi_attach_mtd_dev
uif_init
ubi_cdev_ioctl
ubi_create_volume
cdev_device_add
ubi_debugfs_init_dev
//error goto out_uif;
uif_close
kill_volumes
ubi_cdev_ioctl
ubi_remove_volume
cdev_device_del
// first free
ubi_free_volume
// double free
The cause of this problem is that commit 714fb87e8bc0 make device
"available" before it becomes accessible via sysfs. Therefore, we
roll back the modification. We will fix the race condition between
ubi device creation and udev by removing ubi_get_device in
vol_attribute_show and dev_attribute_show.This avoids accessing
uninitialized ubi_devices[ubi_num].
ubi_get_device is used to prevent devices from being deleted during
sysfs execution. However, now kernfs ensures that devices will not
be deleted before all reference counting are released.
The key process is shown in the following stack.
Fixes: 714fb87e8bc0 ("ubi: Fix race condition between ubi device creation and udev") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1a3f1cf87054833242fcd0218de0481cf855f888) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When compile-testing on 64-bit architectures, GCC complains about the
mismatch of types between the %d format specifier and value returned by
ARRAY_LENGTH(). Use %zu, which is correct everywhere.
Reported-by: kernel test robot <lkp@intel.com> Fixes: 3b588e43ee5c7 ("pinctrl: nuvoton: add NPCM7xx pinctrl and GPIO driver") Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220205155332.1308899-2-j.neuschaefer@gmx.net Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d509d41d89c58fa7b986914de2541c205a8e4824) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The name "DS" is defined in arch/x86/um/shared/sysdep/ptrace_64.h,
which results in a compiler warning when build-testing on ARCH=um.
Rename this driver's "DS" macro to DSTR so avoid this collision.
Reported-by: kernel test robot <lkp@intel.com> Fixes: 3b588e43ee5c7 ("pinctrl: nuvoton: add NPCM7xx pinctrl and GPIO driver") Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Link: https://lore.kernel.org/r/20220205155332.1308899-3-j.neuschaefer@gmx.net Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3e51c30232289b13001d24e9f127dbeb0be50461) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix build errors when BRIDGE=m and SPARX5_SWITCH=y:
riscv64-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.o: in function `.L305':
sparx5_switchdev.c:(.text+0xdb0): undefined reference to `br_vlan_enabled'
riscv64-linux-ld: drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.o: in function `.L283':
sparx5_switchdev.c:(.text+0xee0): undefined reference to `br_vlan_enabled'
Fixes: 3cfa11bac9bb ("net: sparx5: add the basic sparx5 driver") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: UNGLinuxDriver@microchip.com Cc: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220330012025.29560-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b603cbe08b0b0df82169c70c8be4fd1edcf7a446) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The bias-pull-* properties, or PIN_CONFIG_BIAS_PULL_* pin config
parameters, accept optional arguments in ohms denoting the strength of
the pin bias.
Print these values out in debugfs as well.
Fixes: eec450713e5c ("pinctrl: pinconf-generic: Add flag to print arguments") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220308100956.2750295-2-wenst@chromium.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d02ca80ec7359b014bde218e0619d83ca4d872f5) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The algorithm __cbc-aes-neonbs requires a fallback so we need
to select the config options for them or otherwise it will fail
to register on boot-up.
Fixes: 00b99ad2bac2 ("crypto: arm/aes-neonbs - Use generic cbc...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 210891d81b9ced00645ba5320bd28cf3dd4c7a9f) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since IRQF_NO_SUSPEND used for imx mailbox driver, that means this irq
can't be used for wakeup source so that can't wakeup from freeze mode.
Add pm_system_wakeup() to wakeup from freeze mode.
Fixes: b7b2796b9b31e("mailbox: imx: ONLY IPC MU needs IRQF_NO_SUSPEND flag") Reviewed-by: Jacky Bai <ping.bai@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Robin Gong <yibin.gong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c78d23ea7506dbf7b02c04977307331a62653380) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The rxrpc_call struct has a timer used to handle various timed events
relating to a call. This timer can get started from the packet input
routines that are run in softirq mode with just the RCU read lock held.
Unfortunately, because only the RCU read lock is held - and neither ref or
other lock is taken - the call can start getting destroyed at the same time
a packet comes in addressed to that call. This causes the timer - which
was already stopped - to get restarted. Later, the timer dispatch code may
then oops if the timer got deallocated first.
Fix this by trying to take a ref on the rxrpc_call struct and, if
successful, passing that ref along to the timer. If the timer was already
running, the ref is discarded.
The timer completion routine can then pass the ref along to the call's work
item when it queues it. If the timer or work item where already
queued/running, the extra ref is discarded.
Some function calls are not implemented in rxrpc_no_security, there are
preparse_server_key, free_preparse_server_key and destroy_server_key.
When rxrpc security type is rxrpc_no_security, user can easily trigger a
null-ptr-deref bug via ioctl. So judgment should be added to prevent it
When user delete vlan 0, as driver will not delete vlan 0 for hardware in
function hclge_set_vlan_filter_hw(), so vlan 0 in software vlan talbe should
not be deleted.
Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 88570bda6e48ab601739431c21f31e7561d1e79b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently, the debugfs mechanism is that all functions share a
global variable to save the pointer for obtaining data. When
different functions concurrently access the same file node,
repeated release exceptions occur. Therefore, the granularity
of the pointer for storing the obtained data is adjusted to be
private for each function.
Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a0817ad3f283f40c41d0ed4ffffa99d09adf2d77) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Per fstrim(8) we must round up the minlen argument to the fs block size.
The current calculation doesn't take into account devices that have a
discard granularity and requested minlen less than 1 fs block, so the
value can get shifted away to zero in the translation to fs blocks.
The zero minlen passed to gfs2_rgrp_send_discards() then allows
sb_issue_discard() to be called with nr_sects == 0 which returns -EINVAL
and results in gfs2_rgrp_send_discards() returning -EIO.
Make sure minlen is never < 1 fs block by taking the max of the
requested minlen and the fs block size before comparing to the device's
discard granularity and shifting to fs blocks.
Fixes: 076f0faa764ab ("GFS2: Fix FITRIM argument handling") Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5c3c9bce1c99bf58efdf1c7af870240777ca64b5) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When gfs2_setattr_size() fails, it calls gfs2_rs_delete(ip, NULL) to get
rid of any reservations the inode may have. Instead, it should pass in
the inode's write count as the second parameter to allow
gfs2_rs_delete() to figure out if the inode has any writers left.
In a next step, there are two instances of gfs2_rs_delete(ip, NULL) left
where we know that there can be no other users of the inode. Replace
those with gfs2_rs_deltree(&ip->i_res) to avoid the unnecessary write
count check.
With that, gfs2_rs_delete() is only called with the inode's actual write
count, so get rid of the second parameter.
Fixes: a097dc7e24cb ("GFS2: Make rgrp reservations part of the gfs2_inode structure") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1d8195349742e62ca45b01a0030d9c3b4f3dc685) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Clang static analysis reports this issue
interface.c:810:8: warning: Passed-by-value struct
argument contains uninitialized data
now = rtc_tm_to_ktime(tm);
^~~~~~~~~~~~~~~~~~~
tm is set by a successful call to __rtc_read_time()
but its return status is not checked. Check if
it was successful before setting the enabled flag.
Move the decl of err to function scope.
Fixes: 2b2f5ff00f63 ("rtc: interface: ignore expired timers when enqueuing new timers") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20220326194236.2916310-1-trix@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 50ed32e67c5c9cd87169db605f9d207b24429cb3) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fixes: 4a6795933a89 ("kbuild: modpost: Explicitly warn about unprototyped symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9f20ce95db3d57fa0cb8e6179bf570ca6af2d998) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When splitting a value entry, we may need to add the new nodes to the LRU
list and remove the parent node from the LRU list. The WARN_ON checks
in shadow_lru_isolate() catch this oversight. This bug was latent
until we stopped splitting folios in shrink_page_list() with commit 820c4e2e6f51 ("mm/vmscan: Free non-shmem folios without splitting them").
That allows the creation of large shadow entries, and subsequently when
trying to page in a small page, we will split the large shadow entry
in __filemap_add_folio().
Fixes: 8fc75643c5e1 ("XArray: add xas_split") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7aae60df6782971606567ed4f23239390cfdf0a7) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
| mcp251xfd-core.c:1813:7: warning: The left operand
| of '&' is a garbage value
| FIELD_GET(MCP251XFD_REG_DEVID_ID_MASK, dev_id),
| ^ ~~~~~~
dev_id is set in a successful call to mcp251xfd_register_get_dev_id().
Though the status of calls made by mcp251xfd_register_get_dev_id() are
checked and handled, their status' are not returned. So return err.
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Link: https://lore.kernel.org/all/20220319153128.2164120-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit be8ebbabac944f01f6fabb026df060ca84c69d3c) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Syzbot reported warning in usb_submit_urb() which is caused by wrong
endpoint type. We should check that in endpoint is actually present to
prevent this warning.
Found pipes are now saved to struct mcba_priv and code uses them
directly instead of making pipes in place.
Fixes: 51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer") Link: https://lore.kernel.org/all/20220313100903.10868-1-paskripkin@gmail.com Reported-and-tested-by: syzbot+3bc1dce0cc0052d60fde@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fa9c1f14002dc0d5293e16a2007bd89b6e79207b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
because can_put_echo_skb() deletes original skb and
can_free_echo_skb() deletes the cloned skb.
Fixes: 51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer") Link: https://lore.kernel.org/all/20220311080208.45047-1-hbh25y@gmail.com Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 37f07ad24866c6c1423b37b131c9a42414bcf8a1) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If there is already an entry present that is of order >= XA_CHUNK_SHIFT
when we call xas_create_range(), xas_create_range() will misinterpret
that entry as a node and dereference xa_node->parent, generally leading
to a crash that looks something like this:
general protection fault, probably for non-canonical address 0xdffffc0000000001:
0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 32 Comm: khugepaged Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13 #0
RIP: 0010:xa_parent_locked include/linux/xarray.h:1207 [inline]
RIP: 0010:xas_create_range+0x2d9/0x6e0 lib/xarray.c:725
It's deterministically reproducable once you know what the problem is,
but producing it in a live kernel requires khugepaged to hit a race.
While the problem has been present since xas_create_range() was
introduced, I'm not aware of a way to hit it before the page cache was
converted to use multi-index entries.
Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Reported-by: syzbot+0d2b0bf32ca5cfd09f2e@syzkaller.appspotmail.com Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7521a97b1929042604bef6859f62fa8b4bbc077b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The previous commit fixed a memory leak on the send path in the event
that IPv6 is disabled at compile time, but how did a packet even arrive
there to begin with? It turns out we have previously allowed IPv6
endpoints even when IPv6 support is disabled at compile time. This is
awkward and inconsistent. Instead, let's just ignore all things IPv6,
the same way we do other malformed endpoints, in the case where IPv6 is
disabled.
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9af42a4f6d81b96b123f3ec22a4dcb906c6d00e7) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In function wg_socket_send_buffer_as_reply_to_skb() or wg_socket_send_
buffer_to_peer(), the semantics of send6() is required to free skb. But
when CONFIG_IPV6 is disable, kfree_skb() is missing. This patch adds it
to fix this bug.
Signed-off-by: Wang Hai <wanghai38@huawei.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 402991a9771587acc2947cf6c4d689c5397f2258) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We make too nuanced use of ptr_ring to entirely move to the skb_array
wrappers, but we at least should avoid the naughty function pointer cast
when cleaning up skbs. Otherwise RAP/CFI will honk at us. This patch
uses the __skb_array_destroy_skb wrapper for the cleanup, rather than
directly providing kfree_skb, which is what other drivers in the same
situation do too.
Reported-by: PaX Team <pageexec@freemail.hu> Fixes: 886fcee939ad ("wireguard: receive: use ring buffer for incoming handshakes") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6a7245252fdc63f8f651f26d6ee80aed37f0869b) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
commit 2f4c9ba23b88 ("nvme: export zoned namespaces without Zone Append
support read-only") marks zoned namespaces without append support
read-only. It does iso by setting NVME_NS_FORCE_RO in ns->flags in
nvme_update_zone_info and checking for that flag later in
nvme_update_disk_info to mark the disk as read-only.
But commit 73d90386b559 ("nvme: cleanup zone information initialization")
rearranged nvme_update_disk_info to be called before
nvme_update_zone_info and thus not marking the disk as read-only.
The call order cannot be just reverted because nvme_update_zone_info sets
certain queue parameters such as zone_write_granularity that depend on the
prior call to nvme_update_disk_info.
Remove the call to set_disk_ro in nvme_update_disk_info. and call
set_disk_ro after nvme_update_zone_info and nvme_update_disk_info to set
the permission for ZNS drives correctly. The same applies to the
multipath disk path.
Fixes: 73d90386b559 ("nvme: cleanup zone information initialization") Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c98f792a1468159d103db7640dddf335eee713ed) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
A NVMe subsystem with multiple controller can have private namespaces
that use the same NSID under some conditions:
"If Namespace Management, ANA Reporting, or NVM Sets are supported, the
NSIDs shall be unique within the NVM subsystem. If the Namespace
Management, ANA Reporting, and NVM Sets are not supported, then NSIDs:
a) for shared namespace shall be unique; and
b) for private namespace are not required to be unique."
Make sure this specific setup is supported in Linux.
Fixes: 9ad1927a3bc2 ("nvme: always search for namespace head") Signed-off-by: Sungup Moon <sungup.moon@samsung.com>
[hch: refactored and fixed the controller vs subsystem based naming
conflict] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7d18d6c71372bde4d62eceb819de631cebd44367) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When renaming the whiteout file, the old whiteout file is not deleted.
Therefore, we add the old dentry size to the old dir like XFS.
Otherwise, an error may be reported due to `fscki->calc_sz != fscki->size`
in check_indes.
Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Reported-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 578bf41d9443bc662bc0d15a479025271d20983c) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
MM defined the rule [1] very clearly that once page was set with PG_private
flag, we should increment the refcount in that page, also main flows like
pageout(), migrate_page() will assume there is one additional page
reference count if page_has_private() returns true. Otherwise, we may
get a BUG in page migration:
Before the time, we should make clean a concept, what does refcount means
in page gotten from grab_cache_page_write_begin(). There are 2 situations:
Situation 1: refcount is 3, page is created by __page_cache_alloc.
TYPE_A - the write process is using this page
TYPE_B - page is assigned to one certain mapping by calling
__add_to_page_cache_locked()
TYPE_C - page is added into pagevec list corresponding current cpu by
calling lru_cache_add()
Situation 2: refcount is 2, page is gotten from the mapping's tree
TYPE_B - page has been assigned to one certain mapping
TYPE_A - the write process is using this page (by calling
page_cache_get_speculative())
Filesystem releases one refcount by calling put_page() in xxx_write_end(),
the released refcount corresponds to TYPE_A (write task is using it). If
there are any processes using a page, page migration process will skip the
page by judging whether expected_page_refs() equals to page refcount.
The BUG is caused by following process:
PA(cpu 0) kcompactd(cpu 1)
compact_zone
ubifs_write_begin
page_a = grab_cache_page_write_begin
add_to_page_cache_lru
lru_cache_add
pagevec_add // put page into cpu 0's pagevec
(refcnf = 3, for page creation process)
ubifs_write_end
SetPagePrivate(page_a) // doesn't increase page count !
unlock_page(page_a)
put_page(page_a) // refcnt = 2
[...]
UBIFS doesn't increase the page refcount after setting private flag, which
leads to page migration task believes the page is not used by any other
processes, so the page is migrated. This causes concurrent accessing on
page refcount between put_page() called by other process(eg. read process
calls lru_cache_add) and page_ref_unfreeze() called by migration task.
Actually zhangjun has tried to fix this problem [2] by recalculating page
refcnt in ubifs_migrate_page(). It's better to follow MM rules [1], because
just like Kirill suggested in [2], we need to check all users of
page_has_private() helper. Like f2fs does in [3], fix it by adding/deleting
refcount when setting/clearing private for a page. BTW, according to [4],
we set 'page->private' as 1 because ubifs just simply SetPagePrivate().
And, [5] provided a common helper to set/clear page private, ubifs can
use this helper following the example of iomap, afs, btrfs, etc.
Function ubifs_wbuf_write_nolock() may access buf out of bounds in
following process:
ubifs_wbuf_write_nolock():
aligned_len = ALIGN(len, 8); // Assume len = 4089, aligned_len = 4096
if (aligned_len <= wbuf->avail) ... // Not satisfy
if (wbuf->used) {
ubifs_leb_write() // Fill some data in avail wbuf
len -= wbuf->avail; // len is still not 8-bytes aligned
aligned_len -= wbuf->avail;
}
n = aligned_len >> c->max_write_shift;
if (n) {
n <<= c->max_write_shift;
err = ubifs_leb_write(c, wbuf->lnum, buf + written,
wbuf->offs, n);
// n > len, read out of bounds less than 8(n-len) bytes
}
, which can be catched by KASAN:
=========================================================
BUG: KASAN: slab-out-of-bounds in ecc_sw_hamming_calculate+0x1dc/0x7d0
Read of size 4 at addr ffff888105594ff8 by task kworker/u8:4/128
Workqueue: writeback wb_workfn (flush-ubifs_0_0)
Call Trace:
kasan_report.cold+0x81/0x165
nand_write_page_swecc+0xa9/0x160
ubifs_leb_write+0xf2/0x1b0 [ubifs]
ubifs_wbuf_write_nolock+0x421/0x12c0 [ubifs]
write_head+0xdc/0x1c0 [ubifs]
ubifs_jnl_write_inode+0x627/0x960 [ubifs]
wb_workfn+0x8af/0xb80
Function ubifs_wbuf_write_nolock() accepts that parameter 'len' is not 8
bytes aligned, the 'len' represents the true length of buf (which is
allocated in 'ubifs_jnl_xxx', eg. ubifs_jnl_write_inode), so
ubifs_wbuf_write_nolock() must handle the length read from 'buf' carefully
to write leb safely.
Fetch a reproducer in [Link].
Fixes: 1e51764a3c2ac0 ("UBIFS: add new flash file system") Link: https://bugzilla.kernel.org/show_bug.cgi?id=214785 Reported-by: Chengsong Ke <kechengsong@huawei.com> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a7054aaf1909cf40489c0ec1b728fdcf79c751a6) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Make 'ui->data_len' aligned with 8 bytes before it is assigned to
dirtied_ino_d. Since 8871d84c8f8b0c6b("ubifs: convert to fileattr")
applied, 'setflags()' only affects regular files and directories, only
xattr inode, symlink inode and special inode(pipe/char_dev/block_dev)
have none- zero 'ui->data_len' field, so assertion
'!(req->dirtied_ino_d & 7)' cannot fail in ubifs_budget_space().
To avoid assertion fails in future evolution(eg. setflags can operate
special inodes), it's better to make dirtied_ino_d 8 bytes aligned,
after all aligned size is still zero for regular files.
Fixes: 1e51764a3c2ac05a ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 56cf8b26b18ecc7768a9918a98e4d7b279253675) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
UBIFS should make sure the flash has enough space to store dirty (Data
that is newer than disk) data (in memory), space budget is exactly
designed to do that. If space budget calculates less data than we need,
'make_reservation()' will do more work(return -ENOSPC if no free space
lelf, sometimes we can see "cannot reserve xxx bytes in jhead xxx, error
-28" in ubifs error messages) with ubifs inodes locked, which may effect
other syscalls.
A simple way to decide how much space do we need when make a budget:
See how much space is needed by 'make_reservation()' in ubifs_jnl_xxx()
function according to corresponding operation.
It's better to report ENOSPC in ubifs_budget_space(), as early as we can.
Fixes: 474b93704f32163 ("ubifs: Implement O_TMPFILE") Fixes: 1e51764a3c2ac05 ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 489c3a2577b3f1bf7e41279d12a44c71e8527427) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
'ui->dirty' is not protected by 'ui_mutex' in function do_tmpfile() which
may race with ubifs_write_inode[wb_workfn] to access/update 'ui->dirty',
finally dirty space is released twice.
open(O_TMPFILE) wb_workfn
do_tmpfile
ubifs_budget_space(ino_req = { .dirtied_ino = 1})
d_tmpfile // mark inode(tmpfile) dirty
ubifs_jnl_update // without holding tmpfile's ui_mutex
mark_inode_clean(ui)
if (ui->dirty)
ubifs_release_dirty_inode_budget(ui) // release first time
ubifs_write_inode
mutex_lock(&ui->ui_mutex)
ubifs_release_dirty_inode_budget(ui)
// release second time
mutex_unlock(&ui->ui_mutex)
ui->dirty = 0
Run generic/476 can reproduce following message easily
(See reproducer in [Link]):
Fix it by holding tmpfile ubifs inode lock during ubifs_jnl_update().
Similar problem exists in whiteout renaming, but previous fix("ubifs:
Rename whiteout atomically") has solved the problem.
Fixes: 474b93704f32163 ("ubifs: Implement O_TMPFILE") Link: https://bugzilla.kernel.org/show_bug.cgi?id=214765 Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a9662bec5a4dfc87b1503e26aec1aadfdf521658) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently, rename whiteout has 3 steps:
1. create tmpfile(which associates old dentry to tmpfile inode) for
whiteout, and store tmpfile to disk
2. link whiteout, associate whiteout inode to old dentry agagin and
store old dentry, old inode, new dentry on disk
3. writeback dirty whiteout inode to disk
Suddenly power-cut or error occurring(eg. ENOSPC returned by budget,
memory allocation failure) during above steps may cause kinds of problems:
Problem 1: ENOSPC returned by whiteout space budget (before step 2),
old dentry will disappear after rename syscall, whiteout file
cannot be found either.
ls dir // we get file, whiteout
rename(dir/file, dir/whiteout, REANME_WHITEOUT)
ENOSPC = ubifs_budget_space(&wht_req) // return
ls dir // empty (no file, no whiteout)
Problem 2: Power-cut happens before step 3, whiteout inode with 'nlink=1'
is not stored on disk, whiteout dentry(old dentry) is written
on disk, whiteout file is lost on next mount (We get "dead
directory entry" after executing 'ls -l' on whiteout file).
Now, we use following 3 steps to finish rename whiteout:
1. create an in-mem inode with 'nlink = 1' as whiteout
2. ubifs_jnl_rename (Write on disk to finish associating old dentry to
whiteout inode, associating new dentry with old inode)
3. iput(whiteout)
Rely writing in-mem inode on disk by ubifs_jnl_rename() to finish rename
whiteout, which avoids middle disk state caused by suddenly power-cut
and error occurring.
Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c67bc98d1f0853bb196e9c48eab38b6f2ddab795) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
whiteout inode should be put when do_tmpfile() failed if inode has been
initialized. Otherwise we will get following warning during umount:
UBIFS error (ubi0:0 pid 1494): ubifs_assert_failed [ubifs]: UBIFS
assert failed: c->bi.dd_growth == 0, in fs/ubifs/super.c:1930
VFS: Busy inodes after unmount of ubifs. Self-destruct in 5 seconds.
Fixes: 9e0a1fff8db56ea ("ubifs: Implement RENAME_WHITEOUT") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ff846f2c5d1de96f0af036eb9e3a7ecfeb1f1cd5) Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>