A kernel exception was hit when trying to dump /proc/lockdep_chains after
lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!":
Unable to handle kernel paging request at virtual address 00054005450e05c3
... 00054005450e05c3] address between user and kernel address ranges
...
pc : [0xffffffece769b3a8] string+0x50/0x10c
lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c
...
Call trace:
string+0x50/0x10c
vsnprintf+0x468/0x69c
seq_printf+0x8c/0xd8
print_name+0x64/0xf4
lc_show+0xb8/0x128
seq_read_iter+0x3cc/0x5fc
proc_reg_read_iter+0xdc/0x1d4
The cause of the problem is the function lock_chain_get_class() will
shift lock_classes index by 1, but the index don't need to be shifted
anymore since commit 01bb6f0af992 ("locking/lockdep: Change the range
of class_idx in held_lock struct") already change the index to start
from 0.
The lock_classes[-1] located at chain_hlocks array. When printing
lock_classes[-1] after the chain_hlocks entries are modified, the
exception happened.
The output of lockdep_chains are incorrect due to this problem too.
Fixes: f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey") Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20220210105011.21712-1-cheng-jui.wang@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
DSL and CM (Cable Modem) support 8 B max transfer size and have a custom
DT binding for that reason. This driver was checking for a wrong
"compatible" however which resulted in an incorrect setup.
Fixes: e2e5a2c61837 ("i2c: brcmstb: Adding support for CM and DSL SoCs") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The COMMS package can enable the hardware parser to recognize IPSEC
frames with ESP header and SPI identifier. If this package is available
and configured for loading in /lib/firmware, then the driver will
succeed in enabling this protocol type for RSS.
This in turn allows the hardware to hash over the SPI and use it to pick
a consistent receive queue for the same secure flow. Without this all
traffic is steered to the same queue for multiple traffic threads from
the same IP address. For that reason this is marked as a fix, as the
driver supports the model, but it wasn't enabled.
If the package is not available, adding this type will fail, but the
failure is ignored on purpose as it has no negative affect.
Fixes: c90ed40cefe1 ("ice: Enable writing hardware filtering tables") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This fixes a deadlock added with commit b40f3894e39e ("scsi: qedi: Complete
TMF works before disconnect")
Bug description from Jia-Ju Bai:
qedi_process_tmf_resp()
spin_lock(&session->back_lock); --> Line 201 (Lock A)
spin_lock(&qedi_conn->tmf_work_lock); --> Line 230 (Lock B)
qedi_process_cmd_cleanup_resp()
spin_lock_bh(&qedi_conn->tmf_work_lock); --> Line 752 (Lock B)
spin_lock_bh(&conn->session->back_lock); --> Line 784 (Lock A)
When qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp() are
concurrently executed, the deadlock can occur.
This patch fixes the deadlock by not holding the tmf_work_lock in
qedi_process_cmd_cleanup_resp while holding the back_lock. The
tmf_work_lock is only needed while we remove the tmf_work from the
work_list.
Link: https://lore.kernel.org/r/20220208185448.6206-1-michael.christie@oracle.com Fixes: b40f3894e39e ("scsi: qedi: Complete TMF works before disconnect") Cc: Manish Rangankar <mrangankar@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
One way to solve this problem is to move the fd_install() call out of
the sighand->siglock critical section.
Before commit 6fd2fe494b17 ("copy_process(): don't use ksys_close()
on cleanups"), the pidfd installation was done without holding both
the task_list lock and the sighand->siglock. Obviously, holding these
two locks are not really needed to protect the fd_install() call.
So move the fd_install() call down to after the releases of both locks.
Link: https://lore.kernel.org/r/20220208163912.1084752-1-longman@redhat.com Fixes: 6fd2fe494b17 ("copy_process(): don't use ksys_close() on cleanups") Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In order to free resources correctly in the error handling path of
pt_core_init(), 2 goto's have to be switched. Otherwise, some resources
will leak and we will try to release things that have not been allocated
yet.
Also move a dev_err() to a place where it is more meaningful.
Fixes: fa5d823b16a9 ("dmaengine: ptdma: Initial driver for the AMD PTDMA") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Sanjay R Mehta <sanju.mehta@amd.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/41a963a35173f89c874f5c44df5530dc09fea8da.1644044244.git.christophe.jaillet@wanadoo.fr Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is a minor chance for a race, if a pointer to an i2c-bus subnode
is stored and then reused after releasing its reference, and it would
be sufficient to get one more reference under a loop over children
subnodes.
Fixes: e517526195de ("i2c: Add Qualcomm CCI I2C driver") Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Reviewed-by: Robert Foss <robert.foss@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The test treated zero as a successful run when it really should treat
non-zero as a successful run. A mount's idmapping can't change once it
has been attached to the filesystem.
Link: https://lore.kernel.org/r/20220203131411.3093040-2-brauner@kernel.org Fixes: 01eadc8dd96d ("tests: add mount_setattr() selftests") Cc: Seth Forshee <seth.forshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Because of the possible failure of the dma_supported(), the
dma_set_mask_and_coherent() may return error num.
Therefore, it should be better to check it and return the error if
fails.
During set*id() which cred->ucounts to charge the the current process
to is not known until after set_cred_ucounts. So move the
RLIMIT_NPROC checking into a new helper flag_nproc_exceeded and call
flag_nproc_exceeded after set_cred_ucounts.
This is very much an arbitrary subset of the places where we currently
change the RLIMIT_NPROC accounting, designed to preserve the existing
logic.
Fixing the existing logic will be the subject of another series of
changes.
Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20220216155832.680775-4-ebiederm@xmission.com Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Solar Designer <solar@openwall.com> wrote:
> I'm not aware of anyone actually running into this issue and reporting
> it. The systems that I personally know use suexec along with rlimits
> still run older/distro kernels, so would not yet be affected.
>
> So my mention was based on my understanding of how suexec works, and
> code review. Specifically, Apache httpd has the setting RLimitNPROC,
> which makes it set RLIMIT_NPROC:
>
> https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc
>
> The above documentation for it includes:
>
> "This applies to processes forked from Apache httpd children servicing
> requests, not the Apache httpd children themselves. This includes CGI
> scripts and SSI exec commands, but not any processes forked from the
> Apache httpd parent, such as piped logs."
>
> In code, there are:
>
> ./modules/generators/mod_cgid.c: ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
> ./modules/generators/mod_cgi.c: ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
> ./modules/filters/mod_ext_filter.c: rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);
>
> For example, in mod_cgi.c this is in run_cgi_child().
>
> I think this means an httpd child sets RLIMIT_NPROC shortly before it
> execs suexec, which is a SUID root program. suexec then switches to the
> target user and execs the CGI script.
>
> Before 2863643fb8b9, the setuid() in suexec would set the flag, and the
> target user's process count would be checked against RLIMIT_NPROC on
> execve(). After 2863643fb8b9, the setuid() in suexec wouldn't set the
> flag because setuid() is (naturally) called when the process is still
> running as root (thus, has those limits bypass capabilities), and
> accordingly execve() would not check the target user's process count
> against RLIMIT_NPROC.
In commit 2863643fb8b9 ("set_user: add capability check when
rlimit(RLIMIT_NPROC) exceeds") capable calls were added to set_user to
make it more consistent with fork. Unfortunately because of call site
differences those capable calls were checking the credentials of the
user before set*id() instead of after set*id().
This breaks enforcement of RLIMIT_NPROC for applications that set the
rlimit and then call set*id() while holding a full set of
capabilities. The capabilities are only changed in the new credential
in security_task_fix_setuid().
The code in apache suexec appears to follow this pattern.
Commit 909cc4ae86f3 ("[PATCH] Fix two bugs with process limits
(RLIMIT_NPROC)") where this check was added describes the targes of this
capability check as:
2/ When a root-owned process (e.g. cgiwrap) sets up process limits and then
calls setuid, the setuid should fail if the user would then be running
more than rlim_cur[RLIMIT_NPROC] processes, but it doesn't. This patch
adds an appropriate test. With this patch, and per-user process limit
imposed in cgiwrap really works.
So the original use case of this check also appears to match the broken
pattern.
Restore the enforcement of RLIMIT_NPROC by removing the bad capable
checks added in set_user. This unfortunately restores the
inconsistent state the code has been in for the last 11 years, but
dealing with the inconsistencies looks like a larger problem.
The functions copy_page_to_iter_pipe() and push_pipe() can both
allocate a new pipe_buffer, but the "flags" member initializer is
missing.
Fixes: 241699cd72a8 ("new iov_iter flavour: pipe-backed")
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
To: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
> It was reported that v5.14 behaves differently when enforcing
> RLIMIT_NPROC limit, namely, it allows one more task than previously.
> This is consequence of the commit 21d1c5e386bc ("Reimplement
> RLIMIT_NPROC on top of ucounts") that missed the sharpness of
> equality in the forking path.
This can be fixed either by fixing the test or by moving the increment
to be before the test. Fix it my moving copy_creds which contains
the increment before is_ucounts_overlimit.
In the case of CLONE_NEWUSER the ucounts in the task_cred changes.
The function is_ucounts_overlimit needs to use the final version of
the ucounts for the new process. Which means moving the
is_ucounts_overlimit test after copy_creds is necessary.
Both the test in fork and the test in set_user were semantically
changed when the code moved to ucounts. The change of the test in
fork was bad because it was before the increment. The test in
set_user was wrong and the change to ucounts fixed it. So this
fix only restores the old behavior in one lcation not two.
Michal Koutný <mkoutny@suse.com> wrote:
> Tasks are associated to multiple users at once. Historically and as per
> setrlimit(2) RLIMIT_NPROC is enforce based on real user ID.
>
> The commit 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts")
> made the accounting structure "indexed" by euid and hence potentially
> account tasks differently.
>
> The effective user ID may be different e.g. for setuid programs but
> those are exec'd into already existing task (i.e. below limit), so
> different accounting is moot.
>
> Some special setresuid(2) users may notice the difference, justifying
> this fix.
I looked at cred->ucount and it is only used for rlimit operations
that were previously stored in cred->user. Making the fact
cred->ucount can refer to a different user from cred->user a bug,
affecting all uses of cred->ulimit not just RLIMIT_NPROC.
Fix set_cred_ucounts to always use the real uid not the effective uid.
Further simplify set_cred_ucounts by noticing that set_cred_ucounts
somehow retained a draft version of the check to see if alloc_ucounts
was needed that checks the new->user and new->user_ns against the
current_real_cred(). Remove that draft version of the check.
All that matters for setting the cred->ucounts are the user_ns and uid
fields in the cred.
Any cred that is destined for use by commit_creds must have a non-NULL
cred->ucounts field. Only curing credential construction is a NULL
cred->ucounts valid. Only abort_creds, put_cred, and put_cred_rcu
needs to deal with a cred with a NULL ucount. As set_cred_ucounts is
non of those case don't confuse people by handling something that can
not happen.
Link: https://lkml.kernel.org/r/871r4irzds.fsf_-_@disp2133 Tested-by: Yu Zhao <yuzhao@google.com> Reviewed-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
While examining is_ucounts_overlimit and reading the various messages
I realized that is_ucounts_overlimit fails to deal with counts that
may have wrapped.
Being wrapped should be a transitory state for counts and they should
never be wrapped for long, but it can happen so handle it.
Cc: stable@vger.kernel.org Fixes: 21d1c5e386bc ("Reimplement RLIMIT_NPROC on top of ucounts") Link: https://lkml.kernel.org/r/20220216155832.680775-5-ebiederm@xmission.com Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Do alignment logic properly and use the "ptr" local variable for
calculating the remainder of the alignment.
This became an issue because struct edac_mc_layer has a size that is not
zero modulo eight, and the next offset that was prepared for the private
data was unaligned, causing an alignment exception.
The patch in Fixes: which broke this actually wanted to "what we
actually care about is the alignment of the actual pointer that's about
to be returned." But it didn't check that alignment.
Use the correct variable "ptr" for that.
[ bp: Massage commit message. ]
Fixes: 8447c4d15e35 ("edac: Do alignment logic properly in edac_align_ptr()") Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220113100622.12783-2-farbere@amazon.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When connected point to point, the driver does not know the FC4's supported
by the other end. In Fabrics, it can query the nameserver. Thus the driver
must send PRLIs for the FC4s it supports and enable support based on the
acc(ept) or rej(ect) of the respective FC4 PRLI. Currently the driver
supports SCSI and NVMe PRLIs.
Unfortunately, although the behavior is per standard, many devices have
come to expect only SCSI PRLIs. In this particular example, the NVMe PRLI
is properly RJT'd but the target decided that it must LOGO after seeing the
unexpected NVMe PRLI. The LOGO causes the sequence to restart and login is
now in an infinite failure loop.
Fix the problem by having the driver, on a pt2pt link, remember NVMe PRLI
accept or reject status across logout as long as the link stays "up". When
retrying login, if the prior NVMe PRLI was rejected, it will not be sent on
the next login.
Link: https://lore.kernel.org/r/20220212163120.15385-1-jsmart2021@gmail.com Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When the KCONFIG_AUTOCONFIG is specified (e.g. export \
KCONFIG_AUTOCONFIG=output/config/auto.conf), the directory of
include/config/ will not be created, so kconfig can't create deps
files in it and auto.conf can't be generated.
Signed-off-by: Jing Leng <jleng@ambarella.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Single page and coherent memory blocks can use different DMA masks
when the macb accesses physical memory directly. The kernel is clever
enough to allocate pages that fit into the requested address width.
When using the ARM SMMU, the DMA mask must be the same for single
pages and big coherent memory blocks. Otherwise the translation
tables turn into one big mess.
[ 74.959909] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
[ 74.959989] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1
[ 75.173939] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
[ 75.173955] arm-smmu fd800000.smmu: Unhandled context fault: fsr=0x402, iova=0x3165687460, fsynr=0x20001, cbfrsynra=0x877, cb=1
Since using the same DMA mask does not hurt direct 1:1 physical
memory mappings, this commit always aligns DMA and coherent masks.
Signed-off-by: Marc St-Amand <mstamand@ciena.com> Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Dell DW5829e same as DW5821e except the CAT level.
DW5821e supports CAT16 but DW5829e supports CAT9.
Also, DW5829e includes normal and eSIM type.
Please see below test evidence:
[Why]
pflip interrupt order are mapped 1 to 1 to otg id.
e.g. if irq_src=26 corresponds to otg0 then 27->otg1, 28->otg2...
Linux DM registers pflip interrupts per number of crtcs.
In fused pipe case crtc numbers can be less than otg id.
e.g. if one pipe out of 3(otg#0-2) is fused adev->mode_info.num_crtc=2
so DM only registers irq_src 26,27.
This is a bug since if pipe#2 remains unfused DM never gets
otg2 pflip interrupt (irq_src=28)
That may results in gfx failure due to pflip timeout.
[How]
Register pflip interrupts per max num of otg instead of num_crtc
Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
A number of BIOS versions have a problem with the watermarks table not
being configured properly. This manifests as a very scary looking warning
during resume from s0i3. This should be harmless in most cases and is well
understood, so decrease the assertion to a clearer warning about the problem.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kernel parameter "tp_printk_stop_on_boot" starts with "tp_printk" which is
the same as another kernel parameter "tp_printk". If "tp_printk" setup is
called before the "tp_printk_stop_on_boot", it will override the latter
and keep it from being set.
This is similar to other kernel parameter issues, such as:
Commit 745a600cf1a6 ("um: console: Ignore console= option")
or init/do_mounts.c:45 (setup function of "ro" kernel param)
Fix it by checking for a "_" right after the "tp_printk" and if that
exists do not process the parameter.
Link: https://lkml.kernel.org/r/20220208195421.969326-1-jsyoo5b@gmail.com Signed-off-by: JaeSang Yoo <jsyoo5b@gmail.com>
[ Fixed up change log and added space after if condition ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The driver returns an error when devm_phy_optional_get() fails leaving
the previously enabled clock turned on. Change order and enable the
clock only after the phy has been acquired.
If there are failures then we must not leave the non-NULL pointers with
the error value, otherwise `rpcrdma_ep_destroy` gets confused and tries
free them, resulting in an Oops.
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add a checking code when it gets -EPROBE_DEFER while getting a clock
resource. In this case, it doesn't need to print out an error message
because the probing will be re-visited.
This device is a CF card, or possibly an SSD in CF form factor.
It supports NCQ and high speed DMA.
While it also advertises TRIM support, I/O errors are reported
when the discard mount option fstrim is used. TRIM also fails
when disabling NCQ and not just as an NCQ command.
TRIM must be disabled for this device.
Signed-off-by: Zoltán Böszörményi <zboszor@gmail.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The 'shell' built-in only returns the first 256 bytes of the command's
output. In some cases, 'shell' is used to return a path; by bumping up
the buffer size to 4096 this lets us capture up to PATH_MAX.
The specific case where I ran into this was due to commit 1e860048c53e
("gcc-plugins: simplify GCC plugin-dev capability test"). After this
change, we now use `$(shell,$(CC) -print-file-name=plugin)` to return
a path; if the gcc path is particularly long, then the path ends up
truncated at the 256 byte mark, which makes the HAVE_GCC_PLUGINS
depends test always fail.
Signed-off-by: Brenda Streiff <brenda.streiff@ni.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On an overcommitted system which is running multiple workloads of
varying priorities, it is preferred to trigger an oom-killer to kill a
low priority workload than to let the high priority workload receiving
ENOMEMs. On our memory overcommitted systems, we are seeing a lot of
ENOMEMs instead of oom-kills because io_uring_setup callchain is using
__GFP_NORETRY gfp flag which avoids the oom-killer. Let's remove it and
allow the oom-killer to kill a lower priority job.
These are some trivial fixups, which were needed to build the tests with
clang and -Werror. The following issues are fixed:
- Remove various unused variables.
- In child_poll_leader_exit_test, clang isn't smart enough to realize
syscall(SYS_exit, 0) won't return, so it complains we never return
from a non-void function. Add an extra exit(0) to appease it.
- In test_pidfd_poll_leader_exit, ret may be branched on despite being
uninitialized, if we have !use_waitpid. Initialize it to zero to get
the right behavior in that case.
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When running the pidfd_fdinfo_test on arm64, it fails for me. After some
digging, the reason is that the child exits due to SIGBUS, because it
overflows the 1024 byte stack we've reserved for it.
To fix the issue, increase the stack size to 8192 bytes (this number is
somewhat arbitrary, and was arrived at through experimentation -- I kept
doubling until the failure no longer occurred).
Also, let's make the issue easier to debug. wait_for_pid() returns an
ambiguous value: it may return -1 in all of these cases:
1. waitpid() itself returned -1
2. waitpid() returned success, but we found !WIFEXITED(status).
3. The child process exited, but it did so with a -1 exit code.
There's no way for the caller to tell the difference. So, at least log
which occurred, so the test runner can debug things.
While debugging this, I found that we had !WIFEXITED(), because the
child exited due to a signal. This seems like a reasonably common case,
so also print out whether or not we have WIFSIGNALED(), and the
associated WTERMSIG() (if any). This lets us see the SIGBUS I'm fixing
clearly when it occurs.
Finally, I'm suspicious of allocating the child's stack on our stack.
man clone(2) suggests that the correct way to do this is with mmap(),
and in particular by setting MAP_STACK. So, switch to doing it that way
instead.
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The BL32/TEE reserved-memory region is now inherited from the common
family dtsi (meson-g12-common) so we can drop it from board files.
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20220126044954.19069-4-christianshewitt@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add an additional reserved memory region for the BL32 trusted firmware
present in many devices that boot from Amlogic vendor u-boot.
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20220126044954.19069-3-christianshewitt@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add an additional reserved memory region for the BL32 trusted firmware
present in many devices that boot from Amlogic vendor u-boot.
Suggested-by: Mateusz Krzak <kszaquitto@gmail.com> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://lore.kernel.org/r/20220126044954.19069-2-christianshewitt@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When checking smb2 query directory packets from other servers,
OutputBufferLength is different with ksmbd. Other servers add an unaligned
next offset to OutputBufferLength for the last entry.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
ksmbd sets the inode number to UniqueId. However, the same UniqueId for
dot and dotdot entry is set to the inode number of the parent inode.
This patch set them using the current inode and parent inode.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Vivek Thrivikraman reported:
An SCTP server application which is accessed continuously by client
application.
When the session disconnects the client retries to establish a connection.
After restart of SCTP server application the session is not established
because of stale conntrack entry with connection state CLOSED as below.
(removing this entry manually established new connection):
sctp 9 CLOSED src=10.141.189.233 [..] [ASSURED]
Just skip timeout update of closed entries, we don't want them to
stay around forever.
Reported-and-tested-by: Vivek Thrivikraman <vivek.thrivikraman@est.tech> Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1579 Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
These pair of statements are used to trigger an exception, but then help
objtool understand that for warnings, control flow will be restored
immediately afterwards.
The problem is that volatile is not a compiler barrier. GCC explicitly
documents this:
> Note that the compiler can move even volatile asm instructions
> relative to other code, including across jump instructions.
Also, no clobbers are specified to prevent instructions from subsequent
statements from being scheduled by compiler before the second asm
statement. This can lead to instructions from subsequent statements
being emitted by the compiler before the second asm statement.
Providing a scheduling model such as via -march= options enables the
compiler to better schedule instructions with known latencies to hide
latencies from data hazards compared to inline asm statements in which
latencies are not estimated.
If an instruction gets scheduled by the compiler between the two asm
statements, then objtool will think that it is not reachable, producing
a warning.
To prevent instructions from being scheduled in between the two asm
statements, merge them.
Also remove an unnecessary unreachable() asm annotation from BUG() in
favor of __builtin_unreachable(). objtool is able to track that the ud2
from BUG() terminates control flow within the function.
The thead,c900-plic has been used in opensbi to distinguish
PLIC [1]. Although PLICs have the same behaviors in Linux,
they are different hardware with some custom initializing in
firmware(opensbi).
Qute opensbi patch commit-msg by Samuel:
The T-HEAD PLIC implementation requires setting a delegation bit
to allow access from S-mode. Now that the T-HEAD PLIC has its own
compatible string, set this bit automatically from the PLIC driver,
instead of reaching into the PLIC's MMIO space from another driver.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Anup Patel <anup@brainfault.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Samuel Holland <samuel@sholland.org> Cc: Thomas Gleixner <tglx@linutronix.de> Tested-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220130135634.1213301-3-guoren@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In service_callback path RCU dereferenced pointer struct vchiq_service
need to be accessed inside rcu read-critical section.
Also userdata/user_service part of vchiq_service is accessed around
different synchronization mechanism, getting an extra reference to a
pointer keeps sematics simpler and avoids prolonged graceperiod.
Accessing vchiq_service with rcu_read_[lock/unlock] fixes below issue.
[ 32.201659] =============================
[ 32.201664] WARNING: suspicious RCU usage
[ 32.201670] 5.15.11-rt24-v8+ #3 Not tainted
[ 32.201680] -----------------------------
[ 32.201685] drivers/staging/vc04_services/interface/vchiq_arm/vchiq_core.h:529 suspicious rcu_dereference_check() usage!
[ 32.201695]
[ 32.201695] other info that might help us debug this:
[ 32.201695]
[ 32.201700]
[ 32.201700] rcu_scheduler_active = 2, debug_locks = 1
[ 32.201708] no locks held by vchiq-slot/0/98.
[ 32.201715]
[ 32.201715] stack backtrace:
[ 32.201723] CPU: 1 PID: 98 Comm: vchiq-slot/0 Not tainted 5.15.11-rt24-v8+ #3
[ 32.201733] Hardware name: Raspberry Pi 4 Model B Rev 1.4 (DT)
[ 32.201739] Call trace:
[ 32.201742] dump_backtrace+0x0/0x1b8
[ 32.201772] show_stack+0x20/0x30
[ 32.201784] dump_stack_lvl+0x8c/0xb8
[ 32.201799] dump_stack+0x18/0x34
[ 32.201808] lockdep_rcu_suspicious+0xe4/0xf8
[ 32.201817] service_callback+0x124/0x400
[ 32.201830] slot_handler_func+0xf60/0x1e20
[ 32.201839] kthread+0x19c/0x1a8
[ 32.201849] ret_from_fork+0x10/0x20
Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Padmanabha Srinivasaiah <treasure4paddy@gmail.com> Link: https://lore.kernel.org/r/20211231195406.5479-1-treasure4paddy@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The PHY client driver does a phy_exit() call on suspend or rmmod and
the PHY driver needs to know the difference because some clocks need
to be kept running for suspend but can be shutdown on unbind/rmmod
(or if there are no PHY clients at all).
The fix is to use a PM notifier so the driver can tell if a PHY
client is calling exit() because of a system suspend or a driver
unbind/rmmod.
Signed-off-by: Al Cooper <alcooperx@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20211201180653.35097-2-alcooperx@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This was found by coccicheck:
./arch/arm/mach-omap2/display.c, 272, 1-7, ERROR missing put_device;
call of_find_device_by_node on line 258, but without a corresponding
object release within this function.
Move the put_device() call before the if judgment.
Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix following coccicheck warning:
./arch/arm/mach-omap2/omap_hwmod.c:753:1-23: WARNING: Function
for_each_matching_node should have of_node_put() before break
Early exits from for_each_matching_node should decrement the
node reference counter.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
AMD's event select is 3 nybbles, with the high nybble in bits 35:32 of
a PerfEvtSeln MSR. Don't mask off the high nybble when configuring a
RAW perf event.
Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220203014813.2130559-2-jmattson@google.com> Reviewed-by: David Dunn <daviddunn@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
AMD's event select is 3 nybbles, with the high nybble in bits 35:32 of
a PerfEvtSeln MSR. Don't drop the high nybble when setting up the
config field of a perf_event_attr structure for a call to
perf_event_create_kernel_counter().
Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220203014813.2130559-1-jmattson@google.com> Reviewed-by: David Dunn <daviddunn@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The find_arch_event() returns a "unsigned int" value,
which is used by the pmc_reprogram_counter() to
program a PERF_TYPE_HARDWARE type perf_event.
The returned value is actually the kernel defined generic
perf_hw_id, let's rename it to pmc_perf_hw_id() with simpler
incoming parameters for better self-explanation.
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211130074221.93635-3-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If of_find_device_by_node() succeeds, ingenic_ecc_get() doesn't have
a corresponding put_device(). Thus add put_device() to fix the exception
handling.
Fixes: 15de8c6efd0e ("mtd: rawnand: ingenic: Separate top-level and SoC specific code") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20211230072751.21622-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When hid_parse() in elo_probe() fails, it forgets to call usb_put_dev to
decrease the refcount.
Fix this by adding usb_put_dev() in the error handling code of elo_probe().
Fixes: fbf42729d0e9 ("HID: elo: update the reference count of the usb device structure") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The brcmnand driver contains a bug in which if a page (example 2k byte)
is read from the parallel/ONFI NAND and within that page a subpage (512
byte) has correctable errors which is followed by a subpage with
uncorrectable errors, the page read will return the wrong status of
correctable (as opposed to the actual status of uncorrectable.)
The bug is in function brcmnand_read_by_pio where there is a check for
uncorrectable bits which will be preempted if a previous status for
correctable bits is detected.
The fix is to stop checking for bad bits only if we already have a bad
bits status.
Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: david regan <dregan@mail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/trinity-478e0c09-9134-40e8-8f8c-31c371225eda-1643237024774@3c-app-mailcom-lxa02 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The problem is that "erasesize" is a uint64_t type so it might be
non-zero but the lower 32 bits are zero so when it's truncated,
"(uint32_t)erasesize", then that value is zero. This leads to a
divide by zero bug.
Avoid the bug by delaying the divide until after we have validated
that "erasesize" is non-zero and within the uint32_t range.
Fixes: dc2b3e5cbc80 ("mtd: phram: use div_u64_rem to stop overwrite len in phram_setup") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220121115505.GI1978@kadam Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In the event of a skipped partition (case when the entry name is empty)
the kernel panics in the cleanup function as the name entry is NULL.
Rework the parser logic by first checking the real partition number and
then allocate the space and set the data for the valid partitions.
The logic was also fundamentally wrong as with a skipped partition, the
parts number returned was incorrect by not decreasing it for the skipped
partitions.
Fixes: 803eb124e1a6 ("mtd: parsers: Add Qcom SMEM parser") Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220116032211.9728-1-ansuelsmth@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Interacting with a NAND chip on an IPQ6018 I found that the qcomsmem NAND
partition parser was returning -EPROBE_DEFER waiting for the main smem
driver to load.
This caused the board to reset. Playing about with the probe() function
shows that the problem lies in the core clock being switched off before the
nandc_unalloc() routine has completed.
If we look at how qcom_nandc_remove() tears down allocated resources we see
the expected order is
Various block drivers call blk_set_queue_dying to mark a disk as dead due
to surprise removal events, but since commit 8e141f9eb803 that doesn't
work given that the GD_DEAD flag needs to be set to stop I/O.
Replace the driver calls to blk_set_queue_dying with a new (and properly
documented) blk_mark_disk_dead API, and fold blk_set_queue_dying into the
only remaining caller.
Fixes: 8e141f9eb803 ("block: drain file system I/O on del_gendisk") Reported-by: Markus Blöchl <markus.bloechl@ipetronik.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Link: https://lore.kernel.org/r/20220217075231.1140-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Daniel Gibson reports that the n_tty code gets line termination wrong in
very specific cases:
"If you feed a line with exactly 64 chars + terminating newline, and
directly afterwards (without reading) another line into a pseudo
terminal, the the first read() on the other side will return the 64
char line *without* terminating newline, and the next read() will
return the missing terminating newline AND the complete next line (if
it fits in the buffer)"
and bisected the behavior to commit 3b830a9c34d5 ("tty: convert
tty_ldisc_ops 'read()' function to take a kernel pointer").
Now, digging deeper, it turns out that the behavior isn't exactly new:
what changed in commit 3b830a9c34d5 was that the tty line discipline
.read() function is now passed an intermediate kernel buffer rather than
the final user space buffer.
And that intermediate kernel buffer is 64 bytes in size - thus that
special case with exactly 64 bytes plus terminating newline.
The same problem did exist before, but historically the boundary was not
the 64-byte chunk, but the user-supplied buffer size, which is obviously
generally bigger (and potentially bigger than N_TTY_BUF_SIZE, which
would hide the issue entirely).
The reason is that the n_tty canon_copy_from_read_buf() code would look
ahead for the EOL character one byte further than it would actually
copy. It would then decide that it had found the terminator, and unmark
it as an EOL character - which in turn explains why the next read
wouldn't then be terminated by it.
Now, the reason it did all this in the first place is related to some
historical and pretty obscure EOF behavior, see commit ac8f3bf8832a
("n_tty: Fix poll() after buffer-limited eof push read") and commit 40d5e0905a03 ("n_tty: Fix EOF push handling").
And the reason for the EOL confusion is that we treat EOF as a special
EOL condition, with the EOL character being NUL (aka "__DISABLED_CHAR"
in the kernel sources).
So that EOF look-ahead also affects the normal EOL handling.
This patch just removes the look-ahead that causes problems, because EOL
is much more critical than the historical "EOF in the middle of a line
that coincides with the end of the buffer" handling ever was.
Now, it is possible that we should indeed re-introduce the "look at next
character to see if it's a EOF" behavior, but if so, that should be done
not at the kernel buffer chunk boundary in canon_copy_from_read_buf(),
but at a higher level, when we run out of the user buffer.
In particular, the place to do that would be at the top of
'n_tty_read()', where we check if it's a continuation of a previously
started read, and there is no more buffer space left, we could decide to
just eat the __DISABLED_CHAR at that point.
But that would be a separate patch, because I suspect nobody actually
cares, and I'd like to get a report about it before bothering.
Fixes: 3b830a9c34d5 ("tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer") Fixes: ac8f3bf8832a ("n_tty: Fix poll() after buffer-limited eof push read") Fixes: 40d5e0905a03 ("n_tty: Fix EOF push handling") Link: https://bugzilla.kernel.org/show_bug.cgi?id=215611 Reported-and-tested-by: Daniel Gibson <metalcaedes@gmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The result of the writeback, whether it is an ENOSPC or an EIO, or
anything else, does not inhibit the NFS client from reporting the
correct file timestamps.
Fixes: 79566ef018f5 ("NFS: Getattr doesn't require data sync semantics") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit ac795161c936 (NFSv4: Handle case where the lookup of a directory
fails) [1], part of Linux since 5.17-rc2, introduced a regression, where
a symbolic link on an NFS mount to a directory on another NFS does not
resolve(?) the first time it is accessed:
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: ac795161c936 ("NFSv4: Handle case where the lookup of a directory fails") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Tested-by: Donald Buczek <buczek@molgen.mpg.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In nfs4_update_changeattr_locked(), we don't need to set the
NFS_INO_REVAL_PAGECACHE flag, because we already know the value of the
change attribute, and we're already flagging the size. In fact, this
forces us to revalidate the change attribute a second time for no good
reason.
This extra flag appears to have been introduced as part of the xattr
feature, when update_changeattr_locked() was converted for use by the
xattr code.
Fixes: 1b523ca972ed ("nfs: modify update_changeattr to deal with regular files") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Now that we disable wbt by set WBT_STATE_OFF_DEFAULT in
wbt_disable_default() when switch elevator to bfq. And when
we remove scsi device, wbt will be enabled by wbt_enable_default.
If it become false positive between wbt_wait() and wbt_track()
when submit write request.
The following is the scenario that triggered the problem.
T1 T2 T3
elevator_switch_mq
bfq_init_queue
wbt_disable_default <= Set
rwb->enable_state (OFF)
Submit_bio
blk_mq_make_request
rq_qos_throttle
<= rwb->enable_state (OFF)
scsi_remove_device
sd_remove
del_gendisk
blk_unregister_queue
elv_unregister_queue
wbt_enable_default
<= Set rwb->enable_state (ON)
q_qos_track
<= rwb->enable_state (ON)
^^^^^^ this request will mark WBT_TRACKED without inflight add and will
lead to drop rqw->inflight to -1 in wbt_done() which will trigger IO hung.
Fix this by move wbt_enable_default() from elv_unregister to
bfq_exit_queue(). Only re-enable wbt when bfq exit.
Fixes: 76a8040817b4b ("blk-wbt: make sure throttle is enabled properly")
Remove oneline stale comment, and kill one oneshot local variable.
Signed-off-by: Ming Lei <ming.lei@rehdat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-block/20211214133103.551813-1-qiulaibin@huawei.com/ Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In commit da0363f7bfd3 ("ASoC: qcom: Fix for DMA interrupt clear reg
overwriting") we changed regmap_write() to regmap_update_bits() so that
we can avoid overwriting bits that we didn't intend to modify.
Unfortunately this change breaks the case where a register is writable
but not readable, which is exactly how the HDMI irq clear register is
designed (grep around LPASS_HDMITX_APP_IRQCLEAR_REG to see how it's
write only). That's because regmap_update_bits() tries to read the
register from the hardware and if it isn't readable it looks in the
regmap cache to see what was written there last time to compare against
what we want to write there. Eventually, we're unable to modify this
register at all because the bits that we're trying to set are already
set in the cache.
This is doubly bad for the irq clear register because you have to write
the bit to clear an interrupt. Given the irq is level triggered, we see
an interrupt storm upon plugging in an HDMI cable and starting audio
playback. The irq storm is so great that performance degrades
significantly, leading to CPU soft lockups.
Fix it by using regmap_write_bits() so that we really do write the bits
in the clear register that we want to. This brings the number of irqs
handled by lpass_dma_interrupt_handler() down from ~150k/sec to ~10/sec.
Fixes: da0363f7bfd3 ("ASoC: qcom: Fix for DMA interrupt clear reg overwriting") Cc: Srinivasa Rao Mandadapu <srivasam@codeaurora.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20220209232520.4017634-1-swboyd@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Per TAS2770 datasheet there must be a 1 ms delay from reset to first
command. So insert delays into the driver where appropriate.
Fixes: 1a476abc723e ("tas2770: add tas2770 smart PA kernel driver") Signed-off-by: Martin Povišer <povik+lin@cutebit.org> Link: https://lore.kernel.org/r/20220204095301.5554-1-povik+lin@cutebit.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix this lockup by making ufshcd_exec_dev_cmd() allocate a reserved
request.
Link: https://lore.kernel.org/r/20211203231950.193369-10-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 7252a3603015 ("scsi: ufs: Avoid busy-waiting by eliminating tag
conflicts") guarantees that 'tag' is not in use by any SCSI command.
Remove the check that returns early if a conflict occurs.
Link: https://lore.kernel.org/r/20211203231950.193369-6-bvanassche@acm.org Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The previous bug fix had an unfortunate side effect that broke
distribution of binding table entries between nodes. The updated
tipc_sock_addr struct is also used further down in the same
function, and there the old value is still the correct one.
Fixes: 032062f363b4 ("tipc: fix wrong publisher node address in link publications") Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The conversion to the new API broke the snapshot mount option
due to 32 vs. 64 bit type mismatch
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Cc: stable@vger.kernel.org # 5.11+ Reported-by: <ruckajan10@gmail.com> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Add the 'ifdef CONFIG_PPC64' around the 'ptesync' in function
'emulate_update_regs()' to like it is in 'analyse_instr()'. Since it looks like
it got dropped inadvertently by commit 3cdfcbfd32b9 ("powerpc: Change
analyse_instr so it doesn't modify *regs").
A key detail is that analyse_instr() will never recognise lwsync or
ptesync on 32-bit (because of the existing ifdef), and as a result
emulate_update_regs() should never be called with an op specifying
either of those on 32-bit. So removing them from emulate_update_regs()
should be a nop in terms of runtime behaviour.
Fixes: 3cdfcbfd32b9 ("powerpc: Change analyse_instr so it doesn't modify *regs") Cc: stable@vger.kernel.org # v4.14+ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
[mpe: Add last paragraph of change log mentioning analyse_instr() details] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220211005113.1361436-1-anders.roxell@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Allthough kernel text is always mapped with BATs, we still have
inittext mapped with pages, so TLB miss handling is required
when CONFIG_DEBUG_PAGEALLOC or CONFIG_KFENCE is set.
The final solution should be to set a BAT that also maps inittext
but that BAT then needs to be cleared at end of init, and it will
require more changes to be able to do it properly.
As DEBUG_PAGEALLOC or KFENCE are debugging, performance is not a big
deal so let's fix it simply for now to enable easy stable application.
Fixes: 035b19a15a98 ("powerpc/32s: Always map kernel text and rodata with BATs") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aea33b4813a26bdb9378b5f273f00bd5d4abe240.1638857364.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
'setcifsacl -g <SID>' silently fails to set the group SID on server.
Actually, the bug existed since commit 438471b67963 ("CIFS: Add support
for setting owner info, dos attributes, and create time"), but this fix
will not apply cleanly to kernel versions <= v5.10.
Fixes: 3970acf7ddb9 ("SMB3: Add support for getting and setting SACLs") Cc: stable@vger.kernel.org # 5.11+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When writing out a stereo control we discard the change notification from
the first channel, meaning that events are only generated based on changes
to the second channel. Ensure that we report a change if either channel
has changed.
Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-5-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When writing out a stereo control we discard the change notification from
the first channel, meaning that events are only generated based on changes
to the second channel. Ensure that we report a change if either channel
has changed.
Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-3-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When writing out a stereo control we discard the change notification from
the first channel, meaning that events are only generated based on changes
to the second channel. Ensure that we report a change if either channel
has changed.
Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-4-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When writing out a stereo control we discard the change notification from
the first channel, meaning that events are only generated based on changes
to the second channel. Ensure that we report a change if either channel
has changed.
Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220201155629.120510-2-broonie@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
By some unknown reason, BIOS on Shenker Dock 15 doesn't set up the
codec mask properly for the onboard audio. Let's set the forced codec
mask to enable the codec discovery.
The forced probe mask via probe_mask 0x100 bit doesn't work any longer
as expected since the bus init code was moved and it's clearing the
codec_mask value that was set beforehand. This patch fixes the
long-time regression by moving the check_probe_mask() call.
The recently introduced coef_mutex for Realtek codec seems causing a
deadlock when the relevant code is invoked from the power-off state;
then the HD-audio core tries to power-up internally, and this kicks
off the codec runtime PM code that tries to take the same coef_mutex.
In order to avoid the deadlock, do the temporary power up/down around
the coef_mutex acquisition and release. This assures that the
power-up sequence runs before the mutex, hence no re-entrance will
happen.
Legion Y9000X 2019 has the same speaker with Y9000X 2020,
but with a different quirk address. Add one quirk entry
to make the speaker work on Y9000X 2019 too.
Signed-off-by: Yu Huang <diwang90@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20220212160835.165065-1-diwang90@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 83b7dcbc51c930fc2079ab6c6fc9d719768321f1 introduced a generic
implicit feedback parser, which fails to execute for M-Audio FastTrack
Ultra sound cards. The issue is with the ENDPOINT_SYNCTYPE check in
add_generic_implicit_fb() where the SYNCTYPE is ADAPTIVE instead of ASYNC.
The reason is that the sync type of the FastTrack output endpoints are
set to adaptive in the quirks table since commit 65f04443c96dbda11b8fff21d6390e082846aa3c.
non-regular file needs to be compiled and then copied to the output
directory. Remove it from TEST_PROGS and add it to TEST_GEN_PROGS. This
removes error thrown by rsync when non-regular object isn't found:
rsync: [sender] link_stat "/linux/tools/testing/selftests/exec/non-regular" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1333) [sender=3.2.3]
Fixes: 0f71241a8e32 ("selftests/exec: add file type errno tests") Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This was detected by the gcc in Fedora Rawhide's gcc:
50 11.01 fedora:rawhide : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC)
inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9:
util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free]
1225 | *key_scan_pos += strlen(map_opt);
| ^~~~~~~~~~~~~~~
util/bpf-loader.c:1223:9: note: call to 'free' here
1223 | free(map_name);
| ^~~~~~~~~~~~~~
cc1: all warnings being treated as errors
So do the calculations on the pointer before freeing it.
Fixes: 04f9bf2bac72480c ("perf bpf-loader: Add missing '*' for key_scan_pos") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lore.kernel.org/lkml/Yg1VtQxKrPpS3uNA@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Whenever bridge driver hits the max capacity of MDBs, it disables
the MC processing (by setting corresponding bridge option), but never
notifies switchdev about such change (the notifiers are called only upon
explicit setting of this option, through the registered netlink interface).
This could lead to situation when Software MDB processing gets disabled,
but this event never gets offloaded to the underlying Hardware.
Fix this by adding a notify message in such case.
Fixes: 147c1e9b902c ("switchdev: bridge: Offload multicast disabled") Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20220215165303.31908-1-oleksandr.mazur@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
1588 Single Step Timestamping code path uses a mutex to
enforce atomicity for two events:
- update of ptp single step register
- transmit ptp event packet
Before this patch the mutex was not initialized. This
caused unexpected crashes in the Tx function.
Fixes: c55211892f463 ("dpaa2-eth: support PTP Sync packet one-step timestamping") Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Clang static analysis reports this representative problem
dpaa2-switch-flower.c:616:24: warning: The right operand of '=='
is a garbage value
tmp->cfg.vlan_id == vlan) {
^ ~~~~
vlan is set in dpaa2_switch_flower_parse_mirror_key(). However
this function can return success without setting vlan. So
change the default return to -EOPNOTSUPP.
Fixes: 0f3faece5808 ("dpaa2-switch: add VLAN based mirroring") Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When a link comes up we add its presence to the name table to make it
possible for users to subscribe for link up/down events. However, after
a previous call signature change the binding is wrongly published with
the peer node as publishing node, instead of the own node as it should
be. This has the effect that the command 'tipc name table show' will
list the link binding (service type 2) with node scope and a peer node
as originator, something that obviously is impossible.
We correct this bug here.
Fixes: 50a3499ab853 ("tipc: simplify signature of tipc_namtbl_publish()") Signed-off-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220214013852.2803940-1-jmaloy@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The function mt7531_phy_mode_supported in the DSA driver set supported
mode to PHY_INTERFACE_MODE_GMII instead of PHY_INTERFACE_MODE_INTERNAL
for the internal PHY, so this check breaks the PHY initialization:
mt7530 mdio-bus:00 wan (uninitialized): failed to connect to PHY: -EINVAL
Remove the check to make it work again.
Reported-by: Hauke Mehrtens <hauke@hauke-m.de> Fixes: e40d2cca0189 ("net: phy: add MediaTek Gigabit Ethernet PHY driver") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com> Tested-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The callback functions of clcsock will be saved and replaced during
the fallback. But if the fallback happens more than once, then the
copies of these callback functions will be overwritten incorrectly,
resulting in a loop call issue:
So this patch fixes the issue by saving these function pointers only
once in the fallback and avoiding overwriting.
Reported-by: syzbot+4de3c0e8a263e1e499bc@syzkaller.appspotmail.com Fixes: 341adeec9ada ("net/smc: Forward wakeup to smc socket waitqueue after fallback") Link: https://lore.kernel.org/r/0000000000006d045e05d78776f6@google.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
GCC 12 correctly reports a potential use-after-free condition in the
xrealloc helper. Fix the warning by avoiding an implicit "free(ptr)"
when size == 0:
In file included from help.c:12:
In function 'xrealloc',
inlined from 'add_cmdname' at help.c:24:2: subcmd-util.h:56:23: error: pointer may be used after 'realloc' [-Werror=use-after-free]
56 | ret = realloc(ptr, size);
| ^~~~~~~~~~~~~~~~~~
subcmd-util.h:52:21: note: call to 'realloc' here
52 | void *ret = realloc(ptr, size);
| ^~~~~~~~~~~~~~~~~~
subcmd-util.h:58:31: error: pointer may be used after 'realloc' [-Werror=use-after-free]
58 | ret = realloc(ptr, 1);
| ^~~~~~~~~~~~~~~
subcmd-util.h:52:21: note: call to 'realloc' here
52 | void *ret = realloc(ptr, size);
| ^~~~~~~~~~~~~~~~~~
syzbot reported that two threads might write over agg_select_timer
at the same time. Make agg_select_timer atomic to fix the races.
BUG: KCSAN: data-race in bond_3ad_initiate_agg_selection / bond_3ad_state_machine_handler
read to 0xffff8881242aea90 of 4 bytes by task 1846 on cpu 1:
bond_3ad_state_machine_handler+0x99/0x2810 drivers/net/bonding/bond_3ad.c:2317
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 25910 Comm: syz-executor.1 Tainted: G W 5.17.0-rc4-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
value changed: 0xffffffff85dee080 -> 0xffff88815d96ec00
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 13560 Comm: syz-executor.2 Not tainted 5.17.0-rc3-syzkaller-00116-gf1baf68e1383-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 470502de5bdb ("net: sched: unlock rules update API") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Buslov <vladbu@mellanox.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
trace_napi_poll_hit() is reading stat->dev while another thread can write
on it from dropmon_net_event()
Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already,
we only have to take care of load/store tearing.
BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit
write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1:
dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579
notifier_call_chain kernel/notifier.c:84 [inline]
raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392
call_netdevice_notifiers_info net/core/dev.c:1919 [inline]
call_netdevice_notifiers_extack net/core/dev.c:1931 [inline]
call_netdevice_notifiers net/core/dev.c:1945 [inline]
unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415
ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123
vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0:
trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292
trace_napi_poll include/trace/events/napi.h:14 [inline]
__napi_poll+0x36b/0x3f0 net/core/dev.c:6366
napi_poll net/core/dev.c:6432 [inline]
net_rx_action+0x29e/0x650 net/core/dev.c:6519
__do_softirq+0x158/0x2de kernel/softirq.c:558
do_softirq+0xb1/0xf0 kernel/softirq.c:459
__local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
_raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:394 [inline]
ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
value changed: 0xffff88815883e000 -> 0x0000000000000000
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker
Fixes: 4ea7e38696c7 ("dropmon: add ability to detect when hardware dropsrxpackets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>