iavf_free_queues() clears adapter->num_active_queues, which
iavf_free_q_vectors() relies on, so swap the order of these two function
calls in iavf_disable_vf(). This resolves a panic encountered when the
interface is disabled and then later brought up again after PF
communication is restored.
Fixes: 65c7006f234c ("i40evf: assign num_active_queues inside i40evf_alloc_queues") Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
If the driver has lost contact with the PF then it enters a disabled state
and frees adapter->vf_res. However, ndo_fix_features can still be called on
the interface, so we need to check for this condition first. Since we have
no information on the features at this time simply leave them unmodified
and return.
Fixes: c4445aedfe09 ("i40evf: Fix VLAN features") Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Fixed return correct code from set the new channel count.
Implemented by check if reset is done in appropriate time.
This solution give a extra time to pf for reset vf in case
when user want set new channel count for all vfs.
Without this patch it is possible to return misleading output
code to user and vf reset not to be correctly performed by pf.
Fixes: 5520deb15326 ("iavf: Enable support for up to 16 queues") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
rtm@csail.mit.edu reports:
> nfsd4_decode_bitmap4() will write beyond bmval[bmlen-1] if the RPC
> directs it to do so. This can cause nfsd4_decode_state_protect4_a()
> to write client-supplied data beyond the end of
> nfsd4_exchange_id.spo_must_allow[] when called by
> nfsd4_decode_exchange_id().
Rewrite the loops so nfsd4_decode_bitmap() cannot iterate beyond
@bmlen.
Reported by: rtm@csail.mit.edu Fixes: d1c263a031e8 ("NFSD: Replace READ* macros in nfsd4_decode_fattr()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The link_id is supposed to be unique, but smcr_next_link_id() doesn't
skip the used link_id as expected. So the patch fixes this.
Fixes: 026c381fb477 ("net/smc: introduce link_idx for link group array") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
sk_clone_lock() needs to call sock_inuse_add(1) before entering the
sk_free_unlock_clone() error path, for __sk_free() from sk_free() from
sk_free_unlock_clone() calls sock_inuse_add(-1).
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: 648845ab7e200993 ("sock: Move the socket inuse to namespace.") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The MSG_CRYPTO msgs are always encrypted and sent to other nodes
for keys' deployment. But when receiving in peers, if those nodes
do not validate it and make sure it's encrypted, one could craft
a malicious MSG_CRYPTO msg to deploy its key with no need to know
other nodes' keys.
This patch is to do that by checking TIPC_SKB_CB(skb)->decrypted
and discard it if this packet never got decrypted.
Note that this is also a supplementary fix to CVE-2021-43267 that
can be triggered by an unencrypted malicious MSG_CRYPTO msg.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange") Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The driver does not check if hw-tc-offload is enabled for the device
before offloading a flow in the context of indirect block callback.
Fix this by checking NETIF_F_HW_TC in the features flag and rejecting
the offload request. This will avoid unnecessary dmesg error logs when
hw-tc-offload is disabled, such as these:
bnxt_en 0000:19:00.1 eno2np1: dev(ifindex=294) not on same switch
bnxt_en 0000:19:00.1 eno2np1: Error: bnxt_tc_add_flow: cookie=0xffff8dace1c88000 error=-22
bnxt_en 0000:19:00.0 eno1np0: dev(ifindex=294) not on same switch
bnxt_en 0000:19:00.0 eno1np0: Error: bnxt_tc_add_flow: cookie=0xffff8dace1c88000 error=-22
Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Fixes: 627c89d00fb9 ("bnxt_en: flow_offload: offload tunnel decap rules via indirect callbacks") Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Smatch says:
bnx2x_init_ops.h:640 bnx2x_ilt_client_mem_op()
warn: variable dereferenced before check 'ilt' (see line 638)
Move ilt_cli variable initialization _after_ ilt validation, because
it's unsafe to deref the pointer before validation check.
Fixes: 523224a3b3cd ("bnx2x, cnic, bnx2i: use new FW/HSI") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The gpio selftests build against the system includes rather than the
headers from the linux tree. This results in the compile failing if
the system includes are outdated.
Prefer the headers from the linux tree, as per other selftests.
Fixes: 8bc395a6a2e2 ("selftests: gpio: rework and simplify test implementation") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
[Kent: reworded commit comment and added Fixes:] Signed-off-by: Kent Gibson <warthog618@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The head-of-line blocking timer should only be modified when
head-of-line drop is disabled.
One of the steps in recovering from a modem crash is to enable
dropping of packets with timeout of 0 (immediate). We don't know
how the modem configured its endpoints, so before we program the
timer, we need to ensure HOL_BLOCK is disabled.
Fixes: 84f9bd12d46db ("soc: qcom: ipa: IPA endpoints") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Starting with IPA v4.5, the HOL_BLOCK_EN register must be written
twice when enabling head-of-line blocking avoidance.
Fixes: 84f9bd12d46db ("soc: qcom: ipa: IPA endpoints") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Since commit a05829a7222e ("cfg80211: avoid holding the RTNL when
calling the driver") we've not only been protecting the pointer
to monitor_sdata with the RTNL, but also with the wiphy->mtx. This
is relevant in a number of lockdep assertions, e.g. the one we hit
in ieee80211_set_monitor_channel(). However, we're now protecting
all the assignments/dereferences, even the one in interface iter,
with the wiphy->mtx, so switch over the lockdep assertions to that
lock.
Even if userspace specifies the NL80211_ATTR_SURVEY_RADIO_STATS
attribute, we cannot get the statistics because we're not really
parsing the incoming attributes properly any more.
Fix this by passing the attrbuf to nl80211_prepare_wdev_dump()
and filling it there, if given, and using a local version only
if no output is desired.
Since I'm touching it anyway, make nl80211_prepare_wdev_dump()
static.
Fixes: 50508d941c18 ("cfg80211: use parallel_ops for genl") Reported-by: Jan Fuchs <jf@simonwunderlich.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Tested-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20211029092539.2851b4799386.If9736d4575ee79420cbec1bd930181e1d53c7317@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The string copies to the histogram storage has a max size of 256 bytes
(defined by MAX_FILTER_STR_VAL). Only the string size of the event field
needs to be copied to the event storage, but no more than what is in the
event storage. Although nothing should be bigger than 256 bytes, there's
no protection against overwriting of the storage if one day there is.
Copy no more than the destination size, and enforce it.
Also had to turn MAX_FILTER_STR_VAL into an unsigned int, to keep the
min() comparison of the string sizes of comparable types.
Link: https://lore.kernel.org/all/CAHk-=wjREUihCGrtRBwfX47y_KrLCGjiq3t6QtoNJpmVrAEb1w@mail.gmail.com/ Link: https://lkml.kernel.org/r/20211114132834.183429a4@rorschach.local.home Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Tom Zanussi <zanussi@kernel.org> Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: 63f84ae6b82b ("tracing/histogram: Do not copy the fixed-size char array field over the field size") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
TCP Receive zerocopy iterates through the SKB queue via
tcp_recv_skb(), acquiring a pointer to an SKB and an offset within
that SKB to read from. From there, it iterates the SKB frags array to
determine which offset to start remapping pages from.
However, this is built on the assumption that the offset read so far
within the SKB is smaller than the SKB length. If this assumption is
violated, we can attempt to read an invalid frags array element, which
would cause a fault.
tcp_recv_skb() can cause such an SKB to be returned when the TCP FIN
flag is set. Therefore, we must guard against this occurrence inside
skb_advance_frag().
One way that we can reproduce this error follows:
1) In a receiver program, call getsockopt(TCP_ZEROCOPY_RECEIVE) with:
char some_array[32 * 1024];
struct tcp_zerocopy_receive zc = {
.copybuf_address = (__u64) &some_array[0],
.copybuf_len = 32 * 1024,
};
2) In a sender program, after a TCP handshake, send the following
sequence of packets:
i) Seq = [X, X+4000]
ii) Seq = [X+4000, X+5000]
iii) Seq = [X+4000, X+5000], Flags = FIN | URG, urgptr=1000
(This can happen without URG, if we have a signal pending, but URG is
a convenient way to reproduce the behaviour).
In this case, the following event sequence will occur on the receiver:
tcp_zerocopy_receive():
-> receive_fallback_to_copy() // copybuf_len >= inq
-> tcp_recvmsg_locked() // reads 5000 bytes, then breaks due to URG
-> tcp_recv_skb() // yields skb with skb->len == offset
-> tcp_zerocopy_set_hint_for_skb()
-> skb_advance_to_frag() // will returns a frags ptr. >= nr_frags
-> find_next_mappable_frag() // will dereference this bad frags ptr.
With this patch, skb_advance_to_frag() will no longer return an
invalid frags pointer, and will return NULL instead, fixing the issue.
Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Link: https://lore.kernel.org/r/20211111235215.2605384-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The source group count was mistakenly assigned to both dst and src loops.
Fix it to make IPA probe and work again.
Fixes: 4fd704b3608a ("net: ipa: record number of groups in data") Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20211111183724.593478-1-konrad.dybcio@somainline.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Commit b599015f044d ("samples/bpf: Fix application of sizeof to pointer")
tried to fix a bug where sizeof was incorrectly applied to a pointer instead
of the array string was being copied to, to find the destination buffer size,
but ended up using strlen, which is still incorrect. However, on closer look
ifname_buf has no other use, hence directly use optarg.
Fixes: b599015f044d ("samples/bpf: Fix application of sizeof to pointer") Fixes: e531a220cc59 ("samples: bpf: Convert xdp_redirect_cpu to XDP samples helper") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/bpf/20211112020301.528357-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
sample_summary_print() uses accumulated period to calculate and display
per-sec averages. This period gets incremented by sampling interval each
time a new sample is formed, and thus equals to the number of samples
collected multiplied by this interval.
However, the totals are being calculated differently, they receive current
sample statistics already divided by the interval gotten as a difference
between sample timestamps for better precision -- in other words, they are
being incremented by the per-sec values each sample.
This leads to the excessive division of summary per-secs when interval != 1
sec. It is obvious pps couldn't become two times lower just from picking a
different sampling interval value:
$ samples/bpf/xdp_redirect_cpu -p xdp_prognum_n1_inverse_qnum -c all
-s -d 6 -i 1
< snip >
Packets received : 2,197,230,321
Average packets/s : 22,887,816
Packets redirected : 2,197,230,472
Average redir/s : 22,887,817
$ samples/bpf/xdp_redirect_cpu -p xdp_prognum_n1_inverse_qnum -c all
-s -d 6 -i 2
< snip >
Packets received : 159,566,498
Average packets/s : 11,397,607
Packets redirected : 159,566,995
Average redir/s : 11,397,642
This can be easily fixed by treating the divisor not as a period, but rather
as a total number of samples, and thus incrementing it by 1 instead of
interval. As a nice side effect, we can now remove so-named argument from a
couple of functions. Let us also create an "alias" for sample_output::rx_cnt::pps
named 'num' using a union since this field is used to store this number (period
previously) as well, and the resulting counter-intuitive code might've been a
reason for this bug.
Introduction of map_uid made two lookups from outer map to be distinct.
That distinction is only necessary when inner map has an embedded timer.
Otherwise it will make the verifier state pruning to be conservative
which will cause complex programs to hit 1M insn_processed limit.
Tighten map_uid logic to apply to inner maps with timers only.
gv100_hdmi_ctrl() writes vendor_infoframe.subpack0_high to 0x6f0110, and
then overwrites it with 0. Just drop the overwrite with 0, that's clearly
a mistake.
Because of this issue the HDMI VIC is 0 instead of 1 in the HDMI Vendor
InfoFrame when transmitting 4kp30.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 290ffeafcc1a ("drm/nouveau/disp/gv100: initial support") Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/3d3bd0f7-c150-2479-9350-35d394ee772d@xs4all.nl Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Commit 463538a383a2 ("perf tests: Fix test 68 zstd compression for
s390") inadvertently removed the -g flag from all platforms rather than
just s390, because the [[ ]] construct fails in sh. Changing to single
brackets restores testing of call graphs and removes the following error
from the output:
$ ./perf test -v 85
85: Zstd perf.data compression/decompression :
--- start ---
test child forked, pid 50643
Collecting compressed record file:
./tests/shell/record+zstd_comp_decomp.sh: 15: [[: not found
Fixes: 463538a383a2 ("perf tests: Fix test 68 zstd compression for s390") Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20211028134828.65774-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
perf_env__insert_btf() doesn't insert if a duplicate BTF id is
encountered and this causes a memory leak. Modify the function to return
a success/error value and then free the memory if insertion didn't
happen.
v2. Adds a return -1 when the insertion error occurs in
perf_env__fetch_btf. This doesn't affect anything as the result is
never checked.
Fixes: 3792cb2ff43b1b19 ("perf bpf: Save BTF in a rbtree in perf_env") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: http://lore.kernel.org/lkml/20211112074525.121633-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Do not copy the fixed-size char array field of the events over
the field size. The histogram treats char array as a string and
there are 2 types of char array in the event, fixed-size and
dynamic string. The dynamic string (__data_loc) field must be
null terminated, but the fixed-size char array field may not
be null terminated (not a string, but just a data).
In that case, histogram can copy the data after the field.
This uses the original field size for fixed-size char array
field to restrict the histogram not to access over the original
field size.
Link: https://lkml.kernel.org/r/163673292822.195747.3696966210526410250.stgit@devnote2 Fixes: 02205a6752f2 (tracing: Add support for 'field variables') Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Bio will be checked at beginning of submit_bio_noacct(). If bio needs
to be throttled, it will start the timer and stop submit bio directly.
Bio will submit in blk_throtl_dispatch_work_fn() when the timer expires.
But in the current process, if bio is throttled, it will still set bio
issue->value by blkcg_bio_issue_init(). This is redundant and may cause
the above use-after-free.
CPU0 CPU1
submit_bio
submit_bio_noacct
submit_bio_checks
blk_throtl_bio()
<=mod_timer(&sq->pending_timer
blk_throtl_dispatch_work_fn
submit_bio_noacct() <= bio have
throttle tag, will throw directly
and bio issue->value will be set
here
bio_endio()
bio_put()
bio_free() <= free this bio
blkcg_bio_issue_init(bio)
<= bio has been freed and
will lead to UAF
return BLK_QC_T_NONE
Fix this by remove extra blkcg_bio_issue_init.
Fixes: e439bedf6b24 (blkcg: consolidate bio_issue_init() to be a part of core) Signed-off-by: Laibin Qiu <qiulaibin@huawei.com> Link: https://lore.kernel.org/r/20211112093354.3581504-1-qiulaibin@huawei.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Just like what we do in the x86_get_event_constraints(), the
PERF_X86_EVENT_LBR_SELECT flag should also be propagated
to event->hw.flags so that the host lbr driver can save/restore
MSR_LBR_SELECT for the special vlbr event created by KVM or BPF.
Fixes: 097e4311cda9 ("perf/x86: Add constraint to create guest LBR event without hw counter") Reported-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Like Xu <likexu@tencent.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Wanpeng Li <wanpengli@tencent.com> Link: https://lore.kernel.org/r/20211103091716.59906-1-likexu@tencent.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Kevin is reporting crashes which point to a use-after-free of a cfs_rq
in update_blocked_averages(). Initial debugging revealed that we've
live cfs_rq's (on_list=1) in an about to be kfree()'d task group in
free_fair_sched_group(). However, it was unclear how that can happen.
His kernel config happened to lead to a layout of struct sched_entity
that put the 'my_q' member directly into the middle of the object
which makes it incidentally overlap with SLUB's freelist pointer.
That, in combination with SLAB_FREELIST_HARDENED's freelist pointer
mangling, leads to a reliable access violation in form of a #GP which
made the UAF fail fast.
Michal seems to have run into the same issue[1]. He already correctly
diagnosed that commit a7b359fc6a37 ("sched/fair: Correctly insert
cfs_rq's to list on unthrottle") is causing the preconditions for the
UAF to happen by re-adding cfs_rq's also to task groups that have no
more running tasks, i.e. also to dead ones. His analysis, however,
misses the real root cause and it cannot be seen from the crash
backtrace only, as the real offender is tg_unthrottle_up() getting
called via sched_cfs_period_timer() via the timer interrupt at an
inconvenient time.
When unregister_fair_sched_group() unlinks all cfs_rq's from the dying
task group, it doesn't protect itself from getting interrupted. If the
timer interrupt triggers while we iterate over all CPUs or after
unregister_fair_sched_group() has finished but prior to unlinking the
task group, sched_cfs_period_timer() will execute and walk the list of
task groups, trying to unthrottle cfs_rq's, i.e. re-add them to the
dying task group. These will later -- in free_fair_sched_group() -- be
kfree()'ed while still being linked, leading to the fireworks Kevin
and Michal are seeing.
To fix this race, ensure the dying task group gets unlinked first.
However, simply switching the order of unregistering and unlinking the
task group isn't sufficient, as concurrent RCU walkers might still see
it, as can be seen below:
CPU 2 walks the task group list in parallel to sched_offline_group(),
specifically, it'll read the soon to be unlinked task group entry at
(1). Unlinking it on CPU 1 at (2) therefore won't prevent CPU 2 from
still passing it on to tg_unthrottle_up(). CPU 1 now tries to unlink
all cfs_rq's via list_del_leaf_cfs_rq() in
unregister_fair_sched_group(). Meanwhile CPU 2 will re-add some of
these at (3), which is the cause of the UAF later on.
To prevent this additional race from happening, we need to wait until
walk_tg_tree_from() has finished traversing the task groups, i.e.
after the RCU read critical section ends in (4). Afterwards we're safe
to call unregister_fair_sched_group(), as each new walk won't see the
dying task group any more.
On top of that, we need to wait yet another RCU grace period after
unregister_fair_sched_group() to ensure print_cfs_stats(), which might
run concurrently, always sees valid objects, i.e. not already free'd
ones.
This patch survives Michal's reproducer[2] for 8h+ now, which used to
trigger within minutes before.
Fixes: a7b359fc6a37 ("sched/fair: Correctly insert cfs_rq's to list on unthrottle")
[peterz: shuffle code around a bit] Reported-by: Kevin Tanguy <kevin.tanguy@corp.ovh.com> Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:
For MIPS pre-boot, when CONFIG_KERNEL_ZSTD=y, the decompressor
function uses __bswapdi2(), so this object file should be added to
the target object file.
Fixes these build errors:
mips-linux-ld: arch/mips/boot/compressed/decompress.o: in function `xxh64':
decompress.c:(.text+0x8be0): undefined reference to `__bswapdi2'
mips-linux-ld: decompress.c:(.text+0x8c78): undefined reference to `__bswapdi2'
mips-linux-ld: decompress.c:(.text+0x8d04): undefined reference to `__bswapdi2'
mips-linux-ld: arch/mips/boot/compressed/decompress.o:decompress.c:(.text+0xa010): more undefined references to `__bswapdi2' follow
Fixes: 0652035a5794 ("asm-generic: unaligned: remove byteshift helpers") Fixes: cddc40f5617e ("mips: always link byteswap helpers into decompressor") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Several header files need info on CONFIG_32BIT or CONFIG_64BIT,
but kconfig symbol BCM63XX does not provide that info. This leads
to many build errors, e.g.:
arch/mips/include/asm/page.h:196:13: error: use of undeclared identifier 'CAC_BASE'
return x - PAGE_OFFSET + PHYS_OFFSET;
arch/mips/include/asm/mach-generic/spaces.h:91:23: note: expanded from macro 'PAGE_OFFSET'
#define PAGE_OFFSET (CAC_BASE + PHYS_OFFSET)
arch/mips/include/asm/io.h:134:28: error: use of undeclared identifier 'CAC_BASE'
return (void *)(address + PAGE_OFFSET - PHYS_OFFSET);
arch/mips/include/asm/mach-generic/spaces.h:91:23: note: expanded from macro 'PAGE_OFFSET'
#define PAGE_OFFSET (CAC_BASE + PHYS_OFFSET)
arch/mips/include/asm/uaccess.h:82:10: error: use of undeclared identifier '__UA_LIMIT'
return (__UA_LIMIT & (addr | (addr + size) | __ua_size(size))) == 0;
Selecting the SYS_HAS_CPU_BMIPS* symbols causes SYS_HAS_CPU_BMIPS to be
set, which then selects CPU_SUPPORT_32BIT_KERNEL, which causes
CONFIG_32BIT to be set. (a bit more indirect than v1 [RFC].)
Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: bcm-kernel-feedback-list@broadcom.com Cc: linux-mips@vger.kernel.org Cc: Paul Burton <paulburton@kernel.org> Cc: Maxime Bizon <mbizon@freebox.fr> Cc: Ralf Baechle <ralf@linux-mips.org> Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
We currently walk the hypervisor stage-1 page-table towards the end of
hyp init in nVHE protected mode and adjust the host page ownership
attributes in its stage-2 in order to get a consistent state from both
point of views. The walk is done on the entire hyp VA space, and expects
to only ever find page-level mappings. While this expectation is
reasonable in the half of hyp VA space that maps memory with a fixed
offset (see the loop in pkvm_create_mappings_locked()), it can be
incorrect in the other half where nothing prevents the usage of block
mappings. For instance, on systems where memory is physically aligned at
an address that happens to maps to a PMD aligned VA in the hyp_vmemmap,
kvm_pgtable_hyp_map() will install block mappings when backing the
hyp_vmemmap, which will later cause finalize_host_mappings() to fail.
Furthermore, it should be noted that all pages backing the hyp_vmemmap
are also mapped in the 'fixed offset range' of the hypervisor, which
implies that finalize_host_mappings() will walk both aliases and update
the host stage-2 attributes twice. The order in which this happens is
unpredictable, though, since the hyp VA layout is highly dependent on
the position of the idmap page, hence resulting in a fragile mess at
best.
In order to fix all of this, let's restrict the finalization walk to
only cover memory regions in the 'fixed-offset range' of the hyp VA
space and nothing else. This not only fixes a correctness issue, but
will also result in a slighlty faster hyp initialization overall.
Fixes: 2c50166c62ba ("KVM: arm64: Mark host bss and rodata section as shared") Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211108154636.393384-1-qperret@google.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The gcc_aggre1_pnoc_ahb_clk is crucial for the proper MSM8996/APQ8096
functioning. If it gets disabled, several subsytems will stop working
(including eMMC/SDCC and USB). There are no in-kernel users of this
clock, so it is much simpler to remove from the kernel.
The clock was first removed in the commit 9e60de1cf270 ("clk: qcom:
Remove gcc_aggre1_pnoc_ahb_clk from msm8996") by Stephen Boyd, but got
added back in the commit b567752144e3 ("clk: qcom: Add some missing gcc
clks for msm8996") by Rajendra Nayak.
Let's remove it again in hope that nobody adds it back.
Reported-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Cc: Rajendra Nayak <rnayak@codeaurora.org> Cc: Konrad Dybcio <konrad.dybcio@somainline.org> Fixes: b567752144e3 ("clk: qcom: Add some missing gcc clks for msm8996") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20211104011155.2209654-1-dmitry.baryshkov@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
- In the "impose hardware constraints" block, the "logical" divider
value (aka. not translated to the hardware) was clamped to fit in the
register area, but this totally ignored the fact that the divider
value can itself have a fixed divider.
- The code that made sure that the divider value returned by the
function was a multiple of its own fixed divider could result in a
wrong value being calculated, because it was rounded down instead of
rounded up.
Fixes: 4afe2d1a6ed5 ("clk: ingenic: Allow divider value to be divided") Co-developed-by: Artur Rojek <contact@artur-rojek.eu> Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20211001172033.122329-1-paul@crapouillou.net Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
This code looks quite confused: part of function returns 1 on
corruption, part returns -errno. The problem is not stable-specific.
[1] https://lkml.org/lkml/2021/9/19/207
Let's fix to make 'insane cp_payload case' to return 1 rater than
EFSCORRUPTED, so that return value can be kept consistent for all
error cases, it can avoid confusion of code logic.
Fixes: 65ddf6564843 ("f2fs: fix to do sanity check for sb/cp fields correctly") Reported-by: Pavel Machek <pavel@denx.de> Reviewed-by: Pavel Machek <pavel@denx.de> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Compresse file and normal file has differ in i_addr addressing,
specifically addrs per inode/block. So, we will face data loss, if we
disable the compression flag on non-empty files. Therefore we should
disallow not only enabling but disabling the compression flag on
non-empty files.
Fixes: 4c8ff7095bef ("f2fs: support data compression") Signed-off-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Hyeong-Jun Kim <hj514.kim@samsung.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Fix this by defining both ENDIAN macros in
<asm/sfp-machine.h> so that they can be utilized in
<math-emu/soft-fp.h> according to the latter's comment:
/* Allow sfp-machine to have its own byte order definitions. */
(This is what is done in arch/nds32/include/asm/sfp-machine.h.)
This placates these build warnings:
In file included from ../arch/sh/math-emu/math.c:23:
.../include/math-emu/single.h:50:21: warning: "__BIG_ENDIAN" is not defined, evaluates to 0 [-Wundef]
50 | #if __BYTE_ORDER == __BIG_ENDIAN
In file included from ../arch/sh/math-emu/math.c:24:
.../include/math-emu/double.h:59:21: warning: "__BIG_ENDIAN" is not defined, evaluates to 0 [-Wundef]
59 | #if __BYTE_ORDER == __BIG_ENDIAN
Fixes: 4b565680d163 ("sh: math-emu support") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Delete ieee_fpe_handler() since it is not used. After that is done,
delete denormal_to_double() since it is not used:
.../arch/sh/math-emu/math.c:505:12: error: 'ieee_fpe_handler' defined but not used [-Werror=unused-function]
505 | static int ieee_fpe_handler(struct pt_regs *regs)
.../arch/sh/math-emu/math.c:477:13: error: 'denormal_to_double' defined but not used [-Werror=unused-function]
477 | static void denormal_to_double(struct sh_fpu_soft_struct *fpu, int n)
Fixes: 7caf62de25554da3 ("sh: remove unused do_fpu_error") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Takashi YOSHII <takasi-y@ops.dti.ne.jp> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
FRAME_POINTER depends on DEBUG_KERNEL so DWARF_UNWINDER should
depend on DEBUG_KERNEL before selecting FRAME_POINTER.
WARNING: unmet direct dependencies detected for FRAME_POINTER
Depends on [n]: DEBUG_KERNEL [=n] && (M68K || UML || SUPERH [=y]) || ARCH_WANT_FRAME_POINTERS [=n]
Selected by [y]:
- DWARF_UNWINDER [=y]
Fixes: bd353861c735 ("sh: dwarf unwinder support.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Matt Fleming <matt@console-pimps.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
In f2fs_balance_fs_bg(), it needs to check both NAT_ENTRIES and INO_ENTRIES
memory usage to decide whether we should skip background checkpoint, otherwise
we may always skip checking INO_ENTRIES memory usage, so that INO_ENTRIES may
potentially cause high memory footprint.
Fixes: 493720a48543 ("f2fs: fix to avoid REQ_TIME and CP_TIME collision") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
If KMEM_CACHE or maple_alloc_dev failed, the maple_bus_init() will return 0
rather than error, because the retval is not changed after KMEM_CACHE or
maple_alloc_dev failed.
Fixes: 17be2d2b1c33 ("sh: Add maple bus support for the SEGA Dreamcast.") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
request_irq is marked __must_check, but the call in shx3_prepare_cpus
has a void return type, so it can't propagate failure to the caller.
Follow cues from hexagon and just print an error.
Fixes: c7936b9abcf5 ("sh: smp: Hook in to the generic IPI handler for SH-X3 SMP.") Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Rich Felker <dalias@libc.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Until now, all tests involving CONFIG_STRICT_KERNEL_RWX were done with
DEBUG_RODATA_TEST to check the result. But now that
CONFIG_STRICT_KERNEL_RWX is selected by default, it came without
CONFIG_DEBUG_RODATA_TEST and led to the following Oops
0xc1285200 corresponds to 'system_state' global var that the kernel is trying to set to
SYSTEM_RUNNING. This var is above the RO/RW limit so it shouldn't Oops.
In dcr-low.S we use cmpli with three arguments, instead of four
arguments as defined in the ISA:
cmpli cr0,r3,1024
This appears to be a PPC440-ism, looking at the "PPC440x5 CPU Core
User’s Manual" it shows cmpli having no L field, but implied to be 0 due
to the core being 32-bit. It mentions that the ISA defines four
arguments and recommends using cmplwi.
It also corresponds to the old POWER instruction set, which had no L
field there, a reserved bit instead.
dcr-low.S is only built 32-bit, because it is only built when
DCR_NATIVE=y, which is only selected by 40x and 44x. Looking at the
generated code (with gcc/gas) we see cmplwi as expected.
Although gas is happy with the 3-argument version when building for
32-bit, the LLVM assembler is not and errors out with:
arch/powerpc/sysdev/dcr-low.S:27:10: error: invalid operand for instruction
cmpli 0,%r3,1024; ...
^
Switch to the cmplwi extended opcode, which avoids any confusion when
reading the ISA, fixes the issue with the LLVM assembler, and also means
the code could be built 64-bit in future (though that's very unlikely).
DART has an additional global register to control which streams are
isolated. This register is a bit redundant since DART_TCR can already
be used to control isolation and is usually initialized to DART_STREAM_ALL
by the time we get control. Some DARTs (namely the one used for the audio
controller) however have some streams disabled initially. Make sure those
work by initializing DART_STREAMS_ENABLE during reset.
Reported-by: Martin Povišer <povik@protonmail.com> Signed-off-by: Sven Peter <sven@svenpeter.dev> Reviewed-by: Hector Martin <marcan@marcan.st> Link: https://lore.kernel.org/r/20211019162253.45919-1-sven@svenpeter.dev Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
On SAMA7G5 the prescaler part of master clock has been implemented as a
changeable one. Everytime the prescaler is changed the PMC_SR.MCKRDY bit
must be polled. Value 1 for PMC_SR.MCKRDY means the prescaler update is
done. Driver polls for this bit until it becomes 1. On SAMA7G5 it has
been discovered that in some conditions the PMC_SR.MCKRDY is not rising
but the rate it provides it's stable. The workaround is to add a timeout
when polling for PMC_SR.MCKRDY. At the moment, for SAMA7G5, the prescaler
will be removed from Linux clock tree as all the frequencies for CPU could
be obtained from PLL and also there will be less overhead when changing
frequency via DVFS.
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20211011112719.3951784-14-claudiu.beznea@microchip.com Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The pointer cs_desc return from snd_usb_find_clock_source could
be null, so there is a potential null pointer dereference issue.
Fix this by adding a null check before dereference.
The pointer block return from snd_gf1_dma_next_block could be
null, so there is a potential null pointer dereference issue.
Fix this by adding a null check before dereference.
Signed-off-by: Chengfeng Ye <cyeaa@connect.ust.hk> Link: https://lore.kernel.org/r/20211024104611.9919-1-cyeaa@connect.ust.hk Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
According to the new DT schema for qcom,rpm-msg-ram the node name
should be sram@. memory@ is reserved for definition of physical RAM
(usable by Linux).
This fixes the following dtbs_check error on various device trees:
memory@60000: 'device_type' is a required property
From schema: dtschema/schemas/memory.yaml
Signed-off-by: Stephan Gerhold <stephan@gerhold.net> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20211018110009.30837-1-stephan@gerhold.net Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Tegra20 EMC driver uses simple devfreq governor. Add simple devfreq
governor to the list of the Tegra20 EMC driver module softdeps to allow
userspace initramfs tools like dracut to automatically pull the devfreq
module into ramfs image together with the EMC module.
Reported-by: Nicolas Chauvet <kwizart@gmail.com> Suggested-by: Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Link: https://lore.kernel.org/r/20211019231524.888-1-digetx@gmail.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
A link bounce to a slow fabric may observe FDISC response delays lasting
longer than devloss tmo. Current logic decrements the final fabric node
kref during a devloss tmo event. This results in a NULL ptr dereference
crash if the FDISC completes for that fabric node after devloss tmo.
Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when
devloss tmo triggers and we've noticed that fabric node recovery has
already started or finished in between the time lpfc_dev_loss_tmo_callbk
queues lpfc_dev_loss_tmo_handler. If fabric node recovery succeeds, then
the driver reverses the devloss tmo marked kref put with a kref get. If
fabric node recovery fails, then the final kref put relies on the ELS
timing out or the REG_LOGIN cmpl routine.
Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
If an FC link down transition while PLOGIs are outstanding to fabric well
known addresses, outstanding ABTS requests may result in a NULL pointer
dereference. Driver unload requests may hang with repeated "2878" log
messages.
The Link down processing results in ABTS requests for outstanding ELS
requests. The Abort WQEs are sent for the ELSs before the driver had set
the link state to down. Thus the driver is sending the Abort with the
expectation that an ABTS will be sent on the wire. The Abort request is
stalled waiting for the link to come up. In some conditions the driver may
auto-complete the ELSs thus if the link does come up, the Abort completions
may reference an invalid structure.
Fix by ensuring that Abort set the flag to avoid link traffic if issued due
to conditions where the link failed.
Link: https://lore.kernel.org/r/20211020211417.88754-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
An error is detected with the following report when unloading the driver:
"KASAN: use-after-free in lpfc_unreg_rpi+0x1b1b"
The NLP_REG_LOGIN_SEND nlp_flag is set in lpfc_reg_fab_ctrl_node(), but the
flag is not cleared upon completion of the login.
This allows a second call to lpfc_unreg_rpi() to proceed with nlp_rpi set
to LPFC_RPI_ALLOW_ERROR. This results in a use after free access when used
as an rpi_ids array index.
Fix by clearing the NLP_REG_LOGIN_SEND nlp_flag in
lpfc_mbx_cmpl_fc_reg_login().
Link: https://lore.kernel.org/r/20211020211417.88754-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The 'struct attribute' flex array contains some struct lock_class_key's
which become big when lockdep is turned on. Big enough that some drivers
will not load when CONFIG_PROVE_LOCKING=y because they cannot allocate
enough memory:
The length of hw->settings->odr_table is 2 and ref_sensor->id is an enum
variable whose value is between 0 and 5.
However, the value ST_LSM6DSX_ID_MAX (i.e. 5) is not caught properly in
switch (sensor->id) {
If ref_sensor->id is ST_LSM6DSX_ID_MAX, an array overflow will ocurrs in
function st_lsm6dsx_check_odr():
odr_table = &sensor->hw->settings->odr_table[sensor->id];
and in function st_lsm6dsx_set_odr():
reg = &hw->settings->odr_table[ref_sensor->id].reg;
To avoid this array overflow, handle ST_LSM6DSX_ID_GYRO explicitly and
return -EINVAL for the default case.
The enum value ST_LSM6DSX_ID_MAX is only present as an easy way to check
the limit and as such is never used, however this is not locally obvious.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Teng Qi <starmiku1207184332@gmail.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/20211011114003.976221-1-starmiku1207184332@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
We can't free the tg_pt_gp in core_alua_set_tg_pt_gp_id() because it's
still accessed via configfs. Its release must go through the normal
configfs/refcount process.
The max alua_tg_pt_gps_count check should probably have been done in
core_alua_allocate_tg_pt_gp(), but with the current code userspace could
have created 0x0000ffff + 1 groups, but only set the id for 0x0000ffff.
Then it could have deleted a group with an ID set, and then set the ID for
that extra group and it would work ok.
It's unlikely, but just in case this patch continues to allow that type of
behavior, and just fixes the kfree() while in use bug.
Link: https://lore.kernel.org/r/20210930020422.92578-4-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
1. If there are multiple ordered cmds queued and multiple simple cmds
completing, target_restart_delayed_cmds() could be called on different
CPUs and each instance could start a ordered cmd. They could then run in
different orders than they were queued.
2. target_restart_delayed_cmds() and target_handle_task_attr() can race
where:
1. target_handle_task_attr() has passed the simple_cmds == 0 check.
2. transport_complete_task_attr() then decrements simple_cmds to 0.
3. transport_complete_task_attr() runs target_restart_delayed_cmds() and
it does not see any cmds on the delayed_cmd_list.
4. target_handle_task_attr() adds the cmd to the delayed_cmd_list.
The cmd will then end up timing out.
3. If we are sent > 1 ordered cmds and simple_cmds == 0, we can execute
them out of order, because target_handle_task_attr() will hit that
simple_cmds check first and return false for all ordered cmds sent.
4. We run target_restart_delayed_cmds() after every cmd completion, so if
there is more than 1 simple cmd running, we start executing ordered cmds
after that first cmd instead of waiting for all of them to complete.
5. Ordered cmds are not supposed to start until HEAD OF QUEUE and all older
cmds have completed, and not just simple.
6. It's not a bug but it doesn't make sense to take the delayed_cmd_lock
for every cmd completion when ordered cmds are almost never used. Just
replacing that lock with an atomic increases IOPs by up to 10% when
completions are spread over multiple CPUs and there are multiple
sessions/ mqs/thread accessing the same device.
This patch moves the queued delayed handling to a per device work to
serialze the cmd executions for each device and adds a new counter to track
HEAD_OF_QUEUE and SIMPLE cmds. We can then check the new counter to
determine when to run the work on the completion path.
Link: https://lore.kernel.org/r/20210930020422.92578-3-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The following issue was observed running syzkaller:
BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:377 [inline]
BUG: KASAN: slab-out-of-bounds in sg_copy_buffer+0x150/0x1c0 lib/scatterlist.c:831
Read of size 2132 at addr ffff8880aea95dc8 by task syz-executor.0/9815
We get 'alen' from command its type is int. If userspace passes a large
length we will get a negative 'alen'.
Switch n, alen, and rlen to u32.
Link: https://lore.kernel.org/r/20211013033913.2551004-3-yebin10@huawei.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
In resp_readcap16() we get "int alloc_len" value -1104926854, and then pass
the huge arr_len to fill_from_dev_buffer(), but arr is only 32 bytes. This
leads to OOB in sg_copy_buffer().
To solve this issue, define alloc_len as u32.
Link: https://lore.kernel.org/r/20211013033913.2551004-2-yebin10@huawei.com Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
In the testcase pty04, The first process call the write syscall to send
data to the pty master. At the same time, the workqueue will do the
flush_to_ldisc to pop data in a loop until there is no more data left.
When the sender and workqueue running in different core, the sender sends
data fastly in full time which will result in workqueue doing work in loop
for a long time and occuring softlockup in flush_to_ldisc with kernel
configured without preempt. So I add need_resched check and cond_resched
in the flush_to_ldisc loop to avoid it.
Handling of intel_iommu kernel command line option should return "true" to
indicate option is valid and so avoid logging it as unknown by the core
parsing code.
Also log unknown sub-options at the notice level to let user know of
potential typos or similar.
On m68k, compiling drivers under SND_ISA causes build errors:
../sound/core/isadma.c: In function 'snd_dma_program':
../sound/core/isadma.c:33:17: error: implicit declaration of function 'claim_dma_lock' [-Werror=implicit-function-declaration]
33 | flags = claim_dma_lock();
| ^~~~~~~~~~~~~~
../sound/core/isadma.c:41:9: error: implicit declaration of function 'release_dma_lock' [-Werror=implicit-function-declaration]
41 | release_dma_lock(flags);
| ^~~~~~~~~~~~~~~~
../sound/isa/sb/sb16_main.c: In function 'snd_sb16_playback_prepare':
../sound/isa/sb/sb16_main.c:253:72: error: 'DMA_AUTOINIT' undeclared (first use in this function)
253 | snd_dma_program(dma, runtime->dma_addr, size, DMA_MODE_WRITE | DMA_AUTOINIT);
| ^~~~~~~~~~~~
../sound/isa/sb/sb16_main.c:253:72: note: each undeclared identifier is reported only once for each function it appears in
../sound/isa/sb/sb16_main.c: In function 'snd_sb16_capture_prepare':
../sound/isa/sb/sb16_main.c:322:71: error: 'DMA_AUTOINIT' undeclared (first use in this function)
322 | snd_dma_program(dma, runtime->dma_addr, size, DMA_MODE_READ | DMA_AUTOINIT);
| ^~~~~~~~~~~~
We cannot list all the possible chips used in different board revisions,
just use the generic "jedec,spi-nor" compatible instead. This also
fixes dtbs_check error:
['jedec,spi-nor', 's25fl256s1', 's25fl512s'] is too long
Signed-off-by: Li Yang <leoyang.li@nxp.com> Reviewed-by: Kuldeep Singh <kuldeep.singh@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
This fixes dtbs-check error from simple-bus schema:
soc: thermal-zones: {'type': 'object'} is not allowed for {'cpu-thermal': ..... }
From schema: /home/leo/.local/lib/python3.8/site-packages/dtschema/schemas/simple-bus.yaml
Signed-off-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
A little pop can be heard obviously from HP while playing a silent.
This patch fixes it by using two functions:
1. Enable HP 1bit output mode.
2. Change the charge pump switch size during playback on and off.
Signed-off-by: Derek Fang <derek.fang@realtek.com> Link: https://lore.kernel.org/r/20211014094054.811-1-derek.fang@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
When a finger is on the screen, the UPERFECT Y portable touchscreen
monitor reports a contact in the first place. However, after this
initial report, contacts are not reported at the refresh rate of the
screen as required by the Windows 8 specs.
This behaviour triggers the release_timer, removing the fingers even
though they are still present.
To avoid it, add a new class, similar to MT_CLS_WIN_8 but without the
MT_QUIRK_STICKY_FINGERS quirk for this device.
Suggested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The firmware_loader can be used with a pre-allocated buffer
through the use of the API calls:
o request_firmware_into_buf()
o request_partial_firmware_into_buf()
If the firmware was built-in and present, our current check
for if the built-in firmware fits into the pre-allocated buffer
does not return any errors, and we proceed to tell the caller
that everything worked fine. It's a lie and no firmware would
end up being copied into the pre-allocated buffer. So if the
caller trust the result it may end up writing a bunch of 0's
to a device!
Fix this by making the function that checks for the pre-allocated
buffer return non-void. Since the typical use case is when no
pre-allocated buffer is provided make this return successfully
for that case. If the built-in firmware does *not* fit into the
pre-allocated buffer size return a failure as we should have
been doing before.
I'm not aware of users of the built-in firmware using the API
calls with a pre-allocated buffer, as such I doubt this fixes
any real life issue. But you never know... perhaps some oddball
private tree might use it.
In so far as upstream is concerned this just fixes our code for
correctness.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20210917182226.3532898-2-mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
These devices are based on an I2C/I2S device, we need to force the use
of the SOF driver otherwise the legacy HDaudio driver will be loaded -
only HDMI will be supported.
Co-developed-by: Huajun Li <huajun.li@intel.com> Signed-off-by: Huajun Li <huajun.li@intel.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20211004213512.220836-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
According to the datasheet the VSC8531 PHY expects a reset pulse of 100 ns
and a delay of 15 ms after the reset has been deasserted. Set the matching
values in the devicetree.
There are occasions when a controller reboot (controller soft reset) is
issued when a controller firmware crash dump is in progress.
This leads to incomplete controller firmware crash dump:
- When the controller crash dump is in progress, and a kdump is initiated,
the driver issues inbound doorbell reset to bring back the controller in
SIS mode.
- If the controller is in locked up state, the inbound doorbell reset does
not work causing controller initialization failures. This results in the
driver hanging waiting for SIS mode.
To avoid an incomplete controller crash dump, add in a controller crash
dump handshake:
- Controller will indicate start and end of the controller crash dump by
setting some register bits.
- Driver will look these bits when a kdump is initiated. If a controller
crash dump is in progress, the driver will wait for the controller crash
dump to complete before issuing the controller soft reset then complete
driver initialization.
Link: https://lore.kernel.org/r/20210928235442.201875-3-don.brace@microchip.com Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Acked-by: John Donnelly <john.p.donnelly@oracle.com> Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Add a quirk mechanism to allow specifying that active-high jack-detection
should be used on platforms where this info is not available in devicetree.
And add an entry for the Cyberbook T116 tablet to the DMI table, so that
jack-detection will work properly on this tablet.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20211002211459.110124-2-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Use the new IRQF_NO_AUTOEN flag when requesting the IRQ, rather then
disabling it immediately after requesting it.
This fixes a possible race where the IRQ might trigger between requesting
and disabling it; and this also leads to a small code cleanup.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20211003132255.31743-2-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Use the new IRQF_NO_AUTOEN flag when requesting the IRQ, rather then
disabling it immediately after requesting it.
This fixes a possible race where the IRQ might trigger between requesting
and disabling it; and this also leads to a small code cleanup.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20211003132255.31743-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The csi_sel mux register is located in the CCM register base and not the
CCM_ANALOG register base. So move it to the correct position in code.
Otherwise changing the parent of the csi clock can lead to a complete
system failure due to the CCM_ANALOG_PLL_SYS_TOG register being falsely
modified.
Also remove the SET_RATE_PARENT flag since one possible supply for the
csi_sel mux is the system PLL which we don't want to modify.
Signed-off-by: Stefan Riedmueller <s.riedmueller@phytec.de> Reviewed-by: Abel Vesa <abel.vesa@nxp.com> Link: https://lore.kernel.org/r/20210927072857.3940880-1-s.riedmueller@phytec.de Signed-off-by: Abel Vesa <abel.vesa@nxp.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Behringer UFX1204 and UFX1604 have Synchronous endpoints to which
current ALSA code applies implicit feedback sync as if they were
Asynchronous endpoints. This breaks UAC compliance and is unneeded.
Several problems exist with scsi_mode_sense() buffer length handling:
1) The allocation length field of the MODE SENSE(10) command is 16-bits,
occupying bytes 7 and 8 of the CDB. With this command, access to mode
pages larger than 255 bytes is thus possible. However, the CDB
allocation length field is set by assigning len to byte 8 only, thus
truncating buffer length larger than 255.
2) If scsi_mode_sense() is called with len smaller than 8 with
sdev->use_10_for_ms set, or smaller than 4 otherwise, the buffer length
is increased to 8 and 4 respectively, and the buffer is zero filled
with these increased values, thus corrupting the memory following the
buffer.
Fix these 2 problems by using put_unaligned_be16() to set the allocation
length field of MODE SENSE(10) CDB and by returning an error when len is
too small.
Furthermore, if len is larger than 255B, always try MODE SENSE(10) first,
even if the device driver did not set sdev->use_10_for_ms. In case of
invalid opcode error for MODE SENSE(10), access to mode pages larger than
255 bytes are not retried using MODE SENSE(6). To avoid buffer length
overflows for the MODE_SENSE(10) case, check that len is smaller than 65535
bytes.
While at it, also fix the folowing:
* Use get_unaligned_be16() to retrieve the mode data length and block
descriptor length fields of the mode sense reply header instead of using
an open coded calculation.
* Fix the kdoc dbd argument explanation: the DBD bit stands for Disable
Block Descriptor, which is the opposite of what the dbd argument
description was.
Link: https://lore.kernel.org/r/20210820070255.682775-2-damien.lemoal@wdc.com Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
The initial hdac_stream code was adapted a third time with the same
locking issues. Move the spin_lock outside the loops and make sure the
fields are protected on read/write.
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210924192417.169243-5-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Separate software and simulated hardware lkeys and rkeys for MRs and MWs.
This makes struct ib_mr and struct ib_mw isolated from hardware changes
triggered by executing work requests.
U-boot atempts to read serial alias value for ls1012a-rdb but couldn't
do so as it is not initialised and thus, FDT_ERR_NOTFOUND error is
reported while booting linux.
Loading fdt from FIT Image at a0000000 ...
Description: ls1012ardb-dtb
Type: Flat Device Tree
Data Start: 0xab111474
Data Size: 11285 Bytes = 11 KiB
Architecture: AArch64
Load Address: 0x90000000
Loading fdt from 0xab111474 to 0x90000000
Booting using the fdt blob at 0x90000000
Uncompressing Kernel Image
Loading Device Tree to 000000008fffa000, end 000000008ffffc14 ... OK
WARNING: fdt_fixup_stdout: could not read serial0 alias: FDT_ERR_NOTFOUND
NOTICE: RNG: INSTANTIATED
Starting kernel ...
Fix the above error by specifying serial value to duart.
The entry/exit latency and minimum residency in state for the idle
states of MSM8998 were ..bad: first of all, for all of them the
timings were written for CPU sleep but the min-residency-us param
was miscalculated (supposedly, while porting this from downstream);
Then, the power collapse states are setting PC on both the CPU
cluster *and* the L2 cache, which have different timings: in the
specific case of L2 the times are higher so these ones should be
taken into account instead of the CPU ones.
This parameter misconfiguration was not giving particular issues
because on MSM8998 there was no CPU scaling at all, so cluster/L2
power collapse was rarely (if ever) hit.
When CPU scaling is enabled, though, the wrong timings will produce
SoC unstability shown to the user as random, apparently error-less,
sudden reboots and/or lockups.
This set of parameters are stabilizing the SoC when CPU scaling is
ON and when power collapse is frequently hit.
the switch identifies itself as a BCM53012 (rev 5)...
This patch has been tested & verified on OpenWrt's
snapshot with Linux 5.10 (didn't test any older kernels).
The MR32 is able to "talk to the network" as before with
OpenWrt's SWITCHDEV b53 driver.
The assoc_timer takes the pmlmepriv->lock and various functions which
take the pmlmepriv->scanned_queue.lock first take the pmlmepriv->lock,
this means that we cannot have code which waits for the timer
(timer_del_sync) while holding the pmlmepriv->scanned_queue.lock
to avoid a triangle deadlock:
[ 363.139361] ======================================================
[ 363.139377] WARNING: possible circular locking dependency detected
[ 363.139396] 5.15.0-rc1+ #470 Tainted: G C E
[ 363.139413] ------------------------------------------------------
[ 363.139424] RTW_CMD_THREAD/2466 is trying to acquire lock:
[ 363.139441] ffffbacd00699038 (&pmlmepriv->lock){+.-.}-{2:2}, at: _rtw_join_timeout_handler+0x3c/0x160 [r8723bs]
[ 363.139598]
but task is already holding lock:
[ 363.139610] ffffbacd00128ea0 ((&pmlmepriv->assoc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x260
[ 363.139673]
which lock already depends on the new lock.
Make rtw_joinbss_event_prehandle() release the scanned_queue.lock before
it deletes the timer to avoid this (it is still holding pmlmepriv->lock
protecting against racing the timer).
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210920145502.155454-3-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Lockdep complains about rtw_free_assoc_resources() taking the sta_hash_lock
followed by it calling rtw_free_stainfo() which takes xmitpriv->lock.
While the rtl8723bs_xmit_thread takes the sta_hash_lock while already
holding the xmitpriv->lock:
[ 103.849756] ======================================================
[ 103.849761] WARNING: possible circular locking dependency detected
[ 103.849767] 5.15.0-rc1+ #470 Tainted: G C E
[ 103.849773] ------------------------------------------------------
[ 103.849776] wpa_supplicant/695 is trying to acquire lock:
[ 103.849781] ffffa5d0c0562b00 (&pxmitpriv->lock){+.-.}-{2:2}, at: rtw_free_stainfo+0x8a/0x510 [r8723bs]
[ 103.849840]
but task is already holding lock:
[ 103.849843] ffffa5d0c05636a8 (&pstapriv->sta_hash_lock){+.-.}-{2:2}, at: rtw_free_assoc_resources+0x48/0x110 [r8723bs]
[ 103.849881]
which lock already depends on the new lock.
Push the taking of sta_hash_lock down into rtw_free_stainfo(),
where the critical section is, this allows taking the lock after
rtw_free_stainfo() has released pxmitpriv->lock.
This requires changing rtw_free_all_stainfo() so that it does its freeing
in 2 steps, first moving all stainfo-s to free to a local list while
holding the sta_hash_lock and then walking that list to call
rtw_free_stainfo() on them without holding the sta_hash_lock.
Pushing the taking of sta_hash_lock down into rtw_free_stainfo(),
also fixes a whole bunch of callers of rtw_free_stainfo() which
were not holding that lock even though they should.
Note that this also fixes the deadlock from the "remove possible
deadlock when disconnect" patch in a different way. But the
changes from that patch offer a nice locking cleanup regardless.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20210920145502.155454-2-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>