Yang Yingliang [Fri, 26 Mar 2021 10:13:50 +0000 (18:13 +0800)]
net: llc: Correct function name llc_pdu_set_pf_bit() in header
Fix the following make W=1 kernel build warning:
net/llc/llc_pdu.c:36: warning: expecting prototype for pdu_set_pf_bit(). Prototype was for llc_pdu_set_pf_bit() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 26 Mar 2021 10:13:49 +0000 (18:13 +0800)]
net: llc: Correct function name llc_sap_action_unitdata_ind() in header
Fix the following make W=1 kernel build warning:
net/llc/llc_s_ac.c:38: warning: expecting prototype for llc_sap_action_unit_data_ind(). Prototype was for llc_sap_action_unitdata_ind() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 26 Mar 2021 10:13:48 +0000 (18:13 +0800)]
net: llc: Correct some function names in header
Fix the following make W=1 kernel build warning:
net/llc/llc_c_ev.c:622: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_1(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_1() instead
net/llc/llc_c_ev.c:636: warning: expecting prototype for conn_ev_qlfy_last_frame_eq_0(). Prototype was for llc_conn_ev_qlfy_last_frame_eq_0() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hoang Le [Fri, 26 Mar 2021 09:14:14 +0000 (16:14 +0700)]
tipc: fix kernel-doc warnings
Fix kernel-doc warning introduced in
commit b83e214b2e04 ("tipc: add extack messages for bearer/media failure"):
net/tipc/bearer.c:248: warning: Function parameter or member 'extack' not described in 'tipc_enable_bearer'
Fixes: b83e214b2e04 ("tipc: add extack messages for bearer/media failure") Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
net: stmmac: Fix kernel panic due to NULL pointer dereference of fpe_cfg
In this patch, "net: stmmac: support FPE link partner hand-shaking
procedure", priv->plat->fpe_cfg wouldn`t be "devm_kzalloc"ed if
dma_cap->frpsel is 0 (Flexible Rx Parser is not supported in SoC) in
tc_init(). So, fpe_cfg will be remain as NULL and accessing it will cause
kernel panic.
To fix this, move the "devm_kzalloc"ing of priv->plat->fpe_cfg before
dma_cap->frpsel checking in tc_init(). Additionally, checking of
priv->dma_cap.fpesel is added before calling stmmac_fpe_link_state_handle()
as only FPE supported SoC is allowed to call the function.
Below is the kernel panic dump reported by Marek Szyprowski
<m.szyprowski@samsung.com>:
meson8b-dwmac ff3f0000.ethernet eth0: PHY [0.0:00] driver [RTL8211F Gigabit Ethernet] (irq=35)
meson8b-dwmac ff3f0000.ethernet eth0: No Safety Features support found
meson8b-dwmac ff3f0000.ethernet eth0: PTP not supported by HW
meson8b-dwmac ff3f0000.ethernet eth0: configuring for phy/rgmii link mode
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000001
Mem abort info:
...
user pgtable: 4k pages, 48-bit VAs, pgdp=00000000044eb000
[0000000000000001] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in: dw_hdmi_i2s_audio dw_hdmi_cec meson_gxl realtek meson_gxbb_wdt snd_soc_meson_axg_sound_card dwmac_generic axg_audio meson_dw_hdmi crct10dif_ce snd_soc_meson_card_utils snd_soc_meson_axg_tdmout panfrost rc_odroid gpu_sched reset_meson_audio_arb meson_ir snd_soc_meson_g12a_tohdmitx snd_soc_meson_axg_frddr sclk_div clk_phase snd_soc_meson_codec_glue dwmac_meson8b snd_soc_meson_axg_fifo stmmac_platform meson_rng meson_drm stmmac rtc_meson_vrtc rng_core meson_canvas pwm_meson dw_hdmi mdio_mux_meson_g12a pcs_xpcs snd_soc_meson_axg_tdm_interface snd_soc_meson_axg_tdm_formatter nvmem_meson_efuse display_connector
CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.12.0-rc4-next-20210325+
Hardware name: Hardkernel ODROID-C4 (DT)
Workqueue: events_power_efficient phylink_resolve
pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--)
pc : stmmac_mac_link_up+0x14c/0x348 [stmmac]
lr : stmmac_mac_link_up+0x284/0x348 [stmmac] ...
Call trace:
stmmac_mac_link_up+0x14c/0x348 [stmmac]
phylink_resolve+0x104/0x420
process_one_work+0x2a8/0x718
worker_thread+0x48/0x460
kthread+0x134/0x160
ret_from_fork+0x10/0x18
Code: b971ba60350007c0f958c260f9402000 (39400401)
---[ end trace 0c9deb6c510228aa ]---
Fixes: 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking
procedure") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hancock [Fri, 26 Mar 2021 00:04:38 +0000 (18:04 -0600)]
net: axienet: Enable more clocks
This driver was only enabling the first clock on the device, regardless
of its name. However, this controller logic can have multiple clocks
which should all be enabled. Add support for enabling additional clocks.
The clock names used are matching those used in the Xilinx version of this
driver as well as the Xilinx device tree generator, except for mgt_clk
which is not present there.
For backward compatibility, if no named clocks are present, the first
clock present is used for determining the MDIO bus clock divider.
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Update DT bindings to describe all of the clocks that the axienet
driver will now be able to make use of.
Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Mar 2021 22:14:57 +0000 (15:14 -0700)]
Merge branch 'mld-sleepable'
Taehee Yoo says:
====================
mld: change context from atomic to sleepable
This patchset changes the context of MLD module.
Before this patchset, MLD functions are atomic context so it couldn't use
sleepable functions and flags.
There are several reasons why MLD functions are under atomic context.
1. It uses timer API.
Timer expiration functions are executed in the atomic context.
2. atomic locks
MLD functions use rwlock and spinlock to protect their own resources.
So, in order to switch context, this patchset converts resources to use
RCU and removes atomic locks and timer API.
1. The first patch convert from the timer API to delayed work.
Timer API is used for delaying some works.
MLD protocol has a delay mechanism, which is used for replying to a query.
If a listener receives a query from a router, it should send a response
after some delay. But because of timer expire function is executed in
the atomic context, this patch convert from timer API to the delayed work.
2. The fourth patch deletes inet6_dev->mc_lock.
The mc_lock has protected inet6_dev->mc_tomb pointer.
But this pointer is already protected by RTNL and it isn't be used by
datapath. So, it isn't be needed and because of this, many atomic context
critical sections are deleted.
3. The fifth patch convert ip6_sf_socklist to RCU.
ip6_sf_socklist has been protected by ipv6_mc_socklist->sflock(rwlock).
But this is already protected by RTNL So if it is converted to use RCU
in order to be used in the datapath, the sflock is no more needed.
So, its control path context can be switched to sleepable.
4. The sixth patch convert ip6_sf_list to RCU.
The reason for this patch is the same as the previous patch.
5. The seventh patch convert ifmcaddr6 to RCU.
The reason for this patch is the same as the previous patch.
6. Add new workqueues for processing query/report event.
By this patch, query and report events are processed by workqueue
So context is sleepable, not atomic.
While this logic, it acquires RTNL.
7. Add new mc_lock.
The purpose of this lock is to protect per-interface mld data.
Per-interface mld data is usually used by query/report event handler.
So, query/report event workers need only this lock instead of RTNL.
Therefore, it could reduce bottleneck.
Changelog:
v2 -> v3:
1. Do not use msecs_to_jiffies().
(by Cong Wang)
2. Do not add unnecessary rtnl_lock() and rtnl_unlock().
(by Cong Wang)
3. Fix sparse warnings because of rcu annotation.
(by kernel test robot)
- Remove some rcu_assign_pointer(), which was used for non-rcu pointer.
- Add union for rcu pointer.
- Use rcu API in mld_clear_zeros().
- Remove remained rcu_read_unlock().
- Use rcu API for tomb resources.
4. withdraw prevopus 2nd and 3rd patch.
- "separate two flags from ifmcaddr6->mca_flags"
- "add a new delayed_work, mc_delrec_work"
5. Add 6th and 7th patch.
v1 -> v2:
1. Withdraw unnecessary refactoring patches.
(by Cong Wang, Eric Dumazet, David Ahern)
a) convert from array to list.
b) function rename.
2. Separate big one patch into small several patches.
3. Do not rename 'ifmcaddr6->mca_lock'.
In the v1 patch, this variable was changed to 'ifmcaddr6->mca_work_lock'.
But this is actually not needed.
4. Do not use atomic_t for 'ifmcaddr6->mca_sfcount' and
'ipv6_mc_socklist'->sf_count'.
5. Do not add mld_check_leave_group() function.
6. Do not add ip6_mc_del_src_bulk() function.
7. Do not add ip6_mc_add_src_bulk() function.
8. Do not use rcu_read_lock() in the qeth_l3_add_mcast_rtnl().
(by Julian Wiedmann)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:57 +0000 (16:16 +0000)]
mld: add mc_lock for protecting per-interface mld data
The purpose of this lock is to avoid a bottleneck in the query/report
event handler logic.
By previous patches, almost all mld data is protected by RTNL.
So, the query and report event handler, which is data path logic
acquires RTNL too. Therefore if a lot of query and report events
are received, it uses RTNL for a long time.
So it makes the control-plane bottleneck because of using RTNL.
In order to avoid this bottleneck, mc_lock is added.
mc_lock protect only per-interface mld data and per-interface mld
data is used in the query/report event handler logic.
So, no longer rtnl_lock is needed in the query/report event handler logic.
Therefore bottleneck will be disappeared by mc_lock.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:56 +0000 (16:16 +0000)]
mld: add new workqueues for process mld events
When query/report packets are received, mld module processes them.
But they are processed under BH context so it couldn't use sleepable
functions. So, in order to switch context, the two workqueues are
added which processes query and report event.
In the struct inet6_dev, mc_{query | report}_queue are added so it
is per-interface queue.
And mc_{query | report}_work are workqueue structure.
When the query or report event is received, skb is queued to proper
queue and worker function is scheduled immediately.
Workqueues and queues are protected by spinlock, which is
mc_{query | report}_lock, and worker functions are protected by RTNL.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:55 +0000 (16:16 +0000)]
mld: convert ifmcaddr6 to RCU
The ifmcaddr6 has been protected by inet6_dev->lock(rwlock) so that
the critical section is atomic context. In order to switch this context,
changing locking is needed. The ifmcaddr6 actually already protected by
RTNL So if it's converted to use RCU, its control path context can be
switched to sleepable.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:54 +0000 (16:16 +0000)]
mld: convert ip6_sf_list to RCU
The ip6_sf_list has been protected by mca_lock(spin_lock) so that the
critical section is atomic context. In order to switch this context,
changing locking is needed. The ip6_sf_list actually already protected
by RTNL So if it's converted to use RCU, its control path context can
be switched to sleepable.
But It doesn't remove mca_lock yet because ifmcaddr6 isn't converted
to RCU yet. So, It's not fully converted to the sleepable context.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:53 +0000 (16:16 +0000)]
mld: convert ipv6_mc_socklist->sflist to RCU
The sflist has been protected by rwlock so that the critical section
is atomic context.
In order to switch this context, changing locking is needed.
The sflist actually already protected by RTNL So if it's converted
to use RCU, its control path context can be switched to sleepable.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:52 +0000 (16:16 +0000)]
mld: get rid of inet6_dev->mc_lock
The purpose of mc_lock is to protect inet6_dev->mc_tomb.
But mc_tomb is already protected by RTNL and all functions,
which manipulate mc_tomb are called under RTNL.
So, mc_lock is not needed.
Furthermore, it is spinlock so the critical section is atomic.
In order to reduce atomic context, it should be removed.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 25 Mar 2021 16:16:51 +0000 (16:16 +0000)]
mld: convert from timer to delayed work
mcast.c has several timers for delaying works.
Timer's expire handler is working under atomic context so it can't use
sleepable things such as GFP_KERNEL, mutex, etc.
In order to use sleepable APIs, it converts from timers to delayed work.
But there are some critical sections, which is used by both process
and BH context. So that it still uses spin_lock_bh() and rwlock.
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 26 Mar 2021 20:22:22 +0000 (13:22 -0700)]
ethtool: fec: fix FEC_NONE check
Dan points out we need to use the mask not the bit (which is 0).
Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 42ce127d9864 ("ethtool: fec: sanitize ethtool_fecparam->fec") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Mar 2021 22:05:15 +0000 (15:05 -0700)]
Merge branch 'mptcp-cleanups'
Mat Martineau says:
====================
MPTCP: Cleanup and address advertisement fixes
This patch series contains cleanup and fixes we have been testing in the
MPTCP tree. MPTCP uses TCP option headers to advertise additional
address information after an initial connection is established. The main
fixes here deal with making those advertisements more reliable and
improving the way subflows are created after an advertisement is
received.
Patches 1, 2, 4, 10, and 12 are for various cleanup or refactoring.
Patch 3 skips an extra connection attempt if there's already a subflow
connection for the newly received advertisement.
Patches 5, 6, and 7 make sure that the next address is advertised when
there are multiple addresses to share, the advertisement has been
retried, and the peer has not echoed the advertisement. Self tests are
updated.
Patches 8 and 9 fix a problem similar to 5/6/7, but covers a case where
the failure was due to a subflow connection not completing.
Patches 11 and 13 send a bare ack to revoke an advertisement rather than
waiting for other activity to trigger a packet send. This mirrors the
way acks are sent for new advertisements. Self test is included.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:42 +0000 (11:26 -0700)]
selftests: mptcp: signal addresses testcases
This patch adds testcases for signalling multi valid and invalid
addresses for both signal_address_tests and remove_tests.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:41 +0000 (11:26 -0700)]
mptcp: rename mptcp_pm_nl_add_addr_send_ack
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and
RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:40 +0000 (11:26 -0700)]
mptcp: send ack for rm_addr
This patch changes the sending ACK conditions for the ADD_ADDR, send an
ACK packet for RM_ADDR too.
In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send
the ACK packet.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:39 +0000 (11:26 -0700)]
mptcp: drop useless addr_signal clear
msk->pm.addr_signal is cleared in mptcp_pm_add_addr_signal, no need to
clear it in mptcp_pm_nl_add_addr_send_ack again. Drop it.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:38 +0000 (11:26 -0700)]
mptcp: move to next addr when subflow creation fail
When an invalid address was announced, the subflow couldn't be created
for this address. Therefore mptcp_pm_nl_subflow_established couldn't be
invoked. Then the next addresses in the local address list didn't have a
chance to be announced.
This patch invokes the new function mptcp_pm_add_addr_echoed when the
address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check
whether this address is in the anno_list. If it is, PM schedules the
status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke
mptcp_pm_create_subflow_or_signal_addr to deal with the next address in
the local address list.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:37 +0000 (11:26 -0700)]
mptcp: export lookup_anno_list_by_saddr
This patch exported the static function lookup_anno_list_by_saddr, and
renamed it to mptcp_lookup_anno_list_by_saddr.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:36 +0000 (11:26 -0700)]
selftests: mptcp: timeout testcases for multi addresses
This patch added the timeout testcases for multi addresses, valid and
invalid.
These testcases need to transmit 8 ADD_ADDRs, so add a new speed level
'least' to set 10 to mptcp_connect to slow down the transmitting process.
The original speed level 'slow' still uses 50.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:35 +0000 (11:26 -0700)]
selftests: mptcp: add cfg_do_w for cfg_remove
In some testcases, we need to slow down the transmitting process. This
patch added a new argument named cfg_do_w for cfg_remove to allow the
caller to pass an argument to cfg_remove.
In do_rnd_write, use this cfg_do_w to control the transmitting speed.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:34 +0000 (11:26 -0700)]
mptcp: move to next addr when timeout
This patch called mptcp_pm_subflow_established to move to the next address
when an ADD_ADDR has been retransmitted the maximum number of times.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:32 +0000 (11:26 -0700)]
mptcp: skip connecting the connected address
This patch added a new helper named lookup_subflow_by_daddr to find
whether the destination address is in the msk's conn_list.
In mptcp_pm_nl_add_addr_received, use lookup_subflow_by_daddr to check
whether the announced address is already connected. If it is, skip
connecting this address and send out the echo.
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 26 Mar 2021 18:26:31 +0000 (11:26 -0700)]
mptcp: drop argument port from mptcp_pm_announce_addr
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the
port field of another argument 'addr' instead.
Fixes: 0f5c9e3f079f ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 26 Mar 2021 18:26:30 +0000 (11:26 -0700)]
mptcp: clean-up the rtx path
After the previous patch we can easily avoid invoking
the workqueue to perform the retransmission, if the
msk socket lock is held at rtx timer expiration.
This also simplifies the relevant code.
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This series reworks the way IPA resources are defined and
programmed. It is a little long--and I apologize for that--but
I think the patches are best taken together as a single unit.
The IPA hardware operates with a set of distinct "resources." Each
hardware instance has a fixed number of each resource type available.
Available resources are divided into smaller pools, with each pool
shared by endpoints in a "resource group." Each endpoint is thus
assigned to a resource group that determines which pools supply
resources the IPA hardware uses to handle the endpoint's processing.
The exact set of resources used can differ for each version of IPA.
Except for IPA v3.0 and v3.1, there are 5 source and 2 destination
resource types, but there's no reason to assume this won't change.
The number of resource groups used *does* typically change based on
the hardware version. For example, some versions target reduced
functionality and support fewer resource groups.
With that as background...
The net result of this series is to improve the flexibility with
which IPA resources and resource groups are defined, permitting each
version of IPA to define its own set of resources and groups. Along
the way it isolates the resource-related code, and fixes a few bugs
related to resource handling.
The first patch moves resource-related code to a new C file (and
header). It generates a checkpatch warning about updating
MAINTAINERS, which can be ignored. The second patch fixes a bug,
but the bug does not affect SDM845 or SC7180.
The third patch defines an enumerated type whose members provide
symbolic names for resource groups.
The fourth defines some resource limits for SDM845 that were not
previously being programmed. That platform "works" without this,
but to be correct, these limits should really be programmed.
The fifth patch uses a single enumerated type to define both source
and destination resource type IDs, and the sixth uses those IDs to
index the resource limit arrays. The seventh moves the definition
of that enumerated type into the platform data files, allowing each
platform to define its own set of resource types.
The eighth and ninth are fairly trivial changes. One replaces two
"max" symbols having the same value with a single symbol. And the
other replaces two distinct but otherwise identical structure types
with a single common one.
The 10th is a small preparatory patch for the 11th, passing a
different argument to a function that programs resource values.
The 11th allows the actual number of source and destination resource
groups for a platform to be specified in its configuration data.
That way the number is based on the actual number of groups defined.
This removes the need for a sort of clunky pair of functions that
defined that information previously.
Finally, the last patch just increases the number of resource groups
that can be defined to 8.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:22 +0000 (10:11 -0500)]
net: ipa: support more than 6 resource groups
IPA versions 3.0 and 3.1 support up to 8 resource groups. There is
some interest in supporting these older versions of the hardware, so
update the resource configuration code to program resource limits
for these groups if specified.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:21 +0000 (10:11 -0500)]
net: ipa: record number of groups in data
The arrays of source and destination resource limits defined in
configuration data are of a fixed size--which is the maximum number
of resource groups supported for any platform. Most platforms will
use fewer than that many groups.
Add new members to the ipa_rsrc_group_id enumerated type to define
the number of source and destination resource groups are defined for
the platform. (This type is defined for each platform in its data
file.)
Add a new field to the resource configuration data that indicates
how many of the source and destination resource groups are actually
used for the platform, and initialize it with the count value. This
allows us to determine the number of groups defined for the platform
without exposing the ipa_rsrc_group_id enumerated type.
As a result, we no longer need ipa_resource_group_src_count()
and ipa_resource_group_dst_count(), because each platform now
defines its supported number of resource groups. So get rid of
those two functions.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:17 +0000 (10:11 -0500)]
net: ipa: move ipa_resource_type definition
Most platforms have the same set of source and destination resource
types. But some older platforms have some additional ones, and it's
possible different resources will be used in the future.
Move the definition of the ipa_resource_type enumerated type so it
is defined for each platform in its configuration data file. This
permits each to have a distinct set of resources.
Shorten the data files slightly, by putting the min and max limit
values on the same line.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:16 +0000 (10:11 -0500)]
net: ipa: index resource limits with type
Remove the type field from the ipa_resource_src and ipa_resource_dst
structures, and instead use that value as the index into the arrays
of source and destination resources.
Change ipa_resource_config_src() and ipa_resource_config_dst() so
the resource type is passed in as an argument.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:15 +0000 (10:11 -0500)]
net: ipa: combine resource type definitions
Combine the ipa_resource_type_src and ipa_resource_type_dst
enumerated types into a single enumerated type, ipa_resource_type.
Assign value 0 to the first element for the source and destination
types, so their numeric values are preserved. Add some additional
commentary where these are defined, stating explicitly that code
assumes the first source and first destination member must have
numeric value 0.
Fix the kerneldoc comments for the ipa_gsi_endpoint_data structure.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:14 +0000 (10:11 -0500)]
net: ipa: add some missing resource limits
Currently, the SDM845 configuration data defines resource limits for
the first two resource groups (for both source and destination
resource types). The hardware supports additional resource groups,
and we should program the resource limits for those groups as well.
Even the "unused" destination resource group (number 2) should have
non-zero limits programmed in some cases, to ensure correct operation.
Add these missing resource group limit definitions to the SDM845
configuration data.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:13 +0000 (10:11 -0500)]
net: ipa: identify resource groups
Define a new ipa_resource_group_id enumerated type, whose members
have numeric values that match the resource group number used when
programming the hardware. Each platform supports a different number
of source and destination resource groups, so define the type
separately for each platform in its configuration data file.
Use these new symbolic values when specifying the resource group an
endpoint is associated with. And use them to index the limits
arrays for source and destination resources, making it clearer how
these values are used.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 26 Mar 2021 15:11:12 +0000 (10:11 -0500)]
net: ipa: fix bug in resource group limit programming
If the number of resource groups supported by the hardware is less
than a certain number, we return early in ipa_resource_config_src()
and ipa_resource_config_dst() (to avoid programming resource limits
for non-existent groups).
Unfortunately, these checks are off by one. Fix this problem in the
four places it occurs.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Guojia Liao [Fri, 26 Mar 2021 01:36:28 +0000 (09:36 +0800)]
net: hns3: split out hclge_tm_vport_tc_info_update()
hclge_tm_vport_tc_info_update() is bloated, so split it into
separate functions for readability and maintainability.
Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Fri, 26 Mar 2021 01:36:27 +0000 (09:36 +0800)]
net: hns3: split function hclge_reset_rebuild()
hclge_reset_rebuild() is a bit too long. So add a new function
hclge_update_reset_level() to improve readability.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Fri, 26 Mar 2021 01:36:23 +0000 (09:36 +0800)]
net: hns3: remove unused parameter from hclge_set_vf_vlan_common()
Parameter vf in hclge_set_vf_vlan_common() is unused now,
so remove it.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiaran Zhang [Fri, 26 Mar 2021 01:36:22 +0000 (09:36 +0800)]
net: hns3: remove redundant query in hclge_config_tm_hw_err_int()
According to the HW manual, the query operation is unnecessary
when the TM QCN error event is enabled, so remove it.
Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Fri, 26 Mar 2021 01:36:21 +0000 (09:36 +0800)]
net: hns3: remove redundant blank lines
Remove some redundant blank lines.
Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 26 Mar 2021 01:36:20 +0000 (09:36 +0800)]
net: hns3: remove unused code of vmdq
Vmdq is not supported yet, the num_vmdq_vport is always 0,
it's a bit confusing when using the num_vport, so remove
these unused codes of vmdq.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Mar 2021 21:50:34 +0000 (14:50 -0700)]
Merge tag 'mlx5-updates-2021-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-03-24
mlx5e netdev driver updates:
1) Some cleanups from Colin, Tariq and Saeed.
2) Aya made some trivial refactoring to cleanup and generalize
PTP and RQ (Receive Queue) creation and management.
Mostly code decoupling and reducing dependencies between the different
RX objects in the netdev driver.
This is a preparation series for upcoming PTP special RQ creation which
will allow coexistence of CQE compression (important performance feature,
especially in Multihost systems) and HW TS PTP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Aya Levin [Sun, 17 Jan 2021 13:25:27 +0000 (15:25 +0200)]
net/mlx5e: Cleanup PTP
Reduce scope of mlx5e_ptp_params, move to its c file. Remove unneeded
variables from mlx5e_ptp_open and state bitmap from PTP channel. In
addition, remove channel index from PTP channel since it is set to a
hard coded value, use define instead.
Aya Levin [Sun, 7 Mar 2021 13:41:27 +0000 (15:41 +0200)]
net/mlx5e: Generalize PTP implementation
Following patches in the set add support for RX PTP. Rename PTP prefix
from %s/port_ptp/ptp/g to include RX PTP too.
In addition rename indication (used in statistics context) that PTP-SQ
was opened: %s/port_ptp_opened/tx_ptp_opened/g. This will simplify adding
indication that PTP-RQ was opened.
Aya Levin [Thu, 25 Feb 2021 15:46:25 +0000 (17:46 +0200)]
net/mlx5e: Generalize direct-TIRs and direct-RQTs API
Add input parameter indicating the size of direct-TIRs/direct-RQTs array
to be created/destroyed. This allows next patches in the patch-set to
handle a single direct-TIR pointing to a direct-RQT with a single entry.
Aya Levin [Mon, 8 Feb 2021 18:56:02 +0000 (20:56 +0200)]
net/mlx5e: Generalize close RQ
Allow different flavours of RQ to use the same close flow. Add validity
checks to support different RQ types which not necessarily initialize
all the RQ's functionality.
Aya Levin [Mon, 8 Feb 2021 16:25:56 +0000 (18:25 +0200)]
net/mlx5e: Generalize RQ activation
Support RQ activation for RQs without an ICOSQ in the main flow, like
existing trap-RQ and like PTP-RQ that will be introduced in the coming
patches in the patchset.
With this patch, remove the wrapper in traps to deactivate the trap-RQ.
Aya Levin [Sun, 7 Mar 2021 13:29:53 +0000 (15:29 +0200)]
net/mlx5e: Generalize open RQ
Unify RQ creation for different RQ types. For each RQ type add a
separate open helper which initializes the RQ specific values and
trigger a call for generic open RQ function. Avoid passing the
mlx5e_channel pointer to the generic open RQ as a container, since the
RQ may reside under a different type of channel.
Aya Levin [Mon, 8 Feb 2021 14:00:36 +0000 (16:00 +0200)]
net/mlx5e: Allow creating mpwqe info without channel
Change the signature of mlx5e_rq_alloc_mpwqe_info from receiving channel
pointer to receive the NUMA node. This allows creating mpwqe_info in
context of different channels types.
Tariq Toukan [Wed, 10 Mar 2021 12:46:59 +0000 (14:46 +0200)]
net/mlx5e: Restrict usage of mlx5e_priv in params logic functions
Do not use generic struct mlx5e_priv as a parameter to param
functions, as it is too generic. All calculations of the channel's
param should be mainly based on struct mlx5_core_dev and
struct mlx5e_params. Additional info can be explicitly passed.
Tariq Toukan [Sun, 7 Mar 2021 13:13:23 +0000 (15:13 +0200)]
net/mlx5e: Pass q_counter indentifier as parameter to rq_param builders
Pass q_counter idintifier, instead of reading it from mlx5e_priv
parameter.
This is a step towards removing the mlx5e_priv parameter from all
params function and logic in the next patches of the series.
... cannot be used in block quote, it breaks compilation, remove it.
Fix warnings due to missing blank line such as:
net-next/Documentation/networking/nf_flowtable.rst:142: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: 143490cde566 ("docs: nf_flowtable: update documentation with enhancements") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 25 Mar 2021 18:08:17 +0000 (11:08 -0700)]
tcp: convert elligible sysctls to u8
Many tcp sysctls are either bools or small ints that can fit into u8.
Reducing space taken by sysctls can save few cache line misses
when sending/receiving data while cpu caches are empty,
for example after cpu idle period.
This is hard to measure with typical network performance tests,
but after this patch, struct netns_ipv4 has shrunk
by three cache lines.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patchset adds support for multi MSI interrupts in addition to
current single common interrupt implementation. Each MSI interrupt is tied
to a newly introduce interrupt service routine(ISR). Hence, each interrupt
will only go through the corresponding ISR.
In order to increase the efficiency, enabling multi MSI interrupt will
automatically select the interrupt mode configuration INTM=1. When INTM=1,
the TX/RX transfer complete signal will only asserted on corresponding
sbd_perch_tx_intr_o[] or sbd_perch_rx_intr_o[] without asserting signal
on the common sbd_intr_o. Hence, for each TX/RX interrupts, only the
corresponding ISR will be triggered.
Every vendor might have different MSI vector assignment. So, this patchset
only includes multi-vector MSI assignment for Intel platform.
Changes:
v1 -> v2
patch 2/5
-Remove defensive check for invalid dev pointer
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong, Vee Khee [Thu, 25 Mar 2021 17:39:16 +0000 (01:39 +0800)]
net: stmmac: use interrupt mode INTM=1 for multi-MSI
For interrupt mode INTM=0, TX/RX transfer complete will trigger signal
not only on sbd_perch_[tx|rx]_intr_o (Transmit/Receive Per Channel) but
also on the sbd_intr_o (Common).
As for multi-MSI implementation, setting interrupt mode INTM=1 is more
efficient as each TX intr and RX intr (TI/RI) will be handled by TX/RX ISR
without the need of calling the common MAC ISR.
Updated the TX/RX NORMAL interrupts status checking process as the
NIS status bit is not asserted for any RI/TI events for INTM=1.
Signed-off-by: Wong, Vee Khee <vee.khee.wong@intel.com> Co-developed-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
During probe(), the driver will starts with request allocation for
multi-vector interrupts. If it fails, then it will automatically fallback
to request allocation for single interrupts.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Co-developed-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Now we introduce MSI interrupt service routines and hook these routines
up if stmmac_open() sees valid irq line being requested:-
stmmac_mac_interrupt() :- MAC (dev->irq), WOL (wol_irq), LPI (lpi_irq)
stmmac_safety_interrupt() :- Safety Feat Correctible Error (sfty_ce_irq)
& Uncorrectible Error (sfty_ue_irq)
stmmac_msi_intr_rx() :- For all RX MSI irq (rx_irq)
stmmac_msi_intr_tx() :- For all TX MSI irq (tx_irq)
Each of IRQs will have its unique name so that we can differentiate
them easily under /proc/interrupts.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ong Boon Leong [Thu, 25 Mar 2021 17:39:13 +0000 (01:39 +0800)]
net: stmmac: make stmmac_interrupt() function more friendly to MSI
Refactor stmmac_interrupt() by introducing stmmac_common_interrupt()
so that we prepare the ISR operation to be friendly to MSI later.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ong Boon Leong [Thu, 25 Mar 2021 17:39:12 +0000 (01:39 +0800)]
net: stmmac: introduce DMA interrupt status masking per traffic direction
In preparation to make stmmac support multi-vector MSI, we introduce the
interrupt status masking according to RX, TX or RXTX. Default to use RXTX
inside stmmac_dma_interrupt(), so there is no run-time logic difference
now.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Vyukov [Thu, 25 Mar 2021 14:52:45 +0000 (15:52 +0100)]
net: change netdev_unregister_timeout_secs min value to 1
netdev_unregister_timeout_secs=0 can lead to printing the
"waiting for dev to become free" message every jiffy.
This is too frequent and unnecessary.
Set the min value to 1 second.
Also fix the merge issue introduced by
"net: make unregister netdev warning timeout configurable":
it changed "refcnt != 1" to "refcnt".
Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: 5aa3afe107d9 ("net: make unregister netdev warning timeout configurable") Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Mar 2021 00:22:30 +0000 (17:22 -0700)]
Merge branch 'ipa-reg-versions'
Alex Elder says:
====================
net: ipa: update registers for other versions
This series updates IPA and GSI register definitions to permit more
versions of IPA hardware to be supported. Most of the updates are
informational, updating comments to indicate which IPA versions
support each register and field. But some registers are new and
others are deprecated. In a few cases register fields are laid
out differently, and in these cases the changes are a little more
substantive.
I won't claim the result is 100% correct, but it's close, and should
allow all IPA versions 3.x through 4.x to be supported by the driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 25 Mar 2021 14:44:37 +0000 (09:44 -0500)]
net: ipa: expand GSI channel types
IPA v4.5 (GSI v2.5) supports a larger set of channel protocols, and
adds an additional field to hold the most-significant bits of the
protocol identifier on a channel.
Add an inline function that encodes the protocol (including the
extra bits for newer versions of IPA), and define some additional
protocols. At this point we still use only GPI protocol.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 25 Mar 2021 14:44:36 +0000 (09:44 -0500)]
net: ipa: update GSI ring size registers
Each GSI channel has a CNTXT_1 register that encodes the size of its
ring buffer. The size of the field that records that is increased
starting at IPA v4.9. Replace the use of a fixed-size field mask
with a new inline function that encodes that size value.
Similarly, the size of GSI event rings can be larger starting with
IPA v4.9, so create a function to encode that as well.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 25 Mar 2021 14:44:35 +0000 (09:44 -0500)]
net: ipa: GSI register cleanup
The main purpose of this is to extend these GSI register definitions
to support additional IPA versions.
This patch makes some minor updates to "gsi_reg.h":
- Define a DB_IN_BYTES field in the channel QOS register
- Add some comments clarifying when certain fields are valid
- Add the definition of GSI_CH_DB_STOP channel command
- Add a couple of blank lines
- Move one comment and indent another
- Delete two unused register definitions at the end.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 25 Mar 2021 14:44:34 +0000 (09:44 -0500)]
net: ipa: support IPA interrupt addresses for IPA v4.7
Starting with IPA v4.7, registers related to IPA interrupts are
located at a fixed offset 0x1000 above than the addresses used for
earlier versions. Define and use functions to provide the offset to
use for these registers based on IPA version.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>