Dmitry Bogdanov [Mon, 12 Nov 2018 15:45:58 +0000 (15:45 +0000)]
net: aquantia: add rx-flow filter definitions
Add missing register definitions and the functions accessing them
related to rx-flow filters.
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Each slave has it's own receive timestamp filter. But cpts rx/tx
timestamp enable flags are used to allow ts retrieve only for one
user. This limitation causes data path redundancy and setting overlap
if cpsw module is in dual-mac mode for instance.
If rx ts is enabled only for one port - the second interface must expect
every incoming packet to be PTP packet w/o absolutely any reason, and if
it's PTP - do unneeded stuff, as rx filter for second port is not set
and cpts fifo is not supposed to contain appropriate ts event.
That's not correct.
So, to fix control overlap and avoid redundant CPU cycles, the patch
splits rx/tx ts enable flags between network devices. After the patch,
PTP timestamping still should be used for only one port (or PTP id
counter has to be different for both ports as cpts IP is common).
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Mon, 12 Nov 2018 14:00:21 +0000 (16:00 +0200)]
net: ethernet: ti: cpts: purge staled skbs from txq
The overflow event is running with 1 jiffy in case if txq is not
empty, but it can be emptied completely only if next tx event
consumes skb or deletes staled skb from the txq. In case of staled
skb, that can happen for some unpredictable reason (the ts event was
lost or timed out), the overflow event can be generated quite long
time consuming CPU w/o reason before next tx event happens. To avoid
it, purge txq before increasing overflow event rate.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Mon, 12 Nov 2018 23:45:56 +0000 (23:45 +0000)]
net: phy: check if advertising is zero using linkmode_empty
A recent change modified variable advertising from a u32 to a link mode
array and left the u32 zero comparison, so essential we now have an array
being compared to null which is not the intention. Fix this by using the
call to linkmode_empty to check if advertising is all zero.
Detected by CoverityScan, CID#1475424 ("Array compared against 0")
Fixes: 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 12 Nov 2018 17:09:51 +0000 (09:09 -0800)]
Merge branch 'sctp-add-support-for-sk_reuseport'
Xin Long says:
====================
sctp: add support for sk_reuseport
sctp sk_reuseport allows multiple socks to listen on the same port and
addresses, as long as these socks have the same uid. This works pretty
much as TCP/UDP does, the only difference is that sctp is multi-homing
and all the bind_addrs in these socks will have to completely matched,
otherwise listen() will return err.
The below is when 5 sockets are listening on 172.16.254.254:6400 on a
server, 26 sockets on a client connect to 172.16.254.254:6400 and each
may be processed by a different socket on the server which is selected
by hash(lport, pport, paddr) in reuseport_select_sock():
Xin Long [Mon, 12 Nov 2018 10:27:16 +0000 (18:27 +0800)]
sctp: add sock_reuseport for the sock in __sctp_hash_endpoint
This is a part of sk_reuseport support for sctp. It defines a helper
sctp_bind_addrs_check() to check if the bind_addrs in two socks are
matched. It will add sock_reuseport if they are completely matched,
and return err if they are partly matched, and alloc sock_reuseport
if all socks are not matched at all.
It will work until sk_reuseport support is added in
sctp_get_port_local() in the next patch.
v1->v2:
- use 'laddr->valid && laddr2->valid' check instead as Marcelo
pointed in sctp_bind_addrs_check().
Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 12 Nov 2018 10:27:15 +0000 (18:27 +0800)]
sctp: do reuseport_select_sock in __sctp_rcv_lookup_endpoint
This is a part of sk_reuseport support for sctp, and it selects a
sock by the hashkey of lport, paddr and dport by default. It will
work until sk_reuseport support is added in sctp_get_port_local()
in the next patch.
v1->v2:
- define lport as __be16 instead of __be32 as Marcelo pointed in
__sctp_rcv_lookup_endpoint().
Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Mon, 12 Nov 2018 01:47:30 +0000 (01:47 +0000)]
net: phy: marvell: remove set but not used variable 'pause'
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/phy/marvell.c: In function 'm88e1510_config_init':
drivers/net/phy/marvell.c:850:7: warning:
variable 'pause' set but not used [-Wunused-but-set-variable]
It not used any more after commit 3c1bcc8614db ("net: ethernet: Convert phydev
advertize and supported from u32 to link mode")
Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"One last pull request before heading to Vancouver for LPC, here we have:
1) Don't forget to free VSI contexts during ice driver unload, from
Victor Raj.
2) Don't forget napi delete calls during device remove in ice driver,
from Dave Ertman.
3) Don't request VLAN tag insertion of ibmvnic device when SKB
doesn't have VLAN tags at all.
4) IPV4 frag handling code has to accomodate the situation where two
threads try to insert the same fragment into the hash table at the
same time. From Eric Dumazet.
5) Relatedly, don't flow separate on protocol ports for fragmented
frames, also from Eric Dumazet.
6) Memory leaks in qed driver, from Denis Bolotin.
7) Correct valid MTU range in smsc95xx driver, from Stefan Wahren.
8) Validate cls_flower nested policies properly, from Jakub Kicinski.
9) Clearing of stats counters in mc88e6xxx driver doesn't retain
important bits in the G1_STATS_OP register causing the chip to
hang. Fix from Andrew Lunn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
act_mirred: clear skb->tstamp on redirect
net: dsa: mv88e6xxx: Fix clearing of stats counters
tipc: fix link re-establish failure
net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
net: mvneta: correct typo
flow_dissector: do not dissect l4 ports for fragments
net: qualcomm: rmnet: Fix incorrect assignment of real_dev
net: aquantia: allow rx checksum offload configuration
net: aquantia: invalid checksumm offload implementation
net: aquantia: fixed enable unicast on 32 macvlan
net: aquantia: fix potential IOMMU fault after driver unbind
net: aquantia: synchronized flow control between mac/phy
net: smsc95xx: Fix MTU range
net: stmmac: Fix RX packet size > 8191
qed: Fix potential memory corruption
qed: Fix SPQ entries not returned to pool in error flows
qed: Fix blocking/unlimited SPQ entries leak
qed: Fix memory/entry leak in qed_init_sp_request()
inet: frags: better deal with smp races
net: hns3: bugfix for not checking return value
...
Linus Torvalds [Sun, 11 Nov 2018 22:57:55 +0000 (16:57 -0600)]
Merge tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix build errors in binrpm-pkg and bindeb-pkg targets
- fix false positive matches in merge_config.sh
- fix build version mismatch in deb-pkg target
- fix dtbs_install handling in (bin)deb-pkg target
- revert a commit that allows setlocalversion to write to source tree
* tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
builddeb: Fix inclusion of dtbs in debian package
Revert "scripts/setlocalversion: git: Make -dirty check more robust"
kbuild: deb-pkg: fix too low build version number
kconfig: merge_config: avoid false positive matches from comment lines
kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
kbuild: rpm-pkg: fix binrpm-pkg breakage when O= is used
Linus Torvalds [Sun, 11 Nov 2018 22:54:38 +0000 (16:54 -0600)]
Merge tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Several fixes to recent release (4.19, fixes tagged for stable) and
other fixes"
* tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix missing delayed iputs on unmount
Btrfs: fix data corruption due to cloning of eof block
Btrfs: fix infinite loop on inode eviction after deduplication of eof block
Btrfs: fix deadlock on tree root leaf when finding free extent
btrfs: avoid link error with CONFIG_NO_AUTO_INLINE
btrfs: tree-checker: Fix misleading group system information
Btrfs: fix missing data checksums after a ranged fsync (msync)
btrfs: fix pinned underflow after transaction aborted
Btrfs: fix cur_offset in the error case for nocow
Linus Torvalds [Sun, 11 Nov 2018 22:53:02 +0000 (16:53 -0600)]
Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"A large number of ext4 bug fixes, mostly buffer and memory leaks on
error return cleanup paths"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: missing !bh check in ext4_xattr_inode_write()
ext4: fix buffer leak in __ext4_read_dirblock() on error path
ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path
ext4: fix buffer leak in ext4_xattr_move_to_block() on error path
ext4: release bs.bh before re-using in ext4_xattr_block_find()
ext4: fix buffer leak in ext4_xattr_get_block() on error path
ext4: fix possible leak of s_journal_flag_rwsem in error path
ext4: fix possible leak of sbi->s_group_desc_leak in error path
ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref()
ext4: avoid possible double brelse() in add_new_gdb() on error path
ext4: avoid buffer leak in ext4_orphan_add() after prior errors
ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()
ext4: fix possible inode leak in the retry loop of ext4_resize_fs()
ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing
ext4: add missing brelse() update_backups()'s error path
ext4: add missing brelse() add_new_gdb_meta_bg()'s error path
ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path
ext4: avoid potential extra brelse in setup_new_flex_group_blocks()
Linus Torvalds [Sun, 11 Nov 2018 22:41:50 +0000 (16:41 -0600)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of x86 fixes:
- Cure the LDT remapping to user space on 5 level paging which ended
up in the KASLR space
- Remove LDT mapping before freeing the LDT pages
- Make NFIT MCE handling more robust
- Unbreak the VSMP build by removing the dependency on paravirt ops
- Support broken PIT emulation on Microsoft hyperV
- Don't trace vmware_sched_clock() to avoid tracer recursion
- Remove -pipe from KBUILD CFLAGS which breaks clang and is also
slower on GCC
- Trivial coding style and typo fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/vmware: Do not trace vmware_sched_clock()
x86/vsmp: Remove dependency on pv_irq_ops
x86/ldt: Remove unused variable in map_ldt_struct()
x86/ldt: Unmap PTEs for the slot before freeing LDT pages
x86/mm: Move LDT remap out of KASLR region on 5-level paging
acpi/nfit, x86/mce: Validate a MCE's address before using it
acpi/nfit, x86/mce: Handle only uncorrectable machine checks
x86/build: Remove -pipe from KBUILD_CFLAGS
x86/hyper-v: Fix indentation in hv_do_fast_hypercall16()
Documentation/x86: Fix typo in zero-page.txt
x86/hyper-v: Enable PIT shutdown quirk
clockevents/drivers/i8253: Add support for PIT shutdown quirk
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Do not zero sample_id_all for group members
perf tools: Fix undefined symbol scnprintf in libperf-jvmti.so
perf beauty: Use SRCARCH, ARCH=x86_64 must map to "x86" to find the headers
perf intel-pt: Add MTC and CYC timestamps to debug log
perf intel-pt: Add more event information to debug log
perf scripts python: exported-sql-viewer.py: Fix table find when table re-ordered
perf scripts python: exported-sql-viewer.py: Add help window
perf scripts python: exported-sql-viewer.py: Add Selected branches report
perf scripts python: exported-sql-viewer.py: Fall back to /usr/local/lib/libxed.so
perf top: Display the LBR stats in callchain entry
perf stat: Handle different PMU names with common prefix
perf record: Support weak groups
perf evlist: Move perf_evsel__reset_weak_group into evlist
perf augmented_syscalls: Start collecting pathnames in the BPF program
perf trace: Fix setting of augmented payload when using eBPF + raw_syscalls
perf trace: When augmenting raw_syscalls plug raw_syscalls:sys_exit too
perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}
tools headers barrier: Fix arm64 tools build failure wrt smp_load_{acquire,release}
Linus Torvalds [Sun, 11 Nov 2018 22:33:00 +0000 (16:33 -0600)]
Merge branch 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"Two small scheduler fixes:
- Take hotplug lock in sched_init_smp(). Technically not really
required, but lockdep will complain other.
- Trivial comment fix in sched/fair"
* 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix a comment in task_numa_fault()
sched/core: Take the hotplug lock in sched_init_smp()
Heiner Kallweit [Sun, 11 Nov 2018 19:31:21 +0000 (20:31 +0100)]
PCI: add USR vendor id and use it in r8169 and w6692 driver
The PCI vendor id of U.S. Robotics isn't defined in pci_ids.h so far,
only ISDN driver w6692 has a private definition. Move the definition
to pci_ids.h and use it in the r8169 driver too.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Nov 2018 17:11:31 +0000 (09:11 -0800)]
net_sched: sch_fq: add dctcp-like marking
Similar to 80ba92fa1a92 ("codel: add ce_threshold attribute")
After EDT adoption, it became easier to implement DCTCP-like CE marking.
In many cases, queues are not building in the network fabric but on
the hosts themselves.
If packets leaving fq missed their Earliest Departure Time by XXX usec,
we mark them with ECN CE. This gives a feedback (after one RTT) to
the sender to slow down and find better operating mode.
Example :
tc qd replace dev eth0 root fq ce_threshold 2.5ms
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Nov 2018 15:34:28 +0000 (07:34 -0800)]
tcp: tsq: no longer use limit_output_bytes for paced flows
FQ pacing guarantees that paced packets queued by one flow do not
add head-of-line blocking for other flows.
After TCP GSO conversion, increasing limit_output_bytes to 1 MB is safe,
since this maps to 16 skbs at most in qdisc or device queues.
(or slightly more if some drivers lower {gso_max_segs|size})
We still can queue at most 1 ms worth of traffic (this can be scaled
by wifi drivers if they need to)
Tested:
# ethtool -c eth0 | egrep "tx-usecs:|tx-frames:" # 40 Gbit mlx4 NIC
tx-usecs: 16
tx-frames: 16
# tc qdisc replace dev eth0 root fq
# for f in {1..10};do netperf -P0 -H lpaa24,6 -o THROUGHPUT;done
Eric Dumazet [Sun, 11 Nov 2018 14:41:31 +0000 (06:41 -0800)]
tcp: get rid of tcp_tso_should_defer() dependency on HZ/jiffies
tcp_tso_should_defer() first heuristic is to not defer
if last send is "old enough".
Its current implementation uses jiffies and its low granularity.
TSO autodefer performance should not rely on kernel HZ :/
After EDT conversion, we have state variables in nanoseconds that
can allow us to properly implement the heuristic.
This patch increases TSO chunk sizes on medium rate flows,
especially when receivers do not use GRO or similar aggregation.
It also reduces bursts for HZ=100 or HZ=250 kernels, making TCP
behavior more uniform.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Nov 2018 14:41:30 +0000 (06:41 -0800)]
tcp: refine tcp_tso_should_defer() after EDT adoption
tcp_tso_should_defer() last step tries to check if the probable
next ACK packet is coming in less than half rtt.
Problem is that the head->tstamp might be in the future,
so we need to use signed arithmetics to avoid overflows.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Nov 2018 14:41:29 +0000 (06:41 -0800)]
tcp: do not try to defer skbs with eor mark (MSG_EOR)
Applications using MSG_EOR are giving a strong hint to TCP stack :
Subsequent sendmsg() can not append more bytes to skbs having
the EOR mark.
Do not try to TSO defer suchs skbs, there is really no hope.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yafang Shao [Sun, 11 Nov 2018 12:10:10 +0000 (20:10 +0800)]
tcp: minor optimization in tcp ack fast path processing
Bitwise operation is a little faster.
So I replace after() with using the flag FLAG_SND_UNA_ADVANCED as it is
already set before.
In addtion, there's another similar improvement in tcp_cwnd_reduction().
Cc: Joe Perches <joe@perches.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Nov 2018 00:22:29 +0000 (16:22 -0800)]
act_mirred: clear skb->tstamp on redirect
If sch_fq is used at ingress, skbs that might have been
timestamped by net_timestamp_set() if a packet capture
is requesting timestamps could be delayed by arbitrary
amount of time, since sch_fq time base is MONOTONIC.
Fix this problem by moving code from sch_netem.c to act_mirred.c.
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 23:41:10 +0000 (00:41 +0100)]
net: dsa: mv88e6xxx: Fix clearing of stats counters
The mv88e6161 would sometime fail to probe with a timeout waiting for
the switch to complete an operation. This operation is supposed to
clear the statistics counters. However, due to a read/modify/write,
without the needed mask, the operation actually carried out was more
random, with invalid parameters, resulting in the switch not
responding. We need to preserve the histogram mode bits, so apply a
mask to keep them.
Reported-by: Chris Healy <Chris.Healy@zii.aero> Fixes: 40cff8fca9e3 ("net: dsa: mv88e6xxx: Fix stats histogram mode") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: dsa: mv88e6xxx: Support more SERDES interfacxes
Currently the SERDES interfaces for ports 9 and 10 on the mv88e6390x
are supported, allowing upto 10G. However, when unused, these SERDES
interfaces can be used by some of the lower ports for 1000Base-X.
The tricky bit here is ordering. The SERDES have to become free from
ports 9 or 10 before they can be used with lower ports. Normally, this
would happen only when these ports would be configured up, which is
too late. So at probe time, defaulting ports 9 and 10 to 1000BaseX
frees them for use with lower ports. If they are actually needed, they
will be taken back when port 9 and 10 goes up.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 23:32:17 +0000 (00:32 +0100)]
net: dsa: mv88e6xxx: Add support for SERDES on ports 2-8 for 6390X
The 6390X family has 8 SERDES interfaces. When ports 9 and 10 are not
using all their SERDES interfaces, the unused ones can be assigned to
ports 2-8. Add support for interrupts from SERDES interfaces connected
to these lower ports.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 23:32:16 +0000 (00:32 +0100)]
net: dsa: mv88e6xxx: Default ports 9/10 6390X CMODE to 1000BaseX
The 6390X family has 8 SERDES interfaces. This allows ports 9 and 10
to support up to 10Gbps using 4 SERDES interfaces. However, when lower
speeds are used, which need fewer SERDES interfaces, the unused SERDES
interfaces can be used by ports 2-8.
The hardware defaults to ports 9 and 10 having all 4 SERDES interfaces
assigned to them. This only gets changed when the interface is
configured after what the SFP supports has been determined, or the 10G
PHY completes auto-neg.
For hardware designs which limit ports 9 and 10 to one or two SERDES
interfaces, and place SFPs on the lower interfaces, this is too
late. Those ports with SFP should not wait until ports 9/10 are up in
order to get access to the SERDES interface. So change the default
configuration when the driver is initialised. Configure ports 9 and 10
to 1000BaseX, so they use a single SERDES interface, freeing up the
others. They can steal them back if they need them.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 23:32:15 +0000 (00:32 +0100)]
net: dsa: mv88e6xxx: Differentiate between 6390 and 6390X cmodes
The X family variants support additional ports modes, for 10G
operation, which the non-X variants don't have. Add a port_set_cmode()
for non-X variants to enforce this.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: phy: convert advertise and supported to linkmode
This is the last part in converting phylib to make use of a linux
bitmap, not a u32, to represent links modes. This will allow support
for PHYs > 1Gbps, which need to use link modes represented by a bit >
32.
A number of MAC and PHY drivers need changes to support this. However
the previous two patchesets reduced the number somewhat, the helpers
which were introduced have been modified instead of the actual
drivers.
The follow on patches then make use of the extra bits, adding support
for more link modes.
Given how invasive this change is, i expect the build is broken for
some architectures i did not test. I will fixup the breakage as fast
as i can.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 22:43:37 +0000 (23:43 +0100)]
net: phy: Add support for resolving 5G and 2.5G autoneg
Now that 2.5G and 5G can be represented in phydev->advertising and
phydev->lp_advertising, add these two links modes as possible
resolutions to auto negotiation.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 22:43:34 +0000 (23:43 +0100)]
net: phy: Convert u32 phydev->lp_advertising to linkmode
Convert phy drivers to report the link partner advertised modes using
a linkmode bitmap. This allows them to report the higher speeds which
don't fit in a u32.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 10 Nov 2018 22:43:33 +0000 (23:43 +0100)]
net: ethernet: Convert phydev advertize and supported from u32 to link mode
There are a few MAC/PHYs combinations which now support > 1Gbps. These
may need to make use of link modes with bits > 31. Thus their
supported PHY features or advertised features cannot be implemented
using the current bitmap in a u32. Convert to using a linkmode bitmap,
which can support all the currently devices link modes, and is future
proof as more modes are added.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Sat, 10 Nov 2018 22:30:24 +0000 (17:30 -0500)]
tipc: fix link re-establish failure
When a link failure is detected locally, the link is reset, the flag
link->in_session is set to false, and a RESET_MSG with the 'stopping'
bit set is sent to the peer.
The purpose of this bit is to inform the peer that this endpoint just
is going down, and that the peer should handle the reception of this
particular RESET message as a local failure. This forces the peer to
accept another RESET or ACTIVATE message from this endpoint before it
can re-establish the link. This again is necessary to ensure that
link session numbers are properly exchanged before the link comes up
again.
If a failure is detected locally at the same time at the peer endpoint
this will do the same, which is also a correct behavior.
However, when receiving such messages, the endpoints will not
distinguish between 'stopping' RESETs and ordinary ones when it comes
to updating session numbers. Both endpoints will copy the received
session number and set their 'in_session' flags to true at the
reception, while they are still expecting another RESET from the
peer before they can go ahead and re-establish. This is contradictory,
since, after applying the validation check referred to below, the
'in_session' flag will cause rejection of all such messages, and the
link will never come up again.
We now fix this by not only handling received RESET/STOPPING messages
as a local failure, but also by omitting to set a new session number
and the 'in_session' flag in such cases.
Fixes: 7ea817f4e832 ("tipc: check session number before accepting link protocol messages") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
yupeng [Sat, 10 Nov 2018 21:38:12 +0000 (13:38 -0800)]
documentation of some IP/ICMP snmp counters
The snmp_counter.rst explains the meanings of snmp counters. It also
provides a set of experiments (only 1 for this initial patch),
combines the experiments' resutls and the snmp counters'
meanings. This is an initial path, only explains a part of IP/ICMP
counters and provide a simple ping test.
Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the broadcast retransmission algorithm is using the
'prev_retr' field in struct tipc_link to time stamp the latest broadcast
retransmission occasion. This helps to restrict retransmission of
individual broadcast packets to max once per 10 milliseconds, even
though all other criteria for retransmission are met.
We now move this time stamp to the control block of each individual
packet, and remove other limiting criteria. This simplifies the
retransmission algorithm, and eliminates any risk of logical errors
in selecting which packets can be retransmitted.
Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: LUU Duc Canh <canh.d.luu@dektech.com.au> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patchset introduces an alternative to egdev offload by allowing a
driver to register for block updates when an external device (e.g. tunnel
netdev) is bound to a TC block. Drivers can track new netdevs or register
to existing ones to receive information on such events. Based on this,
they may register for block offload rules using already existing
functions.
The patchset also implements this new indirect block registration in the
NFP driver to allow the offloading of tunnel rules. The use of egdev
offload (which is currently only used for tunnel offload) is subsequently
removed.
RFC v2 -> PATCH
- removed embedded tracking function from indir block register (now up to
driver to clean up after itself)
- refactored NFP code due to recent submissions
- removed priv list clean function in NFP (list should be cleared by
indirect block unregisters)
RFC v1->v2:
- free allocated owner struct in block_owner_clean function
- add geneve type helper function
- move test stub in NFP (v1 patch 2) to full tunnel offload
implementation via indirect blocks (v2 patches 3-8)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:31 +0000 (21:21 -0800)]
nfp: flower: remove unnecessary code in flow lookup
Recent changes to NFP mean that stats updates from fw to driver no longer
require a flow lookup and (because egdev offload has been removed) the
ingress netdev for a lookup is now always known.
Remove obsolete code in a flow lookup that matches on host context and
that allows for a netdev to be NULL.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:30 +0000 (21:21 -0800)]
nfp: flower: remove TC egdev offloads
Previously, only tunnel decap rules required egdev registration for
offload in NFP. These are now supported via indirect TC block callbacks.
Remove the egdev code from NFP.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:29 +0000 (21:21 -0800)]
nfp: flower: offload tunnel decap rules via indirect TC blocks
Previously, TC block tunnel decap rules were only offloaded when a
callback was triggered through registration of the rules egress device.
This meant that the driver had no access to the ingress netdev and so
could not verify it was the same tunnel type that the rule implied.
Register tunnel devices for indirect TC block offloads in NFP, giving
access to new rules based on the ingress device rather than egress. Use
this to verify the netdev type of VXLAN and Geneve based rules and offload
the rules to HW if applicable.
Tunnel registration is done via a netdev notifier. On notifier
registration, this is triggered for already existing netdevs. This means
that NFP can register for offloads from devices that exist before it is
loaded (filter rules will be replayed from the TC core). Similarly, on
notifier unregister, a call is triggered for each currently active netdev.
This allows the driver to unregister any indirect block callbacks that may
still be active.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:28 +0000 (21:21 -0800)]
nfp: flower: increase scope of netdev checking functions
Both the actions and tunnel_conf files contain local functions that check
the type of an input netdev. In preparation for re-use with tunnel offload
via indirect blocks, move these to static inline functions in a header
file.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:27 +0000 (21:21 -0800)]
nfp: flower: allow non repr netdev offload
Previously the offload functions in NFP assumed that the ingress (or
egress) netdev passed to them was an nfp repr.
Modify the driver to permit the passing of non repr netdevs as the ingress
device for an offload rule candidate. This may include devices such as
tunnels. The driver should then base its offload decision on a combination
of ingress device and egress port for a rule.
Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Sat, 10 Nov 2018 05:21:26 +0000 (21:21 -0800)]
net: sched: register callbacks for indirect tc block binds
Currently drivers can register to receive TC block bind/unbind callbacks
by implementing the setup_tc ndo in any of their given netdevs. However,
drivers may also be interested in binds to higher level devices (e.g.
tunnel drivers) to potentially offload filters applied to them.
Introduce indirect block devs which allows drivers to register callbacks
for block binds on other devices. The callback is triggered when the
device is bound to a block, allowing the driver to register for rules
applied to that block using already available functions.
Freeing an indirect block callback will trigger an unbind event (if
necessary) to direct the driver to remove any offloaded rules and unreg
any block rule callbacks. It is the responsibility of the implementing
driver to clean any registered indirect block callbacks before exiting,
if the block it still active at such a time.
Allow registering an indirect block dev callback for a device that is
already bound to a block. In this case (if it is an ingress block),
register and also trigger the callback meaning that any already installed
rules can be replayed to the calling driver.
Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:58:01 +0000 (18:58 +0100)]
net: phy: improve and inline phy_change
Now that phy_mac_interrupt() doesn't call phy_change() any longer it's
called from phy_interrupt() only. Therefore phy_interrupt_is_valid()
returns true always and the check can be removed.
In case of PHY_HALTED phy_interrupt() bails out immediately,
therefore the second check for PHY_HALTED including the call to
phy_disable_interrupts() can be removed.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:56:52 +0000 (18:56 +0100)]
net: phy: simplify phy_mac_interrupt and related functions
When using phy_mac_interrupt() the irq number is set to
PHY_IGNORE_INTERRUPT, therefore phy_interrupt_is_valid() returns false.
As a result phy_change() effectively just calls phy_trigger_machine()
when called from phy_mac_interrupt() via phy_change_work(). So we can
call phy_trigger_machine() from phy_mac_interrupt() directly and
remove some now unneeded code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:55:50 +0000 (18:55 +0100)]
net: phy: don't set state PHY_CHANGELINK in phy_change
State PHY_CHANGELINK isn't needed here, we can call the state machine
directly. We just have to remove the check for phy_polling_mode() to
make this work also in interrupt mode. Removing this check doesn't
cause any overhead because when not polling the state machine is
called only if required by some event.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 11 Nov 2018 17:37:04 +0000 (09:37 -0800)]
Merge branch 'remove-PHY_HAS_INTERRUPT'
Heiner Kallweit says:
====================
net: phy: replace PHY_HAS_INTERRUPT with a check for config_intr and ack_interrupt
Flag PHY_HAS_INTERRUPT is used only here for this small check. I think
using interrupts isn't possible if a driver defines neither
config_intr nor ack_interrupts callback. So we can replace checking
flag PHY_HAS_INTERRUPT with checking for these callbacks.
This allows to remove this flag from all driver configs.
v2:
- add helper for check in patch 1
- remove PHY_HAS_INTERRUPT from all drivers, not only Realtek
- remove flag PHY_HAS_INTERRUPT completely
v3:
- rebase patch 2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:17:22 +0000 (18:17 +0100)]
net: phy: remove flag PHY_HAS_INTERRUPT from driver configs
Now that flag PHY_HAS_INTERRUPT has been replaced with a check for
callbacks config_intr and ack_interrupt, we can remove setting this
flag from all driver configs.
Last but not least remove flag PHY_HAS_INTERRUPT completely.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:16:28 +0000 (18:16 +0100)]
net: phy: replace PHY_HAS_INTERRUPT with a check for config_intr and ack_interrupt
Flag PHY_HAS_INTERRUPT is used only here for this small check. I think
using interrupts isn't possible if a driver defines neither
config_intr nor ack_interrupts callback. So we can replace checking
flag PHY_HAS_INTERRUPT with checking for these callbacks.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Herring [Wed, 7 Nov 2018 14:36:46 +0000 (08:36 -0600)]
builddeb: Fix inclusion of dtbs in debian package
Commit 37c8a5fafa3b ("kbuild: consolidate Devicetree dtb build rules")
moved the location of 'dtbs_install' target which caused dtbs to not be
installed when building debian package with 'bindeb-pkg' target. Update
the builddeb script to use the same logic that determines if there's a
'dtbs_install' target which is presence of the arch dts directory. Also,
use CONFIG_OF_EARLY_FLATTREE instead of CONFIG_OF as that's a better
indication of whether we are building dtbs.
This commit will also have the side effect of installing dtbs on any
arch that has dts files. Previously, it was dependent on whether the
arch defined 'dtbs_install'.
While git appears to be able to handle this situation, a monitored
build environment (such as the one used for Chrome OS kernel builds)
may detect it and bail out with an access violation error. On top of
that, the attempted write access suggests that git _will_ write to the
file even if a build output directory is specified. Users may have the
reasonable expectation that the source repository remains untouched in
that situation.
Fixes: 6147b1cf19651 ("scripts/setlocalversion: git: Make -dirty check more robust" Cc: Genki Sky <sky@genki.is> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Tue, 6 Nov 2018 04:18:05 +0000 (13:18 +0900)]
kbuild: deb-pkg: fix too low build version number
Since commit b41d920acff8 ("kbuild: deb-pkg: split generating packaging
and build"), the build version of the kernel contained in a deb package
is too low by 1.
Prior to the bad commit, the kernel was built first, then the number
in .version file was read out, and written into the debian control file.
Now, the debian control file is created before the kernel is actually
compiled, which is causing the version number mismatch.
Let the mkdebian script pass KBUILD_BUILD_VERSION=${revision} to require
the build system to use the specified version number.
Masahiro Yamada [Mon, 5 Nov 2018 08:19:36 +0000 (17:19 +0900)]
kconfig: merge_config: avoid false positive matches from comment lines
The current SED_CONFIG_EXP could match to comment lines in config
fragment files, especially when CONFIG_PREFIX_ is empty. For example,
Buildroot uses empty prefixing; starting symbols with BR2_ is just
convention.
Make the sed expression more robust against false positives from
comment lines. The new sed expression matches to only valid patterns.
David S. Miller [Sun, 12 Aug 2018 04:14:34 +0000 (21:14 -0700)]
brcmfmac: Use standard SKB list accessors in brcmf_sdiod_sglist_rw.
Instead of direct SKB list pointer accesses.
The loops in this function had to be rewritten to accommodate this
more easily.
The first loop iterates now over the target list in the outer loop,
and triggers an mmc data operation when the per-operation limits are
hit.
Then after the loops, if we have any residue, we trigger the last
and final operation.
For the page aligned workaround, where we have to copy the read data
back into the original list of SKBs, we use a two-tiered loop. The
outer loop stays the same and iterates over pktlist, and then we have
an inner loop which uses skb_peek_next(). The break logic has been
simplified because we know that the aggregate length of the SKBs in
the source and destination lists are the same.
This change also ends up fixing a bug, having to do with the
maintainance of the seg_sz variable and how it drove the outermost
loop. It begins as:
seg_sz = target_list->qlen;
ie. the number of packets in the target_list queue. The loop
structure was then:
while (seq_sz) {
...
while (not at end of target_list) {
...
sg_cnt++
...
}
...
seg_sz -= sg_cnt;
The assumption built into that last statement is that sg_cnt counts
how many packets from target_list have been fully processed by the
inner loop. But this not true.
If we hit one of the limits, such as the max segment size or the max
request size, we will break and copy a partial packet then contine
back up to the top of the outermost loop.
With the new loops we don't have this problem as we don't guard the
loop exit with a packet count, but instead use the progression of the
pkt_next SKB through the list to the end. The general structure is:
sg_cnt = 0;
skb_queue_walk(target_list, pkt_next) {
pkt_offset = 0;
...
sg_cnt++;
...
while (pkt_offset < pkt_next->len) {
pkt_offset += sg_data_size;
if (queued up max per request)
mmc_submit_one();
}
}
if (sg_cnt)
mmc_submit_one();
The variables that maintain where we are in the MMC command state such
as req_sz, sg_cnt, and sgl are reset when we emit one of these full
sized requests.
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław [Sat, 10 Nov 2018 18:55:34 +0000 (19:55 +0100)]
OVS: remove VLAN_TAG_PRESENT - fixup
It turns out I missed one VLAN_TAG_PRESENT in OVS code while rebasing.
This fixes it.
Fixes: 9df46aefafa6 ("OVS: remove use of VLAN_TAG_PRESENT") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 10 Nov 2018 19:32:14 +0000 (13:32 -0600)]
Merge tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty fixes for 4.20-rc2
One of these missed the original 4.19-final release, I missed that I
hadn't done a pull request for it as it was in linux-next and my
branch for a long time, that's my fault.
The others are small, fixing some reported issues and finally fixing
the termios mess for alpha so that glibc has a chance to implement
some missing functionality that has been pending for many years now.
All of these have been in linux-next with no reported issues"
* tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout
arch/alpha, termios: implement BOTHER, IBSHIFT and termios2
termios, tty/tty_baudrate.c: fix buffer overrun
vt: fix broken display when running aptitude
serial: sh-sci: Fix receive on SCIFA/SCIFB variants with DMA
Linus Torvalds [Sat, 10 Nov 2018 19:29:47 +0000 (13:29 -0600)]
Merge tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"drm: i915, amdgpu, sun4i, exynos and etnaviv fixes:
- amdgpu has some display fixes, KFD ioctl fixes and a Vega20 bios
interaction fix.
- sun4i has some NULL checks added
- i915 has a 32-bit system fix, LPE audio oops, and HDMI2.0 clock
fixes.
- Exynos has a 3 regression fixes (one frame counter, fbdev missing,
dsi->panel check)
- Etnaviv has a single fencing fix for GPU recovery"
* tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm: (39 commits)
drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()
drm/amd/display: Drop reusing drm connector for MST
drm/amd/display: Cleanup MST non-atomic code workaround
drm/amd/powerplay: always use fast UCLK switching when UCLK DPM enabled
drm/amd/powerplay: set a default fclk/gfxclk ratio
drm/amdgpu/display/dce11: only enable FBC when selected
drm/amdgpu/display/dm: handle FBC dc feature parameter
drm/amdgpu/display/dc: add FBC to dc_config
drm/amdgpu: add DC feature mask module parameter
drm/amdgpu/display: check if fbc is available in set_static_screen_control (v2)
drm/amdgpu/vega20: add CLK base offset
drm/amd/display: Stop leaking planes
drm/amd/display: Fix misleading buffer information
Revert "drm/amd/display: set backlight level limit to 1"
drm/amd: Update atom_smu_info_v3_3 structure
drm/i915: Fix ilk+ watermarks when disabling pipes
drm/sun4i: tcon: prevent tcon->panel dereference if NULL
drm/sun4i: tcon: fix check of tcon->panel null pointer
drm/i915: Don't oops during modeset shutdown after lpe audio deinit
drm/i915: Mark pin flags as u64
...
Linus Torvalds [Sat, 10 Nov 2018 19:27:58 +0000 (13:27 -0600)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace fixes from Eric Biederman:
"I believe all of these are simple obviously correct bug fixes. These
fall into two groups:
- Fixing the implementation of MNT_LOCKED which prevents lesser
privileged users from seeing unders mounts created by more
privileged users.
- Fixing the extended uid and group mapping in user namespaces.
As well as ensuring the code looks correct I have spot tested these
changes as well and in my testing the fixes are working"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
mount: Prevent MNT_DETACH from disconnecting locked mounts
mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts
mount: Retest MNT_LOCKED in do_umount
userns: also map extents in the reverse map to kernel IDs
Linus Torvalds [Sat, 10 Nov 2018 19:25:55 +0000 (13:25 -0600)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A small set of fixes for clk drivers.
One to fix a DT refcount imbalance, two to mark some Amlogic clks as
critical, and one final one that fixes a clk name for the Qualcomm
driver merged this cycle"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: gcc: Fix board clock node name
clk: meson: axg: mark fdiv2 and fdiv3 as critical
clk: meson-gxbb: set fclk_div3 as CLK_IS_CRITICAL
clk: fixed-factor: fix of_node_get-put imbalance
Dave Airlie [Sat, 10 Nov 2018 18:14:22 +0000 (04:14 +1000)]
Merge tag 'drm-intel-fixes-2018-11-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Bugzilla #108282 fixed: Avoid graphics corruption on 32-bit systems for Mesa 18.2.x
Avoid OOPS on LPE audio deinit. Remove two unused W/As.
Fix to correct HDMI 2.0 audio clock modes to spec.
Jakub Kicinski [Sat, 10 Nov 2018 05:06:26 +0000 (21:06 -0800)]
net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
TCA_FLOWER_KEY_ENC_OPTS and TCA_FLOWER_KEY_ENC_OPTS_MASK can only
currently contain further nested attributes, which are parsed by
hand, so the policy is never actually used resulting in a W=1
build warning:
net/sched/cls_flower.c:492:1: warning: ‘enc_opts_policy’ defined but not used [-Wunused-const-variable=]
enc_opts_policy[TCA_FLOWER_KEY_ENC_OPTS_MAX + 1] = {
Add the validation anyway to avoid potential bugs when other
attributes are added and to make the attribute structure slightly
more clear. Validation will also set extact to point to bad
attribute on error.
Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 10 Nov 2018 13:07:21 +0000 (07:07 -0600)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix occasional page fault during boot due to memblock resizing before
the linear map is up.
- Define NET_IP_ALIGN to 0 to improve the DMA performance on some
platforms.
- lib/raid6 test build fix.
- .mailmap update for Punit Agrawal
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: memblock: don't permit memblock resizing until linear mapping is up
arm64: mm: define NET_IP_ALIGN to 0
lib/raid6: Fix arm64 test build
mailmap: Update email for Punit Agrawal
Linus Torvalds [Sat, 10 Nov 2018 12:57:34 +0000 (06:57 -0600)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"I2C has one bugfix (qcom-geni driver), one arch enablement (i2c-omap
driver, no code change), and a new driver (nvidia-gpu) this time"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
usb: typec: ucsi: add support for Cypress CCGx
i2c: nvidia-gpu: make pm_ops static
i2c: add i2c bus driver for NVIDIA GPU
i2c: qcom-geni: Fix runtime PM mismatch with child devices
MAINTAINERS: Add entry for i2c-omap driver
i2c: omap: Enable for ARCH_K3
dt-bindings: i2c: omap: Add new compatible for AM654 SoCs
Kyle Roeschley [Fri, 9 Nov 2018 18:48:03 +0000 (12:48 -0600)]
net: phy: leds: Don't make our own link speed names
The phy core provides a handy phy_speed_to_str() helper, so use that
instead of doing our own formatting of the different known link speeds.
To do this, increase PHY_LED_TRIGGER_SPEED_SUFFIX_SIZE to 11 so we can fit
'Unsupported' if necessary.
Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 9 Nov 2018 17:35:52 +0000 (18:35 +0100)]
net: phy: improve struct phy_device member interrupts handling
As a heritage from the very early days of phylib member interrupts is
defined as u32 even though it's just a flag whether interrupts are
enabled. So we can change it to a bitfield member. In addition change
the code dealing with this member in a way that it's clear we're
dealing with a bool value.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
dpaa2-eth: defer probe on object allocate
Allocatable objects on the fsl-mc bus may be probed by the fsl_mc_allocator
after the first attempts of other drivers to use them. Defer the probe when
this situation happens.
Changes in v2:
- proper handling of IS_ERR_OR_NULL
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Fri, 9 Nov 2018 15:26:46 +0000 (15:26 +0000)]
dpaa2-ptp: defer probe when portal allocation failed
The fsl_mc_portal_allocate can fail when the requested MC portals are
not yet probed by the fsl_mc_allocator. In this situation, the driver
should defer the probe.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Fri, 9 Nov 2018 15:26:45 +0000 (15:26 +0000)]
dpaa2-eth: defer probe on object allocate
The fsl_mc_object_allocate function can fail because not all allocatable
objects are probed by the fsl_mc_allocator at the call time. Defer the
dpaa2-eth probe when this happens.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 9 Nov 2018 14:52:45 +0000 (15:52 +0100)]
udp6: cleanup stats accounting in recvmsg()
In the udp6 code path, we needed multiple tests to select the correct
mib to be updated. Since we touch at least a counter at each iteration,
it's convenient to use the recently introduced __UDPX_MIB() helper once
and remove some code duplication.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
배석진 [Sat, 10 Nov 2018 00:53:06 +0000 (16:53 -0800)]
flow_dissector: do not dissect l4 ports for fragments
Only first fragment has the sport/dport information,
not the following ones.
If we want consistent hash for all fragments, we need to
ignore ports even for first fragment.
This bug is visible for IPv6 traffic, if incoming fragments
do not have a flow label, since skb_get_hash() will give
different results for first fragment and following ones.
It is also visible if any routing rule wants dissection
and sport or dport.
See commit 5e5d6fed3741 ("ipv6: route: dissect flow
in input path if fib rules need it") for details.
[edumazet] rewrote the changelog completely.
Fixes: 06635a35d13d ("flow_dissect: use programable dissector in skb_flow_dissect and friends") Signed-off-by: 배석진 <soukjin.bae@samsung.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 10 Nov 2018 02:50:00 +0000 (18:50 -0800)]
nfp: use the new __netdev_tx_sent_queue() BQL optimisation
__netdev_tx_sent_queue() was added in commit e59020abf0f
("net: bql: add __netdev_tx_sent_queue()") and allows for
better GSO performance.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: qualcomm: rmnet: Fix incorrect assignment of real_dev
A null dereference was observed when a sysctl was being set
from userspace and rmnet was stuck trying to complete some actions
in the NETDEV_REGISTER callback. This is because the real_dev is set
only after the device registration handler completes.
sysctl call stack -
<6> Unable to handle kernel NULL pointer dereference at
virtual address 00000108
<2> pc : rmnet_vnd_get_iflink+0x1c/0x28
<2> lr : dev_get_iflink+0x2c/0x40
<2> rmnet_vnd_get_iflink+0x1c/0x28
<2> inet6_fill_ifinfo+0x15c/0x234
<2> inet6_ifinfo_notify+0x68/0xd4
<2> ndisc_ifinfo_sysctl_change+0x1b8/0x234
<2> proc_sys_call_handler+0xac/0x100
<2> proc_sys_write+0x3c/0x4c
<2> __vfs_write+0x54/0x14c
<2> vfs_write+0xcc/0x188
<2> SyS_write+0x60/0xc0
<2> el0_svc_naked+0x34/0x38
====================
More accurate PHC<->system clock synchronization
RFC->v1:
- added new patches
- separated PHC timestamp from ptp_system_timestamp
- fixed memory leak in PTP_SYS_OFFSET_EXTENDED
- changed PTP_SYS_OFFSET_EXTENDED to work with array of arrays
- fixed PTP_SYS_OFFSET_EXTENDED to break correctly from loop
- fixed timecounter updates in drivers
- split gettimex in igb driver
- fixed ptp_read_* functions to be available without
CONFIG_PTP_1588_CLOCK
This series enables a more accurate synchronization between PTP hardware
clocks and the system clock.
The first two patches are minor cleanup/bug fixes.
The third patch adds an extended version of the PTP_SYS_OFFSET ioctl,
which returns three timestamps for each measurement. The idea is to
shorten the interval between the system timestamps to contain just the
reading of the lowest register of the PHC in order to reduce the error
in the measured offset and get a smaller upper bound on the maximum
error.
The fourth patch deprecates the original gettime function.
The remaining patches update the gettime function in order to support
the new ioctl in the e1000e, igb, ixgbe, and tg3 drivers.
Tests with few different NICs in different machines show that:
- with an I219 (e1000e) the measured delay was reduced from 2500 to 1300
ns and the error in the measured offset, when compared to the cross
timestamping supported by the driver, was reduced by a factor of 5
- with an I210 (igb) the delay was reduced from 5100 to 1700 ns
- with an I350 (igb) the delay was reduced from 2300 to 750 ns
- with an X550 (ixgbe) the delay was reduced from 1950 to 650 ns
- with a BCM5720 (tg3) the delay was reduced from 2400 to 1200 ns
====================
Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tg3: extend PTP gettime function to read system clock
This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl.
Cc: Richard Cochran <richardcochran@gmail.com> Cc: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe: extend PTP gettime function to read system clock
This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl.
Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>