====================
BPF range marking improvements for meta data
The set contains improvements for direct packet access range
markings related to data_meta pointer and test cases for all
such access patterns that the verifier matches on.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:11 +0000 (23:58 +0100)]
bpf: add test cases to bpf selftests to cover all meta tests
Lets also add test cases to cover all possible data_meta access tests
for good/bad access cases so we keep tracking them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:10 +0000 (23:58 +0100)]
bpf: also improve pattern matches for meta access
Follow-up to 0fd4759c5515 ("bpf: fix pattern matches for direct
packet access") to cover also the remaining data_meta/data matches
in the verifier. The matches are also refactored a bit to simplify
handling of all the cases.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:09 +0000 (23:58 +0100)]
bpf: minor cleanups after merge
Two minor cleanups after Dave's recent merge in f8ddadc4db6c
("Merge git://git.kernel.org...") of net into net-next in
order to get the code in line with what was done originally
in the net tree: i) use max() instead of max_t() since both
ranges are u16, ii) don't split the direct access test cases
in the middle with bpf_exit test cases from 390ee7e29fc
("bpf: enforce return code for cgroup-bpf programs").
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 1 Nov 2017 18:48:00 +0000 (11:48 -0700)]
security: bpf: replace include of linux/bpf.h with forward declarations
Touching linux/bpf.h makes us rebuild a surprisingly large
portion of the kernel. Remove the unnecessary dependency
from security.h, it only needs forward declarations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
net: systemport: Only inspect valid switch port & queues
Hesoteric board configurations where port 0 is not available would still
make SYSTEMPORT inspect the switch port 0, queue 0, which, not being
enabled, would cause transmit timeouts over time. Just ignore those
unconfigured rings instead.
Fixes: 84ff33eeb23d ("net: systemport: Establish DSA network device queue mapping") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
nfp: bpf: rename ALU_OP_NEG and support BPF_NEG
Jiong says:
Compilers are starting to use BPF_NEG, for example LLVM. However, NFP
does not support JITing it. This patch set adds this. Unit test is added
as well.
Meanwhile, the current NFP_ALU_NEG is actually doing bitwise NOT (one's
complement) operation, so the name is misleading. This patch set corrects
this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiong Wang [Wed, 1 Nov 2017 17:38:25 +0000 (10:38 -0700)]
nfp: bpf: support [BPF_ALU | BPF_ALU64] | BPF_NEG
This patch supports BPF_NEG under both BPF_ALU64 and BPF_ALU. LLVM recently
starts to generate it.
NOTE: BPF_NEG takes single operand which is an register and serve as both
input and output.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiong Wang [Wed, 1 Nov 2017 17:38:24 +0000 (10:38 -0700)]
nfp: bpf: rename ALU_OP_NEG to ALU_OP_NOT
The current ALU_OP_NEG is Op encoding 0x4 for NPF ALU instruction. It is
actually performing "~B" operation which is bitwise NOT.
The using naming ALU_OP_NEG is misleading as NEG is -B which is not the
same as ~B.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 07:15:26 +0000 (16:15 +0900)]
Merge branch 'dpaa-cleanups'
yuan linyu says:
====================
net: dpaa: two minor cleanup
original i try to remove duplicate code which clean allocated per-cpu area,
thanks to David S. Miller, there are two build warning as errors.
path 1: fix old code maybe-uninitialized warning.
path 2: remove duplicate code and fix unused var warning.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Discovered that the compiler laid-out asm code in suboptimal way
when studying perf report during benchmarking of cpumap. Help
the compiler by the marking unlikely code paths.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This patchset does a bit of cleanup of leftovers after block callbacks
patchset. The main part is patch 2, which restores the original handling
of tc offload feature flag.
---
v1->v2:
- rebased on top of current net-next (bnxt changes)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 1 Nov 2017 10:47:39 +0000 (11:47 +0100)]
net: sched: move the can_offload check from binding phase to rule insertion phase
This restores the original behaviour before the block callbacks were
introduced. Allow the drivers to do binding of block always, no matter
if the NETIF_F_HW_TC feature is on or off. Move the check to the block
callback which is called for rule insertion.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: add notifications for the bridge dev on vlan change
Currently the bridge device doesn't generate any notifications upon vlan
modifications on itself because it doesn't use the generic bridge
notifications.
With the recent changes we know if anything was modified in the vlan config
thus we can generate a notification when necessary for the bridge device
so add support to br_ifinfo_notify() similar to how other combined
functions are done - if port is present it takes precedence, otherwise
notify about the bridge. I've explicitly marked the locations where the
notification should be always for the port by setting bridge to NULL.
I've also taken the liberty to rearrange each modified function's local
variables in reverse xmas tree as well.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 10:17:15 +0000 (10:17 +0000)]
net: hns3: remove a couple of redundant assignments
The assignment to kinfo is redundant as this is a duplicate of
the initialiation of kinfo a few lines earlier, so it can be
removed. The assignment to v_tc_info is never read, so this
variable is redundant and can be removed completely. Cleans
up two clang warnings:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c:433:34:
warning: Value stored to 'kinfo' during its initialization is never read
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c:775:3:
warning: Value stored to 'v_tc_info' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 09:09:13 +0000 (09:09 +0000)]
liquidio: remove redundant setting of inst_processed to zero
The zero value assigned to inst_processed at the end of each
iteration of the do-while loop is overwritten on the next iteration
and hence it is a redundant assignment and can be removed. Cleans
up clang warning:
drivers/net/ethernet/cavium/liquidio/request_manager.c:480:3:
warning: Value stored to 'inst_processed' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 08:57:37 +0000 (08:57 +0000)]
net: dl2k: remove redundant re-assignment to np
The pointer np is initialized and then re-assigned the same value
a few lines later. Remove the redundant duplicated assignment. Cleans
up clang warning:
drivers/net/ethernet/dlink/dl2k.c:314:25: warning: Value stored to
'np' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 08:49:45 +0000 (08:49 +0000)]
wan: wanxl: remove redundant assignment to stat
stat set to zero and the value is never read, instead stat is
set again in the do-loop. Hence the setting to zero is redundant
and can be removed. Cleans up clang warning:
drivers/net/wan/wanxl.c:737:2: warning: Value stored to 'stat'
is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Smooth Cong Wang's bug fix into 'net-next'. Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 05:19:53 +0000 (14:19 +0900)]
Merge branch 'samples-pktgen-updates'
Jesper Dangaard Brouer says:
====================
Updates for samples/pktgen
This patchset updates samples/pktgen and synchronize with changes
maintained in https://github.com/netoptimizer/network-testing/
Features wise Robert Hoo <robert.hu@intel.com> added support for
detecting and determining dev NUMA node IRQs, and added a new script
named pktgen_sample06_numa_awared_queue_irq_affinity.sh that use these
features.
Cleanup remove last of the old sample files, as IPv6 is covered by
existing sample code.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
samples/pktgen: remove remaining old pktgen sample scripts
Since commit 0f06a6787e05 ("samples: Add an IPv6 '-6' option to the
pktgen scripts") the newer pktgen_sampleXX script does show howto use
IPv6 with pktgen.
Thus, there is no longer a reason to keep the older sample scripts around.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Bind each thread (processor of NUMA locality) with each $DEV queue's
irq affinity, 1:1 mapping.
* How many '-t' threads input determines how many queues will be utilized.
If '-f' designates first cpu id, then offset in the NUMA node's cpu list.
(Changes by Jesper: allow changing count from cmdline via '-n')
Signed-off-by: Robert Hoo <robert.hu@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hoo [Wed, 1 Nov 2017 10:41:09 +0000 (11:41 +0100)]
samples/pktgen: Add some helper functions
1. given a device, get its NUMA belongings
2. given a device, get its queues' irq numbers.
3. given a NUMA node, get its cpu id list.
Signed-off-by: Robert Hoo <robert.hu@intel.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Parvi Kaustubhi [Wed, 1 Nov 2017 15:44:46 +0000 (08:44 -0700)]
enic: reset fetch index
Since we are allowing rx ring size modification, reset fetch index
everytime. Otherwise it could have a stale value that can lead to a null
pointer dereference.
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Nov 2017 23:04:27 +0000 (16:04 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull signal bugfix from Eric Biederman:
"When making the generic support for SIGEMT conditional on the presence
of SIGEMT I made a typo that causes it to fail to activate. It was
noticed comparatively quickly but the bug report just made it to me
today"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Fix name of SIGEMT in #if defined() check
Andrew Clayton [Wed, 1 Nov 2017 15:49:59 +0000 (15:49 +0000)]
signal: Fix name of SIGEMT in #if defined() check
Commit cc731525f26a ("signal: Remove kernel interal si_code magic")
added a check for SIGMET and NSIGEMT being defined. That SIGMET should
in fact be SIGEMT, with SIGEMT being defined in
arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
This was actually pointed out by BenHutchings in a lwn.net comment
here https://lwn.net/Comments/734608/
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic") Signed-off-by: Andrew Clayton <andrew@digital-domain.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Linus Torvalds [Wed, 1 Nov 2017 21:46:38 +0000 (14:46 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes that should go into this series:
- Regression fix for ide-cd, ensuring that a request is fully
initialized. From Hongxu.
- Ditto fix for virtio_blk, from Bart.
- NVMe fix from Keith, ensuring that we set the right block size on
revalidation. If the block size changed, we'd be in trouble without
it.
- NVMe rdma fix from Sagi, fixing a potential hang while the
controller is being removed"
* 'for-linus' of git://git.kernel.dk/linux-block:
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
nvme: Fix setting logical block format when revalidating
virtio_blk: Fix an SG_IO regression
nvme-rdma: fix possible hang when issuing commands during ctrl removal
The vma should be pinned by mmap_sem, but handle_userfault() might (in a
return to userspace scenario) release it and then acquire again, so when
we return to __do_page_fault() (with other result than VM_FAULT_RETRY),
the vma might be gone.
Specifically, per Andrea the scenario is
"A return to userland to repeat the page fault later with a
VM_FAULT_NOPAGE retval (potentially after handling any pending signal
during the return to userland). The return to userland is identified
whenever FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in
vmf->flags"
However, since commit a3c4fb7c9c2e ("x86/mm: Fix fault error path using
unsafe vma pointer") there is a vma_pkey() read of vma->vm_flags after
that point, which can thus become use-after-free. Fix this by moving
the read before calling handle_mm_fault().
Hongxu Jia [Tue, 31 Oct 2017 07:39:40 +0000 (15:39 +0800)]
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
Since we split the scsi_request out of struct request, while the
standard prep_rq_fn builds 10 byte cmds, it missed to invoke
scsi_req_init() to initialize certain fields of a scsi_request
structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other
members of struct scsi_request).
An example panic on virtual machines (qemu/virtualbox) to boot
from IDE cdrom:
...
[ 8.754381] Call Trace:
[ 8.755419] blk_peek_request+0x182/0x2e0
[ 8.755863] blk_fetch_request+0x1c/0x40
[ 8.756148] ? ktime_get+0x40/0xa0
[ 8.756385] do_ide_request+0x37d/0x660
[ 8.756704] ? cfq_group_service_tree_add+0x98/0xc0
[ 8.757011] ? cfq_service_tree_add+0x1e5/0x2c0
[ 8.757313] ? ktime_get+0x40/0xa0
[ 8.757544] __blk_run_queue+0x3d/0x60
[ 8.757837] queue_unplugged+0x2f/0xc0
[ 8.758088] blk_flush_plug_list+0x1f4/0x240
[ 8.758362] blk_finish_plug+0x2c/0x40
...
[ 8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP: ffff92aec018bae8
[ 8.772329] ---[ end trace 6408481e551a85c9 ]---
...
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request") Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Here's one more bluetooth-next pull request for the 4.15 kernel.
- New NFA344A device entry for btusb drvier
- Fix race conditions in hci_ldisc
- Fix for isochronous interface assignments in btusb driver
- A few other smaller fixes & improvements
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
====================
cxgb4: add hash-filter support to tc-flower offload
This series of patches add support to create hash-filters; a.k.a
exact-match filters, to tc-flower offload. T6 supports creating
~500K hash-filters in hw and can theoretically be expanded up to
~1 million.
Patch 1 fetches and saves the configured hw filter tuple field shifts
and filter mask.
Patch 2 initializes the driver to use hash-filter configuration.
Patch 3 adds support to create hash filters in hw.
Patch 4 adds support to delete hash filters in hw.
Patch 5 adds support to retrieve filter stats for hash filters.
Patch 6 converts the flower table to use rhashtable instead of
static hlist.
Patch 7 finally adds support to create hash filters via tc-flower
offload.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:05 +0000 (08:53 +0530)]
cxgb4: add support to create hash-filters via tc-flower offload
Determine whether the flow classifies as exact-match with respect to
4-tuple and configured tuple mask in hw. If successfully classified
as exact-match, offload the flow as hash-filter in hw.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:04 +0000 (08:53 +0530)]
cxgb4: convert flower table to use rhashtable
T6 supports ~500K hash filters and can theoretically climb up to
~1 million hash filters. Preallocated hash table is not efficient
in terms of memory usage. So, use rhashtable instead which gives
the flexibility to grow based on usage.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:03 +0000 (08:53 +0530)]
cxgb4: add support to retrieve stats for hash filters
Add support to retrieve packet-count and byte-count for hash-filters
by retrieving filter-entry appropriately based on whether the
request is for hash-filter or not.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:01 +0000 (08:53 +0530)]
cxgb4: add support to create hash filters
Add support to create hash (exact-match) filters based on the value
of 'hash' field in ch_filter_specification.
Allocate SMT/L2T entries if DMAC-rewrite/SMAC-rewrite is requested.
Allocate CLIP entry in case of IPv6 filter.
Use cpl_act_open_req[6] to send hash filter create request to hw.
Also, the filter tuple is calculated as part of sending this request.
Hash-filter reply is processed on getting cpl_act_open_rpl.
In case of success, various bits/fields in filter-tcb are set per
filter requirement, such as enabling filter hitcnts, and/or various
header rewrite operations, such as VLAN-rewrite, NAT or
(L3/L4)-rewrite, and SMAC/DMAC-rewrite. In case of failure, clear the
filter entry and release any hw resources occupied by it.
The patch also moves the functions set_tcb_field, set_tcb_tflag and
configure_filter_smac towards beginning of file.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: dsa: lan9303: Fix STP and flooding issues
This patch set finishes the STP support, and fixes flooding issues.
Patch 1 fixes a flooding issue in the previous patch set.
Patch 2 finishes STP support by adding a ALR entry.
Patch 3 prevent duplicate flooding in HW and SW bridge.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:02 +0000 (15:48 +0100)]
net: dsa: lan9303: lan9303_rcv set skb->offload_fwd_mark
The chip flood broadcast and unknown multicast frames.
On receive set skb->offload_fwd_mark to prevent the SW from flooding to the
same ports.
One exception: Because the ALR is set up to forward STP BPDUs only to CPU,
the SW bridge should flood STP BPDUs if local STP is not enabled.
This is archived by not setting skb->offload_fwd_mark on STP BPDUs.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:01 +0000 (15:48 +0100)]
net: dsa: lan9303: Add STP ALR entry on port 0
STP BPDUs arriving on user ports must sent to CPU port only,
for processing by the SW bridge.
Add an ALR entry with STP state override to fix that.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:00 +0000 (15:48 +0100)]
net: dsa: lan9303: Transmit using ALR when unicast
lan9303_xmit_use_arl() introduced in previous patch set is wrong.
The chip flood broadcast and unknown multicast frames. The effect is that
broadcasts and multicasts are duplicated on egress. It is not possible to
configure the chip to direct unknown multicasts to CPU port only.
This means that only unicast frames can be transmitted using ALR lookup.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 14:37:55 +0000 (14:37 +0000)]
net: thunderx: remove a couple of redundant assignments
The assignment to pointer msg is redundant as it is never read, so
remove msg. Also remove the first assignment to qset as this is not
read before the next re-assignment of a new value to qset in the
for-loop. Cleans up two clang warnings:
drivers/net/ethernet/cavium/thunder/nic_main.c:589:2: warning: Value
stored to 'msg' is never read
drivers/net/ethernet/cavium/thunder/nic_main.c:611:2: warning: Value
stored to 'qset' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 31 Oct 2017 14:29:47 +0000 (14:29 +0000)]
sfc: support rx-fcs and rx-all
Ethernet FCS inclusion (rx-fcs) is supported on EF10 NICs, conditional on
a firmware capability bit (MC_CMD_GET_CAPABILITIES_OUT_RX_INCLUDE_FCS).
To receive frames with bad FCS (rx-all) we just don't return the discard
flag EFX_RX_PKT_DISCARD from efx_ef10_handle_rx_event_errors() or
efx_farch_handle_rx_not_ok().
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 14:23:24 +0000 (14:23 +0000)]
net: macb: remove redundant assignment to variable work_done
Variable work_done is set to zero and this value is never read, instead
it is set to another value a few statements later. Remove the redundant
assignment. Cleans up clang warning:
drivers/net/ethernet/cadence/macb_main.c:1221:2: warning: Value stored
to 'work_done' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Tested-by: Alexander Dahl <ada@thorsis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Tue, 31 Oct 2017 13:32:38 +0000 (14:32 +0100)]
ipv4: fix validate_source for VRF setup
David reported breakages of VRF scenarios due to the
commit 6e617de84e87 ("net: avoid a full fib lookup when rp_filter is
disabled."): the local addresses based test is too strict when VRFs
are in place.
With this change we fall-back to a full lookup when custom fib rules
are in place; so that we address the VRF use case and possibly other
similar issues in non trivial setups.
v1 -> v2:
- fix build breakage when CONFIG_IP_MULTIPLE_TABLES is not defined,
reported by the kbuild test robot
Reported-by: David Ahern <dsahern@gmail.com> Fixes: 6e617de84e87 ("net: avoid a full fib lookup when rp_filter is disabled.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 31 Oct 2017 13:28:16 +0000 (13:28 +0000)]
sctp: fix error return code in sctp_send_add_streams()
Fix to returnerror code -ENOMEM from the sctp_make_strreset_addstrm()
error handling case instead of 0. 'retval' can be overwritten to 0 after
call sctp_stream_alloc_out().
Fixes: e090abd0d81c ("sctp: factor out stream->out allocation") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 12:01:47 +0000 (12:01 +0000)]
net: hso: remove redundant unused variable dev
The pointer dev is being assigned but is never used, hence it is
redundant and can be removed. Cleans up clang warning:
drivers/net/usb/hso.c:2280:2: warning: Value stored to 'dev' is
never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Gao Feng [Tue, 31 Oct 2017 10:25:37 +0000 (18:25 +0800)]
ppp: Destroy the mutex when cleanup
The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the
good readbility, it's better to invoke it in exit func when the init
func invokes mutex_init.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 10:08:23 +0000 (10:08 +0000)]
net: ethernet: slicoss: remove redundant initialization of idx
Variable idx is being initialized and later on over-written by
a new value in a do-loop without the initial value ever being
read. Hence the initializion is redundant and can be removed.
Cleans up clang warning:
drivers/net/ethernet/alacritech/slicoss.c:358:15: warning: Value
stored to 'idx' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Oct 2017 06:08:20 +0000 (23:08 -0700)]
tcp: fix tcp_mtu_probe() vs highest_sack
Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.
If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.
Bad things would then happen, including infinite loops.
This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.
Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.
Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Reported-by: Roman Gushchin <guro@fb.com> Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Oct 2017 05:47:09 +0000 (22:47 -0700)]
ipv6: addrconf: increment ifp refcount before ipv6_del_addr()
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :
WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 12:15:09 +0000 (21:15 +0900)]
Merge branch 'PHYLINK-cosmetic-and-build-fixes'
Florian Fainelli says:
====================
PHYLINK cosmetic and build fixes
Please find two small "fixes" one that corrects some stylistic changes and
another one that fixes an actual build failure in sfp.c. Since PHYLINK is
not directly visible to user, and there are no in-tree users yet (coming)
this is not targeted at "net" but "net-next" instead.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
bpf: document answers to common questions about BPF
to address common misconceptions about what BPF is and what it's not
add short BPF Q&A that clarifies core BPF design principles and
answers some common questions.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vishwanath Pai [Mon, 30 Oct 2017 23:38:52 +0000 (19:38 -0400)]
net: display hw address of source machine during ipv6 DAD failure
This patch updates the error messages displayed in kernel log to include
hwaddress of the source machine that caused ipv6 duplicate address
detection failures.
Examples:
a) When we receive a NA packet from another machine advertising our
address:
ICMPv6: NA: 34:ab:cd:56:11:e8 advertised our address 2001:db8:: on eth0!
b) When we detect DAD failure during address assignment to an interface:
IPv6: eth0: IPv6 duplicate address 2001:db8:: used by 34:ab:cd:56:11:e8
detected!
v2:
Changed %pI6 to %pI6c in ndisc_recv_na()
Chaged the v6 address in the commit message to 2001:db8::
Suggested-by: Igor Lubashev <ilubashe@akamai.com> Signed-off-by: Vishwanath Pai <vpai@akamai.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Craig Gallek [Mon, 30 Oct 2017 22:50:11 +0000 (18:50 -0400)]
tun/tap: sanitize TUNSETSNDBUF input
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
Fixes: 33dccbb050bb ("tun: Limit amount of queued packets per device") Fixes: 20d29d7a916a ("net: macvtap driver") Signed-off-by: Craig Gallek <kraig@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 11:46:41 +0000 (20:46 +0900)]
Merge branch 'netrom-cleanups'
Gustavo A. R. Silva says:
====================
netrom: refactor code and mark expected switch fall-throughs
The aim of this patchset is firstly to refactor code in nr_route.c in order to make it
easier to read and maintain and, secondly, to mark some expected switch fall-throughs
in preparation to enabling -Wimplicit-fallthrough.
I have to mention that I did not implement any unit test.
If someone has any suggestions on how I could test this piece of code
it'd be greatly appreciated.
Changes in v2:
- Make use of the swap macro and remove inline keyword as suggested by
Walter Harms and Kevin Dawson.
Changes in v3:
- Update subject for both patches.
- Add this cover letter as suggested by David Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 1 Nov 2017 11:10:42 +0000 (12:10 +0100)]
mlxsw: i2c: Fix buffer increment counter for write transaction
It fixes a problem for the last chunk where 'chunk_size' is smaller than
MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding
previous data.
Fixes: 6882b0aee180 ("mlxsw: Introduce support for I2C bus") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1) Fix a memleak when a packet matches a policy
without a matching state.
2) Reset the socket cached dst_entry when inserting
a socket policy, otherwise the policy might be
ignored. From Jonathan Basseri.
3) Fix GSO for a IPsec, GRE tunnel combination.
We reset the encapsulation field at the skb
too erly, as a result GRE does not segment
GSO packets. Fix this by resetting the the
encapsulation field right before the
transformation where the inner headers get
invalid.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:06:45 +0000 (14:06 -0700)]
net: tipc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: Jon Maloy <jon.maloy@ericsson.com> Cc: Ying Xue <ying.xue@windriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: tipc-discussion@lists.sourceforge.net Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:05:41 +0000 (14:05 -0700)]
drivers/net: tundra: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: "David S. Miller" <davem@davemloft.net> Cc: Philippe Reynes <tremyfr@gmail.com> Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com> Cc: Eric Dumazet <edumazet@google.com> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:05:12 +0000 (14:05 -0700)]
drivers/net: ntb_netdev: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: Jon Mason <jdmason@kudzu.us> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Allen Hubbe <Allen.Hubbe@emc.com> Cc: linux-ntb@googlegroups.com Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Mon, 30 Oct 2017 20:50:22 +0000 (13:50 -0700)]
bpf: avoid rcu_dereference inside bpf_event_mutex lock region
During perf event attaching/detaching bpf programs,
the tp_event->prog_array change is protected by the
bpf_event_mutex lock in both attaching and deteching
functions. Although tp_event->prog_array is a rcu
pointer, rcu_derefrence is not needed to access it
since mutex lock will guarantee ordering.
Verified through "make C=2" that sparse
locking check still happy with the new change.
Also change the label name in perf_event_{attach,detach}_bpf_prog
from "out" to "unlock" to reflect the code action after the label.
Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: add neigh_suppress to bridge port policies
Add an entry for IFLA_BRPORT_NEIGH_SUPPRESS to bridge port policies.
Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 03:28:33 +0000 (12:28 +0900)]
Merge branch 'mvpp2-various-improvements'
Antoine Tenart says:
====================
net: mvpp2: various improvements
This series includes various patches improving the Marvell PPv2 driver.
I send them as a series to avoid any possible merge conflict.
- Patches 1 and 2 improve the initializing of the Tx and Rx FIFO.
- Patch 3 initialize the RSS table to evenly distribute the ingress
packets across multiple Rx queues based on their hashes.
- Patch 4 limits the number of TSO segments sent to the driver, to avoid
having more segments to handle than the corresponding number of
available descriptors.
- Patch 5 and 6 are cosmetic improvements.
This applies on today's net-next branch, The patches were tested
extensively (I ran iperf and http downloads in parallel, transferring
TBs of data).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:33 +0000 (11:23 +0100)]
net: mvpp2: simplify the Tx desc set DMA logic
Two functions were always used to set the DMA addresses in Tx
descriptors, because this address is split into a base+offset in the
descriptors. A mask was used to come up with the base and offset
addresses and two functions were called, mvpp2_txdesc_dma_addr_set() and
mvpp2_txdesc_offset_set().
This patch moves the base+offset calculation logic to
mvpp2_txdesc_dma_addr_set(), and removes mvpp2_txdesc_offset_set() to
simplify things.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:32 +0000 (11:23 +0100)]
net: mvpp2: use the aggr txq size define everywhere
Cosmetic patch using the MVPP2_AGGR_TXQ_SIZE everywhere instead of the
size field of aggr_txq, as the size never change and is always equal to
the MVPP2_AGGR_TXQ_SIZE define.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:31 +0000 (11:23 +0100)]
net: mvpp2: limit TSO segments and use stop/wake thresholds
Too many TSO descriptors can be required for the default queue size,
when using small MSS values for example. Prevent this by adding a
maximum number of allowed TSO segments (300). In addition set a stop and
a wake thresholds to stop the queue when there's no room for a 1 "worst
case scenario skb". Wake up the queue when the number of descriptors is
low enough.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:30 +0000 (11:23 +0100)]
net: mvpp2: initialize the RSS tables
This patch initialize the RSS tables to evenly (depending on the packets
RSS hashes) distribute the packets across port Rx queues. This helps to
handle packets on different CPUs to improve performances, as more queues
will be used in parallel.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:29 +0000 (11:23 +0100)]
net: mvpp2: initialize the Tx FIFO size
So far only the Rx FIFO size was initialized. For PPv2.2 the Tx FIFO
size can be set as well. This patch initializes the Tx FIFO size for
PPv2.2 controllers to 3K.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:28 +0000 (11:23 +0100)]
net: mvpp2: set the Rx FIFO size depending on the port speeds for PPv2.2
The Rx FIFO size was set to the same value for all ports. This patch
sets it depending on the maximum speed a given port can handle. This is
only working for PPv2.2.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 30 Oct 2017 09:51:18 +0000 (10:51 +0100)]
mlxsw: reg: Add high and low temperature thresholds
The ASIC has the ability to generate events whenever a sensor indicates
the temperature goes above or below its high or low thresholds,
respectively.
In new firmware versions the firmware enforces a minimum of 5
degrees Celsius difference between both thresholds. Make the driver
conform to this requirement.
Note that this is required even when the events are disabled, as in
certain systems interrupts are generated via GPIO based on these
thresholds.
Fixes: 85926f877040 ("mlxsw: reg: Add definition of temperature management registers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yotam Gigi [Mon, 30 Oct 2017 09:41:36 +0000 (11:41 +0200)]
MAINTAINERS: Update Yotam's E-mail
For the time being I will be available in my private mail. Update both the
MAINTAINERS file and the individual modules MODULE_AUTHOR directive with
the new address.
Signed-off-by: Yotam Gigi <yotam.gi@gmail.com> Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pan Bian [Mon, 30 Oct 2017 08:50:01 +0000 (16:50 +0800)]
net: hns: set correct return value
The function of_parse_phandle() returns a NULL pointer if it cannot
resolve a phandle property to a device_node pointer. In function
hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract
the error code. However, in this case, the extracted error code will
always be zero, which is unexpected.
Signed-off-by: Pan Bian <bianpan2016@163.com> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Pan Bian [Sun, 29 Oct 2017 13:57:22 +0000 (21:57 +0800)]
net: lapbether: fix double free
The function netdev_priv() returns the private data of the device. The
memory to store the private data is allocated in alloc_netdev() and is
released in netdev_free(). Calling kfree() on the return value of
netdev_priv() after netdev_free() results in a double free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>