commit 813b3b5db83 (ipv4: Use caller's on-stack flowi as-is
in output route lookups.) introduces another regression which
is very similar to the problem of commit e6b45241c (ipv4: reset
flowi parameters on route connect) wants to fix:
Before we call ip_route_output_key() in sctp_v4_get_dst() to
get a dst that matches a bind address as the source address,
we have already called this function previously and the flowi
parameters have been initialized including flowi4_oif, so when
we call this function again, the process in __ip_route_output_key()
will be different because of the setting of flowi4_oif, and we'll
get a networking device which corresponds to the inputted flowi4_oif
as the output device, this is wrong because we'll never hit this
place if the previously returned source address of dst match one
of the bound addresses.
To reproduce this problem, a vlan setting is enough:
# ifconfig eth0 up
# route del default
# vconfig add eth0 2
# vconfig add eth0 3
# ifconfig eth0.2 10.0.1.14 netmask 255.255.255.0
# route add default gw 10.0.1.254 dev eth0.2
# ifconfig eth0.3 10.0.0.14 netmask 255.255.255.0
# ip rule add from 10.0.0.14 table 4
# ip route add table 4 default via 10.0.0.254 src 10.0.0.14 dev eth0.3
# sctp_darn -H 10.0.0.14 -P 36422 -h 10.1.4.134 -p 36422 -s -I
You'll detect that all the flow are routed to eth0.2(10.0.1.254).
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This adds ability for the arc_emac to really handle its supplying clock.
To get the needed clock-frequency either a real clock or the previous
clock-frequency property must be provided.
Signed-off-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Max Schwarz <max.schwarz@online.de> Signed-off-by: David S. Miller <davem@davemloft.net>
bridge: Handle IFLA_ADDRESS correctly when creating bridge device
When bridge device is created with IFLA_ADDRESS, we are not calling
br_stp_change_bridge_id(), which leads to incorrect local fdb
management and bridge id calculation, and prevents us from receiving
frames on the bridge device.
Reported-by: Tom Gundersen <teg@jklm.no> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Fri, 25 Apr 2014 01:05:03 +0000 (18:05 -0700)]
i40e: fix Timesync Tx interrupt handler code
This patch fixes the PTP Tx timestamp interrupt handler. The original
code misinterpreted the interrupt handler design. We were clearing the
ena_mask bit for the Timesync interrupts. This is done to indicate that
the interrupt will be handled in a scheduled work item (instead of
immediately) and that work item is responsible for re-enabling the
interrupts. However, the Tx timestamp was being handled immediately and
nothing was ever re-enabling it. This resulted in a single interrupt
working for the life of the driver.
This patch fixes the issue by instead clearing the bit from icr0 which
is used to indicate that the interrupt was immediately handled and can
be re-enabled right away. This patch also clears up a related issue due
to writing the PRTTSYN_STAT_0 register, which was unintentionally
clearing the cause bits for Timesync interrupts.
Change-ID: I057bd70d53c302f60fab78246989cbdfa469d83b Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Anjali Singhai Jain <anjali.singhai@intel.com> Acked-by: Shannon Nelson <shannon.nelson@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 26 Apr 2014 16:29:19 +0000 (12:29 -0400)]
Merge tag 'linux-can-fixes-for-3.15-20140424' of git://gitorious.org/linux-can/linux-can
Marc Kleine-Budde says:
====================
this is a pull request for net/master, for the v3.15 release cycle, consisting
of 26 patches.
Thomas Gleixner contributes 21 patches for the c_can driver, which address
several shortcomings in the driver like hardware initialisation, concurrency,
message ordering and poor performance. Two patches Oliver Hartkopp, one adds a
missing lock to the sja1000_isa driver, the other one fixes the return value in
the generic bit time configuration function. And finally a patch by Alexander
Stein, that fixes the slcan driver to use the correct spinlock variant.
To make it 26 patches, Wolfgang Grandegger patch for the c_can_pci
driver, which enables the bus master only for MSI and a patch by
Wolfram Sang, which converts the 'instance' in the c_can driver to the
proper type.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 26 Apr 2014 16:27:14 +0000 (12:27 -0400)]
Merge branch 'altera_tse'
Vince Bridgers says:
====================
This series of patches addresses a handful of issues found in testing
and reported by users of the Altera Triple Speed Ethernet soft IP.
The patches address the following issues (in summary)
1) The SGDMA soft IP was found to incorrectly process receive packets
when the target physical address of the receive buffer was on
a boundary that's not 32-bit aligned. One of the patches addresses
this issue.
2) The pause quanta was not being set by the driver, one patch of this
series sets the pause quanta to the IEEE defined default value
since the hardware reset value is 0.
3) An issue in a error recovery path of the probe routine caused a
kernel panic in the event a phy was probed and could not be found.
A patch addresses this issue.
4) A change was made to the driver name for Ethtool support, and
comments added to support an addition to Ethtool to support
the Altera Triple Speed Ethernet controller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the name used by Ethtool to something more
conventional in preparation for TSE Ethtool register dump
support to be added in the near future.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Altera TSE: Fix Panic in probe routine when phy probe fails
This patch addresses a fault in the error recovery path of the probe
routine where the netdev structure was not being unregistered properly
leading to a panic only when the phy probe failed.
Abbreviated panic stack seen is as follows:
(free_netdev+0xXX) from (altera_tse_probe+0xXX)
(altera_tse_probe+0xXX) from (platform_drv_probe+0xXX)
(platform_drv_probe+0xXX) from (driver_probe_device+0xXX)
(driver_probe_device+0xXX) from (__driver_attach+0xXX)
(__driver_attach+0xXX) from (bus_for_each_dev+0xXX)
(bus_for_each_dev+0xXX) from (driver_attach+0xXX)
(driver_attach+0xXX) from (bus_add_driver+0xXX)
(bus_add_driver+0xXX) from (driver_register+0xXX)
(driver_register+0xXX) from (__platform_driver_register+0xXX)
(__platform_driver_register+0xXX) from (altera_tse_driver_init+0xXX)
(altera_tse_driver_init+0xXX) from (do_one_initcall+0xXX)
(do_one_initcall+0xXX) from (kernel_init_freeable+0xXX)
(kernel_init_freeable+0xXX) from (kernel_init+0xXX)
(kernel_init+0xXX) from (ret_from_fork+0xXX)
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Altera TSE: Work around unaligned DMA receive packet issue with Altera SGDMA
This patch works around a recently discovered unaligned receive dma problem
with the Altera SGMDA. The Altera SGDMA component cannot be configured to
DMA data to unaligned addresses for receive packet operations from the
Triple Speed Ethernet component because of a potential data transfer
corruption that can occur. This patch addresses this issue by
utilizing the shift 16 bits feature of the Altera Triple Speed Ethernet
component and modifying the receive buffer physical addresses accordingly
such that the target receive DMA address is always aligned on a 32-bit
boundary.
Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com> Tested-by: Matthew Gerlach <mgerlach@altera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 26 Apr 2014 16:16:18 +0000 (12:16 -0400)]
Merge branch 'bnx2x-net'
Yuval Mintz says:
====================
bnx2x: SRIOV bug fixes
This series contains 3 SRIOV bug fixes, 2 of which are regressions starting
with commit 2dc33bbc "bnx2x: Remove the sriov VFOP mechanism".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Starting with commit 2dc33bbc "bnx2x: Remove the sriov VFOP mechanism",
the bnx2x started enforcing vlan credits for all vlan configurations.
This exposed 2 issues:
- Vlan credits are not returned once a VF is removed; this causes a leak
of credits, and eventually will lead to VFs with no vlan credits.
- A vlan credit must be set aside for the Hypervisor to use, and should
not be visible to the VF.
Although linux VFs at the moment do not support vlan configuration [from the
VF side] which causes them to be resilient to this sort of issue, Windows VF
over linux hypervisors might fail to load as the vlan credits become depleted.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When removing a VF interface, the driver fails to release that VF's mailbox
and bulletin board allocated memory.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When the ipv6 fib changes during a table dump, the walk is
restarted and the number of nodes dumped are skipped. But the existing
code doesn't advance to the next node after a node is skipped. This can
cause the dump to loop or produce lots of duplicates when the fib
is modified during the dump.
This change advances the walk to the next node if the current node is
skipped after a restart.
Signed-off-by: Kumar Sundararajan <kumar@fb.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Stein [Tue, 15 Apr 2014 14:51:04 +0000 (16:51 +0200)]
can: slcan: Fix spinlock variant
slc_xmit is called within softirq context and locks sl->lock, but
slcan_write_wakeup is not softirq context, so we need to use
spin_[un]lock_bh!
Detected using kernel lock debugging mechanism.
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oliver Hartkopp [Tue, 15 Apr 2014 17:30:00 +0000 (19:30 +0200)]
can: sja1000_isa: add locking for indirect register access mode
When accessing the SJA1000 controller registers in the indirect access mode,
writing the register number and reading/writing the data has to be an atomic
attempt.
As the sja1000_isa driver is an old style driver with a fixed number of
instances the locking variable depends on the same index like all the other
configuration elements given on the module command line.
As a positive side effect dev->dev_id is populated by the instance index,
which was missing in 3e66d0138c05d9 ("can: populate netdev::dev_id for udev
discrimination").
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
can: c_can_pci: enable PCI bus master only for MSI
Coverity complains that c_can_pci_probe() calls pci_enable_msi() without
checking the result:
CID 712278 (#1 of 1): Unchecked return value (CHECKED_RETURN) 3. check_return:
Calling pci_enable_msi_block without checking return value (as is done
elsewhere 88 out of 105 times).
88 pci_enable_msi(pdev);
This is CID 712278.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Reported-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Wolfram Sang [Thu, 17 Apr 2014 08:57:18 +0000 (10:57 +0200)]
can: c_can: use proper type for 'instance'
Commit 6439fbce1075 (can: c_can: fix error checking of priv->instance in
probe()) found the warning but applied a suboptimal solution. Since, both
pdev->id and of_alias_get_id() return integers, it makes sense to convert the
variable to an integer and avoid the cast.
Signed-off-by: Wolfram Sang <wsa@sang-engineering.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:22 +0000 (08:13 +0000)]
can: c_can: Speed up tx buffer invalidation
It's suffcient to kill the TXIE bit in the message control register
even if the documentation of C and D CAN says that it's not allowed to
do that while MSGVAL is set. Reality tells a different story and this
change gives us another 2% of CPU back for not waiting on I/O.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:22 +0000 (08:13 +0000)]
can: c_can: Remove tx locking
Mark suggested to use one IF for the softirq and the other for the
xmit function to avoid the xmit lock.
That requires to write the frame into the interface first, then handle
the echo skb and store the dlc before committing the TX request to the
message ram.
We use an atomic to handle the active buffers instead of reading the
MSGVAL register as thats way faster especially on PCH/x86.
Suggested-by: Mark <mark5@del-llc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:21 +0000 (08:13 +0000)]
can: c_can: Use proper u32 variables in c_can_write_msg_object()
Instead of obfuscating the code by artificial 16 bit splits use the
proper 32 bit assignments and split the result when writing to the
interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:21 +0000 (08:13 +0000)]
can: c_can: Cleanup c_can_write_msg_object()
Remove the MASK from the TX transfer side.
Make the code readable and get rid of the annoying IFX_WRITE_XXX_16BIT
macros which are just obfuscating the code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:20 +0000 (08:13 +0000)]
can: c_can: Cleanup c_can_msg_obj_put/get()
Sigh!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:19 +0000 (08:13 +0000)]
can: c_can: Cleanup c_can_inval_msg_object()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:18 +0000 (08:13 +0000)]
can: c_can: Cleanup setup of receive buffers
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:18 +0000 (08:13 +0000)]
can: c_can: Cleanup c_can_read_msg_object()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:17 +0000 (08:13 +0000)]
can: c_can: Cleanup irq enable/disable
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:17 +0000 (08:13 +0000)]
can: c_can: Work around C_CAN RX wreckage
Alexander reported that the new optimized handling of the RX fifo
causes random packet loss on Intel PCH C_CAN hardware.
After a few fruitless debugging sessions I got hold of a PCH (eg20t)
afflicted system. That machine does not have the CAN interface wired
up, but it was possible to reproduce the issue with the HW loopback
mode.
As Alexander observed correctly, clearing the NewDat flag along with
reading out the message buffer causes that issue on C_CAN, while D_CAN
handles that correctly.
Instead of restoring the original message buffer handling horror the
following workaround solves the issue:
transfer buffer to IF without clearing the NewDat
handle the message
clear NewDat bit
That's similar to the original code but conditional for C_CAN.
I really wonder why all user manuals (C_CAN, Intel PCH and some more)
recommend to clear the NewDat bit right away. The knows it all Oracle
operated by Gurgle does not unearth any useful information either. I
simply cannot believe that we are the first to uncover that HW issue.
Reported-and-tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:16 +0000 (08:13 +0000)]
can: c_can: Disable rx split as workaround
The RX buffer split causes packet loss in the hardware:
What happens is:
RX Packet 1 --> message buffer 1 (newdat bit is not cleared)
RX Packet 2 --> message buffer 2 (newdat bit is not cleared)
RX Packet 3 --> message buffer 3 (newdat bit is not cleared)
RX Packet 4 --> message buffer 4 (newdat bit is not cleared)
RX Packet 5 --> message buffer 5 (newdat bit is not cleared)
RX Packet 6 --> message buffer 6 (newdat bit is not cleared)
RX Packet 7 --> message buffer 7 (newdat bit is not cleared)
RX Packet 8 --> message buffer 8 (newdat bit is not cleared)
Clear newdat bit in message buffer 1
Clear newdat bit in message buffer 2
Clear newdat bit in message buffer 3
Clear newdat bit in message buffer 4
Clear newdat bit in message buffer 5
Clear newdat bit in message buffer 6
Clear newdat bit in message buffer 7
Clear newdat bit in message buffer 8
Now if during that clearing of newdat bits, a new message comes in,
the HW gets confused and drops it.
It does not matter how many of them you clear. I put a delay between
clear of buffer 1 and buffer 2 which was long enough that the message
should have been queued either in buffer 1 or buffer 9. But it did not
show up anywhere. The next message ended up in buffer 1. So the
hardware lost a packet of course without telling it via one of the
error handlers.
That does not happen on all clear newdat bit events. I see one of 10k
packets dropped in the scenario which allows us to reproduce. But the
trace looks always the same.
Not splitting the RX Buffer avoids the packet loss but can cause
reordering. It's hard to trigger, but it CAN happen.
With that mode we use the HW as it was probably designed for. We read
from the buffer 1 upwards and clear the buffer as we get the
message. That's how all microcontrollers use it. So I assume that the
way we handle the buffers was never really tested. According to the
public documentation it should just work :)
Let the user decide which evil is the lesser one.
[ Oliver Hartkopp: Provided a sane config option and help text and
made me switch to favour potential and unlikely reordering over
packet loss ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:15 +0000 (08:13 +0000)]
can: c_can: Get rid of pointless interrupts
The driver handles pointlessly TWO interrupts per packet. The reason
is that it enables the status interrupt which fires for each rx and tx
packet and it enables the per message object interrupts as well.
The status interrupt merily acks or in case of D_CAN ignores the TX/RX
state and then the message object interrupt fires.
The message objects interrupts are only useful if all message objects
have hardware filters activated.
But we don't have that and its not simple to implement in that driver
without rewriting it completely.
So we can ditch the message object interrupts and handle the RX/TX
right away from the status interrupt. Instead of TWO we handle ONE.
Note: We must keep the TXIE/RXIE bits in the message buffers because
the status interrupt alone is not reliable enough in corner cases.
If we ever have the need for HW filtering, then this code needs a
complete overhaul and we can think about it then. For now we prefer a
lower interrupt load.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:15 +0000 (08:13 +0000)]
can: c_can: Avoid status register update for D_CAN
On D_CAN the RXOK, TXOK and LEC bits are cleared/set on read of the
status register. No need to update them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:14 +0000 (08:13 +0000)]
can: c_can: Simplify buffer reenabling
Instead of writing to the message object we can simply clear the
NewDat bit with the get method.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:13 +0000 (08:13 +0000)]
can: c_can: Always update error stats
If the allocation of the error skb fails, we still want to see the
error statistics.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:13 +0000 (08:13 +0000)]
can: c_can: Fix berr reporting
Reading the LEC type with
return (mode & ENABLED) && (status & LEC_MASK);
is not guaranteed to return (status & LEC_MASK) if the enabled bit in
mode is set. It's guaranteed to return 0 or !=0.
Remove the inline function and call unconditionally into the
berr_handling code and return early when the reporting is disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:12 +0000 (08:13 +0000)]
can: c_can: Handle state change correctly
If the allocation of an error skb fails, the state change handling
returns w/o doing any work. That leaves the interface in a wreckaged
state as the internal status is wrong.
Split the interface handling and the skb handling.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:12 +0000 (08:13 +0000)]
can: c_can: Do not access skb after net_receive_skb()
There is no guarantee that the skb is in the same state after calling
net_receive_skb(). It might be freed or reused. Not really harmful as
its a read access, except you turn on the proper debugging options
which catch a use after free.
The whole can subsystem is full of this. Copy and paste ....
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:11 +0000 (08:13 +0000)]
can: c_can: Make bus off interrupt disable logic work
The state change handler is called with device interrupts disabled
already. So no point in disabling them again when we enter bus off
state.
But what's worse is that we reenable the interrupts at the end of NAPI
poll unconditionally. So c_can_start() which is called from the
restart timer can trigger interrupts which confuse the hell out of the
half reinitialized driver/hw.
Remove the pointless device interrupt disable in the BUS_OFF handler
and prevent reenabling the device interrupts at the end of the poll
routine when the current state is BUS_OFF.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:10 +0000 (08:13 +0000)]
can: c_can: Fix startup logic
c_can_start() enables interrupts way too early. The first enabling
happens when setting the control mode in c_can_chip_config() and then
again at the end of the function.
But that happens before napi_enable() and that means that an interrupt
which comes in will disable interrupts again and call napi_schedule,
which ignores the request and the later napi_enable() is not making
thinks work either. So the interface is up with all device interrupts
disabled.
Move the device interrupt after napi_enable() and add it to the other
callsites of c_can_start() in c_can_set_mode() and c_can_power_up()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thomas Gleixner [Fri, 11 Apr 2014 08:13:10 +0000 (08:13 +0000)]
can: c_can_pci: Set the type of the IP core
All type checks in c_can.c are != BOSCH_D_CAN so nobody noticed so far
that the pci code does not update the type information.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David S. Miller [Thu, 24 Apr 2014 17:53:01 +0000 (13:53 -0400)]
Merge branch 'rtnetlink_vf_ports'
David Gibson says:
====================
Fix problems with with IFLA_VF_PORTS (v2)
I've had a customer encounter a problem with getifaddrs(3) freezing up
on a system with a Cisco enic device.
I've discovered that the problem is caused by an enic device with a
large number of SR-IOV virtual functions overflowing the normal sized
packet buffer for netlink, leading to interfaces not being reported
from an RTM_GETLINK request.
The first patch here just makes the problem easier to locate if it
occurs again in a different way, by adding a WARN_ON() when we run out
of room in a netlink packet in this manner.
The second patch actually fixes the problem, by only reporting
IFLA_VF_PORTS information when the RTEXT_FILTER_VF flag is specified.
v2: Corrected some CodingStyle problems
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Gibson [Thu, 24 Apr 2014 00:22:36 +0000 (10:22 +1000)]
rtnetlink: Only supply IFLA_VF_PORTS information when RTEXT_FILTER_VF is set
Since 115c9b81928360d769a76c632bae62d15206a94a (rtnetlink: Fix problem with
buffer allocation), RTM_NEWLINK messages only contain the IFLA_VFINFO_LIST
attribute if they were solicited by a GETLINK message containing an
IFLA_EXT_MASK attribute with the RTEXT_FILTER_VF flag.
That was done because some user programs broke when they received more data
than expected - because IFLA_VFINFO_LIST contains information for each VF
it can become large if there are many VFs.
However, the IFLA_VF_PORTS attribute, supplied for devices which implement
ndo_get_vf_port (currently the 'enic' driver only), has the same problem.
It supplies per-VF information and can therefore become large, but it is
not currently conditional on the IFLA_EXT_MASK value.
Worse, it interacts badly with the existing EXT_MASK handling. When
IFLA_EXT_MASK is not supplied, the buffer for netlink replies is fixed at
NLMSG_GOODSIZE. If the information for IFLA_VF_PORTS exceeds this, then
rtnl_fill_ifinfo() returns -EMSGSIZE on the first message in a packet.
netlink_dump() will misinterpret this as having finished the listing and
omit data for this interface and all subsequent ones. That can cause
getifaddrs(3) to enter an infinite loop.
This patch addresses the problem by only supplying IFLA_VF_PORTS when
IFLA_EXT_MASK is supplied with the RTEXT_FILTER_VF flag set.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
David Gibson [Thu, 24 Apr 2014 00:22:35 +0000 (10:22 +1000)]
rtnetlink: Warn when interface's information won't fit in our packet
Without IFLA_EXT_MASK specified, the information reported for a single
interface in response to RTM_GETLINK is expected to fit within a netlink
packet of NLMSG_GOODSIZE.
If it doesn't, however, things will go badly wrong, When listing all
interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
message in a packet as the end of the listing and omit information for
that interface and all subsequent ones. This can cause getifaddrs(3) to
enter an infinite loop.
This patch won't fix the problem, but it will WARN_ON() making it easier to
track down what's going wrong.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 24 Apr 2014 17:45:04 +0000 (13:45 -0400)]
Merge branch 'netlink-caps'
Eric W. Biederman says:
====================
netlink: Preventing abuse when passing file descriptors.
Andy Lutomirski when looking at the networking stack noticed that it is
possible to trick privilged processes into calling write on a netlink
socket and send netlink messages they did not intend.
In particular from time to time there are suid applications that will
write to stdout or stderr without checking exactly what kind of file
descriptors those are and can be tricked into acting as a limited form
of suid cat. In other conversations the magic string CVE-2014-0181 has
been used to talk about this issue.
This patchset cleans things up a bit, adds some clean abstractions that
when used prevent this kind of problem and then finally changes all of
the handlers of netlink messages that I could find that call capable to
use netlink_ns_capable or an appropriate wrapper.
The abstraction netlink_ns_capable verifies that the original creator of
the netlink socket a message is sent from had the necessary capabilities
as well as verifying that the current sender of a netlink packet has the
necessary capabilities.
The idea is to prevent file descriptor passing of any form from
resulting in a file descriptor that can do more than it can for the
creator of the file descriptor.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net: Use netlink_ns_capable to verify the permisions of netlink messages
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.
To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.
Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Add variants of capable for use on netlink messages
netlink_net_capable - The common case use, for operations that are safe on a network namespace
netlink_capable - For operations that are only known to be safe for the global root
netlink_ns_capable - The general case of capable used to handle special cases
__netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of
the skbuff of a netlink message.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Add variants of capable for use on on sockets
sk_net_capable - The common case, operations that are safe in a network namespace.
sk_capable - Operations that are not known to be safe in a network namespace
sk_ns_capable - The general case for special cases.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Move the permission check in sock_diag_put_filterinfo to packet_diag_dump
The permission check in sock_diag_put_filterinfo is wrong, and it is so removed
from it's sources it is not clear why it is wrong. Move the computation
into packet_diag_dump and pass a bool of the result into sock_diag_filterinfo.
This does not yet correct the capability check but instead simply moves it to make
it clear what is going on.
Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a seg fault on 'ethtool -A' entry if the
interface is down. Obviously we need to have the
phy device initialized / "connected" (see of_phy_connect())
to be able to advertise pause frame capabilities.
Fixes: 23402bddf9e56eecb27bbd1e5467b3b79b3dbe58 Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 24 Apr 2014 17:31:23 +0000 (13:31 -0400)]
Merge branch 'qlcnic-net'
Shahed Shaikh says:
====================
qlcnic: Bug fixes
This patch series contains following fixes -
* Fix memory leak caused because of issuing mailbox
command which can not wait for its completion.
* Reset firmware API lock which might be in inconsistent state.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
o In case QLC_83XX_MBX_CMD_NO_WAIT command type the calling
function does not free the memory as it does not wait for
response. So free it when get a response from adapter after
sending the command.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sony Chacko [Wed, 23 Apr 2014 13:59:55 +0000 (09:59 -0400)]
qlcnic: Reset firmware API lock at driver load time
Some firmware versions fails to reset the lock during
initialization. Force reset firmware API lock during driver
probe to ensure lock availability.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jean Delvare [Wed, 23 Apr 2014 08:40:12 +0000 (10:40 +0200)]
net: cadence: Fix architecture dependencies
I was told that the Cadence macb driver is also useful on Microblaze.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Mark Brown <broonie@kernel.org> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
exisiting BPF verifier allows uninitialized access to registers,
'ret A' is considered to be a valid filter.
So initialize A and X to zero to prevent leaking kernel memory
In the future BPF verifier will be rejecting such filters
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Daniel Borkmann <dborkman@redhat.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Tue, 22 Apr 2014 13:01:30 +0000 (15:01 +0200)]
vxlan: ensure to advertise the right fdb remote
The goal of this patch is to fix rtnelink notification. The main problem was
about notification for fdb entry with more than one remote. Before the patch,
when a remote was added to an existing fdb entry, the kernel advertised the
first remote instead of the added one. Also when a remote was removed from a fdb
entry with several remotes, the deleted remote was not advertised.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/phy: micrel: fix bugged test on device tree loading for ksz9021
In ksz9021_load_values_from_of() val2 to val4 aren't tested against their
initialization value.
This causes the test to always succeed, and this value to be used as if it
was loaded from the devicetree instead of being ignored, in case of a
missing/invalid property in the ethernet OF device node.
As a result, the value "0" is written to the relevant registers.
Change the conditions to test against the right initialization value.
Signed-off-by: Hubert Chaumette <hchaumette@adeneo-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When SMC_DEBUG >= 2, we hit the following compilation error:
drivers/net/ethernet/smsc/smc91x.c:85:0:
drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_findirq’:
drivers/net/ethernet/smsc/smc91x.c:1784:9: error: ‘dev’ undeclared (first use in this function)
DBG(2, dev, "%s: %s\n", CARDNAME, __func__);
^
Fix it by passing in the appropriate netdev pointer.
Signed-off-by: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amos Kong [Fri, 18 Apr 2014 05:45:41 +0000 (13:45 +0800)]
virtio_net: zero is an invald queue_pairs number
Execute "ethtool -L eth0 combined 0" in guest, if multiqueue
is enabled, virtnet_send_command() will return -EINVAL error,
there is a validation in QEMU.
But if multiqueue is disabled, virtnet_set_queues() will just
return zero (success). We should return error for this situation.
Signed-off-by: Amos Kong <akong@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Max Schwarz [Fri, 18 Apr 2014 00:17:32 +0000 (02:17 +0200)]
arc_emac: write initial MAC address from devicetree to hw
The MAC address retrieved from dt was not actually written to the
hardware. This meant proper communication was only possible after
changing the MAC address.
Fix that by always writing the mac address during probing.
Signed-off-by: Max Schwarz <max.schwarz@online.de> Acked-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Fix ns_capable check in sock_diag_put_filterinfo
The caller needs capabilities on the namespace being queried, not on
their own namespace. This is a security bug, although it likely has
only a minor impact.
Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Apr 2014 16:58:38 +0000 (12:58 -0400)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates
This series contains updates to e1000e, igb, ixgbe and i40e.
Most notably are Jakub's patches to clean up the Rx time stamping
code for ixgbe and the fix up of debug messages with proper termination.
Jesse's i40e patch fixes an issue reported by Eric Dumazet that the
i40e driver was allowing the hardware to replicate the PSH flag on
all segments of a TSO operation. With this fix, we are now configuring
the CWR bit to only be set in the first packet of a TSO and we
enable TSO_ECN in order to advertise to the stack that we do the right
thing on the wire.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Unfortunately this contains no easter eggs, its a bit larger than I'd
like, but I included a patch that just moves code from one file to
another and I'd like to avoid merge conflicts with that later, so it
makes it seem worse than it is,
Otherwise:
- radeon: fixes to use new microcode to stabilise some cards, use
some common displayport code, some runtime pm fixes, pll regression
fixes
- i915: fix for some context oopses, a warn in a used path, backlight
fixes
- nouveau: regression fix
- omap: a bunch of fixes"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (51 commits)
drm: bochs: drop unused struct fields
drm: bochs: add power management support
drm: cirrus: add power management support
drm: Split out drm_probe_helper.c from drm_crtc_helper.c
drm/plane-helper: Don't fake-implement primary plane disabling
drm/ast: fix value check in cbr_scan2
drm/nouveau/bios: fix a bit shift error introduced by 457e77b
drm/radeon/ci: make sure mc ucode is loaded before checking the size
drm/radeon/si: make sure mc ucode is loaded before checking the size
drm/radeon: improve PLL params if we don't match exactly v2
drm/radeon: memory leak on bo reservation failure. v2
drm/radeon: fix VCE fence command
drm/radeon: re-enable mclk dpm on R7 260X asics
drm/radeon: add support for newer mc ucode on CI (v2)
drm/radeon: add support for newer mc ucode on SI (v2)
drm/radeon: apply more strict limits for PLL params v2
drm/radeon: update CI DPM powertune settings
drm/radeon: fix runpm handling on APUs (v4)
drm/radeon: disable mclk dpm on R7 260X
drm/tegra: Remove gratuitous pad field
...
Jakub Kicinski [Wed, 2 Apr 2014 10:33:22 +0000 (10:33 +0000)]
e1000e/igb/ixgbe/i40e: fix message terminations
Add \n at the end of messages where missing, remove all \r.
Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Jakub Kicinski <kubakici@wp.pl> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jakub Kicinski [Wed, 2 Apr 2014 10:33:28 +0000 (10:33 +0000)]
ixgbe: clean up Rx time stamping code
Time stamping resources are per-interface so there is no need
to keep separate last_rx_timestamp for each Rx ring, move
last_rx_timestamp to the adapter structure.
With last_rx_timestamp inside adapter, ixgbe_ptp_rx_hwtstamp()
inline function is reduced to a single if statement so it is
no longer necessary. If statement is placed directly in
ixgbe_process_skb_fields() fixing likely/unlikely marking.
Checks for q_vector or adapter to be NULL are superfluous.
Comment about taking I/O hit is a leftover from previous design.
Signed-off-by: Jakub Kicinski <kubakici@wp.pl> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dave Airlie [Sat, 19 Apr 2014 01:16:02 +0000 (11:16 +1000)]
Merge branch 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux into drm-next
Some i2c fixes over DisplayPort.
* 'drm-next-3.15-wip' of git://people.freedesktop.org/~deathsimple/linux:
drm/radeon: Improve vramlimit module param documentation
drm/radeon: fix audio pin counts for DCE6+ (v2)
drm/radeon/dp: switch to the common i2c over aux code
drm/dp/i2c: Update comments about common i2c over dp assumptions (v3)
drm/dp/i2c: send bare addresses to properly reset i2c connections (v4)
drm/radeon/dp: handle zero sized i2c over aux transactions (v2)
drm/i915: support address only i2c-over-aux transactions
drm/tegra: dp: Support address-only I2C-over-AUX transactions
e1000e: Enclose e1000e_pm_thaw() with CONFIG_PM_SLEEP
Fix following compilation warning:
drivers/net/ethernet/intel/e1000e/netdev.c:6238:12: warning
‘e1000e_pm_thaw’ defined but not used [-Wunused-function]
static int e1000e_pm_thaw(struct device *dev)
^ Signed-off-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
e1000e: Correctly include VLAN_HLEN when changing interface MTU
When changing the interface mtu, the driver starts with a value
that doesn't include VLAN_HLEN. Later tests in the driver
set the rx_buffer_len based on the mtu. As a result, when
the user increases the mtu to 1504 (to support 802.1AD for example),
the driver rx_buffer_len does not change and frames longer
the 1522 bytes are rejected as too long.
Include VLAN_HLEN from the start so that an user mtu greater then
1500 bytes is correctly reflected in the driver rx_buffer_len.
CC: e1000-devel@lists.sourceforge.net Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
context, not synchronize it. From Chris Mason.
2) Ipv4 flow input interface should never be zero, it should be
LOOPBACK_IFINDEX instead. From Cong Wang and Julian Anastasov.
3) Properly configure MAC to PHY connection in mvneta devices, from
Thomas Petazzoni.
4) sys_recv should use SYSCALL_DEFINE. From Jan Glauber.
5) Tunnel driver ioctls do not use the correct namespace, fix from
Nicolas Dichtel.
6) Fix memory leak on seccomp filter attach, from Kees Cook.
7) Fix lockdep warning for nested vlans, from Ding Tianhong.
8) Crashes can happen in SCTP due to how the auth_enable value is
managed, fix from Vlad Yasevich.
9) Wireless fixes from John W Linville and co.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
net: sctp: cache auth_enable per endpoint
tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
vlan: Fix lockdep warning when vlan dev handle notification
seccomp: fix memory leak on filter attach
isdn: icn: buffer overflow in icn_command()
ip6_tunnel: use the right netns in ioctl handler
sit: use the right netns in ioctl handler
ip_tunnel: use the right netns in ioctl handler
net: use SYSCALL_DEFINEx for sys_recv
net: mdio-gpio: Add support for separate MDI and MDO gpio pins
net: mdio-gpio: Add support for active low gpio pins
net: mdio-gpio: Use devm_ functions where possible
ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
net: mvneta: properly configure the MAC <-> PHY connection in all situations
net: phy: add minimal support for QSGMII PHY
sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
mwifiex: fix hung task on command timeout
mwifiex: process event before command response
...
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"A set of 5 small cifs fixes"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cif: fix dead code
cifs: fix error handling cifs_user_readv
fs: cifs: remove unused variable.
Return correct error on query of xattr on file with empty xattrs
cifs: Wait for writebacks to complete before attempting write.
i40e: fix TCP flag replication for hardware offload
As reported by Eric Dumazet, the i40e driver was allowing the hardware
to replicate the PSH flag on all segments of a TSO operation.
This patch fixes the first/middle/last TCP flags settings which
makes the TSO operations work correctly.
With this change we are now configuring the CWR bit to only be set
in the first packet of a TSO, so this patch also enables TSO_ECN,
in order to advertise to the stack that we do the right thing
on the wire.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Kavindya Deegala <kavindya.s.deegala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Merge tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a few driver fixes for char/misc drivers that resolve
reported issues.
All have been in linux-next successfully for a few days"
* tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
Tools: hv: Handle the case when the target file exists correctly
vme_tsi148: Utilize to_pci_dev() macro
vme_tsi148: Fix PCI address mapping assumption
vme_tsi148: Fix typo in tsi148_slave_get()
w1: avoid recursive device_add
w1: fix netlink refcnt leak on error path
misc: Grammar s/addition/additional/
drivers: mcb: fix memory leak in chameleon_parse_cells() error path
mei: ignore client writing state during cb completion
mei: me: do not load the driver if the FW doesn't support MEI interface
GenWQE: Increase driver version number
GenWQE: Fix multithreading problems
GenWQE: Ensure rc is not returning an uninitialized value
GenWQE: Add wmb before DDCB is started
GenWQE: Enable access to VPD flash area
Merge tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some driver core fixes for 3.15-rc2. Also in here are some
documentation updates, as well as an API removal that had to wait for
after -rc1 due to the cleanups coming into you from multiple developer
trees (this one and the PPC tree.)
All have been in linux next successfully"
* tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers/base/dd.c incorrect pr_debug() parameters
Documentation: Update stable address in Chinese and Japanese translations
topology: Fix compilation warning when not in SMP
Chinese: add translation of io_ordering.txt
stable_kernel_rules: spelling/word usage
sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
kernfs: protect lazy kernfs_iattrs allocation with mutex
fs: Don't return 0 from get_anon_bdev
Merge tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are a few staging driver fixes for issues that have been reported
for 3.15-rc2.
Also dominating the diffstat for the pull request is the removal of
the rtl8187se driver. It's no longer needed in staging as a "real"
driver for this hardware is now merged in the tree in the "correct"
location in drivers/net/
All of these patches have been tested in linux-next"
* tag 'staging-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8188eu: Fix case where ethtype was never obtained and always be checked against 0
staging: r8712u: Fix case where ethtype was never obtained and always be checked against 0
staging: r8188eu: Calling rtw_get_stainfo() with a NULL sta_addr will return NULL
staging: comedi: fix circular locking dependency in comedi_mmap()
staging: r8723au: Add missing initialization of change_inx in sort algorithm
Staging: unisys: use after free in list_for_each()
staging: unisys: use after free in error messages
staging: speakup: fix misuse of kstrtol() in handle_goto()
staging: goldfish: Call free_irq in error path
staging: delete rtl8187se wireless driver
staging: rtl8723au: Fix buffer overflow in rtw_get_wfd_ie()
staging: gs_fpgaboot: remove __TIMESTAMP__ macro
staging: vme: fix memory leak in vme_user_probe()
staging: fpgaboot: clean up Makefile
staging/usbip: fix store_attach() sscanf return value check
staging/usbip: userspace - fix usbipd SIGSEGV from refresh_exported_devices()
staging: rtl8188eu: remove spaces, correct counts to unbreak P2P ioctls
staging/rtl8821ae: Fix OOM handling in _rtl_init_deferred_work()
Merge tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are a number of small tty/serial driver fixes for 3.15-rc2. Also
in here are some Documentation file removals for drivers that we
removed a long time ago, no need to keep it around any longer.
All of these have been in linux-next for a bit"
* tag 'tty-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250, disable "too much work" messages"
serial: amba-pl011: fix regression, causing an Oops on rmmod
tty: Fix help text of SYNCLINK_CS
tty: fix memleak in alloc_pid
ttyprintk: Allow built as a module
ttyprintk: Fix wrong tty_unregister_driver() call in the error path
serial: 8250, disable "too much work" messages
Documentation/serial: Delete obsolete driver documentation
serial: omap: Fix missing pm_runtime_resume handling by simplifying code
serial_core: Fix pm imbalance on unbind
serial: pl011: change Rx burst size to half of trigger level
serial: timberdale: Depend on X86_32
serial: st-asc: Fix SysRq char handling
Revert "serial: clps711x: Give a chance to perform useful tasks during wait loop"
serial_core: Fix conditional start_tx on ring buffer not empty
serial: efm32: use $vendor,$device scheme for compatible string
serial: omap: free the wakeup settings in remove
Merge tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of tiny USB fixes and new device ids for 3.15-rc2.
Nothing major, just issues some people have reported.
All of these have been in linux-next"
* tag 'usb-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
uas: fix deadlocky memory allocations
uas: fix error handling during scsi_scan()
uas: fix GFP_NOIO under spinlock
uwb: adds missing error handling
USB: cdc-acm: Remove Motorola/Telit H24 serial interfaces from ACM driver
USB: ohci-jz4740: FEAT_POWER is a port feature, not a hub feature
USB: ohci-jz4740: Fix uninitialized variable warning
USB: EHCI: tegra: set txfill_tuning
usb: ehci-platform: Return immediately from suspend if ehci_suspend fails
usb: ehci-exynos: Return immediately from suspend if ehci_suspend fails
USB: fix crash during hotplug of PCI USB controller card
USB: cdc-acm: fix double usb_autopm_put_interface() in acm_port_activate()
usb: usb-common: fix typo for usb_state_string
USB: usb_wwan: fix handling of missing bulk endpoints
USB: pl2303: add ids for Hewlett-Packard HP POS pole displays
USB: cp210x: Add 8281 (Nanotec Plug & Drive)
usb: option driver, add support for Telit UE910v2
Revert "USB: serial: add usbid for dell wwan card to sierra.c"
USB: serial: ftdi_sio: add id for Brainboxes serial cards
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
thp: close race between split and zap huge pages
mm: fix new kernel-doc warning in filemap.c
mm: fix CONFIG_DEBUG_VM_RB description
mm: use paravirt friendly ops for NUMA hinting ptes
mips: export flush_icache_range
mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
wait: explain the shadowing and type inconsistencies
Shiraz has moved
Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
powerpc/mm: fix ".__node_distance" undefined
kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
init/Kconfig: move the trusted keyring config option to general setup
vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()
Sasha Levin has reported two THP BUGs[1][2]. I believe both of them
have the same root cause. Let's look to them one by one.
The first bug[1] is "kernel BUG at mm/huge_memory.c:1829!". It's
BUG_ON(mapcount != page_mapcount(page)) in __split_huge_page(). From my
testing I see that page_mapcount() is higher than mapcount here.
I think it happens due to race between zap_huge_pmd() and
page_check_address_pmd(). page_check_address_pmd() misses PMD which is
under zap:
CPU0 CPU1
zap_huge_pmd()
pmdp_get_and_clear()
__split_huge_page()
anon_vma_interval_tree_foreach()
__split_huge_page_splitting()
page_check_address_pmd()
mm_find_pmd()
/*
* We check if PMD present without taking ptl: no
* serialization against zap_huge_pmd(). We miss this PMD,
* it's not accounted to 'mapcount' in __split_huge_page().
*/
pmd_present(pmd) == 0
The second bug[2] is "kernel BUG at mm/huge_memory.c:1371!".
It's VM_BUG_ON_PAGE(!PageHead(page), page) in zap_huge_pmd().
This happens in similar way:
CPU0 CPU1
zap_huge_pmd()
pmdp_get_and_clear()
page_remove_rmap(page)
atomic_add_negative(-1, &page->_mapcount)
__split_huge_page()
anon_vma_interval_tree_foreach()
__split_huge_page_splitting()
page_check_address_pmd()
mm_find_pmd()
pmd_present(pmd) == 0 /* The same comment as above */
/*
* No crash this time since we already decremented page->_mapcount in
* zap_huge_pmd().
*/
BUG_ON(mapcount != page_mapcount(page))
/*
* We split the compound page here into small pages without
* serialization against zap_huge_pmd()
*/
__split_huge_page_refcount()
VM_BUG_ON_PAGE(!PageHead(page), page); // CRASH!!!
So my understanding the problem is pmd_present() check in mm_find_pmd()
without taking page table lock.
The bug was introduced by me commit with commit 117b0791ac42. Sorry for
that. :(
Let's open code mm_find_pmd() in page_check_address_pmd() and do the
check under page table lock.
Note that __page_check_address() does the same for PTE entires
if sync != 0.
I've stress tested split and zap code paths for 36+ hours by now and
don't see crashes with the patch applied. Before it took <20 min to
trigger the first bug and few hours for second one (if we ignore
first).
This appears to be a copy/paste error. Update the description to
reflect extra rbtree debug and checks for the config option instead of
duplicating CONFIG_DEBUG_VM.
mm: use paravirt friendly ops for NUMA hinting ptes
David Vrabel identified a regression when using automatic NUMA balancing
under Xen whereby page table entries were getting corrupted due to the
use of native PTE operations. Quoting him
Xen PV guest page tables require that their entries use machine
addresses if the preset bit (_PAGE_PRESENT) is set, and (for
successful migration) non-present PTEs must use pseudo-physical
addresses. This is because on migration MFNs in present PTEs are
translated to PFNs (canonicalised) so they may be translated back
to the new MFN in the destination domain (uncanonicalised).
pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
set and clear the _PAGE_PRESENT bit using pte_set_flags(),
pte_clear_flags(), etc.
In a Xen PV guest, these functions must translate MFNs to PFNs
when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
_PAGE_PRESENT.
His suggested fix converted p[te|md]_[set|clear]_flags to using
paravirt-friendly ops but this is overkill. He suggested an alternative
of using p[te|md]_modify in the NUMA page table operations but this is
does more work than necessary and would require looking up a VMA for
protections.
This patch modifies the NUMA page table operations to use paravirt
friendly operations to set/clear the flags of interest. Unfortunately
this will take a performance hit when updating the PTEs on
CONFIG_PARAVIRT but I do not see a way around it that does not break
Xen.
Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: David Vrabel <david.vrabel@citrix.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The lkdtm module performs tests against executable memory ranges, so it
needs to flush the icache for proper behaviors. Other architectures
already export this, so do the same for MIPS.
[akpm@linux-foundation.org: relocate export sites] Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Sanjay Lal <sanjayl@kymasys.com> Cc: John Crispin <blogic@openwrt.org> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
soft lockup in freeing gigantic hugepage fixed in commit 55f67141a892 "mm:
hugetlb: fix softlockup when a large number of hugepages are freed." can
happen in return_unused_surplus_pages(), so let's fix it.