Johannes Berg [Fri, 14 Nov 2014 15:43:50 +0000 (16:43 +0100)]
cfg80211: allow survey data to return global data
Not all devices are able to report survey data (particularly
time spent for various operations) per channel. As all these
statistics already exist in survey data, allow such devices
to report them (if userspace requested it)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 14 Nov 2014 15:35:34 +0000 (16:35 +0100)]
cfg80211: remove "channel" from survey names
All of the survey data is (currently) per channel anyway,
so having the word "channel" in the name does nothing. In
the next patch I'll introduce global data to the survey,
where the word "channel" is actually confusing.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ido Yariv [Tue, 6 Jan 2015 13:39:02 +0000 (08:39 -0500)]
mac80211: Re-fix accounting of the tailroom-needed counter
When hw acceleration is enabled, the GENERATE_IV or PUT_IV_SPACE flags
only require headroom space. Therefore, the tailroom-needed counter can
safely be decremented for most drivers.
The older incarnation of this patch (ca34e3b5) assumed that the above
holds true for all drivers. As reported by Christopher Chavez and
researched by Christian Lamparter and Larry Finger, this isn't a valid
assumption for p54 and cw1200.
Drivers that still require tailroom for ICV/MIC even when HW encryption
is enabled can use IEEE80211_KEY_FLAG_RESERVE_TAILROOM to indicate it.
Signed-off-by: Ido Yariv <idox.yariv@intel.com> Cc: Christopher Chavez <chrischavez@gmx.us> Cc: Christian Lamparter <chunkeey@googlemail.com> Cc: Larry Finger <Larry.Finger@lwfinger.net> Cc: Solomon Peachy <pizza@shaftnet.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Thu, 1 Jan 2015 11:43:17 +0000 (13:43 +0200)]
mac80211: skip disabled channels in VHT check
The patch "40a11ca mac80211: check if channels allow 80 MHz for VHT
probe requests" considered disabled channels as VHT enabled, and
mistakenly sent out probe-requests with the VHT IE.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 23 Dec 2014 16:17:38 +0000 (17:17 +0100)]
nl80211: define multicast group names in header
Put the group names into the userspace API header file so that
userspace clients can use symbolic names from there instead of
hardcoding the actual names. This doesn't really change much,
but seems somewhat cleaner.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
With the wiphy::features flag being used up this patch adds a
new field wiphy::ext_features. Considering extensibility this
new field is declared as a byte array. This extensible flag is
exposed to user-space by NL80211_ATTR_EXT_FEATURES.
Cc: Avinash Patil <patila@marvell.com> Signed-off-by: Gautam (Gautam Kumar) Shukla <gautams@broadcom.com> Signed-off-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 5 Jan 2015 10:16:42 +0000 (11:16 +0100)]
nl80211: document NL80211_BSS_STATUS_AUTHENTICATED isn't used
The flag is no longer used (and hasn't been for a long time)
since trying to track authentication (and make decisions based
on state) was just causing issues all over - see commit 95de817b9034d50860319f6033ec85d25024694c.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
It turns out that the p54 and cw2100 drivers assume that there's
tailroom even when they don't say they really need it. However,
there's currently no way for them to explicitly say they do need
it, so for now revert this.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331.
Cc: stable@vger.kernel.org Fixes: ca34e3b5c808 ("mac80211: Fix accounting of the tailroom-needed counter") Reported-by: Christopher Chavez <chrischavez@gmx.us> Bisected-by: Larry Finger <Larry.Finger@lwfinger.net> Debugged-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
dot11MulticastTransmittedFrameCount should be updated according
to the DA, which might be different from A1. Checking A1 results
in the counter being 0 in case of station, as to-DS data frames
use A1 for the BSSID.
This behaviour is defined in state machines, specifically in the
sta_tx_dcf_3.1d(10) description of 802.11-2012.
Johannes Berg [Mon, 22 Dec 2014 11:51:25 +0000 (12:51 +0100)]
mac80211_hwsim: fix check for custom world regdom array size
David Binderman reports that the conditions in the first loop
are the wrong way around - checking the array contents before
the size.
Instead of leaving the empty loop there and reordering the two
checks unify it into a single loop that skips over non-matches
and exits after the first match.
Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Felix Fietkau [Wed, 17 Dec 2014 12:38:34 +0000 (13:38 +0100)]
mac80211: minstrel: reduce size of struct minstrel_rate_stats
On minstrel_ht, the size of the per-sta struct is almost 18k, making it
an order-3 allocation.
A few fields inside the per-rate statistics are bigger than they need to
be. This patch reduces the size enough to cut down the per-sta struct to
about 13k (order-2 allocation).
Signed-off-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jukka Rissanen [Mon, 15 Dec 2014 11:25:39 +0000 (13:25 +0200)]
nl80211: Stop scheduled scan if netlink client disappears
An attribute NL80211_ATTR_SOCKET_OWNER can be set by the scan initiator.
If present, the attribute will cause the scan to be stopped if the client
dies.
Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Moshe Benji [Sun, 14 Dec 2014 09:05:50 +0000 (11:05 +0200)]
mac80211: handle power constraint and country IEs in RRM
In beacons, handle the Country IE even if no Power Constraint IE
is present, and, capability wise, also in case that the Radio
Measurements capability is enabled.
In cases where the Country IE should be handled and that the
Power Constraint IE is not present, the Country IE alone will
set the power limit (and not both Country and Power Constraint
IEs).
HT override configurations was ignored when choosing the channel
(until now, the override configuration affected only the
capabilities shown in the IEs).
The override configurations received only on association time,
so in this case we should determine the channel again.
Johannes Berg [Wed, 17 Dec 2014 12:55:49 +0000 (13:55 +0100)]
mac80211: free management frame keys when removing station
When writing the code to allow per-station GTKs, I neglected to
take into account the management frame keys (index 4 and 5) when
freeing the station and only added code to free the first four
data frame keys.
Fix this by iterating the array of keys over the right length.
Arik Nemtsov [Mon, 15 Dec 2014 17:26:02 +0000 (19:26 +0200)]
cfg80211: avoid intersection when applying self-managed reg
The custom-reg handling function can currently only add flags to a given
channel. This results in stale flags being left applied. In some cases
a channel was disabled and even the orig_flags were changed to reflect
this.
Previously the API was designed for a single invocation before wiphy
registration, so this didn't matter. The previous approach doesn't scale
well to self-managed regulatory devices, particularly when a more
permissive regdom is applied after a restrictive one.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Mon, 15 Dec 2014 17:26:01 +0000 (19:26 +0200)]
cfg80211: return private regdom for self-managed devices
If a device has self-managed regulatory, insist on returning the wiphy
specific regdomain if a wiphy-idx is specified. The global regdomain is
meaningless for such devices.
Also add an attribute for self-managed devices, so usermode can
distinguish them as such.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jonathan Doron [Mon, 15 Dec 2014 17:26:00 +0000 (19:26 +0200)]
cfg80211: allow wiphy specific regdomain management
Add a new regulatory flag that allows a driver to manage regdomain
changes/updates for its own wiphy.
A self-managed wiphys only employs regulatory information obtained from
the FW and driver and does not use other cfg80211 sources like
beacon-hints, country-code IEs and hints from other devices on the same
system. Conversely, a self-managed wiphy does not share its regulatory
hints with other devices in the system. If a system contains several
devices, one or more of which are self-managed, there might be
contradictory regulatory settings between them. Usage of flag is
generally discouraged. Only use it if the FW/driver is incompatible
with non-locally originated hints.
A new API lets the driver send a complete regdomain, to be applied on
its wiphy only.
After a wiphy-specific regdomain change takes place, usermode will get
a new type of change notification. The regulatory core also takes care
enforce regulatory restrictions, in case some interfaces are on
forbidden channels.
Signed-off-by: Jonathan Doron <jonathanx.doron@intel.com> Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Mon, 15 Dec 2014 17:25:59 +0000 (19:25 +0200)]
cfg80211: allow usermode to query wiphy specific regdom
If a wiphy-idx is specified, the kernel will return the wiphy specific
regdomain, if such exists. Otherwise return the global regdom.
When no wiphy-idx is specified, return the global regdomain as well as
all wiphy-specific regulatory domains in the system, via a new nested
list of attributes.
Add a new attribute for each wiphy-specific regdomain, for usermode to
identify it as such.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211: keep sending peer candidate events while in listen state
Instead of sending peer candidate events just once, send them as long
as the peer remains in the LISTEN state in the peering state machine,
when userspace is implementing the peering manager.
Userspace may silence the events from a peer by progressing the state
machine or by setting the link state to BLOCKED.
Fixes the problem that a mesh peering process won't be fired again after
the previous first peering trial fails due to like air propagation error
if the peering is managed by user space such as wpa_supplicant.
This patch works with another patch for wpa_supplicant described here
which fires a peering process again triggered by the notice from kernel.
http://lists.shmoo.com/pipermail/hostap/2014-November/031235.html
Signed-off-by: Kenzoh Nishikawa <Kenzoh.Nishikawa@jp.sony.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Luciano Coelho [Tue, 16 Dec 2014 14:01:39 +0000 (16:01 +0200)]
mac80211: notify channel switch at the end of ieee80211_chswitch_post_beacon()
The call to cfg80211_ch_switch_notify() should be at the end of the
ieee80211_chswitch_post_beacon() function, because it should only be
sent if everything succeeded.
Fixes: d04b5ac9e70b ("cfg80211/mac80211: allow any interface to send channel switch notifications") Signed-off-by: Luciano Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 11 Nov 2014 11:48:42 +0000 (12:48 +0100)]
mac80211: move U-APSD enablement to vif flags
In order to let drivers have more dynamic U-APSD support,
move the enablement flag to the virtual interface driver
flags. This lets drivers not only set it up differently
for different interfaces, but also enable/disable on the
fly if needed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 13 Nov 2014 10:23:53 +0000 (11:23 +0100)]
mac80211: ask driver to look at power level when starting AP
The power level might have been set, but as the interface was idle
it might not have taken effect yet. Ask the driver to check the
power level when starting up an AP so that in this case the correct
power level is used in case the device/driver can only set it when
the interface is actually active.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Sujith Manoharan [Wed, 10 Dec 2014 15:56:11 +0000 (21:26 +0530)]
mac80211: Fix accounting of multicast frames
Since multicast frames are marked as no-ack, using
IEEE80211_TX_STAT_ACK to check if they have been
successfully transmitted by the driver is incorrect
since a driver can choose to ignore transmission status
for no-ack frames. This results in incorrect accounting
for such frames.
To fix this issue, this patch introduces a new flag
that can be used by drivers to indicate error-free
transmission of no-ack frames.
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com>
[add a note about not setting the flag for non-no-ack frames] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Sujith Manoharan [Wed, 10 Dec 2014 15:56:10 +0000 (21:26 +0530)]
mac80211: Move IEEE80211_TX_CTL_PS_RESPONSE
Move IEEE80211_TX_CTL_PS_RESPONSE to info->control.flags since
this is used only in the TX path (by ath9k). This frees up
a bit which can be used for other purposes.
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Wed, 3 Dec 2014 16:08:17 +0000 (18:08 +0200)]
cfg80211: correctly check ad-hoc channels
Ad-hoc requires beaconing for regulatory purposes. Validate that the
channel is valid for beaconing, and not only enabled.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
cfg80211: don't WARN about two consecutive Country IE hint
This can happen and there is no point in added more
detection code lower in the stack. Catching these in one
single point (cfg80211) is enough. Stop WARNING about this
case.
This fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=89001
Cc: stable@vger.kernel.org Fixes: 2f1c6c572d7b ("cfg80211: process non country IE conflicting first") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211: update the channel context after channel switch
When the channel switch has been made, a vif is now using
the channel context which was reserved. When that happens,
we need to update the channel context since its parameters
may change.
I hit a case in which I switched to a 40Mhz channel but the
reserved channel context was still on 20Mhz. The rate control
would try to send 40Mhz packets on a 20Mhz channel context and
that made iwlwifi's firmware unhappy.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Luciano Coelho [Mon, 1 Dec 2014 09:32:09 +0000 (11:32 +0200)]
nl80211: check matches array length before acessing it
If the userspace passes a malformed sched scan request (or a net
detect wowlan configuration) by adding a NL80211_ATTR_SCHED_SCAN_MATCH
attribute without any nested matchsets, a NULL pointer dereference
will occur. Fix this by checking that we do have matchsets in our
array before trying to access it.
Johannes Berg [Fri, 12 Dec 2014 11:26:25 +0000 (12:26 +0100)]
cfg80211: use __force __rcu to suppress sparse warning
The code assigns a constant value (a pointer to a static variable)
to an RCU pointer, which results in a sparse warning:
reg.c:112:10: warning: cast adds address space to expression (<asn:4>)
Suppress this warning by using __force.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Thu, 4 Dec 2014 10:22:16 +0000 (12:22 +0200)]
cfg80211: avoid mem leak on driver hint set
In the already-set and intersect case of a driver-hint, the previous
wiphy regdomain was not freed before being reset with a copy of the
cfg80211 regdomain.
Cc: stable@vger.kernel.org Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com> Acked-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jouni Malinen [Thu, 11 Dec 2014 21:48:55 +0000 (23:48 +0200)]
cfg80211: Fix 160 MHz channels with 80+80 and 160 MHz drivers
The VHT supported channel width field is a two bit integer, not a
bitfield. cfg80211_chandef_usable() was interpreting it incorrectly and
ended up rejecting 160 MHz channel width if the driver indicated support
for both 160 and 80+80 MHz channels.
Cc: stable@vger.kernel.org (3.16+) Fixes: 3d9d1d6656a73 ("nl80211/cfg80211: support VHT channel configuration")
(however, no real drivers had 160 MHz support it until 3.16) Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Andreas Müller [Fri, 12 Dec 2014 11:11:11 +0000 (12:11 +0100)]
mac80211: fix multicast LED blinking and counter
As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount"
stopped being incremented after the use-after-free fix. Furthermore, the
RX-LED will be triggered by every multicast frame (which wouldn't happen
before) which wouldn't allow the LED to rest at all.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the
patch.
Cc: stable@vger.kernel.org Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation") Signed-off-by: Andreas Müller <goo@stapelspeicher.org>
[rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Jes Sorensen [Wed, 10 Dec 2014 19:14:07 +0000 (14:14 -0500)]
mac80211: avoid using uninitialized stack data
Avoid a case where we would access uninitialized stack data if the AP
advertises HT support without 40MHz channel support.
Cc: stable@vger.kernel.org Fixes: f3000e1b43f1 ("mac80211: fix broken use of VHT/20Mhz with some APs") Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Chun-Hao Lin [Wed, 10 Dec 2014 13:28:38 +0000 (21:28 +0800)]
r8169:update rtl8168g pcie ephy parameter
Add ephy parameter to rtl8168g.
Also change the common function of rtl8168g from "rtl_hw_start_8168g_1" to
"rtl_hw_start_8168g". And function "rtl_hw_start_8168g_1" is used for
setting rtl8168g hardware parameters.
Following is the explanation of what hardware parameter change for.
rtl8168g may erroneous judge the PCIe signal quality and show the error bit
on PCI configuration space when in PCIe low power mode.
The following ephy parameters are for above issue.
{ 0x00, 0x0000, 0x0008 }
{ 0x0c, 0x37d0, 0x0820 }
{ 0x1e, 0x0000, 0x0001 }
rtl8168g may return to PCIe L0 from PCIe L0s low power mode too slow.
The following ephy parameter is for above issue.
{ 0x19, 0x8000, 0x0000 }
Signed-off-by: Chunhao Lin <hau@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 12 Dec 2014 02:12:42 +0000 (18:12 -0800)]
net: dsa: bcm_sf2: force link for all fixed PHY devices
For ports of the switch that we define as "fixed PHYs" such as MoCA, we
would have our Port 7 special handling that would allow us to assert the
link status indication.
For other ports, such as e.g: RGMII_1 connected to a cable modem, we
would rely on whatever the bootloader has left configured, which is a
bad assumption to make, we really need to force the link status
indication here.
Fixes: 246d7f773c13 ("net: dsa: add Broadcom SF2 switch driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 12 Dec 2014 02:15:37 +0000 (21:15 -0500)]
Merge branch 'dma_mb'
Alexander Duyck says:
====================
arch: Add lightweight memory barriers for coherent memory access
These patches introduce two new primitives for synchronizing cache coherent
memory writes and reads. These two new primitives are:
dma_rmb()
dma_wmb()
The first patch cleans up some unnecessary overhead related to the
definition of read_barrier_depends, smp_read_barrier_depends, and comments
related to the barrier.
The second patch adds the primitives for the applicable architectures and
asm-generic.
The third patch adds the barriers to r8169 which turns out to be a good
example of where the new barriers might be useful as they have full
rmb()/wmb() barriers ordering accesses to the descriptors and the DescOwn
bit.
The fourth patch adds support for coherent_rmb() to the Intel fm10k, igb,
and ixgbe drivers. Testing with the ixgbe driver has shown a processing
time reduction of at least 7ns per 64B frame on a Core i7-4930K.
This patch series is essentially the v7 for:
v4-7: Add lightweight memory barriers for coherent memory access
v3: Add lightweight memory barriers fast_rmb() and fast_wmb()
v2: Introduce load_acquire() and store_release()
v1: Introduce read_acquire()
The key changes in this patch series versus the earlier patches are:
v7 resubmit:
- Added Acked-by: Ben Herrenschmidt from v5 to dma_rmb/wmb patch
- No code changes from previous set, still applies cleanly and builds.
v7:
- Dropped test/debug patch that was accidentally slipped in
v6:
- Replaced "memory based device I/O" with "consistent memory" in
docs
- Added reference to DMA-API.txt to explain consistent memory
v5:
- Renamed barriers dma_rmb and dma_wmb
- Undid smp_wmb changes in x86 and PowerPC
- Defined smp_rmb as __lwsync for SMP case on PowerPC
v4:
- Renamed barriers coherent_rmb and coherent_wmb
- Added smp_lwsync for use in smp_load_acquire/smp_store_release
v3:
- Moved away from acquire()/store() and instead focused on barriers
- Added cleanup of read_barrier_depends
- Added change in r8169 to fix cur_tx/DescOwn ordering
- Simplified changes to just replacing/moving barriers in r8169
- Added update to documentation with code example
v2:
- Renamed read_acquire() to be consistent with smp_load_acquire()
- Changed barrier used to be consistent with smp_load_acquire()
- Updated PowerPC code to use __lwsync based on IBM article
- Added store_release() as this is a viable use case for drivers
- Added r8169 patch which is able to fully use primitives
- Added fm10k/igb/ixgbe patch which is able to test performance
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 11 Dec 2014 23:02:28 +0000 (15:02 -0800)]
fm10k/igb/ixgbe: Use dma_rmb on Rx descriptor reads
This change makes it so that dma_rmb is used when reading the Rx
descriptor. The advantage of dma_rmb is that it allows for a much
lower cost barrier on x86, powerpc, arm, and arm64 architectures than a
traditional memory barrier when dealing with reads that only have to
synchronize to coherent memory.
In addition I have updated the code so that it just checks to see if any
bits have been set instead of just the DD bit since the DD bit will always
be set as a part of a descriptor write-back so we just need to check for a
non-zero value being present at that memory location rather than just
checking for any specific bit. This allows the code itself to appear much
cleaner and allows the compiler more room to optimize.
Cc: Matthew Vick <matthew.vick@intel.com> Cc: Don Skidmore <donald.c.skidmore@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 11 Dec 2014 23:02:17 +0000 (15:02 -0800)]
r8169: Use dma_rmb() and dma_wmb() for DescOwn checks
The r8169 use a pair of wmb() calls when setting up the descriptor rings.
The first is to synchronize the descriptor data with the descriptor status,
and the second is to synchronize the descriptor status with the use of the
MMIO doorbell to notify the device that descriptors are ready. This can
come at a heavy price on some systems, and is not really necessary on
systems such as x86 as a simple barrier() would suffice to order store/store
accesses. As such we can replace the first memory barrier with
dma_wmb() to reduce the cost for these accesses.
In addition the r8169 uses a rmb() to prevent compiler optimization in the
cleanup paths, however by moving the barrier down a few lines and replacing
it with a dma_rmb() we should be able to use it to guarantee
descriptor accesses do not occur until the device has updated the DescOwn
bit from its end.
One last change I made is to move the update of cur_tx in the xmit path to
after the wmb. This way we can guarantee the device and all CPUs should
see the DescOwn update before they see the cur_tx value update.
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com> Cc: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 11 Dec 2014 23:02:06 +0000 (15:02 -0800)]
arch: Add lightweight memory barriers dma_rmb() and dma_wmb()
There are a number of situations where the mandatory barriers rmb() and
wmb() are used to order memory/memory operations in the device drivers
and those barriers are much heavier than they actually need to be. For
example in the case of PowerPC wmb() calls the heavy-weight sync
instruction when for coherent memory operations all that is really needed
is an lsync or eieio instruction.
This commit adds a coherent only version of the mandatory memory barriers
rmb() and wmb(). In most cases this should result in the barrier being the
same as the SMP barriers for the SMP case, however in some cases we use a
barrier that is somewhere in between rmb() and smp_rmb(). For example on
ARM the rmb barriers break down as follows:
Barrier Call Explanation
--------- -------- ----------------------------------
rmb() dsb() Data synchronization barrier - system
dma_rmb() dmb(osh) data memory barrier - outer sharable
smp_rmb() dmb(ish) data memory barrier - inner sharable
These new barriers are not as safe as the standard rmb() and wmb().
Specifically they do not guarantee ordering between coherent and incoherent
memories. The primary use case for these would be to enforce ordering of
reads and writes when accessing coherent memory that is shared between the
CPU and a device.
It may also be noted that there is no dma_mb(). Most architectures don't
provide a good mechanism for performing a coherent only full barrier without
resorting to the same mechanism used in mb(). As such there isn't much to
be gained in trying to define such a function.
Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: David Miller <davem@davemloft.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 11 Dec 2014 23:01:55 +0000 (15:01 -0800)]
arch: Cleanup read_barrier_depends() and comments
This patch is meant to cleanup the handling of read_barrier_depends and
smp_read_barrier_depends. In multiple spots in the kernel headers
read_barrier_depends is defined as "do {} while (0)", however we then go
into the SMP vs non-SMP sections and have the SMP version reference
read_barrier_depends, and the non-SMP define it as yet another empty
do/while.
With this commit I went through and cleaned out the duplicate definitions
and reduced the number of definitions down to 2 per header. In addition I
moved the 50 line comments for the macro from the x86 and mips headers that
defined it as an empty do/while to those that were actually defining the
macro, alpha and blackfin.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 11 Dec 2014 20:49:16 +0000 (12:49 -0800)]
net: dsa: propagate error code from dsa_slave_phy_setup
In case we cannot attach to our slave netdevice PHY, error out and
propagate that error up to the caller: dsa_slave_create().
Fixes: 0d8bcdd383b8 ("net: dsa: allow for more complex PHY setups") Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This can be triggered when the platform configuration (platform_data or
Device Tree) indicates there should be a PHY device at this address, but
the HW is non-responsive, such that we cannot attach a PHY device at
this specific location.
Fix this by checking the return value prior to calling
phy_connect_direct().
CC: Andrew Lunn <andrew@lunn.ch> Fixes: b31f65fb4383 ("net: dsa: slave: Fix autoneg for phys on switch MDIO bus") Reported-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1) New offloading infrastructure and example 'rocker' driver for
offloading of switching and routing to hardware.
This work was done by a large group of dedicated individuals, not
limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu
2) Start making the networking operate on IOV iterators instead of
modifying iov objects in-situ during transfers. Thanks to Al Viro
and Herbert Xu.
3) A set of new netlink interfaces for the TIPC stack, from Richard
Alpe.
4) Remove unnecessary looping during ipv6 routing lookups, from Martin
KaFai Lau.
5) Add PAUSE frame generation support to gianfar driver, from Matei
Pavaluca.
6) Allow for larger reordering levels in TCP, which are easily
achievable in the real world right now, from Eric Dumazet.
7) Add a variable of napi_schedule that doesn't need to disable cpu
interrupts, from Eric Dumazet.
8) Use a doubly linked list to optimize neigh_parms_release(), from
Nicolas Dichtel.
9) Various enhancements to the kernel BPF verifier, and allow eBPF
programs to actually be attached to sockets. From Alexei
Starovoitov.
10) Support TSO/LSO in sunvnet driver, from David L Stevens.
11) Allow controlling ECN usage via routing metrics, from Florian
Westphal.
12) Remote checksum offload, from Tom Herbert.
13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
driver, from Thomas Lendacky.
14) Add MPLS support to openvswitch, from Simon Horman.
15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
Klassert.
16) Do gro flushes on a per-device basis using a timer, from Eric
Dumazet. This tries to resolve the conflicting goals between the
desired handling of bulk vs. RPC-like traffic.
17) Allow userspace to ask for the CPU upon what a packet was
received/steered, via SO_INCOMING_CPU. From Eric Dumazet.
18) Limit GSO packets to half the current congestion window, from Eric
Dumazet.
19) Add a generic helper so that all drivers set their RSS keys in a
consistent way, from Eric Dumazet.
20) Add xmit_more support to enic driver, from Govindarajulu
Varadarajan.
21) Add VLAN packet scheduler action, from Jiri Pirko.
22) Support configurable RSS hash functions via ethtool, from Eyal
Perry.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
Fix race condition between vxlan_sock_add and vxlan_sock_release
net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
net/mlx4: Add support for A0 steering
net/mlx4: Refactor QUERY_PORT
net/mlx4_core: Add explicit error message when rule doesn't meet configuration
net/mlx4: Add A0 hybrid steering
net/mlx4: Add mlx4_bitmap zone allocator
net/mlx4: Add a check if there are too many reserved QPs
net/mlx4: Change QP allocation scheme
net/mlx4_core: Use tasklet for user-space CQ completion events
net/mlx4_core: Mask out host side virtualization features for guests
net/mlx4_en: Set csum level for encapsulated packets
be2net: Export tunnel offloads only when a VxLAN tunnel is created
gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
net: fec: only enable mdio interrupt before phy device link up
net: fec: clear all interrupt events to support i.MX6SX
net: fec: reset fep link status in suspend function
net: sock: fix access via invalid file descriptor
net: introduce helper macro for_each_cmsghdr
...
Linus Torvalds [Thu, 11 Dec 2014 21:20:50 +0000 (13:20 -0800)]
Merge tag 'sound-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"This became a fairly large pull request. In addition to the usual
driver updates / fixes, there have been a high amount of cleanups in
ASoC area, as well as control API helpers and kernel documentations
fixes touching through the whole tree.
In the driver side, the biggest changes are the support for new Intel
SoC found on new x86 machines, and the updates of FireWire dice and
oxfw drivers.
Some remarkable items are below:
ALSA core:
- PCM mmap code cleanup, removal of arch-dependent codes
- PCM xrun injection support
- PCM hwptr tracepoint support
- Refactoring of snd_pcm_action(), simplification of PCM locking
- Robustified sequecner auto-load functionality
- New control API helpers and lots of cleanups along with them
- Lots of kerneldoc fixes and cleanups
USB-audio:
- The mixer resume code was largely rewritten, and the devices with
quirks are resumed properly.
- New hardware support: Focusrite Scarlett, Digidesign Mbox1,
Denon/Marantz DACs, Zoom R16/24
FireWire:
- DICE driver updates with better duplex and sync support, including
MIDI support
- New OXFW driver for Oxford Semiconductor FW970/971 chipset,
including the previous LaCie Speakers device. Fullduplex and MIDI
support included as well as DICE driver.
HD-audio:
- Refactoring the driver-caps quirk handling in snd-hda-intel
- More consistent control names representing the topology better
- Fixups: HP mute LED with ALC268 codec, Ideapad S210 built-in mic
fix, ASUS Z99He laptop EAPD
ASoC:
- Conversion of AC'97 drivers to use regmap, bringing us closer to
the removal of the ASoC level I/O code
- Clean up a lot of old drivers that were open coding things that
have subsequently been implemented in the core
- Some DAPM performance improvements
- Removal of the now seldom used CODEC mutex
- Lots of updates for the newer Intel SoC support, including support
for the DSP and some Cherrytrail and Braswell machine drivers
- Support for Samsung boards using rt5631 as the CODEC
- Removal of the obsolete AFEB9260 machine driver
- Driver support for the TI TS3A227E headset driver used in some
Chrombeooks
Others:
- ASIHPI driver update and cleanups
- Lots of dev_*() printk conversions
- Lots of trivial cleanups for the codes spotted by Coccinelle"
* tag 'sound-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (594 commits)
ALSA: pcxhr: NULL dereference on probe failure
ALSA: lola: NULL dereference on probe failure
ALSA: hda - Add "eapd" model string for AD1986A codec
ALSA: hda - Add EAPD fixup for ASUS Z99He laptop
ALSA: oxfw: Add hwdep interface
ALSA: oxfw: Add support for capture/playback MIDI messages
ALSA: oxfw: add support for capturing PCM samples
ALSA: oxfw: Add support AMDTP in-stream
ALSA: oxfw: Add support for Behringer/Mackie devices
ALSA: oxfw: Change the way to start stream
ALSA: oxfw: Add proc interface for debugging purpose
ALSA: oxfw: Change the way to make PCM rules/constraints
ALSA: oxfw: Add support for AV/C stream format command to get/set supported stream formation
ALSA: oxfw: Change the way to name card
ALSA: dice: Add support for MIDI capture/playback
ALSA: dice: Add support for capturing PCM samples
ALSA: dice: Support for non SYT-Match sampling clock source mode
ALSA: dice: Add support for duplex streams with synchronization
ALSA: dice: Change the way to start stream
ALSA: jack: Add dummy snd_jack_set_key() definition
...
Linus Torvalds [Thu, 11 Dec 2014 21:06:58 +0000 (13:06 -0800)]
Merge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux
Pull devicetree changes from Grant Likely:
"Lots of activity in the devicetree code for v3.18. Most of it is
related to getting all of the overlay support code in place, but there
are other important things in there.
Highlights:
- OF_RECONFIG notifiers for SPI, I2C and Platform devices. Those
subsystems can now respond to live changes to the device tree.
- CONFIG_OF_OVERLAY method for applying live changes to the device
tree
- Removal of the of_allnodes list. This used to be used to iterate
over all the nodes in the device tree, but it is unnecessary
because the same thing can be done by iterating over the list of
child pointers. Getting rid of of_allnodes saves some memory and
avoids the possibility of of_allnodes being sorted differently from
the child lists.
- Support for retrieving original DTB blob via sysfs. Needed by
kexec.
- More unittests
- Documentation and minor bug fixes"
* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux: (42 commits)
of: Delete unnecessary check before calling "of_node_put()"
of: Drop ->next pointer from struct device_node
spi: Check for spi_of_notifier when CONFIG_OF_DYNAMIC=y
of: support passing console options with stdout-path
of: add optional options parameter to of_find_node_by_path()
of: Add bindings for chosen node, stdout-path
of: Remove unneeded and incorrect MODULE_DEVICE_TABLE
ARM: dt: fix up PL011 device tree bindings
of: base, fix of_property_read_string_helper kernel-doc
of: remove select of non-existant OF_DEVICE config symbol
spi/of: Add OF notifier handler
spi/of: Create new device registration method and accessors
i2c/of: Add OF_RECONFIG notifier handler
i2c/of: Factor out Devicetree registration code
of/overlay: Add overlay unittests
of/overlay: Introduce DT overlay support
of/reconfig: Add OF_DYNAMIC notifier for platform_bus_type
of/reconfig: Always use the same structure for notifiers
of/reconfig: Add debug output for OF_RECONFIG notifiers
of/reconfig: Add empty stubs for the of_reconfig methods
...
Linus Torvalds [Thu, 11 Dec 2014 20:46:32 +0000 (12:46 -0800)]
Merge tag 'fbdev-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev updates from Tomi Valkeinen:
- support for mx6sl and mx6sx
- OMAP HDMI audio rewrite to make it finally work
- OMAP video PLL work to prepare for new DRA7xx SoCs
- simplefb DT related improvements
* tag 'fbdev-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (81 commits)
video: uvesafb: Deletion of an unnecessary check before the function call "platform_device_put"
video: fbdev-VIA: Deletion of an unnecessary check before the function call "framebuffer_release"
video: fbdev-MMP: Deletion of an unnecessary check before the function call "mmp_unregister_path"
video: mx3fb: Deletion of an unnecessary check before the function call "backlight_device_unregister"
video: fbdev-OMAP2: Deletion of unnecessary checks before the function call "i2c_put_adapter"
video: fbdev-SIS: Deletion of unnecessary checks before the function call "pci_dev_put"
video: smscufx: Deletion of unnecessary checks before the function call "vfree"
video: udlfb: Deletion of unnecessary checks before the function call "vfree"
video: uvesafb: Deletion of an unnecessary check before the function call "uvesafb_free"
video: fbdev-LCDC: Deletion of an unnecessary check before the function call "vfree"
video: fbdev: arkfb: suppress build warning
video: fbdev: s3fb: suppress build warning
video: fbdev: vt8623fb: suppress build warning
OMAPDSS: hdmi5: Fix bit field for IEC958_AES2_CON_SOURCE
OMAPDSS: hdmi: Remove __exit qualifier from hdmi_uninit_output()
OMAPDSS: hdmi5: Change hdmi_wp idlemode to to no_idle for audio playback
OMAPDSS: Remove all references to obsolete HDMI audio callbacks
ASoC: omap: Remove obsolete HDMI audio code and Kconfig options
OMAPDSS: hdmi5: Register ASoC platform device for omap hdmi audio
OMAPDSS: hdmi5: Remove callbacks for the old ASoC DAI driver
...
Linus Torvalds [Thu, 11 Dec 2014 20:09:37 +0000 (12:09 -0800)]
Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox framework updates from Jassi Brar.
* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
Mailbox: Add support for Platform Communication Channel
mailbox/omap: adapt to the new mailbox framework
mailbox: add tx_prepare client callback
mailbox: Don't unnecessarily re-arm the polling timer
Linus Torvalds [Thu, 11 Dec 2014 20:03:34 +0000 (12:03 -0800)]
Merge tag 'spi-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"Not a huge amount going on this release, mainly new drivers (there's a
couple more waiting that didn't quite make the cut for this release
too):
- An interface for querying if the current transfer is the last in a
message, allowing controllers that need special handling for the
final transfer to use the core message parsing.
- Support for Amlogic Meson SPIFC, Imagination Technologies SFPI,
Intel Quark X1000 and Samsung Exynos 7 controllers"
* tag 'spi-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (38 commits)
spi/s3c64xx: Remove redundant runtime PM management
spi: fsl-spi: remove unused variable assignment
spi: spi-fsl-spi: Return an error code in fsl_spi_do_one_msg()
spi: core: Do not mangle error code from kthread_run()
spi: fsl-espi: add (un)prepare_transfer_hardware calls to save power if SPI is not in use
spi: fsl-(e)spi: migrate to generic master queueing
spi/txx9: Deletion of an unnecessary check before the function call "clk_disable"
spi: cadence: Fix 3-to-8 mux mode
spi: cadence: Init HW after reading devicetree attributes
spi: meson: Select REGMAP_MMIO
spi: s3c64xx: add support for exynos7 SPI controller
spi: spi-pxa2xx: SPI support for Intel Quark X1000
spi: meson: meson_spifc_setup_speed() can be static
spi: spi-pxa2xx: Add helpers for regiseters' accessing
spi: spi-mxs: Fix mapping from vmalloc-ed buffer to scatter list
spi: atmel: introduce probe deferring
spi: atmel: remove compat for non DT board when requesting dma chan
spi: meson: Add support for Amlogic Meson SPIFC
spi: meson: Add device tree bindings documentation for SPIFC
spi: core: Add spi_transfer_is_last() helper
...
Linus Torvalds [Thu, 11 Dec 2014 19:58:50 +0000 (11:58 -0800)]
Merge tag 'edac/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac
Pull edac updates from Mauro Carvalho Chehab:
- Broadwell-DE support on sb-edac driver
- Some fixes at sb-edac driver
* tag 'edac/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
sb_edac: Fix typo computing number of banks
sb_edac: Add support for Broadwell-DE processor
sb_edac: Fix discovery of top-of-low-memory for Haswell
sb_edac: Fix erroneous bytes->gigabytes conversion
sb_edac: Fix off-by-one error in number of channels
Marcelo Leitner [Thu, 11 Dec 2014 12:02:22 +0000 (10:02 -0200)]
Fix race condition between vxlan_sock_add and vxlan_sock_release
Currently, when trying to reuse a socket, vxlan_sock_add will grab
vn->sock_lock, locate a reusable socket, inc refcount and release
vn->sock_lock.
But vxlan_sock_release() will first decrement refcount, and then grab
that lock. refcnt operations are atomic but as currently we have
deferred works which hold vs->refcnt each, this might happen, leading to
a use after free (specially after vxlan_igmp_leave):
CPU 1 CPU 2
deferred work vxlan_sock_add
... ...
spin_lock(&vn->sock_lock)
vs = vxlan_find_sock();
vxlan_sock_release
dec vs->refcnt, reaches 0
spin_lock(&vn->sock_lock)
vxlan_sock_hold(vs), refcnt=1
spin_unlock(&vn->sock_lock)
hlist_del_rcu(&vs->hlist);
vxlan_notify_del_rx_port(vs)
spin_unlock(&vn->sock_lock)
So when we look for a reusable socket, we check if it wasn't freed
already before reusing it.
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Fixes: 7c47cedf43a8b3 ("vxlan: move IGMP join/leave to work queue") Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Dec 2014 19:49:23 +0000 (11:49 -0800)]
Merge tag 'media/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- Two new dvb frontend drivers: mn88472 and mn88473
- A new driver for some PCIe DVBSky cards
- A new remote controller driver: meson-ir
- One LIRC staging driver got rewritten and promoted to mainstream:
igorplugusb
- A new tuner driver (m88rs6000t)
- The old omap2 media driver got removed from staging. This driver
uses an old DMA API and it is likely broken on recent kernels.
Nobody cared enough to fix it
- Media bus format moved to a separate header, as DRM will also use the
definitions there
- mem2mem_testdev were renamed to vim2m, in order to use the same
naming convention taken by the other virtual test driver (vivid)
- Added a new driver for coda SoC (coda-jpeg)
- The cx88 driver got converted to use videobuf2 core
- Make DMABUF export buffer to work with DMA Scatter/Gather and Vmalloc
cores
- Lots of other fixes, improvements and cleanups on the drivers.
* tag 'media/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (384 commits)
[media] mn88473: One function call less in mn88473_init() after error
[media] mn88473: Remove uneeded check before release_firmware()
[media] lirc_zilog: Deletion of unnecessary checks before vfree()
[media] MAINTAINERS: Add myself as img-ir maintainer
[media] img-ir: Don't set driver's module owner
[media] img-ir: Depend on METAG or MIPS or COMPILE_TEST
[media] img-ir/hw: Drop [un]register_decoder declarations
[media] img-ir/hw: Fix potential deadlock stopping timer
[media] img-ir/hw: Always read data to clear buffer
[media] redrat3: ensure dma is setup properly
[media] ddbridge: remove unneeded check before dvb_unregister_device()
[media] si2157: One function call less in si2157_init() after error
[media] tuners: remove uneeded checks before release_firmware()
[media] arm: omap2: rx51-peripherals: fix build warning
[media] stv090x: add an extra protetion against buffer overflow
[media] stv090x: Remove an unreachable code
[media] stv090x: Some whitespace cleanups
[media] em28xx: checkpatch cleanup: whitespaces/new lines cleanups
[media] si2168: add support for firmware files in new format
[media] si2168: debug printout for firmware version
...
David S. Miller [Thu, 11 Dec 2014 19:47:40 +0000 (14:47 -0500)]
Merge branch 'mlx4-next'
Or Gerlitz says:
====================
mlx4 driver update
This series from Matan, Jenny, Dotan and myself is mostly about adding
support to a new performance optimized flow steering mode (patches 4-10).
The 1st two patches are small fixes (one for VXLAN and one for SRIOV),
and the third patch is a fix to avoid hard-lockup situation when many
(hunderds) processes holding user-space QPs/CQs get events.
Matan and Or.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:58:00 +0000 (10:58 +0200)]
net/mlx4: Add support for A0 steering
Add the required firmware commands for A0 steering and a way to enable
that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT,
QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used
to configure and query the device.
The different A0 DMFS (steering) modes are:
Static - optimized performance, but flow steering rules are
limited. This mode should be choosed explicitly by the user
in order to be used.
Dynamic - this mode should be explicitly choosed by the user.
In this mode, the FW works in optimized steering mode as long as
it can and afterwards automatically drops to classic (full) DMFS.
Disable - this mode should be explicitly choosed by the user.
The user instructs the system not to use optimized steering, even if
the FW supports Dynamic A0 DMFS (and thus will be able to use optimized
steering in Default A0 DMFS mode).
Default - this mode is implicitly choosed. In this mode, if the FW
supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll
work at Disable A0 DMFS mode.
Under SRIOV configuration, when the A0 steering mode is enabled,
older guest VF drivers who aren't using the RX QP allocation flag
(MLX4_RESERVE_A0_QP) will get a QP from the general range and
fail when attempting to register a steering rule. To avoid that,
the PF context behaviour is changed once on A0 static mode, to
require support for the allocation flag in VF drivers too.
In order to enable A0 steering, we use log_num_mgm_entry_size param.
If the value of the parameter is not positive, we treat the absolute
value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this
bit field enables static A0 steering.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:57:59 +0000 (10:57 +0200)]
net/mlx4: Refactor QUERY_PORT
Currently QUERY_PORT is done as a part of QUERY_DEV_CAP firmware command.
Since we would like to use it without querying all device capabilities,
extract this part to be a function of its own.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:57:58 +0000 (10:57 +0200)]
net/mlx4_core: Add explicit error message when rule doesn't meet configuration
When a given flow steering rule is invalid in respect to the current
steering configuration, print the correct error message to the system log.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:57:57 +0000 (10:57 +0200)]
net/mlx4: Add A0 hybrid steering
A0 hybrid steering is a form of high performance flow steering.
By using this mode, mlx4 cards use a fast limited table based steering,
in order to enable fast steering of unicast packets to a QP.
In order to implement A0 hybrid steering we allocate resources
from different zones:
(1) General range
(2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region.
When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP,
we try hard to allocate the QP from range (2). Otherwise, we try hard not
to allocate from this range. However, when the system is pushed to its
limits and one needs every resource, the allocator uses every region it can.
Meaning, when we run out of raw-eth qps, the allocator allocates from the
general range (and the special-A0 area is no longer active). If we run out
of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that
is also exhausted, the allocator will allocate from the general range
(and the A0 region is no longer active).
Note that if a raw-eth qp is allocated from the general range, it attempts
to allocate the range such that bits 6 and 7 (blueflame bits) in the
QP number are not set.
When the feature is used in SRIOV, the VF has to notify the PF what
kind of QP attributes it needs. In order to do that, along with the
"Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According
to the combination of these bits, the PF tries to allocate a suitable QP.
In order to maintain backward compatibility (with older PFs), the PF
notifies which QP attributes it supports via QUERY_FUNC_CAP command.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:57:56 +0000 (10:57 +0200)]
net/mlx4: Add mlx4_bitmap zone allocator
The zone allocator is a mechanism which manages a few mlx4_bitmaps.
When allocating a resource, the user indicates the desired zone of
which this resource will be allocated from. If possible, the resource
will be allocated from this zone. Otherwise, the resource will be
allocated from a less-than, equal-to, higher-than priority zone,
according to the desired zone's properties with that respective
allocation order.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dotan Barak [Thu, 11 Dec 2014 08:57:55 +0000 (10:57 +0200)]
net/mlx4: Add a check if there are too many reserved QPs
The number of reserved QPs is affected both from the firmware and
from the driver's requirements. This patch adds a check that
validates that this number is indeed feasable.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields
in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset.
The current Ethernet driver code reserves a Tx QP range with 256b alignment.
This is wrong because if there are more than 64 Tx QPs in use,
QPNs >= base + 65 will have bits 6/7 set.
This problem is not specific for the Ethernet driver, any entity that
tries to reserve more than 64 BF-enabled QPs should fail. Also, using
ranges is not necessary here and is wasteful.
The new mechanism introduced here will support reservation for
"Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs
(when hypervisors support WC in VMs). The flow we use is:
1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation,
and request "BF enabled QPs" if BF is supported for the function
2. In the ALLOC_RES FW command, change param1 to:
a. param1[23:0] - number of QPs
b. param1[31-24] - flags controlling QPs reservation
Bit 31 refers to Eth blueflame supported QPs. Those QPs must have
bits 6 and 7 unset in order to be used in Ethernet.
Bits 24-30 of the flags are currently reserved.
When a function tries to allocate a QP, it states the required attributes
for this QP. Those attributes are considered "best-effort". If an attribute,
such as Ethernet BF enabled QP, is a must-have attribute, the function has
to check that attribute is supported before trying to do the allocation.
In a lower layer of the code, mlx4_qp_reserve_range masks out the bits
which are unsupported. If SRIOV is used, the PF validates those attributes
and masks out unsupported attributes as well. In order to notify VFs which
attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's
mailbox is filled by the PF, which notifies which QP allocation attributes
it supports.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Thu, 11 Dec 2014 08:57:53 +0000 (10:57 +0200)]
net/mlx4_core: Use tasklet for user-space CQ completion events
Previously, we've fired all our completion callbacks straight from our ISR.
Some of those callbacks were lightweight (for example, mlx4_en's and
IPoIB napi callbacks), but some of them did more work (for example,
the user-space RDMA stack uverbs' completion handler). Besides that,
doing more than the minimal work in ISR is generally considered wrong,
it could even lead to a hard lockup of the system. Since when a lot
of completion events are generated by the hardware, the loop over those
events could be so long, that we'll get into a hard lockup by the system
watchdog.
In order to avoid that, add a new way of invoking completion events
callbacks. In the interrupt itself, we add the CQs which receive completion
event to a per-EQ list and schedule a tasklet. In the tasklet context
we loop over all the CQs in the list and invoke the user callback.
Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 11 Dec 2014 08:57:52 +0000 (10:57 +0200)]
net/mlx4_core: Mask out host side virtualization features for guests
When VFs (guests in this context) issue the QUERY_DEV_CAP command, they
need not be told that host side virtualization features such as VST, FSM
(MAC anti-spoofing) and running > 80 VFs are supported by the device.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 11 Dec 2014 08:57:51 +0000 (10:57 +0200)]
net/mlx4_en: Set csum level for encapsulated packets
This was dropped by mistake for the napi_gro_frags flow, fix that.
Fixes: dd65beac48a5 ('net/mlx4_en: Extend usage of napi_gro_frags') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Dec 2014 19:39:03 +0000 (11:39 -0800)]
Merge tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight
Pull backlight updates from Lee Jones:
- Clean-up leaky resources; pwm_bl
- Simplify Device Tree initialisation; lp855x_bl
- Add Regulator support; lp855x
- Remove Bryan from the Maintainer list -- new baby, no time :)
* tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
MAINTAINERS: Remove my name from Backlight subsystem
backlight: lp855x: Add supply regulator to lp855x
backlight: lp855x: Refactor DT parsing code
backlight: pwm: Clean-up pwm requested using legacy API
be2net: Export tunnel offloads only when a VxLAN tunnel is created
The encapsulated offload flags shouldn't be unconditionally exported
to the stack. The stack expects offloading to work across all tunnel
types when those flags are set. This would break other tunnels (like
GRE) since be2net currently supports tunnel offload for VxLAN only.
Also, with VxLANs Skyhawk-R can offload only 1 UDP dport. If more
than 1 UDP port is added, we should disable offloads in that case too.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kevin Hao [Thu, 11 Dec 2014 06:08:41 +0000 (14:08 +0800)]
gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
We need to use dma_mapping_error() to check the dma address returned
by dma_map_single/page(). Otherwise we would get warning like this:
WARNING: at lib/dma-debug.c:1140
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc2-next-20141029 #196
task: c0834300 ti: effe6000 task.ti: c0874000
NIP: c02b2c98 LR: c02b2c98 CTR: c030abc4
REGS: effe7d70 TRAP: 0700 Not tainted (3.18.0-rc2-next-20141029)
MSR: 00021000 <CE,ME> CR: 22044022 XER: 20000000
For TX, we need to unmap the pages which has already been mapped and
free the skb before return.
For RX, move the dma mapping and error check to gfar_new_skb(). We
would reuse the original skb in the rx ring when either allocating
skb failure or dma mapping error.
Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
Remove use of calls into t4_fw_hello() with MASTER_MUST, which results in
FW_HELLO_CMD_MASTERFORCE being set. The firmware doesn't support this and of
course any existing PF Drivers will totally go for a toss.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Dec 2014 18:43:14 +0000 (10:43 -0800)]
Merge tag 'pinctrl-v3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control changes from Linus Walleij:
"Here is a stash of pin control changes I have collected for the v3.19
series. Mainly new hardware support, with Intels new embedded SoC as
the especially interesting thing standing out, fully using the
subsystem.
- Force conversion of the ux500 pin control device trees and parsers
to use the generic pin control bindings.
- New driver and device tree bindings for the Qualcomm PMIC MPP pin
controller and GPIO.
- Some ACPI infrastructure for pin controllers.
- New driver for the Intel CherryView/Braswell pin controller, the
first Intel pin controller to fully take advantage of the pin
control subsystem.
- Support the Freescale i.MX VF610 variant.
- Support the sunxi A80 variant.
- Support the Samsung Exynos 4415 and Exynos 7 variants.
- Split out Intel pin controllers to their own subdirectory.
- A large slew of rockchip pin control updates, including
suspend/resume support.
- A large slew of Samsung Exynos pin controller updates.
- Various minor updates and fixes"
* tag 'pinctrl-v3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (49 commits)
pinctrl: at91: enhance (debugfs) at91_gpio_dbg_show
pinctrl: meson: add device tree bindings documentation
gpio: tz1090: Fix error handling of irq_of_parse_and_map
pinctrl: tz1090-pinctrl.txt: Fix typo in binding
pinctrl: pinconf-generic: Declare dt_params/conf_items const
pinctrl: exynos: Add support for Exynos4415
pinctrl: exynos: Add initial driver data for Exynos7
pinctrl: exynos: Add irq_chip instance for Exynos7 wakeup interrupts
pinctrl: exynos: Consolidate irq domain callbacks
pinctrl: exynos: Generalize the eint16_31 demux code
pinctrl: samsung: Separate per-bank init and runtime data
pinctrl: samsung: Constify samsung_pin_ctrl struct
pinctrl: samsung: Constify samsung_pin_bank_type struct
pinctrl: samsung: Drop unused label field in samsung_pin_ctrl struct
pinctrl: samsung: Make samsung_pinctrl_get_soc_data use ERR_PTR()
pinctrl: Add Intel Cherryview/Braswell pin controller support
gpio / ACPI: Add knowledge about pin controllers to acpi_get_gpiod()
pinctrl: Fix path error in documentation
pinctrl: rockchip: save and restore gpio6_c6 pinmux in suspend/resume
pinctrl: rockchip: add suspend/resume functions
...
Linus Torvalds [Thu, 11 Dec 2014 05:17:00 +0000 (21:17 -0800)]
Merge tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki:
"This time we have some more new material than we used to have during
the last couple of development cycles.
The most important part of it to me is the introduction of a unified
interface for accessing device properties provided by platform
firmware. It works with Device Trees and ACPI in a uniform way and
drivers using it need not worry about where the properties come from
as long as the platform firmware (either DT or ACPI) makes them
available. It covers both devices and "bare" device node objects
without struct device representation as that turns out to be necessary
in some cases. This has been in the works for quite a few months (and
development cycles) and has been approved by all of the relevant
maintainers.
On top of that, some drivers are switched over to the new interface
(at25, leds-gpio, gpio_keys_polled) and some additional changes are
made to the core GPIO subsystem to allow device drivers to manipulate
GPIOs in the "canonical" way on platforms that provide GPIO
information in their ACPI tables, but don't assign names to GPIO lines
(in which case the driver needs to do that on the basis of what it
knows about the device in question). That also has been approved by
the GPIO core maintainers and the rfkill driver is now going to use
it.
Second is support for hardware P-states in the intel_pstate driver.
It uses CPUID to detect whether or not the feature is supported by the
processor in which case it will be enabled by default. However, it
can be disabled entirely from the kernel command line if necessary.
Next is support for a platform firmware interface based on ACPI
operation regions used by the PMIC (Power Management Integrated
Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms.
That interface is used for manipulating power resources and for
thermal management: sensor temperature reporting, trip point setting
and so on.
Also the ACPI core is now going to support the _DEP configuration
information in a limited way. Basically, _DEP it supposed to reflect
off-the-hierarchy dependencies between devices which may be very
indirect, like when AML for one device accesses locations in an
operation region handled by another device's driver (usually, the
device depended on this way is a serial bus or GPIO controller). The
support added this time is sufficient to make the ACPI battery driver
work on Asus T100A, but it is general enough to be able to cover some
other use cases in the future.
Finally, we have a new cpufreq driver for the Loongson1B processor.
In addition to the above, there are fixes and cleanups all over the
place as usual and a traditional ACPICA update to a recent upstream
release.
As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for
Intel platforms should be able to handle power management of the DMA
engine correctly, the cpufreq-dt driver should interact with the
thermal subsystem in a better way and the ACPI backlight driver should
handle some more corner cases, among other things.
On top of the ACPICA update there are fixes for race conditions in the
ACPICA's interrupt handling code which might lead to some random and
strange looking failures on some systems.
In the cleanups department the most visible part is the series of
commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration
option. That was triggered by a discussion regarding the generic
power domains code during which we realized that trying to support
certain combinations of PM config options was painful and not really
worth it, because nobody would use them in production anyway. For
this reason, we decided to make CONFIG_PM_SLEEP select
CONFIG_PM_RUNTIME and that lead to the conclusion that the latter
became redundant and CONFIG_PM could be used instead of it. The
material here makes that replacement in a major part of the tree, but
there will be at least one more batch of that in the second part of
the merge window.
Specifics:
- Support for retrieving device properties information from ACPI _DSD
device configuration objects and a unified device properties
interface for device drivers (and subsystems) on top of that. As
stated above, this works with Device Trees and ACPI and allows
device drivers to be written in a platform firmware (DT or ACPI)
agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are
now going to use this new interface and the GPIO subsystem is
additionally modified to allow device drivers to assign names to
GPIO resources returned by ACPI _CRS objects (in case _DSD is not
present or does not provide the expected data). The changes in
this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron
Lu, and Darren Hart with some fixes from others (Fabio Estevam,
Geert Uytterhoeven).
- Support for Hardware Managed Performance States (HWP) as described
in Volume 3, section 14.4, of the Intel SDM in the intel_pstate
driver. CPUID is used to detect whether or not the feature is
supported by the processor. If supported, it will be enabled
automatically unless the intel_pstate=no_hwp switch is present in
the kernel command line. From Dirk Brandewie.
- New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie).
- Support for firmware interface based on ACPI operation regions used
by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR
platforms for power resource control and thermal management (Aaron
Lu).
- Limited support for retrieving off-the-hierarchy dependencies
between devices from ACPI _DEP device configuration objects and
deferred probing support for the ACPI battery driver based on the
_DEP information to make that driver work on Asus T100A (Lan
Tianyu).
- New cpufreq driver for the Loongson1B processor (Kelvin Cheung).
- ACPICA update to upstream revision 20141107 which only affects
tools (Bob Moore).
- Fixes for race conditions in the ACPICA's interrupt handling code
and in the ACPI code related to system suspend and resume (Lv Zheng
and Rafael J Wysocki).
- ACPI core fix for an RCU-related issue in the ioremap() regions
management code that slowed down significantly after CPUs had been
allowed to enter idle states even if they'd had RCU callbakcs
queued and triggered some problems in certain proprietary graphics
driver (and elsewhere). The fix replaces synchronize_rcu() in that
code with synchronize_rcu_expedited() which makes the issue go
away. From Konstantin Khlebnikov.
- ACPI LPSS (Low-Power Subsystem) driver fix to handle power
management of the DMA engine included into the LPSS correctly. The
problem is that the DMA engine doesn't have ACPI PM support of its
own and it simply is turned off when the last LPSS device having
ACPI PM support goes into D3cold. To work around that, the PM
domain used by the ACPI LPSS driver is redesigned so at least one
device with ACPI PM support will be on as long as the DMA engine is
in use. From Andy Shevchenko.
- ACPI backlight driver fix to avoid using it on "Win8-compatible"
systems where it doesn't work and where it was used by default by
mistake (Aaron Lu).
- Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki,
Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin
Chaugule (mostly related to the upcoming ARM64 support).
- Intel RAPL (Running Average Power Limit) power capping driver fixes
and improvements including new processor IDs (Jacob Pan).
- Generic power domains modification to power up domains after
attaching devices to them to meet the expectations of device
drivers and bus types assuming devices to be accessible at probe
time (Ulf Hansson).
- Preliminary support for controlling device clocks from the generic
power domains core code and modifications of the ARM/shmobile
platform to use that feature (Ulf Hansson).
- Assorted minor fixes and cleanups of the generic power domains core
code (Ulf Hansson, Geert Uytterhoeven).
- Assorted minor fixes and cleanups of the device clocks control code
in the PM core (Geert Uytterhoeven, Grygorii Strashko).
- Consolidation of device power management Kconfig options by making
CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter
which is now redundant (Rafael J Wysocki and Kevin Hilman). That
is the first batch of the changes needed for this purpose.
- Core device runtime power management support code cleanup related
to the execution of callbacks (Andrzej Hajda).
- cpuidle ARM support improvements (Lorenzo Pieralisi).
- cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a
new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and
Bartlomiej Zolnierkiewicz).
- New cpufreq driver callback (->ready) to be executed when the
cpufreq core is ready to use a given policy object and cpufreq-dt
driver modification to use that callback for cooling device
registration (Viresh Kumar).
- cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James
Geboski, Tomeu Vizoso).
- Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate,
cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao,
Stefan Wahren, Petr Cvek).
- OPP (Operating Performance Points) framework modification to allow
OPPs to be removed too and update of a few cpufreq drivers
(cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added
during initialization) on driver removal (Viresh Kumar).
- Hibernation core fixes and cleanups (Tina Ruchandani and Markus
Elfring).
- PM Kconfig fix related to CPU power management (Pankaj Dubey).
- cpupower tool fix (Prarit Bhargava)"
* tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits)
i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c
dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
tools: cpupower: fix return checks for sysfs_get_idlestate_count()
drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
leds: leds-gpio: Fix multiple instances registration without 'label' property
iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
hwrandom / exynos / PM: Use CONFIG_PM in #ifdef
block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
USB / PM: Drop CONFIG_PM_RUNTIME from the USB core
PM: Merge the SET*_RUNTIME_PM_OPS() macros
...
Linus Torvalds [Thu, 11 Dec 2014 04:58:52 +0000 (20:58 -0800)]
Merge tag 'pci-v3.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"Here are the PCI changes intended for v3.19. I don't think there's
anything very exciting here, but there was a lot of MSI-related stuff
coming via Thomas.
Details:
NUMA
- Allow numa_node override via sysfs (Prarit Bhargava)
Resource management
- Restore detection of read-only BARs (Myron Stowe)
- Shrink decoding-disabled window while sizing BARs (Myron Stowe)
- Add informational printk for invalid BARs (Myron Stowe)
- Remove fixed parameter in pci_iov_resource_bar() (Myron Stowe)
MSI
- Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits (Yijing Wang)
- Revert "PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()" (Yijing Wang)
- s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq() (Yijing Wang)
Virtualization
- xen: Process failure for pcifront_(re)scan_root() (Chen Gang)
- Make FLR and AF FLR reset warning messages different (Gavin Shan)
Generic host bridge driver
- Allocate config space windows after limiting bus number range (Lorenzo Pieralisi)
- Convert to DT resource parsing API (Lorenzo Pieralisi)
* tag 'pci-v3.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (28 commits)
PCI: Remove fixed parameter in pci_iov_resource_bar()
PCI: Add informational printk for invalid BARs
PCI: tegra: Add Kconfig help text
PCI: tegra: Do not build on 64-bit ARM
PCI: spear: Remove unnecessary OOM message
PCI: mvebu: Add a blank line after declarations
PCI: designware: Add a blank line after declarations
PCI: exynos: Remove unnecessary return statement
PCI: imx6: Use tabs for indentation
PCI: keystone: Remove unnecessary OOM message
PCI: Remove unused and broken to_hotplug_slot()
PCI: Make FLR and AF FLR reset warning messages different
PCI: dra7xx: Add __init annotation to dra7xx_add_pcie_port()
PCI: spear: Add __init annotation to spear13xx_add_pcie_port()
PCI: spear: Rename add_pcie_port(), pcie_init() to spear13xx_add_pcie_port(), etc.
PCI: dra7xx: Rename add_pcie_port() to dra7xx_add_pcie_port()
PCI: layerscape: Add Freescale Layerscape PCIe driver
PCI: Simplify if-return sequences
PCI: Delete unnecessary NULL pointer checks
PCI: Shrink decoding-disabled window while sizing BARs
...
Linus Torvalds [Thu, 11 Dec 2014 04:40:51 +0000 (20:40 -0800)]
Merge tag 'ktest-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest changes from Steven Rostedt:
"The following ktest updates were done:
- Fix handling the make kernelrelease change
- Fix make_min_config that was broken by new bisect_config changes
- Allow tests to undefine default options (not just being able to
override them)
- Print name of test (if defined) to start of test output"
* tag 'ktest-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Add back "tail -1" to kernelrelease make
ktest: Add name to running title
ktest: Allow tests to undefine default options
ktest: Fix make_min_config to handle new assign_configs call
ktest: Use make -s kernelrelease
David S. Miller [Thu, 11 Dec 2014 04:37:06 +0000 (23:37 -0500)]
Merge branch 'fec-next'
Fugang Duan says:
====================
net: fec: driver code clean and bug fix
The patch serial include code clean and bug fix:
Patch#1: avoid dummy operation during suspend/resume test.
Patch#2: bug fix for i.MX6SX SOC that clean all interrupt events during MAC initial process.
Patch#3: before phy device link status is up, only enable MDIO bus interrupt.
V2:
- Modify the comment form from David's suggestion.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Thu, 11 Dec 2014 01:20:32 +0000 (09:20 +0800)]
net: fec: clear all interrupt events to support i.MX6SX
For i.MX6SX FEC controller, there have interrupt mask and event
field extension. To support all SOCs FEC, we clear all interrupt
events during MAVC initial process.
Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Thu, 11 Dec 2014 01:20:31 +0000 (09:20 +0800)]
net: fec: reset fep link status in suspend function
On some i.MX6 serial boards, phy power and refrence clock are supplied
or controlled by SOC. When do suspend/resume test, the power and clock
are disabled, so phy device link down.
For current driver, fep->link is still up status, which cause extra operation
like below code. To avoid the dumy operation, we set fep->link to down when
phy device is real down.
...
if (fep->link) {
napi_disable(&fep->napi);
netif_tx_lock_bh(ndev);
fec_stop(ndev);
netif_tx_unlock_bh(ndev);
napi_enable(&fep->napi);
fep->link = phy_dev->link;
status_change = 1;
}
...
Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Dec 2014 04:35:41 +0000 (20:35 -0800)]
Merge tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull nmi-safe seq_buf printk update from Steven Rostedt:
"This code is a fork from the trace-3.19 pull as it needed the
trace_seq clean ups from that branch.
This code solves the issue of performing stack dumps from NMI context.
The issue is that printk() is not safe from NMI context as if the NMI
were to trigger when a printk() was being performed, the NMI could
deadlock from the printk() internal locks. This has been seen in
practice.
With lots of review from Petr Mladek, this code went through several
iterations, and we feel that it is now at a point of quality to be
accepted into mainline.
Here's what is contained in this patch set:
- Creates a "seq_buf" generic buffer utility that allows a descriptor
to be passed around where functions can write their own "printk()"
formatted strings into it. The generic version was pulled out of
the trace_seq() code that was made specifically for tracing.
- The seq_buf code was change to model the seq_file code. I have a
patch (not included for 3.19) that converts the seq_file.c code
over to use seq_buf.c like the trace_seq.c code does. This was
done to make sure that seq_buf.c is compatible with seq_file.c. I
may try to get that patch in for 3.20.
- The seq_buf.c file was moved to lib/ to remove it from being
dependent on CONFIG_TRACING.
- The printk() was updated to allow for a per_cpu "override" of the
internal calls. That is, instead of writing to the console, a call
to printk() may do something else. This made it easier to allow
the NMI to change what printk() does in order to call dump_stack()
without needing to update that code as well.
- Finally, the dump_stack from all CPUs via NMI code was converted to
use the seq_buf code. The caller to trigger the NMI code would
wait till all the NMIs finished, and then it would print the
seq_buf data to the console safely from a non NMI context
One added bonus is that this code also makes the NMI dump stack work
on PREEMPT_RT kernels. As printk() includes sleeping locks on
PREEMPT_RT, printk() only writes to console if the console does not
use any rt_mutex converted spin locks. Which a lot do"
* tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
x86/nmi: Fix use of unallocated cpumask_var_t
printk/percpu: Define printk_func when printk is not defined
x86/nmi: Perform a safe NMI stack trace on all CPUs
printk: Add per_cpu printk func to allow printk to be diverted
seq_buf: Move the seq_buf code to lib/
seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
tracing: Have seq_buf use full buffer
seq_buf: Add seq_buf_can_fit() helper function
tracing: Add paranoid size check in trace_printk_seq()
tracing: Use trace_seq_used() and seq_buf_used() instead of len
tracing: Clean up tracing_fill_pipe_page()
seq_buf: Create seq_buf_used() to find out how much was written
tracing: Add a seq_buf_clear() helper and clear len and readpos in init
tracing: Convert seq_buf fields to be like seq_file fields
tracing: Convert seq_buf_path() to be like seq_path()
tracing: Create seq_buf layer in trace_seq
0day robot reported the following crash:
[ 21.233581] BUG: unable to handle kernel NULL pointer dereference at 0000000000000007
[ 21.234709] IP: [<ffffffff8156ebda>] sk_attach_bpf+0x39/0xc2
It's due to bpf_prog_get() returning ERR_PTR.
Check it properly.
Reported-by: Fengguang Wu <fengguang.wu@intel.com> Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Dec 2014 04:03:45 +0000 (20:03 -0800)]
Merge tag 'ftracetest-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace self-test updates from Steven Rostedt:
"Updates for the ftrace self tests:
- Added kprobes on ftrace testcase
- Sort test cases
- Add file to hold helper functions
- Use logfile name supported by busybox's mktemp
- Clear trace buffer after running kprobe test
- Fix show descriptions when run on dash shell
- Add --verbose option for showing echo output"
* tag 'ftracetest-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftracetest: Add --verbose option for showing echo output
ftracetest: Fix to show descriptions on dash
ftracetest: Add basic event tracing test cases
ftracetest: Clear trace buffer after running kprobe testcases
ftracetest: Use logfile name supported by busybox's mktemp
ftracetest: Add a couple of ftrace test cases
ftracetest: Add functions file that holds helper functions
ftracetest: Sort testcases
ftracetest: Add kprobes on ftrace testcase
Linus Torvalds [Thu, 11 Dec 2014 03:58:13 +0000 (19:58 -0800)]
Merge tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"There was a lot of clean ups and minor fixes. One of those clean ups
was to the trace_seq code. It also removed the return values to the
trace_seq_*() functions and use trace_seq_has_overflowed() to see if
the buffer filled up or not. This is similar to work being done to
the seq_file code as well in another tree.
Some of the other goodies include:
- Added some "!" (NOT) logic to the tracing filter.
- Fixed the frame pointer logic to the x86_64 mcount trampolines
- Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems.
That is, the ftrace trampoline can be dynamically allocated and be
called directly by functions that only have a single hook to them"
* tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits)
tracing: Truncated output is better than nothing
tracing: Add additional marks to signal very large time deltas
Documentation: describe trace_buf_size parameter more accurately
tracing: Allow NOT to filter AND and OR clauses
tracing: Add NOT to filtering logic
ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter
ftrace/x86: Get rid of ftrace_caller_setup
ftrace/x86: Have save_mcount_regs macro also save stack frames if needed
ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
ftrace/x86: Simplify save_mcount_regs on getting RIP
ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
ftrace/x86: Have static tracing also use ftrace_caller_setup
ftrace/x86: Have static function tracing always test for function graph
kprobes: Add IPMODIFY flag to kprobe_ftrace_ops
ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict
kprobes/ftrace: Recover original IP if pre_handler doesn't change it
tracing/trivial: Fix typos and make an int into a bool
tracing: Deletion of an unnecessary check before iput()
...
Linus Torvalds [Thu, 11 Dec 2014 02:34:42 +0000 (18:34 -0800)]
Merge branch 'akpm' (patchbomb from Andrew)
Merge first patchbomb from Andrew Morton:
- a few minor cifs fixes
- dma-debug upadtes
- ocfs2
- slab
- about half of MM
- procfs
- kernel/exit.c
- panic.c tweaks
- printk upates
- lib/ updates
- checkpatch updates
- fs/binfmt updates
- the drivers/rtc tree
- nilfs
- kmod fixes
- more kernel/exit.c
- various other misc tweaks and fixes
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
exit: pidns: fix/update the comments in zap_pid_ns_processes()
exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
exit: exit_notify: re-use "dead" list to autoreap current
exit: reparent: call forget_original_parent() under tasklist_lock
exit: reparent: avoid find_new_reaper() if no children
exit: reparent: introduce find_alive_thread()
exit: reparent: introduce find_child_reaper()
exit: reparent: document the ->has_child_subreaper checks
exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper()
exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting
exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting
exit: proc: don't try to flush /proc/tgid/task/tgid
exit: release_task: fix the comment about group leader accounting
exit: wait: drop tasklist_lock before psig->c* accounting
exit: wait: don't use zombie->real_parent
exit: wait: cleanup the ptrace_reparented() checks
usermodehelper: kill the kmod_thread_locker logic
usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper()
fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp
nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races
...
Oleg Nesterov [Wed, 10 Dec 2014 23:55:28 +0000 (15:55 -0800)]
exit: pidns: fix/update the comments in zap_pid_ns_processes()
The comments in zap_pid_ns_processes() are not clear, we need to explain
how this code actually works.
1. "Ignore SIGCHLD" looks like optimization but it is not, we also
need this for correctness.
2. The comment above sys_wait4() could tell more.
EXIT_ZOMBIE child is only possible if it has exited before we
ignored SIGCHLD. Or if it is traced from the parent namespace,
but in this case it will be reaped by debugger after detach,
sys_wait4() acts as a synchronization point.
3. The comment about TASK_DEAD (EXIT_DEAD in fact) children is
outdated. Contrary to what it says we do not need to make sure
they all go away after 0a01f2cc390e "pidns: Make the pidns proc
mount/umount logic obvious".
At the same time, we do need to wait for nr_hashed==init_pids,
but the reasons are quite different and not obvious: setns().
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:25 +0000 (15:55 -0800)]
exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
alloc_pid() does get_pid_ns() beforehand but forgets to put_pid_ns() if it
fails because disable_pid_allocation() was called by the exiting
child_reaper.
We could simply move get_pid_ns() down to successful return, but this fix
tries to be as trivial as possible.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Sterling Alexander <stalexan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:20 +0000 (15:55 -0800)]
exit: reparent: call forget_original_parent() under tasklist_lock
Shift "release dead children" loop from forget_original_parent() to its
caller, exit_notify(). It is safe to reap them even if our parent reaps
us right after we drop tasklist_lock, those children no longer have any
connection to the exiting task.
And this allows us to avoid write_lock_irq(tasklist_lock) right after it
was released by forget_original_parent(), we can simply call it with
tasklist_lock held.
While at it, move the comment about forget_original_parent() up to
this function.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:17 +0000 (15:55 -0800)]
exit: reparent: avoid find_new_reaper() if no children
Now that pid_ns logic was isolated we can change forget_original_parent()
to return right after find_child_reaper() when father->children is empty,
there is nothing to reparent in this case.
In particular this avoids find_alive_thread() and this can help if the
whole process exits and it has a lot of PF_EXITING threads at the start of
the thread list, this can easily lead to O(nr_threads ** 2) iterations.
Oleg Nesterov [Wed, 10 Dec 2014 23:55:14 +0000 (15:55 -0800)]
exit: reparent: introduce find_alive_thread()
Add the new simple helper to factor out the for_each_thread() code in
find_child_reaper() and find_new_reaper(). It can also simplify the
potential PF_EXITING -> exit_state change, plus perhaps we can change this
code to take SIGNAL_GROUP_EXIT into account.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:11 +0000 (15:55 -0800)]
exit: reparent: introduce find_child_reaper()
find_new_reaper() does 2 completely different things. Not only it finds a
reaper, it also updates pid_ns->child_reaper or kills the whole namespace
if the caller is ->child_reaper.
Now that has_child_subreaper logic doesn't depend on child_reaper check we
can move that pid_ns code into a separate helper. IMHO this makes the
code more clean, and this allows the next changes.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 10 Dec 2014 23:55:08 +0000 (15:55 -0800)]
exit: reparent: document the ->has_child_subreaper checks
Swap the "init_task" and same_thread_group() checks. This way it is more
simple to document these checks and we can remove the link to the previous
discussion on lkml.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Lennart Poettering <lennart@poettering.net> Cc: Sterling Alexander <stalexan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>