Ram Amrani [Tue, 9 May 2017 12:07:50 +0000 (15:07 +0300)]
qed: Correct doorbell configuration for !4Kb pages
When configuring the doorbell DPI address, driver aligns the start
address to 4KB [HW-pages] instead of host PAGE_SIZE.
As a result, RoCE applications might receive addresses which are
unaligned to pages [when PAGE_SIZE > 4KB], which is a security risk.
Fixes: 51ff17251c9c ("qed: Add support for RoCE hw init") Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Tue, 9 May 2017 12:07:49 +0000 (15:07 +0300)]
qed: Tell QM the number of tasks
Driver doesn't pass the number of tasks to the QM init logic
which would cause back-pressure in scenarios requiring many tasks
[E.g., using max MRs] and thus reduced performance.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When (re|un)loading, Tx-queues belonging to XDP would not get freed.
Fixes: cb6aeb079294 ("qede: Add support for XDP_TX") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: Reduce harmless SRIOV error message to debug level
Under SRIOV resource management, extra counters are allocated to VFs
from a free pool. If that pool is empty, the ALLOC_RES command for
a counter resource fails -- and this generates a misleading error
message in the message log.
Under SRIOV, each VF is allocated (i.e., guaranteed) 2 counters --
one counter per port. For ETH ports, the RoCE driver requests an
additional counter (above the guaranteed counters). If that request
fails, the VF RoCE driver simply uses the default (i.e., guaranteed)
counter for that port.
Thus, failing to allocate an additional counter does not constitute
a problem, and the error message on the PF when this occurs should
be reduced to debug level.
Finally, to identify the situation that the reason for the failure is
that no resources are available to grant to the VF, we modified the
error returned by mlx4_grant_resource to -EDQUOT (Quota exceeded),
which more accurately describes the error.
Fixes: c3abb51bdb0e ("IB/mlx4: Add RoCE/IB dedicated counters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kamal Heib [Tue, 9 May 2017 11:45:22 +0000 (14:45 +0300)]
net/mlx4_en: Change the error print to debug print
The error print within mlx4_en_calc_rx_buf() should be a debug print.
Fixes: 51151a16a60f ('mlx4: allow order-0 memory allocations in RX path') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Mason [Mon, 8 May 2017 21:48:35 +0000 (17:48 -0400)]
net: mdio-mux: bcm-iproc: call mdiobus_free() in error path
If an error is encountered in mdio_mux_init(), the error path will call
mdiobus_free(). Since mdiobus_register() has been called prior to
mdio_mux_init(), the bus->state will not be MDIOBUS_UNREGISTERED. This
causes a BUG_ON() in mdiobus_free(). To correct this issue, add an
error path for mdio_mux_init() which calls mdiobus_unregister() prior to
mdiobus_free().
Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: 98bc865a1ec8 ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: ti: cpsw: adjust cpsw fifos depth for fullduplex flow control
When users set flow control using ethtool the bits are set properly in the
CPGMAC_SL MACCONTROL register, but the FIFO depth in the respective Port n
Maximum FIFO Blocks (Pn_MAX_BLKS) registers remains set to the minimum size
reset value. When receive flow control is enabled on a port, the port's
associated FIFO block allocation must be adjusted. The port RX allocation
must increase to accommodate the flow control runout. The TRM recommends
numbers of 5 or 6.
Hence, apply required Port FIFO configuration to
Pn_MAX_BLKS.Pn_TX_MAX_BLKS=0xF and Pn_MAX_BLKS.Pn_RX_MAX_BLKS=0x5 during
interface initialization.
Cc: Schuyler Patton <spatton@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 8 May 2017 17:12:13 +0000 (10:12 -0700)]
ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
For each netns (except init_net), we initialize its null entry
in 3 places:
1) The template itself, as we use kmemdup()
2) Code around dst_init_metrics() in ip6_route_net_init()
3) ip6_route_dev_notify(), which is supposed to initialize it after
loopback registers
Unfortunately the last one still happens in a wrong order because
we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
net->loopback_dev's idev, thus we have to do that after we add
idev to loopback. However, this notifier has priority == 0 same as
ipv6_dev_notf, and ipv6_dev_notf is registered after
ip6_route_dev_notifier so it is called actually after
ip6_route_dev_notifier. This is similar to commit 2f460933f58e
("ipv6: initialize route null entry in addrconf_init()") which
fixes init_net.
Fix it by picking a smaller priority for ip6_route_dev_notifier.
Also, we have to release the refcnt accordingly when unregistering
loopback_dev because device exit functions are called before subsys
exit functions.
Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 8 May 2017 20:02:23 +0000 (16:02 -0400)]
Merge tag 'mac80211-for-davem-2017-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple more fixes:
* don't try to authenticate during reconfiguration, which causes
drivers to get confused
* fix a kernel-doc warning for a recently merged change
* fix MU-MIMO group configuration (relevant only for monitor mode)
* more rate flags fix: remove stray RX_ENC_FLAG_40MHZ
* fix IBSS probe response allocation size
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jim Baxter [Mon, 8 May 2017 12:49:57 +0000 (13:49 +0100)]
net: cdc_ncm: Fix TX zero padding
The zero padding that is added to NTB's does
not zero the memory correctly.
This is because the skb_put modifies the value
of skb_out->len which results in the memset
command not setting any memory to zero as
(ctx->tx_max - skb_out->len) == 0.
I have resolved this by storing the size of
the memory to be zeroed before the skb_put
and using this in the memset call.
Signed-off-by: Jim Baxter <jim_baxter@mentor.com> Reviewed-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
stmmaceth 0000:00:14.6 eth0: Link is Up - 100Mbps/Full - flow control off
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: stmmac_xmit+0xf1/0x1080
Fix this by adding necessary settings.
P.S. I split fix to three patches according to what each of them adds.
====================
Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 8 May 2017 14:14:22 +0000 (17:14 +0300)]
stmmac: pci: split out common_default_data() helper
New helper is added in order to prevent misconfiguration happened
for one of the platforms when configuration data is expanded.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
("net: stmmac: set default number of rx and tx queues in stmmac_pci")
missed Intel Quark configuration. Append it here.
Fixes: 26d6851fd24e ("net: stmmac: set default number of rx and tx queues in stmmac_pci") Cc: Joao Pinto <Joao.Pinto@synopsys.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Joao Pinto <jpinto@synopsys.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sun, 7 May 2017 22:04:09 +0000 (00:04 +0200)]
bpf: don't let ldimm64 leak map addresses on unprivileged
The patch fixes two things at once:
1) It checks the env->allow_ptr_leaks and only prints the map address to
the log if we have the privileges to do so, otherwise it just dumps 0
as we would when kptr_restrict is enabled on %pK. Given the latter is
off by default and not every distro sets it, I don't want to rely on
this, hence the 0 by default for unprivileged.
2) Printing of ldimm64 in the verifier log is currently broken in that
we don't print the full immediate, but only the 32 bit part of the
first insn part for ldimm64. Thus, fix this up as well; it's okay to
access, since we verified all ldimm64 earlier already (including just
constants) through replace_map_fd_with_map_ptr().
Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Fixes: cbd357008604 ("bpf: verifier (add ability to receive verification log)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Sat, 6 May 2017 08:55:06 +0000 (14:25 +0530)]
cxgb4: avoid disabling FEC by default
Recent Chelsio firmware started using few port capablity bits to
manage FEC and as driver was not aware of FEC changes those bits
were zeroed, consequently disabling FEC.
Avoid zeroing those bits and default to whatever the firmware
tells us the Link is currently advertising.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Sat, 6 May 2017 00:49:01 +0000 (03:49 +0300)]
bnxt_en: allocate enough space for ->ntp_fltr_bmap
We have the number of longs, but we need to calculate the number of
bytes required.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Fri, 5 May 2017 22:34:34 +0000 (15:34 -0700)]
qlge: Avoid reading past end of buffer
Using memcpy() from a string that is shorter than the length copied means
the destination buffer is being filled with arbitrary data from the kernel
rodata segment. Instead, use strncpy() which will fill the trailing bytes
with zeros.
This was found with the future CONFIG_FORTIFY_SOURCE feature.
Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Fri, 5 May 2017 22:30:23 +0000 (15:30 -0700)]
bna: ethtool: Avoid reading past end of buffer
Using memcpy() from a string that is shorter than the length copied means
the destination buffer is being filled with arbitrary data from the kernel
rodata segment. Instead, use strncpy() which will fill the trailing bytes
with zeros.
This was found with the future CONFIG_FORTIFY_SOURCE feature.
Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Fri, 5 May 2017 22:25:32 +0000 (15:25 -0700)]
bna: Avoid reading past end of buffer
Using memcpy() from a string that is shorter than the length copied means
the destination buffer is being filled with arbitrary data from the kernel
rodata segment. Instead, use strncpy() which will fill the trailing bytes
with zeros.
This was found with the future CONFIG_FORTIFY_SOURCE feature.
Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Fri, 5 May 2017 20:17:41 +0000 (16:17 -0400)]
vlan: Keep NETIF_F_HW_CSUM similar to other software devices
Vlan devices, like all other software devices, enable
NETIF_F_HW_CSUM feature. However, unlike all the othe other
software devices, vlans will switch to using IP|IPV6_CSUM
features, if the underlying devices uses them. In these situations,
checksum offload features on the vlan device can't be controlled
via ethtool.
This patch makes vlans keep HW_CSUM feature if the underlying
device supports checksum offloading. This makes vlan devices
behave like other software devices, and restores control to the
user.
A side-effect is that some offload settings (typically UFO)
may be enabled on the vlan device while being disabled on the HW.
However, the GSO code will correctly process the packets. This
actually results in slightly better raw throughput.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Fri, 5 May 2017 19:53:23 +0000 (12:53 -0700)]
tcp: make congestion control optionally skip slow start after idle
Congestion control modules that want full control over congestion
control behavior do not want the cwnd modifications controlled by
the sysctl_tcp_slow_start_after_idle code path.
So skip those code paths for CC modules that use the cong_control()
API.
As an example, those cwnd effects are not desired for the BBR congestion
control algorithm.
Fixes: c0402760f565 ("tcp: new CC hook to set sending rate with rate_sample in any CA state") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Thu, 4 May 2017 21:54:17 +0000 (14:54 -0700)]
ipv4: restore rt->fi for reference counting
IPv4 dst could use fi->fib_metrics to store metrics but fib_info
itself is refcnt'ed, so without taking a refcnt fi and
fi->fib_metrics could be freed while dst metrics still points to
it. This triggers use-after-free as reported by Andrey twice.
This patch reverts commit 2860583fe840 ("ipv4: Kill rt->fi") to
restore this reference counting. It is a quick fix for -net and
-stable, for -net-next, as Eric suggested, we can consider doing
reference counting for metrics itself instead of relying on fib_info.
IPv6 is very different, it copies or steals the metrics from mx6_config
in fib6_commit_metrics() so probably doesn't need a refcnt.
Decnet has already done the refcnt'ing, see dn_fib_semantic_match().
Fixes: 2860583fe840 ("ipv4: Kill rt->fi") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Belous [Thu, 4 May 2017 20:10:56 +0000 (23:10 +0300)]
aquantia: Fix "ethtool -S" crash when adapter down.
This patch fixes the crash that happens when driver tries to collect statistics
from already released "aq_vec" object.
If adapter is in "down" state we still allow user to see statistics from HW.
V2: fixed braces around "aq_vec_free".
Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code") Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Tested-by: David Arcari <darcari@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 2 May 2017 07:33:40 +0000 (09:33 +0200)]
cfg80211: fix multi scheduled scan kernel-doc
Replace @results_wk with @report_results, which was missed
in an earlier patch between revisions thereof.
Fixes: b34939b98369 ("cfg80211: add request id to cfg80211_sched_scan_*() api") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 27 Apr 2017 11:19:04 +0000 (13:19 +0200)]
mac80211: fix IBSS presp allocation size
When VHT IBSS support was added, the size of the extra elements
wasn't considered in ieee80211_ibss_build_presp(), which makes
it possible that it would overrun the allocated buffer. Fix it
by allocating the necessary space.
Fixes: abcff6ef01f9 ("mac80211: add VHT support for IBSS") Reported-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 27 Apr 2017 07:13:38 +0000 (09:13 +0200)]
nl80211: correctly validate MU-MIMO groups
Since groups 0 and 63 are invalid, we should check for those bits.
Note that the 802.11 spec specifies the *bit* order, but the CPU
doesn't care about bit order since it can't address bits, so it's
always treating BIT(0) as the lowest bit within a byte.
Reported-by: Jan Fuchs <jan.fuchs@lancom.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Luca Coelho [Tue, 2 May 2017 14:56:21 +0000 (17:56 +0300)]
mac80211: bail out from prep_connection() if a reconfig is ongoing
If ieee80211_hw_restart() is called during authentication, the
authentication process will continue, causing the driver to be called
in a wrong state. This ultimately causes an oops in the iwlwifi
driver (at least).
This fixes bugzilla 195299 partly.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195299 Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 5 May 2017 09:53:19 +0000 (11:53 +0200)]
mac80211: properly remove RX_ENC_FLAG_40MHZ
Somehow I missed this in my RX rate cleanup series, causing some
drivers to not report correct bandwidth since this flag isn't
used by mac80211 anymore. Fix this, and make hwsim also report
higher bandwidths appropriately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: f3297f68 ("net: alx: switch to pci_alloc_irq_vectors") Signed-off-by: Rakesh Pandit <rakesh@tuxera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 5 May 2017 13:56:54 +0000 (06:56 -0700)]
tcp: randomize timestamps on syncookies
Whole point of randomization was to hide server uptime, but an attacker
can simply start a syn flood and TCP generates 'old style' timestamps,
directly revealing server jiffies value.
Also, TSval sent by the server to a particular remote address vary
depending on syncookies being sent or not, potentially triggering PAWS
drops for innocent clients.
Lets implement proper randomization, including for SYNcookies.
Also we do not need to export sysctl_tcp_timestamps, since it is not
used from a module.
In v2, I added Florian feedback and contribution, adding tsoff to
tcp_get_cookie_sock().
v3 removed one unused variable in tcp_v4_connect() as Florian spotted.
Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tobias Klauser [Fri, 5 May 2017 14:36:53 +0000 (16:36 +0200)]
bridge: netlink: account for IFLA_BRPORT_{B, M}CAST_FLOOD size and policy
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and
IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size()
when they were added. Do so now and also add the corresponding policy
entries:
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Mike Manning <mmanning@brocade.com> Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag") Fixes: 99f906e9ad7b ("bridge: add per-port broadcast flood flag") Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 5 May 2017 02:07:10 +0000 (19:07 -0700)]
Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of new char/misc driver drivers and features for
4.12-rc1.
There's lots of new drivers added this time around, new firmware
drivers from Google, more auxdisplay drivers, extcon drivers, fpga
drivers, and a bunch of other driver updates. Nothing major, except if
you happen to have the hardware for these drivers, and then you will
be happy :)
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
firmware: google memconsole: Fix return value check in platform_memconsole_init()
firmware: Google VPD: Fix return value check in vpd_platform_init()
goldfish_pipe: fix build warning about using too much stack.
goldfish_pipe: An implementation of more parallel pipe
fpga fr br: update supported version numbers
fpga: region: release FPGA region reference in error path
fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe()
mei: drop the TODO from samples
firmware: Google VPD sysfs driver
firmware: Google VPD: import lib_vpd source files
misc: lkdtm: Add volatile to intentional NULL pointer reference
eeprom: idt_89hpesx: Add OF device ID table
misc: ds1682: Add OF device ID table
misc: tsl2550: Add OF device ID table
w1: Remove unneeded use of assert() and remove w1_log.h
w1: Use kernel common min() implementation
uio_mf624: Align memory regions to page size and set correct offsets
uio_mf624: Refactor memory info initialization
uio: Allow handling of non page-aligned memory regions
hangcheck-timer: Fix typo in comment
...
Linus Torvalds [Fri, 5 May 2017 01:27:46 +0000 (18:27 -0700)]
Merge tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Very tiny pull request for 4.12-rc1 for the driver core this time
around.
There are some documentation fixes, an eventpoll.h fixup to make it
easier for the libc developers to take our header files directly, and
some very minor driver core fixes and changes.
All have been in linux-next for a very long time with no reported
issues"
* tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Revert "kref: double kref_put() in my_data_handler()"
driver core: don't initialize 'parent' in device_add()
drivers: base: dma-mapping: use nth_page helper
Documentation/ABI: add information about cpu_capacity
debugfs: set no_llseek in DEFINE_DEBUGFS_ATTRIBUTE
eventpoll.h: add missing epoll event masks
eventpoll.h: fix epoll event masks
Linus Torvalds [Fri, 5 May 2017 01:03:51 +0000 (18:03 -0700)]
Merge tag 'usb-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB updates from Greg KH:
"Here is the big USB patchset for 4.12-rc1.
Lots of good stuff here, after many many many attempts, the kernel
finally has a working typeC interface, many thanks to Heikki and
Guenter and others who have taken the time to get this merged. It
wasn't an easy path for them at all.
There's also a staging driver that uses this new api, which is why
it's coming in through this tree.
Along with that, there's the usual huge number of changes for gadget
drivers, xhci, and other stuff. Johan also finally refactored pretty
much every driver that was looking at USB endpoints to do it in a
common way, which will help prevent any "badly-formed" devices from
causing problems in drivers. That too wasn't a simple task.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (263 commits)
staging: typec: Fairchild FUSB302 Type-c chip driver
staging: typec: Type-C Port Controller Interface driver (tcpci)
staging: typec: USB Type-C Port Manager (tcpm)
usb: host: xhci: remove #ifdef around PM functions
usb: musb: don't mark of_dev_auxdata as initdata
usb: misc: legousbtower: Fix buffers on stack
USB: Revert "cdc-wdm: fix "out-of-sync" due to missing notifications"
usb: Make sure usb/phy/of gets built-in
USB: storage: e-mail update in drivers/usb/storage/unusual_devs.h
usb: host: xhci: print correct command ring address
usb: host: xhci: delete sp_dma_buffers for scratchpad
usb: host: xhci: using correct specification chapter reference for DCBAAP
xhci: switch to pci_alloc_irq_vectors
usb: host: xhci-plat: set resume_quirk() for R-Car controllers
usb: host: xhci-plat: add resume_quirk()
usb: host: xhci-plat: enable clk in resume timing
usb: host: plat: Enable xHCI plat runtime PM
USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit
USB: serial: constify static arrays
usb: fix some references for /proc/bus/usb
...
2) When a RAW socket is in hdrincl mode, we need to make sure that the
user provided at least a minimally sized ipv4/ipv6 header. Fix from
Alexander Potapenko.
3) We must emit IFLA_PHYS_PORT_NAME netlink attributes using
nla_put_string() so that it is NULL terminated.
4) Fix a bug in TCP fastopen handling, wherein child sockets
erroneously inherit the fastopen_req from the parent, and later can
end up derefencing freed memory or doing a double free. From Eric
Dumazet.
5) Don't clear out netdev stats at close time in tg3 driver, from
YueHaibing.
6) Fix refcount leak in xt_CT, from Gao Feng.
7) In nft_set_bitmap() don't leak dummy elements, from Liping Zhang.
8) Fix deadlock due to taking the expectation lock twice, also from
Liping Zhang.
9) Make xt_socket work again with ipv6, from Peter Tirsek.
10) Don't allow IPV6 to be used with IPVS if ipv6.disable=1, from Paolo
Abeni.
11) Make the BPF loader more flexible wrt. changes to the bpf MAP entry
layout. From Jesper Dangaard Brouer.
12) Fix ethtool reported device name in aquantia driver, from Pavel
Belous.
13) Fix build failures due to the compile time size test not working in
netfilter conntrack. From Geert Uytterhoeven.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
cfg80211: make RATE_INFO_BW_20 the default
ipv6: initialize route null entry in addrconf_init()
qede: Fix possible misconfiguration of advertised autoneg value.
qed: Fix overriding of supported autoneg value.
qed*: Fix possible overflow for status block id field.
rtnetlink: NUL-terminate IFLA_PHYS_PORT_NAME string
netvsc: make sure napi enabled before vmbus_open
aquantia: Fix driver name reported by ethtool
ipv4, ipv6: ensure raw socket message is big enough to hold an IP header
net/sched: remove redundant null check on head
tcp: do not inherit fastopen_req from parent
forcedeth: remove unnecessary carrier status check
ibmvnic: Move queue restarting in ibmvnic_tx_complete
ibmvnic: Record SKB RX queue during poll
ibmvnic: Continue skb processing after skb completion error
ibmvnic: Check for driver reset first in ibmvnic_xmit
ibmvnic: Wait for any pending scrqs entries at driver close
ibmvnic: Clean up tx pools when closing
ibmvnic: Whitespace correction in release_rx_pools
ibmvnic: Delete napi's when releasing driver resources
...
Linus Torvalds [Thu, 4 May 2017 19:19:44 +0000 (12:19 -0700)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This update includes the usual round of major driver updates
(hisi_sas, ufs, fnic, cxlflash, be2iscsi, ipr, stex). There's also the
usual amount of cosmetic and spelling stuff"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (155 commits)
scsi: qla4xxx: fix spelling mistake: "Tempalate" -> "Template"
scsi: stex: make S6flag static
scsi: mac_esp: fix to pass correct device identity to free_irq()
scsi: aacraid: pci_alloc_consistent() failures on ARM64
scsi: ufs: make ufshcd_get_lists_status() register operation obvious
scsi: ufs: use MASK_EE_STATUS
scsi: mac_esp: Replace bogus memory barrier with spinlock
scsi: fcoe: make fcoe_e_d_tov and fcoe_r_a_tov static
scsi: sd_zbc: Do not write lock zones for reset
scsi: sd_zbc: Remove superfluous assignments
scsi: sd: sd_zbc: Rename sd_zbc_setup_write_cmnd
scsi: Improve scsi_get_sense_info_fld
scsi: sd: Cleanup sd_done sense data handling
scsi: sd: Improve sd_completed_bytes
scsi: sd: Fix function descriptions
scsi: mpt3sas: remove redundant wmb
scsi: mpt: Move scsi_remove_host() out of mptscsih_remove_host()
scsi: sg: reset 'res_in_use' after unlinking reserved array
scsi: mvumi: remove code handling zero scsi_sg_count(scmd) case
scsi: fusion: fix spelling mistake: "Persistancy" -> "Persistency"
...
Linus Torvalds [Thu, 4 May 2017 19:05:32 +0000 (12:05 -0700)]
Merge tag 'gpio-v4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.12 kernel cycle.
Core changes:
- Return NULL from gpiod_get_optional() when GPIOLIB is disabled.
This was a much discussed change. It affects use cases where people
write drivers that might or might not be using GPIO resources. I
have decided that this is the lesser evil right now.
- Make gpiod_count() behave consistently across different hardware
descriptions.
- Fix the syntax around open drain/open source to not infer active
high/low semantics.
New drivers:
- A new single-register fixed-direction framework driver for hardware
that have lines controlled by a single register that just work in
one direction (out or in), including IRQ support.
- Support the Fintek F71889A GPIO SuperIO controller.
- Support the National NI 169445 MMIO GPIO.
- Support for the X-Gene derivative of the DWC GPIO controller
- Support for the Rohm BD9571MWV-M PMIC GPIO controller.
- Refactor the Gemini GPIO driver to a generic Faraday FTGPIO driver
and replace both the Gemini and the Moxa ART custom drivers with
this driver.
Driver improvements:
- A whole slew of drivers have their spinlocks chaned to raw
spinlocks as they provide irqchips, and thus we are progressing on
realtime compliance.
- Use devm_irq_alloc_descs() in a slew of drivers, getting managed
resources.
- Support for the embedded PWM controller inside the MVEBU driver.
- Debounce, open source and open drain support for the Aspeed driver.
- Misc smaller fixes like spelling and syntax and whatnot"
* tag 'gpio-v4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (77 commits)
gpio: f7188x: Add a missing break
gpio: omap: return error if requested debounce time is not possible
gpio: Add ROHM BD9571MWV-M PMIC GPIO driver
gpio: gpio-wcove: fix GPIO IRQ status mask
gpio: DT bindings, move tca9554 from pcf857x to pca953x
gpio: move tca9554 from pcf857x to pca953x
gpio: arizona: Correct check whether the pin is an input
gpio: Add XRA1403 DTS binding documentation
dt-bindings: add exar to vendor prefixes list
gpio: gpio-wcove: fix irq pending status bit width
gpio: dwapb: use dwapb_read instead of readl_relaxed
gpio: aspeed: Add open-source and open-drain support
gpio: aspeed: Add debounce support
gpio: aspeed: dt: Add optional clocks property
gpio: aspeed: dt: Fix description alignment in bindings document
gpio: mvebu: Add limited PWM support
gpio: Use unsigned int for interrupt numbers
gpio: f7188x: Add F71889A GPIO support.
gpio: core: Decouple open drain/source flag with active low/high
gpio: arizona: Correct handling for reading input GPIOs
...
Linus Torvalds [Thu, 4 May 2017 18:56:59 +0000 (11:56 -0700)]
Merge tag 'platform-drivers-x86-v4.12-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform-drivers update from Darren Hart:
"This represents a significantly larger and more complex set of changes
than those of prior merge windows.
In particular, we had several changes with dependencies on other
subsystems which we felt were best managed through merges of immutable
branches, including one each from input, i2c, and leds. Two patches
for the watchdog subsystem are included after discussion with Wim and
Guenter following a collision in linux-next (this should be resolved
and you should only see these two appear in this pull request). These
are called out in the "External" section below.
Summary of changes:
- significant further cleanup of fujitsu-laptop and hp-wmi
- new model support for ideapad, asus, silead, and xiaomi
- new hotkeys for thinkpad and models using intel-vbtn
- dell keyboard backlight improvements
- build and dependency improvements
- intel * ipc fixes, cleanups, and api updates
- single isolated fixes noted below
platform/x86:
- Add Intel Cherry Trail ACPI INT33FE device driver
- remove sparse_keymap_free() calls
- Make SILEAD_DMI depend on TOUCHSCREEN_SILEAD
asus-wmi:
- try to set als by default
- fix cpufv sysfs file permission
acer-wmi:
- setup accelerometer when ACPI device was found
ideapad-laptop:
- Add IdeaPad V310-15ISK to no_hw_rfkill
- Add IdeaPad 310-15IKB to no_hw_rfkill
intel_pmc_ipc:
- use gcr mem base for S0ix counter read
- Fix iTCO_wdt GCS memory mapping failure
- Add pmc gcr read/write/update api's
- fix gcr offset
dell-laptop:
- Add keyboard backlight timeout AC settings
- Handle return error form dell_get_intensity.
- Protect kbd_state against races
- Refactor kbd_led_triggers_store()
hp-wireless:
- reuse module_acpi_driver
- add Xiaomi's hardware id to the supported list
intel-vbtn:
- add volume up and down
INT33FE:
- add i2c dependency
hp-wmi:
- Cleanup exit paths
- Do not shadow errors in sysfs show functions
- Use DEVICE_ATTR_(RO|RW) helper macros
- Refactor dock and tablet state fetchers
- Cleanup wireless get_(hw|sw)state functions
- Refactor redundant HPWMI_READ functions
- Standardize enum usage for constants
- Cleanup local variable declarations
- Do not shadow error values
- Fix detection for dock and tablet mode
- Fix error value for hp_wmi_tablet_state
fujitsu-laptop:
- simplify error handling in acpi_fujitsu_laptop_add()
- do not log LED registration failures
- switch to managed LED class devices
- reorganize LED-related code
- refactor LED registration
- select LEDS_CLASS
- remove redundant fields from struct fujitsu_bl
- account for backlight power when determining brightness
- do not log set_lcd_level() failures in bl_update_status()
- ignore errors when setting backlight power
- make disable_brightness_adjust a boolean
- clean up use_alt_lcd_levels handling
- sync brightness in set_lcd_level()
- simplify set_lcd_level()
- merge set_lcd_level_alt() into set_lcd_level()
- switch to a managed backlight device
- only handle backlight when appropriate
- update debug message logged by call_fext_func()
- rename call_fext_func() arguments
- simplify call_fext_func()
- clean up local variables in call_fext_func()
- remove keycode fields from struct fujitsu_bl
- model-dependent sparse keymap overrides
- use a sparse keymap for hotkey event generation
- switch to a managed hotkey input device
- refactor hotkey input device setup
- use a sparse keymap for brightness key events
- switch to a managed backlight input device
- refactor backlight input device setup
- remove pf_device field from struct fujitsu_bl
- only register platform device if FUJ02E3 is present
- add and remove platform device in separate functions
- simplify platform device attribute definitions
- remove backlight-related attributes from the platform device
- cleanup error labels in fujitsu_init()
- only register backlight device if FUJ02B1 is present
- sync backlight power status in acpi_fujitsu_laptop_add()
- register backlight device in a separate function
- simplify brightness key event generation logic
- decrease indentation in acpi_fujitsu_bl_notify()
intel-hid:
- Add missing ->thaw callback
- do not set parents of input devices explicitly
- remove redundant set_bit() call
- use devm_input_allocate_device() for HID events input device
- make intel_hid_set_enable() take a boolean argument
- simplify enabling/disabling HID events
silead_dmi:
- Add touchscreen info for Surftab Wintron 7.0
- Abort early if DMI does not match
- Do not treat all devices as i2c_clients
- Add entry for Insyde 7W tablets
- Constify properties arrays
thinkpad_acpi:
- add mapping for new hotkeys
- guard generic hotkey case"
* tag 'platform-drivers-x86-v4.12-1' of git://git.infradead.org/linux-platform-drivers-x86: (108 commits)
platform/x86: Make SILEAD_DMI depend on TOUCHSCREEN_SILEAD
platform/x86: asus-wmi: try to set als by default
platform/x86: asus-wmi: fix cpufv sysfs file permission
platform/x86: acer-wmi: setup accelerometer when ACPI device was found
platform/x86: ideapad-laptop: Add IdeaPad V310-15ISK to no_hw_rfkill
platform/x86: intel_pmc_ipc: use gcr mem base for S0ix counter read
platform/x86: intel_pmc_ipc: Fix iTCO_wdt GCS memory mapping failure
watchdog: iTCO_wdt: Add PMC specific noreboot update api
watchdog: iTCO_wdt: cleanup set/unset no_reboot_bit functions
platform/x86: intel_pmc_ipc: Add pmc gcr read/write/update api's
platform/x86: intel_pmc_ipc: fix gcr offset
platform/x86: dell-laptop: Add keyboard backlight timeout AC settings
platform/x86: dell-laptop: Handle return error form dell_get_intensity.
platform/x86: hp-wireless: reuse module_acpi_driver
platform/x86: intel-vbtn: add volume up and down
platform/x86: INT33FE: add i2c dependency
platform/x86: hp-wmi: Cleanup exit paths
platform/x86: hp-wmi: Do not shadow errors in sysfs show functions
platform/x86: hp-wmi: Use DEVICE_ATTR_(RO|RW) helper macros
platform/x86: hp-wmi: Refactor dock and tablet state fetchers
...
Linus Torvalds [Thu, 4 May 2017 18:37:09 +0000 (11:37 -0700)]
Merge tag 'for-linus-4.12b-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
"Xen fixes and featrues for 4.12. The main changes are:
- enable building the kernel with Xen support but without enabling
paravirtualized mode (Vitaly Kuznetsov)
- add a new 9pfs xen frontend driver (Stefano Stabellini)
- simplify Xen's cpuid handling by making use of cpu capabilities
(Juergen Gross)
- add/modify some headers for new Xen paravirtualized devices
(Oleksandr Andrushchenko)
- EFI reset_system support under Xen (Julien Grall)
- and the usual cleanups and corrections"
* tag 'for-linus-4.12b-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (57 commits)
xen: Move xen_have_vector_callback definition to enlighten.c
xen: Implement EFI reset_system callback
arm/xen: Consolidate calls to shutdown hypercall in a single helper
xen: Export xen_reboot
xen/x86: Call xen_smp_intr_init_pv() on BSP
xen: Revert commits da72ff5bfcb0 and 72a9b186292d
xen/pvh: Do not fill kernel's e820 map in init_pvh_bootparams()
xen/scsifront: use offset_in_page() macro
xen/arm,arm64: rename __generic_dma_ops to xen_get_dma_ops
xen/arm,arm64: fix xen_dma_ops after 815dd18 "Consolidate get_dma_ops..."
xen/9pfs: select CONFIG_XEN_XENBUS_FRONTEND
x86/cpu: remove hypervisor specific set_cpu_features
vmware: set cpu capabilities during platform initialization
x86/xen: use capabilities instead of fake cpuid values for xsave
x86/xen: use capabilities instead of fake cpuid values for x2apic
x86/xen: use capabilities instead of fake cpuid values for mwait
x86/xen: use capabilities instead of fake cpuid values for acpi
x86/xen: use capabilities instead of fake cpuid values for acc
x86/xen: use capabilities instead of fake cpuid values for mtrr
x86/xen: use capabilities instead of fake cpuid values for aperf
...
Johannes Berg [Thu, 4 May 2017 06:42:30 +0000 (08:42 +0200)]
cfg80211: make RATE_INFO_BW_20 the default
Due to the way I did the RX bitrate conversions in mac80211 with
spatch, going setting flags to setting the value, many drivers now
don't set the bandwidth value for 20 MHz, since with the flags it
wasn't necessary to (there was no 20 MHz flag, only the others.)
Rather than go through and try to fix up all the drivers, instead
renumber the enum so that 20 MHz, which is the typical bandwidth,
actually has the value 0, making those drivers all work again.
If VHT was hit used with a driver not reporting it, e.g. iwlmvm,
this manifested in hitting the bandwidth warning in
cfg80211_calculate_bitrate_vht().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Thu, 4 May 2017 05:07:31 +0000 (22:07 -0700)]
ipv6: initialize route null entry in addrconf_init()
Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
since it is always NULL.
This is clearly wrong, we have code to initialize it to loopback_dev,
unfortunately the order is still not correct.
loopback_dev is registered very early during boot, we lose a chance
to re-initialize it in notifier. addrconf_init() is called after
ip6_route_init(), which means we have no chance to correct it.
Fix it by moving this initialization explicitly after
ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qede: Fix possible misconfiguration of advertised autoneg value.
Fail the configuration of advertised speed-autoneg value if the config
update is not supported.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Driver currently uses advertised-autoneg value to populate the
supported-autoneg field. When advertised field is updated, user gets
the same value for supported field. Supported-autoneg value need to be
populated from the link capabilities value returned by the MFW.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qed*: Fix possible overflow for status block id field.
Value for status block id could be more than 256 in 100G mode, need to
update its data type from u8 to u16.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0.
Otherwise libnl3 fails to validate netlink messages with this attribute.
"ip -detail a" assumes too that the attribute is NUL-terminated when
printing it. It often was, due to padding.
I noticed this as libvirtd failing to start on a system with sfc driver
after upgrading it to Linux 4.11, i.e. when sfc added support for
phys_port_name.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv4, ipv6: ensure raw socket message is big enough to hold an IP header
raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied
from the userspace contains the IPv4/IPv6 header, so if too few bytes are
copied, parts of the header may remain uninitialized.
Eric Dumazet [Wed, 3 May 2017 13:39:31 +0000 (06:39 -0700)]
tcp: do not inherit fastopen_req from parent
Under fuzzer stress, it is possible that a child gets a non NULL
fastopen_req pointer from its parent at accept() time, when/if parent
morphs from listener to active session.
We need to make sure this can not happen, by clearing the field after
socket cloning.
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets") Fixes: 7db92362d2fe ("tcp: fix potential double free issue for fastopen_req") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 4 May 2017 02:12:27 +0000 (19:12 -0700)]
Merge tag 'modules-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull modules updates from Jessica Yu:
- Minor code cleanups
- Fix section alignment for .init_array
* tag 'modules-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
kallsyms: Use bounded strnchr() when parsing string
module: Unify the return value type of try_module_get
module: set .init_array alignment to 8
Linus Torvalds [Thu, 4 May 2017 01:41:21 +0000 (18:41 -0700)]
Merge tag 'trace-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New features for this release:
- Pretty much a full rewrite of the processing of function plugins.
i.e. echo do_IRQ:stacktrace > set_ftrace_filter
- The rewrite was needed to add plugins to be unique to tracing
instances. i.e. mkdir instance/foo; cd instances/foo; echo
do_IRQ:stacktrace > set_ftrace_filter The old way was written very
hacky. This removes a lot of those hacks.
- New "function-fork" tracing option. When set, pids in the
set_ftrace_pid will have their children added when the processes
with their pids listed in the set_ftrace_pid file forks.
- Exposure of "maxactive" for kretprobe in kprobe_events
- Allow for builtin init functions to be traced by the function
tracer (via the kernel command line). Module init function tracing
will come in the next release.
- Added more selftests, and have selftests also test in an instance"
* tag 'trace-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (60 commits)
ring-buffer: Return reader page back into existing ring buffer
selftests: ftrace: Allow some event trigger tests to run in an instance
selftests: ftrace: Have some basic tests run in a tracing instance too
selftests: ftrace: Have event tests also run in an tracing instance
selftests: ftrace: Make func_event_triggers and func_traceonoff_triggers tests do instances
selftests: ftrace: Allow some tests to be run in a tracing instance
tracing/ftrace: Allow for instances to trigger their own stacktrace probes
tracing/ftrace: Allow for the traceonoff probe be unique to instances
tracing/ftrace: Enable snapshot function trigger to work with instances
tracing/ftrace: Allow instances to have their own function probes
tracing/ftrace: Add a better way to pass data via the probe functions
ftrace: Dynamically create the probe ftrace_ops for the trace_array
tracing: Pass the trace_array into ftrace_probe_ops functions
tracing: Have the trace_array hold the list of registered func probes
ftrace: If the hash for a probe fails to update then free what was initialized
ftrace: Have the function probes call their own function
ftrace: Have each function probe use its own ftrace_ops
ftrace: Have unregister_ftrace_function_probe_func() return a value
ftrace: Add helper function ftrace_hash_move_and_update_ops()
ftrace: Remove data field from ftrace_func_probe structure
...
Linus Torvalds [Thu, 4 May 2017 01:29:28 +0000 (18:29 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk
Pull printk updates from Petr Mladek:
- There is a situation when early console is not deregistered because
the preferred one matches a wrong entry. It caused messages to appear
twice.
This is the 2nd attempt to fix it. The first one was wrong, see the
commit c6c7d83b9c9e ('Revert "console: don't prefer first registered
if DT specifies stdout-path"').
The fix is coupled with some small code clean up. Well, the console
registration code would deserve a big one. We need to think about it.
- Do not lose information about the preemtive context when the console
semaphore is re-taken.
- Do not block CPU hotplug when someone else is already pushing
messages to the console.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
printk: fix double printing with earlycon
printk: rename selected_console -> preferred_console
printk: fix name/type/scope of preferred_console var
printk: Correctly handle preemption in console_unlock()
printk: use console_trylock() in console_cpu_notify()
Linus Torvalds [Thu, 4 May 2017 00:55:59 +0000 (17:55 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
- a few misc things
- most of MM
- KASAN updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (102 commits)
kasan: separate report parts by empty lines
kasan: improve double-free report format
kasan: print page description after stacks
kasan: improve slab object description
kasan: change report header
kasan: simplify address description logic
kasan: change allocation and freeing stack traces headers
kasan: unify report headers
kasan: introduce helper functions for determining bug type
mm: hwpoison: call shake_page() after try_to_unmap() for mlocked page
mm: hwpoison: call shake_page() unconditionally
mm/swapfile.c: fix swap space leak in error path of swap_free_entries()
mm/gup.c: fix access_ok() argument type
mm/truncate: avoid pointless cleancache_invalidate_inode() calls.
mm/truncate: bail out early from invalidate_inode_pages2_range() if mapping is empty
fs/block_dev: always invalidate cleancache in invalidate_bdev()
fs: fix data invalidation in the cleancache during direct IO
zram: reduce load operation in page_same_filled
zram: use zram_free_page instead of open-coded
zram: introduce zram data accessor
...
BUG: Double free or freeing an invalid pointer
Unexpected shadow byte: 0xFB
to
BUG: KASAN: double-free or invalid-free in kmalloc_oob_left+0xe5/0xef
This makes a bug uniquely identifiable by the first report line. To
account for removing of the unexpected shadow value, print shadow bytes
at the end of the report as in reports for other kinds of bugs.
The buggy address belongs to the object at ffff880068388540
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 123 bytes inside of
128-byte region [ffff880068388540, ffff8800683885c0)
Makes it more explanatory and adds information about relative offset of
the accessed address to the start of the object.
kasan: introduce helper functions for determining bug type
Patch series "kasan: improve error reports", v2.
This patchset improves KASAN reports by making them easier to read and a
little more detailed. Also improves mm/kasan/report.c readability.
Effectively changes a use-after-free report to:
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan]
Write of size 1 at addr ffff88006aa59da8 by task insmod/3951
Memory state around the buggy address: ffff88006aa59c80: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc ffff88006aa59d00: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
>ffff88006aa59d80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
^ ffff88006aa59e00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc ffff88006aa59e80: fb fb fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
==================================================================
from:
==================================================================
BUG: KASAN: use-after-free in kmalloc_uaf+0xaa/0xb6 [test_kasan] at addr ffff88006c4dcb28
Write of size 1 by task insmod/3984
CPU: 1 PID: 3984 Comm: insmod Tainted: G B 4.10.0+ #83
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x292/0x398
kasan_object_err+0x1c/0x70
kasan_report.part.1+0x20e/0x4e0
__asan_report_store1_noabort+0x2c/0x30
kmalloc_uaf+0xaa/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x7feca0f779da
RSP: 002b:00007ffdfeae5218 EFLAGS: 00000206 ORIG_RAX: 00000000000000af
RAX: ffffffffffffffda RBX: 000055a064c13090 RCX: 00007feca0f779da
RDX: 00007feca1236f88 RSI: 000000000004df7e RDI: 00007feca1605000
RBP: 00007feca1236f88 R08: 0000000000000003 R09: 0000000000000000
R10: 00007feca0f73d0a R11: 0000000000000206 R12: 000055a064c14190
R13: 000000000001fe81 R14: 0000000000000000 R15: 0000000000000004
Object at ffff88006c4dcb20, in cache kmalloc-16 size: 16
Allocated:
PID = 3984
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_kmalloc+0xad/0xe0
kmem_cache_alloc_trace+0x82/0x270
kmalloc_uaf+0x56/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 3984
save_stack_trace+0x16/0x20
save_stack+0x43/0xd0
kasan_slab_free+0x73/0xc0
kfree+0xe8/0x2b0
kmalloc_uaf+0x85/0xb6 [test_kasan]
kmalloc_tests_init+0x4f/0xa48 [test_kasan]
do_one_initcall+0xf3/0x390
do_init_module+0x215/0x5d0
load_module+0x54de/0x82b0
SYSC_init_module+0x3be/0x430
SyS_init_module+0x9/0x10
entry_SYSCALL_64_fastpath+0x1f/0xc2
Memory state around the buggy address: ffff88006c4dca00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc ffff88006c4dca80: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
>ffff88006c4dcb00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
^ ffff88006c4dcb80: fb fb fc fc 00 00 fc fc fb fb fc fc fb fb fc fc ffff88006c4dcc00: fb fb fc fc fb fb fc fc fb fb fc fc fb fb fc fc
==================================================================
This patch (of 9):
Introduce get_shadow_bug_type() function, which determines bug type
based on the shadow value for a particular kernel address. Introduce
get_wild_bug_type() function, which determines bug type for addresses
which don't have a corresponding shadow value.
Naoya Horiguchi [Wed, 3 May 2017 21:56:22 +0000 (14:56 -0700)]
mm: hwpoison: call shake_page() after try_to_unmap() for mlocked page
Memory error handler calls try_to_unmap() for error pages in various
states. If the error page is a mlocked page, error handling could fail
with "still referenced by 1 users" message. This is because the page is
linked to and stays in lru cache after the following call chain.
memory_failure() calls shake_page() to hanlde the similar issue, but
current code doesn't cover because shake_page() is called only before
try_to_unmap(). So this patches adds shake_page().
Fixes: 23a003bfd23ea9ea0b7756b920e51f64b284b468 ("mm/madvise: pass return code of memory_failure() to userspace") Link: http://lkml.kernel.org/r/20170417055948.GM31394@yexl-desktop Link: http://lkml.kernel.org/r/1493197841-23986-3-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: kernel test robot <lkp@intel.com> Cc: Xiaolong Ye <xiaolong.ye@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Wed, 3 May 2017 21:56:19 +0000 (14:56 -0700)]
mm: hwpoison: call shake_page() unconditionally
shake_page() is called before going into core error handling code in
order to ensure that the error page is flushed from lru_cache lists
where pages stay during transferring among LRU lists.
But currently it's not fully functional because when the page is linked
to lru_cache by calling activate_page(), its PageLRU flag is set and
shake_page() is skipped. The result is to fail error handling with
"still referenced by 1 users" message.
When the page is linked to lru_cache by isolate_lru_page(), its PageLRU
is clear, so that's fine.
This patch makes shake_page() unconditionally called to avoild the
failure.
Fixes: 23a003bfd23ea9ea0b7756b920e51f64b284b468 ("mm/madvise: pass return code of memory_failure() to userspace") Link: http://lkml.kernel.org/r/20170417055948.GM31394@yexl-desktop Link: http://lkml.kernel.org/r/1493197841-23986-2-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: kernel test robot <lkp@intel.com> Cc: Xiaolong Ye <xiaolong.ye@intel.com> Cc: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Huang Ying [Wed, 3 May 2017 21:56:16 +0000 (14:56 -0700)]
mm/swapfile.c: fix swap space leak in error path of swap_free_entries()
In swapcache_free_entries(), if swap_info_get_cont() returns NULL,
something wrong occurs for the swap entry. But we should still continue
to free the following swap entries in the array instead of skip them to
avoid swap space leak. This is just problem in error path, where system
may be in an inconsistent state, but it is still good to fix it.
Link: http://lkml.kernel.org/r/20170421124739.24534-1-ying.huang@intel.com Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Shaohua Li <shli@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Wed, 3 May 2017 21:56:12 +0000 (14:56 -0700)]
mm/gup.c: fix access_ok() argument type
MIPS just got changed to only accept a pointer argument for access_ok(),
causing one warning in drivers/scsi/pmcraid.c. I tried changing x86 the
same way and found the same warning in __get_user_pages_fast() and
nowhere else in the kernel during randconfig testing:
mm/gup.c: In function '__get_user_pages_fast':
mm/gup.c:1578:6: error: passing argument 1 of '__chk_range_not_ok' makes pointer from integer without a cast [-Werror=int-conversion]
It would probably be a good idea to enforce type-safety in general, so
let's change this file to not cause a warning if we do that.
I don't know why the warning did not appear on MIPS.
Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()") Link: http://lkml.kernel.org/r/20170421162659.3314521-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cleancache_invalidate_inode() called truncate_inode_pages_range() and
invalidate_inode_pages2_range() twice - on entry and on exit. It's
stupid and waste of time. It's enough to call it once at exit.
Link: http://lkml.kernel.org/r/20170424164135.22350-5-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Wed, 3 May 2017 21:56:06 +0000 (14:56 -0700)]
mm/truncate: bail out early from invalidate_inode_pages2_range() if mapping is empty
If mapping is empty (both ->nrpages and ->nrexceptional is zero) we can
avoid pointless lookups in empty radix tree and bail out immediately
after cleancache invalidation.
Link: http://lkml.kernel.org/r/20170424164135.22350-4-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Wed, 3 May 2017 21:55:59 +0000 (14:55 -0700)]
fs: fix data invalidation in the cleancache during direct IO
Patch series "Properly invalidate data in the cleancache", v2.
We've noticed that after direct IO write, buffered read sometimes gets
stale data which is coming from the cleancache. The reason for this is
that some direct write hooks call call invalidate_inode_pages2[_range]()
conditionally iff mapping->nrpages is not zero, so we may not invalidate
data in the cleancache.
Another odd thing is that we check only for ->nrpages and don't check
for ->nrexceptional, but invalidate_inode_pages2[_range] also
invalidates exceptional entries as well. So we invalidate exceptional
entries only if ->nrpages != 0? This doesn't feel right.
- Patch 1 fixes direct IO writes by removing ->nrpages check.
- Patch 2 fixes similar case in invalidate_bdev().
Note: I only fixed conditional cleancache_invalidate_inode() here.
Do we also need to add ->nrexceptional check in into invalidate_bdev()?
- Patches 3-4: some optimizations.
This patch (of 4):
Some direct IO write fs hooks call invalidate_inode_pages2[_range]()
conditionally iff mapping->nrpages is not zero. This can't be right,
because invalidate_inode_pages2[_range]() also invalidate data in the
cleancache via cleancache_invalidate_inode() call. So if page cache is
empty but there is some data in the cleancache, buffered read after
direct IO write would get stale data from the cleancache.
Also it doesn't feel right to check only for ->nrpages because
invalidate_inode_pages2[_range] invalidates exceptional entries as well.
Fix this by calling invalidate_inode_pages2[_range]() regardless of
nrpages state.
Note: nfs,cifs,9p doesn't need similar fix because the never call
cleancache_get_page() (nor directly, nor via mpage_readpage[s]()), so
they are not affected by this bug.
Fixes: c515e1fd361c ("mm/fs: add hooks to support cleancache") Link: http://lkml.kernel.org/r/20170424164135.22350-2-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sangwoo Park [Wed, 3 May 2017 21:55:56 +0000 (14:55 -0700)]
zram: reduce load operation in page_same_filled
In page_same_filled function, all elements in the page is compared with
next index value. The current comparison routine compares the (i)th and
(i+1)th values of the page.
In this case, two load operaions occur for each comparison. But if we
store first value of the page stores at 'val' variable and using it to
compare with others, the load opearation is reduced. It reduce load
operation per page by up to 64times.
Link: http://lkml.kernel.org/r/1488428104-7257-1-git-send-email-sangwoo2.park@lge.com Signed-off-by: Sangwoo Park <sangwoo2.park@lge.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 3 May 2017 21:55:53 +0000 (14:55 -0700)]
zram: use zram_free_page instead of open-coded
The zram_free_page already handles NULL handle case and same page so use
it to reduce error probability. (Acutaully, I made a mistake when I
handled same page feature)
Link: http://lkml.kernel.org/r/1492052365-16169-7-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 3 May 2017 21:55:50 +0000 (14:55 -0700)]
zram: introduce zram data accessor
With element, sometime I got confused handle and element access. It
might be my bad but I think it's time to introduce accessor to prevent
future idiot like me. This patch is just clean-up patch so it shouldn't
change any behavior.
Link: http://lkml.kernel.org/r/1492052365-16169-6-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 3 May 2017 21:55:41 +0000 (14:55 -0700)]
zram: partial IO refactoring
For architecture(PAGE_SIZE > 4K), zram have supported partial IO.
However, the mixed code for handling normal/partial IO is too mess,
error-prone to modify IO handler functions with upcoming feature so this
patch aims for cleaning up zram's IO handling functions.
Link: http://lkml.kernel.org/r/1492052365-16169-3-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Thumshirn reported system goes the panic when using NVMe over
Fabrics loopback target with zram.
The reason is zram expects each bvec in bio contains a single page
but nvme can attach a huge bulk of pages attached to the bio's bvec
so that zram's index arithmetic could be wrong so that out-of-bound
access makes system panic.
[1] in mainline solved solved the problem by limiting max_sectors with
SECTORS_PER_PAGE but it makes zram slow because bio should split with
each pages so this patch makes zram aware of multiple pages in a bvec
so it could solve without any regression(ie, bio split).
[1] 0bc315381fe9, zram: set physical queue limits to avoid array out of
bounds accesses
Link: http://lkml.kernel.org/r/20170413134057.GA27499@bbox Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Wed, 3 May 2017 21:55:34 +0000 (14:55 -0700)]
mm, page_alloc: remove debug_guardpage_minorder() test in warn_alloc()
Commit c0a32fc5a2e4 ("mm: more intensive memory corruption debugging")
changed to check debug_guardpage_minorder() > 0 when reporting
allocation failures. The reasoning was
When we use guard page to debug memory corruption, it shrinks
available pages to 1/2, 1/4, 1/8 and so on, depending on parameter
value. In such case memory allocation failures can be common and
printing errors can flood dmesg. If somebody debug corruption,
allocation failures are not the things he/she is interested about.
but this is misguided.
Allocation requests with __GFP_NOWARN flag by definition do not cause
flooding of allocation failure messages. Allocation requests with
__GFP_NORETRY flag likely also have __GFP_NOWARN flag. Costly
allocation requests likely also have __GFP_NOWARN flag.
Allocation requests without __GFP_DIRECT_RECLAIM flag likely also have
__GFP_NOWARN flag or __GFP_HIGH flag. Non-costly allocation requests
with __GFP_DIRECT_RECLAIM flag basically retry forever due to the "too
small to fail" memory-allocation rule.
Therefore, as a whole, shrinking available pages by
debug_guardpage_minorder= kernel boot parameter might cause flooding of
OOM killer messages but unlikely causes flooding of allocation failure
messages. Let's remove debug_guardpage_minorder() > 0 check which would
likely be pointless.
Link: http://lkml.kernel.org/r/1491910035-4231-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Rafael J . Wysocki" <rafael.j.wysocki@intel.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/madvise: move up the behavior parameter validation
madvise_behavior_valid() should be called before acting upon the
behavior parameter. Hence move up the function. This also includes
MADV_SOFT_OFFLINE and MADV_HWPOISON options as valid behavior parameter
for the system call madvise().
mm/madvise.c: clean up MADV_SOFT_OFFLINE and MADV_HWPOISON
This cleans up handling MADV_SOFT_OFFLINE and MADV_HWPOISON called
through madvise() system call.
* madvise_memory_failure() was misleading to accommodate handling of
both memory_failure() as well as soft_offline_page() functions.
Basically it handles memory error injection from user space which
can go either way as memory failure or soft offline. Renamed as
madvise_inject_error() instead.
* Renamed struct page pointer 'p' to 'page'.
* pr_info() was essentially printing PFN value but it said 'page'
which was misleading. Made the process virtual address explicit.
Before the patch:
Soft offlining page 0x15e3e at 0x3fff8c230000
Soft offlining page 0x1f3 at 0x3fffa0da0000
Soft offlining page 0x744 at 0x3fff7d200000
Soft offlining page 0x1634d at 0x3fff95e20000
Soft offlining page 0x16349 at 0x3fff95e30000
Soft offlining page 0x1d6 at 0x3fff9e8b0000
Soft offlining page 0x5f3 at 0x3fff91bd0000
Injecting memory failure for page 0x15c8b at 0x3fff83280000
Injecting memory failure for page 0x16190 at 0x3fff83290000
Injecting memory failure for page 0x740 at 0x3fff9a2e0000
Injecting memory failure for page 0x741 at 0x3fff9a2f0000
After the patch:
Soft offlining pfn 0x1484e at process virtual address 0x3fff883c0000
Soft offlining pfn 0x1484f at process virtual address 0x3fff883d0000
Soft offlining pfn 0x14850 at process virtual address 0x3fff883e0000
Soft offlining pfn 0x14851 at process virtual address 0x3fff883f0000
Soft offlining pfn 0x14852 at process virtual address 0x3fff88400000
Soft offlining pfn 0x14853 at process virtual address 0x3fff88410000
Soft offlining pfn 0x14854 at process virtual address 0x3fff88420000
Soft offlining pfn 0x1521c at process virtual address 0x3fff6bc70000
Injecting memory failure for pfn 0x10fcf at process virtual address 0x3fff86310000
Injecting memory failure for pfn 0x10fd0 at process virtual address 0x3fff86320000
Injecting memory failure for pfn 0x10fd1 at process virtual address 0x3fff86330000
Injecting memory failure for pfn 0x10fd2 at process virtual address 0x3fff86340000
Injecting memory failure for pfn 0x10fd3 at process virtual address 0x3fff86350000
Injecting memory failure for pfn 0x10fd4 at process virtual address 0x3fff86360000
Injecting memory failure for pfn 0x10fd5 at process virtual address 0x3fff86370000