git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log

sched/fair: Fix incorrect comment for capacity_margin

The comment for capacity_margin introduced in:

3273163c6775 ("sched/fair: Let asymmetric CPU configurations balance at wake-up")

... got its usage the wrong way round - fix it.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: freedom.tan@mediatek.com
Cc: keita.kobayashi.ym@renesas.com
Cc: mgalbraith@suse.de
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1476452472-24740-7-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups

For asymmetric CPU capacity systems it is counter-productive for
throughput if low capacity CPUs are pulling tasks from non-overloaded
CPUs with higher capacity. The assumption is that higher CPU capacity is
preferred over running alone in a group with lower CPU capacity.

This patch rejects higher CPU capacity groups with one or less task per
CPU as potential busiest group which could otherwise lead to a series of
failing load-balancing attempts leading to a force-migration.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: freedom.tan@mediatek.com
Cc: keita.kobayashi.ym@renesas.com
Cc: mgalbraith@suse.de
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1476452472-24740-5-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Add per-CPU min capacity to sched_group_capacity

struct sched_group_capacity currently represents the compute capacity
sum of all CPUs in the sched_group.

Unless it is divided by the group_weight to get the average capacity
per CPU, it hides differences in CPU capacity for mixed capacity systems
(e.g. high RT/IRQ utilization or ARM big.LITTLE).

But even the average may not be sufficient if the group covers CPUs of
different capacities.

Instead, by extending struct sched_group_capacity to indicate min per-CPU
capacity in the group a suitable group for a given task utilization can
more easily be found such that CPUs with reduced capacity can be avoided
for tasks with high utilization (not implemented by this patch).

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: freedom.tan@mediatek.com
Cc: keita.kobayashi.ym@renesas.com
Cc: mgalbraith@suse.de
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1476452472-24740-4-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Consider spare capacity in find_idlest_group()

In low-utilization scenarios comparing relative loads in
find_idlest_group() doesn't always lead to the most optimum choice.
Systems with groups containing different numbers of cpus and/or cpus of
different compute capacity are significantly better off when considering
spare capacity rather than relative load in those scenarios.

In addition to existing load based search an alternative spare capacity
based candidate sched_group is found and selected instead if sufficient
spare capacity exists. If not, existing behaviour is preserved.

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: freedom.tan@mediatek.com
Cc: keita.kobayashi.ym@renesas.com
Cc: mgalbraith@suse.de
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1476452472-24740-3-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/fair: Compute task/cpu utilization at wake-up correctly

At task wake-up load-tracking isn't updated until the task is enqueued.
The task's own view of its utilization contribution may therefore not be
aligned with its contribution to the cfs_rq load-tracking which may have
been updated in the meantime. Basically, the task's own utilization
hasn't yet accounted for the sleep decay, while the cfs_rq may have
(partially). Estimating the cfs_rq utilization in case the task is
migrated at wake-up as task_rq(p)->cfs.avg.util_avg - p->se.avg.util_avg
is therefore incorrect as the two load-tracking signals aren't time
synchronized (different last update).

To solve this problem, this patch synchronizes the task utilization with
its previous rq before the task utilization is used in the wake-up path.
Currently the update/synchronization is done _after_ the task has been
placed by select_task_rq_fair(). The synchronization is done without
having to take the rq lock using the existing mechanism used in
remove_entity_load_avg().

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: freedom.tan@mediatek.com
Cc: keita.kobayashi.ym@renesas.com
Cc: mgalbraith@suse.de
Cc: sgurrappadi@nvidia.com
Cc: vincent.guittot@linaro.org
Cc: yuyang.du@intel.com
Link: http://lkml.kernel.org/r/1476452472-24740-2-git-send-email-morten.rasmussen@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/x86: Do not clear PREEMPT_NEED_RESCHED on preempt count reset

The per-cpu preempt count of x86 contains two values, the actual preempt
count and the inverted PREEMPT_NEED_RESCHED bit. If a corrupted preempt
count is detected the preempt_count_set() function is used to reset the
preempt count.

In case the inverted PREEMPT_NEED_RESCHED bit is zero at the time of the
reset, the preemption indication is lost. Use raw_cpu_cmpxchg_4() to reset
only the count part and leave the PREEMPT_NEED_RESCHED bit as it is.

This improves the kernel's behavior when it runs into preempt count leaks
and tries to fix them up.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1478523660-733-1-git-send-email-schwidefsky@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/cpuacct: Avoid %lld seq_printf warning

For s390 kernel builds I keep getting this warning:

kernel/sched/cpuacct.c: In function 'cpuacct_stats_show':
kernel/sched/cpuacct.c:298:25: warning: format '%lld' expects argument of type 'long long int', but argument 4 has type 'clock_t {aka long int}' [-Wformat=]
seq_printf(sf, "%s %lld\n",

Silence the warning by adding an explicit cast.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161111142749.6545-1-schwidefsky@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'linus' into sched/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'trace-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Alexei discovered a race condition in modules failing to load that can
  cause a ftrace check to trigger and disable ftrace.

  This is because of the way modules are registered to ftrace. Their
  functions are loaded in the ftrace function tables but set to
  "disabled" since they are still in the process of being loaded by the
  module. After the module is finished, it calls back into the ftrace
  infrastructure to enable it.

  Looking deeper into the locations that access all the functions in the
  table, I found more locations that should ignore the disabled ones"

* tag 'trace-v4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records
  ftrace: Ignore FTRACE_FL_DISABLED while walking dyn_ftrace records

Merge tag 'fbdev-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux

Pull fbdev fix from Tomi Valkeinen:
"Fix CLCD regression on Vexpress"

* tag 'fbdev-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
video: ARM CLCD: fix Vexpress regression

sched/cputime: Simplify task_cputime()

Now since fetch_task_cputime() has no other users than task_cputime(),
its code could be used directly in task_cputime().

Moreover since only 2 task_cputime() calls of 17 use a NULL argument,
we can add dummy variables to those calls and remove NULL checks from
task_cputimes().

Also remove NULL checks from task_cputimes_scaled().

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479175612-14718-5-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/cputime, powerpc, s390: Make scaled cputime arch specific

Only s390 and powerpc have hardware facilities allowing to measure
cputimes scaled by frequency. On all other architectures
utimescaled/stimescaled are equal to utime/stime (however they are
accounted separately).

Remove {u,s}timescaled accounting on all architectures except
powerpc and s390, where those values are explicitly accounted
in the proper places.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161031162143.GB12646@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/cputime, powerpc: Remove cputime_to_scaled()

Currently cputime_to_scaled() just return it's argument on
all implementations, we don't need to call this function.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479175612-14718-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

sched/cputime, powerpc: Remove cputime_last_delta global variable

Since commit:

cf9efce0ce313 ("powerpc: Account time using timebase rather than PURR")

cputime_last_delta is not initialized to other value than 0, hence it's
not used except zero check and cputime_to_scaled() just returns
the argument.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1479175612-14718-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Fix off by one wrt. indexing when dumping /proc/net/route entries,
    from Alexander Duyck.

2) Fix lockdep splats in iwlwifi, from Johannes Berg.

3) Cure panic when inserting certain netfilter rules when NFT_SET_HASH
    is disabled, from Liping Zhang.

4) Memory leak when nft_expr_clone() fails, also from Liping Zhang.

5) Disable UFO when path will apply IPSEC tranformations, from Jakub
    Sitnicki.

6) Don't bogusly double cwnd in dctcp module, from Florian Westphal.

7) skb_checksum_help() should never actually use the value "0" for the
    resulting checksum, that has a special meaning, use CSUM_MANGLED_0
    instead. From Eric Dumazet.

8) Per-tx/rx queue statistic strings are wrong in qed driver, fix from
    Yuval MIntz.

9) Fix SCTP reference counting of associations and transports in
    sctp_diag. From Xin Long.

10) When we hit ip6tunnel_xmit() we could have come from an ipv4 path in
    a previous layer or similar, so explicitly clear the ipv6 control
    block in the skb. From Eli Cooper.

11) Fix bogus sleeping inside of inet_wait_for_connect(), from WANG
    Cong.

12) Correct deivce ID of T6 adapter in cxgb4 driver, from Hariprasad
    Shenai.

13) Fix potential access past the end of the skb page frag array in
    tcp_sendmsg(). From Eric Dumazet.

14) 'skb' can legitimately be NULL in inet{,6}_exact_dif_match(). Fix
    from David Ahern.

15) Don't return an error in tcp_sendmsg() if we wronte any bytes
    successfully, from Eric Dumazet.

16) Extraneous unlocks in netlink_diag_dump(), we removed the locking
    but forgot to purge these unlock calls. From Eric Dumazet.

17) Fix memory leak in error path of __genl_register_family(). We leak
    the attrbuf, from WANG Cong.

18) cgroupstats netlink policy table is mis-sized, from WANG Cong.

19) Several XDP bug fixes in mlx5, from Saeed Mahameed.

20) Fix several device refcount leaks in network drivers, from Johan
    Hovold.

21) icmp6_send() should use skb dst device not skb->dev to determine L3
    routing domain. From David Ahern.

22) ip_vs_genl_family sets maxattr incorrectly, from WANG Cong.

23) We leak new macvlan port in some cases of maclan_common_netlink()
    errors. Fix from Gao Feng.

24) Similar to the icmp6_send() fix, icmp_route_lookup() should
    determine L3 routing domain using skb_dst(skb)->dev not skb->dev.
    Also from David Ahern.

25) Several fixes for route offloading and FIB notification handling in
    mlxsw driver, from Jiri Pirko.

26) Properly cap __skb_flow_dissect()'s return value, from Eric Dumazet.

27) Fix long standing regression in ipv4 redirect handling, wrt.
    validating the new neighbour's reachability. From Stephen Suryaputra
    Lin.

28) If sk_filter() trims the packet excessively, handle it reasonably in
    tcp input instead of exploding. From Eric Dumazet.

29) Fix handling of napi hash state when copying channels in sfc driver,
    from Bert Kenward.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (121 commits)
  mlxsw: spectrum_router: Flush FIB tables during fini
  net: stmmac: Fix lack of link transition for fixed PHYs
  sctp: change sk state only when it has assocs in sctp_shutdown
  bnx2: Wait for in-flight DMA to complete at probe stage
  Revert "bnx2: Reset device during driver initialization"
  ps3_gelic: fix spelling mistake in debug message
  net: ethernet: ixp4xx_eth: fix spelling mistake in debug message
  ibmvnic: Fix size of debugfs name buffer
  ibmvnic: Unmap ibmvnic_statistics structure
  sfc: clear napi_hash state when copying channels
  mlxsw: spectrum_router: Correctly dump neighbour activity
  mlxsw: spectrum: Fix refcount bug on span entries
  bnxt_en: Fix VF virtual link state.
  bnxt_en: Fix ring arithmetic in bnxt_setup_tc().
  Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"
  tcp: take care of truncations done by sk_filter()
  ipv4: use new_gw for redirect neigh lookup
  r8152: Fix error path in open function
  net: bpqether.h: remove if_ether.h guard
  net: __skb_flow_dissect() must cap its return value
  ...

Merge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile bugfix from Chris Metcalf:
"This just fixes an incompatibility with tile __ro_after_init"

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: handle __ro_after_init like parisc does

Merge tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC fixes from Alexandre Belloni:
"Here are a few driver fixes for 4.9. It has been calm for a while so I
  don't expect more for this cycle.

  Drivers:
   - asm9260: fix module autoload
   - cmos: fix crashes
   - omap: fix clock handling"

* tag 'rtc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: omap: prevent disabling of clock/module during suspend
  rtc: omap: Fix selecting external osc
  rtc: cmos: Don't enable interrupts in the middle of the interrupt handler
  rtc: cmos: remove all __exit_p annotations
  rtc: asm9260: fix module autoload

tile: handle __ro_after_init like parisc does

The tile architecture already marks RO_DATA as read-only in
the kernel, so grouping RO_AFTER_INIT_DATA with RO_DATA, as is
done by default, means the kernel faults in init when it tries
to write to RO_AFTER_INIT_DATA. For now, just arrange that
__ro_after_init is handled like __write_once, i.e. __read_mostly.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>

mlxsw: spectrum_router: Flush FIB tables during fini

Since commit b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications
instead of switchdev calls") we reflect to the device the entire FIB
table and not only FIBs that point to netdevs created by the driver.

During module removal, FIBs of the second type are removed following
NETDEV_UNREGISTER events sent. The other FIBs are still present in both
the driver's cache and the device's table.

Fix this by iterating over all the FIB tables in the device and flush
them. There's no need to take locks, as we're the only writer.

Fixes: b45f64d16d45 ("mlxsw: spectrum_router: Use FIB notifications instead of switchdev calls")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: stmmac: Fix lack of link transition for fixed PHYs

Commit 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch
is attached") added some logic to avoid polling the fixed PHY and
therefore invoking the adjust_link callback more than once, since this
is a fixed PHY and link events won't be generated.

This works fine the first time, because we start with phydev->irq =
PHY_POLL, so we call adjust_link, then we set phydev->irq =
PHY_IGNORE_INTERRUPT and we stop polling the PHY.

Now, if we called ndo_close(), which calls both phy_stop() and does an
explicit netif_carrier_off(), we end up with a link down. Upon calling
ndo_open() again, despite starting the PHY state machine, we have
PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
link is permanently down.

Fixes: 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch is attached")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ftrace: Add more checks for FTRACE_FL_DISABLED in processing ip records

When a module is first loaded and its function ip records are added to the
ftrace list of functions to modify, they are set to DISABLED, as their text
is still in a read only state. When the module is fully loaded, and can be
updated, the flag is cleared, and if their's any functions that should be
tracing them, it is updated at that moment.

But there's several locations that do record accounting and should ignore
records that are marked as disabled, or they can cause issues.

Alexei already fixed one location, but others need to be addressed.

Cc: stable@vger.kernel.org
Fixes: b7ffffbb46f2 "ftrace: Add infrastructure for delayed enabling of module functions"
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

ftrace: Ignore FTRACE_FL_DISABLED while walking dyn_ftrace records

ftrace_shutdown() checks for sanity of ftrace records
and if dyn_ftrace->flags is not zero, it will warn.
It can happen that 'flags' are set to FTRACE_FL_DISABLED at this point,
since some module was loaded, but before ftrace_module_enable()
cleared the flags for this module.

In other words the module.c is doing:
ftrace_module_init(mod); // calls ftrace_update_code() that sets flags=FTRACE_FL_DISABLED
... // here ftrace_shutdown() is called that warns, since
err = prepare_coming_module(mod); // didn't have a chance to clear FTRACE_FL_DISABLED

Fix it by ignoring disabled records.
It's similar to what __ftrace_hash_rec_update() is already doing.

Link: http://lkml.kernel.org/r/1478560460-3818619-1-git-send-email-ast@fb.com
Cc: stable@vger.kernel.org
Fixes: b7ffffbb46f2 "ftrace: Add infrastructure for delayed enabling of module functions"
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

sctp: change sk state only when it has assocs in sctp_shutdown

Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.

Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.

This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.

Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bnx2-kdump-fix'

Baoquan He says:

====================
bnx2: Wait for in-flight DMA to complete at probe stage

This is v2 post.

In commit 3e1be7a ("bnx2: Reset device during driver initialization"),
firmware requesting code was moved from open stage to probe stage.
The reason is in kdump kernel hardware iommu need device be reset in
driver probe stage, otherwise those in-flight DMA from 1st kernel
will continue going and look up into the newly created io-page tables.
However bnx2 chip resetting involves firmware requesting issue, that
need be done in open stage.

Michale Chan suggested we can just wait for the old in-flight DMA to
complete at probe stage, then though without device resetting, we
don't need to worry the old in-flight DMA could continue looking up
the newly created io-page tables.

v1->v2:
    Michael suggested to wait for the in-flight DMA to complete at probe
    stage. So give up the old method of trying to reset chip at probe
    stage, take the new way accordingly.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnx2: Wait for in-flight DMA to complete at probe stage

In-flight DMA from 1st kernel could continue going in kdump kernel.
New io-page table has been created before bnx2 does reset at open stage.
We have to wait for the in-flight DMA to complete to avoid it look up
into the newly created io-page table at probe stage.

Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "bnx2: Reset device during driver initialization"

This reverts commit 3e1be7ad2d38c6bd6aeef96df9bd0a7822f4e51c.

When people build bnx2 driver into kernel, it will fail to detect
and load firmware because firmware is contained in initramfs and
initramfs has not been uncompressed yet during do_initcalls. So
revert commit 3e1be7a and work out a new way in the later patch.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ps3_gelic: fix spelling mistake in debug message

Trivial fix to spelling mistake "unmached" to "unmatched" in
debug message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ASoC: lpass-platform: fix uninitialized variable

In commit 022d00ee0b55 ("ASoC: lpass-platform: Fix broken pcm data
usage") the stream specific information initialization was broken, with
the dma channel information not being initialized if there was no
alloc_dma_channel() helper function.

Before that, the DMA channel number was implicitly initialized to zero
because the backing store was allocated with devm_kzalloc(). When the
init code was rewritten, that implicit initialization was lost, and gcc
rightfully complains about an uninitialized variable being used.

Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "printk: make reading the kernel log flush pending lines"

This reverts commit bfd8d3f23b51018388be0411ccbc2d56277fe294.

It turns out that this flushes things much too aggressiverly, and causes
lines to break up when the system logger races with new continuation
lines being printed.

There's a pending patch to make printk() flushing much more
straightforward, but it's too invasive for 4.9, so in the meantime let's
just not make the system message logging flush continuation lines.
They'll be flushed by the final newline anyway.

Suggested-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gp8psk-fe: add missing MODULE_foo() macros

This file was converted to a separate module at commit 7a0786c19d65
("gp8psk: Fix DVB frontend attach"), because the DVB attach routines
require it to work. However, I forgot to copy the MODULE_foo() macros
from the original module, causing this warning:

WARNING: modpost: missing MODULE_LICENSE() in drivers/media/dvb-frontends/gp8psk-fe.o

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 7a0786c19d65 ("gp8psk: Fix DVB frontend attach")
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Misc fixes:

   - fix an Intel/MID boot crash/hang bug

   - fix a cache topology mis-parsing bug on certain AMD CPUs

   - fix a virtualization firmware bug by adding a check+quirk
     workaround on the kernel side"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Deal with broken firmware (VMWare/XEN)
  x86/cpu/AMD: Fix cpu_llc_id for AMD Fam17h systems
  x86/platform/intel-mid: Retrofit pci_platform_pm_ops ->get_state hook

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
"This fixes a genirq regression that resulted in the Intel/Broxton
pinctrl/GPIO driver (and possibly others) spewing warnings"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Use irq type from irqdata instead of irqdesc

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"An uncore PMU driver hardware enablement change for Intel SkyLake
  uncore PMUs (Skylake Y, U, H and S platforms), plus a number of
  tooling fixes for the histogram handling/displaying code"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Add more Intel uncore IMC PCI IDs for SkyLake
  perf hists: Fix column length on --hierarchy
  perf hists browser: Fix column indentation on --hierarchy
  perf hists browser: Show folded sign properly on --hierarchy
  perf hists browser: Fix indentation of folded sign on --hierarchy
  perf hist browser: Fix hierarchy column counts

Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull EFI fixes from Ingo Molnar:
"A boot crash fix and a build warning fix"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y
x86/efi: Fix EFI memmap pointer size warning

Merge tag 'ntb-4.9' of git://github.com/jonmason/ntb

Pull NTB fixes from Jon Mason:
"NTB bug fixes for ntb_hw_intel, ntb_perf, and ntb_pingpong.

  Also, a fixup to use jiffies in schedule_timeout_* call instead of a
  constant"

* tag 'ntb-4.9' of git://github.com/jonmason/ntb:
  ntb_perf: potential info leak in debugfs
  ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev
  ntb: make DMA_OUT_RESOURCE_TO HZ independent
  ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent
  NTB: ntb_hw_intel: Fix typo in module parameter descriptions
  ntb_pingpong: Fix db_init parameter description

ntb_perf: potential info leak in debugfs

This is a static checker warning, not something I'm desperately
concerned about. But snprintf() returns the number of bytes that
would have been copied if there were space. We really care about the
number of bytes that actually were copied so we should use scnprintf()
instead.

It probably won't overrun, and in that case we may as well just use
sprintf() but these sorts of things make static checkers and code
reviewers happier.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

ntb: ntb_hw_intel: init peer_addr in struct intel_ntb_dev

The peer_addr member of intel_ntb_dev is not set, therefore when
acquiring ntb_peer_db and ntb_peer_spad we only get the offset rather
than the actual physical address. Adding fix to correct that.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

ntb: make DMA_OUT_RESOURCE_TO HZ independent

schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

ntb_transport: make DMA_OUT_RESOURCE_TO HZ independent

schedule_timeout_* takes a timeout in jiffies but the code currently is
passing in a constant which makes this timeout HZ dependent, so pass it
through msecs_to_jiffies() to fix this up.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

NTB: ntb_hw_intel: Fix typo in module parameter descriptions

Fix typo in module parameter descriptions.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

ntb_pingpong: Fix db_init parameter description

Fix 'db_init' parameter description.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Acked-by: Allen Hubbe <Allen.Hubbe@emc.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>

net: ethernet: ixp4xx_eth: fix spelling mistake in debug message

Trivial fix to spelling mistake "successed" to "succeeded"
in debug message. Also unwrap multi-line literal string.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Fix size of debugfs name buffer

This mistake was causing debugfs directory creation
failures when multiple ibmvnic devices were probed.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ibmvnic: Unmap ibmvnic_statistics structure

This structure was mapped but never subsequently unmapped.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sfc: clear napi_hash state when copying channels

efx_copy_channel() doesn't correctly clear the napi_hash related state.
This means that when napi_hash_add is called for that channel nothing is
done, and we are left with a copy of the napi_hash_node from the old
channel. When we later call napi_hash_del() on this channel we have a
stale napi_hash_node.

Corruption is only seen when there are multiple entries in one of the
napi_hash lists. This is made more likely by having a very large number
of channels. Testing was carried out with 512 channels - 32 channels on
each of 16 ports.

This failure typically appears as protection faults within napi_by_id()
or napi_hash_add(). efx_copy_channel() is only used when tx or rx ring
sizes are changed (ethtool -G).

Fixes: 36763266bbe8 ("sfc: Add support for busy polling")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 4.9-rc5

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
"ARM fixes.  There are a couple pending x86 patches but they'll have to
  wait for next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: arm/arm64: vgic: Kick VCPUs when queueing already pending IRQs
  KVM: arm/arm64: vgic: Prevent access to invalid SPIs
  arm/arm64: KVM: Perform local TLB invalidation when multiplexing vcpus on a single CPU

Merge branch 'media-fixes' (patches from Mauro)

Merge media fixes from Mauro Carvalho Chehab:
"This contains two patches fixing problems with my patch series meant
  to make USB drivers to work again after the DMA on stack changes.

  The last patch on this series is actually not related to DMA on stack.
  It solves a longstanding bug affecting module unload, causing
  module_put() to be called twice. It was reported by the user who
  reported and tested the issues with the gp8psk driver with the DMA
  fixup patches. As we're late at -rc cycle, maybe you prefer to not
  apply it right now. If this is the case, I'll add to the pile of
  patches for 4.10.

  Exceptionally this time, I'm sending the patches via e-mail, because
  I'm on another trip, and won't be able to use the usual procedure
  until Monday. Also, it is only three patches, and you followed already
  the discussions about the first one"

* emailed patches from Mauro Carvalho Chehab <mchehab@osg.samsung.com>:
  gp8psk: Fix DVB frontend attach
  gp8psk: fix gp8psk_usb_in_op() logic
  dvb-usb: move data_mutex to struct dvb_usb_device

Merge tag 'char-misc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
"Here are three small driver fixes for some reported issues for
  4.9-rc5.

  One for the hyper-v subsystem, fixing up a naming issue that showed up
  in 4.9-rc1, one mei driver fix, and one fix for parallel ports,
  resolving a reported regression.

  All have been in linux-next with no reported issues"

* tag 'char-misc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  ppdev: fix double-free of pp->pdev->name
  vmbus: make sysfs names consistent with PCI
  mei: bus: fix received data size check in NFC fixup

Merge tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are two driver core fixes for 4.9-rc5.

  The first resolves an issue with some drivers not liking to be unbound
  and bound again (if CONFIG_DEBUG_TEST_DRIVER_REMOVE is enabled), which
  solves some reported problems with graphics and storage drivers. The
  other resolves a smatch error with the 4.9-rc1 driver core changes
  around this feature.

  Both have been in linux-next with no reported issues"

* tag 'driver-core-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: fix smatch warning on dev->bus check
  driver core: skip removal test for non-removable drivers

Merge tag 'staging-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging/IIO fixes from Grek KH:
"Here are a few small staging and iio driver fixes for reported issues.

  The last one was cherry-picked from my -next branch to resolve a build
  warning that Arnd fixed, in his quest to be able to turn
  -Wmaybe-uninitialized back on again. That patch, and all of the
  others, have been in linux-next for a while with no reported issues"

* tag 'staging-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: maxim_thermocouple: detect invalid storage size in read()
  staging: nvec: remove managed resource from PS2 driver
  Revert "staging: nvec: ps2: change serio type to passthrough"
  drivers: staging: nvec: remove bogus reset command for PS/2 interface
  staging: greybus: arche-platform: fix device reference leak
  staging: comedi: ni_tio: fix buggy ni_tio_clock_period_ps() return value
  staging: sm750fb: Fix bugs introduced by early commits
  iio: hid-sensors: Increase the precision of scale to fix wrong reading interpretation.
  iio: orientation: hid-sensor-rotation: Add PM function (fix non working driver)
  iio: st_sensors: fix scale configuration for h3lis331dl
  staging: iio: ad5933: avoid uninitialized variable in error case

Merge tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB / PHY fixes from Greg KH:
"Here are a number of small USB and PHY driver fixes for 4.9-rc5

  Nothing major, just small fixes for reported issues, all of these have
  been in linux-next for a while with no reported issues"

* tag 'usb-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: cdc-acm: fix TIOCMIWAIT
  cdc-acm: fix uninitialized variable
  drivers/usb: Skip auto handoff for TI and RENESAS usb controllers
  usb: musb: remove duplicated actions
  usb: musb: da8xx: Don't print phy error on -EPROBE_DEFER
  phy: sun4i: check PMU presence when poking unknown bit of pmu
  phy-rockchip-pcie: remove deassert of phy_rst from exit callback
  phy: da8xx-usb: rename the ohci device to ohci-da8xx
  phy: Add reset callback for not generic phy
  uwb: fix device reference leaks
  usb: gadget: u_ether: remove interrupt throttling
  usb: dwc3: st: add missing <linux/pinctrl/consumer.h> include
  usb: dwc3: Fix error handling for core init

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull more block fixes from Jens Axboe:
"Since I mistakenly left out the lightnvm regression fix yesterday and
  the aoeblk seems adequately tested at this point, might as well send
  out another pull to make -rc5"

* 'for-linus' of git://git.kernel.dk/linux-block:
  aoe: fix crash in page count manipulation
  lightnvm: invalid offset calculation for lba_shift

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"The megaraid_sas patch in here fixes a major regression in the last
  fix set that made all megaraid_sas cards unusable. It turns out no-one
  had actually tested such an "obvious" fix, sigh. The fix for the fix
  has been tested ...

  The next most serious is the vmw_pvscsi abort problem which basically
  means that aborts don't work on the vmware paravirt devices and error
  handling always escalates to reset.

  The rest are an assortment of missed reference counting in certain
  paths and corner case bugs that show up on some architectures"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: megaraid_sas: fix macro MEGASAS_IS_LOGICAL to avoid regression
  scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove
  scsi: qla2xxx: do not queue commands when unloading
  scsi: libcxgbi: fix incorrect DDP resource cleanup
  scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
  scsi: scsi_dh_alua: Fix a reference counting bug
  scsi: vmw_pvscsi: return SUCCESS for successful command aborts
  scsi: mpt3sas: Fix for block device of raid exists even after deleting raid disk
  scsi: scsi_dh_alua: fix missing kref_put() in alua_rtpg_work()

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"The typical collection of minor bug fixes in clk drivers. We don't
  have anything in the core framework here, just driver fixes.

  There's a boot fix for Samsung devices and a safety measure for qoriq
  to prevent CPUs from running too fast. There's also a fix for i.MX6Q
  to properly handle audio clock rates. We also have some "that's
  obviously wrong" fixes like bad NULL pointer checks in the MPP driver
  and a poor usage of __pa in the xgene clk driver that are fixed here"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: mmp: pxa910: fix return value check in pxa910_clk_init()
  clk: mmp: pxa168: fix return value check in pxa168_clk_init()
  clk: mmp: mmp2: fix return value check in mmp2_clk_init()
  clk: qoriq: Don't allow CPU clocks higher than starting value
  clk: imx: fix integer overflow in AV PLL round rate
  clk: xgene: Don't call __pa on ioremaped address
  clk/samsung: Use CLK_OF_DECLARE_DRIVER initialization method for CLKOUT
  clk: rockchip: don't return NULL when failing to register ddrclk branch

gp8psk: Fix DVB frontend attach

The DVB binding schema at the DVB core assumes that the frontend is a
separate driver.  Faling to do that causes OOPS when the module is
removed, as it tries to do a symbol_put_addr on an internal symbol,
causing craches like:

    WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
    Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
    CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
    Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
    Call Trace:
       dump_stack+0x44/0x64
       __warn+0xfa/0x120
       module_put+0x57/0x70
       module_put+0x57/0x70
       warn_slowpath_null+0x23/0x30
       module_put+0x57/0x70
       gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
       symbol_put_addr+0x27/0x50
       dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]

From Derek's tests:
    "Attach bug is fixed, tuning works, module unloads without
     crashing. Everything seems ok!"

Reported-by: Derek <user.vdr@gmail.com>
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

gp8psk: fix gp8psk_usb_in_op() logic

Commit bc29131ecb10 ("[media] gp8psk: don't do DMA on stack") fixed the
usage of DMA on stack, but the memcpy was wrong for gp8psk_usb_in_op().
Fix it.

From Derek's email:
"Fix confirmed using 2 different Skywalker models with
HD mpeg4, SD mpeg2."

Suggested-by: Johannes Stezenbach <js@linuxtv.org>
Fixes: bc29131ecb10 ("[media] gp8psk: don't do DMA on stack")
Tested-by: Derek <user.vdr@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dvb-usb: move data_mutex to struct dvb_usb_device

The data_mutex is initialized too late, as it is needed for
each device driver's power control, causing an OOPS:

    dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 PGD 0
    Oops: 0002 [#1] SMP
    Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
    CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
    Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
    task: ffff88020e943840 task.stack: ffff8801f36ec000
    RIP: 0010:[<ffffffff846617af>]  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100
    RSP: 0018:ffff8801f36efb10  EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff88021509bdc8 RCX: 00000000c0000100
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff88021509bdcc
    RBP: ffff8801f36efb58 R08: ffff88021f216320 R09: 0000000000100000
    R10: ffff88021f216320 R11: 00000023fee6c5a1 R12: ffff88020e943840
    R13: ffff88021509bdcc R14: 00000000ffffffff R15: ffff88021509bdd0
    FS:  00007f21adb86740(0000) GS:ffff88021f200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000215bce000 CR4: 00000000001406f0
    Call Trace:
       mutex_lock+0x16/0x25
       cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
       dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
       cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
       usb_probe_interface+0xf3/0x2a0
       driver_probe_device+0x208/0x2b0
       __driver_attach+0x87/0x90
       driver_probe_device+0x2b0/0x2b0
       bus_for_each_dev+0x52/0x80
       bus_add_driver+0x1a3/0x220
       driver_register+0x56/0xd0
       usb_register_driver+0x77/0x130
       do_one_initcall+0x46/0x180
       free_vmap_area_noflush+0x38/0x70
       kmem_cache_alloc+0x84/0xc0
       do_init_module+0x50/0x1be
       load_module+0x1d8b/0x2100
       find_symbol_in_section+0xa0/0xa0
       SyS_finit_module+0x89/0x90
       entry_SYSCALL_64_fastpath+0x13/0x94
    Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  [<ffffffff846617af>] __mutex_lock_slowpath+0x6f/0x100 RSP <ffff8801f36efb10>
    CR2: 0000000000000000

So, move it to the struct dvb_usb_device and initialize it
before calling the driver's callbacks.

Reported-by: Jörg Otte <jrg.otte@gmail.com>
Tested-by: Jörg Otte <jrg.otte@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'mlxsw-fixes'

Jiri Pirko says:

====================
mlxsw: Couple of fixes

Please, queue-up both for stable. Thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum_router: Correctly dump neighbour activity

The device's neighbour table is periodically dumped in order to update
the kernel about active neighbours. A single dump session may span
multiple queries, until the response carries less records than requested
or when a record (can contain up to four neighbour entries) is not full.
Current code stops the session when the number of returned records is
zero, which can result in infinite loop in case of high packet rate.

Fix this by stopping the session according to the above logic.

Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlxsw: spectrum: Fix refcount bug on span entries

When binding port to a newly created span entry, its refcount is
initialized to zero even though it has a bound port. That leads
to unexpected behaviour when the user tries to delete that port
from the span entry.

Fix this by initializing the reference count to 1.

Also add a warning to put function.

Fixes: 763b4b70afcd ("mlxsw: spectrum: Add support in matchall mirror TC offloading")
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bnxt_en-fixes'

Michael Chan says:

====================
bnxt_en: 2 bug fixes.

Bug fixes in bnxt_setup_tc() and VF vitual link state.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Fix VF virtual link state.

If the physical link is down and the VF virtual link is set to "enable",
the current code does not always work. If the link is down but the
cable is attached, the firmware returns LINK_SIGNAL instead of
NO_LINK. The current code is treating LINK_SIGNAL as link up.
The fix is to treat link as down when the link_status != LINK.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bnxt_en: Fix ring arithmetic in bnxt_setup_tc().

The logic is missing the check on whether the tx and rx rings are sharing
completion rings or not.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"

This reverts commit cf00713a655d ("include/uapi/linux/atm_zatm.h: include
linux/time.h").

This attempted to fix userspace breakage that no longer existed when
the patch was merged.  Almost one year earlier, commit 70ba07b675b5
("atm: remove 'struct zatm_t_hist'") deleted the struct in question.

After this patch was merged, we now have to deal with people being
unable to include this header in conjunction with standard C library
headers like stdlib.h (which linux-atm does).  Example breakage:
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../.. -I./../q2931 -I./../saal \
-I.  -DCPPFLAGS_TEST  -I../../src/include -O2 -march=native -pipe -g \
-frecord-gcc-switches -freport-bug -Wimplicit-function-declaration \
-Wnonnull -Wstrict-aliasing -Wparentheses -Warray-bounds \
-Wfree-nonheap-object -Wreturn-local-addr -fno-strict-aliasing -Wall \
-Wshadow -Wpointer-arith -Wwrite-strings -Wstrict-prototypes -c zntune.c
In file included from /usr/include/linux/atm_zatm.h:17:0,
                 from zntune.c:17:
/usr/include/linux/time.h:9:8: error: redefinition of ‘struct timespec’
struct timespec {
        ^
In file included from /usr/include/sys/select.h:43:0,
                 from /usr/include/sys/types.h:219,
                 from /usr/include/stdlib.h:314,
                 from zntune.c:9:
/usr/include/time.h:120:8: note: originally defined here
struct timespec
        ^

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: take care of truncations done by sk_filter()

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Marco Grassi <marco.gra@gmail.com>
Reported-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: use new_gw for redirect neigh lookup

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
- use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
Signed-off-by: Stephen Suryaputra Lin <ssurya@ieee.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8152: Fix error path in open function

If usb_submit_urb() called from the open function fails, the following
crash may be observed.

r8152 8-1:1.0 eth0: intr_urb submit failed: -19
...
r8152 8-1:1.0 eth0: v1.08.3
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b7b
pgd = ffffffc0e7305000
[6b6b6b6b6b6b6b7b] *pgd=0000000000000000, *pud=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
...
PC is at notifier_chain_register+0x2c/0x58
LR is at blocking_notifier_chain_register+0x54/0x70
...
Call trace:
[<ffffffc0002407f8>] notifier_chain_register+0x2c/0x58
[<ffffffc000240bdc>] blocking_notifier_chain_register+0x54/0x70
[<ffffffc00026991c>] register_pm_notifier+0x24/0x2c
[<ffffffbffc183200>] rtl8152_open+0x3dc/0x3f8 [r8152]
[<ffffffc000808000>] __dev_open+0xac/0x104
[<ffffffc0008082f8>] __dev_change_flags+0xb0/0x148
[<ffffffc0008083c4>] dev_change_flags+0x34/0x70
[<ffffffc000818344>] do_setlink+0x2c8/0x888
[<ffffffc0008199d4>] rtnl_newlink+0x328/0x644
[<ffffffc000819e98>] rtnetlink_rcv_msg+0x1a8/0x1d4
[<ffffffc0008373c8>] netlink_rcv_skb+0x68/0xd0
[<ffffffc000817990>] rtnetlink_rcv+0x2c/0x3c
[<ffffffc000836d1c>] netlink_unicast+0x16c/0x234
[<ffffffc00083720c>] netlink_sendmsg+0x340/0x364
[<ffffffc0007e85d0>] sock_sendmsg+0x48/0x60
[<ffffffc0007e9c30>] SyS_sendto+0xe0/0x120
[<ffffffc0007e9cb0>] SyS_send+0x40/0x4c
[<ffffffc000203e34>] el0_svc_naked+0x24/0x28

Clean up error handling to avoid registering the notifier if the open
function is going to fail.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

iio: maxim_thermocouple: detect invalid storage size in read()

As found by gcc -Wmaybe-uninitialized, having a storage_bytes value other
than 2 or 4 will result in undefined behavior:

drivers/iio/temperature/maxim_thermocouple.c: In function 'maxim_thermocouple_read':
drivers/iio/temperature/maxim_thermocouple.c:141:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This probably cannot happen, but returning -EINVAL here is appropriate
and makes gcc happy and the code more robust.

Fixes: 231147ee77f3 ("iio: maxim_thermocouple: Align 16 bit big endian value of raw reads")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
(cherry picked from commit 32cb7d27e65df9daa7cee8f1fdf7b259f214bee2)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y

Booting an EFI mixed mode kernel has been crashing since commit:

e37e43a497d5 ("x86/mm/64: Enable vmapped stacks (CONFIG_HAVE_ARCH_VMAP_STACK=y)")

The user-visible effect in my test setup was the kernel being unable
to find the root file system ramdisk. This was likely caused by silent
memory or page table corruption.

Enabling CONFIG_DEBUG_VIRTUAL=y immediately flagged the thunking code as
abusing virt_to_phys() because it was passing addresses that were not
part of the kernel direct mapping.

Use the slow version instead, which correctly handles all memory
regions by performing a page table walk.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20161112210424.5157-3-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/efi: Fix EFI memmap pointer size warning

Fix this when building on 32-bit:

  arch/x86/platform/efi/efi.c: In function ‘__efi_enter_virtual_mode’:
  arch/x86/platform/efi/efi.c:911:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
       (efi_memory_desc_t *)pa);
       ^
  arch/x86/platform/efi/efi.c:918:5: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
       (efi_memory_desc_t *)pa);
       ^

The @pa local variable is declared as phys_addr_t and that is a u64 when
CONFIG_PHYS_ADDR_T_64BIT=y. (The last is enabled on 32-bit on a PAE
build.)

However, its value comes from __pa() which is basically doing pointer
arithmetic and checking, and returns unsigned long as it is the native
pointer width.

So let's use an unsigned long too. It should be fine to do so because
the later users cast it to a pointer too.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20161112210424.5157-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>

net: bpqether.h: remove if_ether.h guard

__LINUX_IF_ETHER_H is not defined anywhere, and if_ether.h can keep itself from
double inclusion, though it uses a single underscore prefix.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: __skb_flow_dissect() must cap its return value

After Tom patch, thoff field could point past the end of the buffer,
this could fool some callers.

If an skb was provided, skb->len should be the upper limit.
If not, hlen is supposed to be the upper limit.

Fixes: a6e544b0a88b ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yibin Yang <yibyang@cisco.com
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fix-bpf_redirect'

Martin KaFai Lau says:

====================
bpf: Fix bpf_redirect to an ipip/ip6tnl dev

This patch set fixes a bug in bpf_redirect(dev, flags) when dev is an
ipip/ip6tnl. The current problem is IP-EthHdr-IP is sent out instead of
IP-IP.

Patch 1 adds a dev->type test similar to dev_is_mac_header_xmit()
in act_mirred.c which is only available in net-next. We can consider to
refactor it once this patch is pulled into net-next from net.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: Add test for bpf_redirect to ipip/ip6tnl

The test creates two netns, ns1 and ns2.  The host (the default netns)
has an ipip or ip6tnl dev configured for tunneling traffic to the ns2.

    ping VIPS from ns1 <----> host <--tunnel--> ns2 (VIPs at loopback)

The test is to have ns1 pinging VIPs configured at the loopback
interface in ns2.

The VIPs are 10.10.1.102 and 2401:face::66 (which are configured
at lo@ns2). [Note: 0x66 => 102].

At ns1, the VIPs are routed _via_ the host.

At the host, bpf programs are installed at the veth to redirect packets
from a veth to the ipip/ip6tnl.  The test is configured in a way so
that both ingress and egress can be tested.

At ns2, the ipip/ip6tnl dev is configured with the local and remote address
specified.  The return path is routed to the dev ipip/ip6tnl.

During egress test, the host also locally tests pinging the VIPs to ensure
that bpf_redirect at egress also works for the direct egress (i.e. not
forwarding from dev ve1 to ve2).

Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: Fix bpf_redirect to an ipip/ip6tnl dev

If the bpf program calls bpf_redirect(dev, 0) and dev is
an ipip/ip6tnl, it currently includes the mac header.
e.g. If dev is ipip, the end result is IP-EthHdr-IP instead
of IP-IP.

The fix is to pull the mac header.  At ingress, skb_postpull_rcsum()
is not needed because the ethhdr should have been pulled once already
and then got pushed back just before calling the bpf_prog.
At egress, this patch calls skb_postpull_rcsum().

If bpf_redirect(dev, BPF_F_INGRESS) is called,
it also fails now because it calls dev_forward_skb() which
eventually calls eth_type_trans(skb, dev).  The eth_type_trans()
will set skb->type = PACKET_OTHERHOST because the mac address
does not match the redirecting dev->dev_addr.  The PACKET_OTHERHOST
will eventually cause the ip_rcv() errors out.  To fix this,
____dev_forward_skb() is added.

Joint work with Daniel Borkmann.

Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aoe: fix crash in page count manipulation

aoeblk contains some mysterious code, that wants to elevate the bio
vec page counts while it's under IO. That is not needed, it's
fragile, and it's causing kernel oopses for some.

Reported-by: Tested-by: Don Koch <kochd@us.ibm.com>
Tested-by: Tested-by: Don Koch <kochd@us.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>

Merge tag 'perf-hists-hierarchy-fixes-for-mingo-20161111' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes for perf {top,report} --hierarchy, from Arnaldo Carvalho de Melo:

- These are fixes for the --hierarchy view of perf top and report, fixing
output oddities, mostly related to scrolling. (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

lightnvm: invalid offset calculation for lba_shift

The ns->lba_shift assumes its value to be the logarithmic of the
LA size. A previous patch duplicated the lba_shift calculation into
lightnvm. It prematurely also subtracted a 512byte shift, which commonly
is applied per-command. The 512byte shift being subtracted twice led to
data loss when restoring the logical to physical mapping table from
device and when issuing I/O commands using rrpc.

Fix offset by removing the 512byte shift subtraction when calculating
lba_shift.

Fixes: b0b4e09c1ae7 "lightnvm: control life of nvm_dev in driver"
Reported-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>

Merge tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
"Fix a recent regression in the 8250_dw serial driver introduced by
  adding a quirk for the APM X-Gene SoC to it which uncovered an issue
  related to the handling of built-in device properties in the core ACPI
  device enumeration code (Heikki Krogerus)"

* tag 'acpi-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / platform: Add support for build-in properties

Merge tag 'pm-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"These fix two bugs in error code paths in the PM core (system-wide
  suspend of devices), a device reference leak in the boot-time suspend
  test code and a cpupower utility regression from the 4.7 cycle.

  Specifics:

   - Prevent the PM core from attempting to suspend parent devices if
     any of their children, whose suspend callbacks were invoked
     asynchronously, have failed to suspend during the "late" and
     "noirq" phases of system-wide suspend of devices (Brian Norris).

   - Prevent the boot-time system suspend test code from leaking a
     reference to the RTC device used by it (Johan Hovold).

   - Fix cpupower to use the return value of one of its library
     functions correctly and restore the correct behavior of it when
     used for setting cpufreq tunables broken during the 4.7 development
     cycle (Laura Abbott)"

* tag 'pm-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend
  cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set

Merge tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

- mmap handler for dma ops as generic handler no longer works for us
   [Alexey]

- Fixes for EZChip platform [Noam]

- Fix RTC clocksource driver build issue

- ARC IRQ handling fixes [Yuriy]

- Revert a recent makefile change which doesn't go well with oldish
   tools out in the wild

* tag 'arc-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: MCIP: Use IDU_M_DISTRI_DEST mode if there is only 1 destination core
  ARC: IRQ: Do not use hwirq as virq and vice versa
  ARC: [plat-eznps] set default baud for early console
  ARC: [plat-eznps] remove IPI clear from SMP operations
  Revert "ARC: build: retire old toggles"
  ARC: timer: rtc: implement read loop in "C" vs. inline asm
  ARC: change return value of userspace cmpxchg assist syscall
  arc: Implement arch-specific dma_map_ops.mmap
  ARC: [SMP] avoid overriding present cpumask
  ARC: Enable PERF_EVENTS in nSIM driven platforms

Merge tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull x86 platform driver fixes from Darren Hart:
"Minor doc fix, a DMI match for ideapad and a fix to toshiba-wmi to
  avoid loading on non-toshiba systems.

  Documentation/ABI:
   - ibm_rtl: The "What:" fields are incomplete

  toshiba-wmi:
   - Fix loading the driver on non Toshiba laptops

  ideapad-laptop:
   - Add another DMI entry for Yoga 900"

* tag 'platform-drivers-x86-v4.9-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  Documentation/ABI: ibm_rtl: The "What:" fields are incomplete
  toshiba-wmi: Fix loading the driver on non Toshiba laptops
  ideapad-laptop: Add another DMI entry for Yoga 900

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Two small (really, one liners both of them!) fixes that should go into
  this series:

   - Request allocation error handling fix for nbd, from Christophe,
     fixing a regression in this series.

   - An oops fix for drbd. Not a regression in this series, but stable
     material. From Richard"

* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: Fix kernel_sendmsg() usage - potential NULL deref
  nbd: Fix error handling

Merge tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

- Update MAINTAINERS for Intel VMD driver filename

- Update Rockchip rk3399 host bridge driver DTS and resets

- Fix ROM shadow problem that made some video device initialization
   fail

* tag 'pci-v4.9-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: VMD: Update filename to reflect move
  arm64: dts: rockchip: add three new resets for rk3399 PCIe controller
  PCI: rockchip: Add three new resets as required properties
  PCI: Don't attempt to claim shadow copies of ROM

Merge tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"AMD, radeon, i915, imx, msm and udl fixes:

   - amdgpu/radeon have a number of power management regressions and
     fixes along with some better error checking

   - imx has a single regression fix

   - udl has a single kmalloc instead of stack for usb control msg fix

   - msm has some fixes for modesetting bugs and regressions

   - i915 has a one fix for a Sandybridge regression along with some
     others for DP audio.

  They all seem pretty okay at this stage, we've got one MST fix I know
  going through process for i915, but I expect it'll be next week"

* tag 'drm-fixes-for-v4.9-rc5' of git://people.freedesktop.org/~airlied/linux: (30 commits)
  drm/udl: make control msg static const. (v2)
  drm/amd/powerplay: implement get_clock_by_type for iceland.
  drm/amd/powerplay/smu7: fix checks in smu7_get_evv_voltages (v2)
  drm/amd/powerplay: update phm_get_voltage_evv_on_sclk for iceland
  drm/amd/powerplay: propagate errors in phm_get_voltage_evv_on_sclk
  drm/imx: disable planes before DC
  drm/amd/powerplay: return false instead of -EINVAL
  drm/amdgpu/powerplay/smu7: fix unintialized data usage
  drm/amdgpu: fix crash in acp_hw_fini
  drm/i915: Limit Valleyview and earlier to only using mappable scanout
  drm/i915: Round tile chunks up for constructing partial VMAs
  drm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms
  drm/i915/dp: BDW cdclk fix for DP audio
  drm/i915/vlv: Prevent enabling hpd polling in late suspend
  drm/i915: Respect alternate_ddc_pin for all DDI ports
  drm/msm: Fix error handling crashes seen when VRAM allocation fails
  drm/msm/mdp5: 8x16 actually has 8 mixer stages
  drm/msm/mdp5: no scaling support on RGBn pipes for 8x16
  drm/msm/mdp5: handle non-fullscreen base plane case
  drm/msm: Set CLK_IGNORE_UNUSED flag for PLL clocks
  ...

Merge tag 'mmc-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
"MMC core:
   - Fix mmc card initialization for hosts not supporting HW busy
     detection
   - Fix mmc_test for sending commands during non-blocking write

  MMC host:
   - mxs: Avoid using an uninitialized
   - sdhci: Restore enhanced strobe setting during runtime resume
   - sdhci: Fix a couple of reset related issues
   - dw_mmc: Fix a reset controller issue"

* tag 'mmc-v4.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mxs: Initialize the spinlock prior to using it
  mmc: mmc: Use 500ms as the default generic CMD6 timeout
  mmc: mmc_test: Fix "Commands during non-blocking write" tests
  mmc: sdhci: Fix missing enhanced strobe setting during runtime resume
  mmc: sdhci: Reset cmd and data circuits after tuning failure
  mmc: sdhci: Fix unexpected data interrupt handling
  mmc: sdhci: Fix CMD line reset interfering with ongoing data transfer
  mmc: dw_mmc: add the "reset" as name of reset controller
  Documentation: synopsys-dw-mshc: add binding for reset-names

Merge tag 'pinctrl-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"All is about drivers, no core business going on.

   - Fix a host of runtime problems with the Intel Cherryview driver:
     suspend/resume needs to be marshalled properly, and strange effects
     from BIOS interaction during suspend/resume need to be dealt with.

   - A single bit was being set wrong in the Aspeed driver.

   - Fix an iProc probe ordering fallout resulting from v4.9
     refactorings for bus population.

   - Do not specify a default trigger in the ST Micro cascaded GPIO IRQ
     controller: the kernel will moan.

   - Make IRQs optional altogether on the STM32 driver, it turns out not
     all systems have them or want them.

   - Fix a re-probe bug in the i.MX driver, it will eventually crash if
     probed repeatedly, not good"

* tag 'pinctrl-v4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl-aspeed-g5: Never set SCU90[6]
  pinctrl: cherryview: Prevent possible interrupt storm on resume
  pinctrl: cherryview: Serialize register access in suspend/resume
  pinctrl: imx: reset group index on probe
  pinctrl: stm32: move gpio irqs binding to optional
  pinctrl: stm32: remove dependency with interrupt controller
  pinctrl: st: don't specify default interrupt trigger
  pinctrl: iproc: Fix iProc and NSP GPIO support

Merge branches 'pm-tools-fixes' and 'pm-sleep-fixes'

* pm-tools-fixes:
  cpupower: Correct return type of cpu_power_is_cpu_online() in cpufreq-set

* pm-sleep-fixes:
  PM / sleep: don't suspend parent when async child suspend_{noirq, late} fails
  PM / sleep: fix device reference leak in test_suspend

Merge branch 'device-properties'

* device-properties:
ACPI / platform: Add support for build-in properties

Merge branch 'maybe-uninitialized' (patches from Arnd)

Merge fixes for -Wmaybe-uninitialized from Arnd Bergmann:
"It took a while for some patches to make it into mainline through
  maintainer trees, but the 28-patch series is now reduced to 10, with
  one tiny patch added at the end.

  Aside from patches that are no longer required, I did these changes
  compared to version 1:

   - Dropped "iio: maxim_thermocouple: detect invalid storage size in
     read()", which is currently in linux-next as commit 32cb7d27e65d.
     This is the only remaining warning I see for a couple of corner
     cases (kbuild bot reports it on blackfin, kernelci bot and arm-soc
     bot both report it on arm64)

   - Dropped "brcmfmac: avoid maybe-uninitialized warning in
     brcmf_cfg80211_start_ap", which is currently in net/master merge
     pending.

   - Dropped two x86 patches, "x86: math-emu: possible uninitialized
     variable use" and "x86: mark target address as output in 'insb'
     asm" as they do not seem to trigger for a default build, and I got
     no feedback on them. Both of these are ancient issues and seem
     harmless, I will send them again to the x86 maintainers once the
     rest is merged.

   - Dropped "rbd: false-postive gcc-4.9 -Wmaybe-uninitialized" based on
     feedback from Ilya Dryomov, who already has a different fix queued
     up for v4.10. The kbuild bot reports this as a warning for xtensa.

   - Replaced "crypto: aesni: avoid -Wmaybe-uninitialized warning" with
     a simpler patch, this one always triggers but my first solution
     would not be safe for linux-4.9 any more at this point. I'll follow
     up with the larger patch as a cleanup for 4.10.

   - Replaced "dib0700: fix nec repeat handling" with a better one,
     contributed by Sean Young"

* -Wmaybe-uninitialized fixes:
  Kbuild: enable -Wmaybe-uninitialized warnings by default
  pcmcia: fix return value of soc_pcmcia_regulator_set
  infiniband: shut up a maybe-uninitialized warning
  crypto: aesni: shut up -Wmaybe-uninitialized warning
  rc: print correct variable for z8f0811
  dib0700: fix nec repeat handling
  s390: pci: don't print uninitialized data for debugging
  nios2: fix timer initcall return value
  x86: apm: avoid uninitialized data
  NFSv4.1: work around -Wmaybe-uninitialized warning
  Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"15 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  lib/stackdepot: export save/fetch stack for drivers
  mm: kmemleak: scan .data.ro_after_init
  memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
  coredump: fix unfreezable coredumping task
  mm/filemap: don't allow partially uptodate page for pipes
  mm/hugetlb: fix huge page reservation leak in private mapping error paths
  ocfs2: fix not enough credit panic
  Revert "console: don't prefer first registered if DT specifies stdout-path"
  mm: hwpoison: fix thp split handling in memory_failure()
  swapfile: fix memory corruption via malformed swapfile
  mm/cma.c: check the max limit for cma allocation
  scripts/bloat-o-meter: fix SIGPIPE
  shmem: fix pageflags after swapping DMA32 object
  mm, frontswap: make sure allocated frontswap map is assigned
  mm: remove extra newline from allocation stall warning

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull VFS fixes from Al Viro:
"Christoph's and Jan's aio fixes, fixup for generic_file_splice_read
  (removal of pointless detritus that actually breaks it when used for
  gfs2 ->splice_read()) and fixup for generic_file_read_iter()
  interaction with ITER_PIPE destinations."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  splice: remove detritus from generic_file_splice_read()
  mm/filemap: don't allow partially uptodate page for pipes
  aio: fix freeze protection of aio writes
  fs: remove aio_run_iocb
  fs: remove the never implemented aio_fsync file operation
  aio: hold an extra file reference over AIO read/write operations

Merge tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client

Pull Ceph fixes from Ilya Dryomov:
"Ceph's ->read_iter() implementation is incompatible with the new
  generic_file_splice_read() code that went into -rc1.  Switch to the
  less efficient default_file_splice_read() for now; the proper fix is
  being held for 4.10.

  We also have a fix for a 4.8 regression and a trival libceph fixup"

* tag 'ceph-for-4.9-rc5' of git://github.com/ceph/ceph-client:
  libceph: initialize last_linger_id with a large integer
  libceph: fix legacy layout decode with pool 0
  ceph: use default file splice read callback

Merge tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client bugfixes from Anna Schumaker:
"Most of these fix regressions in 4.9, and none are going to stable
  this time around.

  Bugfixes:
   - Trim extra slashes in v4 nfs_paths to fix tools that use this
   - Fix a -Wmaybe-uninitialized warnings
   - Fix suspicious RCU usages
   - Fix Oops when mounting multiple servers at once
   - Suppress a false-positive pNFS error
   - Fix a DMAR failure in NFS over RDMA"

* tag 'nfs-for-4.9-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
  fs/nfs: Fix used uninitialized warn in nfs4_slot_seqid_in_use()
  NFS: Don't print a pNFS error if we aren't using pNFS
  NFS: Ignore connections that have cl_rpcclient uninitialized
  SUNRPC: Fix suspicious RCU usage
  NFSv4.1: work around -Wmaybe-uninitialized warning
  NFS: Trim extra slash in v4 nfs_path

Merge tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs

Pull xfs fix from Dave Chinner:
"This is a fix for an unmount hang (regression) when the filesystem is
  shutdown.  It was supposed to go to you for -rc3, but I accidentally
  tagged the commit prior to it in that pullreq.

  Summary:

   - fix for aborting deferred transactions on filesystem shutdown"

* tag 'xfs-fixes-for-linus-4.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
  xfs: defer should abort intent items if the trans roll fails

Kbuild: enable -Wmaybe-uninitialized warnings by default

Previously the warnings were added back at the W=1 level and above, this
now turns them on again by default, assuming that we have addressed all
warnings and again have a clean build for v4.10.

I found a number of new warnings in linux-next already and submitted
bugfixes for those. Hopefully they are caught by the 0day builder in
the future as soon as this patch is merged.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pcmcia: fix return value of soc_pcmcia_regulator_set

The newly introduced soc_pcmcia_regulator_set() function sometimes
returns without setting its return code, as shown by this warning:

drivers/pcmcia/soc_common.c: In function 'soc_pcmcia_regulator_set':
drivers/pcmcia/soc_common.c:112:5: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This changes it to propagate the regulator_disable() result instead.

Fixes: ac61b6001a63 ("pcmcia: soc_common: add support for Vcc and Vpp regulators")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

infiniband: shut up a maybe-uninitialized warning

Some configurations produce this harmless warning when built with gcc
-Wmaybe-uninitialized:

  infiniband/core/cma.c: In function 'cma_get_net_dev':
  infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized]

I previously reported this for the powerpc64 defconfig, but have now
reproduced the same thing for x86 as well, using gcc-5 or higher.

The code looks correct to me, and this change just rearranges it by
making sure we alway initialize the entire address structure to make the
warning disappear.  My first approach added an initialization at the
time of the declaration, which Doug commented may be too costly, so I
hope this version doesn't add overhead.

Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed
Link: https://patchwork.kernel.org/patch/9212825/
Acked-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

crypto: aesni: shut up -Wmaybe-uninitialized warning

The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:

  arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
  include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]

The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.

This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output.  A
follow-up patch for v4.10 rearranges the code to make the warning go
away.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>