git.proxmox.com Git - mirror_ubuntu-kernels.git/log

bonding: correct the MAC address for "follow" fail_over_mac policy

The "follow" fail_over_mac policy is useful for multiport devices that
either become confused or incur a performance penalty when multiple
ports are programmed with the same MAC address, but the same MAC
address still may happened by this steps for this policy:

1) echo +eth0 > /sys/class/net/bond0/bonding/slaves
   bond0 has the same mac address with eth0, it is MAC1.

2) echo +eth1 > /sys/class/net/bond0/bonding/slaves
   eth1 is backup, eth1 has MAC2.

3) ifconfig eth0 down
   eth1 became active slave, bond will swap MAC for eth0 and eth1,
   so eth1 has MAC1, and eth0 has MAC2.

4) ifconfig eth1 down
   there is no active slave, and eth1 still has MAC1, eth2 has MAC2.

5) ifconfig eth0 up
   the eth0 became active slave again, the bond set eth0 to MAC1.

Something wrong here, then if you set eth1 up, the eth0 and eth1 will have the same
MAC address, it will break this policy for ACTIVE_BACKUP mode.

This patch will fix this problem by finding the old active slave and
swap them MAC address before change active slave.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-4.2-20150716' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-07-16

this is a pull request of 2 patches by Stefan Agner. He fixes the resume
operation in the mcp251x driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "sit: Add gro callbacks to sit_offload"

This patch reverts 19424e052fb44da2f00d1a868cbb51f3e9f4bbb5 ("sit:
Add gro callbacks to sit_offload") because it generates packets
that cannot be handled even by our own GSO.

Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: bcm_sf2: do not use indirect reads and writes for 7445E0

7445E0 contains an ECO which disconnected the internal SF2 pseudo-PHY which was
known to conflict with the external pseudo-PHY of BCM53125 switches. This
motivated the need to utilize the internal SF2 MDIO controller via indirect
register reads/writes to control external Broadcom switches due to this address
conflict (both responded at address 30d).

For 7445E0, the internal pseudo-PHY of the SF2 switch got disconnected, and as
a consequence this prevents the internal SF2 MDIO bus controller from reading
data (reads back everything as 0) since the MDI line is tied low.

Fix this by making the indirect register reads and writes conditional to
7445D0, on 7445E0 we can utilize the SWITCH_MDIO controller (backed by
mdio-unimac and not the DSA created slave MII bus).

We utilize of_machine_is_compatible() here since this is the only way for use
to differentiate between these two chips in a way that does not violate layers
or becomes (too) vendor-specific.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: correctly handle bonding type change on enslave failure

If the bond is enslaving a device with different type it will be setup
by it, but if after being setup the enslave fails the bond doesn't
switch back its type and also keeps pointers to foreign structures that can
be long gone. Thus revert back any type changes if the enslave failed and
the bond had to change its type.
Example:
Before patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default
    link/loopback 16:54:78:34:bd:41 brd 00:00:00:00:00:00
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
20: bond0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
    link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff
(notice the MASTER flag is gone)

After patch:
$ echo lo > bond0/bonding/slaves
-bash: echo: write error: Cannot assign requested address
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
    link/ether 6e:66:94:f6:07:fc brd ff:ff:ff:ff:ff:ff
$ echo +eth1 > bond0/bonding/slaves
$ ip l sh bond0
21: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop state DOWN
mode DEFAULT group default qlen 1000
    link/ether 52:54:00:3f:47:69 brd ff:ff:ff:ff:ff:ff

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: e36b9d16c6a6 ("bonding: clean muticast addresses when device changes type")
Signed-off-by: David S. Miller <davem@davemloft.net>

bonding: fix destruction of bond with devices different from arphrd_ether

When the bonding is being unloaded and the netdevice notifier is
unregistered it executes NETDEV_UNREGISTER for each device which should
remove the bond's proc entry but if the device enslaved is not of
ARPHRD_ETHER type and is in front of the bonding, it may execute
bond_release_and_destroy() first which would release the last slave and
destroy the bond device leaving the proc entry and thus we will get the
following error (with dynamic debug on for bond_netdev_event to see the
events order):
[  908.963051] eql: event: 9
[  908.963052] eql: IFF_SLAVE
[  908.963054] eql: event: 2
[  908.963056] eql: IFF_SLAVE
[  908.963058] eql: event: 6
[  908.963059] eql: IFF_SLAVE
[  908.963110] bond0: Releasing active interface eql
[  908.976168] bond0: Destroying bond bond0
[  908.976266] bond0 (unregistering): Released all slaves
[  908.984097] ------------[ cut here ]------------
[  908.984107] WARNING: CPU: 0 PID: 1787 at fs/proc/generic.c:575
remove_proc_entry+0x112/0x160()
[  908.984110] remove_proc_entry: removing non-empty directory
'net/bonding', leaking at least 'bond0'
[  908.984111] Modules linked in: bonding(-) eql(O) 9p nfsd auth_rpcgss
oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev qxl drm_kms_helper
snd_hda_codec_generic aesni_intel ttm aes_x86_64 glue_helper pcspkr lrw
gf128mul ablk_helper cryptd snd_hda_intel virtio_console snd_hda_codec
psmouse serio_raw snd_hwdep snd_hda_core 9pnet_virtio 9pnet evdev joydev
drm virtio_balloon snd_pcm snd_timer snd soundcore i2c_piix4 i2c_core
pvpanic acpi_cpufreq parport_pc parport processor thermal_sys button
autofs4 ext4 crc16 mbcache jbd2 hid_generic usbhid hid sg sr_mod cdrom
ata_generic virtio_blk virtio_net floppy ata_piix e1000 libata ehci_pci
virtio_pci scsi_mod uhci_hcd ehci_hcd virtio_ring virtio usbcore
usb_common [last unloaded: bonding]

[  908.984168] CPU: 0 PID: 1787 Comm: rmmod Tainted: G        W  O
4.2.0-rc2+ #8
[  908.984170] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  908.984172]  0000000000000000 ffffffff81732d41 ffffffff81525b34
ffff8800358dfda8
[  908.984175]  ffffffff8106c521 ffff88003595af78 ffff88003595af40
ffff88003e3a4280
[  908.984178]  ffffffffa058d040 0000000000000000 ffffffff8106c59a
ffffffff8172ebd0
[  908.984181] Call Trace:
[  908.984188]  [<ffffffff81525b34>] ? dump_stack+0x40/0x50
[  908.984193]  [<ffffffff8106c521>] ? warn_slowpath_common+0x81/0xb0
[  908.984196]  [<ffffffff8106c59a>] ? warn_slowpath_fmt+0x4a/0x50
[  908.984199]  [<ffffffff81218352>] ? remove_proc_entry+0x112/0x160
[  908.984205]  [<ffffffffa05850e6>] ? bond_destroy_proc_dir+0x26/0x30
[bonding]
[  908.984208]  [<ffffffffa057540e>] ? bond_net_exit+0x8e/0xa0 [bonding]
[  908.984217]  [<ffffffff8142f407>] ? ops_exit_list.isra.4+0x37/0x70
[  908.984225]  [<ffffffff8142f52d>] ?
unregister_pernet_operations+0x8d/0xd0
[  908.984228]  [<ffffffff8142f58d>] ?
unregister_pernet_subsys+0x1d/0x30
[  908.984232]  [<ffffffffa0585269>] ? bonding_exit+0x23/0xdba [bonding]
[  908.984236]  [<ffffffff810e28ba>] ? SyS_delete_module+0x18a/0x250
[  908.984241]  [<ffffffff81086f99>] ? task_work_run+0x89/0xc0
[  908.984244]  [<ffffffff8152b732>] ?
entry_SYSCALL_64_fastpath+0x16/0x75
[  908.984247] ---[ end trace 7c006ed4abbef24b ]---

Thus remove the proc entry manually if bond_release_and_destroy() is
used. Because of the checks in bond_remove_proc_entry() it's not a
problem for a bond device to change namespaces (the bug fixed by the
Fixes commit) but since commit
f9399814927ad ("bonding: Don't allow bond devices to change network
namespaces.") that can't happen anyway.

Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: a64d49c3dd50 ("bonding: Manage /proc/net/bonding/ entries from
                      the netdev events")
Tested-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: mv88e6xxx: fix fid_mask when leaving bridge

The mv88e6xxx_priv_state structure contains an fid_mask, where 1 means
the FID is free to use, 0 means the FID is in use.

This patch fixes the bit clear in mv88e6xxx_leave_bridge() when
assigning a new FID to a port.

Example scenario: I have 7 ports, port 5 is CPU, port 6 is unused (no
PHY). After setting the ports 0, 1 and 2 in bridge br0, and ports 3 and
4 in bridge br1, I have the following fid_mask: 0b111110010110 (0xf96).

Indeed, br0 uses FID 0, and br1 uses FID 3.

After setting nomaster for port 0, I get the wrong fid_mask: 0b10 (0x2).

With this patch we correctly get 0b111110010100 (0xf94), meaning port 0
uses FID 1, br0 uses FID 0, and br1 uses FID 3.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

virtio_net: don't require ANY_LAYOUT with VERSION_1

ANY_LAYOUT is a compatibility feature. It's implied
for VERSION_1 devices, and non-transitional devices
might not offer it. Change code to behave accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ravb: do not invalidate cache for RX buffer twice

First, dma_sync_single_for_cpu() shouldn't have been called in the first place
(it's a streaming DMA API), dma_unmap_single() should have been called instead.
Second, dma_unmap_single() call after handing the buffer to napi_gro_receive()
makes little sense. Moreover desc->dptr might not be valid at this point.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: mcp251x: get regulators optionally

The regulators power and transceiver are optional. If those are not
present, the pointer (or error pointer) is correctly handled by the
driver, hence we can use devm_regulator_get_optional safely, which
avoids regulators getting created.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: mcp251x: fix resume when device is down

If a valid power regulator or a dummy regulator is used (which
happens to be the case when no regulator is specified), restart_work
is queued no matter whether the device was running or not at suspend
time. Since work queues get initialized in the ndo_open callback,
resuming leads to a NULL pointer exception.

Reverse exactly the steps executed at suspend time:
- Enable the power regulator in any case
- Enable the transceiver regulator if the device was running, even in
case we have a power regulator
- Queue restart_work only in case the device was running

Fixes: bf66f3736a94 ("can: mcp251x: Move to threaded interrupts instead of workqueues.")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Johan Hedberg says:

====================
pull request: bluetooth 2015-07-15

Here's a regression fix for Broadcom Bluetooth adapters found at least
in certain Apple laptops. The issue was introduced in 4.1 so there's the
appropriate "Cc: stable" entry for it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tc: act_bpf: fix memory leak

prog->bpf_ops is populated when act_bpf is used with classic BPF and
prog->bpf_name is optionally used with extended BPF.
Fix memory leak when act_bpf is released.

Fixes: d23b8ad8ab23 ("tc: add BPF based action")
Fixes: a8cb5f556b56 ("act_bpf: add initial eBPF support for actions")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fq_codel: fix return value of fq_codel_drop()

The ->drop() is supposed to return the number of bytes it dropped,
however fq_codel_drop() returns the index of the flow where it drops
a packet from.

Fix this by introducing a helper to wrap fq_codel_drop().

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net_sched: fix a use-after-free in sfq

Fixes: 25331d6ce42b ("net: sched: implement qstat helper routines")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipvlan'

Konstantin Khlebnikov says:

====================
ipvlan: cleanups and fixes

v1: http://comments.gmane.org/gmane.linux.network/363346
v2: http://comments.gmane.org/gmane.linux.network/369086

v3 has reduced set of patches from "ipvlan: fix ipv6 autoconfiguration".
Here just cleanups and patch which ignores ipv6 notifications from RA.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: ignore addresses from ipv6 autoconfiguration

Inet6addr notifier is atomic and runs in bh context without RTNL when
ipv6 receives router advertisement packet and performs autoconfiguration.

Proper fix still in discussion. Let's at least plug the bug.
v1: http://lkml.kernel.org/r/20150514135618.14062.1969.stgit@buzz
v2: http://lkml.kernel.org/r/20150703125840.24121.91556.stgit@buzz

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: use rcu_deference_bh() in ipvlan_queue_xmit()

In tx path rcu_read_lock_bh() is held, so we need rcu_deference_bh().
This fixes the following warning:

===============================
[ INFO: suspicious RCU usage. ]
4.1.0-rc1+ #1007 Not tainted
-------------------------------
drivers/net/ipvlan/ipvlan.h:106 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by dhclient/1076:
  #0:  (rcu_read_lock_bh){......}, at: [<ffffffff817e8d84>] rcu_lock_acquire+0x0/0x26

stack backtrace:
CPU: 2 PID: 1076 Comm: dhclient Not tainted 4.1.0-rc1+ #1007
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000001 ffff8800d381bac8 ffffffff81a4154f 000000003c1a3c19
  ffff8800d4d0a690 ffff8800d381baf8 ffffffff810b849f ffff880117d41148
  ffff880117d40000 ffff880117d40068 0000000000000156 ffff8800d381bb18
Call Trace:
  [<ffffffff81a4154f>] dump_stack+0x4c/0x65
  [<ffffffff810b849f>] lockdep_rcu_suspicious+0x107/0x110
  [<ffffffff8165a522>] ipvlan_port_get_rcu+0x47/0x4e
  [<ffffffff8165ad14>] ipvlan_queue_xmit+0x35/0x450
  [<ffffffff817ea45d>] ? rcu_read_unlock+0x3e/0x5f
  [<ffffffff810a20bf>] ? local_clock+0x19/0x22
  [<ffffffff810b4781>] ? __lock_is_held+0x39/0x52
  [<ffffffff8165b64c>] ipvlan_start_xmit+0x1b/0x44
  [<ffffffff817edf7f>] dev_hard_start_xmit+0x2ae/0x467
  [<ffffffff817ee642>] __dev_queue_xmit+0x50a/0x60c
  [<ffffffff817ee7a7>] dev_queue_xmit_sk+0x13/0x15
  [<ffffffff81997596>] dev_queue_xmit+0x10/0x12
  [<ffffffff8199b41c>] packet_sendmsg+0xb6b/0xbdf
  [<ffffffff810b5ea7>] ? mark_lock+0x2e/0x226
  [<ffffffff810a1fcc>] ? sched_clock_cpu+0x9e/0xb7
  [<ffffffff817d56f9>] sock_sendmsg_nosec+0x12/0x1d
  [<ffffffff817d7257>] sock_sendmsg+0x29/0x2e
  [<ffffffff817d72cc>] sock_write_iter+0x70/0x91
  [<ffffffff81199563>] __vfs_write+0x7e/0xa7
  [<ffffffff811996bc>] vfs_write+0x92/0xe8
  [<ffffffff811997d7>] SyS_write+0x47/0x7e
  [<ffffffff81a4d517>] system_call_fastpath+0x12/0x6f

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Cc: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: unhash addresses without synchronize_rcu

All structures used in traffic forwarding are rcu-protected:
ipvl_addr, ipvl_dev and ipvl_port. Thus we can unhash addresses
without synchronization. We'll anyway hash it back into the same
bucket: in worst case lockless lookup will scan hash once again.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: plug memory leak in ipvlan_link_delete

Add missing kfree_rcu(addr, rcu);

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: remove counters of ipv4 and ipv6 addresses

They are unused after commit f631c44bbe15 ("ipvlan: Always set broadcast bit in
multicast filter").

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-4.2-20150715' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-07-15

this is a pull request of 12 patches by me.

This series fixes the use of the skb after netif_receive_skb() / netif_rx()
which exists in several drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: lock socket in ip6_datagram_connect()

ip6_datagram_connect() is doing a lot of socket changes without
socket being locked.

This looks wrong, at least for udp_lib_rehash() which could corrupt
lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'isdn-gigaset-fixes'

Tilman Schmidt says:

====================
Fix long-standing regression in ser_gigaset ISDN driver

This series fixes a serious regression in the Gigaset M101 driver
introduced in kernel release 3.10 and removes some unneeded code.

Please also queue up patch 1 of the series for inclusion in the
stable/longterm releases 3.10 and later.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

isdn/gigaset: drop unused ldisc methods

The line discipline read and write methods are optional so the dummy
methods in ser_gigaset are unnecessary and can be removed.

Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

isdn/gigaset: reset tty->receive_room when attaching ser_gigaset

Commit 79901317ce80 ("n_tty: Don't flush buffer when closing ldisc"),
first merged in kernel release 3.10, caused the following regression
in the Gigaset M101 driver:

Before that commit, when closing the N_TTY line discipline in
preparation to switching to N_GIGASET_M101, receive_room would be
reset to a non-zero value by the call to n_tty_flush_buffer() in
n_tty's close method. With the removal of that call, receive_room
might be left at zero, blocking data reception on the serial line.

The present patch fixes that regression by setting receive_room
to an appropriate value in the ldisc open method.

Fixes: 79901317ce80 ("n_tty: Don't flush buffer when closing ldisc")
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>

fq_codel: fix a use-after-free

Fixes: 25331d6ce42b ("net: sched: implement qstat helper routines")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: don't use F-RTO on non-recurring timeouts

Currently F-RTO may repeatedly send new data packets on non-recurring
timeouts in CA_Loss mode. This is a bug because F-RTO (RFC5682)
should only be used on either new recovery or recurring timeouts.

This exacerbates the recovery progress during frequent timeout &
repair, because we prioritize sending new data packets instead of
repairing the holes when the bandwidth is already scarce.

Fix it by correcting the test of a new recovery episode.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: mdb: fix double add notification

Since the mdb add/del code was introduced there have been 2 br_mdb_notify
calls when doing br_mdb_add() resulting in 2 notifications on each add.

Example:
Command: bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
Before patch:
root@debian:~# bridge monitor all
[MDB]dev br0 port eth1 grp 239.0.0.1 permanent
[MDB]dev br0 port eth1 grp 239.0.0.1 permanent

After patch:
root@debian:~# bridge monitor all
[MDB]dev br0 port eth1 grp 239.0.0.1 permanent

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: cfd567543590 ("bridge: add support of adding and deleting mdb entries")
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave

A report with INCLUDE/Change_to_include and empty source list should be
treated as a leave, specified by RFC 3376, section 3.1:
"If the requested filter mode is INCLUDE *and* the requested source
list is empty, then the entry corresponding to the requested
interface and multicast address is deleted if present. If no such
entry is present, the request is ignored."

Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fix skb csum races when peeking

When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.

This is in fact bogus for the MSG_PEEK case as this is done without
any locking. So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.

This patch fixes this by only storing the result if the skb is not
shared. This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Revert "net: fec: Ensure clocks are enabled while using mdio bus"

This reverts commit 6c3e921b18edca290099adfddde8a50236bf2d80.

commit 6c3e921b18ed ("net: fec: Ensure clocks are enabled while using mdio
bus") prevents the kernel to boot on mx6 boards, so let's revert it.

Reported-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

NET: AX.25: Stop heartbeat timer on disconnect.

This may result in a kernel panic. The bug has always existed but
somehow we've run out of luck now and it bites.

Signed-off-by: Richard Stearn <richard@rns-stearn.demon.co.uk>
Cc: stable@vger.kernel.org # all branches
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Clone skb before setting peeked flag

Shared skbs must not be modified and this is crucial for broadcast
and/or multicast paths where we use it as an optimisation to avoid
unnecessary cloning.

The function skb_recv_datagram breaks this rule by setting peeked
without cloning the skb first. This causes funky races which leads
to double-free.

This patch fixes this by cloning the skb and replacing the skb
in the list when setting skb->peeked.

Fixes: a59322be07c9 ("[UDP]: Only increment counter on first peek/recv")
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

rtnetlink: reject non-IFLA_VF_PORT attributes inside IFLA_VF_PORTS

Similarly as in commit 4f7d2cdfdde7 ("rtnetlink: verify IFLA_VF_INFO
attributes before passing them to driver"), we have a double nesting
of netlink attributes, i.e. IFLA_VF_PORTS only contains IFLA_VF_PORT
that is nested itself. While IFLA_VF_PORTS is a verified attribute
from ifla_policy[], we only check if the IFLA_VF_PORTS container has
IFLA_VF_PORT attributes and then pass the attribute's content itself
via nla_parse_nested(). It would be more correct to reject inner types
other than IFLA_VF_PORT instead of continuing parsing and also similarly
as in commit 4f7d2cdfdde7, to check for a minimum of NLA_HDRLEN.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Scott Feldman <sfeldma@gmail.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: pcan_usb: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: usb_8dev: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Bernd Krumboeck <b.krumboeck@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: ems_usb: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Gerhard Uttenthaler <uttenthaler@ems-wuensche.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: esd_usb2: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Thomas Körper <thomas.koerper@esd.eu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: sja1000: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: cc770: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: ti_heccn: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Anant Gole <anantgole@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: slcan: don't touch skb after netif_rx_ni()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: grcan: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: bfin_can: don't touch skb after netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Cc: Aaron Wu <Aaron.wu@analog.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: flexcan: don't touch skb after netif_receive_skb()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: at91_can: don't touch skb after netif_receive_skb()/netif_rx()

There is no guarantee that the skb is in the same state after calling
net_receive_skb() or netif_rx(). It might be freed or reused. Not really
harmful as its a read access, except you turn on the proper debugging options
which catch a use after free.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

net/xen-netback: off by one in BUG_ON() condition

The > should be >=. I also added spaces around the '-' operations so
the code is a little more consistent and matches the condition better.

Fixes: f53c3fe8dad7 ('xen-netback: Introduce TX grant mapping')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Bluetooth: btbcm: allow btbcm_read_verbose_config to fail on Apple

Commit 1c8ba6d013 moved around the setup code for broadcomm chips,
and also added btbcm_read_verbose_config() to read extra information
about the hardware. It's returning errors on some macbooks:

Bluetooth: hci0: BCM: Read verbose config info failed (-16)

Which makes us error out of the setup function. Since this
probe isn't critical to operate the chip, this patch just changes
things to carry on when it fails.

Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: stable@vger.kernel.org # v4.1

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Missing list head init in bluetooth hidp session creation, from Tedd
    Ho-Jeong An.

2) Don't leak SKB in bridge netfilter error paths, from Florian
    Westphal.

3) ipv6 netdevice private leak in netfilter bridging, fixed by Julien
    Grall.

4) Fix regression in IP over hamradio bpq encapsulation, from Ralf
    Baechle.

5) Fix race between rhashtable resize events and table walks, from Phil
    Sutter.

6) Missing validation of IFLA_VF_INFO netlink attributes, fix from
    Daniel Borkmann.

7) Missing security layer socket state initialization in tipc code,
    from Stephen Smalley.

8) Fix shared IRQ handling in boomerang 3c59x interrupt handler, from
    Denys Vlasenko.

9) Missing minor_idr destroy on module unload on macvtap driver, from
    Johannes Thumshirn.

10) Various pktgen kernel thread races, from Oleg Nesterov.

11) Fix races that can cause packets to be processed in the backlog even
    after a device attached to that SKB has been fully unregistered.
    From Julian Anastasov.

12) bcmgenet driver doesn't account packet drops vs.  errors properly,
    fix from Petri Gynther.

13) Array index validation and off by one fix in DSA layer from Florian
    Fainelli

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (66 commits)
  can: replace timestamp as unique skb attribute
  ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux
  can: c_can: Fix default pinmux glitch at init
  can: rcar_can: unify error messages
  can: rcar_can: print request_irq() error code
  can: rcar_can: fix typo in error message
  can: rcar_can: print signed IRQ #
  can: rcar_can: fix IRQ check
  net: dsa: Fix off-by-one in switch address parsing
  net: dsa: Test array index before use
  net: switchdev: don't abort unsupported operations
  net: bcmgenet: fix accounting of packet drops vs errors
  cdc_ncm: update specs URL
  Doc: z8530book: Fix typo in API-z8530-sync-txdma-open.html
  net: inet_diag: always export IPV6_V6ONLY sockopt for listening sockets
  bridge: mdb: allow the user to delete mdb entry if there's a querier
  net: call rcu_read_lock early in process_backlog
  net: do not process device backlog during unregistration
  bridge: fix potential crash in __netdev_pick_tx()
  net: axienet: Fix devm_ioremap_resource return value check
  ...

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This fixes a duplicate dma_unmap_sg call in omap-des and reentrancy
  bugs in the powerpc nx driver which may cause bogus output or worse
  memory corruption"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: nx - Fix reentrancy bugs
  crypto: omap-des - Fix unmapping of dma channels

Merge tag 'linux-can-fixes-for-4.2-20150712' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-07-12

this is a pull request of 8 patchs for net/master.

Sergei Shtylyov contributes 5 patches for the rcar_can driver, fixing the IRQ
check and several info and error messages. There are two patches by J.D.
Schroeder and Roger Quadros for the c_can driver and dra7x-evm device tree,
which precent a glitch in the DCAN1 pinmux. Oliver Hartkopp provides a better
approach to make the CAN skbs unique, the timestamp is replaced by a counter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 4.2-rc2

Revert "drm/i915: Use crtc_state->active in primary check_plane func"

This reverts commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2.

Jörg Otte reports a NULL pointder dereference due to this commit, as
'crtc_state' very much can be NULL:

crtc_state = state->base.state ?
intel_atomic_get_crtc_state(state->base.state, intel_crtc) : NULL;

So the change to test 'crtc_state->base.active' cannot possibly be
correct as-is.

There may be some other minimal fix (like just checking crtc_state for
NULL), but I'm just reverting it now for the rc2 release, and people
like Daniel Vetter who actually know this code will figure out what the
right solution is in the longer term.

Reported-and-bisected-by: Jörg Otte <jrg.otte@gmail.com>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
CC: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull VFS fixes from Al Viro:
"Fixes for this cycle regression in overlayfs and a couple of
  long-standing (== all the way back to 2.6.12, at least) bugs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  freeing unlinked file indefinitely delayed
  fix a braino in ovl_d_select_inode()
  9p: don't leave a half-initialized inode sitting around

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"A fair number of 4.2 fixes also because Markos opened the flood gates.

   - Patch up the math used calculate the location for the page bitmap.

   - The FDC (Not what you think, FDC stands for Fast Debug Channel) IRQ
     around was causing issues on non-Malta platforms, so move the code
     to a Malta specific location.

   - A spelling fix replicated through several files.

   - Fix to the emulation of an R2 instruction for R6 cores.

   - Fix the JR emulation for R6.

   - Further patching of mindless 64 bit issues.

   - Ensure the kernel won't crash on CPUs with L2 caches with >= 8
     ways.

   - Use compat_sys_getsockopt for O32 ABI on 64 bit kernels.

   - Fix cache flushing for multithreaded cores.

   - A build fix"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: O32: Use compat_sys_getsockopt.
  MIPS: c-r4k: Extend way_string array
  MIPS: Pistachio: Support CDMM & Fast Debug Channel
  MIPS: Malta: Make GIC FDC IRQ workaround Malta specific
  MIPS: c-r4k: Fix cache flushing for MT cores
  Revert "MIPS: Kconfig: Disable SMP/CPS for 64-bit"
  MIPS: cps-vec: Use macros for various arithmetics and memory operations
  MIPS: kernel: cps-vec: Replace KSEG0 with CKSEG0
  MIPS: kernel: cps-vec: Use ta0-ta3 pseudo-registers for 64-bit
  MIPS: kernel: cps-vec: Replace mips32r2 ISA level with mips64r2
  MIPS: kernel: cps-vec: Replace 'la' macro with PTR_LA
  MIPS: kernel: smp-cps: Fix 64-bit compatibility errors due to pointer casting
  MIPS: Fix erroneous JR emulation for MIPS R6
  MIPS: Fix branch emulation for BLTC and BGEC instructions
  MIPS: kernel: traps: Fix broken indentation
  MIPS: bootmem: Don't use memory holes for page bitmap
  MIPS: O32: Do not handle require 32 bytes from the stack to be readable.
  MIPS, CPUFREQ: Fix spelling of Institute.
  MIPS: Lemote 2F: Fix build caused by recent mass rename.

can: replace timestamp as unique skb attribute

Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for
overlapping CAN filters" requires the skb->tstamp to be set to check for
identical CAN skbs.

Without timestamping to be required by user space applications this timestamp
was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames
in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs
by introducing several __net_timestamp() calls.

This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb()
to add __net_timestamp() after skbuff creation to prevent the frame loss fixed
in mainline Linux.

This patch removes the timestamp dependency and uses an atomic counter to
create an unique identifier together with the skbuff pointer.

Btw: the new skbcnt element introduced in struct can_skb_priv has to be
initialized with zero in out-of-tree drivers which are not using
alloc_can{,fd}_skb() too.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

ARM: dts: dra7x-evm: Prevent glitch on DCAN1 pinmux

Driver core sets "default" pinmux on on probe and CAN driver
sets "sleep" pinmux during register. This causes a small window
where the CAN pins are in "default" state with the DCAN module
being disabled.

Change the "default" state to be like sleep so this glitch is
avoided. Add a new "active" state that is used by the driver
when CAN is actually active.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: c_can: Fix default pinmux glitch at init

The previous change 3973c526ae9c (net: can: c_can: Disable pins when CAN
interface is down) causes a slight glitch on the pinctrl settings when used.
Since commit ab78029 (drivers/pinctrl: grab default handles from device core),
the device core will automatically set the default pins. This causes the pins
to be momentarily set to the default and then to the sleep state in
register_c_can_dev(). By adding an optional "enable" state, boards can set the
default pin state to be disabled and avoid the glitch when the switch from
default to sleep first occurs. If the "enable" state is not available
c_can_pinctrl_select_state() falls back to using the "default" pinctrl state.

[Roger Q] - Forward port to v4.2 and use pinctrl_get_select().

Signed-off-by: J.D. Schroeder <jay.schroeder@garmin.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: rcar_can: unify error messages

All the error messages in the driver but the ones from devm_clk_get() failures
use similar format. Make those two messages consitent with others.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: rcar_can: print request_irq() error code

Also print the error code when the request_irq() call fails in rcar_can_open(),
rewording the error message...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: rcar_can: fix typo in error message

Fix typo in the first error message printed by rcar_can_open().

Based on the original patch by Vladimir Barinov.

Fixes: 862e2b6af941 ("can: rcar_can: support all input clocks")
Reported-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: rcar_can: print signed IRQ #

Printing IRQ # using "%x" and "%u" unsigned formats isn't quite correct as
'ndev->irq' is of type *int*, so the "%d" format needs to be used instead.

While fixing this, beautify the dev_info() message in rcar_can_probe() a bit.

Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: rcar_can: fix IRQ check

rcar_can_probe() regards 0 as a wrong IRQ #, despite platform_get_irq() that it
calls returns negative error code in that case. This leads to the following
being printed to the console when attempting to open the device:

error requesting interrupt fffffffa

because rcar_can_open() calls request_irq() with a negative IRQ #, and that
function naturally fails with -EINVAL.

Check for the negative error codes instead and propagate them upstream instead
of just returning -ENODEV.

Fixes: fd1159318e55 ("can: add Renesas R-Car CAN driver")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

- the high latency PIT detection fix, which slipped through the cracks
   for rc1

- a regression fix for the early printk mechanism

- the x86 part to plug irq/vector related hotplug races

- move the allocation of the espfix pages on cpu hotplug to non atomic
   context.  The current code triggers a might_sleep() warning.

- a series of KASAN fixes addressing boot crashes and usability

- a trivial typo fix for Kconfig help text

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/kconfig: Fix typo in the CONFIG_CMDLINE_BOOL help text
  x86/irq: Retrieve irq data after locking irq_desc
  x86/irq: Use proper locking in check_irq_vectors_for_cpu_disable()
  x86/irq: Plug irq vector hotplug race
  x86/earlyprintk: Allow early_printk() to use console style parameters like '115200n8'
  x86/espfix: Init espfix on the boot CPU side
  x86/espfix: Add 'cpu' parameter to init_espfix_ap()
  x86/kasan: Move KASAN_SHADOW_OFFSET to the arch Kconfig
  x86/kasan: Add message about KASAN being initialized
  x86/kasan: Fix boot crash on AMD processors
  x86/kasan: Flush TLBs after switching CR3
  x86/kasan: Fix KASAN shadow region page tables
  x86/init: Clear 'init_level4_pgt' earlier
  x86/tsc: Let high latency PIT fail fast in quick_pit_calibrate()

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
"This update from the timer departement contains:

   - A series of patches which address a shortcoming in the tick
     broadcast code.

     If the broadcast device is not available or an hrtimer emulated
     broadcast device, some of the original assumptions lead to boot
     failures.  I rather plugged all of the corner cases instead of only
     addressing the issue reported, so the change got a little larger.

     Has been extensivly tested on x86 and arm.

   - Get rid of the last holdouts using do_posix_clock_monotonic_gettime()

   - A regression fix for the imx clocksource driver

   - An update to the new state callbacks mechanism for clockevents.
     This is required to simplify the conversion, which will take place
     in 4.3"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/broadcast: Prevent NULL pointer dereference
  time: Get rid of do_posix_clock_monotonic_gettime
  cris: Replace do_posix_clock_monotonic_gettime()
  tick/broadcast: Unbreak CONFIG_GENERIC_CLOCKEVENTS=n build
  tick/broadcast: Handle spurious interrupts gracefully
  tick/broadcast: Check for hrtimer broadcast active early
  tick/broadcast: Return busy when IPI is pending
  tick/broadcast: Return busy if periodic mode and hrtimer broadcast
  tick/broadcast: Move the check for periodic mode inside state handling
  tick/broadcast: Prevent deep idle if no broadcast device available
  tick/broadcast: Make idle check independent from mode and config
  tick/broadcast: Sanity check the shutdown of the local clock_event
  tick/broadcast: Prevent hrtimer recursion
  clockevents: Allow set-state callbacks to be optional
  clocksource/imx: Define clocksource for mx27

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
"A single fix for a cpu hotplug race vs. interrupt descriptors:

  Prevent irq setup/teardown across the cpu starting/dying parts of cpu
  hotplug so that the starting/dying cpu has a stable view of the
  descriptor space.  This has been an issue for all architectures in the
  cpu dying phase, where interrupts are migrated away from the dying
  cpu.  In the starting phase its mostly a x86 issue vs the vector space
  update"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  hotplug: Prevent alloc/free of irq descriptors during cpu up/down

freeing unlinked file indefinitely delayed

Normally opening a file, unlinking it and then closing will have
the inode freed upon close() (provided that it's not otherwise busy and
has no remaining links, of course).  However, there's one case where that
does *not* happen.  Namely, if you open it by fhandle with cold dcache,
then unlink() and close().

In normal case you get d_delete() in unlink(2) notice that dentry
is busy and unhash it; on the final dput() it will be forcibly evicted from
dcache, triggering iput() and inode removal.  In this case, though, we end
up with *two* dentries - disconnected (created by open-by-fhandle) and
regular one (used by unlink()).  The latter will have its reference to inode
dropped just fine, but the former will not - it's considered hashed (it
is on the ->s_anon list), so it will stay around until the memory pressure
will finally do it in.  As the result, we have the final iput() delayed
indefinitely.  It's trivial to reproduce -

void flush_dcache(void)
{
        system("mount -o remount,rw /");
}

static char buf[20 * 1024 * 1024];

main()
{
        int fd;
        union {
                struct file_handle f;
                char buf[MAX_HANDLE_SZ];
        } x;
        int m;

        x.f.handle_bytes = sizeof(x);
        chdir("/root");
        mkdir("foo", 0700);
        fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
        close(fd);
        name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
        flush_dcache();
        fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
        unlink("foo/bar");
        write(fd, buf, sizeof(buf));
        system("df ."); /* 20Mb eaten */
        close(fd);
        system("df ."); /* should've freed those 20Mb */
        flush_dcache();
        system("df ."); /* should be the same as #2 */
}

will spit out something like
Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/root         322023 303843      1131 100% /
Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/root         322023 303843      1131 100% /
Filesystem     1K-blocks   Used Available Use% Mounted on
/dev/root         322023 283282     21692  93% /
- inode gets freed only when dentry is finally evicted (here we trigger
than by remount; normally it would've happened in response to memory
pressure hell knows when).

Cc: stable@vger.kernel.org # v2.6.38+; earlier ones need s/kill_it/unhash_it/
Acked-by: J. Bruce Fields <bfields@fieldses.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fix a braino in ovl_d_select_inode()

when opening a directory we want the overlayfs inode, not one from
the topmost layer.

Reported-By: Andrey Jr. Melnikov <temnota.am@gmail.com>
Tested-By: Andrey Jr. Melnikov <temnota.am@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9p: don't leave a half-initialized inode sitting around

Cc: stable@vger.kernel.org # all branches
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'dsa-of-parsing-fixes'

Florian Fainelli says:

====================
net: dsa: OF parsing fixes

This patch series fixes two small parsing issues, the first one was
reported by Dan, the second came after looking more closely at the
code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: Fix off-by-one in switch address parsing

cd->sw_addr is used as a MDIO bus address, which cannot exceed
PHY_MAX_ADDR (32), our check was off-by-one.

Fixes: 5e95329b701c ("dsa: add device tree bindings to register DSA switches")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: Test array index before use

port_index is used an index into an array, and this information comes
from Device Tree, make sure that port_index is not equal to the array
size before using it. Move the check against port_index earlier in the
loop.

Fixes: 5e95329b701c: ("dsa: add device tree bindings to register DSA switches")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: switchdev: don't abort unsupported operations

There is no need to abort attribute setting or object addition, if the
prepare phase returned operation not supported.

Thus, abort these two transactions only if the error is not -EOPNOTSUPP.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bcmgenet: fix accounting of packet drops vs errors

bcmgenet driver needs to separate packet drops from packet errors.

When the driver has to drop a *good* packet, due to lack of buffers or
replacement skbs, increment only dev->stats.[rx|tx]_dropped.

When the driver encounters a bad Rx packet or Tx error, increment only
dev->stats.[rx|tx]_errors + relevant detailed error counter.

Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cdc_ncm: update specs URL

Update referenced specs link to reflect actual file version and location.

Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm

Pull libnvdimm fixes from Dan Williams:
"1) Fixes for a handful of smatch reports (Thanks Dan C.!) and minor
     bug fixes (patches 1-6)

  2) Correctness fixes to the BLK-mode nvdimm driver (patches 7-10).

     Granted these are slightly large for a -rc update.  They have been
     out for review in one form or another since the end of May and were
     deferred from the merge window while we settled on the "PMEM API"
     for the PMEM-mode nvdimm driver (ie memremap_pmem, memcpy_to_pmem,
     and wmb_pmem).

     Now that those apis are merged we implement them in the BLK driver
     to guarantee that mmio aperture moves stay ordered with respect to
     incoming read/write requests, and that writes are flushed through
     those mmio-windows and platform-buffers to be persistent on media.

  These pass the sub-system unit tests with the updates to
  tools/testing/nvdimm, and have received a successful build-report from
  the kbuild robot (468 configs).

  With acks from Rafael for the touches to drivers/acpi/"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/nvdimm:
  nfit: add support for NVDIMM "latch" flag
  nfit: update block I/O path to use PMEM API
  tools/testing/nvdimm: add mock acpi_nfit_flush_address entries to nfit_test
  tools/testing/nvdimm: fix return code for unimplemented commands
  tools/testing/nvdimm: mock ioremap_wt
  pmem: add maintainer for include/linux/pmem.h
  nfit: fix smatch "use after null check" report
  nvdimm: Fix return value of nvdimm_bus_init() if class_create() fails
  libnvdimm: smatch cleanups in __nd_ioctl
  sparse: fix misplaced __pmem definition

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"Mostly slight adjusments for new drivers, but also one core fix for
  which finally the dependencies are now available as well"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: Mark instantiated device nodes with OF_POPULATE
  i2c: jz4780: Fix return value if probe fails
  i2c: xgene-slimpro: Fix missing mbox_free_channel call in probe error path
  i2c: I2C_MT65XX should depend on HAS_DMA

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
"A fix (revert) for a recent regression in Synaptics driver and a fix
  for Elan i2c touchpad driver"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Revert "Input: synaptics - allocate 3 slots to keep stability in image sensors"
  Input: elan_i2c - change the hover event from MT to ST

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"A small set of fixes for problems found by smatch in new drivers that
  we added this rc and a handful of driver fixes that came in during the
  merge window"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  drivers: clk: st: Incorrect register offset used for lock_status
  clk: mediatek: mt8173: Fix enabling of critical clocks
  drivers: clk: st: Fix mux bit-setting for Cortex A9 clocks
  drivers: clk: st: Add CLK_GET_RATE_NOCACHE flag to clocks
  drivers: clk: st: Fix flexgen lock init
  drivers: clk: st: Fix FSYN channel values
  drivers: clk: st: Remove unused code
  clk: qcom: Use parent rate when set rate to pixel RCG clock
  clk: at91: do not leak resources
  clk: stm32: Fix out-by-one error path in the index lookup
  clk: iproc: fix bit manipulation arithmetic
  clk: iproc: fix memory leak from clock name

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"A bunch of fixes for radeon, intel, omap and one amdkfd fix.

  Radeon fixes are all over, but it does fix some cursor corruption
  across suspend/resume.  i915 should fix the second warn you were
  seeing, so let us know if not.  omap is a bunch of small fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (28 commits)
  drm/radeon: disable vce init on cayman (v2)
  drm/amdgpu: fix timeout calculation
  drm/radeon: check if BO_VA is set before adding it to the invalidation list
  drm/radeon: allways add the VM clear duplicate
  Revert "Revert "drm/radeon: dont switch vt on suspend""
  drm/radeon: Fold radeon_set_cursor() into radeon_show_cursor()
  drm/radeon: unpin cursor BOs on suspend and pin them again on resume (v2)
  drm/radeon: Clean up reference counting and pinning of the cursor BOs
  drm/amdkfd: validate pdd where it acquired first
  Revert "drm/i915: Allocate context objects from stolen"
  drm/i915: Declare the swizzling unknown for L-shaped configurations
  drm/radeon: fix underflow in r600_cp_dispatch_texture()
  drm/radeon: default to 2048 MB GART size on SI+
  drm/radeon: fix HDP flushing
  drm/radeon: use RCU query for GEM_BUSY syscall
  drm/amdgpu: Handle irqs only based on irq ring, not irq status regs.
  drm/radeon: Handle irqs only based on irq ring, not irq status regs.
  drm/i915: Use crtc_state->active in primary check_plane func
  drm/i915: Check crtc->active in intel_crtc_disable_planes
  drm/i915: Restore all GGTT VMAs on resume
  ...

Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security

Pull selinux fixes from James Morris.

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix mprotect PROT_EXEC regression caused by mm change
selinux: don't waste ebitmap space when importing NetLabel categories

Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
"This is an assortment of fixes.  Most of the commits are from Filipe
  (fsync, the inode allocation cache and a few others).  Mark kicked in
  a series fixing corners in the extent sharing ioctls, and everyone
  else fixed up on assorted other problems"

* 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix wrong check for btrfs_force_chunk_alloc()
  Btrfs: fix warning of bytes_may_use
  Btrfs: fix hang when failing to submit bio of directIO
  Btrfs: fix a comment in inode.c:evict_inode_truncate_pages()
  Btrfs: fix memory corruption on failure to submit bio for direct IO
  btrfs: don't update mtime/ctime on deduped inodes
  btrfs: allow dedupe of same inode
  btrfs: fix deadlock with extent-same and readpage
  btrfs: pass unaligned length to btrfs_cmp_data()
  Btrfs: fix fsync after truncate when no_holes feature is enabled
  Btrfs: fix fsync xattr loss in the fast fsync path
  Btrfs: fix fsync data loss after append write
  Btrfs: fix crash on close_ctree() if cleaner starts new transaction
  Btrfs: fix race between caching kthread and returning inode to inode cache
  Btrfs: use kmem_cache_free when freeing entry in inode cache
  Btrfs: fix race between balance and unused block group deletion
  btrfs: add error handling for scrub_workers_get()
  btrfs: cleanup noused initialization of dev in btrfs_end_bio()
  btrfs: qgroup: allow user to clear the limitation on qgroup

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Kevin Hilman:
"A fairly random colletion of fixes based on -rc1 for OMAP, sunxi and
  prima2 as well as a few arm64-specific DT fixes.

  This series also includes a late to support a new Allwinner (sunxi)
  SoC, but since it's rather simple and isolated to the
  platform-specific code, it's included it for this -rc"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: dts: add device tree for ARM SMM-A53x2 on LogicTile Express 20MG
  arm: dts: vexpress: add missing CCI PMU device node to TC2
  arm: dts: vexpress: describe all PMUs in TC2 dts
  GICv3: Add ITS entry to THUNDER dts
  arm64: dts: Add poweroff button device node for APM X-Gene platform
  ARM: dts: am4372.dtsi: disable rfbi
  ARM: dts: am57xx-beagle-x15: Provide supply for usb2_phy2
  ARM: dts: am4372: Add emif node
  Revert "ARM: dts: am335x-boneblack: disable RTC-only sleep"
  ARM: sunxi: Enable simplefb in the defconfig
  ARM: Remove deprecated symbol from defconfig files
  ARM: sunxi: Add Machine support for A33
  ARM: sunxi: Introduce Allwinner H3 support
  Documentation: sunxi: Update Allwinner SoC documentation
  ARM: prima2: move to use REGMAP APIs for rtciobrg
  ARM: dts: atlas7: add pinctrl and gpio descriptions
  ARM: OMAP2+: Remove unnessary return statement from the void function, omap2_show_dma_caps
  memory: omap-gpmc: Fix parsing of devices

tick/broadcast: Prevent NULL pointer dereference

Dan reported that the recent changes to the broadcast code introduced
a potential NULL dereference.

Add the proper check.

Fixes: e0454311903d "tick/broadcast: Sanity check the shutdown of the local clock_event"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Doc: z8530book: Fix typo in API-z8530-sync-txdma-open.html

This patch fix a spelling typo found in API-z8530-sync-txdma-open.html.
It is because this file was generated from comment in source,
I have to fix comment in source.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: inet_diag: always export IPV6_V6ONLY sockopt for listening sockets

Reconsidering my commit 20462155 "net: inet_diag: export IPV6_V6ONLY
sockopt", I am not happy with the limitations it causes for socket
analysing code in userspace. Exporting the value only if it is set makes
it hard for userspace to decide whether the option is not set or the
kernel does not support exporting the option at all.

>From an auditor's perspective, the interesting question for listening
AF_INET6 sockets is: "Does it NOT have IPV6_V6ONLY set?" Because it is
the unexpected case. This patch allows to answer this question reliably.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: mdb: allow the user to delete mdb entry if there's a querier

Until now when a querier was present static entries couldn't be deleted.
Fix this and allow the user to manipulate the mdb with or without a
querier.

Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'netdev_unregister_races'

Julian Anastasov says:

====================
net: fixes for device unregistration

Test script from Eric W. Biederman can catch a problem
where packets from backlog are processed long after the last
synchronize_net call. This can be reproduced after few tests
if commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
is reverted for the test. Incoming packets do not hold
reference to device but even if they do, subsystems do not
expect packets to fly during and after the NETDEV_UNREGISTER
event.

The first fix has the cost of netif_running check in fast path.
The second fix calls rcu_read_lock while local IRQ is disabled,
I hope this is not against the rules.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: call rcu_read_lock early in process_backlog

Incoming packet should be either in backlog queue or
in RCU read-side section. Otherwise, the final sequence of
flush_backlog() and synchronize_net() may miss packets
that can run without device reference:

CPU 1                  CPU 2
                       skb->dev: no reference
                       process_backlog:__skb_dequeue
                       process_backlog:local_irq_enable

on_each_cpu for
flush_backlog =>       IPI(hardirq): flush_backlog
                       - packet not found in backlog

                       CPU delayed ...
synchronize_net
- no ongoing RCU
read-side sections

netdev_run_todo,
rcu_barrier: no
ongoing callbacks
                       __netif_receive_skb_core:rcu_read_lock
                       - too late
free dev
                       process packet for freed dev

Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: do not process device backlog during unregistration

commit 381c759d9916 ("ipv4: Avoid crashing in ip_error")
fixes a problem where processed packet comes from device
with destroyed inetdev (dev->ip_ptr). This is not expected
because inetdev_destroy is called in NETDEV_UNREGISTER
phase and packets should not be processed after
dev_close_many() and synchronize_net(). Above fix is still
required because inetdev_destroy can be called for other
reasons. But it shows the real problem: backlog can keep
packets for long time and they do not hold reference to
device. Such packets are then delivered to upper levels
at the same time when device is unregistered.
Calling flush_backlog after NETDEV_UNREGISTER_FINAL still
accounts all packets from backlog but before that some packets
continue to be delivered to upper levels long after the
synchronize_net call which is supposed to wait the last
ones. Also, as Eric pointed out, processed packets, mostly
from other devices, can continue to add new packets to backlog.

Fix the problem by moving flush_backlog early, after the
device driver is stopped and before the synchronize_net() call.
Then use netif_running check to make sure we do not add more
packets to backlog. We have to do it in enqueue_to_backlog
context when the local IRQ is disabled. As result, after the
flush_backlog and synchronize_net sequence all packets
should be accounted.

Thanks to Eric W. Biederman for the test script and his
valuable feedback!

Reported-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
Fixes: 6e583ce5242f ("net: eliminate refcounting in backlog queue")
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'parisc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
"We have one important patch from Dave Anglin and myself which fixes
  PTE/TLB race conditions which caused random segmentation faults on our
  debian buildd servers, and one patch from Alex Ivanov which speeds up
  the graphical text console on the STI framebuffer driver"

* 'parisc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results
  stifb: Implement hardware accelerated copyarea

Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/selinux into for-linus2

selinux: fix mprotect PROT_EXEC regression caused by mm change

commit 66fc13039422ba7df2d01a8ee0873e4ef965b50b ("mm: shmem_zero_setup
skip security check and lockdep conflict with XFS") caused a regression
for SELinux by disabling any SELinux checking of mprotect PROT_EXEC on
shared anonymous mappings.  However, even before that regression, the
checking on such mprotect PROT_EXEC calls was inconsistent with the
checking on a mmap PROT_EXEC call for a shared anonymous mapping.  On a
mmap, the security hook is passed a NULL file and knows it is dealing
with an anonymous mapping and therefore applies an execmem check and no
file checks.  On a mprotect, the security hook is passed a vma with a
non-NULL vm_file (as this was set from the internally-created shmem
file during mmap) and therefore applies the file-based execute check
and no execmem check.  Since the aforementioned commit now marks the
shmem zero inode with the S_PRIVATE flag, the file checks are disabled
and we have no checking at all on mprotect PROT_EXEC.  Add a test to
the mprotect hook logic for such private inodes, and apply an execmem
check in that case.  This makes the mmap and mprotect checking
consistent for shared anonymous mappings, as well as for /dev/zero and
ashmem.

Cc: <stable@vger.kernel.org> # 4.1.x
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes and clean-up from Catalin Marinas:
- ACPI fix when checking the validity of the GICC MADT subtable
- handle debug exceptions in the el*_inv exception entries
- remove pointless register assignment in two compat syscall wrappers
- unnecessary include path
- defconfig update

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: entry32: remove pointless register assignment
  arm64: entry: handle debug exceptions in el*_inv
  arm64: Keep the ARM64 Kconfig selects sorted
  ACPI / ARM64 : use the new BAD_MADT_GICC_ENTRY macro
  ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro
  arm64: defconfig: Add Ceva ahci to the defconfig
  arm64: remove another unnecessary libfdt include path

parisc: Fix some PTE/TLB race conditions and optimize __flush_tlb_range based on timing results

The increased use of pdtlb/pitlb instructions seemed to increase the
frequency of random segmentation faults building packages. Further, we
had a number of cases where TLB inserts would repeatedly fail and all
forward progress would stop. The Haskell ghc package caused a lot of
trouble in this area. The final indication of a race in pte handling was
this syslog entry on sibaris (C8000):

swap_free: Unused swap offset entry 00000004
BUG: Bad page map in process mysqld  pte:00000100 pmd:019bbec5
addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464
CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1
Backtrace:
  [<0000000040173eb0>] show_stack+0x20/0x38
  [<0000000040444424>] dump_stack+0x9c/0x110
  [<00000000402a0d38>] print_bad_pte+0x1a8/0x278
  [<00000000402a28b8>] unmap_single_vma+0x3d8/0x770
  [<00000000402a4090>] zap_page_range+0xf0/0x198
  [<00000000402ba2a4>] SyS_madvise+0x404/0x8c0

Note that the pte value is 0 except for the accessed bit 0x100. This bit
shouldn't be set without the present bit.

It should be noted that the madvise system call is probably a trigger for many
of the random segmentation faults.

In looking at the kernel code, I found the following problems:

1) The pte_clear define didn't take TLB lock when clearing a pte.
2) We didn't test pte present bit inside lock in exception support.
3) The pte and tlb locks needed to merged in order to ensure consistency
between page table and TLB. This also has the effect of serializing TLB
broadcasts on SMP systems.

The attached change implements the above and a few other tweaks to try
to improve performance. Based on the timing code, TLB purges are very
slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it
beneficial to test the split_tlb variable to avoid duplicate purges.
Probably, all PA 2.0 machines have combined TLBs.

I dropped using __flush_tlb_range in flush_tlb_mm as I realized all
applications and most threads have a stack size that is too large to
make this useful. I added some comments to this effect.

Since implementing 1 through 3, I haven't had any random segmentation
faults on mx3210 (rp3440) in about one week of building code and running
as a Debian buildd.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Helge Deller <deller@gmx.de>

stifb: Implement hardware accelerated copyarea

This patch adds hardware assisted scrolling. The code is based upon the
following investigation: https://parisc.wiki.kernel.org/index.php/NGLE#Blitter

A simple 'time ls -la /usr/bin' test shows 1.6x speed increase over soft
copy and 2.3x increase over FBINFO_READS_FAST (prefer soft copy over
screen redraw) on Artist framebuffer.

Signed-off-by: Alex Ivanov <lausgans@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>

Merge tag 'powerpc-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
- opal-prd mmap fix from Vaidy
- set kernel taint for MCEs from Daniel
- alignment exception description from Anton
- ppc4xx_hsta_msi build fix from Daniel
- opal-elog interrupt fix from Alistair
- core_idle_state race fix from Shreyas
- hv-24x7 lockdep fix from Sukadev
- multiple cxl fixes from Daniel, Ian, Mikey & Maninder
- update MAINTAINERS to point at shared tree

* tag 'powerpc-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  cxl: Check if afu is not null in cxl_slbia
  powerpc: Update MAINTAINERS to point at shared tree
  powerpc/perf/24x7: Fix lockdep warning
  cxl: Fix off by one error allowing subsequent mmap page to be accessed
  cxl: Fail mmap if requested mapping is larger than assigned problem state area
  cxl: Fix refcounting in kernel API
  powerpc/powernv: Fix race in updating core_idle_state
  powerpc/powernv: Fix opal-elog interrupt handler
  powerpc/ppc4xx_hsta_msi: Include ppc-pci.h to fix reference to hose_list
  powerpc: Add plain English description for alignment exception oopses
  cxl: Test the correct mmio space before unmapping
  powerpc: Set the correct kernel taint on machine check errors
  cxl/vphb.c: Use phb pointer after NULL check
  powerpc/powernv: Fix vma page prot flags in opal-prd driver

nfit: add support for NVDIMM "latch" flag

Add support in the NFIT BLK I/O path for the "latch" flag
defined in the "Get Block NVDIMM Flags" _DSM function:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This flag requires the driver to read back the command register after it
is written in the block I/O path. This ensures that the hardware has
fully processed the new command and moved the aperture appropriately.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>

nfit: update block I/O path to use PMEM API

Update the nfit block I/O path to use the new PMEM API and to adhere to
the read/write flows outlined in the "NVDIMM Block Window Driver
Writer's Guide":

http://pmem.io/documents/NVDIMM_Driver_Writers_Guide.pdf

This includes adding support for targeted NVDIMM flushes called "flush
hints" in the ACPI 6.0 specification:

http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf

For performance and media durability the mapping for a BLK aperture is
moved to a write-combining mapping which is consistent with
memcpy_to_pmem() and wmb_blk().

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>