]> git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log
mirror_ubuntu-zesty-kernel.git
7 years agonet: ena: add support for out of order rx buffers refill
Netanel Belgazal [Fri, 23 Jun 2017 08:21:54 +0000 (11:21 +0300)]
net: ena: add support for out of order rx buffers refill

BugLink: http://bugs.launchpad.net/bugs/1701575
ENA driver post Rx buffers through the Rx submission queue
for the ENA device to fill them with receive packets.
Each Rx buffer is marked with req_id in the Rx descriptor.

Newer ENA devices could consume the posted Rx buffer in out of order,
and as result the corresponding Rx completion queue will have Rx
completion descriptors with non contiguous req_id(s)

In this change the driver holds two rings.
The first ring (called free_rx_ids) is a mapping ring.
It holds all the unused request ids.
The values in this ring are from 0 to ring_size -1.

When the driver wants to allocate a new Rx buffer it uses the head of
free_rx_ids and uses it's value as the index for rx_buffer_info ring.
The req_id is also written to the Rx descriptor

Upon Rx completion,
The driver took the req_id from the completion descriptor and uses it
as index in rx_buffer_info.
The req_id is then return to the free_rx_ids ring.

This patch also adds statistics to inform when the driver receive out
of range or unused req_id.

Note:
free_rx_ids is only accessible from the napi handler, so no locking is
required

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ad974baef2a17a170fe837ad19f10dcab63e9470 net-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: add reset reason for each device FLR
Netanel Belgazal [Fri, 23 Jun 2017 08:21:53 +0000 (11:21 +0300)]
net: ena: add reset reason for each device FLR

BugLink: http://bugs.launchpad.net/bugs/1701575
For each device reset, log to the device what is the cause
the reset occur.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2eed0e307f671e37f3829dfec5dbb1fba826a15 net-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: change sizeof() argument to be the type pointer
Netanel Belgazal [Fri, 23 Jun 2017 08:21:52 +0000 (11:21 +0300)]
net: ena: change sizeof() argument to be the type pointer

BugLink: http://bugs.launchpad.net/bugs/1701575
Instead of using:
memset(ptr, 0x0, sizeof(struct ...))
use:
memset(ptr, 0x0, sizeor(*ptr))

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 917501109cf4ea4c47991f593111338811369bf5 net-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: add hardware hints capability to the driver
Netanel Belgazal [Fri, 23 Jun 2017 08:21:51 +0000 (11:21 +0300)]
net: ena: add hardware hints capability to the driver

BugLink: http://bugs.launchpad.net/bugs/1701575
With this patch, ENA device can update the ena driver about
the desired timeout values:
These values are part of the "hardware hints" which are transmitted
to the driver as Asynchronous event through ENA async
event notification queue.

In case the ENA device does not support this capability,
the driver will use its own default values.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 82ef30f13be0b4fea70a9c6215f7cff40bb3be63 net-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: change return value for unsupported features unsupported return value
Netanel Belgazal [Fri, 23 Jun 2017 08:21:50 +0000 (11:21 +0300)]
net: ena: change return value for unsupported features unsupported return value

BugLink: http://bugs.launchpad.net/bugs/1701575
return -EOPNOTSUPP instead of -EPERM.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d1497638b6fedccaef875a5d9f09b3fe4ffbd22d net-next)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: update ena driver to version 1.1.7
Netanel Belgazal [Sun, 11 Jun 2017 12:42:51 +0000 (15:42 +0300)]
net: ena: update ena driver to version 1.1.7

BugLink: http://bugs.launchpad.net/bugs/1701575
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e7ff7efae5708513a795e329909ccbe2ac367b1a)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: bug fix in lost tx packets detection mechanism
Netanel Belgazal [Sun, 11 Jun 2017 12:42:50 +0000 (15:42 +0300)]
net: ena: bug fix in lost tx packets detection mechanism

BugLink: http://bugs.launchpad.net/bugs/1701575
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 800c55cb76be6617232ef50a2be29830f3aa8e5c)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: disable admin msix while working in polling mode
Netanel Belgazal [Sun, 11 Jun 2017 12:42:49 +0000 (15:42 +0300)]
net: ena: disable admin msix while working in polling mode

BugLink: http://bugs.launchpad.net/bugs/1701575
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a2cc5198dac102775b21787752a2e0afe44ad311)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: fix theoretical Rx hang on low memory systems
Netanel Belgazal [Sun, 11 Jun 2017 12:42:48 +0000 (15:42 +0300)]
net: ena: fix theoretical Rx hang on low memory systems

BugLink: http://bugs.launchpad.net/bugs/1701575
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a3af7c18cfe545a711e5df7491b7d6df71eba2ff)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: add missing unmap bars on device removal
Netanel Belgazal [Sun, 11 Jun 2017 12:42:47 +0000 (15:42 +0300)]
net: ena: add missing unmap bars on device removal

BugLink: http://bugs.launchpad.net/bugs/1701575
This patch also change the mapping functions to devm_ functions

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0857d92f71b6cb75281fde913554b2d5436c394b)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: fix race condition between submit and completion admin command
Netanel Belgazal [Sun, 11 Jun 2017 12:42:46 +0000 (15:42 +0300)]
net: ena: fix race condition between submit and completion admin command

BugLink: http://bugs.launchpad.net/bugs/1701575
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 661d2b0ccef6a63f48b61105cf7be17403d1db01)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: add missing return when ena_com_get_io_handlers() fails
Netanel Belgazal [Sun, 11 Jun 2017 12:42:45 +0000 (15:42 +0300)]
net: ena: add missing return when ena_com_get_io_handlers() fails

BugLink: http://bugs.launchpad.net/bugs/1701575
Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 2d2c600a917127f16f179d5a88fc44ba3ed263ed)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: fix rare uncompleted admin command false alarm
Netanel Belgazal [Sun, 11 Jun 2017 12:42:43 +0000 (15:42 +0300)]
net: ena: fix rare uncompleted admin command false alarm

BugLink: http://bugs.launchpad.net/bugs/1701575
The current flow to detect admin completion is:
while (command_not_completed) {
if (timeout)
error

check_for_completion()
sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a77c1aafcc906f657d1a0890c1d898be9ee1d5c9)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agonet: ena: remove superfluous check in ena_remove()
Lino Sanfilippo [Sat, 18 Feb 2017 11:19:41 +0000 (12:19 +0100)]
net: ena: remove superfluous check in ena_remove()

BugLink: http://bugs.launchpad.net/bugs/1701575
The check in ena_remove() for the pci driver data not being NULL is not
needed, since it is always set in the probe() function. Remove the
superfluous check.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 90a6c997bd8bb836809eda5efb6406d8c58c0924)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
7 years agoUBUNTU: Start new release
Kleber Sacilotto de Souza [Wed, 19 Jul 2017 07:56:00 +0000 (09:56 +0200)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agoUBUNTU: Ubuntu-4.4.0-87.110
Kleber Sacilotto de Souza [Tue, 18 Jul 2017 12:07:48 +0000 (14:07 +0200)]
UBUNTU: Ubuntu-4.4.0-87.110

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agomm/mmap.c: expand_downwards: don't require the gap if !vm_prev
Oleg Nesterov [Mon, 17 Jul 2017 12:53:29 +0000 (14:53 +0200)]
mm/mmap.c: expand_downwards: don't require the gap if !vm_prev

expand_stack(vma) fails if address < stack_guard_gap even if there is no
vma->vm_prev.  I don't think this makes sense, and we didn't do this
before the recent commit 1be7107fbe18 ("mm: larger stack guard gap,
between vmas").

We do not need a gap in this case, any address is fine as long as
security_mmap_addr() doesn't object.

This also simplifies the code, we know that address >= prev->vm_end and
thus underflow is not possible.

Link: http://lkml.kernel.org/r/20170628175258.GA24881@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000364

(cherry picked from commit 32e4e6d5cbb0c0e427391635991fe65e17797af8)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agomm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
Michal Hocko [Mon, 17 Jul 2017 12:53:28 +0000 (14:53 +0200)]
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack

Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page.  They are punching a new
MAP_FIXED mapping inside the existing stack Vma.

This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety.  It is a real regression on
the other hand.

Let's work around the problem by considering PROT_NONE mapping as a part
of the stack.  This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.

Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000364

(cherry picked from commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agoCIFS: Fix some return values in case of error in 'crypt_message'
Christophe Jaillet [Mon, 17 Jul 2017 19:29:13 +0000 (16:29 -0300)]
CIFS: Fix some return values in case of error in 'crypt_message'

BugLink: https://bugs.launchpad.net/bugs/1704857
'rc' is known to be 0 at this point. So if 'init_sg' or 'kzalloc' fails, we
should return -ENOMEM instead.

Also remove a useless 'rc' in a debug message as it is meaningless here.

Fixes: 026e93dc0a3ee ("CIFS: Encrypt SMB3 requests before sending")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
(cherry picked from commit 517a6e43c4872c89794af5b377fa085e47345952)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agoCIFS: Fix null pointer deref during read resp processing
Pavel Shilovsky [Mon, 17 Jul 2017 19:29:12 +0000 (16:29 -0300)]
CIFS: Fix null pointer deref during read resp processing

BugLink: https://bugs.launchpad.net/bugs/1704857
Currently during receiving a read response mid->resp_buf can be
NULL when it is being passed to cifs_discard_remaining_data() from
cifs_readv_discard(). Fix it by always passing server->smallbuf
instead and initializing mid->resp_buf at the end of read response
processing.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit 350be257ea83029daee974c72b1fe2e6f1f8e615)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
7 years agoUBUNTU: Start new release
Thadeu Lima de Souza Cascardo [Fri, 14 Jul 2017 12:50:05 +0000 (09:50 -0300)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: Ubuntu-4.4.0-86.109
Thadeu Lima de Souza Cascardo [Wed, 12 Jul 2017 21:25:50 +0000 (18:25 -0300)]
UBUNTU: Ubuntu-4.4.0-86.109

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: Packaging: Breaks unfixed iscsitarget versions
Thadeu Lima de Souza Cascardo [Wed, 12 Jul 2017 21:15:35 +0000 (18:15 -0300)]
UBUNTU: Packaging: Breaks unfixed iscsitarget versions

BugLink: https://bugs.launchpad.net/bugs/1701697
In linux-4.4.0-84, we backported commit
2da62906b1e298695e1bb725927041cd59942c98 ("[net] drop 'size' argument of
sock_recvmsg()"), which breaks building iscsitarget-dkms.

iscsitarget was fixed to build with this new kernel, but we need to add
a Breaks with older versions.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: Start new release
Kamal Mostafa [Mon, 10 Jul 2017 18:24:28 +0000 (11:24 -0700)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
7 years agoUBUNTU: Ubuntu-4.4.0-85.108
Thadeu Lima de Souza Cascardo [Mon, 3 Jul 2017 14:20:59 +0000 (11:20 -0300)]
UBUNTU: Ubuntu-4.4.0-85.108

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agohv_utils: fix TimeSync work on pre-TimeSync-v4 hosts
Vitaly Kuznetsov [Mon, 3 Jul 2017 12:57:05 +0000 (09:57 -0300)]
hv_utils: fix TimeSync work on pre-TimeSync-v4 hosts

BugLink: http://bugs.launchpad.net/bugs/1676635
It was found that ICTIMESYNCFLAG_SYNC packets are handled incorrectly
on WS2012R2, e.g. after the guest is paused and resumed its time is set
to something different from host's time. The problem is that we call
adj_guesttime() with reftime=0 for these old hosts and we don't account
for that in 'if (adj_flags & ICTIMESYNCFLAG_SYNC)' branch and
hv_set_host_time().

While we could've solved this by adding a check like
'if (ts_srv_version > TS_VERSION_3)' to hv_set_host_time() I prefer
to do some refactoring. We don't need to have two separate containers
for host samples, struct host_ts which we use for PTP is enough.

Throw away 'struct adj_time_work' and create hv_get_adj_host_time()
accessor to host_ts to avoid code duplication.

Fixes: 3716a49a81ba ("hv_utils: implement Hyper-V PTP source")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from linux-next commit 1d10602d306cb7f70545b5e1166efc9409e7d384)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agohv_utils: drop .getcrosststamp() support from PTP driver
Vitaly Kuznetsov [Mon, 3 Jul 2017 12:57:04 +0000 (09:57 -0300)]
hv_utils: drop .getcrosststamp() support from PTP driver

BugLink: http://bugs.launchpad.net/bugs/1676635
Turns out that our implementation of .getcrosststamp() never actually
worked. Hyper-V is sending time samples every 5 seconds and this is
too much for get_device_system_crosststamp() as it's interpolation
algorithm (which nobody is currently using in kernel, btw) accounts
for a 'slow' device but we're not slow in Hyper-V, our time reference
is too far away.

.getcrosststamp() is not currently used, get_device_system_crosststamp()
almost always returns -EINVAL and client falls back to using PTP_SYS_OFFSET
so this patch doesn't change much. I also tried doing interpolation
manually (e.g. the same way hv_ptp_gettime() works and it turns out that
we're getting even lower quality:

PTP_SYS_OFFSET_PRECISE with manual interpolation:
* PHC0                     0   3    37     4  -3974ns[-5338ns] +/-  977ns
* PHC0                     0   3    77     7  +2227ns[+3184ns] +/-  576ns
* PHC0                     0   3   177    10  +3060ns[+5220ns] +/-  548ns
* PHC0                     0   3   377    12  +3937ns[+4371ns] +/- 1414ns
* PHC0                     0   3   377     6   +764ns[+1240ns] +/- 1047ns
* PHC0                     0   3   377     7  -1210ns[-3731ns] +/-  479ns
* PHC0                     0   3   377     9   +153ns[-1019ns] +/-  406ns
* PHC0                     0   3   377    12   -872ns[-1793ns] +/-  443ns
* PHC0                     0   3   377     5   +701ns[+3599ns] +/-  426ns
* PHC0                     0   3   377     5   -923ns[ -375ns] +/- 1062ns

PTP_SYS_OFFSET:
* PHC0                     0   3     7     5    +72ns[+8020ns] +/-  251ns
* PHC0                     0   3    17     5   -885ns[-3661ns] +/-  254ns
* PHC0                     0   3    37     6   -454ns[-5732ns] +/-  258ns
* PHC0                     0   3    77    10  +1183ns[+3754ns] +/-  164ns
* PHC0                     0   3   377     5   +579ns[+1137ns] +/-  110ns
* PHC0                     0   3   377     7   +501ns[+1064ns] +/-   96ns
* PHC0                     0   3   377     9  +1641ns[+3342ns] +/-  106ns
* PHC0                     0   3   377     8    -47ns[  +77ns] +/-  160ns
* PHC0                     0   3   377     5    +54ns[ +107ns] +/-  102ns
* PHC0                     0   3   377     8   -354ns[ -617ns] +/-   89ns

This fact wasn't noticed during the initial testing of the PTP device
somehow but got revealed now. Let's just drop .getcrosststamp()
implementation for now as it doesn't seem to be suitable for us.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from linux-next commit 4f9bac039a64f6306b613a0d90e6b7e75d7ab0c4)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoDrivers: hv: util: don't forget to init host_ts.lock
Dexuan Cui [Mon, 3 Jul 2017 12:57:03 +0000 (09:57 -0300)]
Drivers: hv: util: don't forget to init host_ts.lock

BugLink: http://bugs.launchpad.net/bugs/1676635
Without the patch, I always get a "BUG: spinlock bad magic" warning.

Fixes: 3716a49a81ba ("hv_utils: implement Hyper-V PTP source")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5a16dfc855127906fcd2935fb039bc8989313915)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoDrivers: hv: util: Fix a typo
K. Y. Srinivasan [Mon, 3 Jul 2017 12:57:02 +0000 (09:57 -0300)]
Drivers: hv: util: Fix a typo

BugLink: http://bugs.launchpad.net/bugs/1676635
Fix a typo.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit bb6a4db92f8345a210b369b791e6920253b10437)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agohv_utils: implement Hyper-V PTP source
Vitaly Kuznetsov [Mon, 3 Jul 2017 12:57:01 +0000 (09:57 -0300)]
hv_utils: implement Hyper-V PTP source

BugLink: http://bugs.launchpad.net/bugs/1676635
With TimeSync version 4 protocol support we started updating system time
continuously through the whole lifetime of Hyper-V guests. Every 5 seconds
there is a time sample from the host which triggers do_settimeofday[64]().
While the time from the host is very accurate such adjustments may cause
issues:
- Time is jumping forward and backward, some applications may misbehave.
- In case an NTP server runs in parallel and uses something else for time
  sync (network, PTP,...) system time will never converge.
- Systemd starts annoying you by printing "Time has been changed" every 5
  seconds to the system log.

Instead of doing in-kernel time adjustments offload the work to an
NTP client by exposing TimeSync messages as a PTP device. Users may now
decide what they want to use as a source.

I tested the solution with chrony, the config was:

 refclock PHC /dev/ptp0 poll 3 dpoll -2 offset 0

The result I'm seeing is accurate enough, the time delta between the guest
and the host is almost always within [-10us, +10us], the in-kernel solution
was giving us comparable results.

I also tried implementing PPS device instead of PTP by using not currently
used Hyper-V synthetic timers (we use only one of four for clockevent) but
with PPS source only chrony wasn't able to give me the required accuracy,
the delta often more that 100us.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3716a49a81ba19dda7202633a68b28564ba95eb5)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agohv_util: switch to using timespec64
Vitaly Kuznetsov [Mon, 3 Jul 2017 12:57:00 +0000 (09:57 -0300)]
hv_util: switch to using timespec64

BugLink: http://bugs.launchpad.net/bugs/1676635
do_settimeofday() is deprecated, use do_settimeofday64() instead.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 17244623a4c0f68d3f02c9c74d9b6ae259425826)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoDrivers: hv: util: Use hv_get_current_tick() to get current tick
K. Y. Srinivasan [Mon, 3 Jul 2017 12:56:59 +0000 (09:56 -0300)]
Drivers: hv: util: Use hv_get_current_tick() to get current tick

BugLink: http://bugs.launchpad.net/bugs/1676635
As part of the effort to interact with Hyper-V in an instruction set
architecture independent way, use the new API to get the current
tick.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 305f7549c9298247723c255baddb7a54b4e63050)
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: SAUCE: hv: make clocksource available for PTP device supporting
Marcelo Henrique Cerri [Mon, 3 Jul 2017 12:56:58 +0000 (09:56 -0300)]
UBUNTU: SAUCE: hv: make clocksource available for PTP device supporting

BugLink: http://bugs.launchpad.net/bugs/1676635
Make the MSR and TSC clock sources available during the hv_init() and
export them via hyperv_cs in a similar way as the following upstream
commit does:

commit dee863b571b0 ("hv: export current Hyper-V clocksource")

Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: Start new release
Thadeu Lima de Souza Cascardo [Mon, 3 Jul 2017 10:44:23 +0000 (07:44 -0300)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: Ubuntu-4.4.0-84.107
Thadeu Lima de Souza Cascardo [Wed, 28 Jun 2017 17:52:26 +0000 (14:52 -0300)]
UBUNTU: Ubuntu-4.4.0-84.107

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoUBUNTU: SAUCE: xhci: AMD Promontory USB disable port support
Jiahau Chang [Mon, 26 Jun 2017 03:02:32 +0000 (11:02 +0800)]
UBUNTU: SAUCE: xhci: AMD Promontory USB disable port support

BugLink: http://bugs.launchpad.net/bugs/1695216
v4: Remove the patch code in case USB_PORT_FEAT_REMOTE_WAKE_MASK

For AMD Promontory xHCI host, although you can disable USB 2.0 ports in BIOS
settings, those ports will be enabled anyway after you remove a device on
that port and re-plug it in again. It's a known limitation of the chip.
As a workaround we can clear the PORT_WAKE_BITS.

Signed-off-by: Jiahau Chang <Lars_Chang@asmedia.com.tw>
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoBluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device
Shih-Yuan Lee (FourDollars) [Thu, 22 Jun 2017 07:24:44 +0000 (15:24 +0800)]
Bluetooth: btusb: Add support for 0489:e0a2 QCA_ROME device

BugLink: http://bugs.launchpad.net/bugs/1699651
T:  Bus=01 Lev=01 Prnt=01 Port=06 Cnt=03 Dev#=  3 Spd=12  MxCh= 0
D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=0489 ProdID=e0a2 Rev=00.01
C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb

Signed-off-by: Shih-Yuan Lee (FourDollars) <sylee@canonical.com>
Suggested-by: Owen Lin <olin@rivetnetworks.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
(backported from commit 06e41d8a36689f465006f017bbcd8a73edb98109 linux-next)
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoHandle mismatched open calls
Sachin Prabhu [Fri, 3 Mar 2017 23:41:38 +0000 (15:41 -0800)]
Handle mismatched open calls

BugLink: http://bugs.launchpad.net/bugs/1670508
A signal can interrupt a SendReceive call which result in incoming
responses to the call being ignored. This is a problem for calls such as
open which results in the successful response being ignored. This
results in an open file resource on the server.

The patch looks into responses which were cancelled after being sent and
in case of successful open closes the open fids.

For this patch, the check is only done in SendReceive2()

RH-bz: 1403319

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Cc: Stable <stable@vger.kernel.org>
(cherry-picked from commit 38bd49064a1ecb67baad33598e3d824448ab11ec)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoCall echo service immediately after socket reconnect
Sachin Prabhu [Thu, 20 Oct 2016 23:52:24 +0000 (19:52 -0400)]
Call echo service immediately after socket reconnect

BugLink: http://bugs.launchpad.net/bugs/1670508
Commit 4fcd1813e640 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") changes the behaviour of the SMB2 echo
service and causes it to renegotiate after a socket reconnect. However
under default settings, the echo service could take up to 120 seconds to
be scheduled.

The patch forces the echo service to be called immediately resulting a
negotiate call being made immediately on reconnect.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit b8c600120fc87d53642476f48c8055b38d6e14c7)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoCIFS: Fix possible use after free in demultiplex thread
Pavel Shilovsky [Wed, 1 Mar 2017 00:05:19 +0000 (16:05 -0800)]
CIFS: Fix possible use after free in demultiplex thread

BugLink: http://bugs.launchpad.net/bugs/1670508
The recent changes that added SMB3 encryption support introduced
a possible use after free in the demultiplex thread. When we
process an encrypted packed we obtain a pointer to SMB session
but do not obtain a reference. This can possibly lead to a situation
when this session was freed before we copy a decryption key from
there. Fix this by obtaining a copy of the key rather than a pointer
to the session under a spinlock.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit 61cfac6f267dabcf2740a7ec8a0295833b28b5f5)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Allow to switch on encryption with seal mount option
Pavel Shilovsky [Thu, 17 Nov 2016 21:59:23 +0000 (13:59 -0800)]
CIFS: Allow to switch on encryption with seal mount option

BugLink: http://bugs.launchpad.net/bugs/1670508
This allows users to inforce encryption for SMB3 shares if a server
supports it.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(backport from commit ae6f8dd4d0c87bfb72da9d9b56342adf53e69c31)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
[cascardo: fixup of conflict with 06e7bc1427cfb2fe4d2ac84305afb79dbb0b42f3]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoCIFS: Add capability to decrypt big read responses
Pavel Shilovsky [Fri, 18 Nov 2016 00:20:23 +0000 (16:20 -0800)]
CIFS: Add capability to decrypt big read responses

BugLink: http://bugs.launchpad.net/bugs/1670508
Allow to decrypt transformed packets that are bigger than the big
buffer size. In particular it is used for read responses that can
only exceed the big buffer size.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit c42a6abe3012832a68a371dabe17c2ced97e62ad)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Decrypt and process small encrypted packets
Pavel Shilovsky [Thu, 17 Nov 2016 23:24:46 +0000 (15:24 -0800)]
CIFS: Decrypt and process small encrypted packets

BugLink: http://bugs.launchpad.net/bugs/1670508
Allow to decrypt transformed packets, find a corresponding mid
and process as usual further.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 4326ed2f6a16ae9d33e4209b540dc9a371aba840)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Add copy into pages callback for a read operation
Pavel Shilovsky [Fri, 18 Nov 2016 00:20:18 +0000 (16:20 -0800)]
CIFS: Add copy into pages callback for a read operation

BugLink: http://bugs.launchpad.net/bugs/1670508
Since we have two different types of reads (pagecache and direct)
we need to process such responses differently after decryption of
a packet. The change allows to specify a callback that copies a read
payload data into preallocated pages.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit d70b9104b1ca586f73aaf59426756cec3325a40e)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Add mid handle callback
Pavel Shilovsky [Wed, 16 Nov 2016 22:06:17 +0000 (14:06 -0800)]
CIFS: Add mid handle callback

BugLink: http://bugs.launchpad.net/bugs/1670508
We need to process read responses differently because the data
should go directly into preallocated pages. This can be done
by specifying a mid handle callback.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 9b7c18a2d4b798963ea80f6769701dcc4c24b55e)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Add transform header handling callbacks
Pavel Shilovsky [Thu, 17 Nov 2016 23:24:34 +0000 (15:24 -0800)]
CIFS: Add transform header handling callbacks

BugLink: http://bugs.launchpad.net/bugs/1670508
We need to recognize and parse transformed packets in demultiplex
thread to find a corresponsing mid and process it further.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 9bb17e0916a03ab901fb684e874d77a1e96b3d1e)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Encrypt SMB3 requests before sending
Pavel Shilovsky [Thu, 3 Nov 2016 23:47:37 +0000 (16:47 -0700)]
CIFS: Encrypt SMB3 requests before sending

BugLink: http://bugs.launchpad.net/bugs/1670508
This change allows to encrypt packets if it is required by a server
for SMB sessions or tree connections.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(backported from commit 026e93dc0a3eefb0be060bcb9ecd8d7a7fd5c398)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Enable encryption during session setup phase
Pavel Shilovsky [Tue, 8 Nov 2016 02:20:50 +0000 (18:20 -0800)]
CIFS: Enable encryption during session setup phase

BugLink: http://bugs.launchpad.net/bugs/1670508
In order to allow encryption on SMB connection we need to exchange
a session key and generate encryption and decryption keys.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit cabfb3680f78981d26c078a26e5c748531257ebb)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Add capability to transform requests before sending
Pavel Shilovsky [Mon, 31 Oct 2016 20:49:30 +0000 (13:49 -0700)]
CIFS: Add capability to transform requests before sending

BugLink: http://bugs.launchpad.net/bugs/1670508
This will allow us to do protocol specific tranformations of packets
before sending to the server. For SMB3 it can be used to support
encryption.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 7fb8986e7449d0a5cebd84d059927afa423fbf85)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Separate RFC1001 length processing for SMB2 read
Pavel Shilovsky [Wed, 23 Nov 2016 23:31:54 +0000 (15:31 -0800)]
CIFS: Separate RFC1001 length processing for SMB2 read

BugLink: http://bugs.launchpad.net/bugs/1670508
Allocate and initialize SMB2 read request without RFC1001 length
field to directly call cifs_send_recv() rather than SendReceive2()
in a read codepath.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit b8f57ee8aad414a3122bff72d7968a94baacb9b6)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Separate SMB2 sync header processing
Pavel Shilovsky [Mon, 24 Oct 2016 23:59:57 +0000 (16:59 -0700)]
CIFS: Separate SMB2 sync header processing

BugLink: http://bugs.launchpad.net/bugs/1670508
Do not process RFC1001 length in smb2_hdr_assemble() because
it is not a part of SMB2 header. This allows to cleanup the code
and adds a possibility combine several SMB2 packets into one
for compounding.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit cb200bd6264a80c04e09e8635fa4f3901cabdaef)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Send RFC1001 length in a separate iov
Pavel Shilovsky [Wed, 23 Nov 2016 23:14:57 +0000 (15:14 -0800)]
CIFS: Send RFC1001 length in a separate iov

BugLink: http://bugs.launchpad.net/bugs/1670508
In order to simplify further encryption support we need to separate
RFC1001 length and SMB2 header when sending a request. Put the length
field in iov[0] and the rest of the packet into following iovs.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 738f9de5cdb9175c19d24cfdf90b4543fc3b47bf)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Make send_cancel take rqst as argument
Pavel Shilovsky [Wed, 23 Nov 2016 23:08:14 +0000 (15:08 -0800)]
CIFS: Make send_cancel take rqst as argument

BugLink: http://bugs.launchpad.net/bugs/1670508
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit fb2036d817584df42504910fe104f68517e8990e)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Make SendReceive2() takes resp iov
Pavel Shilovsky [Tue, 25 Oct 2016 18:38:47 +0000 (11:38 -0700)]
CIFS: Make SendReceive2() takes resp iov

BugLink: http://bugs.launchpad.net/bugs/1670508
Now SendReceive2 frees the first iov and returns a response buffer
in it that increases a code complexity. Simplify this by making
a caller responsible for freeing request buffer itself and returning
a response buffer in a separate iov.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit da502f7df03d2d0b416775f92ae022f3f82bedd5)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoCIFS: Separate SMB2 header structure
Pavel Shilovsky [Mon, 24 Oct 2016 22:33:04 +0000 (15:33 -0700)]
CIFS: Separate SMB2 header structure

BugLink: http://bugs.launchpad.net/bugs/1670508
In order to support compounding and encryption we need to separate
RFC1001 length field and SMB2 header structure because the protocol
treats them differently. This change will allow to simplify parsing
of such complex SMB2 packets further.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 31473fc4f9653b73750d3792ffce6a6e1bdf0da7)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: Add soft dependencies
Jean Delvare [Wed, 25 Jan 2017 15:09:10 +0000 (16:09 +0100)]
cifs: Add soft dependencies

BugLink: http://bugs.launchpad.net/bugs/1670508
List soft dependencies of cifs so that mkinitrd and dracut can include
the required helper modules.

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Steve French <sfrench@samba.org>
(cherry picked from commit b9be76d585d48cb25af8db0d35e1ef9030fbe13a)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: Only select the required crypto modules
Jean Delvare [Wed, 25 Jan 2017 15:08:17 +0000 (16:08 +0100)]
cifs: Only select the required crypto modules

BugLink: http://bugs.launchpad.net/bugs/1670508
The sha256 and cmac crypto modules are only needed for SMB2+, so move
the select statements to config CIFS_SMB2. Also select CRYPTO_AES
there as SMB2+ needs it.

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Steve French <sfrench@samba.org>
(cherry picked from commit 3692304bba6164be3810afd41b84ecb0e1e41db1)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: Simplify SMB2 and SMB311 dependencies
Jean Delvare [Wed, 25 Jan 2017 15:07:29 +0000 (16:07 +0100)]
cifs: Simplify SMB2 and SMB311 dependencies

BugLink: http://bugs.launchpad.net/bugs/1670508
* CIFS_SMB2 depends on CIFS, which depends on INET and selects NLS. So
  these dependencies do not need to be repeated for CIFS_SMB2.
* CIFS_SMB311 depends on CIFS_SMB2, which depends on INET. So this
  dependency doesn't need to be repeated for CIFS_SMB311.

Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Steve French <sfrench@samba.org>
(cherry picked from commit c1ecea87471bbb614f8121e00e5787f363140365)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoSMB3: parsing for new snapshot timestamp mount parm
Steve French [Sat, 12 Nov 2016 04:36:20 +0000 (22:36 -0600)]
SMB3: parsing for new snapshot timestamp mount parm

BugLink: http://bugs.launchpad.net/bugs/1670508
New mount option "snapshot=<time>" to allow mounting an earlier
version of the remote volume (if such a snapshot exists on
the server).

Note that eventually specifying a snapshot time of 1 will allow
the user to mount the oldest snapshot. A subsequent patch
add the processing for that and another for actually specifying
the "time warp" create context on SMB2/SMB3 open.

Check to make sure SMB2 negotiated, and ensure that
we use a different tcon if mount same share twice
but with different snaphshot times

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 8b217fe7fcadd162944a88b14990b9723c27419f)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoSMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup
Sachin Prabhu [Fri, 7 Oct 2016 18:11:22 +0000 (19:11 +0100)]
SMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup

BugLink: http://bugs.launchpad.net/bugs/1670508
We split the rawntlmssp authentication into negotiate and
authencate parts. We also clean up the code and add helpers.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 166cea4dc3a4f66f020cfb9286225ecd228ab61d)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoSMB2: Separate Kerberos authentication from SMB2_sess_setup
Sachin Prabhu [Fri, 7 Oct 2016 18:11:21 +0000 (19:11 +0100)]
SMB2: Separate Kerberos authentication from SMB2_sess_setup

BugLink: http://bugs.launchpad.net/bugs/1670508
Add helper functions and split Kerberos authentication off
SMB2_sess_setup.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
(cherry picked from commit 3baf1a7b921500596b77487d5a34a27d656fc032)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoSMB3: Add mount parameter to allow user to override max credits
Steve French [Fri, 23 Sep 2016 05:44:16 +0000 (00:44 -0500)]
SMB3: Add mount parameter to allow user to override max credits

BugLink: http://bugs.launchpad.net/bugs/1670508
Add mount option "max_credits" to allow setting maximum SMB3
credits to any value from 10 to 64000 (default is 32000).
This can be useful to workaround servers with problems allocating
credits, or to throttle the client to use smaller amount of
simultaneous i/o or to workaround server performance issues.

Also adds a cap, so that even if the server granted us more than
65000 credits due to a server bug, we would not use that many.

Signed-off-by: Steve French <steve.french@primarydata.com>
(cherry picked from commit 141891f4727c08829755be6c785e125d2e96c899)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoFix memory leaks in cifs_do_mount()
Sachin Prabhu [Fri, 29 Jul 2016 21:38:19 +0000 (22:38 +0100)]
Fix memory leaks in cifs_do_mount()

BugLink: http://bugs.launchpad.net/bugs/1670508
Fix memory leaks introduced by the patch
fs/cifs: make share unaccessible at root level mountable

Also move allocation of cifs_sb->prepath to cifs_setup_cifs_sb().

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Tested-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
(cherry picked from commit 4214ebf4654798309364d0c678b799e402f38288)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs_readv_receive: use cifs_read_from_socket()
Al Viro [Sun, 10 Jan 2016 00:37:16 +0000 (19:37 -0500)]
cifs_readv_receive: use cifs_read_from_socket()

BugLink: http://bugs.launchpad.net/bugs/1670508
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit a6137305a8c47fa92ab1a8efcfe76f0e9fa96ab7)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: don't bother with kmap on read_pages side
Al Viro [Sun, 10 Jan 2016 00:54:50 +0000 (19:54 -0500)]
cifs: don't bother with kmap on read_pages side

BugLink: http://bugs.launchpad.net/bugs/1670508
just do ITER_BVEC recvmsg

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(backported from commit 71335664c38f03de10d7cf1d82705fe55a130b33)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: no need to wank with copying and advancing iovec on recvmsg side either
Al Viro [Fri, 13 Nov 2015 08:00:17 +0000 (03:00 -0500)]
cifs: no need to wank with copying and advancing iovec on recvmsg side either

BugLink: http://bugs.launchpad.net/bugs/1670508
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 09aab880f7c5ac31e9db18a63353b980bcb52261)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: merge the hash calculation helpers
Al Viro [Fri, 13 Nov 2015 03:46:49 +0000 (22:46 -0500)]
cifs: merge the hash calculation helpers

BugLink: http://bugs.launchpad.net/bugs/1670508
three practically identical copies...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 16c568efff82e4a6a75d2bd86576e648fad8a7fe)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years ago[net] drop 'size' argument of sock_recvmsg()
Al Viro [Sun, 15 Mar 2015 01:13:46 +0000 (21:13 -0400)]
[net] drop 'size' argument of sock_recvmsg()

BugLink: http://bugs.launchpad.net/bugs/1670508
all callers have it equal to msg_data_left(msg).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 2da62906b1e298695e1bb725927041cd59942c98)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoPrepare for encryption support (first part). Add decryption and encryption key genera...
Steve French [Fri, 18 Dec 2015 19:05:30 +0000 (13:05 -0600)]
Prepare for encryption support (first part). Add decryption and encryption key generation. Thanks to Metze for helping with this.

BugLink: http://bugs.launchpad.net/bugs/1670508
Reviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <steve.french@primarydata.com>
(backported from commit 373512ec5c105ed09e3738196dcb257dfab65cba)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agocifs: Make echo interval tunable
Steve French [Fri, 18 Dec 2015 18:31:36 +0000 (12:31 -0600)]
cifs: Make echo interval tunable

BugLink: http://bugs.launchpad.net/bugs/1670508
Currently the echo interval is set to 60 seconds using a macro. This
setting determines the interval at which echo requests are sent to the
server on an idling connection. This setting also affects the time
required for a connection to an unresponsive server to timeout.

Making this setting a tunable allows users to control the echo interval
times as well as control the time after which the connecting to an
unresponsive server times out.

To set echo interval, pass the echo_interval=n mount option.

Version four of the patch.
v2: Change MIN and MAX timeout values
v3: Remove incorrect comment in cifs_get_tcp_session
v4: Fix bug in setting echo_intervalw

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Acked-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
(backported from commit adfeb3e00e8e1b9fb4ad19eb7367e7c272d16003)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
7 years agoRevert "Call echo service immediately after socket reconnect"
Thadeu Lima de Souza Cascardo [Mon, 8 May 2017 17:37:42 +0000 (14:37 -0300)]
Revert "Call echo service immediately after socket reconnect"

BugLink: http://bugs.launchpad.net/bugs/1670508
This reverts commit f068ccac8a390dca36ee914ca3dfe7c8fb82bc12.

The revert is in favor of cherry-picking the upstream commit, so the
rest of the patchset can apply.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoRevert "Handle mismatched open calls"
Thadeu Lima de Souza Cascardo [Mon, 8 May 2017 17:37:15 +0000 (14:37 -0300)]
Revert "Handle mismatched open calls"

BugLink: http://bugs.launchpad.net/bugs/1670508
This reverts commit 35067b7fba326a76624769e03afeb4b5ff182041.

The revert is in favor of cherry-picking the upstream commit, so the
rest of the patchset can apply.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agobpf: don't let ldimm64 leak map addresses on unprivileged
Daniel Borkmann [Wed, 21 Jun 2017 14:13:32 +0000 (22:13 +0800)]
bpf: don't let ldimm64 leak map addresses on unprivileged

CVE-2017-9150

The patch fixes two things at once:

1) It checks the env->allow_ptr_leaks and only prints the map address to
   the log if we have the privileges to do so, otherwise it just dumps 0
   as we would when kptr_restrict is enabled on %pK. Given the latter is
   off by default and not every distro sets it, I don't want to rely on
   this, hence the 0 by default for unprivileged.

2) Printing of ldimm64 in the verifier log is currently broken in that
   we don't print the full immediate, but only the 32 bit part of the
   first insn part for ldimm64. Thus, fix this up as well; it's okay to
   access, since we verified all ldimm64 earlier already (including just
   constants) through replace_map_fd_with_map_ptr().

Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs")
Fixes: cbd357008604 ("bpf: verifier (add ability to receive verification log)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(backported from commit 0d0e57697f162da4aa218b5feafe614fb666db07)
Signed-off-by: Wen-chien Jesse Sung <jesse.sung@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years ago/proc/iomem: only expose physical resource addresses to privileged users
Linus Torvalds [Fri, 9 Jun 2017 12:56:00 +0000 (14:56 +0200)]
/proc/iomem: only expose physical resource addresses to privileged users

CVE-2015-8944

In commit c4004b02f8e5b ("x86: remove the kernel code/data/bss resources
from /proc/iomem") I was hoping to remove the phyiscal kernel address
data from /proc/iomem entirely, but that had to be reverted because some
system programs actually use it.

This limits all the detailed resource information to properly
credentialed users instead.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 51d7b120418e99d6b3bf8df9eb3cc31e8171dee4)
Signed-off-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
7 years agoMake file credentials available to the seqfile interfaces
Linus Torvalds [Thu, 14 Apr 2016 18:22:00 +0000 (11:22 -0700)]
Make file credentials available to the seqfile interfaces

A lot of seqfile users seem to be using things like %pK that uses the
credentials of the current process, but that is actually completely
wrong for filesystem interfaces.

The unix semantics for permission checking files is to check permissions
at _open_ time, not at read or write time, and that is not just a small
detail: passing off stdin/stdout/stderr to a suid application and making
the actual IO happen in privileged context is a classic exploit
technique.

So if we want to be able to look at permissions at read time, we need to
use the file open credentials, not the current ones.  Normal file
accesses can just use "f_cred" (or any of the helper functions that do
that, like file_ns_capable()), but the seqfile interfaces do not have
any such options.

It turns out that seq_file _does_ save away the user_ns information of
the file, though.  Since user_ns is just part of the full credential
information, replace that special case with saving off the cred pointer
instead, and suddenly seq_file has all the permission information it
needs.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2015-8944

(cherry-picked from commit 34dbbcdbf63360661ff7bda6c5f52f99ac515f92)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
7 years agoLinux 4.4.73
Greg Kroah-Hartman [Sat, 17 Jun 2017 04:40:54 +0000 (06:40 +0200)]
Linux 4.4.73

BugLink: http://bugs.launchpad.net/bugs/1698817
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agosparc64: make string buffers large enough
Dan Carpenter [Fri, 25 Nov 2016 11:03:55 +0000 (14:03 +0300)]
sparc64: make string buffers large enough

BugLink: http://bugs.launchpad.net/bugs/1698817
commit b5c3206190f1fddd100b3060eb15f0d775ffeab8 upstream.

My static checker complains that if "lvl" is ULONG_MAX (this is 64 bit)
then some of the strings will overflow.  I don't know if that's possible
but it seems simple enough to make the buffers slightly larger.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Waldemar Brodkorb <wbx@openadk.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agos390/kvm: do not rely on the ILC on kvm host protection fauls
Christian Borntraeger [Mon, 15 May 2017 12:11:03 +0000 (14:11 +0200)]
s390/kvm: do not rely on the ILC on kvm host protection fauls

BugLink: http://bugs.launchpad.net/bugs/1698817
commit c0e7bb38c07cbd8269549ee0a0566021a3c729de upstream.

For most cases a protection exception in the host (e.g. copy
on write or dirty tracking) on the sie instruction will indicate
an instruction length of 4. Turns out that there are some corner
cases (e.g. runtime instrumentation) where this is not necessarily
true and the ILC is unpredictable.

Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for
all possible ILCs.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoxtensa: don't use linux IRQ #0
Max Filippov [Mon, 5 Jun 2017 09:43:51 +0000 (02:43 -0700)]
xtensa: don't use linux IRQ #0

BugLink: http://bugs.launchpad.net/bugs/1698817
commit e5c86679d5e864947a52fb31e45a425dea3e7fa9 upstream.

Linux IRQ #0 is reserved for error reporting and may not be used.
Increase NR_IRQS for one additional slot and increase
irq_domain_add_legacy parameter first_irq value to 1, so that linux
IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains.
Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa
PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform
data definitions.

This fixes inability to use hardware IRQ #0 in configurations that don't
use device tree and allows for non-identity mapping between linux IRQ #
and hardware IRQ #.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agotipc: ignore requests when the connection state is not CONNECTED
Parthasarathy Bhuvaragan [Tue, 24 Jan 2017 12:00:47 +0000 (13:00 +0100)]
tipc: ignore requests when the connection state is not CONNECTED

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 4c887aa65d38633885010277f3482400681be719 ]

In tipc_conn_sendmsg(), we first queue the request to the outqueue
followed by the connection state check. If the connection is not
connected, we should not queue this message.

In this commit, we reject the messages if the connection state is
not CF_CONNECTED.

Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Tested-by: John Thompson <thompa.atl@gmail.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoproc: add a schedule point in proc_pid_readdir()
Eric Dumazet [Tue, 24 Jan 2017 23:18:07 +0000 (15:18 -0800)]
proc: add a schedule point in proc_pid_readdir()

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 3ba4bceef23206349d4130ddf140819b365de7c8 ]

We have seen proc_pid_readdir() invocations holding cpu for more than 50
ms.  Add a cond_resched() to be gentle with other tasks.

[akpm@linux-foundation.org: coding style fix]
Link: http://lkml.kernel.org/r/1484238380.15816.42.camel@edumazet-glaptop3.roam.corp.google.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoromfs: use different way to generate fsid for BLOCK or MTD
Coly Li [Tue, 24 Jan 2017 23:18:46 +0000 (15:18 -0800)]
romfs: use different way to generate fsid for BLOCK or MTD

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit f598f82e204ec0b17797caaf1b0311c52d43fb9a ]

Commit 8a59f5d25265 ("fs/romfs: return f_fsid for statfs(2)") generates
a 64bit id from sb->s_bdev->bd_dev.  This is only correct when romfs is
defined with CONFIG_ROMFS_ON_BLOCK.  If romfs is only defined with
CONFIG_ROMFS_ON_MTD, sb->s_bdev is NULL, referencing sb->s_bdev->bd_dev
will triger an oops.

Richard Weinberger points out that when CONFIG_ROMFS_BACKED_BY_BOTH=y,
both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD are defined.
Therefore when calling huge_encode_dev() to generate a 64bit id, I use
the follow order to choose parameter,

- CONFIG_ROMFS_ON_BLOCK defined
  use sb->s_bdev->bd_dev
- CONFIG_ROMFS_ON_BLOCK undefined and CONFIG_ROMFS_ON_MTD defined
  use sb->s_dev when,
- both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD undefined
  leave id as 0

When CONFIG_ROMFS_ON_MTD is defined and sb->s_mtd is not NULL, sb->s_dev
is set to a device ID generated by MTD_BLOCK_MAJOR and mtd index,
otherwise sb->s_dev is 0.

This is a try-best effort to generate a uniq file system ID, if all the
above conditions are not meet, f_fsid of this romfs instance will be 0.
Generally only one romfs can be built on single MTD block device, this
method is enough to identify multiple romfs instances in a computer.

Link: http://lkml.kernel.org/r/1482928596-115155-1-git-send-email-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Reported-by: Nong Li <nongli1031@gmail.com>
Tested-by: Nong Li <nongli1031@gmail.com>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agosctp: sctp_addr_id2transport should verify the addr before looking up assoc
Xin Long [Tue, 24 Jan 2017 06:01:53 +0000 (14:01 +0800)]
sctp: sctp_addr_id2transport should verify the addr before looking up assoc

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 6f29a130613191d3c6335169febe002cba00edf5 ]

sctp_addr_id2transport is a function for sockopt to look up assoc by
address. As the address is from userspace, it can be a v4-mapped v6
address. But in sctp protocol stack, it always handles a v4-mapped
v6 address as a v4 address. So it's necessary to convert it to a v4
address before looking up assoc by address.

This patch is to fix it by calling sctp_verify_addr in which it can do
this conversion before calling sctp_endpoint_lookup_assoc, just like
what sctp_sendmsg and __sctp_connect do for the address from users.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agor8152: avoid start_xmit to schedule napi when napi is disabled
hayeswang [Thu, 26 Jan 2017 01:38:32 +0000 (09:38 +0800)]
r8152: avoid start_xmit to schedule napi when napi is disabled

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit de9bf29dd6e4a8a874cb92f8901aed50a9d0b1d3 ]

Stop the tx when the napi is disabled to prevent napi_schedule() is
called.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agor8152: fix rtl8152_post_reset function
hayeswang [Fri, 20 Jan 2017 06:33:55 +0000 (14:33 +0800)]
r8152: fix rtl8152_post_reset function

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 2c561b2b728ca4013e76d6439bde2c137503745e ]

The rtl8152_post_reset() should sumbit rx urb and interrupt transfer,
otherwise the rx wouldn't work and the linking change couldn't be
detected.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agor8152: re-schedule napi for tx
hayeswang [Thu, 26 Jan 2017 01:38:33 +0000 (09:38 +0800)]
r8152: re-schedule napi for tx

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 248b213ad908b88db15941202ef7cb7eb137c1a0 ]

Re-schedule napi after napi_complete() for tx, if it is necessay.

In r8152_poll(), if the tx is completed after tx_bottom() and before
napi_complete(), the scheduling of napi would be lost. Then, no
one handles the next tx until the next napi_schedule() is called.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agonfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
Chuck Lever [Thu, 26 Jan 2017 20:14:52 +0000 (15:14 -0500)]
nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 406dab8450ec76eca88a1af2fc15d18a2b36ca49 ]

Lock sequence IDs are bumped in decode_lock by calling
nfs_increment_seqid(). nfs_increment_sequid() does not use the
seqid_mutating_err() function fixed in commit 059aa7348241 ("Don't
increment lock sequence ID after NFS4ERR_MOVED").

Fixes: 059aa7348241 ("Don't increment lock sequence ID after ...")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Xuan Qi <xuan.qi@oracle.com>
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoravb: unmap descriptors when freeing rings
Kazuya Mizuguchi [Thu, 26 Jan 2017 13:29:27 +0000 (14:29 +0100)]
ravb: unmap descriptors when freeing rings

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit a47b70ea86bdeb3091341f5ae3ef580f1a1ad822 ]

"swiotlb buffer is full" errors occur after repeated initialisation of a
device - f.e. suspend/resume or ip link set up/down. This is because memory
mapped using dma_map_single() in ravb_ring_format() and ravb_start_xmit()
is not released.  Resolve this problem by unmapping descriptors when
freeing rings.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: reworked]
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agodrm/ast: Fixed system hanged if disable P2A
Y.C. Chen [Thu, 26 Jan 2017 01:45:40 +0000 (09:45 +0800)]
drm/ast: Fixed system hanged if disable P2A

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 6c971c09f38704513c426ba6515f22fb3d6c87d5 ]

The original ast driver will access some BMC configuration through P2A bridge
that can be disabled since AST2300 and after.
It will cause system hanged if P2A bridge is disabled.
Here is the update to fix it.

Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agodrm/nouveau: Don't enabling polling twice on runtime resume
Lyude Paul [Thu, 12 Jan 2017 02:25:23 +0000 (21:25 -0500)]
drm/nouveau: Don't enabling polling twice on runtime resume

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit cae9ff036eea577856d5b12860b4c79c5e71db4a ]

As it turns out, on cards that actually have CRTCs on them we're already
calling drm_kms_helper_poll_enable(drm_dev) from
nouveau_display_resume() before we call it in
nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
enable polling twice, which results in a potential deadlock between the
RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
polling the second time while output_poll_execute is running and holding
the mode_config lock. As such, make sure we only enable polling in
nouveau_pmops_runtime_resume() if we need to.

This fixes hangs observed on the ThinkPad W541

Signed-off-by: Lyude <lyude@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: David Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoparisc, parport_gsc: Fixes for printk continuation lines
Helge Deller [Tue, 3 Jan 2017 21:55:50 +0000 (22:55 +0100)]
parisc, parport_gsc: Fixes for printk continuation lines

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 83b5d1e3d3013dbf90645a5d07179d018c8243fa ]

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agonet: adaptec: starfire: add checks for dma mapping errors
Alexey Khoroshilov [Fri, 27 Jan 2017 22:07:30 +0000 (01:07 +0300)]
net: adaptec: starfire: add checks for dma mapping errors

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit d1156b489fa734d1af763d6a07b1637c01bb0aed ]

init_ring(), refill_rx_ring() and start_tx() don't check
if mapping dma memory succeed.
The patch adds the checks and failure handling.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agopinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
Jisheng Zhang [Mon, 23 Jan 2017 07:15:32 +0000 (15:15 +0800)]
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit e82d02580af45663fad6d3596e4344c606e81e10 ]

This should be a typo.

Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agogianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
Arseny Solokha [Sun, 29 Jan 2017 12:52:20 +0000 (19:52 +0700)]
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 4af0e5bb95ee3ba5ea4bd7dbb94e1648a5279cc9 ]

In spite of switching to paged allocation of Rx buffers, the driver still
called dma_unmap_single() in the Rx queues tear-down path.

The DMA region unmapping code in free_skb_rx_queue() basically predates
the introduction of paged allocation to the driver. While being refactored,
it apparently hasn't reflected the change in the DMA API usage by its
counterpart gfar_new_page().

As a result, setting an interface to the DOWN state now yields the following:

  # ip link set eth2 down
  fsl-gianfar ffe24000.ethernet: DMA-API: device driver frees DMA memory with wrong function [device address=0x000000001ecd0000] [size=40]
  ------------[ cut here ]------------
  WARNING: CPU: 1 PID: 189 at lib/dma-debug.c:1123 check_unmap+0x8e0/0xa28
  CPU: 1 PID: 189 Comm: ip Tainted: G           O    4.9.5 #1
  task: dee73400 task.stack: dede2000
  NIP: c02101e8 LR: c02101e8 CTR: c0260d74
  REGS: dede3bb0 TRAP: 0700   Tainted: G           O     (4.9.5)
  MSR: 00021000 <CE,ME>  CR: 28002222  XER: 00000000

  GPR00: c02101e8 dede3c60 dee73400 000000b6 dfbd033c dfbd36c4 1f622000 dede2000
  GPR08: 00000007 c05b1634 1f622000 00000000 22002484 100a9904 00000000 00000000
  GPR16: 00000000 db4c849c 00000002 db4c8480 00000001 df142240 db4c84bc 00000000
  GPR24: c0706148 c0700000 00029000 c07552e8 c07323b4 dede3cb8 c07605e0 db535540
  NIP [c02101e8] check_unmap+0x8e0/0xa28
  LR [c02101e8] check_unmap+0x8e0/0xa28
  Call Trace:
  [dede3c60] [c02101e8] check_unmap+0x8e0/0xa28 (unreliable)
  [dede3cb0] [c02103b8] debug_dma_unmap_page+0x88/0x9c
  [dede3d30] [c02dffbc] free_skb_resources+0x2c4/0x404
  [dede3d80] [c02e39b4] gfar_close+0x24/0xc8
  [dede3da0] [c0361550] __dev_close_many+0xa0/0xf8
  [dede3dd0] [c03616f0] __dev_close+0x2c/0x4c
  [dede3df0] [c036b1b8] __dev_change_flags+0xa0/0x174
  [dede3e10] [c036b2ac] dev_change_flags+0x20/0x60
  [dede3e30] [c03e130c] devinet_ioctl+0x540/0x824
  [dede3e90] [c0347dcc] sock_ioctl+0x134/0x298
  [dede3eb0] [c0111814] do_vfs_ioctl+0xac/0x854
  [dede3f20] [c0111ffc] SyS_ioctl+0x40/0x74
  [dede3f40] [c000f290] ret_from_syscall+0x0/0x3c
  --- interrupt: c01 at 0xff45da0
      LR = 0xff45cd0
  Instruction dump:
  811d001c 7c66482e 813d0020 9061000c 807f000c 5463103a 7cc6182e 3c60c052
  386309ac 90c10008 4cc63182 4826b845 <0fe000004bfffa60 3c80c052 388402c4
  ---[ end trace 695ae6d7ac1d0c47 ]---
  Mapped at:
   [<c02e22a8>] gfar_alloc_rx_buffs+0x178/0x248
   [<c02e3ef0>] startup_gfar+0x368/0x570
   [<c036aeb4>] __dev_open+0xdc/0x150
   [<c036b1b8>] __dev_change_flags+0xa0/0x174
   [<c036b2ac>] dev_change_flags+0x20/0x60

Even though the issue was discovered in 4.9 kernel, the code in question
is identical in the current net and net-next trees.

Fixes: 75354148ce69 ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agodrm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
Ben Skeggs [Wed, 24 May 2017 01:54:09 +0000 (21:54 -0400)]
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 96692b097ba76d0c637ae8af47b29c73da33c9d0 ]

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agodrm/nouveau: prevent userspace from deleting client object
Ben Skeggs [Wed, 24 May 2017 01:54:08 +0000 (21:54 -0400)]
drm/nouveau: prevent userspace from deleting client object

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit c966b6279f610a24ac1d42dcbe30e10fa61220b2 ]

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoipv6: fix flow labels when the traffic class is non-0
Dimitris Michailidis [Wed, 24 May 2017 01:54:07 +0000 (21:54 -0400)]
ipv6: fix flow labels when the traffic class is non-0

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 90427ef5d2a4b9a24079889bf16afdcdaebc4240 ]

ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
supposed to be passed a flow label, which it returns as is if non-0 and
in some other cases, otherwise it calculates a new value.

The problem is callers often pass a flowi6.flowlabel, which may also
contain traffic class bits. If the traffic class is non-0
ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
returns the whole thing. Thus it can return a 'flow label' longer than
20b and the low 20b of that is typically 0 resulting in packets with 0
label. Moreover, different packets of a flow may be labeled differently.
For a TCP flow with ECN non-payload and payload packets get different
labels as exemplified by this pair of consecutive packets:

(pure ACK)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
    Payload Length: 32
    Next Header: TCP (6)

(payload)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
    .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
    Payload Length: 688
    Next Header: TCP (6)

This patch allows ip6_make_flowlabel() to be passed more than just a
flow label and has it extract the part it really wants. This was simpler
than modifying the callers. With this patch packets like the above become

Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
    Payload Length: 32
    Next Header: TCP (6)

Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
    .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
    Payload Length: 688
    Next Header: TCP (6)

Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agoFS-Cache: Initialise stores_lock in netfs cookie
David Howells [Wed, 24 May 2017 01:54:06 +0000 (21:54 -0400)]
FS-Cache: Initialise stores_lock in netfs cookie

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 62deb8187d116581c88c69a2dd9b5c16588545d4 ]

Initialise the stores_lock in fscache netfs cookies.  Technically, it
shouldn't be necessary, since the netfs cookie is an index and stores no
data, but initialising it anyway adds insignificant overhead.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agofscache: Clear outstanding writes when disabling a cookie
David Howells [Wed, 24 May 2017 01:54:05 +0000 (21:54 -0400)]
fscache: Clear outstanding writes when disabling a cookie

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit 6bdded59c8933940ac7e5b416448276ac89d1144 ]

fscache_disable_cookie() needs to clear the outstanding writes on the
cookie it's disabling because they cannot be completed after.

Without this, fscache_nfs_open_file() gets stuck because it disables the
cookie when the file is opened for writing but can't uncache the pages till
afterwards - otherwise there's a race between the open routine and anyone
who already has it open R/O and is still reading from it.

Looking in /proc/pid/stack of the offending process shows:

[<ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
[<ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
[<ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
[<ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
[<ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
[<ffffffff811743ac>] vfs_open+0x5c/0x65
[<ffffffff81184185>] path_openat+0x785/0x8fb
[<ffffffff81184343>] do_filp_open+0x48/0x9e
[<ffffffff81174710>] do_sys_open+0x13b/0x1cb
[<ffffffff811747b9>] SyS_open+0x19/0x1b
[<ffffffff81001c44>] do_syscall_64+0x80/0x17a
[<ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
[<ffffffffffffffff>] 0xffffffffffffffff

Reported-by: Jianhong Yin <jiyin@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
7 years agofscache: Fix dead object requeue
David Howells [Wed, 24 May 2017 01:54:04 +0000 (21:54 -0400)]
fscache: Fix dead object requeue

BugLink: http://bugs.launchpad.net/bugs/1698817
[ Upstream commit e26bfebdfc0d212d366de9990a096665d5c0209a ]

Under some circumstances, an fscache object can become queued such that it
fscache_object_work_func() can be called once the object is in the
OBJECT_DEAD state.  This results in the kernel oopsing when it tries to
invoke the handler for the state (which is hard coded to 0x2).

The way this comes about is something like the following:

 (1) The object dispatcher is processing a work state for an object.  This
     is done in workqueue context.

 (2) An out-of-band event comes in that isn't masked, causing the object to
     be queued, say EV_KILL.

 (3) The object dispatcher finishes processing the current work state on
     that object and then sees there's another event to process, so,
     without returning to the workqueue core, it processes that event too.
     It then follows the chain of events that initiates until we reach
     OBJECT_DEAD without going through a wait state (such as
     WAIT_FOR_CLEARANCE).

     At this point, object->events may be 0, object->event_mask will be 0
     and oob_event_mask will be 0.

 (4) The object dispatcher returns to the workqueue processor, and in due
     course, this sees that the object's work item is still queued and
     invokes it again.

 (5) The current state is a work state (OBJECT_DEAD), so the dispatcher
     jumps to it - resulting in an OOPS.

When I'm seeing this, the work state in (1) appears to have been either
LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
fscache_osm_lookup_oob).

The window for (2) is very small:

 (A) object->event_mask is cleared whilst the event dispatch process is
     underway - though there's no memory barrier to force this to the top
     of the function.

     The window, therefore is from the time the object was selected by the
     workqueue processor and made requeueable to the time the mask was
     cleared.

 (B) fscache_raise_event() will only queue the object if it manages to set
     the event bit and the corresponding event_mask bit was set.

     The enqueuement is then deferred slightly whilst we get a ref on the
     object and get the per-CPU variable for workqueue congestion.  This
     slight deferral slightly increases the probability by allowing extra
     time for the workqueue to make the item requeueable.

Handle this by giving the dead state a processor function and checking the
for the dead state address rather than seeing if the processor function is
address 0x2.  The dead state processor function can then set a flag to
indicate that it's occurred and give a warning if it occurs more than once
per object.

If this race occurs, an oops similar to the following is seen (note the RIP
value):

BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
IP: [<0000000000000002>] 0x1
PGD 0
Oops: 0010 [#1] SMP
Modules linked in: ...
CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
Workqueue: fscache_object fscache_object_work_func [fscache]
task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000
RIP: 0010:[<0000000000000002>]  [<0000000000000002>] 0x1
RSP: 0018:ffff880717547df8  EFLAGS: 00010202
RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200
RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480
RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510
R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000
R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600
FS:  0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
 ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
 ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
Call Trace:
 [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
 [<ffffffff8109d5db>] process_one_work+0x17b/0x470
 [<ffffffff8109e4ac>] worker_thread+0x21c/0x400
 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
 [<ffffffff810a5acf>] kthread+0xcf/0xe0
 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
 [<ffffffff816460d8>] ret_from_fork+0x58/0x90
 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeremy McNicoll <jeremymc@redhat.com>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>