]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
9 years agonfs: Remove unused argument in nfs_server_set_fsinfo()
Kinglong Mee [Wed, 1 Jul 2015 03:55:50 +0000 (11:55 +0800)]
nfs: Remove unused argument in nfs_server_set_fsinfo()

Commit e38eb6506f "NFS: set_pnfs_layoutdriver() from nfs4_proc_fsinfo()"
have remove the using of mntfh from nfs_server_set_fsinfo().

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: Fix a memory leak when meeting an unsupported state protect
Kinglong Mee [Wed, 1 Jul 2015 03:54:53 +0000 (11:54 +0800)]
nfs: Fix a memory leak when meeting an unsupported state protect

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: take extra reference to fl->fl_file when running a LOCKU operation
Jeff Layton [Tue, 30 Jun 2015 18:12:30 +0000 (14:12 -0400)]
nfs: take extra reference to fl->fl_file when running a LOCKU operation

Jean reported another crash, similar to the one fixed by feaff8e5b2cf:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
    IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi
    CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    30ec000
    RIP: 0010:[<ffffffff8124ef7f>]  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    RSP: 0018:ffff8802330efc08  EFLAGS: 00010296
    RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000
    R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020
    FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0
    Stack:
     ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281
     ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246
     ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58
    Call Trace:
     [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0
     [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc]
     [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0
     [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4]
     [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4]
     [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc]
     [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc]
     [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc]
     [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810add67>] process_one_work+0x147/0x400
     [<ffffffff810ae42b>] worker_thread+0x11b/0x460
     [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0
     [<ffffffff810b35d9>] kthread+0xc9/0xe0
     [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
     [<ffffffff8173c222>] ret_from_fork+0x42/0x70
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
    Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83
    RIP  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
     RSP <ffff8802330efc08>
    CR2: 0000000000000148
    ---[ end trace 64484f16250de7ef ]---

The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
We must take a reference to the struct file when running the LOCKU
compound to prevent the final fput from running until the operation is
complete.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFSv4: When returning a delegation, don't reclaim an incompatible open mode.
NeilBrown [Mon, 29 Jun 2015 04:28:54 +0000 (14:28 +1000)]
NFSv4: When returning a delegation, don't reclaim an incompatible open mode.

It is possible to have an active open with one mode, and a delegation
for the same file with a different mode.
In particular, a WR_ONLY open and an RD_ONLY delegation.
This happens if a WR_ONLY open is followed by a RD_ONLY open which
provides a delegation, but is then close.

When returning the delegation, we currently try to claim opens for
every open type (n_rdwr, n_rdonly, n_wronly).  As there is no harm
in claiming an open for a mode that we already have, this is often
simplest.

However if the delegation only provides a subset of the modes that we
currently have open, this will produce an error from the server.

So when claiming open modes prior to returning a delegation, skip the
open request if the mode is not covered by the delegation - the open_stateid
must already cover that mode, so there is nothing to do.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFSv4.2: LAYOUTSTATS is optional to implement
Trond Myklebust [Sat, 27 Jun 2015 15:45:46 +0000 (11:45 -0400)]
NFSv4.2: LAYOUTSTATS is optional to implement

Make it so, by checking the return value for NFS4ERR_MOTSUPP and
caching the information as a server capability.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFSv4.2: Fix up a decoding error in layoutstats
Trond Myklebust [Sat, 27 Jun 2015 15:30:57 +0000 (11:30 -0400)]
NFSv4.2: Fix up a decoding error in layoutstats

According to the spec, the server is only returning the status,
which we decode in the op header.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS/flexfiles: Fix the reset of struct pgio_header when resending
Trond Myklebust [Fri, 26 Jun 2015 19:37:58 +0000 (15:37 -0400)]
pNFS/flexfiles: Fix the reset of struct pgio_header when resending

hdr->good_bytes needs to be set to the length of the request, not
zero.

Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS/flexfiles: Turn off layoutcommit for servers that don't need it
Trond Myklebust [Fri, 26 Jun 2015 18:51:32 +0000 (14:51 -0400)]
pNFS/flexfiles: Turn off layoutcommit for servers that don't need it

This patch ensures that we record the value of 'ffl_flags' from
the layout, and then checks for the presence of the
FF_FLAGS_NO_LAYOUTCOMMIT flag before deciding whether or not to
call pnfs_set_layoutcommit().

The effect is that servers now can decide whether or not they want
the client to call layoutcommit before returning a writeable layout.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoMerge branch 'layoutstats'
Trond Myklebust [Fri, 26 Jun 2015 18:01:59 +0000 (14:01 -0400)]
Merge branch 'layoutstats'

* layoutstats:
  pnfs/flexfiles: protect ktime manipulation with mirror lock
  nfs: provide pnfs_report_layoutstat when NFS42 is disabled
  pnfs/flexfiles: report layoutstat regularly
  nfs42: serialize LAYOUTSTATS calls of the same file
  pnfs/flexfiles: encode LAYOUTSTATS flexfiles specific data
  pnfs/flexfiles: add ff_layout_prepare_layoutstats
  pNFS/flexfiles: track when layout is first used
  pNFS/flexfiles: add layoutstats tracking
  pNFS/flexfiles: Remove unused struct members user_name, group_name
  pnfs: add pnfs_report_layoutstat helper function
  pNFS: fill in nfs42_layoutstat_ops
  NFSv.2/pnfs Add a LAYOUTSTATS rpc function

9 years agopnfs/flexfiles: protect ktime manipulation with mirror lock
Peng Tao [Fri, 26 Jun 2015 01:45:49 +0000 (09:45 +0800)]
pnfs/flexfiles: protect ktime manipulation with mirror lock

It looks as if xchg() and cmpxchg() are not available for 64-bit integers on sparc32:

> New breakage seen in linux-next today:
>
> ERROR: "__xchg_called_with_bad_pointer" [fs/nfs/flexfilelayout/nfs_layout_flexfiles.ko] undefined!
> ERROR: "__cmpxchg_called_with_bad_pointer" [fs/nfs/flexfilelayout/nfs_layout_flexfiles.ko] undefined!
> make[2]: *** [__modpost] Error 1
> make[1]: *** [modules] Error 2

Given that mirror ktime manipulation is already under mirror->lock, let's make use of the fact.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: provide pnfs_report_layoutstat when NFS42 is disabled
Peng Tao [Thu, 25 Jun 2015 10:19:32 +0000 (18:19 +0800)]
nfs: provide pnfs_report_layoutstat when NFS42 is disabled

kbuild test robot reported:
   fs/built-in.o: In function `pnfs_report_layoutstat':
>> (.text+0x151a1c): undefined reference to `nfs42_proc_layoutstats_generic'

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: verify open flags before allowing open
Benjamin Coddington [Thu, 25 Jun 2015 13:25:50 +0000 (09:25 -0400)]
nfs: verify open flags before allowing open

Commit 9597c13b forbade opens with O_APPEND|O_DIRECT for NFSv4:

    nfs: verify open flags before allowing an atomic open

    Currently, you can open a NFSv4 file with O_APPEND|O_DIRECT, but cannot
    fcntl(F_SETFL,...) with those flags. This flag combination is explicitly
    forbidden on NFSv3 opens, and it seems like it should also be on NFSv4.

However, you can still open a file with O_DIRECT|O_APPEND if there exists a
cached dentry for the file because nfs4_file_open() is used instead of
nfs_atomic_open() and the check is bypassed.  Add the check in
nfs4_file_open() as well.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: always update creds in mirror, even when we have an already connected ds
Jeff Layton [Wed, 24 Jun 2015 16:10:24 +0000 (12:10 -0400)]
nfs: always update creds in mirror, even when we have an already connected ds

A ds can be associated with more than one mirror, but we currently skip
setting a mirror's credentials if we find that it's already set up with
a connected client.

The upshot is that we can end up sending DS writes with MDS credentials
instead of properly setting them up. Fix nfs4_ff_layout_prepare_ds to
always verify that the mirror's credentials are set up, even when we
have a DS that's already connected.

Reported-by: Tom Haynes <thomas.haynes@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: fix potential credential leak in ff_layout_update_mirror_cred
Jeff Layton [Wed, 24 Jun 2015 16:10:23 +0000 (12:10 -0400)]
nfs: fix potential credential leak in ff_layout_update_mirror_cred

If we have two tasks racing to update a mirror's credentials, then they
can end up leaking one (or more) sets of credentials. The first task
will set mirror->cred and then the second task will just overwrite it.

Use a cmpxchg to ensure that the creds are only set once. If we get to
the point where we would set mirror->cred and find that they're already
set, then we just release the creds that were just found.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org # 4.0+
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopnfs/flexfiles: report layoutstat regularly
Peng Tao [Tue, 23 Jun 2015 11:52:04 +0000 (19:52 +0800)]
pnfs/flexfiles: report layoutstat regularly

As a simple scheme, report every minute if IO is still going on.

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs42: serialize LAYOUTSTATS calls of the same file
Peng Tao [Tue, 23 Jun 2015 11:52:03 +0000 (19:52 +0800)]
nfs42: serialize LAYOUTSTATS calls of the same file

There is no need to report concurrently.

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopnfs/flexfiles: encode LAYOUTSTATS flexfiles specific data
Peng Tao [Tue, 23 Jun 2015 11:52:02 +0000 (19:52 +0800)]
pnfs/flexfiles: encode LAYOUTSTATS flexfiles specific data

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopnfs/flexfiles: add ff_layout_prepare_layoutstats
Peng Tao [Tue, 23 Jun 2015 11:52:01 +0000 (19:52 +0800)]
pnfs/flexfiles: add ff_layout_prepare_layoutstats

It fills in the generic part of LAYOUTSTATS call. One thing to note
is that we don't really track if IO is continuous or not. So just fake
to use the completed bytes for it.

Still missing flexfiles specific part, which will be included in the next patch.

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS/flexfiles: track when layout is first used
Peng Tao [Tue, 23 Jun 2015 11:52:00 +0000 (19:52 +0800)]
pNFS/flexfiles: track when layout is first used

So that we can report cumulative time since the beginning
of statistics collection of the layout.

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS/flexfiles: add layoutstats tracking
Trond Myklebust [Tue, 23 Jun 2015 11:51:59 +0000 (19:51 +0800)]
pNFS/flexfiles: add layoutstats tracking

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS/flexfiles: Remove unused struct members user_name, group_name
Trond Myklebust [Tue, 23 Jun 2015 11:51:58 +0000 (19:51 +0800)]
pNFS/flexfiles: Remove unused struct members user_name, group_name

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopnfs: add pnfs_report_layoutstat helper function
Peng Tao [Tue, 23 Jun 2015 11:51:57 +0000 (19:51 +0800)]
pnfs: add pnfs_report_layoutstat helper function

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS: fill in nfs42_layoutstat_ops
Peng Tao [Tue, 23 Jun 2015 11:51:56 +0000 (19:51 +0800)]
pNFS: fill in nfs42_layoutstat_ops

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFSv.2/pnfs Add a LAYOUTSTATS rpc function
Trond Myklebust [Tue, 23 Jun 2015 11:51:55 +0000 (19:51 +0800)]
NFSv.2/pnfs Add a LAYOUTSTATS rpc function

Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoMerge branch 'bugfixes'
Trond Myklebust [Mon, 22 Jun 2015 13:55:08 +0000 (09:55 -0400)]
Merge branch 'bugfixes'

* bugfixes:
  NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes
  pNFS: Fix a memory leak when attempted pnfs fails
  NFS: Ensure that we update the sequence id under the slot table lock
  nfs: Initialize cb_sequenceres information before validate_seqid()
  nfs: Only update callback sequnce id when CB_SEQUENCE success
  NFSv4: nfs4_handle_delegation_recall_error should ignore EAGAIN

9 years agoSUNRPC: Set the TCP user timeout option on client sockets
Trond Myklebust [Sat, 20 Jun 2015 19:31:54 +0000 (15:31 -0400)]
SUNRPC: Set the TCP user timeout option on client sockets

Use the TCP_USER_TIMEOUT socket option to advertise to the server
how long we will keep the connection open if there is unacknowledged
data. See RFC5482.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: Ensure we release the TCP socket once it has been closed
Trond Myklebust [Fri, 19 Jun 2015 20:17:57 +0000 (16:17 -0400)]
SUNRPC: Ensure we release the TCP socket once it has been closed

This fixes a regression introduced by commit caf4ccd4e88cf2 ("SUNRPC:
Make xs_tcp_close() do a socket shutdown rather than a sock_release").
Prior to that commit, the autoclose feature would ensure that an
idle connection would result in the socket being both disconnected and
released, whereas now only gets disconnected.

While the current behaviour is harmless, it does leave the port bound
until either RPC traffic resumes or the RPC client is shut down.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: Handle connection issues correctly on the back channel
Trond Myklebust [Fri, 19 Jun 2015 17:04:13 +0000 (13:04 -0400)]
SUNRPC: Handle connection issues correctly on the back channel

If the back channel is disconnected, we can and should just fail the
transmission. The expectation is that the NFSv4.1 server will always
retransmit any outstanding callbacks once the connection is
re-established.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: Fix comment for nfs_pageio_init() and nfs_pageio_complete_mirror()
Yijing Wang [Thu, 18 Jun 2015 11:37:13 +0000 (19:37 +0800)]
nfs: Fix comment for nfs_pageio_init() and nfs_pageio_complete_mirror()

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key()
Fabian Frederick [Tue, 16 Jun 2015 19:57:52 +0000 (21:57 +0200)]
sunrpc: use sg_init_one() in krb5_rc4_setup_enc/seq_key()

Don't opencode sg_init_one()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes
Trond Myklebust [Wed, 17 Jun 2015 23:56:22 +0000 (19:56 -0400)]
NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes

If a write attempt fails, and the write is queued up for resending to
the server, as opposed to being dropped, then we need to set the
appropriate flag so that nfs_file_fsync() does the right thing.

Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopNFS: Fix a memory leak when attempted pnfs fails
Trond Myklebust [Wed, 17 Jun 2015 23:41:51 +0000 (19:41 -0400)]
pNFS: Fix a memory leak when attempted pnfs fails

pnfs_do_write() expects the call to pnfs_write_through_mds() to free the
pgio header and to release the layout segment before exiting. The problem
is that nfs_pgio_data_destroy() doesn't actually do this; it only frees
the memory allocated by nfs_generic_pgio().

Ditto for pnfs_do_read()...

Fix in both cases is to add a call to hdr->release(hdr).

Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoMerge tag 'nfs-rdma-for-4.2' of git://git.linux-nfs.org/projects/anna/nfs-rdma
Trond Myklebust [Tue, 16 Jun 2015 15:36:47 +0000 (11:36 -0400)]
Merge tag 'nfs-rdma-for-4.2' of git://git.linux-nfs.org/projects/anna/nfs-rdma

NFS: NFSoRDMA Client Changes

These patches continue to build up for improving the rsize and wsize that the
NFS client uses when talking over RDMA.  In addition, these patches also add
in scalability enhancements and other bugfixes.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* tag 'nfs-rdma-for-4.2' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (142 commits)
  xprtrdma: Reduce per-transport MR allocation
  xprtrdma: Stack relief in fmr_op_map()
  xprtrdma: Split rb_lock
  xprtrdma: Remove rpcrdma_ia::ri_memreg_strategy
  xprtrdma: Remove ->ro_reset
  xprtrdma: Remove unused LOCAL_INV recovery logic
  xprtrdma: Acquire MRs in rpcrdma_register_external()
  xprtrdma: Introduce an FRMR recovery workqueue
  xprtrdma: Acquire FMRs in rpcrdma_fmr_register_external()
  xprtrdma: Introduce helpers for allocating MWs
  xprtrdma: Use ib_device pointer safely
  xprtrdma: Remove rr_func
  xprtrdma: Replace rpcrdma_rep::rr_buffer with rr_rxprt
  xprtrdma: Warn when there are orphaned IB objects
  ...

9 years agoNFSv4: Fix stateid recovery on revoked delegations
Trond Myklebust [Tue, 16 Jun 2015 15:26:35 +0000 (11:26 -0400)]
NFSv4: Fix stateid recovery on revoked delegations

Ensure that we fix the non-NULL stateid case as well.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoRecover from stateid-type error on SETATTR
Olga Kornievskaia [Fri, 12 Jun 2015 20:53:30 +0000 (16:53 -0400)]
Recover from stateid-type error on SETATTR

Client can receives stateid-type error (eg., BAD_STATEID) on SETATTR when
delegation stateid was used. When no open state exists, in case of application
calling truncate() on the file, client has no state to recover and fails with
EIO.

Instead, upon such error, return the bad delegation and then resend the
SETATTR with a zero stateid.

Signed-off: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: Fix showing truncated fsid/dev in, /proc/net/nfsfs/volumes
Kinglong Mee [Sat, 13 Jun 2015 02:07:00 +0000 (10:07 +0800)]
nfs: Fix showing truncated fsid/dev in, /proc/net/nfsfs/volumes

A truncated fsid showing from /proc/fs/nfsfs/volumes as,
NV SERVER   PORT DEV     FSID              FSC
v4 c0a80881  801 0:43    34931f044c2a439b  no

It should be as,
NV SERVER   PORT DEV          FSID                              FSC
v4 c0a80881  801 0:43         34931f044c2a439b:954c5d830fa4be8c no

The max buffer length for storing "%llx:%llx" format should be
 16 + 1 + 16 + 1 = 34 (16 for %llx, 1 for ':', 1 for '\0').

Also, for storing "%u:%u" of MAJOR() and MINOR() should be
 8 + 1 + 3 + 1 = 13 (8 for 2^24, 1 for ':', 3 for 2^8, 1 for '\0').

v2, add comments for dev/fsid buffer and use sizeof in snprintf.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: make nfs4_init_uniform_client_string use a dynamically allocated buffer
Jeff Layton [Tue, 9 Jun 2015 23:44:00 +0000 (19:44 -0400)]
nfs: make nfs4_init_uniform_client_string use a dynamically allocated buffer

Change the uniform client string generator to dynamically allocate the
NFSv4 client name string buffer. With this patch, we can eliminate the
buffers that are embedded within the "args" structs and simply use the
name string that is hanging off the client.

This uniform string case is a little simpler than the nonuniform since
we don't need to deal with RCU, but we do have two different cases,
depending on whether there is a uniquifier or not.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: make nfs4_init_nonuniform_client_string use a dynamically allocated buffer
Jeff Layton [Tue, 9 Jun 2015 23:43:59 +0000 (19:43 -0400)]
nfs: make nfs4_init_nonuniform_client_string use a dynamically allocated buffer

The way the *_client_string functions work is a little goofy. They build
the string in an on-stack buffer and then use kstrdup to copy it. This
is not only stack-heavy but artificially limits the size of the client
name string. Change it so that we determine the length of the string,
allocate it and then scnprintf into it.

Since the contents of the nonuniform string depend on rcu-managed data
structures, it's possible that they'll change between when we allocate
the string and when we go to fill it. If that happens, free the string,
recalculate the length and try again. If it the mismatch isn't resolved
on the second try then just give up and return -EINVAL.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: update maxsz values for SETCLIENTID and EXCHANGE_ID
Jeff Layton [Tue, 9 Jun 2015 23:43:58 +0000 (19:43 -0400)]
nfs: update maxsz values for SETCLIENTID and EXCHANGE_ID

The spec allows for up to NFS4_OPAQUE_LIMIT (1k). While we'll almost
certainly never use that much, these ops are generally the only ones
in the compound so we might as well allow for them to be that large.

Also, the existing code didn't add in a word for the opaque length
field for either name string. Fix that while we're in there.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: convert setclientid and exchange_id encoders to use clp->cl_owner_id
Jeff Layton [Tue, 9 Jun 2015 23:43:57 +0000 (19:43 -0400)]
nfs: convert setclientid and exchange_id encoders to use clp->cl_owner_id

...instead of buffers that are part of their arg structs. We already
hold a reference to the client, so we might as well use the allocated
buffer. In the event that we can't allocate the clp->cl_owner_id, then
just return -ENOMEM.

Note too that we switch from a GFP_KERNEL allocation here to GFP_NOFS.
It's possible we could end up trying to do a SETCLIENTID or EXCHANGE_ID
in order to reclaim some memory, and the GFP_KERNEL allocations in the
existing code could cause recursion back into NFS reclaim.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: increase size of EXCHANGE_ID name string buffer
Jeff Layton [Tue, 9 Jun 2015 23:43:56 +0000 (19:43 -0400)]
nfs: increase size of EXCHANGE_ID name string buffer

The current buffer is much too small if you have a relatively long
hostname. Bring it up to the size of the one that SETCLIENTID has.

Cc: <stable@vger.kernel.org>
Reported-by: Michael Skralivetsky <michael.skralivetsky@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agopnfs/flexfiles: use swap() in ff_layout_sort_mirrors()
Fabian Frederick [Fri, 12 Jun 2015 16:58:50 +0000 (18:58 +0200)]
pnfs/flexfiles: use swap() in ff_layout_sort_mirrors()

Use kernel.h macro definition.

Thanks to Julia Lawall for Coccinelle scripting support.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: never enqueue a ->rq_cong request on ->sending
Neil Brown [Mon, 15 Jun 2015 05:55:30 +0000 (15:55 +1000)]
SUNRPC: never enqueue a ->rq_cong request on ->sending

If the sending queue has a task without ->rq_cong set at the front,
and then a number of tasks with ->rq_cong set such that they use
the entire congestion window, then the queue deadlocks.  The first
entry cannot be processed until later entries complete.

This scenario has been seen with a client using UDP to access a server,
and the network connection breaking for a period of time - it doesn't
recover.

It never really makes sense for an ->rq_cong request to be on the ->sending
queue, but it can happen when a request is being retried, and finds
the transport if locked (XPRT_LOCKED).  In this case we simple call
__xprt_put_cong() and the deadlock goes away.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoxprtrdma: Reduce per-transport MR allocation
Chuck Lever [Tue, 26 May 2015 15:53:33 +0000 (11:53 -0400)]
xprtrdma: Reduce per-transport MR allocation

Reduce resource consumption per-transport to make way for increasing
the credit limit and maximum r/wsize. Pre-allocate fewer MRs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Stack relief in fmr_op_map()
Chuck Lever [Tue, 26 May 2015 15:53:23 +0000 (11:53 -0400)]
xprtrdma: Stack relief in fmr_op_map()

fmr_op_map() declares a 64 element array of u64 in automatic
storage. This is 512 bytes (8 * 64) on the stack.

Instead, when FMR memory registration is in use, pre-allocate a
physaddr array for each rpcrdma_mw.

This is a pre-requisite for increasing the r/wsize maximum for
FMR on platforms with 4KB pages.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Split rb_lock
Chuck Lever [Tue, 26 May 2015 15:53:13 +0000 (11:53 -0400)]
xprtrdma: Split rb_lock

/proc/lock_stat showed contention between rpcrdma_buffer_get/put
and the MR allocation functions during I/O intensive workloads.

Now that MRs are no longer allocated in rpcrdma_buffer_get(),
there's no reason the rb_mws list has to be managed using the
same lock as the send/receive buffers. Split that lock. The
new lock does not need to disable interrupts because buffer
get/put is never called in an interrupt context.

struct rpcrdma_buffer is re-arranged to ensure rb_mwlock and rb_mws
are always in a different cacheline than rb_lock and the buffer
pointers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Remove rpcrdma_ia::ri_memreg_strategy
Chuck Lever [Tue, 26 May 2015 15:53:04 +0000 (11:53 -0400)]
xprtrdma: Remove rpcrdma_ia::ri_memreg_strategy

Clean up: This field is no longer used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Remove ->ro_reset
Chuck Lever [Tue, 26 May 2015 15:52:54 +0000 (11:52 -0400)]
xprtrdma: Remove ->ro_reset

An RPC can exit at any time. When it does so, xprt_rdma_free() is
called, and it calls ->op_unmap().

If ->ro_reset() is running due to a transport disconnect, the two
methods can race while processing the same rpcrdma_mw. The results
are unpredictable.

Because of this, in previous patches I've altered ->ro_map() to
handle MR reset. ->ro_reset() is no longer needed and can be
removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Remove unused LOCAL_INV recovery logic
Chuck Lever [Tue, 26 May 2015 15:52:44 +0000 (11:52 -0400)]
xprtrdma: Remove unused LOCAL_INV recovery logic

Clean up: Remove functions no longer used to recover broken FRMRs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Acquire MRs in rpcrdma_register_external()
Chuck Lever [Tue, 26 May 2015 15:52:35 +0000 (11:52 -0400)]
xprtrdma: Acquire MRs in rpcrdma_register_external()

Acquiring 64 MRs in rpcrdma_buffer_get() while holding the buffer
pool lock is expensive, and unnecessary because most modern adapters
can transfer 100s of KBs of payload using just a single MR.

Instead, acquire MRs one-at-a-time as chunks are registered, and
return them to rb_mws immediately during deregistration.

Note: commit 539431a437d2 ("xprtrdma: Don't invalidate FRMRs if
registration fails") is reverted: There is now a valid case where
registration can fail (with -ENOMEM) but the QP is still in RTS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Introduce an FRMR recovery workqueue
Chuck Lever [Tue, 26 May 2015 15:52:25 +0000 (11:52 -0400)]
xprtrdma: Introduce an FRMR recovery workqueue

After a transport disconnect, FRMRs can be left in an undetermined
state. In particular, the MR's rkey is no good.

Currently, FRMRs are fixed up by the transport connect worker, but
that can race with ->ro_unmap if an RPC happens to exit while the
transport connect worker is running.

A better way of dealing with broken FRMRs is to detect them before
they are re-used by ->ro_map. Such FRMRs are either already invalid
or are owned by the sending RPC, and thus no race with ->ro_unmap
is possible.

Introduce a mechanism for handing broken FRMRs to a workqueue to be
reset in a context that is appropriate for allocating resources
(ie. an ib_alloc_fast_reg_mr() API call).

This mechanism is not yet used, but will be in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-By: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Acquire FMRs in rpcrdma_fmr_register_external()
Chuck Lever [Tue, 26 May 2015 15:52:16 +0000 (11:52 -0400)]
xprtrdma: Acquire FMRs in rpcrdma_fmr_register_external()

Acquiring 64 FMRs in rpcrdma_buffer_get() while holding the buffer
pool lock is expensive, and unnecessary because FMR mode can
transfer up to a 1MB payload using just a single ib_fmr.

Instead, acquire ib_fmrs one-at-a-time as chunks are registered, and
return them to rb_mws immediately during deregistration.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Introduce helpers for allocating MWs
Chuck Lever [Tue, 26 May 2015 15:52:06 +0000 (11:52 -0400)]
xprtrdma: Introduce helpers for allocating MWs

We eventually want to handle allocating MWs one at a time, as
needed, instead of grabbing 64 and throwing them at each RPC in the
pipeline.

Add a helper for grabbing an MW off rb_mws, and a helper for
returning an MW to rb_mws. These will be used in a subsequent patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Use ib_device pointer safely
Chuck Lever [Tue, 26 May 2015 15:51:56 +0000 (11:51 -0400)]
xprtrdma: Use ib_device pointer safely

The connect worker can replace ri_id, but prevents ri_id->device
from changing during the lifetime of a transport instance. The old
ID is kept around until a new ID is created and the ->device is
confirmed to be the same.

Cache a copy of ri_id->device in rpcrdma_ia and in rpcrdma_rep.
The cached copy can be used safely in code that does not serialize
with the connect worker.

Other code can use it to save an extra address generation (one
pointer dereference instead of two).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Remove rr_func
Chuck Lever [Tue, 26 May 2015 15:51:46 +0000 (11:51 -0400)]
xprtrdma: Remove rr_func

A posted rpcrdma_rep never has rr_func set to anything but
rpcrdma_reply_handler.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Replace rpcrdma_rep::rr_buffer with rr_rxprt
Chuck Lever [Tue, 26 May 2015 15:51:37 +0000 (11:51 -0400)]
xprtrdma: Replace rpcrdma_rep::rr_buffer with rr_rxprt

Clean up: Instead of carrying a pointer to the buffer pool and
the rpc_xprt, carry a pointer to the controlling rpcrdma_xprt.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoxprtrdma: Warn when there are orphaned IB objects
Chuck Lever [Tue, 26 May 2015 15:51:27 +0000 (11:51 -0400)]
xprtrdma: Warn when there are orphaned IB objects

WARN during transport destruction if ib_dealloc_pd() fails. This is
a sign that xprtrdma orphaned one or more RDMA API objects at some
point, which can pin lower layer kernel modules and cause shutdown
to hang.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Devesh Sharma <devesh.sharma@avagotech.com>
Tested-By: Devesh Sharma <devesh.sharma@avagotech.com>
Reviewed-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
9 years agoNFS: Ensure that we update the sequence id under the slot table lock
Trond Myklebust [Fri, 12 Jun 2015 01:13:58 +0000 (21:13 -0400)]
NFS: Ensure that we update the sequence id under the slot table lock

Fix a callback slot table regression.

Fixes: e937ee714b2d ("nfs: Only update callback sequnce id when CB_SEQUENCE success")
Cc: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: Initialize cb_sequenceres information before validate_seqid()
Kinglong Mee [Tue, 2 Jun 2015 10:59:01 +0000 (18:59 +0800)]
nfs: Initialize cb_sequenceres information before validate_seqid()

For a cb_layoutrecall replay, nfsd got CB_SEQUENCE status of zero,
but all informations of cb_sequenceres are zero too !!!

validate_seqid() return NFS4ERR_RETRY_UNCACHED_REP for a replay,
and skip the initlize cb_sequenceres.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: deny backchannel RPCs with an incorrect authflavor instead of dropping them
Jeff Layton [Thu, 4 Jun 2015 22:40:13 +0000 (18:40 -0400)]
nfs: deny backchannel RPCs with an incorrect authflavor instead of dropping them

A drop should really only be done when the frame is malformed or we have
reason to think that there is some sort of DoS going on. When we get an
RPC with bad auth, we should send back an error instead.

Cc: Andy Adamson <William.Adamson@netapp.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: Address kbuild warning in net/sunrpc/debugfs.c
Chuck Lever [Thu, 11 Jun 2015 17:47:10 +0000 (13:47 -0400)]
SUNRPC: Address kbuild warning in net/sunrpc/debugfs.c

Cross-compile test on ARCH=mn10300:

In file included from include/linux/list.h:8:0,
                 from include/linux/wait.h:6,
                 from include/linux/fs.h:6,
                 from include/linux/debugfs.h:18,
                 from net/sunrpc/debugfs.c:7:
net/sunrpc/debugfs.c: In function 'fault_disconnect_write':
include/linux/kernel.h:723:17: warning: comparison of distinct pointer
types lacks a cast
    (void) (&_min1 == &_min2);  \
                   ^
>> net/sunrpc/debugfs.c:307:8: note: in expansion of macro 'min'
    len = min(len, sizeof(buffer) - 1);

Fixes: ('SUNRPC: Transport fault injection')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agonfs: Only update callback sequnce id when CB_SEQUENCE success
Kinglong Mee [Tue, 2 Jun 2015 10:58:46 +0000 (18:58 +0800)]
nfs: Only update callback sequnce id when CB_SEQUENCE success

When testing pnfs layout, nfsd got error NFS4ERR_SEQ_MISORDERED.
It is caused by nfs return NFS4ERR_DELAY before validate_seqid(),
don't update the sequnce id, but nfsd updates the sequnce id !!!

According to RFC5661 20.9.3,
" If CB_SEQUENCE returns an error, then the state of the slot
  (sequence ID, cached reply) MUST NOT change. "

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFS: Convert use of __constant_htonl to htonl
Vaishali Thakkar [Wed, 10 Jun 2015 14:45:29 +0000 (20:15 +0530)]
NFS: Convert use of __constant_htonl to htonl

In little endian cases, the macro htonl unfolds to __swab32 which
provides special case for constants. In big endian cases,
__constant_htonl and htonl expand directly to the same expression.
So, replace __constant_htonl with htonl with the goal of getting
rid of the definition of __constant_htonl completely.

The semantic patch that performs this transformation is as follows:

@@expression x;@@

- __constant_htonl(x)
+ htonl(x)

Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: Transport fault injection
Chuck Lever [Mon, 11 May 2015 18:02:25 +0000 (14:02 -0400)]
SUNRPC: Transport fault injection

It has been exceptionally useful to exercise the logic that handles
local immediate errors and RDMA connection loss.  To enable
developers to test this regularly and repeatably, add logic to
simulate connection loss every so often.

Fault injection is disabled by default. It is enabled with

  $ sudo echo xxx > /sys/kernel/debug/sunrpc/inject_fault/disconnect

where "xxx" is a large positive number of transport method calls
before a disconnect. A value of several thousand is usually a good
number that allows reasonable forward progress while still causing a
lot of connection drops.

These hooks are disabled when SUNRPC_DEBUG is turned off.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFS: Remove unused nfs_rw_ops->rw_release() function
Anna Schumaker [Wed, 10 Jun 2015 20:54:28 +0000 (16:54 -0400)]
NFS: Remove unused nfs_rw_ops->rw_release() function

This was only ever set to nfs_writeback_release_common(), a function
which is completely empty.  Let's just drop this function pointer and
simplify the code a bit.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoNFSv4: handle nfs4_get_referral failure
Dominique Martinet [Thu, 4 Jun 2015 15:04:17 +0000 (17:04 +0200)]
NFSv4: handle nfs4_get_referral failure

nfs4_proc_lookup_common is supposed to return a posix error, we have to
handle any error returned that isn't errno

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Frank S. Filz <ffilzlnx@mindspring.com>
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: turn swapper_enable/disable functions into rpc_xprt_ops
Jeff Layton [Wed, 3 Jun 2015 20:14:29 +0000 (16:14 -0400)]
sunrpc: turn swapper_enable/disable functions into rpc_xprt_ops

RDMA xprts don't have a sock_xprt, but an rdma_xprt, so the
xs_swapper_enable/disable functions will likely oops when fed an RDMA
xprt. Turn these functions into rpc_xprt_ops so that that doesn't
occur. For now the RDMA versions are no-ops that just return -EINVAL
on an attempt to swapon.

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: lock xprt before trying to set memalloc on the sockets
Jeff Layton [Wed, 3 Jun 2015 20:14:28 +0000 (16:14 -0400)]
sunrpc: lock xprt before trying to set memalloc on the sockets

It's possible that we could race with a call to xs_reset_transport, in
which case the xprt->inet pointer could be zeroed out while we're
accessing it. Lock the xprt before we try to set memalloc on it.

Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: if we're closing down a socket, clear memalloc on it first
Jeff Layton [Wed, 3 Jun 2015 20:14:27 +0000 (16:14 -0400)]
sunrpc: if we're closing down a socket, clear memalloc on it first

We currently increment the memalloc_socks counter if we have a xprt that
is associated with a swapfile. That socket can be replaced however
during a reconnect event, and the memalloc_socks counter is never
decremented if that occurs.

When tearing down a xprt socket, check to see if the xprt is set up for
swapping and sk_clear_memalloc before releasing the socket if so.

Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: make xprt->swapper an atomic_t
Jeff Layton [Wed, 3 Jun 2015 20:14:26 +0000 (16:14 -0400)]
sunrpc: make xprt->swapper an atomic_t

Split xs_swapper into enable/disable functions and eliminate the
"enable" flag.

Currently, it's racy if you have multiple swapon/swapoff operations
running in parallel over the same xprt. Also fix it so that we only
set it to a memalloc socket on a 0->1 transition and only clear it
on a 1->0 transition.

Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agosunrpc: keep a count of swapfiles associated with the rpc_clnt
Jeff Layton [Wed, 3 Jun 2015 20:14:25 +0000 (16:14 -0400)]
sunrpc: keep a count of swapfiles associated with the rpc_clnt

Jerome reported seeing a warning pop when working with a swapfile on
NFS. The nfs_swap_activate can end up calling sk_set_memalloc while
holding the rcu_read_lock and that function can sleep.

To fix that, we need to take a reference to the xprt while holding the
rcu_read_lock, set the socket up for swapping and then drop that
reference. But, xprt_put is not exported and having NFS deal with the
underlying xprt is a bit of layering violation anyway.

Fix this by adding a set of activate/deactivate functions that take a
rpc_clnt pointer instead of an rpc_xprt, and have nfs_swap_activate and
nfs_swap_deactivate call those.

Also, add a per-rpc_clnt atomic counter to keep track of the number of
active swapfiles associated with it. When the counter does a 0->1
transition, we enable swapping on the xprt, when we do a 1->0 transition
we disable swapping on it.

This also allows us to be a bit more selective with the RPC_TASK_SWAPPER
flag. If non-swapper and swapper clnts are sharing a xprt, then we only
need to flag the tasks from the swapper clnt with that flag.

Acked-by: Mel Gorman <mgorman@suse.de>
Reported-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoLinux 4.1-rc7
Linus Torvalds [Mon, 8 Jun 2015 03:23:50 +0000 (20:23 -0700)]
Linux 4.1-rc7

9 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 7 Jun 2015 23:56:10 +0000 (16:56 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus

Pull MIPS updates from Ralf Baechle:
 "Eight fixes across arch/mips.  Nothing stands particuarly out nor is
  complicated but fixes keep coming in at a higher than comfortable
  rate"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: KVM: Do not sign extend on unsigned MMIO load
  MIPS: BPF: Fix stack pointer allocation
  MIPS: Loongson-3: Fix a cpu-hotplug issue in loongson3_ipi_interrupt()
  MIPS: Fix enabling of DEBUG_STACKOVERFLOW
  MIPS: c-r4k: Fix typo in probe_scache()
  MIPS: Avoid an FPE exception in FCSR mask probing
  MIPS: ath79: Add a missing new line in log message
  MIPS: ralink: Fix clearing the illegal access interrupt

9 years agoMerge tag 'driver-core-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 7 Jun 2015 05:37:45 +0000 (22:37 -0700)]
Merge tag 'driver-core-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two fixes for the driver core that resolve some reported
  issues.

  One is a regression from 4.0, the other a fixes a reported oops that
  has been there since 3.19.

  Both have been in linux-next for a while with no problems"

* tag 'driver-core-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  drivers/base: cacheinfo: handle absence of caches
  drivers: of/base: move of_init to driver_init

9 years agoMerge tag 'staging-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 7 Jun 2015 05:33:08 +0000 (22:33 -0700)]
Merge tag 'staging-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging

Pull staging / IIO fixes from Greg KH:
 "Here are some IIO driver fixes to resolve reported issues, some ozwpan
  fixes for some reported CVE problems, and a rtl8712 driver fix for a
  reported regression.

  All have been in linux-next successfully"

* tag 'staging-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: rtl8712: fix stack dump
  ozwpan: unchecked signed subtraction leads to DoS
  ozwpan: divide-by-zero leading to panic
  ozwpan: Use unsigned ints to prevent heap overflow
  ozwpan: Use proper check to prevent heap overflow
  iio: adc: twl6030-gpadc: Fix modalias
  iio: adis16400: Fix burst transfer for adis16448
  iio: adis16400: Fix burst mode
  iio: adis16400: Compute the scan mask from channel indices
  iio: adis16400: Use != channel indices for the two voltage channels
  iio: adis16400: Report pressure channel scale

9 years agoMerge tag 'tty-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sun, 7 Jun 2015 05:14:23 +0000 (22:14 -0700)]
Merge tag 'tty-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are a few TTY and Serial driver fixes for reported regressions
  and crashes.

  All of these have been in linux-next with no reported problems"

* tag 'tty-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  n_tty: Fix auditing support for cannonical mode
  serial: 8250_omap: provide complete custom startup & shutdown callbacks
  n_tty: Fix calculation of size in canon_copy_from_read_buf
  serial: imx: Fix DMA handling for IDLE condition aborts
  serial/amba-pl011: Unconditionally poll for FIFO space before each TX char

9 years agoMerge tag 'usb-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 7 Jun 2015 05:06:53 +0000 (22:06 -0700)]
Merge tag 'usb-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB and PHY driver fixes from Greg KH:
 "Here are some USB and PHY driver fixes that resolve some reported
  regressions.  Also in here are some new device ids.

  All of the details are in the shortlog and these patches have been in
  linux-next with no problems"

* tag 'usb-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (22 commits)
  USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle
  usb: renesas_usbhs: Don't disable the pipe if Control write status stage
  usb: renesas_usbhs: Fix fifo unclear in usbhsf_prepare_pop
  usb: gadget: f_fs: fix check in read operation
  usb: musb: fix order of conditions for assigning end point operations
  usb: gadget: f_uac1: check return code from config_ep_by_speed
  usb: gadget: ffs: fix: Always call ffs_closed() in ffs_data_clear()
  usb: gadget: g_ffs: Fix counting of missing_functions
  usb: s3c2410_udc: correct reversed pullup logic
  usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros
  usb: phy: tahvo: Pass the IRQF_ONESHOT flag
  usb: phy: ab8500-usb: Pass the IRQF_ONESHOT flag
  usb: renesas_usbhs: Revise the binding document about the dma-names
  usb: host: xhci: add mutex for non-thread-safe data
  usb: make module xhci_hcd removable
  USB: serial: ftdi_sio: Add support for a Motion Tracker Development Board
  usb: gadget: f_midi: fix segfault when reading empty id
  phy: phy-rcar-gen2: Fix USBHS_UGSTS_LOCK value
  phy: omap-usb2: invoke pm_runtime_disable on error path
  phy: fix Kconfig dependencies
  ...

9 years agoMerge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 7 Jun 2015 04:43:29 +0000 (21:43 -0700)]
Merge tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux

Pull devicetree fix from Grant Likely:
 "Stupid typo fix for v4.1.  One of the IS_ENABLED() macro calls forgot
  the CONFIG_ prefix.  Only affects a tiny number of platforms, but
  still..."

* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
  of/dynamic: Fix test for PPC_PSERIES

9 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Sat, 6 Jun 2015 16:15:14 +0000 (09:15 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "i915 has a bunch of fixes, and Russell found a bug in sysfs writing
  handling that results in userspace getting stuck"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm: fix writing to /sys/class/drm/*/status
  drm/i915: Move WaBarrierPerformanceFixDisable:skl to skl code from chv code
  drm/i915: Include G4X/VLV/CHV in self refresh status
  drm/i915: Initialize HWS page address after GPU reset
  drm/i915: Don't skip request retirement if the active list is empty
  drm/i915/hsw: Fix workaround for server AUX channel clock divisor

9 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sat, 6 Jun 2015 16:08:23 +0000 (09:08 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input subsystem fixes from Dmitry Torokhov:
 "Just a couple touchpad drivers fixups"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: alps - do not reduce trackpoint speed by half
  Input: elantech - add new icbody type
  Input: elantech - fix detection of touchpads where the revision matches a known rate

9 years agoMerge branch 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 6 Jun 2015 16:06:20 +0000 (09:06 -0700)]
Merge branch 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb

Pull swiotlb fix from Konrad Rzeszutek Wilk:
 "Tiny little fix which just converts an function to be static.  Really
  tiny"

* 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: do not export map_single function

9 years agoMerge branch 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 6 Jun 2015 16:03:54 +0000 (09:03 -0700)]
Merge branch 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft

Pull iBFT fix from Konrad Rzeszutek Wilk:
 "One single fix from Chris to workaround UEFI platforms failing with
  iSCSI IBFT"

* 'stable/for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft:
  iscsi_ibft: filter null v4-mapped v6 addresses

9 years agoMIPS: KVM: Do not sign extend on unsigned MMIO load
Nicholas Mc Guire [Thu, 7 May 2015 12:47:50 +0000 (14:47 +0200)]
MIPS: KVM: Do not sign extend on unsigned MMIO load

Fix possible unintended sign extension in unsigned MMIO loads by casting
to uint16_t in the case of mmio_needed != 2.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9985/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
9 years agoMIPS: BPF: Fix stack pointer allocation
Markos Chandras [Thu, 4 Jun 2015 10:56:13 +0000 (11:56 +0100)]
MIPS: BPF: Fix stack pointer allocation

Fix stack pointer offset which could potentially corrupt
argument registers in the previous frame. The calculated offset
reflects the size of all the registers we need to preserve so there
is no need for this erroneous subtraction.

[ralf@linux-mips.org: Fixed conflict due to only applying this fix part
of the entire series as part of 4.1 fixes.]

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10527/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
9 years agoMIPS: Loongson-3: Fix a cpu-hotplug issue in loongson3_ipi_interrupt()
Huacai Chen [Thu, 4 Jun 2015 06:53:47 +0000 (14:53 +0800)]
MIPS: Loongson-3: Fix a cpu-hotplug issue in loongson3_ipi_interrupt()

setup_per_cpu_areas() only setup __per_cpu_offset[] for each possible
cpu, but loongson_sysconf.nr_cpus can be greater than possible cpus
(due to reserved_cpus_mask). So in loongson3_ipi_interrupt(), percpu
access will touch the original varible in .data..percpu section which
has been freed. Without this patch, cpu-hotplug will cause memery
corruption.

Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: http://patchwork.linux-mips.org/patch/10524/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
9 years agoMIPS: Fix enabling of DEBUG_STACKOVERFLOW
James Hogan [Thu, 4 Jun 2015 12:25:27 +0000 (13:25 +0100)]
MIPS: Fix enabling of DEBUG_STACKOVERFLOW

Commit 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection") added
kernel stack overflow detection, however it only enabled it conditional
upon the preprocessor definition DEBUG_STACKOVERFLOW, which is never
actually defined. The Kconfig option is called DEBUG_STACKOVERFLOW,
which manifests to the preprocessor as CONFIG_DEBUG_STACKOVERFLOW, so
switch it to using that definition instead.

Fixes: 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Adam Jiang <jiang.adam@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.37+
Patchwork: http://patchwork.linux-mips.org/patch/10531/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
9 years agoMIPS: c-r4k: Fix typo in probe_scache()
Joshua Kinard [Tue, 2 Jun 2015 20:55:22 +0000 (16:55 -0400)]
MIPS: c-r4k: Fix typo in probe_scache()

Fixes a typo in arch/mips/mm/c-r4k.c's probe_scache().

Signed-off-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
9 years agoiscsi_ibft: filter null v4-mapped v6 addresses
Chris Leech [Thu, 28 May 2015 19:51:51 +0000 (12:51 -0700)]
iscsi_ibft: filter null v4-mapped v6 addresses

I've had reports of UEFI platforms failing iSCSI boot in various
configurations, that ended up being caused by network initialization
scripts getting tripped up by unexpected null addresses (0.0.0.0) being
reported for gateways, dhcp servers, and dns servers.

The tianocore EDK2 iSCSI driver generates an iBFT table that always uses
IPv4-mapped IPv6 addresses for the NIC structure fields.  This results
in values that are "not present or not specified" being reported as
::ffff:0.0.0.0 rather than all zeros as specified.

The iscsi_ibft module filters unspecified fields from the iBFT from
sysfs, preventing userspace from using invalid values and making it easy
to check for the presence of a value.  This currently fails in regard to
these mapped null addresses.

In order to remain consistent with how the iBFT information is exposed,
we should accommodate the behavior of the tianocore iSCSI driver as it's
already in the wild in a large number of servers.

Tested under qemu using an OVMF build of tianocore EDK2.

Signed-off-by: Chris Leech <cleech@redhat.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agoswiotlb: do not export map_single function
Alexandre Courbot [Tue, 26 May 2015 09:02:54 +0000 (18:02 +0900)]
swiotlb: do not export map_single function

The map_single() function is not defined as static, even though it
doesn't seem to be used anywhere else in the kernel. Make it static to
avoid namespace pollution since this is a rather generic symbol.

Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agodrm: fix writing to /sys/class/drm/*/status
Russell King [Fri, 5 Jun 2015 22:27:30 +0000 (08:27 +1000)]
drm: fix writing to /sys/class/drm/*/status

Writing to a file is supposed to return the number of bytes written.
Returning zero unfortunately causes bash to constantly spin trying
to write to the sysfs file, to such an extent that even ^c and ^z
have no effect.  The only way out of that is to kill the shell and
log back in.  This isn't nice behaviour.

Fix it by returning the number of characters written to sysfs files.

[airlied: used suggestion from Al Viro]
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoMerge tag 'drm-intel-fixes-2015-06-05' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Fri, 5 Jun 2015 21:14:13 +0000 (07:14 +1000)]
Merge tag 'drm-intel-fixes-2015-06-05' of git://anongit.freedesktop.org/drm-intel into drm-fixes

bunch of i915 fixes.
* tag 'drm-intel-fixes-2015-06-05' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Move WaBarrierPerformanceFixDisable:skl to skl code from chv code
  drm/i915: Include G4X/VLV/CHV in self refresh status
  drm/i915: Initialize HWS page address after GPU reset
  drm/i915: Don't skip request retirement if the active list is empty
  drm/i915/hsw: Fix workaround for server AUX channel clock divisor

9 years agoMerge tag 'pci-v4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Linus Torvalds [Fri, 5 Jun 2015 17:22:06 +0000 (10:22 -0700)]
Merge tag 'pci-v4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "Resource management
   - Fix IOV sorting by alignment (Wei Yang)
   - Preserve resource size during alignment reordering (Yinghai Lu)

  Miscellaneous
    - MAINTAINERS: Add Pratyush for SPEAr13xx and DesignWare PCIe (Pratyush Anand)"

* tag 'pci-v4.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Preserve resource size during alignment reordering
  PCI: Fix IOV resource sorting by alignment requirement
  MAINTAINERS: Add Pratyush Anand as SPEAr13xx and DesignWare PCIe maintainer

9 years agoMerge tag 'sound-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 5 Jun 2015 17:16:52 +0000 (10:16 -0700)]
Merge tag 'sound-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "It was a fairly calm week; here you can find only a few trivial quirks
  and fixes for USB and HD-audio.  All changes are pretty device
  specific"

* tag 'sound-4.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)
  ALSA: usb-audio: add MAYA44 USB+ mixer control names
  ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420
  ALSA: hda - Fix jack detection at resume with VT codecs
  ALSA: usb-audio: don't try to get Outlaw RR2150 sample rate
  ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion
  ALSA: hda/realtek - Suooprt Dell headset mode for ALC256

9 years agoMerge tag 'iommu-fixes-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 5 Jun 2015 17:14:51 +0000 (10:14 -0700)]
Merge tag 'iommu-fixes-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu fix from Joerg Roedel:
 "Only one patch:

   - Revert "iommu/amd: Don't allocate with __GFP_ZERO in
     alloc_coherent".

     This patch caused problems with some drivers, so it is better to
     revert it now until the drivers have been fixed"

* tag 'iommu-fixes-v4.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  Revert "iommu/amd: Don't allocate with __GFP_ZERO in alloc_coherent"

9 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 5 Jun 2015 17:03:48 +0000 (10:03 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - early_idt_handlers[] fix that fixes the build with bleeding edge
     tooling

   - build warning fix on GCC 5.1

   - vm86 fix plus self-test to make it harder to break it again"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers
  x86/asm/entry/32, selftests: Add a selftest for kernel entries from VM86 mode
  x86/boot: Add CONFIG_PARAVIRT_SPINLOCKS quirk to arch/x86/boot/compressed/misc.h
  x86/asm/entry/32: Really make user_mode() work correctly for VM86 mode

9 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 5 Jun 2015 17:00:53 +0000 (10:00 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "The biggest chunk of the changes are two regression fixes: a HT
  workaround fix and an event-group scheduling fix.  It's been verified
  with 5 days of fuzzer testing.

  Other fixes:

   - eBPF fix
   - a BIOS breakage detection fix
   - PMU driver fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/pt: Fix a refactoring bug
  perf/x86: Tweak broken BIOS rules during check_hw_exists()
  perf/x86/intel/pt: Untangle pt_buffer_reset_markers()
  perf: Disallow sparse AUX allocations for non-SG PMUs in overwrite mode
  perf/x86: Improve HT workaround GP counter constraint
  perf/x86: Fix event/group validation
  perf: Fix race in BPF program unregister

9 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 5 Jun 2015 16:58:36 +0000 (09:58 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Just two small fixes: one radeon, one amdkfd"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/amdkfd: fix topology bug with capability attr.
  drm/radeon: use proper ACR regisiter for DCE3.2

9 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 5 Jun 2015 16:54:43 +0000 (09:54 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c bug fixes from Wolfram Sang:
 "Two small bugfixes for I2C"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: s3c2410: fix oops in suspend callback for non-dt platforms
  i2c: hix5hd2: Fix modalias to make module auto-loading work

9 years agoSUNRPC: Fix a backchannel race
Trond Myklebust [Thu, 4 Jun 2015 19:37:10 +0000 (15:37 -0400)]
SUNRPC: Fix a backchannel race

We need to allow the server to send a new request immediately after we've
replied to the previous one. Right now, there is a window between the
send and the release of the old request in rpc_put_task(), where the
server could send us a new backchannel RPC call, and we have no
request to service it.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
9 years agoSUNRPC: Clean up allocation and freeing of back channel requests
Trond Myklebust [Mon, 1 Jun 2015 19:05:38 +0000 (15:05 -0400)]
SUNRPC: Clean up allocation and freeing of back channel requests

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>