]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
11 years agoaio: don't include aio.h in sched.h
Kent Overstreet [Tue, 7 May 2013 23:19:08 +0000 (16:19 -0700)]
aio: don't include aio.h in sched.h

Faster kernel compiles by way of fewer unnecessary includes.

[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: kill ki_retry
Kent Overstreet [Tue, 7 May 2013 23:19:11 +0000 (16:19 -0700)]
aio: kill ki_retry

Thanks to Zach Brown's work to rip out the retry infrastructure, we don't
need this anymore - ki_retry was only called right after the kiocb was
initialized.

This also refactors and trims some duplicated code, as well as cleaning up
the refcounting/error handling a bit.

[akpm@linux-foundation.org: use fmode_t in aio_run_iocb()]
[akpm@linux-foundation.org: fix file_start_write/file_end_write tests]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: kill ki_key
Kent Overstreet [Tue, 7 May 2013 23:19:10 +0000 (16:19 -0700)]
aio: kill ki_key

ki_key wasn't actually used for anything previously - it was always 0.
Drop it to trim struct kiocb a bit.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: give shared kioctx fields their own cachelines
Kent Overstreet [Tue, 7 May 2013 23:18:56 +0000 (16:18 -0700)]
aio: give shared kioctx fields their own cachelines

[akpm@linux-foundation.org: make reqs_active __cacheline_aligned_in_smp]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: kill struct aio_ring_info
Kent Overstreet [Tue, 7 May 2013 23:18:55 +0000 (16:18 -0700)]
aio: kill struct aio_ring_info

struct aio_ring_info was kind of odd, the only place it's used is where
it's embedded in struct kioctx - there's no real need for it.

The next patch rearranges struct kioctx and puts various things on their
own cachelines - getting rid of struct aio_ring_info now makes that
reordering a bit clearer.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: kill batch allocation
Kent Overstreet [Tue, 7 May 2013 23:18:53 +0000 (16:18 -0700)]
aio: kill batch allocation

Previously, allocating a kiocb required touching quite a few global
(well, per kioctx) cachelines...  so batching up allocation to amortize
those was worthwhile.  But we've gotten rid of some of those, and in
another couple of patches kiocb allocation won't require writing to any
shared cachelines, so that means we can just rip this code out.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: change reqs_active to include unreaped completions
Kent Overstreet [Tue, 7 May 2013 23:18:51 +0000 (16:18 -0700)]
aio: change reqs_active to include unreaped completions

The aio code tries really hard to avoid having to deal with the
completion ringbuffer overflowing.  To do that, it has to keep track of
the number of outstanding kiocbs, and the number of completions
currently in the ringbuffer - and it's got to check that every time we
allocate a kiocb.  Ouch.

But - we can improve this quite a bit if we just change reqs_active to
mean "number of outstanding requests and unreaped completions" - that
means kiocb allocation doesn't have to look at the ringbuffer, which is
a fairly significant win.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: use cancellation list lazily
Kent Overstreet [Tue, 7 May 2013 23:18:49 +0000 (16:18 -0700)]
aio: use cancellation list lazily

Cancelling kiocbs requires adding them to a per kioctx linked list,
which is one of the few things we need to take the kioctx lock for in
the fast path.  But most kiocbs can't be cancelled - so if we just do
this lazily, we can avoid quite a bit of locking overhead.

While we're at it, instead of using a flag bit switch to using ki_cancel
itself to indicate that a kiocb has been cancelled/completed.  This lets
us get rid of ki_flags entirely.

[akpm@linux-foundation.org: remove buggy BUG()]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: use flush_dcache_page()
Kent Overstreet [Tue, 7 May 2013 23:18:47 +0000 (16:18 -0700)]
aio: use flush_dcache_page()

This wasn't causing problems before because it's not needed on x86, but
it is needed on other architectures.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: make aio_read_evt() more efficient, convert to hrtimers
Kent Overstreet [Tue, 7 May 2013 23:18:45 +0000 (16:18 -0700)]
aio: make aio_read_evt() more efficient, convert to hrtimers

Previously, aio_read_event() pulled a single completion off the
ringbuffer at a time, locking and unlocking each time.  Change it to
pull off as many events as it can at a time, and copy them directly to
userspace.

This also fixes a bug where if copying the event to userspace failed,
we'd lose the event.

Also convert it to wait_event_interruptible_hrtimeout(), which
simplifies it quite a bit.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agowait: add wait_event_hrtimeout()
Kent Overstreet [Tue, 7 May 2013 23:18:43 +0000 (16:18 -0700)]
wait: add wait_event_hrtimeout()

Analagous to wait_event_timeout() and friends, this adds
wait_event_hrtimeout() and wait_event_interruptible_hrtimeout().

Note that unlike the versions that use regular timers, these don't
return the amount of time remaining when they return - instead, they
return 0 or -ETIME if they timed out.  because I was uncomfortable with
the semantics of doing it the other way (that I could get it right,
anyways).

If the timer expires, there's no real guarantee that expire_time -
current_time would be <= 0 - due to timer slack certainly, and I'm not
sure I want to know the implications of the different clock bases in
hrtimers.

If the timer does expire and the code calculates that the time remaining
is nonnegative, that could be even worse if the calling code then reuses
that timeout.  Probably safer to just return 0 then, but I could imagine
weird bugs or at least unintended behaviour arising from that too.

I came to the conclusion that if other users end up actually needing the
amount of time remaining, the sanest thing to do would be to create a
version that uses absolute timeouts instead of relative.

[akpm@linux-foundation.org: fix description of `timeout' arg]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: refcounting cleanup
Kent Overstreet [Tue, 7 May 2013 23:18:41 +0000 (16:18 -0700)]
aio: refcounting cleanup

The usage of ctx->dead was fubar - it makes no sense to explicitly check
it all over the place, especially when we're already using RCU.

Now, ctx->dead only indicates whether we've dropped the initial
refcount. The new teardown sequence is:

  set ctx->dead
  hlist_del_rcu();
  synchronize_rcu();

Now we know no system calls can take a new ref, and it's safe to drop
the initial ref:

  put_ioctx();

We also need to ensure there are no more outstanding kiocbs.  This was
done incorrectly - it was being done in kill_ctx(), and before dropping
the initial refcount.  At this point, other syscalls may still be
submitting kiocbs!

Now, we cancel and wait for outstanding kiocbs in free_ioctx(), after
kioctx->users has dropped to 0 and we know no more iocbs could be
submitted.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: make aio_put_req() lockless
Kent Overstreet [Tue, 7 May 2013 23:18:39 +0000 (16:18 -0700)]
aio: make aio_put_req() lockless

Freeing a kiocb needed to touch the kioctx for three things:

 * Pull it off the reqs_active list
 * Decrementing reqs_active
 * Issuing a wakeup, if the kioctx was in the process of being freed.

This patch moves these to aio_complete(), for a couple reasons:

 * aio_complete() already has to issue the wakeup, so if we drop the
   kioctx refcount before aio_complete does its wakeup we don't have to
   do it twice.
 * aio_complete currently has to take the kioctx lock, so it makes sense
   for it to pull the kiocb off the reqs_active list too.
 * A later patch is going to change reqs_active to include unreaped
   completions - this will mean allocating a kiocb doesn't have to look
   at the ringbuffer. So taking the decrement of reqs_active out of
   kiocb_free() is useful prep work for that patch.

This doesn't really affect cancellation, since existing (usb) code that
implements a cancel function still calls aio_complete() - we just have
to make sure that aio_complete does the necessary teardown for cancelled
kiocbs.

It does affect code paths where we free kiocbs that were never
submitted; they need to decrement reqs_active and pull the kiocb off the
reqs_active list.  This occurs in two places: kiocb_batch_free(), which
is going away in a later patch, and the error path in io_submit_one.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: do fget() after aio_get_req()
Kent Overstreet [Tue, 7 May 2013 23:18:37 +0000 (16:18 -0700)]
aio: do fget() after aio_get_req()

aio_get_req() will fail if we have the maximum number of requests
outstanding, which depending on the application may not be uncommon.  So
avoid doing an unnecessary fget().

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: dprintk() -> pr_debug()
Kent Overstreet [Tue, 7 May 2013 23:18:35 +0000 (16:18 -0700)]
aio: dprintk() -> pr_debug()

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: move private stuff out of aio.h
Kent Overstreet [Tue, 7 May 2013 23:18:33 +0000 (16:18 -0700)]
aio: move private stuff out of aio.h

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: add kiocb_cancel()
Kent Overstreet [Tue, 7 May 2013 23:18:31 +0000 (16:18 -0700)]
aio: add kiocb_cancel()

Minor refactoring, to get rid of some duplicated code

[akpm@linux-foundation.org: fix warning]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: kill return value of aio_complete()
Kent Overstreet [Tue, 7 May 2013 23:18:29 +0000 (16:18 -0700)]
aio: kill return value of aio_complete()

Nothing used the return value, and it probably wasn't possible to use it
safely for the locked versions (aio_complete(), aio_put_req()).  Just
kill it.

Signed-off-by: Kent Overstreet <koverstreet@google.com>
Acked-by: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agochar: add aio_{read,write} to /dev/{null,zero}
Zach Brown [Tue, 7 May 2013 23:18:27 +0000 (16:18 -0700)]
char: add aio_{read,write} to /dev/{null,zero}

These are handy for measuring the cost of the aio infrastructure with
operations that do very little and complete immediately.

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: remove retry-based AIO
Zach Brown [Tue, 7 May 2013 23:18:25 +0000 (16:18 -0700)]
aio: remove retry-based AIO

This removes the retry-based AIO infrastructure now that nothing in tree
is using it.

We want to remove retry-based AIO because it is fundemantally unsafe.
It retries IO submission from a kernel thread that has only assumed the
mm of the submitting task.  All other task_struct references in the IO
submission path will see the kernel thread, not the submitting task.
This design flaw means that nothing of any meaningful complexity can use
retry-based AIO.

This removes all the code and data associated with the retry machinery.
The most significant benefit of this is the removal of the locking
around the unused run list in the submission path.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Zach Brown <zab@redhat.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agogadget: remove only user of aio retry
Zach Brown [Tue, 7 May 2013 23:18:23 +0000 (16:18 -0700)]
gadget: remove only user of aio retry

This removes the only in-tree user of aio retry.  This will let us
remove the retry code from the aio core.

Removing retry is relatively easy as the USB gadget wasn't using it to
retry IOs at all.  It always fully submitted the IO in the context of
the initial io_submit() call.  It only used the AIO retry facility to
get the submitter's mm context for copying the result of a read back to
user space.  This is easy to implement with use_mm() and a work struct,
much like kvm does with async_pf_execute() for get_user_pages().

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoaio: remove dead code from aio.h
Zach Brown [Tue, 7 May 2013 23:18:21 +0000 (16:18 -0700)]
aio: remove dead code from aio.h

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm: remove old aio use_mm() comment
Zach Brown [Tue, 7 May 2013 23:18:19 +0000 (16:18 -0700)]
mm: remove old aio use_mm() comment

Bunch of performance improvements and cleanups Zach Brown and I have
been working on.  The code should be pretty solid at this point, though
it could of course use more review and testing.

The results in my testing are pretty impressive, particularly when an
ioctx is being shared between multiple threads.  In my crappy synthetic
benchmark, with 4 threads submitting and one thread reaping completions,
I saw overhead in the aio code go from ~50% (mostly ioctx lock
contention) to low single digits.  Performance with ioctx per thread
improved too, but I'd have to rerun those benchmarks.

The reason I've been focused on performance when the ioctx is shared is
that for a fair number of real world completions, userspace needs the
completions aggregated somehow - in practice people just end up
implementing this aggregation in userspace today, but if it's done right
we can do it much more efficiently in the kernel.

Performance wise, the end result of this patch series is that submitting
a kiocb writes to _no_ shared cachelines - the penalty for sharing an
ioctx is gone there.  There's still going to be some cacheline
contention when we deliver the completions to the aio ringbuffer (at
least if you have interrupts being delivered on multiple cores, which
for high end stuff you do) but I have a couple more patches not in this
series that implement coalescing for that (by taking advantage of
interrupt coalescing).  With that, there's basically no bottlenecks or
performance issues to speak of in the aio code.

This patch:

use_mm() is used in more places than just aio.  There's no need to mention
callers when describing the function.

Signed-off-by: Zach Brown <zab@redhat.com>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/vmalloc.c: add vfree comment
Andrew Morton [Tue, 7 May 2013 23:18:18 +0000 (16:18 -0700)]
mm/vmalloc.c: add vfree comment

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoremove unused random32() and srandom32()
Akinobu Mita [Tue, 7 May 2013 23:18:17 +0000 (16:18 -0700)]
remove unused random32() and srandom32()

After finishing a naming transition, remove unused backward
compatibility wrapper macros

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/infiniband/hw: rename random32() to prandom_u32()
Andrew Morton [Tue, 7 May 2013 23:18:16 +0000 (16:18 -0700)]
drivers/infiniband/hw: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random number
generator.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/net: rename random32() to prandom_u32()
Akinobu Mita [Tue, 7 May 2013 23:18:15 +0000 (16:18 -0700)]
drivers/net: rename random32() to prandom_u32()

Use preferable function name which implies using a pseudo-random number
generator.

[akpm@linux-foundation.org: convert team_mode_random.c]
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Thomas Sailer <t.sailer@alumni.ethz.ch>
Acked-by: Bing Zhao <bzhao@marvell.com> [mwifiex]
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Thomas Sailer <t.sailer@alumni.ethz.ch>
Cc: Jean-Paul Roubelat <jpr@f6fbb.org>
Cc: Bing Zhao <bzhao@marvell.com>
Cc: Brett Rudley <brudley@broadcom.com>
Cc: Arend van Spriel <arend@broadcom.com>
Cc: "Franky (Zhenhui) Lin" <frankyl@broadcom.com>
Cc: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agohugetlbfs: fix mmap failure in unaligned size request
Naoya Horiguchi [Tue, 7 May 2013 23:18:13 +0000 (16:18 -0700)]
hugetlbfs: fix mmap failure in unaligned size request

The current kernel returns -EINVAL unless a given mmap length is
"almost" hugepage aligned.  This is because in sys_mmap_pgoff() the
given length is passed to vm_mmap_pgoff() as it is without being aligned
with hugepage boundary.

This is a regression introduced in commit 40716e29243d ("hugetlbfs: fix
alignment of huge page requests"), where alignment code is pushed into
hugetlb_file_setup() and the variable len in caller side is not changed.

To fix this, this patch partially reverts that commit, and adds
alignment code in caller side.  And it also introduces hstate_sizelog()
in order to get proper hstate to specified hugepage size.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=56881

[akpm@linux-foundation.org: fix warning when CONFIG_HUGETLB_PAGE=n]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: <iceman_dvd@yahoo.com>
Cc: Steven Truelove <steven.truelove@utoronto.ca>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoparisc: remove the second argument of kmap_atomic()
Zhao Hongjiang [Tue, 7 May 2013 23:18:12 +0000 (16:18 -0700)]
parisc: remove the second argument of kmap_atomic()

kmap_atomic() requires only one argument now.

Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agodrivers/rtc/rtc-rs5c372.c: add R2221T/L variant to the driver
Lucas Stach [Tue, 7 May 2013 23:18:11 +0000 (16:18 -0700)]
drivers/rtc/rtc-rs5c372.c: add R2221T/L variant to the driver

Register layout is the same, so just add the variant to the appropriate
places.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoinclude/linux/mm.h: complete the mm_walk definition
Andrew Morton [Tue, 7 May 2013 23:18:10 +0000 (16:18 -0700)]
include/linux/mm.h: complete the mm_walk definition

That nameless-function-arguments thing drives me batty.  Fix.

Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm, memcg: add rss_huge stat to memory.stat
David Rientjes [Tue, 7 May 2013 23:18:09 +0000 (16:18 -0700)]
mm, memcg: add rss_huge stat to memory.stat

This exports the amount of anonymous transparent hugepages for each
memcg via the new "rss_huge" stat in memory.stat.  The units are in
bytes.

This is helpful to determine the hugepage utilization for individual
jobs on the system in comparison to rss and opportunities where
MADV_HUGEPAGE may be helpful.

The amount of anonymous transparent hugepages is also included in "rss"
for backwards compatibility.

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agomm/SPARC: use common help functions to free reserved pages
Jiang Liu [Tue, 7 May 2013 23:18:08 +0000 (16:18 -0700)]
mm/SPARC: use common help functions to free reserved pages

Use common help functions to free reserved pages.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penber...
Linus Torvalds [Tue, 7 May 2013 15:42:20 +0000 (08:42 -0700)]
Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux

Pull slab changes from Pekka Enberg:
 "The bulk of the changes are more slab unification from Christoph.

  There's also few fixes from Aaron, Glauber, and Joonsoo thrown into
  the mix."

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (24 commits)
  mm, slab_common: Fix bootstrap creation of kmalloc caches
  slab: Return NULL for oversized allocations
  mm: slab: Verify the nodeid passed to ____cache_alloc_node
  slub: tid must be retrieved from the percpu area of the current processor
  slub: Do not dereference NULL pointer in node_match
  slub: add 'likely' macro to inc_slabs_node()
  slub: correct to calculate num of acquired objects in get_partial_node()
  slub: correctly bootstrap boot caches
  mm/sl[au]b: correct allocation type check in kmalloc_slab()
  slab: Fixup CONFIG_PAGE_ALLOC/DEBUG_SLAB_LEAK sections
  slab: Handle ARCH_DMA_MINALIGN correctly
  slab: Common definition for kmem_cache_node
  slab: Rename list3/l3 to node
  slab: Common Kmalloc cache determination
  stat: Use size_t for sizes instead of unsigned
  slab: Common function to create the kmalloc array
  slab: Common definition for the array of kmalloc caches
  slab: Common constants for kmalloc boundaries
  slab: Rename nodelists to node
  slab: Common name for the per node structures
  ...

11 years agoMerge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Tue, 7 May 2013 14:59:19 +0000 (07:59 -0700)]
Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull misc kbuild updates from Michal Marek:
 "Non-critical kbuild changes:

   - make coccicheck improvements, but no new semantic patches this time

   - make rpm improvements

   - make tar-pkg change to include the architecture in the filename.

     This is a deliberate incompatibility, but nobody has complained so
     far and it is useful if you build for different architectures.  It
     also matches what the deb-pkg and rpm-pkg targets produce.

   - kbuild documentation fix"

* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  rpm-pkg: Remove pointless set -e statements
  rpm-pkg: Always regenerate the specfile
  rpm-pkg: Do not write to the parent directory
  rpm-pkg: Do not package the whole source directory
  buildtar: Add ARCH to the archive name
  Coccinelle: Fix patch output when coccicheck is used with M= and C=
  Coccinelle: Add support to the SPFLAGS variable
  Coccinelle: Cleanup the setting of the FLAGS and OPTIONS variables
  Coccinelle: Restore coccicheck verbosity in ONLINE mode (C=1 or C=2)
  scripts/package/Makefile: compare objtree with srctree instead of test KBUILD_OUTPUT
  doc: change example to existing Makefile fragment
  scripts/tags.sh: Add magic for OFFSET and DEFINE

11 years agoMerge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Tue, 7 May 2013 14:58:05 +0000 (07:58 -0700)]
Merge branch 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kconfig updates from Michal Marek:
 - use pkg-config to detect curses libraries
 - clean up the way curses headers are searched
 - Some randconfig fixes, of which one had to be reverted
 - KCONFIG_SEED for randconfig debugging
 - memuconfig memory leak plugged
 - menuconfig > breadcrumbs > navigation
 - xconfig compilation fix
 - Other minor fixes

* 'kconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  kconfig: fix lists definition for C++
  Revert "kconfig: fix randomising choice entries in presence of KCONFIG_ALLCONFIG"
  kconfig: implement KCONFIG_PROBABILITY for randconfig
  kconfig: allow specifying the seed for randconfig
  kconfig: fix randomising choice entries in presence of KCONFIG_ALLCONFIG
  kconfig: do not override symbols already set
  kconfig: fix randconfig tristate detection
  kconfig/lxdialog: rationalise the include paths where to find {.n}curses{,w}.h
  menuconfig: Add "breadcrumbs" navigation aid
  menuconfig: Fix memory leak introduced by jump keys feature
  merge_config.sh: Avoid creating unnessary source softlinks
  kconfig: optionally use pkg-config to detect ncurses libs
  menuconfig: optionally use pkg-config to detect ncurses libs

11 years agoMerge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Linus Torvalds [Tue, 7 May 2013 14:56:26 +0000 (07:56 -0700)]
Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild

Pull kbuild changes from Michal Marek:
 "Kbuild commits for v3.10-rc1:

   - Fix make mrproper after mod/file2alias rework
   - Fix ld-option Makefile function
   - Rewrite headers_install to shell to drop Perl dependency.

  There are some more patches I have to look at, so I might send another
  pull request later.  Or just queue them for 3.11."

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
  Fix cleaning in scripts/mod
  headers_install.pl: convert to headers_install.sh
  kbuild: fix ld-option function

11 years agomenuconfig: fix NULL pointer dereference when searching a symbol
Li Zefan [Tue, 7 May 2013 13:56:54 +0000 (15:56 +0200)]
menuconfig: fix NULL pointer dereference when searching a symbol

Searching for PPC_EFIKA results in a segmentation fault, and it's
because get_symbol_prop() returns NULL.

In this case CONFIG_PPC_EFIKA is defined in arch/powerpc/platforms/
52xx/Kconfig, so it won't be parsed if ARCH!=PPC, but menuconfig knows
this symbol when it parses sound/soc/fsl/Kconfig:

    config SND_MPC52xx_SOC_EFIKA
        tristate "SoC AC97 Audio support for bbplan Efika and STAC9766"
        depends on PPC_EFIKA

This bug was introduced by commit bcdedcc1afd6 ("menuconfig: print more
info for symbol without prompts").

Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Tested-by: Libo Chen <libo.chen@huawei.com>
Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoe1000e: fix scheduling while atomic bug
Bruce Allan [Tue, 7 May 2013 05:52:47 +0000 (22:52 -0700)]
e1000e: fix scheduling while atomic bug

A scheduling while atomic bug was introduced recently (by commit
ce43a2168c59: "e1000e: cleanup USLEEP_RANGE checkpatch checks").

Revert the particular instance of usleep_range() which causes the bug.

Reported-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'slab/next' into slab/for-linus
Pekka Enberg [Tue, 7 May 2013 06:19:47 +0000 (09:19 +0300)]
Merge branch 'slab/next' into slab/for-linus

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 6 May 2013 22:51:10 +0000 (15:51 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "Just a small pile of fixes"

 1) Fix race conditions in IP fragmentation LRU list handling, from
    Konstantin Khlebnikov.

 2) vfree() is no longer verboten in interrupts, so deferring is
    pointless, from Al Viro.

 3) Conversion from mutex to semaphore in netpoll left trylock test
    inverted, caught by Dan Carpenter.

 4) 3c59x uses wrong base address when releasing regions, from Sergei
    Shtylyov.

 5) Bounds checking in TIPC from Dan Carpenter.

 6) Fastopen cookies should not be expired as aggressively as other TCP
    metrics.  From Eric Dumazet.

 7) Fix retrieval of MAC address in ibmveth, from Ben Herrenschmidt.

 8) Don't use "u16" in virtio user headers, from Stephen Hemminger

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  tipc: potential divide by zero in tipc_link_recv_fragment()
  tipc: add a bounds check in link_recv_changeover_msg()
  net/usb: new driver for RTL8152
  3c59x: fix freeing nonexistent resource on driver unload
  netpoll: inverted down_trylock() test
  rps_dev_flow_table_release(): no need to delay vfree()
  fib_trie: no need to delay vfree()
  net: frag, fix race conditions in LRU list maintenance
  tcp: do not expire TCP fastopen cookies
  net/eth/ibmveth: Fixup retrieval of MAC address
  virtio: don't expose u16 in userspace api

11 years agoMerge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney...
Linus Torvalds [Mon, 6 May 2013 22:41:42 +0000 (15:41 -0700)]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds

Pull LED subsystem updates from Bryan Wu:
 - move LED trigger drivers into a new directory
 - lp55xx common driver updates
 - other led drivers updates and bug fixing

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
  leds: leds-asic3: switch to using SIMPLE_DEV_PM_OPS
  leds: leds-bd2802: add CONFIG_PM_SLEEP to suspend/resume functions
  leds: lp55xx: configure the clock detection
  leds: lp55xx: use common clock framework when external clock is used
  leds: leds-ns2: fix oops at module removal
  leds: leds-pwm: Defer led_pwm_set() if PWM can sleep
  leds: lp55xx: fix the sysfs read operation
  leds: lm355x, lm3642: support camera LED triggers for flash and torch
  leds: add camera LED triggers
  leds: trigger: use inline functions instead of macros
  leds: tca6507: Use of_match_ptr() macro
  leds: wm8350: Complain if we fail to reenable DCDC
  leds: renesas: set gpio_request_one() flags param correctly
  leds: leds-ns2: set devm_gpio_request_one() flags param correctly
  leds: leds-lt3593: set devm_gpio_request_one() flags param correctly
  leds: leds-bd2802: remove erroneous __exit annotation
  leds: atmel-pwm: remove erroneous __exit annotation
  leds: move LED trigger drivers into new subdirectory
  leds: add new LP5562 LED driver

11 years agoMerge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux
Linus Torvalds [Mon, 6 May 2013 22:40:55 +0000 (15:40 -0700)]
Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux

Pull GPIO changes from Grant Likely:
 "The usual selection of bug fixes and driver updates for GPIO.  Nothing
  really stands out except the addition of the GRGPIO driver and some
  enhacements to ACPI support"

I'm pulling this despite the earlier mess.  Let's hope it compiles these
days.

* tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux: (46 commits)
  gpio: grgpio: Add irq support
  gpio: grgpio: Add device driver for GRGPIO cores
  gpiolib-acpi: introduce acpi_get_gpio_by_index() helper
  GPIO: gpio-generic: remove kfree() from bgpio_remove call
  gpio / ACPI: Handle ACPI events in accordance with the spec
  gpio: lpc32xx: Fix off-by-one valid range checking for bank
  gpio: mcp23s08: convert driver to DT
  gpio/omap: force restore if context loss is not detectable
  gpio/omap: optimise interrupt service routine
  gpio/omap: remove extra context restores in *_runtime_resume()
  gpio/omap: free irq domain in probe() failure paths
  gpio: gpio-generic: Add 16 and 32 bit big endian byte order support
  gpio: samsung: Add terminating entry for exynos_pinctrl_ids
  gpio: mvebu: add dbg_show function
  MAX7301 GPIO: Do not force SPI speed when using OF Platform
  gpio: gpio-tps65910.c: fix checkpatch error
  gpio: gpio-timberdale.c: fix checkpatch error
  gpio: gpio-tc3589x.c: fix checkpatch errors
  gpio: gpio-stp-xway.c: fix checkpatch error
  gpio: gpio-sch.c: fix checkpatch error
  ...

11 years agoMerge tag 'for-3.10-rc1' of git://gitorious.org/linux-pwm/linux-pwm
Linus Torvalds [Mon, 6 May 2013 22:32:36 +0000 (15:32 -0700)]
Merge tag 'for-3.10-rc1' of git://gitorious.org/linux-pwm/linux-pwm

Pull pwm changes from Thierry Reding:
 "Nothing very exciting this time around.  A couple of bug fixes and a
  lot of cleanup across the board.  The DaVinci 8xx family of SoCs now
  use the same driver as the AM33xx family.

  Many thanks to Axel Lin and Jingoo Han who have done a great job
  fixing various bugs and inconsistencies."

* tag 'for-3.10-rc1' of git://gitorious.org/linux-pwm/linux-pwm: (27 commits)
  pwm: lpc32xx: Don't change PWM_ENABLE bit in lpc32xx_pwm_config
  pwm: lpc32xx: Properly set PWM_ENABLE bit in lpc32xx_pwm_[enable|disable]
  pwm: Constify OF match tables
  pwm: pwm-tiehrpwm: Update device-tree binding document
  pwm: pwm-tiecap: Update device-tree binding document
  pwm: puv3: Remove unused enabled filed from struct puv3_pwm_chip
  pwm: pxa: Remove PWM_ID_BASE macro
  pwm: spear: Remove unused *dev from struct spear_pwm_chip
  pwm: mxs: Remove unused *dev from struct mxs_pwm_chip
  pwm: twl: Return proper error if twl6030_pwm_enable() fails
  pwm: pxa: Remove clk_enabled field from struct pxa_pwm_chip
  pwm: imx: Remove enabled field from struct imx_chip
  pwm: twl: Add .owner to struct pwm_ops
  pwm: twl-led: Add .owner to struct pwm_ops
  pwm: atmel-tcb: Add .owner to struct pwm_ops
  pwm: ab8500: Add .owner to struct pwm_ops
  pwm: spear: Fix checking return value of clk_enable() and clk_prepare()
  pwm: tiehrpwm: Staticize non-exported symbols
  pwm: tiecap: Staticize non-exported symbols
  pwm: ab8500: Fix trivial typo in dev_err message
  ...

11 years agoMerge tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 6 May 2013 21:59:13 +0000 (14:59 -0700)]
Merge tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:
 "The updates are mostly about the x86 IOMMUs this time.

  Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a
  PPC platform) and an extension to the IOMMU group interface.

  On the x86 side this includes a workaround for VT-d to disable
  interrupt remapping on broken chipsets.  On the AMD-Vi side the most
  important new feature is a kernel command-line interface to override
  broken information in IVRS ACPI tables and get interrupt remapping
  working this way.

  Besides that there are small fixes all over the place."

* tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (24 commits)
  iommu/tegra: Fix printk formats for dma_addr_t
  iommu: Add a function to find an iommu group by id
  iommu/vt-d: Remove warning for HPET scope type
  iommu: Move swap_pci_ref function to drivers/iommu/pci.h.
  iommu/vt-d: Disable translation if already enabled
  iommu/amd: fix error return code in early_amd_iommu_init()
  iommu/AMD: Per-thread IOMMU Interrupt Handling
  iommu: Include linux/err.h
  iommu/amd: Workaround for ERBT1312
  iommu/amd: Document ivrs_ioapic and ivrs_hpet parameters
  iommu/amd: Don't report firmware bugs with cmd-line ivrs overrides
  iommu/amd: Add ioapic and hpet ivrs override
  iommu/amd: Add early maps for ioapic and hpet
  iommu/amd: Extend IVRS special device data structure
  iommu/amd: Move add_special_device() to __init
  iommu: Fix compile warnings with forward declarations
  iommu/amd: Properly initialize irq-table lock
  iommu/amd: Use AMD specific data structure for irq remapping
  iommu/amd: Remove map_sg_no_iommu()
  iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets
  ...

11 years agoFix cleaning in scripts/mod
Andreas Schwab [Sat, 4 May 2013 14:32:53 +0000 (16:32 +0200)]
Fix cleaning in scripts/mod

Make sure devicetable-offsets.h is cleaned in the scripts/mod directory

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
11 years agomm, slab_common: Fix bootstrap creation of kmalloc caches
Christoph Lameter [Fri, 3 May 2013 18:04:18 +0000 (18:04 +0000)]
mm, slab_common: Fix bootstrap creation of kmalloc caches

For SLAB the kmalloc caches must be created in ascending sizes in order
for the OFF_SLAB sub-slab cache to work properly.

Create the non power of two caches immediately after the prior power of
two kmalloc cache. Do not create the non power of two caches before all
other caches.

Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Christoph Lamete <cl@linux.com>
Link: http://lkml.kernel.org/r/201305040348.CIF81716.OStQOHFJMFLOVF@I-love.SAKURA.ne.jp
Signed-off-by: Pekka Enberg <penberg@kernel.org>
11 years agotipc: potential divide by zero in tipc_link_recv_fragment()
Dan Carpenter [Mon, 6 May 2013 09:31:17 +0000 (09:31 +0000)]
tipc: potential divide by zero in tipc_link_recv_fragment()

The worry here is that fragm_sz could be zero since it comes from
skb->data.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agotipc: add a bounds check in link_recv_changeover_msg()
Dan Carpenter [Mon, 6 May 2013 08:28:41 +0000 (08:28 +0000)]
tipc: add a bounds check in link_recv_changeover_msg()

The bearer_id here comes from skb->data and it can be a number from 0 to
7.  The problem is that the ->links[] array has only 2 elements so I
have added a range check.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/usb: new driver for RTL8152
hayeswang [Thu, 2 May 2013 16:01:25 +0000 (16:01 +0000)]
net/usb: new driver for RTL8152

Add new driver for supporting Realtek RTL8152 Based USB 2.0 Ethernet Adapters

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph...
Linus Torvalds [Mon, 6 May 2013 20:11:19 +0000 (13:11 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client

Pull Ceph changes from Alex Elder:
 "This is a big pull.

  Most of it is culmination of Alex's work to implement RBD image
  layering, which is now complete (yay!).

  There is also some work from Yan to fix i_mutex behavior surrounding
  writes in cephfs, a sync write fix, a fix for RBD images that get
  resized while they are mapped, and a few patches from me that resolve
  annoying auth warnings and fix several bugs in the ceph auth code."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (254 commits)
  rbd: fix image request leak on parent read
  libceph: use slab cache for osd client requests
  libceph: allocate ceph message data with a slab allocator
  libceph: allocate ceph messages with a slab allocator
  rbd: allocate image object names with a slab allocator
  rbd: allocate object requests with a slab allocator
  rbd: allocate name separate from obj_request
  rbd: allocate image requests with a slab allocator
  rbd: use binary search for snapshot lookup
  rbd: clear EXISTS flag if mapped snapshot disappears
  rbd: kill off the snapshot list
  rbd: define rbd_snap_size() and rbd_snap_features()
  rbd: use snap_id not index to look up snap info
  rbd: look up snapshot name in names buffer
  rbd: drop obj_request->version
  rbd: drop rbd_obj_method_sync() version parameter
  rbd: more version parameter removal
  rbd: get rid of some version parameters
  rbd: stop tracking header object version
  rbd: snap names are pointer to constant data
  ...

11 years agoMerge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Mon, 6 May 2013 20:07:47 +0000 (13:07 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French:
 "A set of cifs cleanup fixes.

  The only big one of this set optimizes the cifs error logging,
  renaming cFYI and cERROR macros to cifs_dbg, and in the process makes
  it clearer and reduces module size."

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: small variable name cleanup
  CIFS: fix error return code in cifs_atomic_open()
  cifs: store the real expected sequence number in the mid
  cifs: on send failure, readjust server sequence number downward
  cifs: remove ENOSPC handling in smb_sendv
  [CIFS] cifs: Rename cERROR and cFYI to cifs_dbg
  fs: cifs: use kmemdup instead of kmalloc + memcpy
  cifs: replaced kmalloc + memset with kzalloc
  cifs: ignore the unc= and prefixpath= mount options

11 years agoautofs - remove autofs dentry mount check
David Jeffery [Mon, 6 May 2013 05:49:30 +0000 (13:49 +0800)]
autofs - remove autofs dentry mount check

When checking if an autofs mount point is busy it isn't sufficient to
only check if it's a mount point.

For example, if the mount of an offset mountpoint in a tree is denied
for this host by its export and the dentry becomes a process working
directory the check incorrectly returns the mount as not in use at
expire.

This can happen since the default when mounting within a tree is
nostrict, which means ingnore mount fails on mounts within the tree and
continue.  The nostrict option is meant to allow mounting in this case.

Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoautofs - fix sparse warning for autofs4_d_manage()
Claudiu Ghioc [Mon, 6 May 2013 05:47:16 +0000 (13:47 +0800)]
autofs - fix sparse warning for autofs4_d_manage()

Fixed the sparse warning:

  fs/autofs4/root.c:411:5: warning: symbol 'autofs4_d_manage' was not declared. Should it be static?"

[ Clearly it should be static as the function is declared static at the
  top of root.c.  - imk ]

Signed-off-by: Claudiu Ghioc <claudiu.ghioc@gmail.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Mon, 6 May 2013 19:34:53 +0000 (12:34 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Martin Schwidefsky:
 "This is the second batch of s390 patches for the 3.10 merge window.

  Heiko improved the memory detection, this fixes kdump for large memory
  sizes.  Some kvm related memory management work, new ipldev/condev
  keywords in cio and bug fixes."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mem_detect: remove artificial kdump memory types
  s390/mm: add pte invalidation notifier for kvm
  s390/zcrypt: ap bus rescan problem when toggle crypto adapters on/off
  s390/memory hotplug,sclp: get rid of per memory increment usecount
  s390/memory hotplug: provide memory_block_size_bytes() function
  s390/mem_detect: limit memory detection loop to "mem=" parameter
  s390/kdump,bootmem: fix bootmem allocator bitmap size
  s390: get rid of odd global real_memory_size
  s390/kvm: Change the virtual memory mapping location for Virtio devices
  s390/zcore: calculate real memory size using own get_mem_size function
  s390/mem_detect: add DAT sanity check
  s390/mem_detect: fix lockdep irq tracing
  s390/mem_detect: move memory detection code to mm folder
  s390/zfcpdump: exploit new cio_ignore keywords
  s390/cio: add condev keyword to cio_ignore
  s390/cio: add ipldev keyword to cio_ignore
  s390/uaccess: add "fallthrough" comments

11 years ago3c59x: fix freeing nonexistent resource on driver unload
Sergei Shtylyov [Thu, 2 May 2013 11:10:22 +0000 (11:10 +0000)]
3c59x: fix freeing nonexistent resource on driver unload

When unloading the driver that drives an EISA board, a message similar to the
following one is displayed:

Trying to free nonexistent resource <0000000000013000-000000000001301f>

Then an user is unable to reload the driver because the resource it requested in
the previous load hasn't been freed. This happens most probably due to a typo in
vortex_eisa_remove() which calls release_region() with 'dev->base_addr'  instead
of 'edev->base_addr'...

Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonetpoll: inverted down_trylock() test
Dan Carpenter [Mon, 6 May 2013 02:15:13 +0000 (02:15 +0000)]
netpoll: inverted down_trylock() test

The return value is reversed from mutex_trylock().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agorps_dev_flow_table_release(): no need to delay vfree()
Al Viro [Sun, 5 May 2013 16:05:55 +0000 (16:05 +0000)]
rps_dev_flow_table_release(): no need to delay vfree()

The same story as with fib_trie patch - vfree() from RCU callbacks
is legitimate now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agofib_trie: no need to delay vfree()
Al Viro [Sun, 5 May 2013 16:03:46 +0000 (16:03 +0000)]
fib_trie: no need to delay vfree()

Now that vfree() can be called from interrupt contexts, there's no
need to play games with schedule_work() to escape calling vfree()
from RCU callbacks.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet: frag, fix race conditions in LRU list maintenance
Konstantin Khlebnikov [Sun, 5 May 2013 04:56:22 +0000 (04:56 +0000)]
net: frag, fix race conditions in LRU list maintenance

This patch fixes race between inet_frag_lru_move() and inet_frag_lru_add()
which was introduced in commit 3ef0eb0db4bf92c6d2510fe5c4dc51852746f206
("net: frag, move LRU list maintenance outside of rwlock")

One cpu already added new fragment queue into hash but not into LRU.
Other cpu found it in hash and tries to move it to the end of LRU.
This leads to NULL pointer dereference inside of list_move_tail().

Another possible race condition is between inet_frag_lru_move() and
inet_frag_lru_del(): move can happens after deletion.

This patch initializes LRU list head before adding fragment into hash and
inet_frag_lru_move() doesn't touches it if it's empty.

I saw this kernel oops two times in a couple of days.

[119482.128853] BUG: unable to handle kernel NULL pointer dereference at           (null)
[119482.132693] IP: [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.136456] PGD 2148f6067 PUD 215ab9067 PMD 0
[119482.140221] Oops: 0000 [#1] SMP
[119482.144008] Modules linked in: vfat msdos fat 8021q fuse nfsd auth_rpcgss nfs_acl nfs lockd sunrpc ppp_async ppp_generic bridge slhc stp llc w83627ehf hwmon_vid snd_hda_codec_hdmi snd_hda_codec_realtek kvm_amd k10temp kvm snd_hda_intel snd_hda_codec edac_core radeon snd_hwdep ath9k snd_pcm ath9k_common snd_page_alloc ath9k_hw snd_timer snd soundcore drm_kms_helper ath ttm r8169 mii
[119482.152692] CPU 3
[119482.152721] Pid: 20, comm: ksoftirqd/3 Not tainted 3.9.0-zurg-00001-g9f95269 #132 To Be Filled By O.E.M. To Be Filled By O.E.M./RS880D
[119482.161478] RIP: 0010:[<ffffffff812ede89>]  [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.166004] RSP: 0018:ffff880216d5db58  EFLAGS: 00010207
[119482.170568] RAX: 0000000000000000 RBX: ffff88020882b9c0 RCX: dead000000200200
[119482.175189] RDX: 0000000000000000 RSI: 0000000000000880 RDI: ffff88020882ba00
[119482.179860] RBP: ffff880216d5db58 R08: ffffffff8155c7f0 R09: 0000000000000014
[119482.184570] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88020882ba00
[119482.189337] R13: ffffffff81c8d780 R14: ffff880204357f00 R15: 00000000000005a0
[119482.194140] FS:  00007f58124dc700(0000) GS:ffff88021fcc0000(0000) knlGS:0000000000000000
[119482.198928] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[119482.203711] CR2: 0000000000000000 CR3: 00000002155f0000 CR4: 00000000000007e0
[119482.208533] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[119482.213371] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[119482.218221] Process ksoftirqd/3 (pid: 20, threadinfo ffff880216d5c000, task ffff880216d3a9a0)
[119482.223113] Stack:
[119482.228004]  ffff880216d5dbd8 ffffffff8155dcda 0000000000000000 ffff000200000001
[119482.233038]  ffff8802153c1f00 ffff880000289440 ffff880200000014 ffff88007bc72000
[119482.238083]  00000000000079d5 ffff88007bc72f44 ffffffff00000002 ffff880204357f00
[119482.243090] Call Trace:
[119482.248009]  [<ffffffff8155dcda>] ip_defrag+0x8fa/0xd10
[119482.252921]  [<ffffffff815a8013>] ipv4_conntrack_defrag+0x83/0xe0
[119482.257803]  [<ffffffff8154485b>] nf_iterate+0x8b/0xa0
[119482.262658]  [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.267527]  [<ffffffff815448e4>] nf_hook_slow+0x74/0x130
[119482.272412]  [<ffffffff8155c7f0>] ? inet_del_offload+0x40/0x40
[119482.277302]  [<ffffffff8155d068>] ip_rcv+0x268/0x320
[119482.282147]  [<ffffffff81519992>] __netif_receive_skb_core+0x612/0x7e0
[119482.286998]  [<ffffffff81519b78>] __netif_receive_skb+0x18/0x60
[119482.291826]  [<ffffffff8151a650>] process_backlog+0xa0/0x160
[119482.296648]  [<ffffffff81519f29>] net_rx_action+0x139/0x220
[119482.301403]  [<ffffffff81053707>] __do_softirq+0xe7/0x220
[119482.306103]  [<ffffffff81053868>] run_ksoftirqd+0x28/0x40
[119482.310809]  [<ffffffff81074f5f>] smpboot_thread_fn+0xff/0x1a0
[119482.315515]  [<ffffffff81074e60>] ? lg_local_lock_cpu+0x40/0x40
[119482.320219]  [<ffffffff8106d870>] kthread+0xc0/0xd0
[119482.324858]  [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.329460]  [<ffffffff816c32dc>] ret_from_fork+0x7c/0xb0
[119482.334057]  [<ffffffff8106d7b0>] ? insert_kthread_work+0x40/0x40
[119482.338661] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de 48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48 39 c8 74 7a <4c> 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89 42 08
[119482.343787] RIP  [<ffffffff812ede89>] __list_del_entry+0x29/0xd0
[119482.348675]  RSP <ffff880216d5db58>
[119482.353493] CR2: 0000000000000000

Oops happened on this path:
ip_defrag() -> ip_frag_queue() -> inet_frag_lru_move() -> list_move_tail() -> __list_del_entry()

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoslab: Return NULL for oversized allocations
Christoph Lameter [Fri, 3 May 2013 15:43:18 +0000 (15:43 +0000)]
slab: Return NULL for oversized allocations

The inline path seems to have changed the SLAB behavior for very large
kmalloc allocations with  commit e3366016 ("slab: Use common
kmalloc_index/kmalloc_size functions"). This patch restores the old
behavior but also adds diagnostics so that we can figure where in the
code these large allocations occur.

Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Christoph Lameter <cl@linux.com>
Link: http://lkml.kernel.org/r/201305040348.CIF81716.OStQOHFJMFLOVF@I-love.SAKURA.ne.jp
[ penberg@kernel.org: use WARN_ON_ONCE ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
11 years agoMerge tag 'mfd-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd...
Linus Torvalds [Mon, 6 May 2013 00:36:20 +0000 (17:36 -0700)]
Merge tag 'mfd-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-next

Pull MFD update from Samuel Ortiz:
 "For 3.10 we have a few new MFD drivers for:

   - The ChromeOS embedded controller which provides keyboard, battery
     and power management services.  This controller is accessible
     through i2c or SPI.

   - Silicon Laboratories 476x controller, providing access to their FM
     chipset and their audio codec.

   - Realtek's RTS5249, a memory stick, MMC and SD/SDIO PCI based
     reader.

   - Nokia's Tahvo power button and watchdog device.  This device is
     very similar to Retu and is thus supported by the same code base.

   - STMicroelectronics STMPE1801, a keyboard and GPIO controller
     supported by the stmpe driver.

   - ST-Ericsson AB8540 and AB8505 power management and voltage
     converter controllers through the existing ab8500 code.

  Some other drivers got cleaned up or improved.  In particular:

   - The Linaro/STE guys got the ab8500 driver in sync with their
     internal code through a series of optimizations, fixes and
     improvements.

   - The AS3711 and OMAP USB drivers now have DT support.

   - The arizona clock and interrupt handling code got improved.

   - The wm5102 register patch and boot mechanism also got improved."

* tag 'mfd-3.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-next: (104 commits)
  mfd: si476x: Don't use 0bNNN
  mfd: vexpress: Handle pending config transactions
  mfd: ab8500: Export ab8500_gpadc_sw_hw_convert properly
  mfd: si476x: Fix i2c warning
  mfd: si476x: Add header files and Kbuild plumbing
  mfd: si476x: Add chip properties handling code
  mfd: si476x: Add the bulk of the core driver
  mfd: si476x: Add commands abstraction layer
  mfd: rtsx: Support RTS5249
  mfd: retu: Add Tahvo support
  mfd: ucb1400: Pass ucb1400-gpio data through ac97 bus
  mfd: wm8994: Add some OF properties
  mfd: wm8994: Add device ID data to WM8994 OF device IDs
  input: Export matrix_keypad_parse_of_params()
  mfd: tps65090: Add compatible string for charger subnode
  mfd: db8500-prcmu: Support platform dependant device selection
  mfd: syscon: Fix warnings when printing resource_size_t
  of: Add stub of_get_parent for non-OF builds
  mfd: omap-usb-tll: Convert to devm_ioremap_resource()
  mfd: omap-usb-host: Convert to devm_ioremap_resource()
  ...

11 years agoMerge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 5 May 2013 21:47:31 +0000 (14:47 -0700)]
Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Gleb Natapov:
 "Highlights of the updates are:

  general:
   - new emulated device API
   - legacy device assignment is now optional
   - irqfd interface is more generic and can be shared between arches

  x86:
   - VMCS shadow support and other nested VMX improvements
   - APIC virtualization and Posted Interrupt hardware support
   - Optimize mmio spte zapping

  ppc:
    - BookE: in-kernel MPIC emulation with irqfd support
    - Book3S: in-kernel XICS emulation (incomplete)
    - Book3S: HV: migration fixes
    - BookE: more debug support preparation
    - BookE: e6500 support

  ARM:
   - reworking of Hyp idmaps

  s390:
   - ioeventfd for virtio-ccw

  And many other bug fixes, cleanups and improvements"

* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  kvm: Add compat_ioctl for device control API
  KVM: x86: Account for failing enable_irq_window for NMI window request
  KVM: PPC: Book3S: Add API for in-kernel XICS emulation
  kvm/ppc/mpic: fix missing unlock in set_base_addr()
  kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
  kvm/ppc/mpic: remove users
  kvm/ppc/mpic: fix mmio region lists when multiple guests used
  kvm/ppc/mpic: remove default routes from documentation
  kvm: KVM_CAP_IOMMU only available with device assignment
  ARM: KVM: iterate over all CPUs for CPU compatibility check
  KVM: ARM: Fix spelling in error message
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
  KVM: ARM: Fix API documentation for ONE_REG encoding
  ARM: KVM: promote vfp_host pointer to generic host cpu context
  ARM: KVM: add architecture specific hook for capabilities
  ARM: KVM: perform HYP initilization for hotplugged CPUs
  ARM: KVM: switch to a dual-step HYP init code
  ARM: KVM: rework HYP page table freeing
  ARM: KVM: enforce maximum size for identity mapped code
  ARM: KVM: move to a KVM provided HYP idmap
  ...

11 years agoGive the OID registry file module info to avoid kernel tainting
David Howells [Sat, 4 May 2013 07:48:27 +0000 (08:48 +0100)]
Give the OID registry file module info to avoid kernel tainting

Give the OID registry file module information so that it doesn't taint the
kernel when compiled as a module and loaded.

Reported-by: Dros Adamson <Weston.Adamson@netapp.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <Trond.Myklebust@netapp.com>
cc: stable@vger.kernel.org
cc: linux-nfs@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 years agotcp: do not expire TCP fastopen cookies
Eric Dumazet [Fri, 3 May 2013 19:12:45 +0000 (19:12 +0000)]
tcp: do not expire TCP fastopen cookies

TCP metric cache expires entries after one hour.

This probably make sense for TCP RTT/RTTVAR/CWND, but not
for TCP fastopen cookies.

Its better to try previous cookie. If it appears to be obsolete,
server will send us new cookie anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agonet/eth/ibmveth: Fixup retrieval of MAC address
Benjamin Herrenschmidt [Fri, 3 May 2013 17:19:01 +0000 (17:19 +0000)]
net/eth/ibmveth: Fixup retrieval of MAC address

Some ancient pHyp versions used to create a 8 bytes local-mac-address
property in the device-tree instead of a 6 bytes one for veth.

The Linux driver code to deal with that is an insane hack which also
happens to break with some choices of MAC addresses in qemu by testing
for a bit in the address rather than just looking at the size of the
property.

Sanitize this by doing the latter instead.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agovirtio: don't expose u16 in userspace api
stephen hemminger [Fri, 3 May 2013 14:49:41 +0000 (14:49 +0000)]
virtio: don't expose u16 in userspace api

Programs using virtio headers outside of kernel will no longer
build because u16 type does not exist in userspace. All user ABI
must use __u16 typedef instead.

Bug introduce by:
  commit 986a4f4d452dec004697f667439d27c3fda9c928
  Author: Jason Wang <jasowang@redhat.com>
  Date:   Fri Dec 7 07:04:56 2012 +0000

    virtio_net: multiqueue support

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 years agoMerge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 5 May 2013 20:23:27 +0000 (13:23 -0700)]
Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull 'full dynticks' support from Ingo Molnar:
 "This tree from Frederic Weisbecker adds a new, (exciting! :-) core
  kernel feature to the timer and scheduler subsystems: 'full dynticks',
  or CONFIG_NO_HZ_FULL=y.

  This feature extends the nohz variable-size timer tick feature from
  idle to busy CPUs (running at most one task) as well, potentially
  reducing the number of timer interrupts significantly.

  This feature got motivated by real-time folks and the -rt tree, but
  the general utility and motivation of full-dynticks runs wider than
  that:

   - HPC workloads get faster: CPUs running a single task should be able
     to utilize a maximum amount of CPU power.  A periodic timer tick at
     HZ=1000 can cause a constant overhead of up to 1.0%.  This feature
     removes that overhead - and speeds up the system by 0.5%-1.0% on
     typical distro configs even on modern systems.

   - Real-time workload latency reduction: CPUs running critical tasks
     should experience as little jitter as possible.  The last remaining
     source of kernel-related jitter was the periodic timer tick.

   - A single task executing on a CPU is a pretty common situation,
     especially with an increasing number of cores/CPUs, so this feature
     helps desktop and mobile workloads as well.

  The cost of the feature is mainly related to increased timer
  reprogramming overhead when a CPU switches its tick period, and thus
  slightly longer to-idle and from-idle latency.

  Configuration-wise a third mode of operation is added to the existing
  two NOHZ kconfig modes:

   - CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named
     as a config option.  This is the traditional Linux periodic tick
     design: there's a HZ tick going on all the time, regardless of
     whether a CPU is idle or not.

   - CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the
     periodic tick when a CPU enters idle mode.

   - CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the
     tick when a CPU is idle, also slows the tick down to 1 Hz (one
     timer interrupt per second) when only a single task is running on a
     CPU.

  The .config behavior is compatible: existing !CONFIG_NO_HZ and
  CONFIG_NO_HZ=y settings get translated to the new values, without the
  user having to configure anything.  CONFIG_NO_HZ_FULL is turned off by
  default.

  This feature is based on a lot of infrastructure work that has been
  steadily going upstream in the last 2-3 cycles: related RCU support
  and non-periodic cputime support in particular is upstream already.

  This tree adds the final pieces and activates the feature.  The pull
  request is marked RFC because:

   - it's marked 64-bit only at the moment - the 32-bit support patch is
     small but did not get ready in time.

   - it has a number of fresh commits that came in after the merge
     window.  The overwhelming majority of commits are from before the
     merge window, but still some aspects of the tree are fresh and so I
     marked it RFC.

   - it's a pretty wide-reaching feature with lots of effects - and
     while the components have been in testing for some time, the full
     combination is still not very widely used.  That it's default-off
     should reduce its regression abilities and obviously there are no
     known regressions with CONFIG_NO_HZ_FULL=y enabled either.

   - the feature is not completely idempotent: there is no 100%
     equivalent replacement for a periodic scheduler/timer tick.  In
     particular there's ongoing work to map out and reduce its effects
     on scheduler load-balancing and statistics.  This should not impact
     correctness though, there are no known regressions related to this
     feature at this point.

   - it's a pretty ambitious feature that with time will likely be
     enabled by most Linux distros, and we'd like you to make input on
     its design/implementation, if you dislike some aspect we missed.
     Without flaming us to crisp! :-)

  Future plans:

   - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off
     the periodic tick altogether when there's a single busy task on a
     CPU.  We'd first like 1 Hz to be exposed more widely before we go
     for the 0 Hz target though.

   - once we reach 0 Hz we can remove the periodic tick assumption from
     nr_running>=2 as well, by essentially interrupting busy tasks only
     as frequently as the sched_latency constraints require us to do -
     once every 4-40 msecs, depending on nr_running.

  I am personally leaning towards biting the bullet and doing this in
  v3.10, like the -rt tree this effort has been going on for too long -
  but the final word is up to you as usual.

  More technical details can be found in Documentation/timers/NO_HZ.txt"

* 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
  sched: Keep at least 1 tick per second for active dynticks tasks
  rcu: Fix full dynticks' dependency on wide RCU nocb mode
  nohz: Protect smp_processor_id() in tick_nohz_task_switch()
  nohz_full: Add documentation.
  cputime_nsecs: use math64.h for nsec resolution conversion helpers
  nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config
  nohz: Reduce overhead under high-freq idling patterns
  nohz: Remove full dynticks' superfluous dependency on RCU tree
  nohz: Fix unavailable tick_stop tracepoint in dynticks idle
  nohz: Add basic tracing
  nohz: Select wide RCU nocb for full dynticks
  nohz: Disable the tick when irq resume in full dynticks CPU
  nohz: Re-evaluate the tick for the new task after a context switch
  nohz: Prepare to stop the tick on irq exit
  nohz: Implement full dynticks kick
  nohz: Re-evaluate the tick from the scheduler IPI
  sched: New helper to prevent from stopping the tick in full dynticks
  sched: Kick full dynticks CPU that have more than one task enqueued.
  perf: New helper to prevent full dynticks CPUs from stopping tick
  perf: Kick full dynticks CPU if events rotation is needed
  ...

11 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 5 May 2013 18:37:16 +0000 (11:37 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Misc fixes plus a small hw-enablement patch for Intel IB model 58
  uncore events"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL
  perf/x86/intel/lbr: Fix LBR filter
  perf/x86: Blacklist all MEM_*_RETIRED events for Ivy Bridge
  perf: Fix vmalloc ring buffer pages handling
  perf/x86/intel: Fix unintended variable name reuse
  perf/x86/intel: Add support for IvyBridge model 58 Uncore
  perf/x86/intel: Fix typo in perf_event_intel_uncore.c
  x86: Eliminate irq_mis_count counted in arch_irq_stat

11 years agoMerge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 5 May 2013 17:58:06 +0000 (10:58 -0700)]
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull mudule updates from Rusty Russell:
 "We get rid of the general module prefix confusion with a binary config
  option, fix a remove/insert race which Never Happens, and (my
  favorite) handle the case when we have too many modules for a single
  commandline.  Seriously, the kernel is full, please go away!"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  modpost: fix unwanted VMLINUX_SYMBOL_STR expansion
  X.509: Support parse long form of length octets in Authority Key Identifier
  module: don't unlink the module until we've removed all exposure.
  kernel: kallsyms: memory override issue, need check destination buffer length
  MODSIGN: do not send garbage to stderr when enabling modules signature
  modpost: handle huge numbers of modules.
  modpost: add -T option to read module names from file/stdin.
  modpost: minor cleanup.
  genksyms: pass symbol-prefix instead of arch
  module: fix symbol versioning with symbol prefixes
  CONFIG_SYMBOL_PREFIX: cleanup.

11 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Sun, 5 May 2013 17:35:26 +0000 (10:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull single_open() leak fixes from Al Viro:
 "A bunch of fixes for a moderately common class of bugs: file with
  single_open() done by its ->open() and seq_release as its ->release().

  That leaks; fortunately, it's not _too_ common (either people manage
  to RTFM that says "When using single_open(), the programmer should use
  single_release() instead of seq_release() in the file_operations
  structure to avoid a memory leak", or they just copy a correct
  instance), but grepping through the tree has caught quite a pile.

  All of that is, AFAICS, -stable fodder, for as far as the patches
  apply.  I tried to carve it up into reasonably-sized pieces (more or
  less "comes from the same tree")"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  rcutrace: single_open() leaks
  gadget: single_open() leaks
  staging: single_open() leaks
  megaraid: single_open() leak
  wireless: single_open() leaks
  input: single_open() leak
  rtc: single_open() leaks
  ds1620: single_open() leak
  sh: single_open() leaks
  parisc: single_open() leaks
  mips: single_open() leaks
  ia64: single_open() leaks
  h8300: single_open() leaks
  cris: single_open() leaks
  arm: single_open() leaks

11 years agoMerge branch 'ipc-cleanups'
Linus Torvalds [Sun, 5 May 2013 17:13:44 +0000 (10:13 -0700)]
Merge branch 'ipc-cleanups'

Merge ipc fixes and cleanups from my IPC branch.

The ipc locking has always been pretty ugly, and the scalability fixes
to some degree made it even less readable.  We had two cases of double
unlocks in error paths due to this (one rcu read unlock, one semaphore
unlock), and this fixes the bugs I found while trying to clean things up
a bit so that we are less likely to have more.

* ipc-cleanups:
  ipc: simplify rcu_read_lock() in semctl_nolock()
  ipc: simplify semtimedop/semctl_main() common error path handling
  ipc: move sem_obtain_lock() rcu locking into the only caller
  ipc: fix double sem unlock in semctl error path
  ipc: move the rcu_read_lock() from sem_lock_and_putref() into callers
  ipc: sem_putref() does not need the semaphore lock any more
  ipc: move rcu_read_unlock() out of sem_unlock() and into callers

11 years agokvm: Add compat_ioctl for device control API
Scott Wood [Wed, 1 May 2013 01:00:45 +0000 (20:00 -0500)]
kvm: Add compat_ioctl for device control API

This API shouldn't have 32/64-bit issues, but VFS assumes it does
unless told otherwise.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
11 years agoperf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL
Peter Zijlstra [Fri, 3 May 2013 12:11:25 +0000 (14:11 +0200)]
perf/x86/intel/lbr: Demand proper privileges for PERF_SAMPLE_BRANCH_KERNEL

We should always have proper privileges when requesting kernel
data.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/20130503121256.230745028@chello.nl
[ Fix build error reported by fengguang.wu@intel.com, propagate error code back. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/n/tip-v0x9ky3ahzr6nm3c6ilwrili@git.kernel.org
11 years agorcutrace: single_open() leaks
Al Viro [Sun, 5 May 2013 04:16:35 +0000 (00:16 -0400)]
rcutrace: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agogadget: single_open() leaks
Al Viro [Sun, 5 May 2013 04:16:11 +0000 (00:16 -0400)]
gadget: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agostaging: single_open() leaks
Al Viro [Sun, 5 May 2013 04:15:43 +0000 (00:15 -0400)]
staging: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomegaraid: single_open() leak
Al Viro [Sun, 5 May 2013 04:15:15 +0000 (00:15 -0400)]
megaraid: single_open() leak

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agowireless: single_open() leaks
Al Viro [Sun, 5 May 2013 04:13:20 +0000 (00:13 -0400)]
wireless: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoinput: single_open() leak
Al Viro [Sun, 5 May 2013 04:12:56 +0000 (00:12 -0400)]
input: single_open() leak

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agortc: single_open() leaks
Al Viro [Sun, 5 May 2013 04:12:29 +0000 (00:12 -0400)]
rtc: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agods1620: single_open() leak
Al Viro [Sun, 5 May 2013 04:11:29 +0000 (00:11 -0400)]
ds1620: single_open() leak

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agosh: single_open() leaks
Al Viro [Sun, 5 May 2013 04:11:01 +0000 (00:11 -0400)]
sh: single_open() leaks

Cc: vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoparisc: single_open() leaks
Al Viro [Sun, 5 May 2013 04:09:44 +0000 (00:09 -0400)]
parisc: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agomips: single_open() leaks
Al Viro [Sun, 5 May 2013 04:09:30 +0000 (00:09 -0400)]
mips: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoia64: single_open() leaks
Al Viro [Sun, 5 May 2013 04:09:04 +0000 (00:09 -0400)]
ia64: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoh8300: single_open() leaks
Al Viro [Sun, 5 May 2013 04:08:26 +0000 (00:08 -0400)]
h8300: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agocris: single_open() leaks
Al Viro [Sun, 5 May 2013 04:07:52 +0000 (00:07 -0400)]
cris: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agoarm: single_open() leaks
Al Viro [Sun, 5 May 2013 04:07:22 +0000 (00:07 -0400)]
arm: single_open() leaks

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
11 years agocifs: small variable name cleanup
Dan Carpenter [Wed, 10 Apr 2013 11:43:00 +0000 (14:43 +0300)]
cifs: small variable name cleanup

server and ses->server are the same, but it's a little bit ugly that we
lock &ses->server->srv_mutex and unlock &server->srv_mutex.  It causes
a false positive in Smatch about inconsistent locking.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agoCIFS: fix error return code in cifs_atomic_open()
Wei Yongjun [Thu, 4 Apr 2013 06:16:21 +0000 (14:16 +0800)]
CIFS: fix error return code in cifs_atomic_open()

Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agocifs: store the real expected sequence number in the mid
Jeff Layton [Wed, 3 Apr 2013 15:55:03 +0000 (11:55 -0400)]
cifs: store the real expected sequence number in the mid

Currently, the signing routines take a pointer to a place to store the
expected sequence number for the mid response. It then stores a value
that's one below what that sequence number should be, and then adds one
to it when verifying the signature on the response.

Increment the sequence number before storing the value in the mid, and
eliminate the "+1" when checking the signature.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agocifs: on send failure, readjust server sequence number downward
Jeff Layton [Wed, 3 Apr 2013 14:27:36 +0000 (10:27 -0400)]
cifs: on send failure, readjust server sequence number downward

If sending a call to the server fails for some reason (for instance, the
sending thread caught a signal), then we must readjust the sequence
number downward again or the next send will have it too high.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agocifs: remove ENOSPC handling in smb_sendv
Jeff Layton [Fri, 22 Mar 2013 12:36:45 +0000 (08:36 -0400)]
cifs: remove ENOSPC handling in smb_sendv

To my knowledge, no one ever reported seeing this pop.

Acked-by: Suresh Jayaraman <sjayaraman@novell.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years ago[CIFS] cifs: Rename cERROR and cFYI to cifs_dbg
Joe Perches [Sun, 5 May 2013 03:12:25 +0000 (22:12 -0500)]
[CIFS] cifs: Rename cERROR and cFYI to cifs_dbg

It's not obvious from reading the macro names that these macros
are for debugging.  Convert the names to a single more typical
kernel style cifs_dbg macro.

cERROR(1, ...)   -> cifs_dbg(VFS, ...)
cFYI(1, ...)     -> cifs_dbg(FYI, ...)
cFYI(DBG2, ...)  -> cifs_dbg(NOISY, ...)

Move the terminating format newline from the macro to the call site.

Add CONFIG_CIFS_DEBUG function cifs_vfs_err to emit the
"CIFS VFS: " prefix for VFS messages.

Size is reduced ~ 1% when CONFIG_CIFS_DEBUG is set (default y)

$ size fs/cifs/cifs.ko*
   text    data     bss     dec     hex filename
 265245    2525     132  267902   4167e fs/cifs/cifs.ko.new
 268359    2525     132  271016   422a8 fs/cifs/cifs.ko.old

Other miscellaneous changes around these conversions:

o Miscellaneous typo fixes
o Add terminating \n's to almost all formats and remove them
  from the macros to be more kernel style like.  A few formats
  previously had defective \n's
o Remove unnecessary OOM messages as kmalloc() calls dump_stack
o Coalesce formats to make grep easier,
  added missing spaces when coalescing formats
o Use %s, __func__ instead of embedded function name
o Removed unnecessary "cifs: " prefixes
o Convert kzalloc with multiply to kcalloc
o Remove unused cifswarn macro

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 5 May 2013 03:10:04 +0000 (20:10 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Several routines do not use netdev_features_t to hold such bitmasks,
    fixes from Patrick McHardy and Bjørn Mork.

 2) Update cpsw IRQ software state and the actual HW irq enabling in the
    correct order.  From Mugunthan V N.

 3) When sending tipc packets to multiple bearers, we have to make
    copies of the SKB rather than just giving the original SKB directly.
    Fix from Gerlando Falauto.

 4) Fix race with bridging topology change timer, from Stephen
    Hemminger.

 5) Fix TCPv6 segmentation handling in GRE and VXLAN, from Pravin B
    Shelar.

 6) Endian bug in USB pegasus driver, from Dan Carpenter.

 7) Fix crashes on MTU reduction in USB asix driver, from Holger
    Eitzenberger.

 8) Don't allow the kernel to BUG() just because the user puts some crap
    in an AF_PACKET mmap() ring descriptor.  Fix from Daniel Borkmann.

 9) Don't use variable sized arrays on the stack in xen-netback, from
    Wei Liu.

10) Fix stats reporting and an unbalanced napi_disable() in be2net
    driver.  From Somnath Kotur and Ajit Khaparde.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (25 commits)
  cxgb4: fix error recovery when t4_fw_hello returns a positive value
  sky2: Fix crash on receiving VLAN frames
  packet: tpacket_v3: do not trigger bug() on wrong header status
  asix: fix BUG in receive path when lowering MTU
  net: qmi_wwan: Add Telewell TW-LTE 4G
  usbnet: pegasus: endian bug in write_mii_word()
  vxlan: Fix TCPv6 segmentation.
  gre: Fix GREv4 TCPv6 segmentation.
  bridge: fix race with topology change timer
  tipc: pskb_copy() buffers when sending on more than one bearer
  tipc: tipc_bcbearer_send(): simplify bearer selection
  tipc: cosmetic: clean up comments and break a long line
  drivers: net: cpsw: irq not disabled in cpsw isr in particular sequence
  xen-netback: better names for thresholds
  xen-netback: avoid allocating variable size array on stack
  xen-netback: remove redundent parameter in netbk_count_requests
  be2net: Fix to fail probe if MSI-X enable fails for a VF
  be2net: avoid napi_disable() when it has not been enabled
  be2net: Fix firmware download for Lancer
  be2net: Fix to receive Multicast Packets when Promiscuous mode is enabled on certain devices
  ...

11 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next
Linus Torvalds [Sun, 5 May 2013 03:08:49 +0000 (20:08 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next

Pull sparc updates from David Miller:

 1) Hibernation support, as well as removal of excess interrupt
    twiddling in MMU context allocation on sparc64 from Kirill Tkhai.

 2) Kill references to __ARCH_WANT_UNLOCKED_CTXSW.

 3) Sparc32 LEON bug fixes from Daniel Hellstrom and Andreas Larsson.

 4) Provide cmpxchg64(), from Geert Uytterhoeven.

 5) Device refcount and registry bug fixes from Federico Vaga and Wei
    Yongjun.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
  serial: sunsu: add missing platform_driver_unregister() when module exit
  sparc32, leon: Do not overwrite previously set irq flow handlers
  sparc/kernel/vio.c: add put_device() after device_find_child()
  sparc64: Do not save/restore interrupts in get_new_mmu_context()
  sparc: Consistently use 'wr' and 'rd' instructions for ASRs.
  sparc64: Kill __ARCH_WANT_UNLOCKED_CTXSW
  sparc64: Provide cmpxchg64()
  sparc64: Do not change num_physpages during initmem freeing
  sparc64: Hibernation support
  sparc,leon: updated GRPCI2 config name
  sparc,leon: support for GRPCI1 PCI host bridge controller
  sparc32,leon: add support for PCI busn resource for GRPCI2

11 years agofs: cifs: use kmemdup instead of kmalloc + memcpy
Silviu-Mihai Popescu [Mon, 11 Mar 2013 16:22:32 +0000 (18:22 +0200)]
fs: cifs: use kmemdup instead of kmalloc + memcpy

This replaces calls to kmalloc followed by memcpy with a single call to
kmemdup. This was found via make coccicheck.

Signed-off-by: Silviu-Mihai Popescu <silviupopescu1990@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agocifs: replaced kmalloc + memset with kzalloc
Dia Vasile [Sun, 10 Mar 2013 12:29:04 +0000 (14:29 +0200)]
cifs: replaced kmalloc + memset with kzalloc

Signed-off-by: Diana Vasile <kill.elohim@hotmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>
11 years agocifs: ignore the unc= and prefixpath= mount options
Jeff Layton [Fri, 22 Mar 2013 12:42:57 +0000 (08:42 -0400)]
cifs: ignore the unc= and prefixpath= mount options

...as advertised for 3.10.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Steve French <smfrench@gmail.com>