git.proxmox.com Git - mirror_ubuntu-hirsute-kernel.git/log

ftrace: graph of a single function

This patch adds the file:

   /debugfs/tracing/set_graph_function

which can be used along with the function graph tracer.

When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.

For example:

# echo blk_unplug > /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/trace
[...]
------------------------------------------
| 2)  make-19003  =>  kjournald-2219
------------------------------------------

2)               |  blk_unplug() {
2)               |    dm_unplug_all() {
2)               |      dm_get_table() {
2)      1.381 us |        _read_lock();
2)      0.911 us |        dm_table_get();
2)      1. 76 us |        _read_unlock();
2) +   12.912 us |      }
2)               |      dm_table_unplug_all() {
2)               |        blk_unplug() {
2)      0.778 us |          generic_unplug_device();
2)      2.409 us |        }
2)      5.992 us |      }
2)      0.813 us |      dm_table_put();
2) +   29. 90 us |    }
2) +   34.532 us |  }

You can add up to 32 functions into this file. Currently we limit it
to 32, but this may change with later improvements.

To add another function, use the append '>>':

  # echo sys_read >> /debugfs/tracing/set_graph_function
  # cat /debugfs/tracing/set_graph_function
  blk_unplug
  sys_read

Using the '>' will clear out the function and write anew:

  # echo sys_write > /debug/tracing/set_graph_function
  # cat /debug/tracing/set_graph_function
  sys_write

Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.

The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.

Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core

Merge commit 'v2.6.28-rc7' into tracing/core

ftrace: fix race in function graph during fork

Impact: graph tracer race/crash fix

There is a nasy race in startup of a new process running the
function graph tracer. In fork.c:

total_forks++;
spin_unlock(&current->sighand->siglock);
write_unlock_irq(&tasklist_lock);
ftrace_graph_init_task(p);
proc_fork_connector(p);
cgroup_post_fork(p);
return p;

The new task is free to run as soon as the tasklist_lock is released.
This is before the ftrace_graph_init_task. If the task does run
it will be using the same ret_stack and curr_ret_stack as the parent.
This will cause crashes that are difficult to debug.

This patch moves the ftrace_graph_init_task to just after the alloc_pid
code. This fixes the above race.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

trace: fix output of stack trace

Impact: fix to output of stack trace

If a function is not found in the stack of the stack tracer, the
number printed is quite strange. This fixes the algorithm to handle
missing functions better.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/function-graph-tracer: enabled by default

CONFIG_FUNCTION_GRAPH_TRACER depends on FUNCTION_TRACER already,
(turning it non-default) so it so making it default-n is pointless.

So enable it by default - it's a nice extension of the function tracer.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/function-graph-tracer: improve duration output

Impact: better trace output of duration for long calls

The old duration output didn't exceeded 9999.999 us to fit the column
and the nanosecs were always 3 numbers. As Ingo suggested, it's better
to have the whole microseconds elapsed time and shift the nanosecs precision
if needed to fit the maximum 7 numbers. And usec need more number, the case
should be rare and important enough to break a bit the column alignment to
show it.

So, depending of the duration value, we now have these patterns:

    u.nnn us
   uu.nnn us
  uuu.nnn us
uuuu.nnn us
uuuuu.nn us
uuuuuu.n us
uuuuuuuu..... us

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

tracing/function-graph-tracer: display unified style cmdline and pid

Impact: extend function-graph output: let one know which thread called a function

This patch implements a helper function to print the couple cmdline/pid.
Its output is provided during task switching and on each row if the new
"funcgraph-proc" defualt-off option is set through trace_options file.

The output is center aligned and never exceeds 14 characters. The cmdline
is truncated over 7 chars.
But note that if the pid exceeds 6 characters, the column will overflow (but
the situation is abnormal).

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: add checks on ret stack in function graph

Import: robustness checks

Add more checks in the function graph code to detect errors and
perhaps print out better information if a bug happens.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: function graph return for function entry

Impact: feature, let entry function decide to trace or not

This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: print real return in dumpstack for function graph

Impact: better dumpstack output

I noticed in my crash dumps and even in the stack tracer that a
lot of functions listed in the stack trace are simply
return_to_handler which is ftrace graphs way to insert its own
call into the return of a function.

But we lose out where the actually function was called from.

This patch adds in hooks to the dumpstack mechanism that detects
this and finds the real function to print. Both are printed to
let the user know that a hook is still in place.

This does give a funny side effect in the stack tracer output:

        Depth   Size      Location    (80 entries)
        -----   ----      --------
  0)     4144      48   save_stack_trace+0x2f/0x4d
  1)     4096     128   ftrace_call+0x5/0x2b
  2)     3968      16   mempool_alloc_slab+0x16/0x18
  3)     3952     384   return_to_handler+0x0/0x73
  4)     3568    -240   stack_trace_call+0x11d/0x209
  5)     3808     144   return_to_handler+0x0/0x73
  6)     3664    -128   mempool_alloc+0x4d/0xfe
  7)     3792     128   return_to_handler+0x0/0x73
  8)     3664     -32   scsi_sg_alloc+0x48/0x4a [scsi_mod]

As you can see, the real functions are now negative. This is due
to them not being found inside the stack.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ring-buffer: change "page" variable names to "bpage"

Impact: clean up

Andrew Morton pointed out that the kernel convention of a variable
named page should be of type page struct. The ring buffer uses
a variable named "page" for a pointer to something else.

This patch converts those to be called "bpage" (as in "buffer page").

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: add ftrace_graph_stop()

Impact: new ftrace_graph_stop function

While developing more features of function graph, I hit a bug that
caused the WARN_ON to trigger in the prepare_ftrace_return function.
Well, it was hard for me to find out that was happening because the
bug would not print, it would just cause a hard lockup or reboot.
The reason is that it is not safe to call printk from this function.

Looking further, I also found that it calls unregister_ftrace_graph,
which grabs a mutex and calls kstop machine. This would definitely
lock the box up if it were to trigger.

This patch adds a fast and safe ftrace_graph_stop() which will
stop the function tracer. Then it is safe to call the WARN ON.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: have function graph use mcount caller address

Impact: consistency change for function graph

This patch makes function graph record the mcount caller address
the same way the function tracer does.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: clean up function graph asm

Impact: clean up

There exists macros for x86 asm to handle x86_64 and i386.
This patch updates function graph asm to use them.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ring-buffer: read page interface

Impact: new API to ring buffer

This patch adds a new interface into the ring buffer that allows a
page to be read from the ring buffer on a given CPU. For every page
read, one must also be given to allow for a "swap" of the pages.

rpage = ring_buffer_alloc_read_page(buffer);
if (!rpage)
goto err;
ret = ring_buffer_read_page(buffer, &rpage, cpu, full);
if (!ret)
goto empty;
process_page(rpage);
ring_buffer_free_read_page(rpage);

The caller of these functions must handle any waits that are
needed to wait for new data. The ring_buffer_read_page will simply
return 0 if there is no data, or if "full" is set and the writer
is still on the current page.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ring-buffer: move some metadata into buffer page

Impact: get ready for splice changes

This patch moves the commit and timestamp into the beginning of each
data page of the buffer. This change will allow the page to be moved
to another location (disk, network, etc) and still have information
in the page to be able to read it.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

ftrace: replace raw_local_irq_save with local_irq_save

Impact: fix for lockdep and ftrace

The raw_local_irq_save/restore confuses lockdep. This patch
converts them to the local_irq_save/restore variants.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge commit 'v2.6.28-rc7'; branch 'x86/dumpstack' into tracing/ftrace

Merge x86/dumpstack into tracing/ftrace because upcoming ftrace changes
depend on cleanups already in x86/dumpstack.

Also merge to latest upstream -rc.

Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core

tracing/function-graph-tracer: support for x86-64

Impact: extend and enable the function graph tracer to 64-bit x86

This patch implements the support for function graph tracer under x86-64.
Both static and dynamic tracing are supported.

This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
wanted to use probe_kernel_read/write to make the return address
saving/patching code more generic but it causes tracing recursion.

That would be perhaps useful to implement a notrace version of these
function for other archs ports.

Note that arch/x86/process_64.c is not traced, as in X86-32. I first
thought __switch_to() was responsible of crashes during tracing because I
believed current task were changed inside but that's actually not the
case (actually yes, but not the "current" pointer).

So I will have to investigate to find the functions that harm here, to
enable tracing of the other functions inside (but there is no issue at
this time, while process_64.c stays out of -pg flags).

A little possible race condition is fixed inside this patch too. When the
tracer allocate a return stack dynamically, the current depth is not
initialized before but after. An interrupt could occur at this time and,
after seeing that the return stack is allocated, the tracer could try to
trace it with a random uninitialized depth. It's a prevention, even if I
hadn't problems with it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

function trace: fix a bug of single thread function trace

Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func()

When disabling single thread function trace using
"echo -1 > set_ftrace_pid", the normal function trace
has to restore to original function, otherwise the normal
function trace will not work well.

Without this commit, something like below:

$ ps |grep 850
850 root 2556 S -/bin/sh
$ echo 850 > /debug/tracing/set_ftrace_pid
$ echo function > /debug/tracing/current_tracer
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ cat /debug/tracing/trace_pipe |wc -l
59704
$ echo -1 > /debug/tracing/set_ftrace_pid
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ more /debug/tracing/trace_pipe
<====== nothing output now!
it should output trace record.

Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branches 'tracing/branch-tracer', 'tracing/ftrace', 'tracing/function-graph-tracer', 'tracing/markers', 'tracing/powerpc', 'tracing/stack-tracer' and 'tracing/tracepoints' into tracing/core

Merge branch 'tracing/urgent' into tracing/core

Conflicts:
kernel/trace/ring_buffer.c

Linux 2.6.28-rc7

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (25 commits)
  em28xx: remove backward compat macro added on a previous fix
  V4L/DVB (9748): em28xx: fix compile warning
  V4L/DVB (9743): em28xx: fix oops audio
  V4L/DVB (9742): em28xx-alsa: implement another locking schema
  V4L/DVB (9732): sms1xxx: use new firmware for Hauppauge WinTV MiniStick
  V4L/DVB (9691): gspca: Move the video device to a separate area.
  V4L/DVB (9690): gspca: Lock the subdrivers via module_get/put.
  V4L/DVB (9689): gspca: Memory leak when disconnect while streaming.
  V4L/DVB (9668): em28xx: fix a race condition with hald
  V4L/DVB (9664): af9015: don't reconnect device in USB-bus
  V4L/DVB (9647): em28xx: void having two concurrent control URB's
  V4L/DVB (9646): em28xx: avoid allocating/dealocating memory on every control urb
  V4L/DVB (9645): em28xx: Avoid memory leaks if registration fails
  V4L/DVB (9639): Make dib0700 remote control support work with firmware v1.20
  V4L/DVB (9635): v4l: s2255drv fix firmware test on big-endian
  V4L/DVB (9634): Make sure the i2c gate is open before powering down tuner
  V4L/DVB (9632): make em28xx aux audio input work
  V4L/DVB (9631): Make s2api work for ATSC support
  V4L/DVB (9627): em28xx: Avoid i2c register error for boards without eeprom
  V4L/DVB (9608): Fix section mismatch warning for dm1105 during make
  ...

drivers/gpu/drm/i915/i915_irq.c: fix warning

drivers/gpu/drm/i915/i915_irq.c: In function 'i915_disable_pipestat':
drivers/gpu/drm/i915/i915_irq.c:101: warning: control may reach end of non-void function 'i915_pipestat' being inlined

Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i82875p_edac: fix module remove

Fix module removal bugs of i82875p_edac.  Also i82975x_edac code seems to
have the same module removal bugs as in i82875p_edac.

The problems were:

1. In module removal i82875p_remove_one() is never called.

   Variable i82875p_registered is newer changed from 1, which
   guarantees i82875p_remove_one() is not called (and even if it were
   called, it would be called in wrong order).

   As a result, the edac_mc workque is not stopped and keeps probing.
   If kernel debugging options are not enabled, user may not notice
   anything going wrong.

   if debugging options are enabled and I do "rmmod i82875p_edac", I
   get:

      edac debug: edac_pci_workq_function() checking
      BUG: unable to handle kernel paging request at f882d16f
      ...
      call trace:
       [<f8834df3>] ? edac_mc_workq_function+0x55/0x7e [edac_core]
       [<c0233974>] ? run_workqueue+0xd7/0x1a5
       [<c023392f>] ? run_workqueue+0x92/0x1a5
       [<f8834d9e>] ? edac_mc_workq_function+0x0/0x7e [edac_core]
       [<c0233af9>] ? worker_thread+0xb7/0xc3
       [<c0236a7b>] ? autoremove_wake_function+0x0/0x33
       [<c0233a42>] ? worker_thread+0x0/0xc3
       [<c0236809>] ? kthread+0x3b/0x61
       [<c02367ce>] ? kthread+0x0/0x61
       [<c0204587>] ? kernel_thread_helper+0x7/0x10

   Fix for this is to get rid of needles variable i82875p_registered
   altogether and run i82875p_remove_one() *before*
   pci_unregister_driver().

2. edac_mc_del_mc() uses mci after freeing mci

   edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device().  The
   kobject refcount of mci drops to 0 and mci is freed.  After this
   mci is accessed via debug print and i82875p_remove_one() still
   uses mci->pvt and tries to free mci again with edac_mc_free().

   The fix for this is add kobject_get(&mci->edac_mci_kobj) after
   edac_mc_alloc(). Then the mci is still available after returning
   from edac_mc_del_mc() with refcount 1, and mci->pvt is still
   available. When i82875p_remove_one() finally calls edac_mc_free(),
   this will cause kobject_put() and mci is released properly.

Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

i82875p_edac: fix overflow device resource setup

When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26
or later, the module load fails due to BAR 0 collision.  On 2.6.25 the
module loads just fine.

The overflow device on the MB seems to be hidden and its resources are not
allocated at normal PCI bus init.  Log shows the missing resource problem:

  EDAC DEBUG: i82875p_probe1()
  PCI: 0000:00:06.0 reg 10 32bit mmio: [fecf0000, fecf0fff]
  pci 0000:00:06.0: device not available because of BAR 0
[0xfecf0000-0xfecf0fff] collisions
  EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow
device

The patch below fixes this by calling pci_bus_assign_resources() after
the overflow device is revealed and added to the bus. With this patch
I am again able to load and use the module.

Signed-off-by: Jarkko Lavinen <jlavi@iki.fi>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbdev: fix FB console blanking

The commit aef7db4bd5a3b6068dfa05919a3d685199eed116 fixed the problem with
recursive locking in fb blanking code if blank is caused by user setting
the /sys/class/graphics/fb*/blank.  However this broke the fbcon timeout
blanking.

If you use a driver that defines ->fb_blank operation and at the same time
that driver relies on other driver (e.g.  backlight or lcd class) to blank
the screen, when the fbcon times out and tries to blank the fb, it will
call only fb driver blanker and won't notify the other driver.  Thus FB
output is disabled, but the screen isn't blanked.

Restore fbcon blanking and at the same time apply the proper fix for the
above problem: if fbcon_blank is called with FBINFO_FLAG_USEREVENT, we are
already called through notification from fb_blank, thus we don't have to
blank the fb again.

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ntfs: don't fool kernel-doc

kernel-doc handles macros now (it has for quite some time), so change the
ntfs_debug() macro's kernel-doc to be just before the macro instead of
before a phony function prototype.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel-doc: handle varargs cleanly

The method for listing varargs in kernel-doc notation is:
* @...: these arguments are printed by the @fmt argument

but scripts/kernel-doc is confused: it always lists varargs as:
... variable arguments
and ignores the @...: line's description, but then prints that
line after the list of function parameters as though it's
not part of the function parameters.

This patch makes kernel-doc print the supplied @... description if it is
present; otherwise a boilerplate "variable arguments" is printed.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/idr.c: fix rcu related race with idr_find

2nd part of the fixes needed for
http://bugzilla.kernel.org/show_bug.cgi?id=11796.

When the idr tree is either grown or shrunk, then the update to the number
of layers and the top pointer were not atomic. This race caused crashes.

The attached patch fixes that by replicating the layers counter in each
layer, thus idr_find doesn't need idp->layers anymore.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Clement Calmels <cboulte@gmail.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

DMA-API.txt: fix description of pci_map_sg/dma_map_sg scatterlists handling

- pci_map_sg/dma_map_sg are used with a scatter gather list that doesn't
come from the block layer (e.g. some network drivers do).

- how IOMMUs merge adjacent elements of the scatter/gather list is
independent of how the block layer determines sees elements.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

frv: fix mmap2 error handling

Fix the error handling in sys_mmap2(). Currently, if the pgoff check
fails, fput() might have to be called (which it isn't), so do the pgoff
check first, before fget() is called.

Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

taint: add missing comment

The description for 'D' was missing in the comment... (causing me a
minute of WTF followed by looking at more of the code)

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

radeonfb: fix problem with color expansion & alignment

The engine on some radeon variants locks up if color expansion is called
for non aligned source data. This patch enables a feature of the core
fbdev to request aligned input pixmaps and uses the HW clipping engine to
clip the output to the requested size

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=11875

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: James Cloos <cloos@jhcloos.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: fix spi_s3c24xx_gpio num_chipselect

The spi master driver must have num_chipselect set to allow the bus to
initialise. Pass this through the platform data.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: fix spi_s3c24xx_gpio device handle lookup

The spidev_to_sg() call in spi_s3c24xx_gpio.c was using the wrong method
to convert the spi device into the private data for the driver. Fix this
by using spi_master_get_devdata.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: au1550_spi full duplex dma fix

Fix unsafe order in dma mapping operation: always flush data from the
cache *BEFORE* invalidating it, to allow full duplex transfers where the
same buffer may be used for both writes and reads. Tested with mmc-spi.

Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: fix spi_imx probe oopsing

Corrects spi_imx driver oops during initialization/probing: can't use
drv_data before it's allocated.

Signed-off-by: Julien Boibessot <julien.boibessot@armadeus.com>
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

epoll: introduce resource usage limits

It has been thought that the per-user file descriptors limit would also
limit the resources that a normal user can request via the epoll
interface.  Vegard Nossum reported a very simple program (a modified
version attached) that can make a normal user to request a pretty large
amount of kernel memory, well within the its maximum number of fds.  To
solve such problem, default limits are now imposed, and /proc based
configuration has been introduced.  A new directory has been created,
named /proc/sys/fs/epoll/ and inside there, there are two configuration
points:

  max_user_instances = Maximum number of devices - per user

  max_user_watches   = Maximum number of "watched" fds - per user

The current default for "max_user_watches" limits the memory used by epoll
to store "watches", to 1/32 of the amount of the low RAM.  As example, a
256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
That should be enough to not break existing heavy epoll users.  The
default value for "max_user_instances" is set to 128, that should be
enough too.

This also changes the userspace, because a new error code can now come out
from EPOLL_CTL_ADD (-ENOSPC).  The EMFILE from epoll_create() was already
listed, so that should be ok.

[akpm@linux-foundation.org: use get_current_user()]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: <stable@kernel.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Reported-by: Vegard Nossum <vegardno@ifi.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: mpc52xx_psc_spi chipselect bugfix

According to the manual the "tdfOnExit" flag must be set on the last byte
we want to send.  The PSC controller holds SS low until the flag is set.

However, the flag was set always on the last byte of the FIFO,
independently if it is the last byte of the transfer.  This generates
spurious toggling of the SS signals that breaks the protocol of some
peripherals.  Fix.

Signed-off-by: Stefano Babic <sbabic@denx.de>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi: avoid spidev crash when device is removed

I saw a kernel oops in spidev_remove() when a spidev device was registered
and I unloaded the SPI master driver:

Unable to handle kernel paging request for data at address 0x00000004
Faulting instruction address: 0xc01c0c50
Oops: Kernel access of bad area, sig: 11 [#1]
CDSPR
Modules linked in: spi_ppc4xx(-)
NIP: c01c0c50 LR: c01bf9e4 CTR: c01c0c34
REGS: cec89c30 TRAP: 0300 Not tainted (2.6.27.3izt)
MSR: 00021000 <ME> CR: 24000228 XER: 20000007
DEAR: 00000004, ESR: 00800000
TASK = cf889040[2070] 'rmmod' THREAD: cec88000
GPR00: 00000000 cec89ce0 cf889040 cec8e000 00000004 cec8e000 ffffffff 00000000
GPR08: 0000001c c0336380 00000000 c01c0c34 00000001 1001a338 100e0000 100df49c
GPR16: 100b54c0 100df49c 100ddd20 100f05a8 100b5340 100efd68 00000000 00000000
GPR24: 100ec008 100f0428 c0327788 c0327794 cec8e0ac cec8e000 c0336380 00000000
NIP [c01c0c50] spidev_remove+0x1c/0xe4
LR [c01bf9e4] spi_drv_remove+0x2c/0x3c
Call Trace:
[cec89d00] [c01bf9e4] spi_drv_remove+0x2c/0x3c
[cec89d10] [c01859a0] __device_release_driver+0x78/0xb4
[cec89d20] [c0185ab0] device_release_driver+0x28/0x44
[cec89d40] [c0184be8] bus_remove_device+0xac/0xd8
[cec89d60] [c0183094] device_del+0x100/0x194
[cec89d80] [c0183140] device_unregister+0x18/0x30
[cec89da0] [c01bf30c] __unregister+0x20/0x34
[cec89db0] [c0182778] device_for_each_child+0x38/0x74
[cec89de0] [c01bf2d0] spi_unregister_master+0x28/0x44
[cec89e00] [c01bfeac] spi_bitbang_stop+0x1c/0x58
[cec89e20] [d908a5e0] spi_ppc4xx_of_remove+0x24/0x7c [spi_ppc4xx]
[...]

IMHO a call to spi_set_drvdata() is missing in spidev_probe(). The patch
below helped.

Signed-off-by: Wolfgang Ocker <weo@reccoware.de>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

spi documentation: use __initdata on struct

Use __initdata for data, not __init.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hwmon: applesmc: make applesmc load automatically on startup

make use of the new dmi device loading support to automatically load the
applesmc driver based on the dmi_match table.

Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
Cc: Nicolas Boichat <nicolas@boichat.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

parport_serial: fix array overflow

The netmos_9xx5_combo type assumes that PCI SSID provides always the
correct value for the number of parallel and serial ports, but there are
indeed broken devices with wrong numbers, which may result in Oops.

This patch simply adds the check of the array range.

Reference: Novell bnc#447067
https://bugzilla.novell.com/show_bug.cgi?id=447067

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: memory hotplug fix for notifier callback

Fixes for memcg/memory hotplug.

While memory hotplug allocate/free memmap, page_cgroup doesn't free
page_cgroup at OFFLINE when page_cgroup is allocated via bootomem.
(Because freeing bootmem requires special care.)

Then, if page_cgroup is allocated by bootmem and memmap is freed/allocated
by memory hotplug, page_cgroup->page == page is no longer true.

But current MEM_ONLINE handler doesn't check it and update
page_cgroup->page if it's not necessary to allocate page_cgroup.  (This
was not found because memmap is not freed if SPARSEMEM_VMEMMAP is y.)

And I noticed that MEM_ONLINE can be called against "part of section".
So, freeing page_cgroup at CANCEL_ONLINE will cause trouble.  (freeing
used page_cgroup) Don't rollback at CANCEL.

One more, current memory hotplug notifier is stopped by slub because it
sets NOTIFY_STOP_MASK to return vaule.  So, page_cgroup's callback never
be called.  (low priority than slub now.)

I think this slub's behavior is not intentional(BUG). and fixes it.

Another way to be considered about page_cgroup allocation:
  - free page_cgroup at OFFLINE even if it's from bootmem
    and remove specieal handler. But it requires more changes.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12041

Signed-off-by: KAMEZAWA Hiruyoki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Tested-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmalloc fix lazy unmapping cache aliasing

Jim Radford has reported that the vmap subsystem rewrite was sometimes
causing his VIVT ARM system to behave strangely (seemed like going into
infinite loops trying to fault in pages to userspace).

We determined that the problem was most likely due to a cache aliasing
issue. flush_cache_vunmap was only being called at the moment the page
tables were to be taken down, however with lazy unmapping, this can happen
after the page has subsequently been freed and allocated for something
else. The dangling alias may still have dirty data attached to it.

The fix for this problem is to do the cache flushing when the caller has
called vunmap -- it would be a bug for them to write anything else to the
mapping at that point.

That appeared to solve Jim's problems.

Reported-by: Jim Radford <radford@blackbean.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: fix regression in ocfs2_read_blocks_sync()
  ocfs2: fix return value set in init_dlmfs_fs()
  ocfs2: Small documentation update
  ocfs2: fix wake_up in unlock_ast
  ocfs2: initialize stack_user lvbptr
  ocfs2: comments typo fix

ocfs2: fix regression in ocfs2_read_blocks_sync()

We're panicing in ocfs2_read_blocks_sync() if a jbd-managed buffer is seen.
At first glance, this seems ok but in reality it can happen. My test case
was to just run 'exorcist'. A struct inode is being pushed out of memory but
is then re-read at a later time, before the buffer has been checkpointed by
jbd. This causes a BUG to be hit in ocfs2_read_blocks_sync().

Reviewed-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

ocfs2: fix return value set in init_dlmfs_fs()

In init_dlmfs_fs(), if calling kmem_cache_create() failed, the code will use return value from
calling bdi_init(). The correct behavior should be set status as -ENOMEM before going to "bail:".

Signed-off-by: Coly Li <coyli@suse.de>
Acked-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

ocfs2: Small documentation update

Remove some features from the "not-supported" list that are actually
supported now.

Signed-off-by: Mark Fasheh <mfasheh@suse.com>

ocfs2: fix wake_up in unlock_ast

In ocfs2_unlock_ast(), call wake_up() on lockres before releasing
the spin lock on it. As soon as the spin lock is released, the
lockres can be freed.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

ocfs2: initialize stack_user lvbptr

The locking_state dump, ocfs2_dlm_seq_show, reads the lvb on locks where it
has not yet been initialized by a lock call.

Signed-off-by: David Teigland <teigland@redhat.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

ocfs2: comments typo fix

This patch fixes two typos in comments of ocfs2.

Signed-off-by: Coly Li <coyli@suse.de>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

em28xx: remove backward compat macro added on a previous fix

commit 50f3beb50abe0cc0228363af804e50e710b3e5b0 fixed em28xx-alsa
locking schema. However, a backport macro was kept.

This patch removes the macro, since it is not needed for the module
compilation against upstream.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ
  [libata] pata_rb532_cf: fix signature of the xfer function
  [libata] pata_rb532_cf: fix and rename register definitions
  ata_piix: add borked Tecra M4 to broken suspend list

V4L/DVB (9748): em28xx: fix compile warning

Label fail_unreg is no longer used.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

V4L/DVB (9743): em28xx: fix oops audio

Replaced usb_kill_usb for usb_unlink_usb
(wait until urb to fully stop require USB core to put the calling process to sleep).

Oops:
http://www.kerneloops.org/raw.php?rawid=71799&msgid=

Signed-off-by: Douglas Schilling Landgraf <dougsland@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/mlx4: Fix MTT leakage in resize CQ
  IB/ehca: Fix problem with generated flush work completions
  IB/ehca: Change misleading error message on memory hotplug
  mlx4_core: Save/restore default port IB capability mask

libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ

Some recent Seagate harddrives have firmware bug which causes FLUSH
CACHE to timeout under certain circumstances if NCQ is being used.
This can be worked around by disabling NCQ and fixed by updating the
firmware. Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these
devices.

The wiki page has been updated to contain information on this issue.

http://ata.wiki.kernel.org/index.php/Known_issues

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

[libata] pata_rb532_cf: fix signature of the xfer function

Per definition, this function should return the number of bytes
consumed. As the original parameter "buflen" is being decremented inside
the read/write loop, save it in "retlen" at the beginning.

Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Acked-by: Sergei Shtyltov <sshtylyov@ru.mvista.com>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Acked-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

[libata] pata_rb532_cf: fix and rename register definitions

The original standalone driver uses a custom address for the error
register. Use it in pata_rb532_cf, too.

Rename two register definitions:
- The address offset 0x0800 in fact is the ATA base, not ATA command
address.
- The offset 0x0C00 is not a regular ATA data address, but a buffered one
allowing 4-byte IO.

Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

ata_piix: add borked Tecra M4 to broken suspend list

Tecra M4 sometimes forget what it is and reports bogus data via DMI
which makes the machine evade broken suspend matching and thus fail
suspend/resume. This patch updates piix_broken_suspend() such that it
can match such case. As the borked DMI data is a bit generic,
matching many entries to make the match more specific is necessary.
As the usual DMI matching is limited to four entries, this patch uses
hard coded manual matching.

This is reported by Alexandru Romanescu.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

Merge branches 'ehca' and 'mlx4' into for-linus

IB/mlx4: Fix MTT leakage in resize CQ

When resizing a CQ, MTTs associated with the old CQE buffer were not
freed. As a result, if any app used resize CQ repeatedly, all MTTs
were eventually exhausted, which led to all memory registration
operations failing until the driver is reloaded.

Once the RESIZE_CQ command returns successfully from FW, FW no longer
accesses the old CQ buffer, so it is safe to deallocate the MTT
entries used by the old CQ buffer.

Finally, if the RESIZE_CQ command fails, the MTTs allocated for the
new CQEs buffer also need to be de-allocated.

This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1416>.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ehca: Fix problem with generated flush work completions

This fix enables ehca device driver to generate flush work completions
even if the application doesn't request completions for all work
requests. The current implementation of ehca will generate flush work
completions for the wrong work requests if an application uses non
signaled work completions.

Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

IB/ehca: Change misleading error message on memory hotplug

The error message printed when the eHCA driver prevents memory hotplug
is misleading -- the user might think that hot-removing the lhca,
hotplugging memory, then hot-adding the lhca again will work, but it
actually doesn't.

Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  ieee1394: sbp2: fix race condition in state change
  ieee1394: fix list corruption (reported at module removal)
  firewire: fw-sbp2: another iPod mini quirk entry
  ieee1394: sbp2: another iPod mini quirk entry

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: Apple ALU wireless keyboards are bluetooth devices
  HID: remove setup mutex, fix possible deadlock
  HID: add USB ID for another dual gameron adapter
  HID: unignore mouse on unibody macbooks
  HID: fix blacklist entries for greenasia/pantherlord

Add kref to fake tty used by USB console

We alloc a fake tty in usb serial console setup function. we should
init the tty's kref otherwise we will face WARN_ON after following
invoke of tty_port_tty_set --> tty_kref_get.

Signed-off-by: Kevin Hao <kexin.hao@windriver.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/char/tty_io.c: Avoid panic when no console is configured.

When no console is configured tty_open tries to call kref_get on a NULL
pointer, return ENODEV instead.

Signed-off-by: Will Newton <will.newton@gmail.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: serial: add more Onda device ids to option driver
  USB: usb-storage: unusual_devs entry for Nikon D2H
  USB: storage: unusual_devs entry for Mio C520-GPS
  USB: fsl_usb2_udc: Report disconnect before unbinding
  USB: fsl_qe_udc: Report disconnect before unbinding
  USB: fix SB600 USB subsystem hang bug
  Revert "USB: improve ehci_watchdog's side effect in CPU power management"

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix build for 32-bit SMP configs

vmscan: protect zone rotation stats by lru lock

The zone's rotation statistics must not be accessed without the
corresponding LRU lock held. Fix an unprotected write in
shrink_active_list().

Acked-by: Rik van Riel <riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "of_platform_driver noise on sparce"

This reverts commit e669dae6141ff97d3c7566207f5de3b487dcf837, since it
is incomplete, and clashes with fuller patches and the sparc 32/64
unification effort.

Requested-by: David Miller <davem@davemloft.net>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

USB: serial: add more Onda device ids to option driver

Thanks to Domenico Riccio for pointing these out.

Cc: Domenico Riccio <domenico.riccio@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: usb-storage: unusual_devs entry for Nikon D2H

This patch adds an unusual_devs entry for the Nikon D2H camera.

From: Tobias Kunze Briseño <t@fictive.com>,
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: storage: unusual_devs entry for Mio C520-GPS

This patch (as1176) adds an unusual_devs entry for the Mio C520 GPS
unit. Other devices also based on the Mitac hardware use the same USB
interface firmware, so the Vendor and Product names are generalized.

This fixes Bugzilla #11583.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Tamas Kerecsen <kerecsen@bigfoot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fsl_usb2_udc: Report disconnect before unbinding

Gadgets disable endpoints in their disconnect callbacks, so
we must call disconnect before unbinding.

The patch fixes following badness:

root@b1:~# insmod fsl_usb2_udc.ko
Freescale High-Speed USB SOC Device Controller driver (Apr 20, 2007)
root@b1:~# insmod g_ether.ko
g_ether gadget: using random self ethernet address
g_ether gadget: using random host ethernet address
usb0: MAC 26:07:ba:c0:44:33
usb0: HOST MAC 96:81:0c:05:4d:e3
g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
g_ether gadget: g_ether ready
fsl-usb2-udc: bind to driver g_ether
g_ether gadget: high speed config #1: CDC Ethernet (ECM)
root@b1:~# rmmod g_ether.ko
------------[ cut here ]------------
Badness at drivers/usb/gadget/composite.c:871
[...]
NIP [e10c3454] composite_unbind+0x24/0x15c [g_ether]
LR [e10aa454] usb_gadget_unregister_driver+0x13c/0x164 [fsl_usb2_udc]
Call Trace:
[df145e80] [ffffff94] 0xffffff94 (unreliable)
[df145eb0] [e10aa454] usb_gadget_unregister_driver+0x13c/0x164 [fsl_usb2_udc]
[df145ed0] [e10c4c40] usb_composite_unregister+0x3c/0x4c [g_ether]
[df145ee0] [c006bcc0] sys_delete_module+0x130/0x19c
[df145f40] [c00142d8] ret_from_syscall+0x0/0x38
[...]
unregistered gadget driver 'g_ether'

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fsl_qe_udc: Report disconnect before unbinding

Gadgets disable endpoints in their disconnect callbacks, so
we must call disconnect before unbinding. This also fixes
muram memory leak, since we free muram in the qe_ep_disable().

But mainly the patch fixes following badness:

root@b1:~# insmod fsl_qe_udc.ko
fsl_qe_udc: Freescale QE/CPM USB Device Controller driver, 1.0
fsl_qe_udc e01006c0.usb: QE USB controller initialized as device
root@b1:~# insmod g_ether.ko
g_ether gadget: using random self ethernet address
g_ether gadget: using random host ethernet address
usb0: MAC be:2d:3c:fa:be:f0
usb0: HOST MAC 62:b8:6a:df:38:66
g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
g_ether gadget: g_ether ready
fsl_qe_udc e01006c0.usb: fsl_qe_udc bind to driver g_ether
g_ether gadget: high speed config #1: CDC Ethernet (ECM)
root@b1:~# rmmod g_ether.ko
------------[ cut here ]------------
Badness at drivers/usb/gadget/composite.c:871
[...]
NIP [d10c1374] composite_unbind+0x24/0x15c [g_ether]
LR [d10a82f4] usb_gadget_unregister_driver+0x128/0x168 [fsl_qe_udc]
Call Trace:
[cfb93e80] [cfb1f3a0] 0xcfb1f3a0 (unreliable)
[cfb93eb0] [d10a82f4] usb_gadget_unregister_driver+0x128/0x168 [fsl_qe_udc]
[cfb93ed0] [d10c2a3c] usb_composite_unregister+0x3c/0x4c [g_ether]
[cfb93ee0] [c006bde0] sys_delete_module+0x130/0x19c
[cfb93f40] [c00142d8] ret_from_syscall+0x0/0x38
[...]
fsl_qe_udc e01006c0.usb: unregistered gadget driver 'g_ether'

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

USB: fix SB600 USB subsystem hang bug

This patch is required for all AMD SB600 revisions to avoid USB subsystem hang
symptom. The USB subsystem hang symptom is observed when the system has
multiple USB devices connected to it. In some cases a USB hub may be required
to observe this symptom.

Reported in bugzilla as #11599, the similar patch for SB700 old revision is:
commit b09bc6cbae4dd3a2d35722668ef2c502a7b8b093

Reported-by: raffaele <ralfconn@tele2.it>
Tested-by: Roman Mamedov <roman@rm.pp.ru>
Signed-off-by: Shane Huang <shane.huang@amd.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Revert "USB: improve ehci_watchdog's side effect in CPU power management"

This reverts commit f0d781d59cb621e1795d510039df973d0f8b23fc.

It was the wrong thing to do, and does not really do what it said
it did.

Cc: Yi Yang <yi.y.yang@intel.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

powerpc: Fix build for 32-bit SMP configs

attr_smt_snooze_delay is only defined for CONFIG_PPC64, so protect the
attribute removal with the same condition. This fixes this build error
on 32-bit SMP configurations:

/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c: In function ‘unregister_cpu_online’:
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: ‘attr_smt_snooze_delay’ undeclared (first use in this function)
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: (Each undeclared identifier is reported only once
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: for each function it appears in.)

Signed-off-by: Paul Mackerras <paulus@samba.org>

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: Fix copy'n'pasteo that broke VT switch if flushing was non-empty.

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc: Fix system calls on Cell entered with XER.SO=1
  powerpc/cell: Fix GDB watchpoints, again
  powerpc/mpic: Don't reset affinity for secondary MPIC on boot
  powerpc/cell/axon-msi: Retry on missing interrupt
  powerpc: Fix boot freeze on machine with empty memory node
  powerpc: Fix IRQ assignment for some PCIe devices
  powerpc/spufs: Fix spinning in spufs_ps_fault on signal
  powerpc/mpc832x_rdb: fix swapped ethernet ids
  powerpc: Use generic PHY driver for Marvell 88E1111 PHY on GE Fanuc SBC610
  powerpc/85xx: L2 cache size wrong in 8572DS dts
  powerpc/virtex: Update defconfigs
  powerpc/52xx: update defconfigs
  xsysace: Fix driver to use resource_size_t instead of unsigned long
  powerpc/virtex: fix various format/casting printk mismatches
  powerpc/mpc5200: fix bestcomm Kconfig dependencies
  powerpc/44x: Fix 460EX/460GT machine check handling
  powerpc/40x: Limit allocable DRAM during early mapping

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  Allow architectures to override copy_user_highpage()
  [ARM] pxa/palmtx: misc fixes to use generic GPIO API
  ARM: OMAP: Fixes for suspend / resume GPIO wake-up handling
  [ARM] pxa/corgi: update default config to exclude tosa from being built
  [ARM] pxa/pcm990: use negative number for an invalid GPIO in camera data
  ARM: OMAP: Typo fix for clock_allow_idle
  ARM: OMAP: Remove broken LCD driver for SX1
  [ARM] 5335/1: pxa25x_udc: Fix is_vbus_present to return 1 or 0
  [ARM] pxa/MioA701: bluetooth resume fix
  [ARM] pxa/MioA701: fix memory corruption.

drm/i915: Fix copy'n'pasteo that broke VT switch if flushing was non-empty.

Introduced in the "Avoid BUG_ONs on VT switch" commit.

Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>

powerpc: Fix system calls on Cell entered with XER.SO=1

It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
= y, if a program sets the SO (summary overflow) bit in the XER and
then does a system call, the SO bit in CR0 will be set on return
regardless of whether the system call detected an error.  Since CR0.SO
is used as the error indication from the system call, this means that
all system calls appear to fail.

The reason is that the workaround for the timebase bug on Cell uses a
compare instruction.  With CONFIG_VIRT_CPU_ACCOUNTING = y, the
ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
compare instruction, which copies XER.SO to CR0.SO.  Since we were
doing this in the system call entry patch after clearing CR0.SO but
before saving the CR, this meant that the saved CR image had CR0.SO
set if XER.SO was set on entry.

This fixes it by moving the clearing of CR0.SO to after the
ACCOUNT_CPU_USER_ENTRY call in the system call entry path.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/cell: Fix GDB watchpoints, again

An earlier patch from Jens Osterkamp attempted to fix GDB
watchpoints by enabling the DABRX register at boot time.
Unfortunately, this did not work on SMP setups, where
secondary CPUs were still using the power-on DABRX value.

This introduces the same change for secondary CPUs on cell
as well.

Reported-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Tested-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc/mpic: Don't reset affinity for secondary MPIC on boot

Kexec/kdump currently fails on the IBM QS2x blades when the kexec happens
on a CPU other than the initial boot CPU. It turns out that this is the
result of mpic_init trying to set affinity of each interrupt vector to the
current boot CPU.

As far as I can tell, the same problem is likely to exist on any
secondary MPIC, because they have to deliver interrupts to the first
output all the time. There are two potential solutions for this: either
not set up affinity at all for secondary MPICs, or assume that a single
CPU output is connected to the upstream interrupt controller and hardcode
affinity to that per architecture.

This patch implements the second approach, defaulting to the first output.
Currently, all known secondary MPICs are routed to their upstream port
using the first destination, so we hardcode that.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc/cell/axon-msi: Retry on missing interrupt

The MSI capture logic for the axon bridge can sometimes
lose interrupts in case of high DMA and interrupt load,
when it signals an MSI interrupt to the MPIC interrupt
controller while we are already handling another MSI.

Each MSI vector gets written into a FIFO buffer in main
memory using DMA, and that DMA access is normally flushed
by the actual interrupt packet on the IOIF.  An MMIO
register in the MSIC holds the position of the last
entry in the FIFO buffer that was written.  However,
reading that position does not flush the DMA, so that
we can observe stale data in the buffer.

In a stress test, we have observed the DMA to arrive
up to 14 microseconds after reading the register.

This patch works around this problem by retrying the
access to the FIFO buffer.

We can reliably detect the conditioning by writing
an invalid MSI vector into the FIFO buffer after
reading from it, assuming that all MSIs we get
are valid.  After detecting an invalid MSI vector,
we udelay(1) in the interrupt cascade for up to
100 times before giving up.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc: Fix boot freeze on machine with empty memory node

I got a bug report about a distro kernel not booting on a particular
machine.  It would freeze during boot:

> ...
> Could not find start_pfn for node 1
> [boot]0015 Setup Done
> Built 2 zonelists in Node order, mobility grouping on.  Total pages: 123783
> Policy zone: DMA
> Kernel command line:
> [boot]0020 XICS Init
> [boot]0021 XICS Done
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> clocksource: timebase mult[7d0000] shift[22] registered
> Console: colour dummy device 80x25
> console handover: boot [udbg0] -> real [hvc0]
> Dentry cache hash table entries: 1048576 (order: 7, 8388608 bytes)
> Inode-cache hash table entries: 524288 (order: 6, 4194304 bytes)
> freeing bootmem node 0

I've reproduced this on 2.6.27.7.  It is caused by commit
8f64e1f2d1e09267ac926e15090fd505c1c0cbcb ("powerpc: Reserve in bootmem
lmb reserved regions that cross NUMA nodes").

The problem is that Jon took a loop which was (in pseudocode):

for_each_node(nid)
NODE_DATA(nid) = careful_alloc(nid);
setup_bootmem(nid);
reserve_node_bootmem(nid);

and broke it up into:

for_each_node(nid)
NODE_DATA(nid) = careful_alloc(nid);
setup_bootmem(nid);
for_each_node(nid)
reserve_node_bootmem(nid);

The issue comes in when the 'careful_alloc()' is called on a node with
no memory.  It falls back to using bootmem from a previously-initialized
node.  But, bootmem has not yet been reserved when Jon's patch is
applied.  It gives back bogus memory (0xc000000000000000) and pukes
later in boot.

The following patch collapses the loop back together.  It also breaks
the mark_reserved_regions_for_nid() code out into a function and adds
some comments.  I think a huge part of introducing this bug is because
for loop was too long and hard to read.

The actual bug fix here is the:

+ if (end_pfn <= node->node_start_pfn ||
+     start_pfn >= node_end_pfn)
+ continue;

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc: Fix IRQ assignment for some PCIe devices

Currently, some PCIe devices on POWER6 machines do not get interrupts
assigned correctly. The problem is that OF doesn't create an
"interrupt" property for them. The fix is for of_irq_map_pci to fall
back to using the value in the PCI interrupt-pin register in config
space, as we do when there is no OF device-tree node for the device.

I have verified that this works fine with a pair of Squib-E SAS
adapter on a P6-570.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
  parisc: struct device - replace bus_id with dev_name(), dev_set_name()
  parisc: fix kernel crash when unwinding a userspace process
  parisc: __kernel_time_t is always long

Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] fix regression in cifs_write_begin/cifs_write_end

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: prevent divide by zero error in cpu_avg_load_per_task, update
  sched, cpusets: fix warning in kernel/cpuset.c
  sched: prevent divide by zero error in cpu_avg_load_per_task

Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq.h: fix missing/extra kernel-doc
  genirq: __irq_set_trigger: change pr_warning to pr_debug
  irq: fix typo
  x86: apic honour irq affinity which was set in early boot
  genirq: fix the affinity setting in setup_irq
  genirq: keep affinities set from userspace across free/request_irq()

Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: consistent alignement for lockdep info