Steven Rostedt [Wed, 3 Dec 2008 20:36:57 +0000 (15:36 -0500)]
ftrace: graph of a single function
This patch adds the file:
/debugfs/tracing/set_graph_function
which can be used along with the function graph tracer.
When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.
Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.
The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.
Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
The new task is free to run as soon as the tasklist_lock is released.
This is before the ftrace_graph_init_task. If the task does run
it will be using the same ret_stack and curr_ret_stack as the parent.
This will cause crashes that are difficult to debug.
This patch moves the ftrace_graph_init_task to just after the alloc_pid
code. This fixes the above race.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 16:04:50 +0000 (11:04 -0500)]
trace: fix output of stack trace
Impact: fix to output of stack trace
If a function is not found in the stack of the stack tracer, the
number printed is quite strange. This fixes the algorithm to handle
missing functions better.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: better trace output of duration for long calls
The old duration output didn't exceeded 9999.999 us to fit the column
and the nanosecs were always 3 numbers. As Ingo suggested, it's better
to have the whole microseconds elapsed time and shift the nanosecs precision
if needed to fit the maximum 7 numbers. And usec need more number, the case
should be rare and important enough to break a bit the column alignment to
show it.
So, depending of the duration value, we now have these patterns:
u.nnn us
uu.nnn us
uuu.nnn us
uuuu.nnn us
uuuuu.nn us
uuuuuu.n us
uuuuuuuu..... us
tracing/function-graph-tracer: display unified style cmdline and pid
Impact: extend function-graph output: let one know which thread called a function
This patch implements a helper function to print the couple cmdline/pid.
Its output is provided during task switching and on each row if the new
"funcgraph-proc" defualt-off option is set through trace_options file.
The output is center aligned and never exceeds 14 characters. The cmdline
is truncated over 7 chars.
But note that if the pid exceeds 6 characters, the column will overflow (but
the situation is abnormal).
Steven Rostedt [Wed, 3 Dec 2008 04:50:05 +0000 (23:50 -0500)]
ftrace: function graph return for function entry
Impact: feature, let entry function decide to trace or not
This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:04 +0000 (23:50 -0500)]
ftrace: print real return in dumpstack for function graph
Impact: better dumpstack output
I noticed in my crash dumps and even in the stack tracer that a
lot of functions listed in the stack trace are simply
return_to_handler which is ftrace graphs way to insert its own
call into the return of a function.
But we lose out where the actually function was called from.
This patch adds in hooks to the dumpstack mechanism that detects
this and finds the real function to print. Both are printed to
let the user know that a hook is still in place.
This does give a funny side effect in the stack tracer output:
Steven Rostedt [Wed, 3 Dec 2008 04:50:03 +0000 (23:50 -0500)]
ring-buffer: change "page" variable names to "bpage"
Impact: clean up
Andrew Morton pointed out that the kernel convention of a variable
named page should be of type page struct. The ring buffer uses
a variable named "page" for a pointer to something else.
This patch converts those to be called "bpage" (as in "buffer page").
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Wed, 3 Dec 2008 04:50:02 +0000 (23:50 -0500)]
ftrace: add ftrace_graph_stop()
Impact: new ftrace_graph_stop function
While developing more features of function graph, I hit a bug that
caused the WARN_ON to trigger in the prepare_ftrace_return function.
Well, it was hard for me to find out that was happening because the
bug would not print, it would just cause a hard lockup or reboot.
The reason is that it is not safe to call printk from this function.
Looking further, I also found that it calls unregister_ftrace_graph,
which grabs a mutex and calls kstop machine. This would definitely
lock the box up if it were to trigger.
This patch adds a fast and safe ftrace_graph_stop() which will
stop the function tracer. Then it is safe to call the WARN ON.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:07 +0000 (15:34 -0500)]
ring-buffer: read page interface
Impact: new API to ring buffer
This patch adds a new interface into the ring buffer that allows a
page to be read from the ring buffer on a given CPU. For every page
read, one must also be given to allow for a "swap" of the pages.
rpage = ring_buffer_alloc_read_page(buffer);
if (!rpage)
goto err;
ret = ring_buffer_read_page(buffer, &rpage, cpu, full);
if (!ret)
goto empty;
process_page(rpage);
ring_buffer_free_read_page(rpage);
The caller of these functions must handle any waits that are
needed to wait for new data. The ring_buffer_read_page will simply
return 0 if there is no data, or if "full" is set and the writer
is still on the current page.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Tue, 2 Dec 2008 20:34:06 +0000 (15:34 -0500)]
ring-buffer: move some metadata into buffer page
Impact: get ready for splice changes
This patch moves the commit and timestamp into the beginning of each
data page of the buffer. This change will allow the page to be moved
to another location (disk, network, etc) and still have information
in the page to be able to read it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend and enable the function graph tracer to 64-bit x86
This patch implements the support for function graph tracer under x86-64.
Both static and dynamic tracing are supported.
This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
wanted to use probe_kernel_read/write to make the return address
saving/patching code more generic but it causes tracing recursion.
That would be perhaps useful to implement a notrace version of these
function for other archs ports.
Note that arch/x86/process_64.c is not traced, as in X86-32. I first
thought __switch_to() was responsible of crashes during tracing because I
believed current task were changed inside but that's actually not the
case (actually yes, but not the "current" pointer).
So I will have to investigate to find the functions that harm here, to
enable tracing of the other functions inside (but there is no issue at
this time, while process_64.c stays out of -pg flags).
A little possible race condition is fixed inside this patch too. When the
tracer allocate a return stack dynamically, the current depth is not
initialized before but after. An interrupt could occur at this time and,
after seeing that the return stack is allocated, the tracer could try to
trace it with a random uninitialized depth. It's a prevention, even if I
hadn't problems with it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Liming Wang [Tue, 2 Dec 2008 02:33:08 +0000 (10:33 +0800)]
function trace: fix a bug of single thread function trace
Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func()
When disabling single thread function trace using
"echo -1 > set_ftrace_pid", the normal function trace
has to restore to original function, otherwise the normal
function trace will not work well.
Linus Torvalds [Tue, 2 Dec 2008 03:56:34 +0000 (19:56 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (25 commits)
em28xx: remove backward compat macro added on a previous fix
V4L/DVB (9748): em28xx: fix compile warning
V4L/DVB (9743): em28xx: fix oops audio
V4L/DVB (9742): em28xx-alsa: implement another locking schema
V4L/DVB (9732): sms1xxx: use new firmware for Hauppauge WinTV MiniStick
V4L/DVB (9691): gspca: Move the video device to a separate area.
V4L/DVB (9690): gspca: Lock the subdrivers via module_get/put.
V4L/DVB (9689): gspca: Memory leak when disconnect while streaming.
V4L/DVB (9668): em28xx: fix a race condition with hald
V4L/DVB (9664): af9015: don't reconnect device in USB-bus
V4L/DVB (9647): em28xx: void having two concurrent control URB's
V4L/DVB (9646): em28xx: avoid allocating/dealocating memory on every control urb
V4L/DVB (9645): em28xx: Avoid memory leaks if registration fails
V4L/DVB (9639): Make dib0700 remote control support work with firmware v1.20
V4L/DVB (9635): v4l: s2255drv fix firmware test on big-endian
V4L/DVB (9634): Make sure the i2c gate is open before powering down tuner
V4L/DVB (9632): make em28xx aux audio input work
V4L/DVB (9631): Make s2api work for ATSC support
V4L/DVB (9627): em28xx: Avoid i2c register error for boards without eeprom
V4L/DVB (9608): Fix section mismatch warning for dm1105 during make
...
Andrew Morton [Mon, 1 Dec 2008 21:14:08 +0000 (13:14 -0800)]
drivers/gpu/drm/i915/i915_irq.c: fix warning
drivers/gpu/drm/i915/i915_irq.c: In function 'i915_disable_pipestat':
drivers/gpu/drm/i915/i915_irq.c:101: warning: control may reach end of non-void function 'i915_pipestat' being inlined
Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarkko Lavinen [Mon, 1 Dec 2008 21:14:08 +0000 (13:14 -0800)]
i82875p_edac: fix module remove
Fix module removal bugs of i82875p_edac. Also i82975x_edac code seems to
have the same module removal bugs as in i82875p_edac.
The problems were:
1. In module removal i82875p_remove_one() is never called.
Variable i82875p_registered is newer changed from 1, which
guarantees i82875p_remove_one() is not called (and even if it were
called, it would be called in wrong order).
As a result, the edac_mc workque is not stopped and keeps probing.
If kernel debugging options are not enabled, user may not notice
anything going wrong.
if debugging options are enabled and I do "rmmod i82875p_edac", I
get:
Fix for this is to get rid of needles variable i82875p_registered
altogether and run i82875p_remove_one() *before*
pci_unregister_driver().
2. edac_mc_del_mc() uses mci after freeing mci
edac_mc_del_mc() calls calls edac_remove_sysfs_mci_device(). The
kobject refcount of mci drops to 0 and mci is freed. After this
mci is accessed via debug print and i82875p_remove_one() still
uses mci->pvt and tries to free mci again with edac_mc_free().
The fix for this is add kobject_get(&mci->edac_mci_kobj) after
edac_mc_alloc(). Then the mci is still available after returning
from edac_mc_del_mc() with refcount 1, and mci->pvt is still
available. When i82875p_remove_one() finally calls edac_mc_free(),
this will cause kobject_put() and mci is released properly.
Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jarkko Lavinen [Mon, 1 Dec 2008 21:14:06 +0000 (13:14 -0800)]
i82875p_edac: fix overflow device resource setup
When I do "modprobe i82875p_edac" on my Asus P4C800 MB on kernels 2.6.26
or later, the module load fails due to BAR 0 collision. On 2.6.25 the
module loads just fine.
The overflow device on the MB seems to be hidden and its resources are not
allocated at normal PCI bus init. Log shows the missing resource problem:
EDAC DEBUG: i82875p_probe1()
PCI: 0000:00:06.0 reg 10 32bit mmio: [fecf0000, fecf0fff]
pci 0000:00:06.0: device not available because of BAR 0
[0xfecf0000-0xfecf0fff] collisions
EDAC i82875p: i82875p_setup_overfl_dev(): Failed to enable overflow
device
The patch below fixes this by calling pci_bus_assign_resources() after
the overflow device is revealed and added to the bus. With this patch
I am again able to load and use the module.
Signed-off-by: Jarkko Lavinen <jlavi@iki.fi> Cc: Doug Thompson <norsk5@yahoo.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The commit aef7db4bd5a3b6068dfa05919a3d685199eed116 fixed the problem with
recursive locking in fb blanking code if blank is caused by user setting
the /sys/class/graphics/fb*/blank. However this broke the fbcon timeout
blanking.
If you use a driver that defines ->fb_blank operation and at the same time
that driver relies on other driver (e.g. backlight or lcd class) to blank
the screen, when the fbcon times out and tries to blank the fb, it will
call only fb driver blanker and won't notify the other driver. Thus FB
output is disabled, but the screen isn't blanked.
Restore fbcon blanking and at the same time apply the proper fix for the
above problem: if fbcon_blank is called with FBINFO_FLAG_USEREVENT, we are
already called through notification from fb_blank, thus we don't have to
blank the fb again.
Randy Dunlap [Mon, 1 Dec 2008 21:14:04 +0000 (13:14 -0800)]
ntfs: don't fool kernel-doc
kernel-doc handles macros now (it has for quite some time), so change the
ntfs_debug() macro's kernel-doc to be just before the macro instead of
before a phony function prototype.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Anton Altaparmakov <aia21@cantab.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Mon, 1 Dec 2008 21:14:03 +0000 (13:14 -0800)]
kernel-doc: handle varargs cleanly
The method for listing varargs in kernel-doc notation is:
* @...: these arguments are printed by the @fmt argument
but scripts/kernel-doc is confused: it always lists varargs as:
... variable arguments
and ignores the @...: line's description, but then prints that
line after the list of function parameters as though it's
not part of the function parameters.
This patch makes kernel-doc print the supplied @... description if it is
present; otherwise a boilerplate "variable arguments" is printed.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Mon, 1 Dec 2008 21:14:00 +0000 (13:14 -0800)]
frv: fix mmap2 error handling
Fix the error handling in sys_mmap2(). Currently, if the pgoff check
fails, fput() might have to be called (which it isn't), so do the pgoff
check first, before fget() is called.
Signed-off-by: David Howells <dhowells@redhat.com> Reported-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The description for 'D' was missing in the comment... (causing me a
minute of WTF followed by looking at more of the code)
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
radeonfb: fix problem with color expansion & alignment
The engine on some radeon variants locks up if color expansion is called
for non aligned source data. This patch enables a feature of the core
fbdev to request aligned input pixmaps and uses the HW clipping engine to
clip the output to the requested size
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: James Cloos <cloos@jhcloos.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: "David S. Miller" <davem@davemloft.net> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Mon, 1 Dec 2008 21:13:56 +0000 (13:13 -0800)]
spi: fix spi_s3c24xx_gpio device handle lookup
The spidev_to_sg() call in spi_s3c24xx_gpio.c was using the wrong method
to convert the spi device into the private data for the driver. Fix this
by using spi_master_get_devdata.
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Nikitenko [Mon, 1 Dec 2008 21:13:56 +0000 (13:13 -0800)]
spi: au1550_spi full duplex dma fix
Fix unsafe order in dma mapping operation: always flush data from the
cache *BEFORE* invalidating it, to allow full duplex transfers where the
same buffer may be used for both writes and reads. Tested with mmc-spi.
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davide Libenzi [Mon, 1 Dec 2008 21:13:55 +0000 (13:13 -0800)]
epoll: introduce resource usage limits
It has been thought that the per-user file descriptors limit would also
limit the resources that a normal user can request via the epoll
interface. Vegard Nossum reported a very simple program (a modified
version attached) that can make a normal user to request a pretty large
amount of kernel memory, well within the its maximum number of fds. To
solve such problem, default limits are now imposed, and /proc based
configuration has been introduced. A new directory has been created,
named /proc/sys/fs/epoll/ and inside there, there are two configuration
points:
max_user_instances = Maximum number of devices - per user
max_user_watches = Maximum number of "watched" fds - per user
The current default for "max_user_watches" limits the memory used by epoll
to store "watches", to 1/32 of the amount of the low RAM. As example, a
256MB 32bit machine, will have "max_user_watches" set to roughly 90000.
That should be enough to not break existing heavy epoll users. The
default value for "max_user_instances" is set to 128, that should be
enough too.
This also changes the userspace, because a new error code can now come out
from EPOLL_CTL_ADD (-ENOSPC). The EMFILE from epoll_create() was already
listed, so that should be ok.
[akpm@linux-foundation.org: use get_current_user()] Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@kernel.org> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefano Babic [Mon, 1 Dec 2008 21:13:53 +0000 (13:13 -0800)]
spi: mpc52xx_psc_spi chipselect bugfix
According to the manual the "tdfOnExit" flag must be set on the last byte
we want to send. The PSC controller holds SS low until the flag is set.
However, the flag was set always on the last byte of the FIFO,
independently if it is the last byte of the transfer. This generates
spurious toggling of the SS signals that breaks the protocol of some
peripherals. Fix.
Signed-off-by: Stefano Babic <sbabic@denx.de> Acked-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Takashi Iwai [Mon, 1 Dec 2008 21:13:49 +0000 (13:13 -0800)]
parport_serial: fix array overflow
The netmos_9xx5_combo type assumes that PCI SSID provides always the
correct value for the number of parallel and serial ports, but there are
indeed broken devices with wrong numbers, which may result in Oops.
This patch simply adds the check of the array range.
While memory hotplug allocate/free memmap, page_cgroup doesn't free
page_cgroup at OFFLINE when page_cgroup is allocated via bootomem.
(Because freeing bootmem requires special care.)
Then, if page_cgroup is allocated by bootmem and memmap is freed/allocated
by memory hotplug, page_cgroup->page == page is no longer true.
But current MEM_ONLINE handler doesn't check it and update
page_cgroup->page if it's not necessary to allocate page_cgroup. (This
was not found because memmap is not freed if SPARSEMEM_VMEMMAP is y.)
And I noticed that MEM_ONLINE can be called against "part of section".
So, freeing page_cgroup at CANCEL_ONLINE will cause trouble. (freeing
used page_cgroup) Don't rollback at CANCEL.
One more, current memory hotplug notifier is stopped by slub because it
sets NOTIFY_STOP_MASK to return vaule. So, page_cgroup's callback never
be called. (low priority than slub now.)
I think this slub's behavior is not intentional(BUG). and fixes it.
Another way to be considered about page_cgroup allocation:
- free page_cgroup at OFFLINE even if it's from bootmem
and remove specieal handler. But it requires more changes.
Nick Piggin [Mon, 1 Dec 2008 21:13:47 +0000 (13:13 -0800)]
mm: vmalloc fix lazy unmapping cache aliasing
Jim Radford has reported that the vmap subsystem rewrite was sometimes
causing his VIVT ARM system to behave strangely (seemed like going into
infinite loops trying to fault in pages to userspace).
We determined that the problem was most likely due to a cache aliasing
issue. flush_cache_vunmap was only being called at the moment the page
tables were to be taken down, however with lazy unmapping, this can happen
after the page has subsequently been freed and allocated for something
else. The dangling alias may still have dirty data attached to it.
The fix for this problem is to do the cache flushing when the caller has
called vunmap -- it would be a bug for them to write anything else to the
mapping at that point.
That appeared to solve Jim's problems.
Reported-by: Jim Radford <radford@blackbean.org> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 2 Dec 2008 02:56:55 +0000 (18:56 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
ocfs2: fix regression in ocfs2_read_blocks_sync()
ocfs2: fix return value set in init_dlmfs_fs()
ocfs2: Small documentation update
ocfs2: fix wake_up in unlock_ast
ocfs2: initialize stack_user lvbptr
ocfs2: comments typo fix
Mark Fasheh [Fri, 21 Nov 2008 22:06:55 +0000 (14:06 -0800)]
ocfs2: fix regression in ocfs2_read_blocks_sync()
We're panicing in ocfs2_read_blocks_sync() if a jbd-managed buffer is seen.
At first glance, this seems ok but in reality it can happen. My test case
was to just run 'exorcist'. A struct inode is being pushed out of memory but
is then re-read at a later time, before the buffer has been checkpointed by
jbd. This causes a BUG to be hit in ocfs2_read_blocks_sync().
Reviewed-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Coly Li [Mon, 17 Nov 2008 04:38:22 +0000 (12:38 +0800)]
ocfs2: fix return value set in init_dlmfs_fs()
In init_dlmfs_fs(), if calling kmem_cache_create() failed, the code will use return value from
calling bdi_init(). The correct behavior should be set status as -ENOMEM before going to "bail:".
Signed-off-by: Coly Li <coyli@suse.de> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Linus Torvalds [Mon, 1 Dec 2008 19:23:33 +0000 (11:23 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ
[libata] pata_rb532_cf: fix signature of the xfer function
[libata] pata_rb532_cf: fix and rename register definitions
ata_piix: add borked Tecra M4 to broken suspend list
Linus Torvalds [Mon, 1 Dec 2008 19:01:54 +0000 (11:01 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Fix MTT leakage in resize CQ
IB/ehca: Fix problem with generated flush work completions
IB/ehca: Change misleading error message on memory hotplug
mlx4_core: Save/restore default port IB capability mask
Tejun Heo [Thu, 27 Nov 2008 04:36:48 +0000 (13:36 +0900)]
libata: blacklist Seagate drives which time out FLUSH_CACHE when used with NCQ
Some recent Seagate harddrives have firmware bug which causes FLUSH
CACHE to timeout under certain circumstances if NCQ is being used.
This can be worked around by disabling NCQ and fixed by updating the
firmware. Implement ATA_HORKAGE_FIRMWARE_UPDATE and blacklist these
devices.
The wiki page has been updated to contain information on this issue.
http://ata.wiki.kernel.org/index.php/Known_issues
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Phil Sutter [Fri, 28 Nov 2008 19:48:35 +0000 (20:48 +0100)]
[libata] pata_rb532_cf: fix signature of the xfer function
Per definition, this function should return the number of bytes
consumed. As the original parameter "buflen" is being decremented inside
the read/write loop, save it in "retlen" at the beginning.
Signed-off-by: Phil Sutter <n0-1@freewrt.org> Acked-by: Sergei Shtyltov <sshtylyov@ru.mvista.com> Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Acked-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Phil Sutter [Fri, 28 Nov 2008 19:48:26 +0000 (20:48 +0100)]
[libata] pata_rb532_cf: fix and rename register definitions
The original standalone driver uses a custom address for the error
register. Use it in pata_rb532_cf, too.
Rename two register definitions:
- The address offset 0x0800 in fact is the ATA base, not ATA command
address.
- The offset 0x0C00 is not a regular ATA data address, but a buffered one
allowing 4-byte IO.
Signed-off-by: Phil Sutter <n0-1@freewrt.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Sat, 29 Nov 2008 13:37:21 +0000 (22:37 +0900)]
ata_piix: add borked Tecra M4 to broken suspend list
Tecra M4 sometimes forget what it is and reports bogus data via DMI
which makes the machine evade broken suspend matching and thus fail
suspend/resume. This patch updates piix_broken_suspend() such that it
can match such case. As the borked DMI data is a bit generic,
matching many entries to make the match more specific is necessary.
As the usual DMI matching is limited to four entries, this patch uses
hard coded manual matching.
This is reported by Alexandru Romanescu.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alexandru Romanescu <a_romanescu@yahoo.co.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When resizing a CQ, MTTs associated with the old CQE buffer were not
freed. As a result, if any app used resize CQ repeatedly, all MTTs
were eventually exhausted, which led to all memory registration
operations failing until the driver is reloaded.
Once the RESIZE_CQ command returns successfully from FW, FW no longer
accesses the old CQ buffer, so it is safe to deallocate the MTT
entries used by the old CQ buffer.
Finally, if the RESIZE_CQ command fails, the MTTs allocated for the
new CQEs buffer also need to be de-allocated.
This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1416>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stefan Roscher [Mon, 1 Dec 2008 18:05:50 +0000 (10:05 -0800)]
IB/ehca: Fix problem with generated flush work completions
This fix enables ehca device driver to generate flush work completions
even if the application doesn't request completions for all work
requests. The current implementation of ehca will generate flush work
completions for the wrong work requests if an application uses non
signaled work completions.
Signed-off-by: Stefan Roscher <stefan.roscher@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Joachim Fenkes [Mon, 1 Dec 2008 18:05:44 +0000 (10:05 -0800)]
IB/ehca: Change misleading error message on memory hotplug
The error message printed when the eHCA driver prevents memory hotplug
is misleading -- the user might think that hot-removing the lhca,
hotplugging memory, then hot-adding the lhca again will work, but it
actually doesn't.
Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Mon, 1 Dec 2008 17:34:23 +0000 (09:34 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: sbp2: fix race condition in state change
ieee1394: fix list corruption (reported at module removal)
firewire: fw-sbp2: another iPod mini quirk entry
ieee1394: sbp2: another iPod mini quirk entry
Linus Torvalds [Mon, 1 Dec 2008 16:33:59 +0000 (08:33 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Apple ALU wireless keyboards are bluetooth devices
HID: remove setup mutex, fix possible deadlock
HID: add USB ID for another dual gameron adapter
HID: unignore mouse on unibody macbooks
HID: fix blacklist entries for greenasia/pantherlord
Kevin Hao [Mon, 1 Dec 2008 11:36:16 +0000 (11:36 +0000)]
Add kref to fake tty used by USB console
We alloc a fake tty in usb serial console setup function. we should
init the tty's kref otherwise we will face WARN_ON after following
invoke of tty_port_tty_set --> tty_kref_get.
Signed-off-by: Kevin Hao <kexin.hao@windriver.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Newton [Mon, 1 Dec 2008 11:36:06 +0000 (11:36 +0000)]
drivers/char/tty_io.c: Avoid panic when no console is configured.
When no console is configured tty_open tries to call kref_get on a NULL
pointer, return ENODEV instead.
Signed-off-by: Will Newton <will.newton@gmail.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
USB: serial: add more Onda device ids to option driver
USB: usb-storage: unusual_devs entry for Nikon D2H
USB: storage: unusual_devs entry for Mio C520-GPS
USB: fsl_usb2_udc: Report disconnect before unbinding
USB: fsl_qe_udc: Report disconnect before unbinding
USB: fix SB600 USB subsystem hang bug
Revert "USB: improve ehci_watchdog's side effect in CPU power management"
Requested-by: David Miller <davem@davemloft.net> Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Stern [Fri, 21 Nov 2008 21:15:12 +0000 (16:15 -0500)]
USB: storage: unusual_devs entry for Mio C520-GPS
This patch (as1176) adds an unusual_devs entry for the Mio C520 GPS
unit. Other devices also based on the Mitac hardware use the same USB
interface firmware, so the Vendor and Product names are generalized.
Anton Vorontsov [Thu, 13 Nov 2008 11:57:20 +0000 (14:57 +0300)]
USB: fsl_qe_udc: Report disconnect before unbinding
Gadgets disable endpoints in their disconnect callbacks, so
we must call disconnect before unbinding. This also fixes
muram memory leak, since we free muram in the qe_ep_disable().
But mainly the patch fixes following badness:
root@b1:~# insmod fsl_qe_udc.ko
fsl_qe_udc: Freescale QE/CPM USB Device Controller driver, 1.0
fsl_qe_udc e01006c0.usb: QE USB controller initialized as device
root@b1:~# insmod g_ether.ko
g_ether gadget: using random self ethernet address
g_ether gadget: using random host ethernet address
usb0: MAC be:2d:3c:fa:be:f0
usb0: HOST MAC 62:b8:6a:df:38:66
g_ether gadget: Ethernet Gadget, version: Memorial Day 2008
g_ether gadget: g_ether ready
fsl_qe_udc e01006c0.usb: fsl_qe_udc bind to driver g_ether
g_ether gadget: high speed config #1: CDC Ethernet (ECM)
root@b1:~# rmmod g_ether.ko
------------[ cut here ]------------
Badness at drivers/usb/gadget/composite.c:871
[...]
NIP [d10c1374] composite_unbind+0x24/0x15c [g_ether]
LR [d10a82f4] usb_gadget_unregister_driver+0x128/0x168 [fsl_qe_udc]
Call Trace:
[cfb93e80] [cfb1f3a0] 0xcfb1f3a0 (unreliable)
[cfb93eb0] [d10a82f4] usb_gadget_unregister_driver+0x128/0x168 [fsl_qe_udc]
[cfb93ed0] [d10c2a3c] usb_composite_unregister+0x3c/0x4c [g_ether]
[cfb93ee0] [c006bde0] sys_delete_module+0x130/0x19c
[cfb93f40] [c00142d8] ret_from_syscall+0x0/0x38
[...]
fsl_qe_udc e01006c0.usb: unregistered gadget driver 'g_ether'
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Shane Huang [Tue, 25 Nov 2008 07:12:33 +0000 (15:12 +0800)]
USB: fix SB600 USB subsystem hang bug
This patch is required for all AMD SB600 revisions to avoid USB subsystem hang
symptom. The USB subsystem hang symptom is observed when the system has
multiple USB devices connected to it. In some cases a USB hub may be required
to observe this symptom.
It was the wrong thing to do, and does not really do what it said
it did.
Cc: Yi Yang <yi.y.yang@intel.com> Cc: David Brownell <dbrownell@users.sourceforge.net> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Milton Miller [Sun, 16 Nov 2008 11:44:42 +0000 (11:44 +0000)]
powerpc: Fix build for 32-bit SMP configs
attr_smt_snooze_delay is only defined for CONFIG_PPC64, so protect the
attribute removal with the same condition. This fixes this build error
on 32-bit SMP configurations:
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c: In function ‘unregister_cpu_online’:
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: ‘attr_smt_snooze_delay’ undeclared (first use in this function)
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: (Each undeclared identifier is reported only once
/data/home/miltonm/next.git/arch/powerpc/kernel/sysfs.c:722: error: for each function it appears in.)
Linus Torvalds [Mon, 1 Dec 2008 00:45:13 +0000 (16:45 -0800)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: Fix copy'n'pasteo that broke VT switch if flushing was non-empty.
Linus Torvalds [Mon, 1 Dec 2008 00:44:18 +0000 (16:44 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Fix system calls on Cell entered with XER.SO=1
powerpc/cell: Fix GDB watchpoints, again
powerpc/mpic: Don't reset affinity for secondary MPIC on boot
powerpc/cell/axon-msi: Retry on missing interrupt
powerpc: Fix boot freeze on machine with empty memory node
powerpc: Fix IRQ assignment for some PCIe devices
powerpc/spufs: Fix spinning in spufs_ps_fault on signal
powerpc/mpc832x_rdb: fix swapped ethernet ids
powerpc: Use generic PHY driver for Marvell 88E1111 PHY on GE Fanuc SBC610
powerpc/85xx: L2 cache size wrong in 8572DS dts
powerpc/virtex: Update defconfigs
powerpc/52xx: update defconfigs
xsysace: Fix driver to use resource_size_t instead of unsigned long
powerpc/virtex: fix various format/casting printk mismatches
powerpc/mpc5200: fix bestcomm Kconfig dependencies
powerpc/44x: Fix 460EX/460GT machine check handling
powerpc/40x: Limit allocable DRAM during early mapping
Linus Torvalds [Mon, 1 Dec 2008 00:39:06 +0000 (16:39 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
Allow architectures to override copy_user_highpage()
[ARM] pxa/palmtx: misc fixes to use generic GPIO API
ARM: OMAP: Fixes for suspend / resume GPIO wake-up handling
[ARM] pxa/corgi: update default config to exclude tosa from being built
[ARM] pxa/pcm990: use negative number for an invalid GPIO in camera data
ARM: OMAP: Typo fix for clock_allow_idle
ARM: OMAP: Remove broken LCD driver for SX1
[ARM] 5335/1: pxa25x_udc: Fix is_vbus_present to return 1 or 0
[ARM] pxa/MioA701: bluetooth resume fix
[ARM] pxa/MioA701: fix memory corruption.
Paul Mackerras [Sun, 30 Nov 2008 11:49:45 +0000 (11:49 +0000)]
powerpc: Fix system calls on Cell entered with XER.SO=1
It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
= y, if a program sets the SO (summary overflow) bit in the XER and
then does a system call, the SO bit in CR0 will be set on return
regardless of whether the system call detected an error. Since CR0.SO
is used as the error indication from the system call, this means that
all system calls appear to fail.
The reason is that the workaround for the timebase bug on Cell uses a
compare instruction. With CONFIG_VIRT_CPU_ACCOUNTING = y, the
ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
compare instruction, which copies XER.SO to CR0.SO. Since we were
doing this in the system call entry patch after clearing CR0.SO but
before saving the CR, this meant that the saved CR image had CR0.SO
set if XER.SO was set on entry.
This fixes it by moving the clearing of CR0.SO to after the
ACCOUNT_CPU_USER_ENTRY call in the system call entry path.
Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Arnd Bergmann [Fri, 28 Nov 2008 09:51:24 +0000 (09:51 +0000)]
powerpc/cell: Fix GDB watchpoints, again
An earlier patch from Jens Osterkamp attempted to fix GDB
watchpoints by enabling the DABRX register at boot time.
Unfortunately, this did not work on SMP setups, where
secondary CPUs were still using the power-on DABRX value.
This introduces the same change for secondary CPUs on cell
as well.
Arnd Bergmann [Fri, 28 Nov 2008 09:51:23 +0000 (09:51 +0000)]
powerpc/mpic: Don't reset affinity for secondary MPIC on boot
Kexec/kdump currently fails on the IBM QS2x blades when the kexec happens
on a CPU other than the initial boot CPU. It turns out that this is the
result of mpic_init trying to set affinity of each interrupt vector to the
current boot CPU.
As far as I can tell, the same problem is likely to exist on any
secondary MPIC, because they have to deliver interrupts to the first
output all the time. There are two potential solutions for this: either
not set up affinity at all for secondary MPICs, or assume that a single
CPU output is connected to the upstream interrupt controller and hardcode
affinity to that per architecture.
This patch implements the second approach, defaulting to the first output.
Currently, all known secondary MPICs are routed to their upstream port
using the first destination, so we hardcode that.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Fri, 28 Nov 2008 09:51:22 +0000 (09:51 +0000)]
powerpc/cell/axon-msi: Retry on missing interrupt
The MSI capture logic for the axon bridge can sometimes
lose interrupts in case of high DMA and interrupt load,
when it signals an MSI interrupt to the MPIC interrupt
controller while we are already handling another MSI.
Each MSI vector gets written into a FIFO buffer in main
memory using DMA, and that DMA access is normally flushed
by the actual interrupt packet on the IOIF. An MMIO
register in the MSIC holds the position of the last
entry in the FIFO buffer that was written. However,
reading that position does not flush the DMA, so that
we can observe stale data in the buffer.
In a stress test, we have observed the DMA to arrive
up to 14 microseconds after reading the register.
This patch works around this problem by retrying the
access to the FIFO buffer.
We can reliably detect the conditioning by writing
an invalid MSI vector into the FIFO buffer after
reading from it, assuming that all MSIs we get
are valid. After detecting an invalid MSI vector,
we udelay(1) in the interrupt cascade for up to
100 times before giving up.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
I've reproduced this on 2.6.27.7. It is caused by commit 8f64e1f2d1e09267ac926e15090fd505c1c0cbcb ("powerpc: Reserve in bootmem
lmb reserved regions that cross NUMA nodes").
The problem is that Jon took a loop which was (in pseudocode):
The issue comes in when the 'careful_alloc()' is called on a node with
no memory. It falls back to using bootmem from a previously-initialized
node. But, bootmem has not yet been reserved when Jon's patch is
applied. It gives back bogus memory (0xc000000000000000) and pukes
later in boot.
The following patch collapses the loop back together. It also breaks
the mark_reserved_regions_for_nid() code out into a function and adds
some comments. I think a huge part of introducing this bug is because
for loop was too long and hard to read.
Currently, some PCIe devices on POWER6 machines do not get interrupts
assigned correctly. The problem is that OF doesn't create an
"interrupt" property for them. The fix is for of_irq_map_pci to fall
back to using the value in the PCI interrupt-pin register in config
space, as we do when there is no OF device-tree node for the device.
I have verified that this works fine with a pair of Squib-E SAS
adapter on a P6-570.
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
parisc: struct device - replace bus_id with dev_name(), dev_set_name()
parisc: fix kernel crash when unwinding a userspace process
parisc: __kernel_time_t is always long
Linus Torvalds [Sun, 30 Nov 2008 21:06:47 +0000 (13:06 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: prevent divide by zero error in cpu_avg_load_per_task, update
sched, cpusets: fix warning in kernel/cpuset.c
sched: prevent divide by zero error in cpu_avg_load_per_task
Linus Torvalds [Sun, 30 Nov 2008 21:06:20 +0000 (13:06 -0800)]
Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
irq.h: fix missing/extra kernel-doc
genirq: __irq_set_trigger: change pr_warning to pr_debug
irq: fix typo
x86: apic honour irq affinity which was set in early boot
genirq: fix the affinity setting in setup_irq
genirq: keep affinities set from userspace across free/request_irq()