Kyle McMartin [Fri, 6 Oct 2006 03:45:45 +0000 (23:45 -0400)]
[PARISC] Make firmware calls irqsafe-ish...
There's no reason why we shouldn't be using _irqsave instead of
_irq for any of these calls. fwiw, this fixes the
"start_kernel(): bug: interrupts were enabled early" message displayed
on bootup recently.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org> Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Olaf Hering [Fri, 6 Oct 2006 18:53:10 +0000 (20:53 +0200)]
[PATCH] fix mesh compile errors after irq changes
drivers/scsi/mesh.c:469: error: too many arguments to function 'mesh_interrupt'
drivers/scsi/mesh.c:507: error: too many arguments to function 'mesh_interrupt'
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Fri, 6 Oct 2006 18:35:08 +0000 (11:35 -0700)]
Merge branch 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa
* 'linus' of master.kernel.org:/pub/scm/linux/kernel/git/perex/alsa:
[ALSA] version 1.0.13
[ALSA] sound/pci/au88x0/au88x0.c: ioremap balanced with iounmap
[ALSA] Handle file operations during snd_card disconnects using static file->f_op
[ALSA] emu10k1: Fix outl() in snd_emu10k1_resume_regs()
[ALSA] Repair snd-usb-usx2y for usb 2.6.18
[ALSA] Fix bug in snd-usb-usx2y's usX2Y_pcms_lock_check()
[ALSA] Dereference after free in snd_hwdep_release()
[ALSA] Fix memory leak in sound/isa/es18xx.c
[ALSA] hda-intel - New pci id for Nvidia MCP61
[ALSA] Add new subdevice ids for hda-intel
[ALSA] WM9712 fixes for ac97_patch.c
[ALSA] hda/patch_si3054: new codec vendor IDs
Karsten Wiese [Fri, 6 Oct 2006 14:08:27 +0000 (16:08 +0200)]
[ALSA] Handle file operations during snd_card disconnects using static file->f_op
Alsa used to kmalloc one file->f_op per file per disconnecting snd_card.
This led to oopses sometimes when file->f_op was freed before __fput()
finished.
Patch adds a virtual device for disconnect: VDD.
VDD consists of:
LIST_HEAD(shutdown_files)
protected by DEFINE_SPINLOCK(shutdown_mutex)
static struct file_operations snd_shutdown_f_ops
and functions assigned to it
Additions to struct snd_monitor_file
to specify if instance is hidden by VDD or not.
A VDD's instance is
created in snd_card_disconnect() under the card->files_lock.
cleaned up in snd_card_file_remove() under the card->files_lock.
Arnaud Patard [Wed, 4 Oct 2006 16:21:05 +0000 (18:21 +0200)]
[ALSA] emu10k1: Fix outl() in snd_emu10k1_resume_regs()
The emu10k1 driver saves the A_IOCFG and HCFG register on suspend and restores
it on resumes. Unfortunately, this doesn't work as the arguments to outl() are
reversed.
Karsten Wiese [Wed, 4 Oct 2006 15:16:46 +0000 (17:16 +0200)]
[ALSA] Fix bug in snd-usb-usx2y's usX2Y_pcms_lock_check()
Fix bug in snd-usb-usx2y's usX2Y_pcms_lock_check()
substream can be NULL......
in mainline, bug was introduced by:
2006-06-22 [ALSA] Add O_APPEND flag support to PCM
Tobin Davis [Tue, 26 Sep 2006 13:30:10 +0000 (15:30 +0200)]
[ALSA] Add new subdevice ids for hda-intel
This patch adds a couple of device ids for Acer laptops. In both cases,
the owners got the driver working by adding 'model=acer' to their
modprobe.conf files.
Signed-off-by: Tobin Davis <tdavis@dsl-only.net> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Luke Zhang [Tue, 26 Sep 2006 13:28:41 +0000 (15:28 +0200)]
[ALSA] WM9712 fixes for ac97_patch.c
This patch by Luke Zhang fixes a couple of issues with the WM9712
support in ac97_patch.c
Changes:-
o Fix Out3 ZC switch invert.
o Extend capture volume control to 6 bits.
o Change Mic 1 volume mask to 5 bits (31).
o Add Mic 2 volume.
Signed-off-by: Luke Zhang <lzhang@intrinsyc.com> Signed-off-by: Liam Girdwood <liam.girdwood@wolfsonmicro.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jaroslav Kysela <perex@suse.cz>
There are additional IDs for Si3054 codec based HDA modems. Most of
them were discovered on discuss@linmodems.org list - Thanks to MarvS
and all linmodems.org folks.
Jiri Kosina [Fri, 6 Oct 2006 09:11:56 +0000 (11:11 +0200)]
[PATCH] make kernels with CONFIG_X86_GENERIC and !CONFIG_SMP compilable
CONFIG_X86_GENERIC is not exclusively CONFIG_SMP, as mach-default/ could
be compiled also for UP archs. The patch fixes compilation error in
include/asm/mach-summit/mach_apic.h in case CONFIG_X86_GENERIC && !CONFIG_SMP
Linus Torvalds [Fri, 6 Oct 2006 16:13:53 +0000 (09:13 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] Use CONFIG_GENERIC_TIME and define TOD clock source.
[PATCH] sysrq: irq change build fix.
[S390] irq change build fixes.
[S390] cio: 0 is a valid chpid.
[S390] monwriter buffer limit.
[S390] ap bus poll thread priority.
Cc: David Howells <dhowells@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo "Blaisorblade" Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 6 Oct 2006 07:44:05 +0000 (00:44 -0700)]
[PATCH] knfsd: tidy up up meaning of 'buffer size' in nfsd/sunrpc
There is some confusion about the meaning of 'bufsz' for a sunrpc server.
In some cases it is the largest message that can be sent or received. In
other cases it is the largest 'payload' that can be included in a NFS
message.
In either case, it is not possible for both the request and the reply to be
this large. One of the request or reply may only be one page long, which
fits nicely with NFS.
So we remove 'bufsz' and replace it with two numbers: 'max_payload' and
'max_mesg'. Max_payload is the size that the server requests. It is used
by the server to check the max size allowed on a particular connection:
depending on the protocol a lower limit might be used.
max_mesg is the largest single message that can be sent or received. It is
calculated as the max_payload, rounded up to a multiple of PAGE_SIZE, and
with PAGE_SIZE added to overhead. Only one of the request and reply may be
this size. The other must be at most one page.
Cc: Greg Banks <gnb@sgi.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
NeilBrown [Fri, 6 Oct 2006 07:44:04 +0000 (00:44 -0700)]
[PATCH] md: fix bug where new drives added to an md array sometimes don't sync properly
This fixes a bug introduced in 2.6.18.
If a drive is added to a raid1 using older tools (mdadm-1.x or raidtools)
then it will be included in the array without any resync happening.
It has been submitted for 2.6.18.1.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pierre Ossman [Fri, 6 Oct 2006 07:44:03 +0000 (00:44 -0700)]
[PATCH] mmc: multi sector write transfers
SD cards extend the protocol by allowing the host to query a card how many
blocks were successfully stored on the medium. This allows us to safely write
chunks of blocks at once.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jim Cromie [Fri, 6 Oct 2006 07:43:59 +0000 (00:43 -0700)]
[PATCH] MAINTAINERS: take over scx200-* and pc8736* drivers
Add MAINTAINERS entries for new scx200_hrt and pc8736x_gpio drivers, and
take over maintenance of scx200_gpio, authored by Christer Weinigel (which
I've hacked at), who no longer has the hardware.
Also take over hwmon/pc87360, authored by Jean Delvare, who's dropped
maintenance to dedicate more time to hwmon subsystem.
Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Christer Weinigel <christer@weinigel.se> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add a way for a no_page() handler to request a retry of the faulting
instruction. It goes back to userland on page faults and just tries again
in get_user_pages(). I added a cond_resched() in the loop in that later
case.
The problem I have with signal and spufs is an actual bug affecting apps and I
don't see other ways of fixing it.
In addition, we are having issues with infiniband and 64k pages (related to
the way the hypervisor deals with some HV cards) that will require us to muck
around with the MMU from within the IB driver's no_page() (it's a pSeries
specific driver) and return to the caller the same way using NOPAGE_REFAULT.
And to add to this, the graphics folks have been following a new approach of
memory management that involves transparently swapping objects between video
ram and main meory. To do that, they need installing PTEs from a no_page()
handler as well and that also requires returning with NOPAGE_REFAULT.
(For the later, they are currently using io_remap_pfn_range to install one PTE
from no_page() which is a bit racy, we need to add a check for the PTE having
already been installed afer taking the lock, but that's ok, they are only at
the proof-of-concept stage. I'll send a patch adding a "clean" function to do
that, we can use that from spufs too and get rid of the sparsemem hacks we do
to create struct page for SPEs. Basically, that provides a generic solution
for being able to have no_page() map hardware devices, which is something that
I think sound driver folks have been asking for some time too).
All of these things depend on having the NOPAGE_REFAULT exit path from
no_page() handlers.
Signed-off-by: Benjamin Herrenchmidt <benh@kernel.crashing.org> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Fri, 6 Oct 2006 07:43:49 +0000 (00:43 -0700)]
[PATCH] Fix WARN_ON / WARN_ON_ONCE regression
Tim and Ananiev report that the recent WARN_ON_ONCE changes cause increased
cache misses with the tbench workload. Apparently due to the access to the
newly-added static variable.
Rearrange the code so that we don't touch that variable unless the warning is
going to trigger.
Also rework the logic so that the static variable starts out at zero, so we
can move it into bss.
It would seem logical to mark the static variable as __read_mostly too. But
it would be wrong, because that would put it back into the vmlinux image, and
the kernel will never read from this variable in normal operation anyway.
Unless the compiler or hardware go and do some prefetching on us?
For some reason this patch shrinks softirq.o text by 40 bytes.
Cc: Tim Chen <tim.c.chen@intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Ananiev, Leonid I" <leonid.i.ananiev@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[S390] Use CONFIG_GENERIC_TIME and define TOD clock source.
Fix too slow clock by using CONFIG_GENERIC_TIME and adding a
clock source for the s390 time-of-day clock. As added benefit
we get rid of the s390 specific definition of do_gettimeofday
and do_settimeofday.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Thu, 5 Oct 2006 23:32:01 +0000 (16:32 -0700)]
Merge git://git.infradead.org/~dhowells/irq-2.6
* git://git.infradead.org/~dhowells/irq-2.6:
IRQ: Maintain regs pointer globally rather than passing to IRQ handlers
IRQ: Typedef the IRQ handler function type
IRQ: Typedef the IRQ flow handler function type
Trying to build both drivers results in the following error:
LD drivers/scsi/built-in.o
drivers/scsi/qla4xxx/built-in.o: In function `qla4xxx_slave_configure':
drivers/scsi/qla4xxx/ql4_os.c:1433: multiple definition of `extended_error_logging'
drivers/scsi/qla2xxx/built-in.o:drivers/scsi/qla2xxx/qla_os.c:2166:
first defined here
make[2]: *** [drivers/scsi/built-in.o] Error 1
make[1]: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2
The following patch simply adds a qla2_ (qla4_ respectively) prefix to
the variable name.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andi Kleen [Thu, 5 Oct 2006 16:47:22 +0000 (18:47 +0200)]
[PATCH] x86-64: Fix FPU corruption
This reverts an earlier patch that was found to cause FPU
state corruption. I think the corruption happens because
unlazy_fpu() can cause FPU exceptions and when it happens
after the current switch some processing would affect
the state in the wrong process.
Thanks to Douglas Crosher and Tom Hughes for testing.
Andi Kleen [Thu, 5 Oct 2006 16:47:22 +0000 (18:47 +0200)]
[PATCH] x86: Terminate the kernel stacks for the unwinder
Always make sure RIP/EIP is 0 in the registers stored on the top
of the stack of a kernel thread. This makes sure the unwinder code
won't try a fallback but knows the stack has ended.
AK: this patch is a bit mysterious. in theory they should be terminated
anyways, but it seems to fix at least one crash. Anyways double termination
probably doesn't hurt.
Jon Mason [Thu, 5 Oct 2006 16:47:21 +0000 (18:47 +0200)]
[PATCH] x86-64: Calgary IOMMU: Fix off by one when calculating register space location
The purpose of the code being modified is to determine the location
of the calgary chip address space. This is done by a magical formula
of FE0MB-8MB*OneBasedChassisNumber+1MB*(RioNodeId-ChassisBase) to
find the offset where BIOS puts it. In this formula,
OneBasedChassisNumber corresponds to the NUMA node, and rionodeid is
always 2 or 3 depending on which chip in the system it is. The
problem was that we had an off by one error that caused us to account
some busses to the wrong chip and thus give them the wrong address
space.
calgary_init's for loop does not correspond to the actual device being
checked, which makes its upperbound check for array overflow useless.
Changing this to a do-while loop is the correct way of doing this.
There should be no possibility of spinning forever in this loop, as
pci_get_device states that it will go through all iterations, then
return NULL (thus breaking the loop).
David Howells [Thu, 5 Oct 2006 13:55:46 +0000 (14:55 +0100)]
IRQ: Maintain regs pointer globally rather than passing to IRQ handlers
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Mark Assad [Thu, 5 Oct 2006 02:25:05 +0000 (12:25 +1000)]
[PATCH] itmtouch: fix inverted flag to indicate touch location correctly, correct white space
There is a bug in the current version of the itmtouch USB touchscreen
driver. The if statment that checks if pressure is being applied to the
touch screen is now missing a ! (not), so events are no longer being
reported correctly.
The original source code for this line was as follows:
* git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] cell: fix bugs found by sparse
[POWERPC] spiderpic: enable new style devtree support
[POWERPC] Update cell_defconfig
[POWERPC] spufs: add infrastructure for finding elf objects
[POWERPC] spufs: support new OF device tree format
[POWERPC] spufs: add support for read/write on cntl
[POWERPC] spufs: remove support for ancient firmware
[POWERPC] spufs: make mailbox functions handle multiple elements
[POWERPC] spufs: use correct pg_prot for mapping SPU local store
[POWERPC] spufs: Add infrastructure needed for gang scheduling
[POWERPC] spufs: implement error event delivery to user space
[POWERPC] spufs: fix context switch during page fault
[POWERPC] spufs: scheduler support for NUMA.
[POWERPC] spufs: cell spu problem state mapping updates
Matthew Wilcox [Wed, 4 Oct 2006 21:12:52 +0000 (15:12 -0600)]
[PA-RISC] Fix time.c for new do_timer() calling convention
do_timer now wants to know how many ticks have elapsed. Now that we
have to calculate that, we can eliminate some of the clever code that
avoided having to calculate that. Also add some more documentation.
I'd like to thank Grant Grundler for helping me with this.
Signed-off-by: Matthew Wilcox <willy@parisc-linux.org>
Matthew Wilcox [Wed, 4 Oct 2006 19:16:10 +0000 (13:16 -0600)]
[PA-RISC] Fix sys32_sysctl
When CONFIG_SYSCTL_SYSCALL isn't defined, do_sysctl doesn't exist and
we fail to link. Fix with an ifdef, the same way sparc64 did.
Also add some minor changes to be more like sparc64.
Arnd Bergmann [Wed, 4 Oct 2006 15:26:24 +0000 (17:26 +0200)]
[POWERPC] cell: fix bugs found by sparse
- Some long constants should be marked 'ul'.
- When using desc->handler_data to pass an __iomem
register area, we need to add casts to and from
__iomem.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:21 +0000 (17:26 +0200)]
[POWERPC] spufs: add infrastructure for finding elf objects
This adds an 'object-id' file that the spe library can
use to store a pointer to its ELF object. This was
originally meant for use by oprofile, but is now
also used by the GNU debugger, if available.
In order for oprofile to find the location in an spu-elf
binary where an event counter triggered, we need a way
to identify the binary in the first place.
Unfortunately, that binary itself can be embedded in a
powerpc ELF binary. Since we can assume it is mapped into
the effective address space of the running process,
have that one write the pointer value into a new spufs
file.
When a context switch occurs, pass the user value to
the profiler so that can look at the mapped file (with
some care).
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:20 +0000 (17:26 +0200)]
[POWERPC] spufs: support new OF device tree format
The properties we used traditionally in the device tree are somewhat
nonstandard. This adds support for a more conventional format using
'interrupts' and 'reg' properties.
The interrupts are specified in three cells (class 0, 1 and 2) and
registered at the interrupt-parent.
The reg property contains either three or four register areas in the
order 'local-store', 'problem', 'priv2', and 'priv1', so the priv1 one
can be left out in case of hypervisor driven systems that access these
through hcalls.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:17 +0000 (17:26 +0200)]
[POWERPC] spufs: make mailbox functions handle multiple elements
Since libspe2 will provide a function that can read/write
multiple mailbox elements at once, the kernel should handle
that efficiently.
read/write on the three mailbox files can now access the
spe context multiple times to operate on any number of
mailbox data elements.
If the spu application keeps writing to its outbound
mailbox, the read call will pick up all the data in a
single system call.
Unfortunately, if the user passes an invalid pointer,
we may lose a mailbox element on read, since we can't
put it back. This probably impossible to solve, if the
user also accesses the mailbox through direct register
access.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:16 +0000 (17:26 +0200)]
[POWERPC] spufs: use correct pg_prot for mapping SPU local store
This hopefully fixes a long-standing bug in the spu file system.
An spu context comes with local memory that can be either saved
in kernel pages or point directly to a physical SPE.
When mapping the physical SPE, that mapping needs to be cache-inhibited.
For simplicity, we used to map the kernel backing memory that way
too, but unfortunately that was not only inefficient, but also incorrect
because the same page could then be accessed simultaneously through
a cacheable and a cache-inhibited mapping, which is not allowed
by the powerpc specification and in our case caused data inconsistency
for which we did a really ugly workaround in user space.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:15 +0000 (17:26 +0200)]
[POWERPC] spufs: Add infrastructure needed for gang scheduling
Add the concept of a gang to spufs as a new type of object.
So far, this has no impact whatsover on scheduling, but makes
it possible to add that later.
A new type of object in spufs is now a spu_gang. It is created
with the spu_create system call with the flags argument set
to SPU_CREATE_GANG (0x2). Inside of a spu_gang, it
is then possible to create spu_context objects, which until
now was only possible at the root of spufs.
There is a new member in struct spu_context pointing to
the spu_gang it belongs to, if any. The spu_gang maintains
a list of spu_context structures that are its children.
This information can then be used in the scheduler in the
future.
There is still a bug that needs to be resolved in this
basic infrastructure regarding the order in which objects
are removed. When the spu_gang file descriptor is closed
before the spu_context descriptors, we leak the dentry
and inode for the gang. Any ideas how to cleanly solve
this are appreciated.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Arnd Bergmann [Wed, 4 Oct 2006 15:26:14 +0000 (17:26 +0200)]
[POWERPC] spufs: implement error event delivery to user space
This tries to fix spufs so we have an interface closer to what is
specified in the man page for events returned in the third argument of
spu_run.
Fortunately, libspe has never been using the returned contents of that
register, as they were the same as the return code of spu_run (duh!).
Unlike the specification that we never implemented correctly, we now
require a SPU_CREATE_EVENTS_ENABLED flag passed to spu_create, in
order to get the new behavior. When this flag is not passed, spu_run
will simply ignore the third argument now.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
HyeonSeung Jang [Wed, 4 Oct 2006 15:26:13 +0000 (17:26 +0200)]
[POWERPC] spufs: fix context switch during page fault
For better explanation, I break down the page fault handling into steps:
1) There is a page fault caused by DMA operation initiated by SPU and
DMA is suspended.
2) The interrupt handler 'spu_irq_class_1()/__spu_trap_data_map()' is
called and it just wakes up the sleeping spe-manager thread.
3) by PPE scheduler, the corresponding bottom half,
spu_irq_class_1_bottom() is called in process context and DMA is
restarted.
There can be a quite large time gap between 2) and 3) and I found
the following problem:
Between 2) and 3) If the context becomes unbound, 3) is not executed
because when the spe-manager thread is awaken, the context is already
saved. (This situation can happen, for example, when a high priority spe
thread newly started in that time gap)
But the actual problem is that the corresponding SPU context does not
work even if it is bound again to a SPU.
Besides I can see the following warning in mambo simulator when the
context becomes
unbound(in save_mfc_cmd()), i.e. when unbind() is called for the
context after step 2) before 3) :
'WARNING: 61392752237: SPE2: MFC_CMD_QUEUE channel count of 15 is
inconsistent with number of available DMA queue entries of 16'
After I go through available documents, I found that the problem is
because the suspended DMA is not restarted when it is bound again.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Mark Nutter [Wed, 4 Oct 2006 15:26:12 +0000 (17:26 +0200)]
[POWERPC] spufs: scheduler support for NUMA.
This patch adds NUMA support to the the spufs scheduler.
The new arch/powerpc/platforms/cell/spufs/sched.c is greatly
simplified, in an attempt to reduce complexity while adding
support for NUMA scheduler domains. SPUs are allocated starting
from the calling thread's node, moving to others as supported by
current->cpus_allowed. Preemption is gone as it was buggy, but
should be re-enabled in another patch when stable.
The new arch/powerpc/platforms/cell/spu_base.c maintains idle
lists on a per-node basis, and allows caller to specify which
node(s) an SPU should be allocated from, while passing -1 tells
spu_alloc() that any node is allowed.
Since the patch removes the currently implemented preemptive
scheduling, it is technically a regression, but practically
all users have since migrated to this version, as it is
part of the IBM SDK and the yellowdog distribution, so there
is not much point holding it back while the new preemptive
scheduling patch gets delayed further.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>