Eric Dumazet [Thu, 6 Aug 2009 22:09:28 +0000 (15:09 -0700)]
execve: must clear current->clear_child_tid
While looking at Jens Rosenboom bug report
(http://lkml.org/lkml/2009/7/27/35) about strange sys_futex call done from
a dying "ps" program, we found following problem.
clone() syscall has special support for TID of created threads. This
support includes two features.
One (CLONE_CHILD_SETTID) is to set an integer into user memory with the
TID value.
One (CLONE_CHILD_CLEARTID) is to clear this same integer once the created
thread dies.
The integer location is a user provided pointer, provided at clone()
time.
kernel keeps this pointer value into current->clear_child_tid.
At execve() time, we should make sure kernel doesnt keep this user
provided pointer, as full user memory is replaced by a new one.
As glibc fork() actually uses clone() syscall with CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID set, chances are high that we might corrupt user
memory in forked processes.
Following sequence could happen:
1) bash (or any program) starts a new process, by a fork() call that
glibc maps to a clone( ... CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
...) syscall
2) When new process starts, its current->clear_child_tid is set to a
location that has a meaning only in bash (or initial program) context
(&THREAD_SELF->tid)
3) This new process does the execve() syscall to start a new program.
current->clear_child_tid is left unchanged (a non NULL value)
4) If this new program creates some threads, and initial thread exits,
kernel will attempt to clear the integer pointed by
current->clear_child_tid from mm_release() :
/*
* We don't check the error code - if userspace has
* not set up a proper pointer then tough luck.
*/
<< here >> put_user(0, tidptr);
sys_futex(tidptr, FUTEX_WAKE, 1, NULL, NULL, 0);
}
5) OR : if new program is not multi-threaded, but spied by /proc/pid
users (ps command for example), mm_users > 1, and the exiting program
could corrupt 4 bytes in a persistent memory area (shm or memory mapped
file)
If current->clear_child_tid points to a writeable portion of memory of the
new program, kernel happily and silently corrupts 4 bytes of memory, with
unexpected effects.
Fix is straightforward and should not break any sane program.
x = sdhci_alloc_host(...)
... when != x = E
(
* if (x == NULL || ...) S1 else S2
|
* if (x == NULL && ...) S1 else S2
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Acked-by: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: Matt Fleming <matt@console-pimps.org> Cc: Ian Molton <ian@mnementh.co.uk> Cc: "Roberto A. Foglietta" <roberto.foglietta@gmail.com> Cc: Philip Langdale <philipl@overt.org> Cc: Pierre Ossman <pierre@ossman.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recent framebuffer locking patches first made affected systems unbootable,
then the dead-lock has been fixed but as of 2.6.31-rc4 the framebuffer on
mx3 machines doesn't work. Fix this.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Aug 2009 22:07:39 +0000 (15:07 -0700)]
vfs: mnt_want_write_file(): fix special file handling
I suspect that mnt_want_write_file() may have wrong assumption. I think
mnt_want_write_file() is assuming it increments ->mnt_writers if
(file->f_mode & FMODE_WRITE). But, if it's special_file(), it is false?
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Thu, 6 Aug 2009 22:07:37 +0000 (15:07 -0700)]
compat_ioctl: hook up compat handler for FIEMAP ioctl
The FIEMAP_IOC_FIEMAP mapping ioctl was missing a 32-bit compat handler,
which means that 32-bit suerspace on 64-bit kernels cannot use this ioctl
command.
The structure is nicely aligned, padded, and sized, so it is just this
simple.
Tested w/ 32-bit ioctl tester (from Josef) on a 64-bit kernel on ext4.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Mark Lord <lkml@rtr.ca> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Josef Bacik <josef@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The common way for the VC drivers is to set the screen dimension
parameters manually in the init case and only call vc_resize() for
!init - which allocates a screen buffer according to the new
dimensions.
fbcon instead would do vc_resize() unconditionally and afterwards set
the dimensions manually (again) for !init - i.e. completely upside
down. The vc_resize() allocated buffer would then get lost by
vc_allocate() allocating a fresh one.
Use vc_resize() only for actual resizing to close the leak.
Set the dimensions manually only in initialization mode to remove the
redundant setting in resize mode.
Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Tested-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a bug caused by changing pointers (viafb_mode, viafb_mode1)
assigned by module_param. It reduces driver complexity by not needlessly
changing these vars as they are only read once and removing now
superfluous code.
On unpatched kernels loading viafb with viafb_mode or viafb_mode1 option
used and afterwards unloading it results in:
This is caused by the current code changing the pointers assigned by
module_param. During unload it tries to free the memory the pointers
point at which is now part of an internal structure.
The patch simply avoids changing the pointers. This is okay as they are
read only once during the initialization process.
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Cc: Scott Fang <ScottFang@viatech.com.cn> Cc: Joseph Chan <JosephChan@via.com.tw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: make set_mempolicy(MPOL_INTERLEAV) N_HIGH_MEMORY aware
At first, init_task's mems_allowed is initialized as this.
init_task->mems_allowed == node_state[N_POSSIBLE]
And cpuset's top_cpuset mask is initialized as this
top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]
Before 2.6.29:
policy's mems_allowed is initialized as this.
1. update tasks->mems_allowed by its cpuset->mems_allowed.
2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)
Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.
In 2.6.30: After commit 58568d2a8215cb6f55caf2332017d7bdff954e1c
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.
Then, policy's mems_allowd can includes a possible node, which has no pgdat.
MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.
NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL
Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will
be on statck. Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.
This patch stands on old behavior. But I feel this fix itself is just a
Band-Aid. But to do fundametal fix, we have to take care of memory
hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I
think.)
mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.
In old behavior, this is guaranteed by frequent reference to cpuset's
code. Now, most of them are removed and mempolicy has to check it by
itself.
To do check, a few nodemask_t will be used for calculating nodemask. But,
size of nodemask_t can be big and it's not good to allocate them on stack.
Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.
[akpm@linux-foundation.org: cleanups & tweaks] Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Miao Xie <miaox@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Paul Menage <menage@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: David Rientjes <rientjes@google.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefani Seibold [Thu, 6 Aug 2009 22:07:30 +0000 (15:07 -0700)]
fbcon: fix rotate upside down crash
Fix the rotate_ud() function not to crash in case of a font which has not
a width of multiple by 8: The inner loop of the font pixel copy should not
access a bit outside the font memory area. Subtract the shift offset from
the font width will prevent this.
Signed-off-by: Stefani Seibold <stefani@seibold.net> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 4 Aug 2009 22:39:55 +0000 (15:39 -0700)]
Merge branch 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'fix/hda' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Read buffer overflow
ALSA: hda: Correct EAPD for Dell Inspiron 1525
ALSA: hda: warn on spurious response
ALSA: hda: remember last command for each codec
ALSA: hda: read CORBWP inside reg_lock
ALSA: hda: take reg_lock in azx_init_cmd_io/azx_free_cmd_io
ALSA: hda: take cmd_mutex in probe_codec()
ALSA: hda: track CIRB/CORB command/response states for each codec
ALSA: hda - Fix quirk for Toshiba Satellite A135-S4527
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
tty-ldisc: be more careful in 'put_ldisc' locking
tty-ldisc: turn ldisc user count into a proper refcount
tty-ldisc: make refcount be atomic_t 'users' count
Linus Torvalds [Tue, 4 Aug 2009 22:39:16 +0000 (15:39 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Make SCSI SG v4 driver enabled by default and remove EXPERIMENTAL dependency, since udev depends on BSG
block: Update topology documentation
block: Stack optimal I/O size
block: Add a wrapper for setting minimum request size without a queue
block: Make blk_queue_stack_limits use the new stacking interface
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (23 commits)
[SCSI] sd: Avoid sending extended inquiry to legacy devices
[SCSI] libsas: fix wide port hotplug issues
[SCSI] libfc: fix a circular locking warning during sending RRQ
[SCSI] qla4xxx: Remove hiwat code so scsi eh does not get escalated when we can make progress
[SCSI] qla4xxx: Fix srb lookup in qla4xxx_eh_device_reset
[SCSI] qla4xxx: Fix Driver Fault Recovery Completion
[SCSI] qla4xxx: add timeout handler
[SCSI] qla4xxx: Correct Extended Sense Data Errors
[SCSI] libiscsi: disable bh in and abort handler.
[SCSI] zfcp: Fix tracing of request id for abort requests
[SCSI] zfcp: Fix wka port processing
[SCSI] zfcp: avoid double notify in lowmem scenario
[SCSI] zfcp: Add port only once to FC transport class
[SCSI] zfcp: Recover from stalled outbound queue
[SCSI] zfcp: Fix erp escalation procedure
[SCSI] zfcp: Fix logic for physical port close
[SCSI] zfcp: Use -EIO for SBAL allocation failures
[SCSI] zfcp: Use unchained mode for small ct and els requests
[SCSI] zfcp: Use correct flags for zfcp_erp_notify
[SCSI] zfcp: Return -ENOMEM for allocation failures in zfcp_fsf
...
Linus Torvalds [Tue, 4 Aug 2009 22:32:40 +0000 (15:32 -0700)]
Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=y
perf: Fix read buffer overflow
perf top: Add mwait_idle_with_hints to skip_symbols[]
perf tools: Fix faulty check
perf report: Update for the new FORK/EXIT events
perf_counter: Full task tracing
perf_counter: Collapse inherit on read()
tracing, perf_counter: Add help text to CONFIG_EVENT_PROFILE
perf_counter tools: Fix link errors with older toolchains
Linus Torvalds [Tue, 4 Aug 2009 22:32:22 +0000 (15:32 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix race in cpupri introduced by cpumask_var changes
sched: Fix latencytop and sleep profiling vs group scheduling
Linus Torvalds [Tue, 4 Aug 2009 22:32:08 +0000 (15:32 -0700)]
Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
posix-timers: Fix oops in clock_nanosleep() with CLOCK_MONOTONIC_RAW
Linus Torvalds [Tue, 4 Aug 2009 22:31:51 +0000 (15:31 -0700)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix missing function_graph events when we splice_read from trace_pipe
tracing: Fix invalid function_graph entry
trace: stop tracer in oops_enter()
ftrace: Only update $offset when we update $ref_func
ftrace: Fix the conditional that updates $ref_func
tracing: only truncate ftrace files when O_TRUNC is set
tracing: show proper address for trace-printk format
Linus Torvalds [Tue, 4 Aug 2009 22:28:59 +0000 (15:28 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Work around compilation warning in arch/x86/kernel/apm_32.c
x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq()
x86, 32-bit: Fix double accounting in reserve_top_address()
x86: Don't use current_cpu_data in x2apic phys_pkg_id
x86, UV: Fix UV apic mode
x86, UV: Fix macros for accessing large node numbers
x86, UV: Delete mapping of MMR rangs mapped by BIOS
x86, UV: Handle missing blade-local memory correctly
x86: fix assembly constraints in native_save_fl()
x86, msr: execute on the correct CPU subset
x86: Fix assert syntax in vmlinux.lds.S
x86: Make 64-bit efi_ioremap use ioremap on MMIO regions
x86: Add quirk to make Apple MacBook5,2 use reboot=pci
x86: Fix CPA memtype reserving in the set_pages_array*() cases
x86, pat: Fix set_memory_wc related corruption
x86: fix section mismatch for i386 init code
Linus Torvalds [Tue, 4 Aug 2009 22:28:46 +0000 (15:28 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] Make cpufreq suspend code conditional on powerpc.
[CPUFREQ] Fix a kobject reference bug related to managed CPUs
[CPUFREQ] Do not set policy for offline cpus
[CPUFREQ] Fix NULL pointer dereference regression in conservative governor
Linus Torvalds [Tue, 4 Aug 2009 22:28:23 +0000 (15:28 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix missing unlock in error path of nilfs_mdt_write_page
nilfs2: fix oops due to inconsistent state in page with discrete b-tree nodes
Linus Torvalds [Mon, 3 Aug 2009 21:54:56 +0000 (14:54 -0700)]
tty-ldisc: be more careful in 'put_ldisc' locking
Use 'atomic_dec_and_lock()' to make sure that we always hold the
tty_ldisc_lock when the ldisc count goes to zero. That way we can never
race against 'tty_ldisc_try()' increasing the count again.
Linus Torvalds [Mon, 3 Aug 2009 18:11:19 +0000 (11:11 -0700)]
tty-ldisc: turn ldisc user count into a proper refcount
By using the user count for the actual lifetime rules, we can get rid of
the silly "wait_for_idle" logic, because any busy ldisc will
automatically stay around until the last user releases it. This avoids
a host of odd issues, and simplifies the code.
So now, when the last ldisc reference is dropped, we just release the
ldisc operations struct reference, and free the ldisc.
It looks obvious enough, and it does work for me, but the counting
_could_ be off. It probably isn't (bad counting in the new version would
generally imply that the old code did something really bad, like free an
ldisc with a non-zero count), but it does need some testing, and
preferably somebody looking at it.
With this change, both 'tty_ldisc_put()' and 'tty_ldisc_deref()' are
just aliases for the new ref-counting 'put_ldisc()'. Both of them
decrement the ldisc user count and free it if it goes down to zero.
They're identical functions, in other words.
But the reason they still exist as sepate functions is that one of them
was exported (tty_ldisc_deref) and had a stupid name (so I don't want to
use it as the main name), and the other one was used in multiple places
(and I didn't want to make the patch larger just to rename the users).
In addition to the refcounting, I did do some minimal cleanup. For
example, now "tty_ldisc_try()" actually returns the ldisc it got under
the lock, rather than returning true/false and then the caller would
look up the ldisc again (now without the protection of the lock).
That said, there's tons of dubious use of 'tty->ldisc' without obviously
proper locking or refcounting left. I expressly did _not_ want to try to
fix it all, keeping the patch minimal. There may or may not be bugs in
that kind of code, but they wouldn't be _new_ bugs.
That said, even if the bugs aren't new, the timing and lifetime will
change. For example, some silly code may depend on the 'tty->ldisc'
pointer not changing because they hold a refcount on the 'ldisc'. And
that's no longer true - if you hold a ref on the ldisc, the 'ldisc'
itself is safe, but tty->ldisc may change.
So the proper locking (remains) to hold tty->ldisc_mutex if you expect
tty->ldisc to be stable. That's not really a _new_ rule, but it's an
example of something that the old code might have unintentionally
depended on and hidden bugs.
Whatever. The patch _looks_ sensible to me. The only users of
ldisc->users are:
- get_ldisc() - atomically increment the count
- put_ldisc() - atomically decrements the count and releases if zero
- tty_ldisc_try_get() - creates the ldisc, and sets the count to 1.
The ldisc should then either be released, or be attached to a tty.
Linus Torvalds [Mon, 3 Aug 2009 17:58:29 +0000 (10:58 -0700)]
tty-ldisc: make refcount be atomic_t 'users' count
This is pure preparation of changing the ldisc reference counting to be
a true refcount that defines the lifetime of the ldisc. But this is a
purely syntactic change for now to make the next steps easier.
This patch should make no semantic changes at all. But I wanted to make
the ldisc refcount be an atomic (I will be touching it without locks
soon enough), and I wanted to rename it so that there isn't quite as
much confusion between 'ldo->refcount' (ldisk operations refcount) and
'ld->refcount' (ldisc refcount itself) in the same file.
So it's now an atomic 'ld->users' count. It still starts at zero,
despite having a reference from 'tty->ldisc', but that will change once
we turn it into a _real_ refcount.
Alexander Duyck [Tue, 4 Aug 2009 18:46:41 +0000 (11:46 -0700)]
igbvf: Allow VF driver to correctly recognize failure to set mac
The VF driver was not correctly recognizing that it did not correctly set
it's mac address. As a result the VF driver was unable to receive network
traffic until being unloaded and reloaded. The issue was root caused to
the fact that the CTS bit was not taken into account when checking for the
request being NAKed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Jones [Tue, 4 Aug 2009 18:03:25 +0000 (14:03 -0400)]
[CPUFREQ] Make cpufreq suspend code conditional on powerpc.
The suspend code runs with interrupts disabled, and the powerpc workaround we
do in the cpufreq suspend hook calls the drivers ->get method.
powernow-k8's ->get does an smp_call_function_single
which needs interrupts enabled
cpufreq's suspend/resume code was added in 42d4dc3f4e1e to work around
a hardware problem on ppc powerbooks. If we make all this code
conditional on powerpc, we avoid the issue above.
The policy is released when the cpufreq device is removed in:
__cpufreq_remove_dev():
/* if this isn't the CPU which is the parent of the kobj, we
* only need to unlink, put and exit
*/
Not creating the symlink is not sever at all.
As long as:
sysfs_remove_link(&sys_dev->kobj, "cpufreq");
handles it gracefully that the symlink did not exist.
Possibly no error should be returned at all, because ondemand
governor would still provide the same functionality.
Userspace in userspace gov case might be confused if the link
is missing.
Prarit Bhargava [Mon, 3 Aug 2009 14:58:11 +0000 (10:58 -0400)]
[CPUFREQ] Do not set policy for offline cpus
Suspend/Resume fails on multi socket, multi core systems because the cpufreq
code erroneously sets the per_cpu policy_cpu value when a logical cpu is
offline.
This most notably results in missing sysfs files that are used to set the
cpu frequencies of the various cpus.
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Dave Jones <davej@redhat.com>
[CPUFREQ] Fix NULL pointer dereference regression in conservative governor
Commit ee88415caf736b89500f16e0a545614541a45005
introduced this regression when it removed enable bit in cpu_dbs_info_s.
That added a possibility of dbs_cpufreq_notifier getting called for a
CPU that is not yet managed by conservative governor. That will happen
as the transition notifier is set as soon as one CPU switches to
conservative governor and other CPUs can get a NULL pointer dereference
without the enable bit check. Add the enable bit back again.
Reported-by: Lermytte Christophe <Christophe.Lermytte@thomson.net> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com>
Russell King [Mon, 27 Jul 2009 06:00:48 +0000 (11:30 +0530)]
mfd: twl4030 irq fixes
The TWL4030 IRQ handler has a bug which leads to spinlock lock-up. It is
calling the 'unmask' function in a process context. :The mask/unmask/ack
functions are only designed to be called from the IRQ handler code,
or the proper API interfaces found in linux/interrupt.h.
Also there is no need to have IRQ chaining mechanism. The right way to
handle this is to claim the parent interrupt as a standard interrupt
and arrange for handle_twl4030_pih to take care of the rest of the devices.
Mail thread on this issue can be found at:
http://marc.info/?l=linux-arm-kernel&m=124629940123396&w=2
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Ingo Molnar [Mon, 29 Jun 2009 08:40:20 +0000 (10:40 +0200)]
perf_counter: Set the CONFIG_PERF_COUNTERS default to y if CONFIG_PROFILING=y
If user has already enabled profiling support in the kernel
(for oprofile, old-style profiling of ftrace) then offer up
perfcounters with a y default in interactive kconfig sessions.
Jack Steiner [Mon, 20 Jul 2009 14:28:41 +0000 (09:28 -0500)]
x86, UV: Complete IRQ interrupt migration in arch_enable_uv_irq()
In uv_setup_irq(), the call to create_irq() initially assigns
IRQ vectors to cpu 0. The subsequent call to
assign_irq_vector() in arch_enable_uv_irq() migrates the IRQ to
another cpu and frees the cpu 0 vector - at least it will be
freed as soon as the "IRQ move" completes.
arch_enable_uv_irq() needs to send a cleanup IPI to complete
the IRQ move. Otherwise, assignment of GRU interrupts on large
systems (>200 cpus) will exhaust the cpu 0 interrupt vectors
and initialization of the GRU driver will fail.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090720142840.GA8885@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jan Beulich [Thu, 30 Jul 2009 15:49:30 +0000 (16:49 +0100)]
x86, 32-bit: Fix double accounting in reserve_top_address()
With VMALLOC_END included in the calculation of MAXMEM (as of
2.6.28) it is no longer correct to also bump __VMALLOC_RESERVE
in reserve_top_address(). Doing so results in needlessly small
lowmem.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4A71DD2A020000780000D482@vpn.id2.novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
it turns out: we can not use current_cpu_data in phys_pgd_id
for x2apic.
identify_boot_cpu() is called by check_bugs() before
smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
for bsp is assigned with boot_cpu_data.
Just make phys_pkg_id for x2apic is aligned to xapic.
Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <4A6ADD0D.10002@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Steiner [Mon, 27 Jul 2009 14:38:08 +0000 (09:38 -0500)]
x86, UV: Fix macros for accessing large node numbers
The UV chipset automatically supplies the upper bits on nodes
being referenced by MMR accesses. These bit can be deleted from
the hub addressing macros.
Signed-off-by: Jack Steiner <steiner@sgi.com>
LKML-Reference: <20090727143808.GA8076@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Steiner [Mon, 27 Jul 2009 14:36:56 +0000 (09:36 -0500)]
x86, UV: Delete mapping of MMR rangs mapped by BIOS
The UV BIOS has added additional MMR ranges that are mapped via
EFI virtual mode mappings. These ranges should be deleted from
ranges mapped by uv_system_init().
UV blades may not have any blade-local memory. Add a field
(nid) to the UV blade structure to indicates whether the node
has local memory. This is needed by the GRU driver (pushed
separately).
Jean Delvare [Tue, 4 Aug 2009 04:10:01 +0000 (21:10 -0700)]
3c59x: Fix build failure with gcc 3.2
Fix the following build failure with gcc 3.2:
CC [M] drivers/net/3c59x.o
drivers/net/3c59x.c:2726:1: directives may not be used inside a macro argument
drivers/net/3c59x.c:2725:59: unterminated argument list invoking macro "pr_err"
drivers/net/3c59x.c: In function `dump_tx_ring':
drivers/net/3c59x.c:2727: implicit declaration of function `pr_err'
drivers/net/3c59x.c:2731: syntax error before ')' token
Apparently gcc 3.2 doesn't like #if interleaved with a macro call.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Mike McCormack [Fri, 31 Jul 2009 01:57:42 +0000 (01:57 +0000)]
sky2: Avoid transmits during sky2_down()
This patch supersedes my previous patch "sky2: Avoid transmitting
during sky2_restart".
I have reworked the patch to avoid crashes during both sky2_restart()
and sky2_set_ringparam().
Without this patch, the sky2 driver can be crashed by doing:
# pktgen eth1 & (transmit many packets on eth1)
# ethtool -G eth1 tx 510
I am aware you object to storing extra state, but I can't see a way
around this. Without remembering that we're restarting,
netif_wake_queue() is called in the ISR from sky2_tx_complete(), and
netif_tx_lock() is used in sky2_tx_done(). If anybody can see a way
around this, please let me know.
Signed-off-by: Mike McCormack <mikem@ring3k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
H. Peter Anvin [Mon, 3 Aug 2009 23:33:40 +0000 (16:33 -0700)]
x86: fix assembly constraints in native_save_fl()
From Gabe Black in bugzilla 13888:
native_save_fl is implemented as follows:
11static inline unsigned long native_save_fl(void)
12{
13 unsigned long flags;
14
15 asm volatile("# __raw_save_flags\n\t"
16 "pushf ; pop %0"
17 : "=g" (flags)
18 : /* no input */
19 : "memory");
20
21 return flags;
22}
If gcc chooses to put flags on the stack, for instance because this is
inlined into a larger function with more register pressure, the offset
of the flags variable from the stack pointer will change when the
pushf is performed. gcc doesn't attempt to understand that fact, and
address used for pop will still be the same. It will write to
somewhere near flags on the stack but not actually into it and
overwrite some other value.
I saw this happen in the ide_device_add_all function when running in a
simulator I work on. I'm assuming that some quirk of how the simulated
hardware is set up caused the code path this is on to be executed when
it normally wouldn't.
A simple fix might be to change "=g" to "=r".
Reported-by: Gabe Black <spamforgabe@umich.edu> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Stable Team <stable@kernel.org>
This happens because the 64-bit version of efi_ioremap calls
init_memory_mapping for all addresses, regardless of whether they are
RAM or MMIO. The EFI tables on this machine ask for runtime access to
some MMIO regions:
This arranges to pass the EFI memory type through to efi_ioremap, and
makes efi_ioremap use ioremap rather than init_memory_mapping if the
type is EFI_MEMORY_MAPPED_IO. With this, the above warning goes away.
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19062.55858.533494.471153@cargo.ozlabs.ibm.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Paul Mackerras [Mon, 3 Aug 2009 12:47:32 +0000 (22:47 +1000)]
x86: Add quirk to make Apple MacBook5,2 use reboot=pci
The latest Apple MacBook (MacBook5,2) doesn't reboot successfully
under Linux; neither the EFI reboot method nor the default method
using the keyboard controller works (the system just hangs and doesn't
reset). However, the method using the "PCI reset register" at 0xcf9
does work.
This adds a quirk to detect this machine via DMI and force the
reboot_type to BOOT_CF9. With this it reboots successfully without
requiring a command-line option. Note that the EFI code forces
reboot_type to BOOT_EFI when the machine is booted via EFI, but this
overrides that since the core_initcall runs after the EFI
initialization code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19062.56420.501516.316181@cargo.ozlabs.ibm.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reinette Chatre [Mon, 3 Aug 2009 19:10:16 +0000 (12:10 -0700)]
iwlagn: do not send key clear commands when rfkill enabled
Do all key clearing except sending sommands to device when rfkill
enabled. When rfkill enabled the interface is brought down and will
be brought back up correctly after rfkill is enabled again.
Same change is not needed for iwl3945 as it ignores return code when
sending key clearing command to device.
This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13742
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Frans Pop <elendil@planet.nl> Signed-off-by: John W. Linville <linville@tuxdriver.com>
"cfg80211: respect API on orig_flags on channel for beacon hint"
We did indeed respect _orig flags but the intention was not clearly
stated in the commit log. This patch fixes firmware issues picked
up by iwlwifi when we lift passive scan of beaconing restrictions
on channels its EEPROM has been configured to always enable.
By doing so though we also disallowed beacon hints on devices
registering their wiphy with custom world regulatory domains
enabled, this happens to be currently ath5k, ath9k and ar9170.
The passive scan and beacon restrictions on those devices would
never be lifted even if we did find a beacon and the hardware did
support such enhancements when world roaming.
Since Johannes indicates iwlwifi firmware cannot be changed to
allow beacon hinting we set up a flag now to specifically allow
drivers to disable beacon hints for devices which cannot use them.
We enable the flag on iwlwifi to disable beacon hints and by default
enable it for all other drivers. It should be noted beacon hints lift
passive scan flags and beacon restrictions when we receive a beacon from
an AP on any 5 GHz non-DFS channels, and channels 12-14 on the 2.4 GHz
band. We don't bother with channels 1-11 as those channels are allowed
world wide.
This should fix world roaming for ath5k, ath9k and ar9170, thereby
improving scan time when we receive the first beacon from any AP,
and also enabling beaconing operation (AP/IBSS/Mesh) on cards which
would otherwise not be allowed to do so. Drivers not using custom
regulatory stuff (wiphy_apply_custom_regulatory()) were not affected
by this as the orig_flags for the channels would have been cleared
upon wiphy registration.
I tested this with a world roaming ath5k card.
Cc: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Wed, 29 Jul 2009 20:07:44 +0000 (22:07 +0200)]
cfg80211: add two missing NULL pointer checks
These pointers can be NULL, the is_mesh() case isn't
ever hit in the current kernel, but cmp_ies() can be
hit under certain conditions.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org [2.6.29, 2.6.30] Signed-off-by: John W. Linville <linville@tuxdriver.com>
ixgbe: Patch to modify 82598 PCIe completion timeout values
The default completion timeout values for 82598 should be in the
range of 50us to 50ms, however the hardware default for these
parts is 500us to 1ms which is less than the 10ms recommended by
the pcie spec. To address this we need to increase the value to
either 10ms to 250ms for capability version 1 configuration, or
16ms to 55ms for version 2.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Young [Mon, 3 Aug 2009 04:26:16 +0000 (04:26 +0000)]
bluetooth: rfcomm_init bug fix
rfcomm tty may be used before rfcomm_tty_driver initilized,
The problem is that now socket layer init before tty layer, if userspace
program do socket callback right here then oops will happen.
make 3 changes:
1. remove #ifdef in rfcomm/core.c,
make it blank function when rfcomm tty not selected in rfcomm.h
2. tune the rfcomm_init error patch to ensure
tty driver initilized before rfcomm socket usage.
3. remove __exit for rfcomm_cleanup_sockets
because above change need call it in a __init function.
Reported-by: Oliver Hartkopp <oliver@hartkopp.net> Tested-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
x86: Fix CPA memtype reserving in the set_pages_array*() cases
The code was incorrectly reserving memtypes using the page
virtual address instead of the physical address. Furthermore,
the code was not ignoring highmem pages as it ought to.
( upstream does not pass in highmem pages yet - but upcoming
graphics code will do it and there's no reason to not handle
this properly in the CPA APIs.)
MIPS: MTX-1: Request button GPIO before setting its direction
This patch fixes the following warning at boot time:
WARNING: at drivers/gpio/gpiolib.c:83 0x8021d5e0()
autorequest GPIO-207
Modules linked in:
Call Trace:[<8011e0ec>] 0x8011e0ec
[<80110a28>] 0x80110a28
[<80110a28>] 0x80110a28
[..snip..]
The current code does not request the GPIO and attempts
to set its direction, which is a violation of the GPIO API.
This patch also unhardcode the GPIO we request and use
the one we defined in the button driver.