David Howells [Wed, 5 Nov 2008 17:38:47 +0000 (17:38 +0000)]
Fix accidental implicit cast in HR-timer conversion
Fix the hrtimer_add_expires_ns() function. It should take a 'u64 ns' argument,
but rather takes an 'unsigned long ns' argument - which might only be 32-bits.
On FRV, this results in the kernel locking up because hrtimer_forward() passes
the result of a 64-bit multiplication to this function, for which the compiler
discards the top 32-bits - something that didn't happen when ktime_add_ns() was
called directly.
Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:55 +0000 (12:53 -0800)]
fat: Fix ATTR_RO for directory
FAT has the ATTR_RO (read-only) attribute. But on Windows, the ATTR_RO
of the directory will be just ignored actually, and is used by only
applications as flag. E.g. it's setted for the customized folder by
Explorer.
This adds "rodir" option. If user specified it, ATTR_RO is used as
read-only flag even if it's the directory. Otherwise, inode->i_mode
is not used to hold ATTR_RO (i.e. fat_mode_can_save_ro() returns 0).
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Fix ATTR_RO in the case of (~umask & S_WUGO) == 0
If inode->i_mode doesn't have S_WUGO, current code assumes it means
ATTR_RO. However, if (~[ufd]mask & S_WUGO) == 0, inode->i_mode can't
hold S_WUGO. Therefore the updated directory entry will always have
ATTR_RO.
This adds fat_mode_can_hold_ro() to check it. And if inode->i_mode
can't hold, uses -i_attrs to hold ATTR_RO instead.
With this, we don't set ATTR_RO unless users change it via ioctl() if
(~[ufd]mask & S_WUGO) == 0.
And on FAT_IOCTL_GET_ATTRIBUTES path, this adds ->i_mutex to it for
not returning the partially updated attributes by FAT_IOCTL_SET_ATTRIBUTES
to userland.
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:54 +0000 (12:53 -0800)]
fat: Cleanup FAT attribute stuff
This adds three helpers:
fat_make_attrs() - makes FAT attributes from inode.
fat_make_mode() - makes mode_t from FAT attributes.
fat_save_attrs() - saves FAT attributes to inode.
Then this replaces: MSDOS_MKMODE() by fat_make_mode(), fat_attr() by
fat_make_attrs(), ->i_attrs = attr & ATTR_UNUSED by fat_save_attrs().
And for root inode, those is used with ATTR_DIR instead of bogus
ATTR_NONE.
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:52 +0000 (12:53 -0800)]
fat: Kill d_invalidate() in vfat_lookup()
d_invalidate() for positive dentry doesn't work in some cases
(vfsmount, nfsd, and maybe others). shrink_dcache_parent() by
d_invalidate() is pointless for vfat usage at all.
So, this kills it, and intead of it uses d_move().
To save old behavior, this returns alias simply for directory (don't
change pwd, etc..). the directory lookup shouldn't be important for
performance.
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:51 +0000 (12:53 -0800)]
fat: Fix/Cleanup dcache handling for vfat
- Add comments for handling dcache of vfat.
- Separate case-sensitive case and case-insensitive to
vfat_revalidate() and vfat_ci_revalidate().
vfat_revalidate() doesn't need to drop case-insensitive negative
dentry on creation path.
- Current code is missing to set ->d_revalidate to the negative dentry
created by unlink/etc..
This sets ->d_revalidate always, and returns 1 for positive
dentry. Now, we don't need to change ->d_op dynamically anymore,
so this just uses sb->s_root->d_op to set ->d_op.
- d_find_alias() may return DCACHE_DISCONNECTED dentry. It's not
the interesting dentry there. This checks it.
- Add missing LOOKUP_PARENT check. We don't need to drop the valid
negative dentry for (LOOKUP_CREATE | LOOKUP_PARENT) lookup.
- For consistent filename on creation path, this drops negative dentry
if we can't see intent.
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:47 +0000 (12:53 -0800)]
fat: use generic_file_llseek() for directory
Since fat_dir_ioctl() was already fixed (i.e. called under ->i_mutex),
and __fat_readdir() doesn't take BKL anymore. So, BKL for ->llseek()
is pointless, and we have to use generic_file_llseek().
OGAWA Hirofumi [Thu, 6 Nov 2008 20:53:47 +0000 (12:53 -0800)]
fat: Fix and cleanup timestamp conversion
This cleans date_dos2unix()/fat_date_unix2dos() up. New code should be
much more readable.
And this fixes those old functions. Those doesn't handle 2100
correctly. 2100 isn't leap year, but old one handles it as leap year.
Also, with this, centi sec is handled and is fixed.
Andrew Victor [Thu, 6 Nov 2008 20:53:42 +0000 (12:53 -0800)]
SAM9 watchdog: update for moved headers
The architecture header files were recently moved from
include/asm-arm/mach-at91/ to arch/arm/mach-at91/include/mach/. The SAM9
watchdog driver still includes a header from the old location.
Signed-off-by: Andrew Victor <linux@maxim.org.za> Cc: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Frans Pop [Thu, 6 Nov 2008 20:53:41 +0000 (12:53 -0800)]
rtc-cmos: fix boot log message
-rtc0: alarms up to one month, y3k, 114 bytes nvram, , hpet irqs irqs
+rtc0: alarms up to one month, y3k, 114 bytes nvram, hpet irqs
Signed-off-by: Frans Pop <elendil@planet.nl> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Thu, 6 Nov 2008 20:53:40 +0000 (12:53 -0800)]
atmel_serial: keep clock off when it's not needed
The atmel_serial driver is mismanaging its clock by leaving it on at all
times ... the whole point of clock management is to leave it off unless
it's actively needed, which conserves power!!
Although the kernel doesn't actually hang without my fix, it does
discard quite a lot of early console output.
The result still looks correct:
usart users= 1 on 35000000 Hz, for atmel_usart.0
usart users= 0 off 35000000 Hz, for atmel_usart.2
when using ttyS0 as serial console.
[haavard.skinnemoen@atmel.com: Make sure clock is enabled early for console] Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit 3e680aae4e53ab54cdbb0c29257dae0cbb158e1c ("fb: convert
lock/unlock_kernel() into local fb mutex") introduced several deadlocks
in the fb_compat_ioctl() path, as mutex_lock() doesn't allow recursion,
unlike lock_kernel(). This broke frame buffer applications on 64-bit
systems with a 32-bit userland.
This patch fixes the remaining deadlocks:
- Revert commit 120a37470c2831fea49fdebaceb5a7039f700ce6,
- Extract the core logic of fb_ioctl() into a new function do_fb_ioctl(),
- Change all callsites of fb_ioctl() where info->lock is already held to
call do_fb_ioctl() instead,
- Add sparse annotations to all routines that take info->lock.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Thu, 6 Nov 2008 20:53:36 +0000 (12:53 -0800)]
memory hotplug: fix page_zone() calculation in test_pages_isolated()
My last bugfix here (adding zone->lock) introduced a new problem: Using
page_zone(pfn_to_page(pfn)) to get the zone after the for() loop is wrong.
pfn will then be >= end_pfn, which may be in a different zone or not
present at all. This may lead to an addressing exception in page_zone()
or spin_lock_irqsave().
Now I use __first_valid_page() again after the loop to find a valid page
for page_zone().
Arthur Jones [Thu, 6 Nov 2008 20:53:35 +0000 (12:53 -0800)]
ext3: wait on all pending commits in ext3_sync_fs
In ext3_sync_fs, we only wait for a commit to finish if we started it, but
there may be one already in progress which will not be synced.
In the case of a data=ordered umount with pending long symlinks which are
delayed due to a long list of other I/O on the backing block device, this
causes the buffer associated with the long symlinks to not be moved to the
inode dirty list in the second phase of fsync_super. Then, before they
can be dirtied again, kjournald exits, seeing the UMOUNT flag and the
dirty pages are never written to the backing block device, causing long
symlink corruption and exposing new or previously freed block data to
userspace.
This can be reproduced with a script created
by Eric Sandeen <sandeen@redhat.com>:
#!/bin/bash
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
rm -f /mnt/test2/*
dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512
touch
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
ln -s
/mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename
/mnt/test2/link
umount /mnt/test2
mount /dev/sdb4 /mnt/test2
ls /mnt/test2/
umount /mnt/test2
To ensure all commits are synced, we flush all journal commits now when
sync_fs'ing ext3.
Signed-off-by: Arthur Jones <ajones@riverbed.com> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: <linux-ext4@vger.kernel.org> Cc: <stable@kernel.org> [2.6.everything] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: dann frazier <dannf@hp.com> Acked-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Li Zefan [Thu, 6 Nov 2008 20:53:32 +0000 (12:53 -0800)]
cgroups: fix invalid cgrp->dentry before cgroup has been completely removed
This fixes an oops when reading /proc/sched_debug.
A cgroup won't be removed completely until finishing cgroup_diput(), so we
shouldn't invalidate cgrp->dentry in cgroup_rmdir(). Otherwise, when a
group is being removed while cgroup_path() gets called, we may trigger
NULL dereference BUG.
The bug can be reproduced:
# cat test.sh
#!/bin/sh
mount -t cgroup -o cpu xxx /mnt
for (( ; ; ))
{
mkdir /mnt/sub
rmdir /mnt/sub
}
# ./test.sh &
# cat /proc/sched_debug
It really does not matter when we flush the lru. The system is free to
add pages onto the lru even during migration which will make the page
migration either skip the page (mbind, migrate_pages) or return a busy
state (move_pages).
Fixes this lockdep warning (and potential deadlock):
Some VM place has
mmap_sem -> kevent_wq via lru_add_drain_all()
net/core/dev.c::dev_ioctl() has
rtnl_lock -> mmap_sem (*) the ioctl has copy_from_user() and it can do page fault.
linkwatch_event has
kevent_wq -> rtnl_lock
Signed-off-by: Christoph Lameter <cl@linux-foundation.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh@veritas.com> Cc: Rik van Riel <riel@redhat.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fbdev: add new framebuffer driver for Fujitsu MB862xx GDCs
Add a framebuffer driver for the Fujitsu Carmine/Coral-P(A)/Lime graphics
controllers. Lime GDC support is known to work on PPC440EPx based lwmon5
and MPC8544E based socrates embedded boards, both equipped with Lime GDC.
Carmine/Coral-P PCI GDC support is known to work on PPC440EPx based
Sequoia board and also on x86 platform.
Signed-off-by: Anatolij Gustschin <agust@denx.de> Cc: Dmitry Baryshkov <dbaryshkov@gmail.com> Cc: Anton Vorontsov <avorontsov@ru.mvista.com> Cc: Matteo Fortini <m.fortini@selcomgroup.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Thu, 6 Nov 2008 20:53:29 +0000 (12:53 -0800)]
oom: do not dump task state for non thread group leaders
When /proc/sys/vm/oom_dump_tasks is enabled, it's only necessary to dump
task state information for thread group leaders. The kernel log gets
quickly overwhelmed on machines with a massive number of threads by
dumping non-thread group leaders.
Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Thu, 6 Nov 2008 20:53:27 +0000 (12:53 -0800)]
hugetlb: pull gigantic page initialisation out of the default path
As we can determine exactly when a gigantic page is in use we can optimise
the common regular page cases by pulling out gigantic page initialisation
into its own function. As gigantic pages are never released to buddy we
do not need a destructor. This effectivly reverts the previous change to
the main buddy allocator. It also adds a paranoid check to ensure we
never release gigantic pages from hugetlbfs to the main buddy.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Whitcroft [Thu, 6 Nov 2008 20:53:26 +0000 (12:53 -0800)]
hugetlbfs: handle pages higher order than MAX_ORDER
When working with hugepages, hugetlbfs assumes that those hugepages are
smaller than MAX_ORDER. Specifically it assumes that the mem_map is
contigious and uses that to optimise access to the elements of the mem_map
that represent the hugepage. Gigantic pages (such as 16GB pages on
powerpc) by definition are of greater order than MAX_ORDER (larger than
MAX_ORDER_NR_PAGES in size). This means that we can no longer make use of
the buddy alloctor guarentees for the contiguity of the mem_map, which
ensures that the mem_map is at least contigious for maximmally aligned
areas of MAX_ORDER_NR_PAGES pages.
This patch adds new mem_map accessors and iterator helpers which handle
any discontiguity at MAX_ORDER_NR_PAGES boundaries. It then uses these to
implement gigantic page versions of copy_huge_page and clear_huge_page,
and to allow follow_hugetlb_page handle gigantic pages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a regression where the controller firmware version is not
displayed in procfs. The previous patch would be called anytime something
changed. This will get called only once for each controller.
Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes a broken symlink in sysfs that was introduced by the
above commit. We broke it in 2.6.27-rc on or about 20080804. Some
installers are broken if this symlink does not exist and they may not
detect the logical drives configured on the controller. It does not
require being backported into 2.6.26.x or earlier kernels.
Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: <stable@kernel.org> [2.6.27.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Thu, 6 Nov 2008 20:53:23 +0000 (12:53 -0800)]
autofs4: collect version check return
The function check_dev_ioctl_version() returns an error code upon fail but
it isn't captured and returned in validate_dev_ioctl() as it should be.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ian Kent [Thu, 6 Nov 2008 20:53:22 +0000 (12:53 -0800)]
autofs4: correct offset mount expire check
When checking a directory tree in autofs_tree_busy() we can incorrectly
decide that the tree isn't busy. This happens for the case of an active
offset mount as autofs4_follow_mount() follows past the active offset
mount, which has an open file handle used for expires, causing the file
handle not to count toward the busyness check.
Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Henrik Rydberg [Thu, 6 Nov 2008 20:53:20 +0000 (12:53 -0800)]
hwmon: applesmc: add support for Macbook 5
Add accelerometer, backlight and temperature sensor support for the new
unibody Macbook 5.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Tested-by: David M. Lary <dmlary@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Brown [Thu, 6 Nov 2008 20:53:18 +0000 (12:53 -0800)]
rtc: fix handling of missing tm_year data when reading alarms
When fixing up invalid years rtc_read_alarm() was calling rtc_valid_tm()
as a boolean but rtc_valid_tm() returns zero on success or a negative
number if the time is not valid so the test was inverted.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Alessandro Zummo <a.zummo@towertech.it> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Problem 1 (see patch below):
vc_tab_stop is declared as an array of 8 unsigned ints in struct
vc_data in include/linux/console_struct.h .
In drivers/char/vt.c only 5 of these 8 unsigned ints get initialized
leading to unintended tabulator placement on displays with more than
160 columns text.
Problem 2 (open):
Upcoming displays will have more than 256 columns of text leading to
invalid memory access in drivers/char/vt.c during tabulator
calculations:
if (vc->vc_tab_stop[vc->vc_x >> 5] & (1 << (vc->vc_x & 31)))
break;
Signed-off-by: Wolfgang Kroworsch <wolfgang@kroworsch.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Miller [Thu, 6 Nov 2008 08:37:40 +0000 (00:37 -0800)]
net: Fix recursive descent in __scm_destroy().
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[JFFS2] fix race condition in jffs2_lzo_compress()
deflate_mutex protects the globals lzo_mem and lzo_compress_buf. However,
jffs2_lzo_compress() unlocks deflate_mutex _before_ it has copied out the
compressed data from lzo_compress_buf. Correct this by moving the mutex
unlock after the copy.
In addition, document what deflate_mutex actually protects.
Cc: stable@kernel.org Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Richard Purdie <rpurdie@openedhand.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Ingo Molnar [Wed, 5 Nov 2008 15:52:08 +0000 (16:52 +0100)]
sched: re-tune balancing
Impact: improve wakeup affinity on NUMA systems, tweak SMP systems
Given the fixes+tweaks to the wakeup-buddy code, re-tweak the domain
balancing defaults on NUMA and SMP systems.
Turn on SD_WAKE_AFFINE which was off on x86 NUMA - there's no reason
why we would not want to have wakeup affinity across nodes as well.
(we already do this in the standard NUMA template.)
lat_ctx on a NUMA box is particularly happy about this change:
[MTD] [NOR] Fix cfi_send_gen_cmd handling of x16 devices in x8 mode (v4)
For "unlock" cycles to 16bit devices in 8bit compatibility mode we need
to use the byte addresses 0xaaa and 0x555. These effectively match
the word address 0x555 and 0x2aa, except the latter has its low bit set.
Most chips don't care about the value of the 'A-1' pin in x8 mode,
but some -- like the ST M29W320D -- do. So we need to be careful to
set it where appropriate.
cfi_send_gen_cmd is only ever passed addresses where the low byte
is 0x00, 0x55 or 0xaa. Of those, only addresses ending 0xaa are
affected by this patch, by masking in the extra low bit when the device
is known to be in compatibility mode.
[dwmw2: Do it only when (cmd_ofs & 0xff) == 0xaa]
v4: Fix stupid typo in cfi_build_cmd_addr that failed to compile
I'm writing this patch way to late at night.
v3: Bring all of the work back into cfi_build_cmd_addr
including calling of map_bankwidth(map) and cfi_interleave(cfi)
So every caller doesn't need to.
v2: Only modified the address if we our device_type is larger than our
bus width.
Cc: stable@kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Peter Zijlstra [Tue, 4 Nov 2008 20:25:10 +0000 (21:25 +0100)]
sched: fix buddies for group scheduling
Impact: scheduling order fix for group scheduling
For each level in the hierarchy, set the buddy to point to the right entity.
Therefore, when we do the hierarchical schedule, we have a fair chance of
ending up where we meant to.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Currently we only have a forward looking buddy, that is, we prefer to
schedule to the task we last woke up, under the presumption that its
going to consume the data we just produced, and therefore will have
cache hot benefits.
This allows co-waking producer/consumer task pairs to run ahead of the
pack for a little while, keeping their cache warm. Without this, we
would interleave all pairs, utterly trashing the cache.
This patch introduces a backward looking buddy, that is, suppose that
in the above scenario, the consumer preempts the producer before it
can go to sleep, we will therefore miss the wakeup from consumer to
producer (its already running, after all), breaking the cycle and
reverting to the cache-trashing interleaved schedule pattern.
The backward buddy will try to schedule back to the task that woke us
up in case the forward buddy is not available, under the assumption
that the last task will be the one with the most cache hot task around
barring current.
This will basically allow a task to continue after it got preempted.
In order to avoid starvation, we allow either buddy to get wakeup_gran
ahead of the pack.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
xfrm: Fix xfrm_policy_gc_lock handling.
niu: Use pci_ioremap_bar().
bnx2x: Version Update
bnx2x: Calling netif_carrier_off at the end of the probe
bnx2x: PCI configuration bug on big-endian
bnx2x: Removing the PMF indication when unloading
mv643xx_eth: fix SMI bus access timeouts
net: kconfig cleanup
fs_enet: fix polling
XFRM: copy_to_user_kmaddress() reports local address twice
SMC91x: Fix compilation on some platforms.
udp: Fix the SNMP counter of UDP_MIB_INERRORS
udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
drivers/net/smc911x.c: Fix lockdep warning on xmit.
Linus Torvalds [Tue, 4 Nov 2008 16:19:01 +0000 (08:19 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: mask off DET when restoring SControl for detach
libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it
libata: Fix a potential race condition in ata_scsi_park_show()
sata_nv: fix generic, nf2/3 detection regression
sata_via: restore vt*_prepare_host error handling
sata_promise: add ATA engine reset to reset ops
Tejun Heo [Mon, 3 Nov 2008 10:27:07 +0000 (19:27 +0900)]
libata: mask off DET when restoring SControl for detach
libata restores SControl on detach; however, trying to restore
non-zero DET can cause undeterministic behavior including PMP device
going offline till power cycling. Mask off DET when restoring
SControl.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 3 Nov 2008 10:01:09 +0000 (19:01 +0900)]
libata: implement ATA_HORKAGE_ATAPI_MOD16_DMA and apply it
libata always uses PIO for ATAPI commands when the number of bytes to
transfer isn't multiple of 16 but quantum DAT72 chokes on odd bytes
PIO transfers. Implement a horkage to skip the mod16 check and apply
it to the quantum device.
This is reported by John Clark in the following thread.
http://thread.gmane.org/gmane.linux.ide/34748
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: John Clark <clarkjc@runbox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Elias Oltmanns [Mon, 3 Nov 2008 10:01:08 +0000 (19:01 +0900)]
libata: Fix a potential race condition in ata_scsi_park_show()
Peter Moulder has pointed out that there is a slight chance that a
negative value might be passed to jiffies_to_msecs() in
ata_scsi_park_show(). This is fixed by saving the value of jiffies in a
local variable, thus also reducing code since the volatile variable
jiffies is accessed only once.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Tejun Heo <tj.kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Mon, 3 Nov 2008 03:37:49 +0000 (12:37 +0900)]
sata_nv: fix generic, nf2/3 detection regression
All three flavors of sata_nv's are different in how their hardreset
behaves.
* generic: Hardreset is not reliable. Link often doesn't come online
after hardreset.
* nf2/3: A little bit better - link comes online with longer debounce
timing. However, nf2/3 can't reliable wait for the first D2H
Register FIS, so it can't wait for device readiness or classify the
device after hardreset. Follow-up SRST required.
* ck804: Hardreset finally works.
The core layer change to prefer hardreset and follow up changes
exposed the above issues and caused various detection regressions for
all three flavors. This patch, hopefully, fixes all the known issues
and should make sata_nv error handling more reliable.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
catched by gcc:
drivers/ata/sata_via.c: In function 'svia_init_one':
drivers/ata/sata_via.c:567: warning: 'host' may be used uninitialized in this function
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Joseph Chan <JosephChan@via.com.tw> Cc: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Promise ATA engines need to be reset when errors occur.
That's currently done for errors detected by sata_promise itself,
but it's not done for errors like timeouts detected outside of
the low-level driver.
The effect of this omission is that a timeout tends to result
in a sequence of failed COMRESETs after which libata EH gives
up and disables the port. At that point the port's ATA engine
hangs and even reloading the driver will not resume it.
To fix this, make sata_promise override ->hardreset on SATA
ports with code which calls pdc_reset_port() on the port in
question before calling libata's hardreset. PATA ports don't
use ->hardreset, so for those we override ->softreset instead.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
bnx2x: Calling netif_carrier_off at the end of the probe
netif_carrier_off was called too early at the probe. In case of failure
or simply bad timing, this can cause a fatal error since linkwatch_event
might run too soon.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The current code read nothing but zeros on big-endian (wrong part of the
32bits). This caused poor performance on big-endian machines. Though this
issue did not cause the system to crash, the performance is significantly
better with the fix so I view it as critical bug fix.
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When the PMF flag is set, the driver can access the HW freely. When the
driver is unloaded, it should not access the HW. The problem caused fatal
errors when "ethtool -i" was called after the calling instance was unloaded
and another instance was already loaded
Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The mv643xx_eth mii bus implementation uses wait_event_timeout() to
wait for SMI completion interrupts.
If wait_event_timeout() would return zero, mv643xx_eth would conclude
that the SMI access timed out, but this is not necessarily true --
wait_event_timeout() can also return zero in the case where the SMI
completion interrupt did happen in time but where it took longer than
the requested timeout for the process performing the SMI access to be
scheduled again. This would lead to occasional SMI access timeouts
when the system would be under heavy load.
The fix is to ignore the return value of wait_event_timeout(), and
to re-check the SMI done bit after wait_event_timeout() returns to
determine whether or not the SMI access timed out.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix renaming one hardlink on top of another
[CIFS] fix error in smb_send2
[CIFS] Reduce number of socket retries in large write path
Jeff Layton [Mon, 3 Nov 2008 19:05:08 +0000 (14:05 -0500)]
cifs: fix renaming one hardlink on top of another
cifs: fix renaming one hardlink on top of another
POSIX says that renaming one hardlink on top of another to the same
inode is a no-op. We had the logic mostly right, but forgot to clear
the return code.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>
Linus Torvalds [Mon, 3 Nov 2008 18:21:40 +0000 (10:21 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing, ring-buffer: add paranoid checks for loops
ftrace: use kretprobe trampoline name to test in output
tracing, alpha: undefined reference to `save_stack_trace'
Linus Torvalds [Mon, 3 Nov 2008 18:15:40 +0000 (10:15 -0800)]
Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
io mapping: clean up #ifdefs
io mapping: improve documentation
i915: use io-mapping interfaces instead of a variety of mapping kludges
resources: add io-mapping functions to dynamically map large device apertures
x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps
Linus Torvalds [Mon, 3 Nov 2008 18:14:59 +0000 (10:14 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda: make a STAC_DELL_EQ option
ALSA: emu10k1 - Add more invert_shared_spdif flag to Audigy models
ALSA: hda - Add a quirk for another Acer Aspire (1025:0090)
ALSA: remove direct access of dev->bus_id in sound/isa/*
sound: struct device - replace bus_id with dev_name(), dev_set_name()
ALSA: Fix PIT lockup on some chipsets when using the PC-Speaker
ALSA: rawmidi - Add open check in rawmidi callbacks
ALSA: hda - Add digital-mic for ALC269 auto-probe mode
ALSA: hda - Disable broken mic auto-muting in Realtek codes
Linus Torvalds [Mon, 3 Nov 2008 17:58:40 +0000 (09:58 -0800)]
Merge branch 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
i915: Add GEM ioctl to get available aperture size.
drm/radeon: fixup further bus mastering confusion.
build fix: CONFIG_DRM_I915=y && CONFIG_ACPI=n
Steven Rostedt [Fri, 31 Oct 2008 13:58:35 +0000 (09:58 -0400)]
tracing, ring-buffer: add paranoid checks for loops
While writing a new tracer, I had a bug where I caused the ring-buffer
to recurse in a bad way. The bug was with the tracer I was writing
and not the ring-buffer itself. But it took a long time to find the
problem.
This patch adds paranoid checks into the ring-buffer infrastructure
that will catch bugs of this nature.
Note: I put the bug back in the tracer and this patch showed the error
nicely and prevented the lockup.
Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steven Rostedt [Fri, 31 Oct 2008 19:44:07 +0000 (15:44 -0400)]
ftrace: use kretprobe trampoline name to test in output
Impact: ia64+tracing build fix
When a function is kprobed, the return address is set to the
kprobe_trampoline, or something similar. This caused the output
of the trace to look confusing when the parent seemed to be this
"kprobe_trampoline" function.
To fix this, Abhishek Sagar added a test of the instruction pointer
of the parent to see if it matched the kprobe_trampoline. If it
did, the output would print a "[unknown/kretprobe'd]" instead.
Unfortunately, not all archs do this the same way, and the trampoline
function may not be exported, which causes failures in builds.
This patch will compare the name instead of the pointer to see
if it matches. This prevents us from depending on a function from
being exported, and should work on all archs. The worst that can
happen is that an arch might use a different name and then we
go back to the confusing output. At least the arch will still build.
Arnaud Ebalard [Mon, 3 Nov 2008 09:30:23 +0000 (01:30 -0800)]
XFRM: copy_to_user_kmaddress() reports local address twice
While adding support for MIGRATE/KMADDRESS in strongSwan (as specified
in draft-ebalard-mext-pfkey-enhanced-migrate-00), Andreas Steffen
noticed that XFRMA_KMADDRESS attribute passed to userland contains the
local address twice (remote provides local address instead of remote
one).
This bug in copy_to_user_kmaddress() affects only key managers that use
native XFRM interface (key managers that use PF_KEY are not affected).
For the record, the bug was in the initial changeset I posted which
added support for KMADDRESS (13c1d18931ebb5cf407cb348ef2cd6284d68902d
'xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)').
Signed-off-by: Arnaud Ebalard <arno@natisbad.org> Reported-by: Andreas Steffen <andreas.steffen@strongswan.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Fri, 31 Oct 2008 19:50:41 +0000 (19:50 +0000)]
tracing, alpha: undefined reference to `save_stack_trace'
Impact: build fix on !stacktrace architectures
only select STACKTRACE on architectures that have STACKTRACE_SUPPORT
... since we also need to ifdef out the guts of ftrace_trace_stack().
We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags
on such configs, but that can wait.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Takashi Iwai [Mon, 3 Nov 2008 09:07:43 +0000 (10:07 +0100)]
ALSA: hda - Add a quirk for another Acer Aspire (1025:0090)
Added a quirk for another Acer Aspier laptop (1025:0090) with ALC883
codec. Reported in Novell bnc#426935:
https://bugzilla.novell.com/show_bug.cgi?id=426935
David S. Miller [Mon, 3 Nov 2008 08:04:24 +0000 (00:04 -0800)]
SMC91x: Fix compilation on some platforms.
This reverts 51ac3beffd4afaea4350526cf01fe74aaff25eff ('SMC91x: delete
unused local variable "lp"') and adds __maybe_unused markers to these
(potentially) unused variables.
The issue is that in some configurations SMC_IO_SHIFT evaluates
to '(lp->io_shift)', but in some others it's plain '0'.
Based upon a build failure report from Manuel Lauss.
Signed-off-by: David S. Miller <davem@davemloft.net>