Chris Mason [Thu, 14 May 2009 17:24:30 +0000 (13:24 -0400)]
Btrfs: Don't loop forever on metadata IO failures
When a btrfs metadata read fails, the first thing we try to do is find
a good copy on another mirror of the block. If this fails, read_tree_block()
ends up returning a buffer that isn't up to date.
The btrfs btree reading code was reworked to drop locks and repeat
the search when IO was done, but the changes didn't add a check for failed
reads. The end result was looping forever on buffers that were never
going to become up to date.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Thu, 14 May 2009 17:10:02 +0000 (13:10 -0400)]
Btrfs: init inode ordered_data_close flag properly
This flag is used to decide when we need to send a given file through
the ordered code to make sure it is fully written before a transaction
commits. It was not being properly set to zero when the inode was
being setup.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 27 Apr 2009 15:47:50 +0000 (11:47 -0400)]
Btrfs: look for acls during btrfs_read_locked_inode
This changes btrfs_read_locked_inode() to peek ahead in the btree for acl items.
If it is certain a given inode has no acls, it will set the in memory acl
fields to null to avoid acl lookups completely.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 27 Apr 2009 14:49:53 +0000 (10:49 -0400)]
Btrfs: fix acl caching
Linus noticed the btrfs code to cache acls wasn't properly caching
a NULL acl when the inode didn't have any acls. This meant the common
case of no acls resulted in expensive btree searches every time the
kernel checked permissions (which is quite often).
This is a modified version of Linus' original patch:
Properly set initial acl fields to BTRFS_ACL_NOT_CACHED in the inode.
This forces an acl lookup when permission checks are done.
Fix btrfs_get_acl to avoid lookups and locking when the inode acls fields
are set to null.
Fix btrfs_get_acl to use the right return value from __btrfs_getxattr
when deciding to cache a NULL acl. It was storing a NULL acl when
__btrfs_getxattr return -ENOENT, but __btrfs_getxattr was actually returning
-ENODATA for this case.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Joel Becker [Tue, 21 Apr 2009 19:38:30 +0000 (12:38 -0700)]
Btrfs: Fix a trivial warning using max() of u64 vs ULL.
A small warning popped up on ia64 because inode-map.c was comparing a
u64 object id with the ULL FIRST_FREE_OBJECTID. My first thought was
that all the OBJECTID constants should contain the u64 cast because
btrfs code deals entirely in u64s. But then I saw how large that was,
and figured I'd just fix the max() call.
Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 27 Apr 2009 11:29:05 +0000 (07:29 -0400)]
Btrfs: ratelimit IO error printks
Btrfs has printks for various IO errors, including bad checksums and
mismatches between what we expect the block headers to contain and what
we actually find on the disk.
Longer term we need a real reporting mechanism for this, but for now
printk is going to have to do.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Ball [Mon, 27 Apr 2009 11:29:03 +0000 (07:29 -0400)]
Btrfs: When shrinking, only update disk size on success
Previously, we updated a device's size prior to attempting a shrink
operation. This patch moves the device resizing logic to only happen if
the shrink completes successfully. In the process, it introduces a new
field to btrfs_device -- disk_total_bytes -- to track the on-disk size.
Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Fri, 24 Apr 2009 18:39:25 +0000 (14:39 -0400)]
Btrfs: fix deadlocks and stalls on dead root removal
After a transaction commit, the old root of the subvol btrees are sent through
snapshot removal. This is what actually frees up any blocks replaced by
COW, and anything the old blocks pointed to.
Snapshot deletion will pause when a transaction commit has started, which
helps to avoid a huge amount of delayed reference count updates piling up
as the transaction is trying to close.
But, this pause happens after the snapshot deletion process has asked other
procs on the system to throttle back a bit so that it can make progress.
We don't want to throttle everyone while we're waiting for the transaction
commit, it leads to deadlocks in the user transaction ioctls used by Ceph
and makes things slower in general.
This patch changes things to avoid the throttling while we sleep.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Fri, 24 Apr 2009 18:39:24 +0000 (14:39 -0400)]
Btrfs: fix fallocate deadlock on inode extent lock
The btrfs fallocate call takes an extent lock on the entire range
being fallocated, and then runs through insert_reserved_extent on each
extent as they are allocated.
The problem with this is that btrfs_drop_extents may decide to try
and take the same extent lock fallocate was already holding. The solution
used here is to push down knowledge of the range that is already locked
going into btrfs_drop_extents.
It turns out that at least one other caller had the same bug.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Tue, 21 Apr 2009 21:40:57 +0000 (17:40 -0400)]
Btrfs: try to keep a healthy ratio of metadata vs data block groups
This patch makes the chunk allocator keep a good ratio of metadata vs data
block groups. By default for every 8 data block groups, we'll allocate 1
metadata chunk, or about 12% of the disk will be allocated for metadata. This
can be changed by specifying the metadata_ratio mount option.
This is simply the number of data block groups that have to be allocated to
force a metadata chunk allocation. By making sure we allocate metadata chunks
more often, we are less likely to get into situations where the whole disk
has been allocated as data block groups.
Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Tue, 21 Apr 2009 15:53:38 +0000 (11:53 -0400)]
Btrfs: fix btrfs fallocate oops and deadlock
Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.
When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.
The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:10 +0000 (15:50 -0400)]
Btrfs: use the right node in reada_for_balance
reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks. We never directly used the
results of the read done by this function because the btree search is
started over at the end.
This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node. It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: fix oops on page->mapping->host during writepage
The extent_io writepage call updates the writepage index in the inode
as it makes progress. But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.
This lead to an oops, especially common with compression turned on. The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: add a priority queue to the async thread helpers
Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority. But, the checksumming helper threads prevent it
from being fully effective.
There are two problems. First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes. Second,
the checksumming uses an ordered async work queue. The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads. Usually this gives us less seeky IO.
But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.
This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.
The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios. This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: use WRITE_SYNC for synchronous writes
Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future. This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.
Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios. The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.
This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel
* 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel:
drm/i915: fix scheduling while holding the new active list spinlock
drm/i915: Allow tiling of objects with bit 17 swizzling by the CPU.
drm/i915: Correctly set the write flag for get_user_pages in pread.
drm/i915: Fix use of uninitialized var in 40a5f0de
drm/i915: indicate framebuffer restore key in SysRq help message
drm/i915: sync hdmi detection by hdmi identifier with 2D
drm/i915: Fix a mismerge of the IGD patch (new .find_pll hooks missed)
drm/i915: Implement batch and ring buffer dumping
That change is causing only one Intel CPU's microcode to be updated e.g.
microcode: CPU3 updated from revision 0x9 to 0x17, date = 2005-04-22
where before it announced that also for CPU0 and CPU1 and CPU2.
We cannot use work_on_cpu() in the CONFIG_MICROCODE_OLD_INTERFACE code,
because Intel's request_microcode_user() involves a copy_from_user() from
/sbin/microcode_ctl, which therefore needs to be on that CPU at the time.
Signed-off-by: Shaohua Li<shaohua.li@intel.com>
[anholt: Add more detail to the comment about the lock break that's added] Signed-off-by: Eric Anholt <eric@anholt.net>
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: warn about lockdep disabling after kernel taint, fix
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix "direct_io" private mmap
fuse: fix argument type in fuse_get_user_pages()
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix possible mismatch of sufile counters on recovery
nilfs2: segment usage file cleanups
nilfs2: fix wrong accounting and duplicate brelse in nilfs_sufile_set_error
nilfs2: simplify handling of active state of segments fix
nilfs2: remove module version
nilfs2: fix lockdep recursive locking warning on meta data files
nilfs2: fix lockdep recursive locking warning on bmap
nilfs2: return f_fsid for statfs2
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Add in PCI bus for DMA API debugging.
sh: Pre-allocate a reasonable number of DMA debug entries.
sh: sh7786: modify usb setup timeout judgment bug.
MAINTAINERS: Update sh architecture file patterns.
sh: ap325: use edge control for ov772x camera
sh: Plug in support for ARCH=sh64 using sh SRCARCH.
sh: urquell: Fix up address mapping in board comments.
sh: Add support for DMA API debugging.
sh: Provide cpumask_of_pcibus() to fix NUMA build.
sh: urquell: Add board comment
sh: wire up sys_preadv/sys_pwritev() syscalls.
sh: sh7785lcr: fix PCI address map for 32-bit mode
sh: intc: Added resume from hibernation support to the intc
David Howells [Tue, 14 Apr 2009 16:08:34 +0000 (17:08 +0100)]
Fix lpfc_parse_bg_err()'s use of do_div()
Fix lpfc_parse_bg_err()'s use of do_div(). It should be passing a 64-bit
variable as the first parameter. However, since it's only using a 32-bit
variable, it doesn't need to use do_div() at all, but can instead use the
division operator.
This deals with the following warnings:
CC drivers/scsi/lpfc/lpfc_scsi.o
drivers/scsi/lpfc/lpfc_scsi.c: In function 'lpfc_parse_bg_err':
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: comparison of distinct pointer types lacks a cast
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: right shift count >= width of type
drivers/scsi/lpfc/lpfc_scsi.c:1397: warning: passing argument 1 of '__div64_32' from incompatible pointer type
Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tony Breeds [Tue, 14 Apr 2009 13:58:41 +0000 (14:58 +0100)]
parport_pc: Fix build failure drivers/parport/parport_pc.c for powerpc
In commit 51dcdfec6a274afc1c6fce180d582add9ff512c0 ("parport: Use the
PCI IRQ if offered") parport_pc_probe_port() gained an irqflags arg.
This isn't being supplied on powerpc. This patch make powerpc fallback
to the old behaviour, that is using "0" for irqflags.
Fixes build failure:
In file included from drivers/parport/parport_pc.c:68:
arch/powerpc/include/asm/parport.h: In function 'parport_pc_find_nonpci_ports':
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
arch/powerpc/include/asm/parport.h:32: error: too few arguments to function 'parport_pc_probe_port'
make[3]: *** [drivers/parport/parport_pc.o] Error 1
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 14 Apr 2009 13:58:11 +0000 (14:58 +0100)]
tty: Fix leak in ti-usb
If the ti-usb adapter returns an zero data length frame (which happens)
then we leak a kref. Found by Christoph Mair <christoph.mair@gmail.com>
who proposed a patch. The patch here is different as Christoph's patch
didn't work for the case where tty = NULL and data arrived but Christoph
did all the hard work chasing it down.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alan Cox [Tue, 14 Apr 2009 13:57:36 +0000 (14:57 +0100)]
cdc-acm: Fix long standing abuse of tty->low_latency
ACM sets the low latency flag but calls the flip buffer routines from
IRQ context which isn't permitted (and as of 2.6.29 causes a warning
hence this one was caught)
Fortunatelt ACM doesn't need to set this flag in the first place as it
only set it to work around problems in ancient (pre tty flip rewrite)
kernels.
lockdep: warn about lockdep disabling after kernel taint, fix
Impact: build fix for Sparc and s390
Stephen Rothwell reported that the Sparc build broke:
In file included from kernel/panic.c:12:
include/linux/debug_locks.h: In function '__debug_locks_off':
include/linux/debug_locks.h:15: error: implicit declaration of function 'xchg'
due to:
9eeba61: lockdep: warn about lockdep disabling after kernel taint
There is some inconsistency between architectures about where exactly
xchg() is defined.
The traditional place is in system.h but the more logical point for it
is in atomic.h - where most architectures (especially new ones) have
it defined. These architecture also still offer it via system.h.
Some, such as Sparc or s390 only have it in asm/system.h and not available
via asm/atomic.h at all.
Use the widest set of headers in debug_locks.h and also include asm/system.h.
Paul Mundt [Tue, 14 Apr 2009 06:22:15 +0000 (15:22 +0900)]
sh: Pre-allocate a reasonable number of DMA debug entries.
This prevents the DMA API debugging from running out of entries right
away on boot. Defines 4096 entries by default, which while a bit on the
heavy side, ought to leave enough breathing room for some time.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
tomoyo: version bump to 2.2.0.
tomoyo: add Documentation/tomoyo.txt
We ended up incorrectly using '&cur' instead of '&readin' in the
work_on_cpu() -> smp_call_function_single() transformation in commit 01599fca6758d2cd133e78f87426fc851c9ea725 ("cpufreq: use
smp_call_function_[single|many]() in acpi-cpufreq.c").
Andrew explains:
"OK, the acpi tree went and had conflicting changes merged into it after
I'd written the patch and it appears that I incorrectly reverted part
of 18b2646fe3babeb40b34a0c1751e0bf5adfdc64c while fixing the resulting
rejects.
Switching it to `readin' looks correct."
Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge branch 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
xen: add FIX_TEXT_POKE to fixmap
xen: honour VCPU availability on boot
xen: clean up gate trap/interrupt constants
xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
xen: resume interrupts before system devices.
xen/mmu: weaken flush_tlb_other test
xen/mmu: some early pagetable cleanups
Xen: Add virt_to_pfn helper function
x86-64: remove PGE from must-have feature list
xen: mask XSAVE from cpuid
NULL noise: arch/x86/xen/smp.c
xen: remove xen_load_gdt debug
xen: make xen_load_gdt simpler
xen: clean up xen_load_gdt
xen: split construction of p2m mfn tables from registration
xen: separate p2m allocation from setting
xen: disable preempt for leave_lazy_mmu
sh: sh7786: modify usb setup timeout judgment bug.
This corrects a race with the PHY RST bit not being set properly if the
PLL status changes right before timeout. This resulted in it potentially
failing even if the device came up in time.
Special thanks to Mr. Juha Leppanen and Iwamatsu-san for reporting this
out and reviewing it.
Reported-by: Juha Leppanen <juha_motorsportcom@luukku.com> Reviewed-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com> Tested-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Jean Delvare [Mon, 13 Apr 2009 21:40:21 +0000 (14:40 -0700)]
edac: use to_delayed_work()
The edac-core driver includes code which assumes that the work_struct
which is included in every delayed_work is the first member of that
structure. This is currently the case but might change in the future, so
use to_delayed_work() instead, which doesn't make such an assumption.
linux-2.6.30-rc1 has the to_delayed_work() function that will allow this
patch to work
Signed-off-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Doug Thompson <dougthompson@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Mon, 13 Apr 2009 21:40:19 +0000 (14:40 -0700)]
sgi-xpc: clean up numerous globals
Introduce xpc_arch_ops and eliminate numerous individual global definitions.
Signed-off-by: Robin Holt <holt@sgi.com> Cc: Dean Nelson <dcn@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Mon, 13 Apr 2009 21:40:19 +0000 (14:40 -0700)]
sgi-xpc: implement opencomplete messaging
sgi-xpc has a window of failure where an open message can be sent and a
subsequent data message can get lost. We have added a new message
(opencomplete) which closes that window.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Mon, 13 Apr 2009 21:40:18 +0000 (14:40 -0700)]
sgi-xpc: prevent false heartbeat failures
The heartbeat timeout functionality in sgi-xpc is currently not trained to
the connection time. If a connection is made and the code is in the last
polling window prior to doing a timeout, the next polling window will see
the heartbeat as unchanged and initiate a no-heartbeat disconnect.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Dean Nelson <dcn@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yang Hongyang [Mon, 13 Apr 2009 21:40:14 +0000 (14:40 -0700)]
Replace all DMA_nBIT_MASK macro with DMA_BIT_MASK(n)
This is the second go through of the old DMA_nBIT_MASK macro,and there're not
so many of them left,so I put them into one patch.I hope this is the last round.
After this the definition of the old DMA_nBIT_MASK macro could be removed.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Greg KH <greg@kroah.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Mon, 13 Apr 2009 21:40:14 +0000 (14:40 -0700)]
ext2: fix data corruption for racing writes
If two writers allocating blocks to file race with each other (e.g.
because writepages races with ordinary write or two writepages race with
each other), ext2_getblock() can be called on the same inode in parallel.
Before we are going to allocate new blocks, we have to recheck the block
chain we have obtained so far without holding truncate_mutex. Otherwise
we could overwrite the indirect block pointer set by the other writer
leading to data loss.
The below test program by Ying is able to reproduce the data loss with ext2
on in BRD in a few minutes if the machine is under memory pressure:
long kMemSize = 50 << 20;
int kPageSize = 4096;
int main(int argc, char **argv) {
int status;
int count = 0;
int i;
char *fname = "/mnt/test.mmap";
char *mem;
unlink(fname);
int fd = open(fname, O_CREAT | O_EXCL | O_RDWR, 0600);
status = ftruncate(fd, kMemSize);
mem = mmap(0, kMemSize, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
// Fill the memory with 1s.
memset(mem, 1, kMemSize);
sleep(2);
for (i = 0; i < kMemSize; i++) {
int byte_good = mem[i] != 0;
if (!byte_good && ((i % kPageSize) == 0)) {
//printf("%d ", i / kPageSize);
count++;
}
}
munmap(mem, kMemSize);
close(fd);
unlink(fname);
if (count > 0) {
printf("Running %d bad page\n", count);
return 1;
}
return 0;
}
Cc: Ying Han <yinghan@google.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Jan Kara <jack@suse.cz> Cc: Mingming Cao <cmm@us.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SHMEM_MAX_BYTES was derived from the maximum size of its triple-indirect
swap vector, forgetting to take the MAX_LFS_FILESIZE limit into account.
Never mind 256kB pages, even 8kB pages on 32-bit kernels allowed files to
grow slightly bigger than that supposed maximum.
Fix this by using the min of both (at build time not run time). And it
happens that this calculation is good as far as 8MB pages on 32-bit or
16MB pages on 64-bit: though SHMSWP_MAX_INDEX gets truncated before that,
it's truncated to such large numbers that we don't need to care.
[akpm@linux-foundation.org: it needs pagemap.h]
[akpm@linux-foundation.org: fix sparc64 min() warnings] Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Yuri Tikhonov <yur@emcraft.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a division by zero which we have in shmem_truncate_range() and
shmem_unuse_inode() when using big PAGE_SIZE values (e.g. 256kB on
ppc44x).
With 256kB PAGE_SIZE, the ENTRIES_PER_PAGEPAGE constant becomes too large
(0x1.0000.0000) on a 32-bit kernel, so this patch just changes its type
from 'unsigned long' to 'unsigned long long'.
Hugh: reverted its unsigned long longs in shmem_truncate_range() and
shmem_getpage(): the pagecache index cannot be more than an unsigned long,
so the divisions by zero occurred in unreached code. It's a pity we need
any ULL arithmetic here, but I found no pretty way to avoid it.
Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefan Husemann [Mon, 13 Apr 2009 21:40:10 +0000 (14:40 -0700)]
intelfb: support i854
Support the Intel 854 Chipset in fbdev.
We test and use the patch on a Thomson IP1101 IPTV-Box. On the VGA-Port
we get a normal signal.
Here is the link to the Mambux-Project: http://www.mambux.de
Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@linux.ie> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Stefan Husemann <shusemann@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Mon, 13 Apr 2009 21:40:06 +0000 (14:40 -0700)]
jbd: update locking coments
Update information about locking in JBD revoke code.
Reported-by: Lin Tan <tammy000@gmail.com>. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Grover [Mon, 13 Apr 2009 21:40:05 +0000 (14:40 -0700)]
mm: document get_user_pages_fast()
While better than get_user_pages(), the usage of gupf(), especially the
return values and the fact that it can potentially only partially pin the
range, warranted some documentation.
Signed-off-by: Andy Grover <andy.grover@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Robertson [Mon, 13 Apr 2009 21:40:04 +0000 (14:40 -0700)]
initramfs: fix initramfs to work with hardlinked init
Change cb6ff208076b5f434db1b8c983429269d719cef5 ("NOMMU: Support XIP on
initramfs") seems to have broken booting from initramfs with /sbin/init
being a hardlink.
It seems like the logic required for XIP on nommu, i.e. ftruncate to
reported cpio header file size (body_len) is broken for hardlinks, which
have a reported size of 0, and the truncate thus nukes the contents of the
file (in my case busybox), making boot impossible and ending with runaway
loop modprobe binfmt-0000 - and of course 0000 is not a valid binary
format.
My fix is to only call ftruncate if size is non-zero which fixes things
for me, but I'm not certain whether this will break XIP for those files on
nommu systems, although I would guess not.
Signed-off-by: Randy Robertson <rmrobert@vmware.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ptrace: fix exit_ptrace() vs ptrace_traceme() race
Pointed out by Roland. The bug was recently introduced by me in
"forget_original_parent: split out the un-ptrace part", commit 39c626ae47c469abdfd30c6e42eff884931380d6.
Since that patch we have a window after exit_ptrace() drops tasklist and
before forget_original_parent() takes it again. In this window the child
can do ptrace(PTRACE_TRACEME) and nobody can untrace this child after
that.
Change ptrace_traceme() to not attach to the exiting ->real_parent. We
don't report the error in this case, we pretend we attach right before
->real_parent calls exit_ptrace() which should untrace us anyway.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Joe Perches <joe@perches.com> Cc: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: "Jani Monoses" <jani@ubuntu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Righi [Mon, 13 Apr 2009 21:39:58 +0000 (14:39 -0700)]
res_counter: update documentation
After the introduction of resource counters hierarchies
(28dbc4b6a01fb579a9441c7b81e3d3413dc452df) the prototypes of
res_counter_init() and res_counter_charge() have been changed.
Keep the documentation consistent with the actual function prototypes.
Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: Paul Menage <menage@google.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Mon, 13 Apr 2009 21:39:57 +0000 (14:39 -0700)]
spi: spi_write_then_read() bugfixes
The "simplify spi_write_then_read()" patch included two regressions from
the 2.6.27 behaviors:
- The data it wrote out during the (full duplex) read side
of the transfer was not zeroed.
- It fails completely on half duplex hardware, such as
Microwire and most "3-wire" SPI variants.
So, revert that patch. A revised version should be submitted at some
point, which can get the speedup on standard hardware (full duplex)
without breaking on less-capable half-duplex stuff.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Cc: <stable@kernel.org> [2.6.28.x, 2.6.29.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
forgot to update the NSIGTRAP define in asm-generic/siginfo.h to the new
number of sigtrap subcodes. Nothing in the tree seems to use it, but
presumably something in user space might. So update it.
Krzysztof Helt [Mon, 13 Apr 2009 21:39:55 +0000 (14:39 -0700)]
cirrusfb: do not allow unsupported pixel depth
Do not allow modes with unsupported pixel depth. Otherwise, one can hang
a computer by setting incorrect value with fbset command.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Sandeen [Mon, 13 Apr 2009 21:39:54 +0000 (14:39 -0700)]
include/linux/fiemap.h: include types.h now that it's exported
Include <linux/types.h> in fiemap.h. Sam Ravnborg pointed out that this
was missing in this newly-exported header which uses the __u32 and __u64
types.
Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sisfb: fix color component length for pseudocolor modes
sisfb incorrectly sets the length of the color fields to 6 bits
for PSEUDOCOLOR modes, even though 8 bits are always used per pixel.
Fix this by setting the length to 8.
Signed-off-by: Michal Januszewski <spock@gentoo.org> Cc: Thomas Winischhofer <thomas@winischhofer.net> Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sa1100fb: fix color component length for pseudocolor modes
sa1100fb incorrectly sets the length of the color fields to 8 bits for
PSEUDOCOLOR modes for which only 4 bits are used per pixel. Fix this by
setting the length to 4 bits for these modes.
Signed-off-by: Michal Januszewski <spock@gentoo.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
s3fb: fix color component length for pseudocolor modes
s3fb incorrectly sets the length of the color fields to 6 bits for
PSEUDOCOLOR modes, even though 8 or 4 bits are used per pixel. Fix this
by setting the length to 8 or 4, respectively.
Signed-off-by: Michal Januszewski <spock@gentoo.org> Acked-by: Ondrej Zajicek <santiago@crfreenet.org> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Anderson [Mon, 13 Apr 2009 21:39:46 +0000 (14:39 -0700)]
hfs: fix memory leak when unmounting
When an HFS filesystem is unmounted, it leaks a 2-page bitmap. Also,
under extreme memory pressure, it's possible that hfs_releasepage() may
use a tree pointer that has not been initialized, and if so, the release
request should just be rejected.
[akpm@linux-foundation.org: free_pages(0) is legal, remove obvious comment] Signed-off-by: Dave Anderson <anderson@redhat.com> Tested-by: Eugene Teo <eugeneteo@kernel.sg> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jonathan Cameron [Mon, 13 Apr 2009 21:39:45 +0000 (14:39 -0700)]
hwmon: sht15 humidity sensor driver
Data sheet at:
http://www.sensirion.ch/en/pdf/product_information/Datasheet-humidity-sensor-SHT1x.pdf
These sensors communicate over a 2 wire bus running a device specific
protocol. The complexity of the driver is mainly due to handling the
substantial delays between requesting a reading and the device pulling the
data line low to indicate that the data is available. This is handled by
an interrupt that is disabled under all other conditions.
I wasn't terribly clear on the best way to handle this, so comments on
that aspect would be particularly welcome!
Interpretation of the temperature depends on knowing the supply voltage.
If configured in a board config as a regulator consumer this is obtained
from the regulator subsystem. If not it should be provided in the
platform data.
I've placed this driver in the hwmon subsystem as it is definitely a
device that may be used for hardware monitoring and with it's relatively
slow response times (up to 120 millisecs to get a reading) a caching
strategy certainly seems to make sense!
Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Garrett [Mon, 13 Apr 2009 21:39:44 +0000 (14:39 -0700)]
efifb: exit if framebuffer address is invalid
efifb will attempt to ioremap a framebuffer even if its starting address
is 0, failing and causing an ugly backtrace in the process. Exit before
probing if this is the case.
Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Peter Jones <pjones@redhat.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
uvesafb: fix color component length for pseudocolor modes
uvesafb incorrectly sets the length of the color fields to 6 bits for
PSEUDOCOLOR modes, even though 8 bits are always used per pixel. Fix this
by setting the length to 8.
The switch of the DAC width from the default 6 bits to 8 bits is retained
and tracked internally in the driver, but never exposed to userspace.
Signed-off-by: Michal Januszewski <spock@gentoo.org> Acked-by: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: <syrjala@sci.fi> Cc: Geert Uytterhoeven <geert.uytterhoeven@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fbdev: fix color component field length documentation
The documentation about the meaning of the color component bitfield
lengths in pseudocolor modes is inconsistent. Fix it, so that it
indicates the correct interpretation everywhere, i.e. that 1 << length is
the number of palette entries.
Signed-off-by: Michal Januszewski <spock@gentoo.org> Acked-by: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: <syrjala@sci.fi> Acked-by: Geert Uytterhoeven <geert.uytterhoeven@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Righi [Mon, 13 Apr 2009 21:39:39 +0000 (14:39 -0700)]
fbdev: fix info->lock deadlock in fbcon_event_notify()
fb_notifier_call_chain() is called with info->lock held, i.e. in
do_fb_ioctl() => FBIOPUT_VSCREENINFO => fb_set_var() and the some
notifier callbacks, like fbcon_event_notify(), try to re-acquire
info->lock again.
Remove the lock/unlock_fb_info() in all the framebuffer notifier
callbacks' and be sure to always call fb_notifier_call_chain() with
info->lock held.
Reported-by: Pavel Roskin <proski@gnu.org> Reported-by: Eric Miao <eric.y.miao@gmail.com> Signed-off-by: Andrea Righi <righi.andrea@gmail.com> Cc: Stefan Richter <stefanr@s5r6.in-berlin.de> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>