Thomas Gleixner [Sat, 5 Feb 2011 20:39:28 +0000 (21:39 +0100)]
m32r: Fixup last __do_IRQ leftover
Somehow I managed to miss the last __do_IRQ caller when I cleanup the
remaining users. m32r is fully converted to the generic irq layer, but
I managed to not commit the conversion of __do_IRQ() to
generic_handle_irq() after compile testing the quilt series :(
Pointed-out-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (68 commits)
net: can: janz-ican3: world-writable sysfs termination file
net: can: at91_can: world-writable sysfs files
MAINTAINERS: update email ids of the be2net driver maintainers.
bridge: Don't put partly initialized fdb into hash
r8169: prevent RxFIFO induced loops in the irq handler.
r8169: RxFIFO overflow oddities with 8168 chipsets.
r8169: use RxFIFO overflow workaround for 8168c chipset.
include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument
net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.
net: Support compat SIOCGETVIFCNT ioctl in ipv4.
net: Fix bug in compat SIOCGETSGCNT handling.
niu: Fix races between up/down and get_stats.
tcp_ecn is an integer not a boolean
atl1c: Add missing PCI device ID
s390: Fix possibly wrong size in strncmp (smsgiucv)
s390: Fix wrong size in memcmp (netiucv)
qeth: allow OSA CHPARM change in suspend state
qeth: allow HiperSockets framesize change in suspend
qeth: add more strict MTU checking
qeth: show new mac-address if its setting fails
...
Vasiliy Kulikov [Fri, 4 Feb 2011 02:23:50 +0000 (02:23 +0000)]
net: can: at91_can: world-writable sysfs files
Don't allow everybody to write to mb0_id file.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Fri, 4 Feb 2011 21:02:36 +0000 (13:02 -0800)]
bridge: Don't put partly initialized fdb into hash
The fdb_create() puts a new fdb into hash with only addr set. This is
not good, since there are callers, that search the hash w/o the lock
and access all the other its fields.
Applies to current netdev tree.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Tetsuo Handa [Fri, 4 Feb 2011 18:13:24 +0000 (18:13 +0000)]
CRED: Fix kernel panic upon security_file_alloc() failure.
In get_empty_filp() since 2.6.29, file_free(f) is called with f->f_cred == NULL
when security_file_alloc() returned an error. As a result, kernel will panic()
due to put_cred(NULL) call within RCU callback.
Fix this bug by assigning f->f_cred before calling security_file_alloc().
Linus Torvalds [Fri, 4 Feb 2011 18:02:22 +0000 (10:02 -0800)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (27 commits)
gpu/stub: fix acpi_video build error, fix stub kconfig dependencies
drm/radeon/kms: dynamically allocate power state space
drm/radeon/kms: fix s/r issues with bios scratch regs
agp: ensure GART has an address before enabling it
Revert "agp: AMD AGP is used on UP1100 & UP1500 alpha boxen"
amd-k7-agp: remove non-x86 code
drm/radeon/kms/evergreen: always set certain VGT regs at CP init
drm/radeon/kms: add updated ib_execute function for evergreen
drm/radeon: remove 0x4243 pci id
drm/radeon/kms: Enable new pll calculation for avivo+ asics
drm/radeon/kms: add new pll algo for avivo asics
drm/radeon/kms: add pll debugging output
drm/radeon/kms: switch back to min->max pll post divider iteration
drm/radeon/kms: rv6xx+ thermal sensor fixes
drm/nv50: fix display on 0x50
drm/nouveau: correctly pair hwmon_init and hwmon_fini
drm/i915: Only bind to function 0 of the PCI device
drm/i915: Suppress spurious vblank interrupts
drm: Avoid leak of adjusted mode along quick set_mode paths
drm: Simplify and defend later checks when disabling a crtc
...
Keith Packard [Fri, 4 Feb 2011 00:57:28 +0000 (16:57 -0800)]
drm: Only set DPMS ON when actually configuring a mode
In drm_crtc_helper_set_config, instead of always forcing all outputs
to DRM_MODE_DPMS_ON, only set them if the CRTC is actually getting a
mode set, as any mode set will turn all outputs on.
This fixes https://lkml.org/lkml/2011/1/24/457
Signed-off-by: Keith Packard <keithp@keithp.com> Cc: stable@kernel.org (2.6.37) Reported-and-tested-by: Carlos R. Mafra <crmafra2@gmail.com> Tested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Francois Romieu [Thu, 3 Feb 2011 11:02:36 +0000 (12:02 +0100)]
r8169: RxFIFO overflow oddities with 8168 chipsets.
Some experiment-based action to prevent my 8168 chipsets locking-up hard
in the irq handler under load (pktgen ~1Mpps). Apparently a reset is not
always mandatory (is it at all ?).
- RTL_GIGA_MAC_VER_12
- RTL_GIGA_MAC_VER_25
Missed ~55% packets. Note:
- this is an old SiS 965L motherboard
- the 8168 chipset emits (lots of) control frames towards the sender
- RTL_GIGA_MAC_VER_26
The chipset does not go into a frenzy of mac control pause when it
crashes yet but it can still be crashed. It needs more work.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Ivan Vecera <ivecera@redhat.com> Cc: Hayes <hayeswang@realtek.com>
Ivan Vecera [Thu, 27 Jan 2011 11:24:11 +0000 (12:24 +0100)]
r8169: use RxFIFO overflow workaround for 8168c chipset.
I found that one of the 8168c chipsets (concretely XID 1c4000c0) starts
generating RxFIFO overflow errors. The result is an infinite loop in
interrupt handler as the RxFIFOOver is handled only for ...MAC_VER_11.
With the workaround everything goes fine.
Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Cc: Hayes <hayeswang@realtek.com>
Julia Lawall [Fri, 28 Jan 2011 05:43:40 +0000 (05:43 +0000)]
include/net/genetlink.h: Allow genlmsg_cancel to accept a NULL argument
nlmsg_cancel can accept NULL as its second argument, so for similarity,
this patch extends genlmsg_cancel to be able to accept a NULL second
argument as well.
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
The comments under "config STUB_POULSBO" are close to correct,
but they are not being followed. This patch updates them to reflect
the requirements for THERMAL.
This build error is caused by STUB_POULSBO selecting ACPI_VIDEO
when ACPI_VIDEO's config requirements are not met.
David S. Miller [Fri, 4 Feb 2011 01:21:31 +0000 (17:21 -0800)]
net: Fix bug in compat SIOCGETSGCNT handling.
Commit 709b46e8d90badda1898caea50483c12af178e96 ("net: Add compat
ioctl support for the ipv4 multicast ioctl SIOCGETSGCNT") added the
correct plumbing to handle SIOCGETSGCNT properly.
However, whilst definiting a proper "struct compat_sioc_sg_req" it
isn't actually used in ipmr_compat_ioctl().
Correct this oversight.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 4 Feb 2011 00:31:43 +0000 (16:31 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/hfsplus:
hfsplus: fix up a comparism in hfsplus_file_extend
hfsplus: fix two memory leaks in wrapper.c
hfsplus: do not leak buffer on error
hfsplus: fix failed mount handling
David S. Miller [Fri, 4 Feb 2011 00:12:50 +0000 (16:12 -0800)]
niu: Fix races between up/down and get_stats.
As reported by Flavio Leitner, there is no synchronization to protect
NIU's get_stats method from seeing a NULL pointer in either
np->rx_rings or np->tx_rings. In fact, as far as ->ndo_get_stats
is concerned, these values are set completely asynchronously.
Flavio attempted to fix this using a RW semaphore, which in fact
works most of the time. However, dev_get_stats() can be invoked
from non-sleepable contexts in some cases, so this fix doesn't
work in all cases.
So instead, control the visibility of the np->{rx,tx}_ring pointers
when the device is being brough up, and use properties of the device
down sequence to our advantage.
In niu_get_stats(), return immediately if netif_running() is false.
The device shutdown sequence first marks the device as not running (by
clearing the __LINK_STATE_START bit), then it performans a
synchronize_rcu() (in dev_deactive_many()), and then finally it
invokes the driver ->ndo_stop() method.
This guarentees that all invocations of niu_get_stats() either see
netif_running() as false, or they see the channel pointers before
->ndo_stop() clears them out.
If netif_running() is true, protect against startup races by loading
the np->{rx,tx}_rings pointer into a local variable, and punting if
it is NULL. Use ACCESS_ONCE to prevent the compiler from reloading
the pointer on us.
Also, during open, control the order in which the pointers and the
ring counts become visible globally using SMP write memory barriers.
We make sure the np->num_{rx,tx}_rings value is stable and visible
before np->{rx,tx}_rings is.
Such visibility control is not necessary on the niu_free_channels()
side because of the RCU sequencing that happens during device down as
described above. We are always guarenteed that all niu_get_stats
calls are finished, or will see netif_running() false, by the time
->ndo_stop is invoked.
Reported-by: Flavio Leitner <fleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Deucher [Wed, 2 Feb 2011 23:42:03 +0000 (18:42 -0500)]
drm/radeon/kms: dynamically allocate power state space
We previously used a static array, but some new systems
had more states then we had array space, so dynamically
allocate space based on the number of states in the vbios.
Alex Deucher [Thu, 3 Feb 2011 00:46:06 +0000 (19:46 -0500)]
drm/radeon/kms: fix s/r issues with bios scratch regs
The accelerate mode bit gets checked by certain atom
command tables to set up some register state. It needs
to be clear when setting modes and set when not.
Stephen Kitt [Mon, 31 Jan 2011 22:25:43 +0000 (14:25 -0800)]
agp: ensure GART has an address before enabling it
Some BIOSs (eg. the AMI BIOS on the Asus P4P800 motherboard) don't
initialise the GART address, and pcibios_assign_resources() can ignore it
because it can be marked as a host bridge (see
https://bugzilla.kernel.org/show_bug.cgi?id=24392#c5 for details). This
was handled correctly up to 2.6.35, but the pci_enable_device() cleanup in
2.6.36 96576a9e1a0cdb8 ("agp: intel-agp: do not use PCI resources before
pci_enable_device()") means that the kernel tries to enable the GART
before assigning it an address; in such cases the GART overlaps with other
device assignments and ends up being disabled.
This patch fixes https://bugzilla.kernel.org/show_bug.cgi?id=24392
Note that I imagine efficeon-agp.c probably has the same problem, but
I can't test that and I'd like to make sure this patch is suitable for
-stable (since 2.6.36 and 2.6.37 are affected).
Signed-off-by: Stephen Kitt <steve@sk2.org> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Maciej Rutecki <maciej.rutecki@gmail.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Kulikov Vasiliy <segooon@gmail.com> Cc: Florian Mickler <florian@mickler.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
The AMD 751 and 761 chipsets are used on the UP1000, UP1100, and UP1500
OEM motherboards, but they neglect to do anything to make AGP work.
According to Ivan Kokshaysky:
There is quite fundamental conflict between the Alpha architecture and
x86 AGP implementation - Alpha is entirely cache coherent by design,
while x86 AGP is not (I mean native AGP DMA transactions, not a PCI over
AGP). There are no such things as non-cacheable mappings or software
support for cache flushing/invalidation on Alpha, so x86 AGP code won't
work on Nautilus.
So there's no point in allowing this driver to be configured on Alpha.
Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Currently the error handling in hfsplus_fill_super is a mess, and can
lead to accessing fields in the superblock that haven't been even set
up yet. Fix this by making sure we do not set up sb->s_root until we
have the mount fully set up, and before that do proper step by step
unwinding instead of using hfsplus_put_super as a big hammer.
Reported-by: Dan Williams <dcbw@redhat.com> Signed-off-by: Christoph Hellwig <hch@tuxera.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] libsas: fix runaway error handler problem
[SCSI] fix incorrect value of SCSI_MAX_SG_CHAIN_SEGMENTS due to include file ordering
[SCSI] arcmsr: Fix the issue of system hangup after commands timeout on ARC-1200
[SCSI] mpt2sas: fix Integrated Raid unsynced on shutdown problem
[SCSI] mpt2sas: Kernel Panic during Large Topology discovery
[SCSI] mpt2sas: Fix the race between broadcast asyn event and scsi command completion
[SCSI] mpt2sas: Correct resizing calculation for max_queue_depth
[SCSI] mpt2sas: fix internal device reset for older firmware prior to MPI Rev K
[SCSI] mpt2sas: Fix device removal handshake for zoned devices
Suresh Siddha [Thu, 3 Feb 2011 20:20:04 +0000 (12:20 -0800)]
x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after switching mm
Clearing the cpu in prev's mm_cpumask early will avoid the flush tlb
IPI's while the cr3 is still pointing to the prev mm. And this window
can lead to the possibility of bogus TLB fills resulting in strange
failures. One such problematic scenario is mentioned below.
T1. CPU-1 is context switching from mm1 to mm2 context and got a NMI
etc between the point of clearing the cpu from the mm_cpumask(mm1)
and before reloading the cr3 with the new mm2.
T2. CPU-2 is tearing down a specific vma for mm1 and will proceed with
flushing the TLB for mm1. It doesn't send the flush TLB to CPU-1
as it doesn't see that cpu listed in the mm_cpumask(mm1).
T3. After the TLB flush is complete, CPU-2 goes ahead and frees the
page-table pages associated with the removed vma mapping.
T4. CPU-2 now allocates those freed page-table pages for something
else.
T5. As the CR3 and TLB caches for mm1 is still active on CPU-1, CPU-1
can potentially speculate and walk through the page-table caches
and can insert new TLB entries. As the page-table pages are
already freed and being used on CPU-2, this page walk can
potentially insert a bogus global TLB entry depending on the
(random) contents of the page that is being used on CPU-2.
T6. This bogus TLB entry being global will be active across future CR3
changes and can result in weird memory corruption etc.
To avoid this issue, for the prev mm that is handing over the cpu to
another mm, clear the cpu from the mm_cpumask(prev) after the cr3 is
changed.
Marking it for -stable, though we haven't seen any reported failure that
can be attributed to this.
Linus Torvalds [Thu, 3 Feb 2011 19:19:26 +0000 (11:19 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA: Update missed conversion of flush_scheduled_work()
RDMA/ucma: Copy iWARP route information on queries
RDMA/amso1100: Fix compile warnings
RDMA/cxgb4: Set the correct device physical function for iWARP connections
RDMA/cxgb4: Limit MAXBURST EQ context field to 256B
IB/qib: Hold link for TX SERDES settings
mlx4_core: Add ConnectX-3 device IDs
Linus Torvalds [Thu, 3 Feb 2011 16:55:07 +0000 (08:55 -0800)]
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: Fix update_curr_rt()
sched, docs: Update schedstats documentation to version 15
Will call update_curr_rt() on rq->curr, which at that time is
rq->stop. The problem is that rq->stop.prio matches an RT prio and
thus falsely assumes its a rt_sched_class task.
Reported-Debuged-Tested-Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission> Cc: stable@kernel.org # .37 Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Wed, 26 Jan 2011 14:38:35 +0000 (15:38 +0100)]
perf: Fix reading in perf_event_read()
It is quite possible for the event to have been disabled between
perf_event_read() sending the IPI and the CPU servicing the IPI and
calling __perf_event_read(), hence revalidate the state.
Apparently setting inode->bdi to one's own sb->s_bdi stops VFS from
sending *read-aheads*. This problem was bisected to this commit. A
revert fixes it. I'll investigate farther why is this happening for the
next Kernel, but for now a revert.
I'm sending to stable@kernel.org as well, since it exists also in
2.6.37. 2.6.36 is good and does not have this patch.
Linus Torvalds [Thu, 3 Feb 2011 01:52:19 +0000 (17:52 -0800)]
Merge branch 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6
* 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
[media] fix saa7111 non-detection
[media] rc/streamzap: fix reporting response times
[media] mceusb: really fix remaining keybounce issues
[media] rc: use time unit conversion macros correctly
[media] rc/ir-lirc-codec: add back debug spew
[media] ir-kbd-i2c: improve remote behavior with z8 behind usb
[media] lirc_zilog: z8 on usb doesn't like back-to-back i2c_master_send
[media] hdpvr: fix up i2c device registration
[media] rc/mce: add mappings for missing keys
[media] gspca - zc3xx: Discard the partial frames
[media] gspca - zc3xx: Fix bad images with the sensor hv7131r
[media] gspca - zc3xx: Bad delay when given by a table
Josef Bacik [Tue, 1 Feb 2011 23:52:47 +0000 (15:52 -0800)]
fs: make block fiemap mapping length at least blocksize long
Some filesystems don't deal well with being asked to map less than
blocksize blocks (GFS2 for example). Since we are always mapping at least
blocksize sections anyway, just make sure len is at least as big as a
blocksize so we don't trip up any filesystems. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Namhyung Kim [Tue, 1 Feb 2011 23:52:46 +0000 (15:52 -0800)]
vfs: sparse: add __FMODE_EXEC
FMODE_EXEC is a constant type of fmode_t but was used with normal integer
constants. This results in following warnings from sparse. Fix it using
new macro __FMODE_EXEC.
fs/exec.c:116:58: warning: restricted fmode_t degrades to integer
fs/exec.c:689:58: warning: restricted fmode_t degrades to integer
fs/fcntl.c:777:9: warning: restricted fmode_t degrades to integer
Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memcg: fix event counting breakage from recent THP update
Changes in e401f1761 ("memcg: modify accounting function for supporting
THP better") adds nr_pages to support multiple page size in
memory_cgroup_charge_statistics.
But counting the number of event nees abs(nr_pages) for increasing
counters. This patch fixes event counting.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Tue, 1 Feb 2011 23:52:43 +0000 (15:52 -0800)]
memcg: prevent endless loop when charging huge pages to near-limit group
If reclaim after a failed charging was unsuccessful, the limits are
checked again, just in case they settled by means of other tasks.
This is all fine as long as every charge is of size PAGE_SIZE, because in
that case, being below the limit means having at least PAGE_SIZE bytes
available.
But with transparent huge pages, we may end up in an endless loop where
charging and reclaim fail, but we keep going because the limits are not
yet exceeded, although not allowing for a huge page.
Fix this up by explicitely checking for enough room, not just whether we
are within limits.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Tue, 1 Feb 2011 23:52:42 +0000 (15:52 -0800)]
memcg: prevent endless loop when charging huge pages
The charging code can encounter a charge size that is bigger than a
regular page in two situations: one is a batched charge to fill the
per-cpu stocks, the other is a huge page charge.
This code is distributed over two functions, however, and only the outer
one is aware of huge pages. In case the charging fails, the inner
function will tell the outer function to retry if the charge size is
bigger than regular pages--assuming batched charging is the only case.
And the outer function will retry forever charging a huge page.
This patch makes sure the inner function can distinguish between batch
charging and a single huge page charge. It will only signal another
attempt if batch charging failed, and go into regular reclaim when it is
called on behalf of a huge page.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jin Dongming [Tue, 1 Feb 2011 23:52:41 +0000 (15:52 -0800)]
thp: fix unsuitable behavior for hwpoisoned tail page
When a tail page of THP is poisoned, memory-failure will do nothing except
setting PG_hwpoison, while the expected behavior is that the process, who
is using the poisoned tail page, should be killed.
The above problem is caused by lru check of the poisoned tail page of THP.
Because PG_lru flag is only set on the head page of THP, the check always
consider the poisoned tail page as NON lru page.
So the lru check for the tail page of THP should be avoided, as like as
hugetlb.
This patch adds !PageTransCompound() before lru check for THP, because of
the check (!PageHuge() && !PageTransCompound()) the whole branch could be
optimized away at build time when both hugetlbfs and THP are set with "N"
(or in archs not supporting either of those).
[akpm@linux-foundation.org: fix unrelated typo in shake_page() comment] Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jin Dongming [Tue, 1 Feb 2011 23:52:40 +0000 (15:52 -0800)]
thp: fix the wrong reported address of hwpoisoned hugepages
When the tail page of THP is poisoned, the head page will be poisoned too.
And the wrong address, address of head page, will be sent with sigbus
always.
So when the poisoned page is used by Guest OS which is running on KVM,
after the address changing(hva->gpa) by qemu, the unexpected process on
Guest OS will be killed by sigbus.
What we expected is that the process using the poisoned tail page could be
killed on Guest OS, but not that the process using the healthy head page
is killed.
Since it is not good to poison the healthy page, avoid poisoning other
than the page which is really poisoned.
(While we poison all pages in a huge page in case of hugetlb,
we can do this for THP thanks to split_huge_page().)
Here we fix two parts:
1. Isolate the poisoned page only to make sure
the reported address is the address of poisoned page.
2. make the poisoned page work as the poisoned regular page.
[akpm@linux-foundation.org: fix spello in comment] Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jin Dongming [Tue, 1 Feb 2011 23:52:39 +0000 (15:52 -0800)]
thp: fix splitting of hwpoisoned hugepages
The poisoned THP is now split with split_huge_page() in
collect_procs_anon(). If kmalloc() is failed in collect_procs(),
split_huge_page() could not be called. And the work after
split_huge_page() for collecting the processes using poisoned page will
not be done, too. So the processes using the poisoned page could not be
killed.
The condition becomes worse when CONFIG_DEBUG_VM == "Y". Because the
poisoned THP could not be split, system panic will be caused by
VM_BUG_ON(PageTransHuge(page)) in try_to_unmap().
This patch does:
1. move split_huge_page() to the place before collect_procs().
This can be sure the failure of splitting THP is caused by itself.
2. when splitting THP is failed, stop the operations after it.
This can avoid unexpected system panic or non sense works.
[akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Tue, 1 Feb 2011 23:52:38 +0000 (15:52 -0800)]
MAINTAINERS: fixup Simtec support email entries
The support@simtec.co.uk address is for direct customer support only, the
EB2410ITX and EB110ATX entries should direct to the Simtec Linux Team
address of linux@simtec.co.uk
Also add correct email address for Vincent Sanders
[akpm@linux-foundation.org: fix Vincent's address] Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Vincent Sanders <vince@simtec.co.uk> Cc: Simtec Support <support@simtec.co.uk> Cc: Simtec Linux Team <linux@simtec.co.uk> Cc: Jack Stone <jwjstone@fastmail.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Tue, 1 Feb 2011 23:52:37 +0000 (15:52 -0800)]
MAINTAINERS: fixup file entries for "SIMTEC EB2410ITX (BAST)"
Add the correct files for the Simtec BAST machine, ensuring the IDE and
IRQ routing are added, and move to the machine specific file instead of
trying to catch all of arch/arm/mach-s3c2410
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Simtec Linux Team <linux@simtec.co.uk> Cc: Simtec Support <support@simtec.co.uk> Cc: Jack Stone <jwjstone@fastmail.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Tue, 1 Feb 2011 23:52:36 +0000 (15:52 -0800)]
MAINTAINERS: move s3c2410 drivers to ARM/SAMSUNG ARM
There are currently two entries under the "SIMTEC EB2410ITX (BAST)"
machine entry for drivers/*/*s3c2410*, which is catching everything
s3c2410 driver related.
This entry is for a specific S3C2410 based machine, so move these two file
entries to the "ARM/SAMSUNG ARM ARCHITECTURES" entry, where it will reach
a wider audience of interested parties.
Signed-off-by: Ben Dooks <ben-linux@fluff.org> Cc: Simtec Linux Team <linux@simtec.co.uk> Acked-by: Kukjin Kim <kgene.kim@samsung.com> Cc: Jack Stone <jwjstone@fastmail.fm> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Tue, 1 Feb 2011 23:52:35 +0000 (15:52 -0800)]
epoll: epoll_wait() should not use timespec_add_ns()
commit 95aac7b1cd224f ("epoll: make epoll_wait() use the hrtimer range
feature") added a performance regression because it uses timespec_add_ns()
with potential very large 'ns' values.
[akpm@linux-foundation.org: s/epoll_set_mstimeout/ep_set_mstimeout/, per Davide] Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Shawn Bohrer <shawn.bohrer@gmail.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> [2.6.37.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Tue, 1 Feb 2011 23:52:33 +0000 (15:52 -0800)]
mm/migration: fix page corruption during hugepage migration
If migrate_huge_page by memory-failure fails , it calls put_page in itself
to decrease page reference and caller of migrate_huge_page also calls
putback_lru_pages. It can do double free of page so it can make page
corruption on page holder.
In addtion, clean of pages on caller is consistent behavior with
migrate_pages by cf608ac19c ("mm: compaction: fix COMPACTPAGEFAILED
counting").
Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: when migrate_pages returns 0, all pages must have been released
In some cases migrate_pages could return zero while still leaving a few
pages in the pagelist (and some caller wouldn't notice it has to call
putback_lru_pages after commit cf608ac19c9 ("mm: compaction: fix
COMPACTPAGEFAILED counting")).
Add one missing putback_lru_pages not added by commit cf608ac19c95 ("mm:
compaction: fix COMPACTPAGEFAILED counting").
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Cc: Christoph Lameter <cl@linux.com> Acked-by: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Tue, 1 Feb 2011 23:52:31 +0000 (15:52 -0800)]
memsw: deprecate noswapaccount kernel parameter and schedule it for removal
noswapaccount couldn't be used to control memsw for both on/off cases so
we have added swapaccount[=0|1] parameter. This way we can turn the
feature in two ways noswapaccount resp. swapaccount=0. We have kept the
original noswapaccount but I think we should remove it after some time as
it just makes more command line parameters without any advantages and also
the code to handle parameters is uglier if we want both parameters.
Signed-off-by: Michal Hocko <mhocko@suse.cz> Requested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
__setup based kernel command line parameters handlers which are handled in
obsolete_checksetup are provided with the parameter value including =
(more precisely everything right after the parameter name).
This means that the current implementation of swapaccount[=1|0] doesn't
work at all because if there is a value for the parameter then we are
testing for "0" resp. "1" but we are getting "=0" resp. "=1" and if
there is no parameter value we are getting an empty string rather than
NULL.
The original noswapccount parameter, which doesn't care about the value,
works correctly.
Peter Chubb [Wed, 2 Feb 2011 23:39:58 +0000 (15:39 -0800)]
tcp_ecn is an integer not a boolean
There was some confusion at LCA as to why the sysctl tcp_ecn took one
of three values when it was documented as a Boolean. This patch fixes
the documentation.
Signed-off-by: Peter Chubb <peter.chubb@nicta.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Chuck Ebbert [Wed, 2 Feb 2011 23:02:08 +0000 (15:02 -0800)]
atl1c: Add missing PCI device ID
Commit 8f574b35f22fbb9b5e5f1d11ad6b55b6f35f4533 ("atl1c: Add AR8151 v2
support and change L0s/L1 routine") added support for a new adapter
but failed to add it to the PCI device table.
Signed-Off-By: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Weil [Wed, 2 Feb 2011 06:04:36 +0000 (06:04 +0000)]
s390: Fix possibly wrong size in strncmp (smsgiucv)
This error was reported by cppcheck:
drivers/s390/net/smsgiucv.c:63: error: Using sizeof for array given as
function argument returns the size of pointer.
Although there is no runtime problem as long as sizeof(u8 *) == 8,
this misleading code should get fixed.
Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Weil [Wed, 2 Feb 2011 06:04:35 +0000 (06:04 +0000)]
s390: Fix wrong size in memcmp (netiucv)
This error was reported by cppcheck:
drivers/s390/net/netiucv.c:568: error: Using sizeof for array given
as function argument returns the size of pointer.
sizeof(ipuser) did not result in 16 (as many programmers would have
expected) but sizeof(u8 *), so it is 4 or 8, too small here.
Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Wed, 2 Feb 2011 06:04:34 +0000 (06:04 +0000)]
qeth: allow OSA CHPARM change in suspend state
For OSA the CHPARM-definition determines the number of available
outbound queues.
A CHPARM-change may occur while a Linux system with probed
OSA device is in suspend state. This patch enables proper
resuming of an OSA device in this case.
Signed-off-by: Ursula braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Wed, 2 Feb 2011 06:04:33 +0000 (06:04 +0000)]
qeth: allow HiperSockets framesize change in suspend
For HiperSockets the framesize-definition determines the selected
mtu-size and the size of the allocated qdio buffers.
A framesize-change may occur while a Linux system with probed
HiperSockets device is in suspend state. This patch enables proper
resuming of a HiperSockets device in this case.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Wed, 2 Feb 2011 06:04:31 +0000 (06:04 +0000)]
qeth: show new mac-address if its setting fails
Setting of a MAC-address may fail because an already used MAC-address
is to bet set or because of authorization problems. In those cases
qeth issues a message, but the mentioned MAC-address is not the
new MAC-address to be set, but the actual MAC-address. This patch
chooses now the new MAC-address to be set for the error messages.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
this may not be necessary at this point, but we should still clean up
the skb->skb_iif. If not we may end up with an invalid valid for
skb->skb_iif when the skb is reused and the check is done in
__netif_receive_skb.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Gleixner [Fri, 28 Jan 2011 07:47:15 +0000 (08:47 +0100)]
genirq: Prevent irq storm on migration
move_native_irq() masks and unmasks the interrupt line
unconditionally, but the interrupt line might be masked due to a
threaded oneshot handler in progress. Unmasking the line in that case
can lead to interrupt storms. Observed on PREEMPT_RT.
Originally-from: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@kernel.org
Dave Airlie [Wed, 2 Feb 2011 01:22:34 +0000 (11:22 +1000)]
Merge remote branch 'intel/drm-intel-fixes' of /ssd/git/drm-next into drm-fixes
* 'intel/drm-intel-fixes' of /ssd/git/drm-next:
drm/i915: Only bind to function 0 of the PCI device
drm/i915: Suppress spurious vblank interrupts
drm: Avoid leak of adjusted mode along quick set_mode paths
drm: Simplify and defend later checks when disabling a crtc
drm: Don't switch fb when disabling an output
drm/i915: Reset crtc after resume
drm/i915/crt: Force the initial probe after reset
drm/i915: Reset state after a GPU reset or resume
drm: Add an interface to reset the device
drm/i915/sdvo: If at first we don't succeed in reading the response, wait
Ajit Khaparde [Tue, 1 Feb 2011 23:41:13 +0000 (15:41 -0800)]
be2net: fix a crash seen during insmod/rmmod test
While running insmod/rmood in a loop, an unnecessary netif_stop_queue
causes the system to crash. Remove the netif_stop_queue call
and netif_start_queue in the link status update path.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 1 Feb 2011 23:23:58 +0000 (10:23 +1100)]
Merge branch 'next' of git://git.monstr.eu/linux-2.6-microblaze
* 'next' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Fix ASM optimized code for LE
microblaze: Fix unaligned issue on MMU system with BS=0 DIV=1
microblaze: Fix DTB passing from bootloader
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix length checks in checkSMB
[CIFS] Update cifs minor version
cifs: No need to check crypto blockcipher allocation
cifs: clean up some compiler warnings
cifs: make CIFS depend on CRYPTO_MD4
cifs: force a reconnect if there are too many MIDs in flight
cifs: don't pop a printk when sending on a socket is interrupted
cifs: simplify SMB header check routine
cifs: send an NT_CANCEL request when a process is signalled
cifs: handle cancelled requests better
cifs: fix two compiler warning about uninitialized vars
mlock: operate on any regions with protection != PROT_NONE
As Tao Ma noticed, change 5ecfda0 breaks blktrace. This is because
blktrace mmaps a file with PROT_WRITE permissions but without PROT_READ,
so my attempt to not unnecessarity break COW during mlock ended up
causing mlock to fail with a permission problem.
I am proposing to let mlock ignore vma protection in all cases except
PROT_NONE. In particular, mlock should not fail for PROT_WRITE regions
(as in the blktrace case, which broke at 5ecfda0) or for PROT_EXEC
regions (which seem to me like they were always broken).
Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The comments under "config STUB_POULSBO" are close to correct,
but they are not being followed. This patch updates them to reflect
the requirements for THERMAL.
This build error is caused by STUB_POULSBO selecting ACPI_VIDEO
when ACPI_VIDEO's config requirements are not met.
Stefan Weil [Sun, 30 Jan 2011 10:31:26 +0000 (10:31 +0000)]
isdn: icn: Fix potentially wrong string handling
This warning was reported by cppcheck:
drivers/isdn/icn/icn.c:1641: error: Dangerous usage of 'rev' (strncpy doesn't always 0-terminate it)
If strncpy copied 20 bytes, the destination string rev was not terminated.
The patch adds one more byte to rev and makes sure that this byte is
always 0.
Cc: Karsten Keil <isdn@linux-pingi.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Stefan Weil <weil@mail.berlios.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The chip was erroneously configured to accept all multicast frames
in a normal (none-promisc) rx mode both on the RSS and on the FCoE L2 rings
when in an NPAR mode. This caused packet duplication for every received multicast
frame in this mode.
Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Wilson [Tue, 1 Feb 2011 19:43:02 +0000 (19:43 +0000)]
drm/i915: Only bind to function 0 of the PCI device
Early chipsets (gen2/3) used function 1 as a placeholder for multi-head.
We used to ignore these since they were not assigned to
PCI_CLASS_DISPLAY_VGA. However with 934f992c7 we attempt to bind to all
Intel PCI_CLASS_DISPLAY devices (and functions) to work in multi-gpu
systems. This fails hard on gen2/3.
Reported-by: Ferenc Wágner <wferi@niif.hu> Tested-by: Ferenc Wágner <wferi@niif.hu>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=28012 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org
Stefan Weil [Fri, 28 Jan 2011 12:30:17 +0000 (12:30 +0000)]
vxge: Fix wrong boolean operator
This error is reported by cppcheck:
drivers/net/vxge/vxge-config.c:3693: warning: Mutual exclusion over || always evaluates to true. Did you intend to use && instead?
It looks like cppcheck is correct, so fix this. No test was run.
Cc: Ramkrishna Vepa <ramkrishna.vepa@exar.com> Cc: Sivakumar Subramani <sivakumar.subramani@exar.com> Cc: Sreenivasa Honnur <sreenivasa.honnur@exar.com> Cc: Jon Mason <jon.mason@exar.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Stefan Weil <weil@mail.berlios.de> Acked-by: Ram Vepa <ram.vepa@exar.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.
To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.
Thus, the event delivered looks like the following:
netfilter: arpt_mangle: fix return values of checkentry
In 135367b "netfilter: xtables: change xt_target.checkentry return type",
the type returned by checkentry was changed from boolean to int, but the
return values where not adjusted.
arptables: Input/output error
This broke arptables with the mangle target since it returns true
under success, which is interpreted by xtables as >0, thus
returning EIO.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
When built with rcu checks enabled, vhost triggers
bogus warnings as vhost features are read without
dev->mutex sometimes, and private pointer is read
with our kind of rcu where work serves as a
read side critical section.
Fixing it properly is not trivial.
Disable the warnings by stubbing out the checks for now.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>