Daniel Vetter [Wed, 4 Jul 2012 20:52:50 +0000 (22:52 +0200)]
drm/i915: don't return a spurious -EIO from intel_ring_begin
The issue with this check is that it results in userspace receiving an
-EIO while the gpu reset hasn't completed, resulting in fallback to sw
rendering or worse.
Now there's also a stern comment in intel_ring_wait_seqno saying that
intel_ring_begin should not return -EAGAIN, ever, because some callers
can't handle that. But after an audit of the callsites I don't see any
issues. I guess the last problematic spot disappeared with the removal
of the pipelined fencing code.
So do the right thing and call check_wedge, which should properly
decide whether an -EAGAIN or -EIO is appropriate if wedged is set.
Note that the early check for a wedged gpu before touching the ring is
rather important (and it took me quite some time of acting like the
densest doofus to figure that out): If we don't do that and the gpu
died for good, not having been resurrect by the reset code, userspace
can merrily fill up the entire ring until it notices that something is
amiss.
Allowing userspace to emit more render, despite that we know that it
will fail can't lead to anything good (and by experience can lead to
all sorts of havoc, including angering the OOM gods and hard-hanging
the hw for good).
v2: Fix EAGAIN mispell, noticed by Chris Wilson.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Wed, 4 Jul 2012 20:18:42 +0000 (22:18 +0200)]
drm/i915: properly SIGBUS on I/O errors
... instead of looping endless with no hope of ever serving that
page-fault. We only need to break out of this loop when the gpu died,
to run the reset work (and hopefully resurrect it).
To clarify questions Chris raised on irc: This is about handling I/O
errors not from our own code, but e.g. when the disk died when trying
to swap in a gem bo. So this patch remidies the issue that the current
handling only handles gpu-death-induced cases of -EIO. Admittedly,
dying disks are much rarer than hanging gpus ...To clarify questions
Chris raised on irc: This is about handling I/O errors not from our
own code, but e.g. when the disk died when trying to swap in a gem bo.
So this patch remidies the issue that the current handling only
handles gpu-death-induced cases of -EIO. Admittedly, dying disks are
much rarer than hanging gpus ...
drm/i915: Fix infinite loop regression from 21dd3734
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Wed, 4 Jul 2012 20:18:41 +0000 (22:18 +0200)]
drm/i915: don't hang userspace when the gpu reset is stuck
With the gpu reset no longer using a trylock we've increased the
chances of userspace getting stuck quite a bit. To make that
(hopefully) rare case more paletable time out when waiting for the gpu
reset code to complete and signal this little issue to the caller by
returning -EIO.
This should help userspace to somewhat gracefully fall back and
hopefully allow the user to grab some logs and reboot the machine
(instead of staring at a frozen X screen in agony).
Suggested by Chris Wilson because I've been stubborn about allowing
the gpu reset code no to fail, ever (by removing the trylock).
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So don't return -EAGAIN, even in the case of a gpu hang. Remap it to
-EIO instead. Note that this isn't really an issue with
interruptability, but more that we have quite a few codepaths (mostly
around kms stuff) that simply can't handle any errors and hence not
even -EAGAIN. Instead of adding proper failure paths so that we could
restart these ioctls we've opted for the cheap way out of sleeping
non-interruptibly. Which works everywhere but when the gpu dies,
which this patch fixes.
So essentially interruptible == false means 'wait for the gpu or die
trying'.'
This patch is a bit ugly because intel_ring_begin is all non-interruptible
and hence only returns -EIO. But as the comment in there says,
auditing all the callsites would be a pain.
To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno
and intel_wait_ring_buffer. Also use the opportunity to clarify the
different cases in i915_gem_check_wedge a bit with comments.
v2: Don't access dev_priv->mm.interruptible from check_wedge - we
might not hold dev->struct_mutex, making this racy. Instead pass
interruptible in as a parameter. I've noticed this because I've hit a
BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been
added in
although that commit is missing any justification for this. I guess
it's just copy&paste, because the same commit add the same BUG_ON
check to check_olr, where it indeed makes sense.
But in check_wedge everything we access is protected by other means,
so this is superflous. And because it now gets in the way (we add a
new caller in __wait_seqno, which can be called without
dev->struct_mutext) let's just remove it.
v3: Group all the i915_gem_check_wedge refactoring into this patch, so
that this patch here is all about not returning -EAGAIN to callsites
that can't handle syscall restarting.
v4: Add clarification what interuptible == fales means in our code,
requested by Ben Widawsky.
v5: Fix EAGAIN mispell noticed by Chris Wilson.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Wed, 4 Jul 2012 20:18:39 +0000 (22:18 +0200)]
drm/i915: don't trylock in the gpu reset code
Simply failing to reset the gpu because someone else might still hold
the mutex isn't a great idea - I see reliable silent reset failures.
And gpu reset simply needs to be reliable and Just Work.
"But ... the deadlocks!"
We already kick all processes waiting for the gpu before launching the
reset work item. New waiters need to check the wedging state anyway
and then bail out. If we have places that can deadlock, we simply need
to fix them.
"But ... testing!"
We have the gpu hangman, and if the current gpu load gem_exec_nop
isn't good enough to hit a specific case, we can add a new one.
"But ... don't we return -EAGAIN for non-interruptible calls to
wait_seqno now?"
Yep, but this problem already exists in the current code. A follow up
patch will remedy this by returning -EIO for non-interruptible sleeps
if the gpu died and the low-level wait bails out with -EAGAIN.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Eugeni Dodonov [Thu, 28 Jun 2012 18:55:35 +0000 (15:55 -0300)]
drm/i915: re-initialize DDI buffer translations after resume
This is necessary for the modesetting to work correctly after a
suspend-resume cycle. Without this, the pipes and clocks got the correct
configuration, but the underlying DDI buffers configuration was lost.
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 3 Jul 2012 18:57:32 +0000 (15:57 -0300)]
drm/i915: get rid of dev_priv->info->has_pch_split
Previously we had has_pch_split to tell us whether we had a PCH or not
and we also had dev_priv->pch_type to tell us which kind of PCH it
was, but it could only be used if we were 100% sure we did have a PCH.
Now that PCH_NONE was added to dev_priv->pch_type we don't need
has_pch_split anymore: we can just check for pch_type != PCH_NONE.
The HAS_PCH_{IBX,CPT,LPT} macros use dev_priv->pch_type, so they can
only be called after intel_detect_pch. The HAS_PCH_SPLIT macro looks
at dev_priv->info->has_pch_split, which is available earlier.
Since the goal is to implement HAS_PCH_SPLIT using dev_priv->pch_type
instead of dev_priv->info->has_pch_split, we need to make sure that
intel_detect_pch is called before any calls to HAS_PCH_SPLIT are made.
So we moved the intel_detect_pch call to an earlier stage.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Paulo Zanoni [Tue, 3 Jul 2012 21:48:16 +0000 (18:48 -0300)]
drm/i915: add PCH_NONE to enum intel_pch
And rely on the fact that it's 0 to assume that machines without a PCH
will have PCH_NONE as dev_priv->pch_type.
Just today I finally realized that HAS_PCH_IBX is true for machines
without a PCH. IMHO this is totally counter-intuitive and I don't
think it's a good idea to assume that we're going to check for
HAS_PCH_IBX only after we check for HAS_PCH_SPLIT.
I believe that in the future we'll have more PCH types and checks
like:
if (HAS_PCH_IBX(dev) || HAS_PCH_CPT(dev))
will become more and more common. There's a good chance that we may
break non-PCH machines by adding these checks in code that runs on all
machines. I also believe that the HAS_PCH_SPLIT check will become less
common as we add more and more different PCH types. We'll probably
start replacing checks like:
if (HAS_PCH_SPLIT(dev))
foo();
else
bar();
with:
if (HAS_PCH_NEW(dev))
baz();
else if (HAS_PCH_OLD(dev) || HAS_PCH_IBX(dev))
foo();
else
bar();
and this may break gen 2/3/4.
As far as we have investigated, this patch will affect the behavior of
intel_hdmi_dpms and intel_dp_link_down on gen 4. In both functions the
code inside the HAS_PCH_IBX check is for IBX-specific workarounds, so
we should be safe. If we start bisecting gen 2/3/4 bugs to this commit
we should consider replacing the HAS_PCH_IBX checks with something
else.
V2: Improve commit message, list possible side effects and solution.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jesse Barnes [Thu, 21 Jun 2012 22:13:50 +0000 (15:13 -0700)]
drm/i915: prefer wide & slow to fast & narrow in DP configs
High frequency link configurations have the potential to cause trouble
with long and/or cheap cables, so prefer slow and wide configurations
instead. This patch has the potential to cause trouble for eDP
configurations that lie about available lanes, so if we run into that we
can make it conditional on eDP.
I've botched up the handling of ironlake_disable_rc6. Fix this up by
calling it at the right place. Note though that ironlake_disable_rc6
does a bit more than just disabling rc6 - it also tears down all the
allocated context objects.
Hence we need to move intel_teardown_rc6 out and directly call it from
intel_modeset_cleanup.
Also properly mark ironlake_enable_rc6 as static and kill the un-used
declaration in i915_drv.h.
Note: In review a question popped out why disable_rc6 also tears down
the backing object and why we should move that out - it's simply for
consistency with gen6+ rps code, which does it that way.
Cc: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Most of the RPS and RC6 enabling functionality is similar to what we had
on Gen6/Gen7, so we preserve most of the registers.
Note that Haswell only has RC6, so account for that as well. As suggested
by Daniel Vetter, to reduce the amount of changes in the patch, we still
write the RC6p/RC6pp thresholds, but those are ignored on Haswell.
Note: Some discussion about the nature of the new tuning constants
popped up in review - the answer is that we don't know why they've
changed, but the guide from VPG with the magic numbers simply has
different values now.
v2: Squash fix for ?: vs | operation precende bug into this patch.
Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Added note to commit message. Squashed fix.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Mon, 2 Jul 2012 14:51:03 +0000 (11:51 -0300)]
drm/i915: Implement w/a for sporadic read failures on waking from rc6
As a w/a to prevent reads sporadically returning 0, we need to wait for
the GT thread to return to TC0 before proceeding to read the registers.
v2: adapt for Haswell changes (Eugeni).
v3: use wait_for_atomic_us for thread status polling.
v3: *really* use wait_for_atomic for polling.
References: https://bugs.freedesktop.org/show_bug.cgi?id=50243 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Mon, 2 Jul 2012 14:51:02 +0000 (11:51 -0300)]
drm/i915: Group the GT routines together in both code and vtable
Tidy up the routines for interacting with the GT (in particular the
forcewake dance) which are scattered throughout the code in a single
structure.
v2: use wait_for_atomic for polling.
v3: *really* use wait_for_atomic for polling.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
causes quite some decent regressions. We can fix this by setting the
CS_STALL bit to ensure that the following seqno write happens only
after the cache flush has completed. But only do that when the caller
actually wants the flush (and not also when we invalidate caches
before starting the next batch).
I've looked through all our ancient scrolls about gen6+ pipe control
workarounds, and this seems to be indeed a legal combination: We're
allowed to set the CS_STALL bit when we flush the render cache (which
we do).
While yelling at this code, also pass back the return value from
intel_emit_post_sync_nonzero_flush properly.
v2: Instead of emitting more pipe controls, set the CS_STALL bit on
the write flush as suggested by Chris Wilson. It seems to work, too.
Cc: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua <huax.lu@intel.com> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Jesper Juhl [Tue, 26 Jun 2012 22:55:37 +0000 (00:55 +0200)]
drm/i915/sprite: Fix mem leak in intel_plane_init()
If we ever hit the default case in the switch statement we'll return
from the function without freeing the memory we just allocated to
'intel_plane' (but that has not been used).
This patch gets rid of the leak by freeing the memory just before we
return.
Daniel Vetter [Mon, 25 Jun 2012 13:58:49 +0000 (15:58 +0200)]
drm/i915: disable drm agp support for !gen3 with kms enabled
This is the quick&dirty way Dave Airlie suggested to workaround the
midlayer drm agp brain-damange. Note that i915_probe is only called
when the driver has ksm enabled, so no need to check for that.
We also need to move the intel_agp_enabled check at the right place.
Note that the only thing this does is enforce the correct module load
order (by using a symbol from intel-agp.ko) to ensure that the fake
agp driver is ready before the drm core tries to set up the agp stuff.
v2: Add a comment to explain why gen3 needs all this legacy fake agp
stuff - we've shipped an XvMC library with a kms-enabled ddx that
requires it (but only on gen3).
v3: Make it clear that this is only a gen3 issue in the comment.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Sun, 24 Jun 2012 18:51:36 +0000 (20:51 +0200)]
drm/i915: don't use dev->agp
This single leftover use is due to a patch that went into 3.5 through
-fixes. With the fake agp stuff on demise, at least for gen6+ we can't
use this any more.
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Sun, 24 Jun 2012 14:42:33 +0000 (16:42 +0200)]
drm/i915: make enable/disable_gt_powersave locking consistent
The enable functions grabbed dev->struct_mutex themselves, whereas
the disable functions expected dev->struct_mutex to be held by the
caller. Move the locking out to the (currently only) callsite of
intel_enable_gt_powersave to make this more consistent.
Originally this was prep work for future patches, but I've chased down
a totally wrong alley. Still, I think this is a sensible
clarification.
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Sun, 24 Jun 2012 14:42:32 +0000 (16:42 +0200)]
drm/i915: wrap up gt powersave enabling functions
... instead of calling each one for each generation indiviudally.
Notice that we've already managed to be inconsistent, the resume path
is missing an IS_VLV check. As a nice benefit we can mark all the
platform specific enable/disable functions as static and hide them in
intel_pm.c
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel Vetter [Mon, 25 Jun 2012 17:06:12 +0000 (19:06 +0200)]
Merge tag 'v3.5-rc4' into drm-intel-next-queued
I want to merge the "no more fake agp on gen6+" patches into
drm-intel-next (well, the last pieces). But a patch in 3.5-rc4 also
adds a new use of dev->agp. Hence the backmarge to sort this out, for
otherwise drm-intel-next merged into Linus' tree would conflict in the
relevant code, things would compile but nicely OOPS at driver load :(
Conflicts in this merge are just simple cases of "both branches
changed/added lines at the same place". The only tricky part is to
keep the order correct wrt the unwind code in case of errors in
intel_ringbuffer.c (and the MI_DISPLAY_FLIP #defines in i915_reg.h
together, obviously).
Linus Torvalds [Sun, 24 Jun 2012 18:02:09 +0000 (11:02 -0700)]
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Avi Kivity:
"Fixing a scheduling-while-atomic bug in the ppc code, and a bug which
allowed pci bridges to be assigned to guests."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S HV: Drop locks around call to kvmppc_pin_guest_page
KVM: Fix PCI header check on device assignment
Linus Torvalds [Sun, 24 Jun 2012 18:00:07 +0000 (11:00 -0700)]
Merge tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull InfiniBand/RDMA fixes from Roland Dreier:
- Fixes to new ocrdma driver
- Typo in test in CMA
* tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
RDMA/cma: QP type check on received REQs should be AND not OR
RDMA/ocrdma: Fix off by one in ocrdma_query_gid()
RDMA/ocrdma: Fixed RQ error CQE polling
RDMA/ocrdma: Correct queue SGE calculation
RDMA/ocrdma: Correct reported max queue sizes
RDMA/ocrdma: Fixed GID table for vlan and events
Linus Torvalds [Sun, 24 Jun 2012 17:57:59 +0000 (10:57 -0700)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Nothing very controversial in here. Most of the fixes are for OMAP
this time around, with some orion/kirkwood and a tegra patch mixed in."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: Orion: Fix Virtual/Physical mixup with watchdog
ARM: Kirkwood: clk_register_gate_fn: add fn assignment
ARM: Orion5x - Restore parts of io.h, with rework
ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled
ARM: OMAP2+: mux: fix sparse warning
ARM: OMAP2+: CM: increase the module disable timeout
ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks
ARM: OMAP4: hwmod data: fix 32k sync timer idle modes
ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby
ARM: OMAP: Fix Beagleboard DVI reset gpio
arm/dts: OMAP2: Fix interrupt controller binding
ARM: OMAP2: Fix tusb6010 GPIO interrupt for n8x0
ARM: OMAP2+: Fix MUSB ifdefs for platform init code
ARM: tegra: make tegra_cpu_reset_handler_enable() __init
ARM: OMAP: PM: Lock clocks list while generating summary
ARM: iconnect: Remove include of removed linux/spi/orion_spi.h
Linus Torvalds [Sun, 24 Jun 2012 17:57:15 +0000 (10:57 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nothing major in here, one radeon SI fix for tiling, and one uninit
var fix, two minor header file fixes."
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm: drop comment about this header being autogenerated.
drm/edid: don't return stack garbage from supports_rb
vga_switcheroo: Add include guard
drm/radeon: SI tiling fixes for display
Andrew Lunn [Fri, 22 Jun 2012 06:54:02 +0000 (08:54 +0200)]
ARM: Orion: Fix Virtual/Physical mixup with watchdog
The orion watchdog is expecting to be passed the physcial address of
the hardware, and will ioremap() it to give a virtual address it will
use as the base address for the hardware. However, when creating the
platform resource record, a virtual address was being used.
Add the necassary #define's so we can pass the physical address as
expected.
Tested on Kirkwood and Orion5x.
Cc: stable <stable@vger.kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Olof Johansson <olof@lixom.net>
In commit: 98d9986 ARM: Kirkwood: Replace clock gating
the kirkwood clock gating has been reworked. A custom variant of
clock gating, that calls a custom function before gating the clock
off, has been introduced. However in clk_register_gate_fn() this
custom function "fn" is never assigned.
This patch adds the missing fn assignment.
Cc: stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org> Tested-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Olof Johansson <olof@lixom.net>
Andrew Lunn [Fri, 22 Jun 2012 18:57:57 +0000 (20:57 +0200)]
ARM: Orion5x - Restore parts of io.h, with rework
Commit 4d5fc58dbe34b78157c05b319669bb3e064ba8bd (ARM: remove bunch of
now unused mach/io.h files) removed the orion5x io.h. Unfortunately,
this is still needed for the definition of IO_SPACE_LIMIT which
overrides the default 64K. All Orion based systems have 1Mbyte of IO
space per PCI[e] bus, and try to request_resource() this size. Orion5x
has two such PCI buses.
It is likely that the original, removed version, was broken. This
version might be less broken. However, it has not been tested on
hardware with a PCI card, let alone hardware with a PCI card with IO
capabilities.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Sat, 23 Jun 2012 23:16:29 +0000 (16:16 -0700)]
Merge tag 'omap-fixes-a-for-3.5rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes
From Paul Walmsley (as per Tony Lindgren's request):
"Some uncontroversial OMAP clock, hwmod, and compiler warning fixes for 3.5-rc"
* tag 'omap-fixes-a-for-3.5rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending:
ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled
ARM: OMAP2+: mux: fix sparse warning
ARM: OMAP2+: CM: increase the module disable timeout
ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks
ARM: OMAP4: hwmod data: fix 32k sync timer idle modes
ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby
ARM: OMAP: PM: Lock clocks list while generating summary
Olof Johansson [Sat, 23 Jun 2012 23:11:50 +0000 (16:11 -0700)]
Merge tag 'omap-fixes-for-v3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
"Here are a few fixes with the biggest one being fix for Beagle DVI
reset. All of them are regression fixes, except for the missing omap2
interrupt controller binding that somehow got missed earlier."
* tag 'omap-fixes-for-v3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP: Fix Beagleboard DVI reset gpio
arm/dts: OMAP2: Fix interrupt controller binding
ARM: OMAP2: Fix tusb6010 GPIO interrupt for n8x0
ARM: OMAP2+: Fix MUSB ifdefs for platform init code
Linus Torvalds [Sat, 23 Jun 2012 00:47:08 +0000 (17:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"There are a couple of fixes from Yan for bad pointer dereferences in
the messenger code and when fiddling with page->private after page
migration, a fix from Alex for a use-after-free in the osd client
code, and a couple fixes for the message refcounting and shutdown
ordering."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: flush msgr queue during mon_client shutdown
rbd: Clear ceph_msg->bio_iter for retransmitted message
libceph: use con get/put ops from osd_client
libceph: osd_client: don't drop reply reference too early
ceph: check PG_Private flag before accessing page->private
Linus Torvalds [Fri, 22 Jun 2012 18:07:55 +0000 (11:07 -0700)]
Merge tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs
Pull XFS fixes from Ben Myers:
- Fix stale data exposure with unwritten extents
- Fix a warning in xfs_alloc_vextent with ODEBUG
- Fix overallocation and alignment of pages for xfs_bufs
- Fix a cursor leak
- Fix a log hang
- Fix a crash related to xfs_sync_worker
- Rename xfs log structure from struct log to struct xlog so we can use
crash dumps effectively
* tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs:
xfs: rename log structure to xlog
xfs: shutdown xfs_sync_worker before the log
xfs: Fix overallocation in xfs_buf_allocate_memory()
xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
xfs: check for stale inode before acquiring iflock on push
xfs: fix debug_object WARN at xfs_alloc_vextent()
xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
Linus Torvalds [Fri, 22 Jun 2012 17:58:57 +0000 (10:58 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar.
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ftrace: Make all inline tags also include notrace
perf: Use css_tryget() to avoid propping up css refcount
perf tools: Fix synthesizing tracepoint names from the perf.data headers
perf stat: Fix default output file
perf tools: Fix endianity swapping for adds_features bitmask
Ricardo Neri [Thu, 21 Jun 2012 08:08:53 +0000 (10:08 +0200)]
ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled
As per the OMAP4 documentation, audio over HDMI must be transmitted in
no-idle mode. This patch adds the HWMOD_SWSUP_SIDLE so that omap_hwmod uses
no-idle/force-idle settings instead of smart-idle mode.
This is required as the DSS interface clock is used as functional clock
for the HDMI wrapper audio FIFO. If no-idle mode is not used, audio could
be choppy, have bad quality or not be audible at all.
Signed-off-by: Ricardo Neri <ricardo.neri@ti.com>
[b-cousson@ti.com: Update the subject and align the .flags
location with the script template] Signed-off-by: Benoit Cousson <b-cousson@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
Paul Walmsley [Sun, 17 Jun 2012 17:57:53 +0000 (11:57 -0600)]
ARM: OMAP2+: CM: increase the module disable timeout
Increase the timeout for disabling an IP block to five milliseconds.
This is to handle the usb_host_fs idle latency, which takes almost
four milliseconds after a host controller reset.
This is the second of two patches needed to resolve the following
boot warning:
Paul Walmsley [Sun, 17 Jun 2012 17:57:52 +0000 (11:57 -0600)]
ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks
Until the OMAP4 code is converted to disable the use of the clock
framework-based clockdomain enable/disable sequence, any clock used as
a hwmod main_clk must have a clockdomain associated with it. This
patch populates some clock structure clockdomain names to resolve the
following warnings during kernel init:
omap_hwmod: dpll_mpu_m2_ck: missing clockdomain for dpll_mpu_m2_ck.
omap_hwmod: trace_clk_div_ck: missing clockdomain for trace_clk_div_ck.
omap_hwmod: l3_div_ck: missing clockdomain for l3_div_ck.
omap_hwmod: ddrphy_ck: missing clockdomain for ddrphy_ck.
The 32k sync timer IP block target idle modes in the hwmod data are
incorrect. The IP block does not support any smart-idle modes.
Update the data to reflect the correct modes.
This problem was initially identified and a diff fragment posted to
the lists by Benoît Cousson <b-cousson@ti.com>. A patch description
bug in the first version was also identified by Benoît.
Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Benoît Cousson <b-cousson@ti.com> Cc: Tero Kristo <t-kristo@ti.com>
Djamil Elaidi [Sun, 17 Jun 2012 17:57:51 +0000 (11:57 -0600)]
ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby
If an IP is configured in Smart-Standby-Wakeup, when disabling wakeup feature the
IP will not go back to Smart-Standby, but will remain in Smart-Standby-Wakeup.
Signed-off-by: Djamil Elaidi <d-elaidi@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com>
Linus Torvalds [Thu, 21 Jun 2012 23:05:43 +0000 (16:05 -0700)]
Merge tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix a write hang due to an uninitalised variable when
!defined(CONFIG_NFS_V4)
- Address upcall races in the legacy NFSv4 idmapper
- Remove an O_DIRECT refcounting issue
- Fix a pNFS refcounting bug when the file layout metadata server is
also acting as a data server
- Fix a pNFS module loading race.
* tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Force the legacy idmapper to be single threaded
NFS: Initialise commit_info.rpc_out when !defined(CONFIG_NFS_V4)
NFS: Fix a refcounting issue in O_DIRECT
NFSv4.1: Fix a race in set_pnfs_layoutdriver
NFSv4.1: Fix umount when filelayout DS is also the MDS
Linus Torvalds [Thu, 21 Jun 2012 20:41:07 +0000 (13:41 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This is a small pull with btrfs fixes. The biggest of the bunch is
another fix for the new backref walking code.
We're still hammering out one btrfs dio vs buffered reads problem, but
that one will have to wait for the next rc."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: delay iput with async extents
Btrfs: add a missing spin_lock
Btrfs: don't assume to be on the correct extent in add_all_parents
Btrfs: introduce btrfs_next_old_item
Linus Torvalds [Thu, 21 Jun 2012 20:40:40 +0000 (13:40 -0700)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"Two minor fixes in emc2103 and applesmc drivers."
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (emc2103) Fix use of an uninitilized variable in error case
hwmon: (applesmc) Limit key length in warning messages
Ben Myers [Fri, 25 May 2012 20:45:36 +0000 (15:45 -0500)]
xfs: shutdown xfs_sync_worker before the log
Revert commit 1307bbd, which uses the s_umount semaphore to provide
exclusion between xfs_sync_worker and unmount, in favor of shutting down
the sync worker before freeing the log in xfs_log_unmount. This is a
cleaner way of resolving the race between xfs_sync_worker and unmount
than using s_umount.
Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>
Jan Kara [Tue, 5 Jun 2012 22:32:26 +0000 (00:32 +0200)]
xfs: Fix overallocation in xfs_buf_allocate_memory()
Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug
causing xfs_buf_allocate_memory() to overestimate the number of necessary
pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively
every buffer is straddling a page boundary which causes
xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access
which is unnecessary.
Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the
buffer is inserted into the cache only after being fully initialized now.
So just make xfs_buf_alloc() fill in proper block number from the beginning.
CC: David Chinner <dchinner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Dave Chinner [Tue, 12 Jun 2012 04:20:26 +0000 (14:20 +1000)]
xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near
When we fail to find an matching extent near the requested extent
specification during a left-right distance search in
xfs_alloc_ag_vextent_near, we fail to free the original cursor that
we used to look up the XFS_BTNUM_CNT tree and hence leak it.
Reported-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com>
Brian Foster [Mon, 11 Jun 2012 14:39:43 +0000 (10:39 -0400)]
xfs: check for stale inode before acquiring iflock on push
An inode in the AIL can be flush locked and marked stale if
a cluster free transaction occurs at the right time. The
inode item is then marked as flushing, which causes xfsaild
to spin and leaves the filesystem stalled. This is
reproduced by running xfstests 273 in a loop for an
extended period of time.
Check for stale inodes before the flush lock. This marks
the inode as pinned, leads to a log flush and allows the
filesystem to proceed.
Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
Josef Bacik [Fri, 15 Jun 2012 18:19:48 +0000 (12:19 -0600)]
Btrfs: delay iput with async extents
There is some concern that these iput()'s could be the final iputs and could
induce lockups on people waiting on writeback. This would happen in the
rare case that we don't create ordered extents because of an error, but it
is theoretically possible and we already have a mechanism to deal with this
so just make them delayed iputs to negate any worry.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Josef Bacik [Mon, 18 Jun 2012 13:23:18 +0000 (07:23 -0600)]
Btrfs: add a missing spin_lock
When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(. Add back a spin_lock that should be there and
we are now all set. Thanks,
Btrfs: add a missing spin_lock
When fixing up the locking in the delayed ref destruction work I accidently
broke the locking myself ;(. Add back a spin_lock that should be there and
we are now all set. Thanks,
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Alexander Block [Tue, 19 Jun 2012 13:42:26 +0000 (07:42 -0600)]
Btrfs: don't assume to be on the correct extent in add_all_parents
add_all_parents did assume that path is already at a correct extent data
item, which may not be true in case of data extents that were partly
rewritten and splitted.
We need to check if we're on a matching extent for every item and only
for the ones after the first. The loop is changed to do this now.
This patch also fixes a bug introduced with commit 3b127fd8 "Btrfs:
remove obsolete btrfs_next_leaf call from __resolve_indirect_ref".
The removal of next_leaf did sometimes result in slot==nritems when
the above described case happens, and thus resulting in invalid values
(e.g. wanted_obejctid) in add_all_parents (leading to missed backrefs
or even crashes).
Signed-off-by: Alexander Block <ablock84@googlemail.com> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
Daniel Vetter [Tue, 19 Jun 2012 09:33:06 +0000 (11:33 +0200)]
drm/edid: don't return stack garbage from supports_rb
We need to initialize this to false, because the is_rb callback only
ever sets it to true.
Noticed while reading through the code.
Cc: stable@vger.kernel.org Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Thu, 21 Jun 2012 05:12:52 +0000 (22:12 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"A few fixes in pl330 and imx-sdma drivers."
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
DMA: PL330: Fix racy mutex unlock
DMA: PL330: Add missing static storage class specifier
dma: imx-sdma: buf_tail should be initialize in prepare function
dmaengine: pl330: dont complete descriptor for cyclic dma
Linus Torvalds [Thu, 21 Jun 2012 05:11:04 +0000 (22:11 -0700)]
Merge branch 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull two cgroup fixes from Tejun Heo:
"This containes two patches fixing a refcnt race bug during css_put().
Decrementing and checking the value weren't atomic and two tasks could
think that they both pushed the counter to zero."
* 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroups: Account for CSS_DEACT_BIAS in __css_put
cgroup: make sure that decisions in __css_put are atomic
David Rientjes [Thu, 21 Jun 2012 01:00:12 +0000 (18:00 -0700)]
mm, mempolicy: fix mbind() to do synchronous migration
If the range passed to mbind() is not allocated on nodes set in the
nodemask, it migrates the pages to respect the constraint.
The final formal of migrate_pages() is a mode of type enum migrate_mode,
not a boolean. do_mbind() is currently passing "true" which is the
equivalent of MIGRATE_SYNC_LIGHT. This should instead be MIGRATE_SYNC
for synchronous page migration.
Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 21 Jun 2012 01:30:30 +0000 (18:30 -0700)]
media: pms.c needs linux/slab.h
drivers/media/video/pms.c uses kzalloc() and kfree() so it should
include <linux/slab.h> to fix build errors and a warning.
drivers/media/video/pms.c:1047:2: error: implicit declaration of function 'kzalloc'
drivers/media/video/pms.c:1047:6: warning: assignment makes pointer from integer without a cast
drivers/media/video/pms.c:1116:2: error: implicit declaration of function 'kfree'
Found in mmotm but applies to mainline.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 20 Jun 2012 22:15:03 +0000 (15:15 -0700)]
Merge tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging tree fixes from Greg Kroah-Hartman:
"Here are a number of small fixes for the drivers/staging tree, as well
as iio and pstore drivers (which came from the staging tree in the
3.5-rc1 merge). All of these are tiny, but resolve issues that people
have been reporting.
There's also a documentation update to reflect what the iio drivers
really are doing, which is good to get straightened out.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8712u: Add new USB IDs
staging: gdm72xx: Release netlink socket properly
iio: drop wrong reference from Kconfig
pstore/inode: Make pstore_fill_super() static
pstore/ram: Should zap persistent zone on unlink
pstore/ram_core: Factor persistent_ram_zap() out of post_init()
pstore/ram_core: Do not reset restored zone's position and size
pstore/ram: Should update old dmesg buffer before reading
staging:iio:ad7298: Fix linker error due to missing IIO kfifo buffer
Revert "staging: usbip: bugfix for stack corruption on 64-bit architectures"
staging: usbip: bugfix for stack corruption on 64-bit architectures
staging/comedi: fix build for USB not enabled
staging: omapdrm: fix crash when freeing bad fb
staging:iio:ad7606: Re-add missing scale attribute
iio: Fix potential use after free
staging:iio: remove num_interrupt_lines from documentation
iio: documentation: Add out_altvoltage and friends
Linus Torvalds [Wed, 20 Jun 2012 22:14:28 +0000 (15:14 -0700)]
Merge tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and printk fixes from Greg Kroah-Hartman:
"Here are some fixes for 3.5-rc4 that resolve the kmsg problems that
people have reported showing up after the printk and kmsg changes went
into 3.5-rc1. There are also a smattering of other tiny fixes for the
extcon and hyper-v drivers that people have reported.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
extcon: max8997: Add missing kfree for info->edev in max8997_muic_remove()
extcon: Set platform drvdata in gpio_extcon_probe() and fix irq leak
extcon: Fix wrong index in max8997_extcon_cable[]
kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation
printk: return -EINVAL if the message len is bigger than the buf size
printk: use mutex lock to stop syslog_seq from going wild
kmsg - kmsg_dump() use iterator to receive log buffer content
vme: change maintainer e-mail address
Extcon: Don't try to create duplicate link names
driver core: fixup reversed deferred probe order
printk: Fix alignment of buf causing crash on ARM EABI
Tools: hv: verify origin of netlink connector message
Linus Torvalds [Wed, 20 Jun 2012 22:13:56 +0000 (15:13 -0700)]
Merge tag 'char-misc-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull misc tree updates from Greg Kroah-Hartman:
"Here are some drivers/misc bugfixes (really just drivers/misc/mei/
fixes) for a few problems that have been reported.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'char-misc-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: mei: set WDIOF_ALARMONLY on mei watchdog
misc: mei: Disable MSI when IRQ registration fails
misc: mei: fix stalled read
misc: mei: unregister misc device in pci_remove function
misc: mei: set IRQF_ONESHOT for msi request_threaded_irq
Linus Torvalds [Wed, 20 Jun 2012 22:13:13 +0000 (15:13 -0700)]
Merge tag 'tty-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fixes from Greg Kroah-Hartman:
"Here are 3 patches resolving a boot regression (the mop500 fix), a
build warning fix, and a kernel-doc fix. All tiny, but should go into
the final 3.5 release.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial/amba-pl011: move custom pin control to driver
serial: fix serial_txx9.c build warning/typo
serial: fix kernel-doc warnings in 8250.c
Linus Torvalds [Wed, 20 Jun 2012 21:41:57 +0000 (14:41 -0700)]
Merge branch 'akpm' (Andrew's patch-bomb)
* emailed from Andrew Morton <akpm@linux-foundation.org>: (21 patches)
mm/memblock: fix overlapping allocation when doubling reserved array
c/r: prctl: Move PR_GET_TID_ADDRESS to a proper place
pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper
pidns: guarantee that the pidns init will be the last pidns process reaped
fault-inject: avoid call to random32() if fault injection is disabled
Viresh has moved
get_maintainer: Fix --help warning
mm/memory.c: fix kernel-doc warnings
mm: fix kernel-doc warnings
mm: correctly synchronize rss-counters at exit/exec
mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range
h8300: use the declarations provided by <asm/sections.h>
h8300: fix use of extinct _sbss and _ebss
xtensa: use the declarations provided by <asm/sections.h>
xtensa: use "test -e" instead of bashism "test -a"
xtensa: replace xtensa-specific _f{data,text} by _s{data,text}
memcg: fix use_hierarchy css_is_ancestor oops regression
mm, oom: fix and cleanup oom score calculations
nilfs2: ensure proper cache clearing for gc-inodes
thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
...
Greg Pearson [Wed, 20 Jun 2012 19:53:05 +0000 (12:53 -0700)]
mm/memblock: fix overlapping allocation when doubling reserved array
__alloc_memory_core_early() asks memblock for a range of memory then try
to reserve it. If the reserved region array lacks space for the new
range, memblock_double_array() is called to allocate more space for the
array. If memblock is used to allocate memory for the new array it can
end up using a range that overlaps with the range originally allocated in
__alloc_memory_core_early(), leading to possible data corruption.
With this patch memblock_double_array() now calls memblock_find_in_range()
with a narrowed candidate range (in cases where the reserved.regions array
is being doubled) so any memory allocated will not overlap with the
original range that was being reserved. The range is narrowed by passing
in the starting address and size of the previously allocated range. Then
the range above the ending address is searched and if a candidate is not
found, the range below the starting address is searched.
Signed-off-by: Greg Pearson <greg.pearson@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Wed, 20 Jun 2012 19:53:04 +0000 (12:53 -0700)]
pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper
find_new_reaper() changes pid_ns->child_reaper, see add0d4df ("pid_ns:
zap_pid_ns_processes: fix the ->child_reaper changing").
The original reason has gone away after the previous patch, ->children
list must be empty after zap_pid_ns_processes().
However now we can not switch to init_pid_ns.child_reaper.
__unhash_process() relies on the "->child_reaper == parent" check, but
this check does not work if the last exiting task is also the child
reaper.
As Eric sugested, we can change __unhash_process() to use the parent's
pid_ns and remove this code.
Also, with this change we can move detach_pid(PIDTYPE_PID) back, where it
was before the previous fix.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Louis Rilling <louis.rilling@kerlabs.com> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Pavel Emelyanov <xemul@parallels.com> Tested-by: Andrew Wagin <avagin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pidns: guarantee that the pidns init will be the last pidns process reaped
Today we have a twofold bug. Sometimes release_task on pid == 1 in a pid
namespace can run before other processes in a pid namespace have had
release task called. With the result that pid_ns_release_proc can be
called before the last proc_flus_task() is done using upid->ns->proc_mnt,
resulting in the use of a stale pointer. This same set of circumstances
can lead to waitpid(...) returning for a processes started with
clone(CLONE_NEWPID) before the every process in the pid namespace has
actually exited.
To fix this modify zap_pid_ns_processess wait until all other processes in
the pid namespace have exited, even EXIT_DEAD zombies.
The delay_group_leader and related tests ensure that the thread gruop
leader will be the last thread of a process group to be reaped, or to
become EXIT_DEAD and self reap. With the change to zap_pid_ns_processes
we get the guarantee that pid == 1 in a pid namespace will be the last
task that release_task is called on.
With pid == 1 being the last task to pass through release_task
pid_ns_release_proc can no longer be called too early nor can wait return
before all of the EXIT_DEAD tasks in a pid namespace have exited.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Louis Rilling <louis.rilling@kerlabs.com> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Pavel Emelyanov <xemul@parallels.com> Tested-by: Andrew Wagin <avagin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Wed, 20 Jun 2012 19:53:02 +0000 (12:53 -0700)]
get_maintainer: Fix --help warning
Using --help emits a concatenation error. Fix it.
Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Paul Bolle <pebolle@tiscali.nl> Tested-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Wed, 20 Jun 2012 19:53:02 +0000 (12:53 -0700)]
mm/memory.c: fix kernel-doc warnings
Fix kernel-doc warnings in mm/memory.c:
Warning(mm/memory.c:1377): No description found for parameter 'start'
Warning(mm/memory.c:1377): Excess function parameter 'address' description in 'zap_page_range'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wanpeng Li [Wed, 20 Jun 2012 19:53:01 +0000 (12:53 -0700)]
mm: fix kernel-doc warnings
Fix kernel-doc warnings such as
Warning(../mm/page_cgroup.c:432): No description found for parameter 'id'
Warning(../mm/page_cgroup.c:432): Excess function parameter 'mem' description in 'swap_cgroup_record'
Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: correctly synchronize rss-counters at exit/exec
do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does
put_user(clear_child_tid) which can update task->rss_stat and thus make
mm->rss_stat inconsistent. This triggers the "BUG:" printk in check_mm().
Let's fix this bug in the safest way, and optimize/cleanup this later.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 20 Jun 2012 19:53:00 +0000 (12:53 -0700)]
mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range
Andrea asked for addr, end, vma->vm_start, and vma->vm_end to be emitted
when !rwsem_is_locked(&tlb->mm->mmap_sem). Otherwise, debugging the
underlying issue is more difficult.
Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
h8300: use the declarations provided by <asm/sections.h>
Cleanups:
- Include <asm/sections.h>,
- Remove the (different) extern declarations,
- Remove the no longer needed address-of ('&') operators,
- Remove the superfluous casts, use proper printk formatting instead.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xtensa: use the declarations provided by <asm/sections.h>
Cleanups:
- Include <asm/sections.h>,
- Remove the (different) extern declarations,
- Remove the no longer needed address-of ('&') operators,
- Use %p to format pointer differences.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xtensa: replace xtensa-specific _f{data,text} by _s{data,text}
commit a2d063ac216c161 ("extable, core_kernel_data(): Make sure all archs
define _sdata") missed xtensa. Xtensa does have a start of data marker,
but calls it _fdata, causing
kernel/built-in.o:(.text+0x964): undefined reference to `_sdata'
_stext was already defined, but it was duplicated by _fdata.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If use_hierarchy is set, reclaim testing soon oopses in css_is_ancestor()
called from __mem_cgroup_same_or_subtree() called from page_referenced():
when processes are exiting, it's easy for mm_match_cgroup() to pass along
a NULL memcg coming from a NULL mm->owner.
Check for that in __mem_cgroup_same_or_subtree(). Return true or false?
False because we cannot know if it was in the hierarchy, but also false
because it's better not to count a reference from an exiting process.
Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Wed, 20 Jun 2012 19:52:58 +0000 (12:52 -0700)]
mm, oom: fix and cleanup oom score calculations
The divide in p->signal->oom_score_adj * totalpages / 1000 within
oom_badness() was causing an overflow of the signed long data type.
This adds both the root bias and p->signal->oom_score_adj before doing the
normalization which fixes the issue and also cleans up the calculation.
Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>