git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/log

md/raid10: handle merge_bvec_fn in member devices.

Currently we don't honour merge_bvec_fn in member devices so if there
is one, we force all requests to be single-page at most.
This is not ideal.

So enhance the raid10 merge_bvec_fn to check that function in children
as well.

This introduces a small problem.  There is no locking around calls
the ->merge_bvec_fn and subsequent calls to ->make_request.  So a
device added between these could end up getting a request which
violates its merge_bvec_fn.

Currently the best we can do is synchronize_sched().  This will work
providing no preemption happens.  If there is preemption, we just
have to hope that new devices are largely consistent with old devices.

Signed-off-by: NeilBrown <neilb@suse.de>

md: add proper merge_bvec handling to RAID0 and Linear.

These personalities currently set a max request size of one page
when any member device has a merge_bvec_fn because they don't
bother to call that function.

This causes extra works in splitting and combining requests.

So make the extra effort to call the merge_bvec_fn when it exists
so that we end up with larger requests out the bottom.

Signed-off-by: NeilBrown <neilb@suse.de>

md: tidy up rdev_for_each usage.

md.h has an 'rdev_for_each()' macro for iterating the rdevs in an
mddev.  However it uses the 'safe' version of list_for_each_entry,
and so requires the extra variable, but doesn't include 'safe' in the
name, which is useful documentation.

Consequently some places use this safe version without needing it, and
many use an explicity list_for_each entry.

So:
- rename rdev_for_each to rdev_for_each_safe
- create a new rdev_for_each which uses the plain
   list_for_each_entry,
- use the 'safe' version only where needed, and convert all other
   list_for_each_entry calls to use rdev_for_each.

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid1,raid10: avoid deadlock during resync/recovery.

If RAID1 or RAID10 is used under LVM or some other stacking
block device, it is possible to enter a deadlock during
resync or recovery.
This can happen if the upper level block device creates
two requests to the RAID1 or RAID10. The first request gets
processed, blocks recovery and queue requests for underlying
requests in current->bio_list. A resync request then starts
which will wait for those requests and block new IO.

But then the second request to the RAID1/10 will be attempted
and it cannot progress until the resync request completes,
which cannot progress until the underlying device requests complete,
which are on a queue behind that second request.

So allow that second request to proceed even though there is
a resync request about to start.

This is suitable for any -stable kernel.

Cc: stable@vger.kernel.org
Reported-by: Ray Morris <support@bettercgi.com>
Tested-by: Ray Morris <support@bettercgi.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/bitmap: ensure to load bitmap when creating via sysfs.

When commit 69e51b449d383e (md/bitmap: separate out loading a bitmap...)
created bitmap_load, it missed calling it after bitmap_create when a
bitmap is created through the sysfs interface.
So if a bitmap is added this way, we don't allocate memory properly
and can crash.

This is suitable for any -stable release since 2.6.35.
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

md: don't set md arrays to readonly on shutdown.

It seems that with recent kernel, writeback can still be happening
while shutdown is happening, and consequently data can be written
after the md reboot notifier switches all arrays to read-only.
This causes a BUG.

So don't switch them to read-only - just mark them clean and
set 'safemode' to '2' which mean that immediately after any
write the array will be switch back to 'clean'.

This could result in the shutdown happening when array is marked
dirty, thus forcing a resync on reboot. However if you reboot
without performing a "sync" first, you get to keep both halves.

This is suitable for any stable kernel (though there might be some
conflicts with obvious fixes in earlier kernels).

Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>

md: allow re-add to failed arrays.

When an array is failed (some data inaccessible) then there is no
point attempting to add a spare as it could not possibly be recovered.

However that may be value in re-adding a recently removed device.
e.g. if there is a write-intent-bitmap and it is clear, then access
to the data could be restored by this action.

So don't reject a re-add to a failed array for RAID10 and RAID5 (the
only arrays types that check for a failed array).

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: use atomic_dec_return() instead of atomic_dec() and atomic_read().

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md: Use existed macros instead of numbers

Signed-off-by: majianpeng <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: removed unused 'added_devices' variable.

commit 908f4fbd265733 removed the last user of this variable,
so we should discard it completely.

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid10: remove unnecessary smp_mb() from end_sync_write

Recent commit 4ca40c2ce099e4f1ce3 (md/raid10: Allow replacement device ...)
added an smp_mb in end_sync_write.
This was to close a possible race with raid10_remove_disk.
However there is no such race as it is never attempted to remove a
disk while resync (or recovery) is happening.
so the smp_mb is just noise.

Signed-off-by: NeilBrown <neilb@suse.de>

md/raid5: make sure reshape_position is cleared on error path.

Leaving a valid reshape_position value in place could be confusing.

Signed-off-by: NeilBrown <neilb@suse.de>

Linux 3.3-rc7

aio: fix the "too late munmap()" race

Current code has put_ioctx() called asynchronously from aio_fput_routine();
that's done *after* we have killed the request that used to pin ioctx,
so there's nothing to stop io_destroy() waiting in wait_for_all_aios()
from progressing. As the result, we can end up with async call of
put_ioctx() being the last one and possibly happening during exit_mmap()
or elf_core_dump(), neither of which expects stray munmap() being done
to them...

We do need to prevent _freeing_ ioctx until aio_fput_routine() is done
with that, but that's all we care about - neither io_destroy() nor
exit_aio() will progress past wait_for_all_aios() until aio_fput_routine()
does really_put_req(), so the ioctx teardown won't be done until then
and we don't care about the contents of ioctx past that point.

Since actual freeing of these suckers is RCU-delayed, we don't need to
bump ioctx refcount when request goes into list for async removal.
All we need is rcu_read_lock held just over the ->ctx_lock-protected
area in aio_fput_routine().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

aio: fix io_setup/io_destroy race

Have ioctx_alloc() return an extra reference, so that caller would drop it
on success and not bother with re-grabbing it on failure exit. The current
code is obviously broken - io_destroy() from another thread that managed
to guess the address io_setup() would've returned would free ioctx right
under us; gets especially interesting if aio_context_t * we pass to
io_setup() points to PROT_READ mapping, so put_user() fails and we end
up doing io_destroy() on kioctx another thread has just got freed...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Pull btrfs updates from Chris Mason:
"I have two additional and btrfs fixes in my for-linus branch.  One is
  a casting error that leads to memory corruption on i386 during scrub,
  and the other fixes a corner case in the backref walking code (also
  triggered by scrub)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix casting error in scrub reada code
  btrfs: fix locking issues in find_parent_nodes()

memcg: revert fix to mapcount check for this release

Respectfully revert commit e6ca7b89dc76 "memcg: fix mapcount check
in move charge code for anonymous page" for the 3.3 release, so that
it behaves exactly like releases 2.6.35 through 3.2 in this respect.

Horiguchi-san's commit is correct in itself, 1 makes much more sense
than 2 in that check; but it does not go far enough - swapcount
should be considered too - if we really want such a check at all.

We appear to have reached agreement now, and expect that 3.4 will
remove the mapcount check, but had better not make 3.3 different.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86: Derandom delay_tsc for 64 bit

Commit f0fbf0abc093 ("x86: integrate delay functions") converted
delay_tsc() into a random delay generator for 64 bit.  The reason is
that it merged the mostly identical versions of delay_32.c and
delay_64.c.  Though the subtle difference of the result was:

static void delay_tsc(unsigned long loops)
{
- unsigned bclock, now;
+ unsigned long bclock, now;

Now the function uses rdtscl() which returns the lower 32bit of the
TSC. On 32bit that's not problematic as unsigned long is 32bit. On 64
bit this fails when the lower 32bit are close to wrap around when
bclock is read, because the following check

       if ((now - bclock) >= loops)
           break;

evaluated to true on 64bit for e.g. bclock = 0xffffffff and now = 0
because the unsigned long (now - bclock) of these values results in
0xffffffff00000001 which is definitely larger than the loops
value. That explains Tvortkos observation:

"Because I am seeing udelay(500) (_occasionally_) being short, and
that by delaying for some duration between 0us (yep) and 491us."

Make those variables explicitely u32 again, so this works for both 32
and 64 bit.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # >= 2.6.27
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Nothing exciting here: just a few regression fixes for HD-audio and
  ASoC, also the support of missing 32bit compat ioctl for HDSPM."

* tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hdspm - Provide ioctl_compat
  ALSA: hda/realtek - Apply the coef-setup only to ALC269VB
  ALSA: hda - add quirk to detect CD input on Gigabyte EP45-DS3
  ASoC: neo1973: fix neo1973 wm8753 initialization

MAINTAINERS: new git entry for arm/mach-msm

The msm git tree moved to

git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm.git

Signed-off-by: David Brown <davidb@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull C6X fix from Mark Salter:
"Fix for C6X KSTK_EIP and KSTK_ESP macros."

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
C6X: fix KSTK_EIP and KSTK_ESP macros

Merge tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull two IOMMU fixes from Joerg Roedel:
"The first is an additional fix for the OMAP initialization order issue
  and the second patch fixes a possible section mismatch which can lead
  to a kernel crash in the AMD IOMMU driver when suspend/resume is used
  and the compiler has not inlined the iommu_set_device_table function."

* tag 'iommu-fixes-v3.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  x86/amd: iommu_set_device_table() must not be __init
  ARM: OMAP: fix iommu, not mailbox

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull radeon drm stuff from Dave Airlie:
"Just some radeon fixes, one is for an oops where we run out of ioremap
  space on some big hardware systems in 32-bit mode, stuff doesn't work
  properly but at least the machine will boot.

  One regression fix, and two bugs, one hw, one blit code."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon/kms: fix hdmi duallink checks
  drm/radeon/kms: set SX_MISC in the r6xx blit code (v2)
  drm/radeon: deal with errors from framebuffer init path.
  drm/radeon: fix a semaphore deadlock on pre cayman asics

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking from David Miller:

1) IPV4 routing metrics can become stale when routes are changed by the
   administrator, fix from Steffen Klassert.

2) atl1c does "val |= XXX;" where XXX is a bit number not a bit mask,
   fix by using set_bit.  From Dan Carpenter.

3) Memory accounting bug in carl9170 driver results in wedged TX queue.
   Fix from Nicolas Cavallari.

4) iwlwifi accidently uses "sizeof(ptr)" instead of "sizeof(*ptr)", fix
   from Johannes Berg.

5) Openvswitch doesn't honor dp_ifindex when doing vport lookups, fix
   from Ben Pfaff.

6) ehea conversion to 64-bit stats lost multicast and rx_errors
   accounting, fix from Eric Dumazet.

7) Bridge state transition logging in br_stp_disable_port() is busted,
   it's emitted at the wrong time and the message is in the wrong tense,
   fix from Paulius Zaleckas.

8) mlx4 device erroneously invokes the queue resize firmware operation
   twice, fix from Jack Morgenstein.

9) Fix deadlock in usbnet, need to drop lock when invoking usb_unlink_urb()
   otherwise we recurse into taking it again.  Fix from Sebastian Siewior.

10) hyperv network driver uses the wrong driver name string, fix from
    Haiyang Zhang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/hyperv: Use the built-in macro KBUILD_MODNAME for this driver
  net/usbnet: avoid recursive locking in usbnet_stop()
  route: Remove redirect_genid
  inetpeer: Invalidate the inetpeer tree along with the routing cache
  mlx4_core: fix bug in modify_cq wrapper for resize flow.
  atl1c: set ATL1C_WORK_EVENT_RESET bit correctly
  bridge: fix state reporting when port is disabled
  bridge: br_log_state() s/entering/entered/
  ehea: restore multicast and rx_errors fields
  openvswitch: Fix checksum update for actions on UDP packets.
  openvswitch: Honor dp_ifindex, when specified, for vport lookup by name.
  iwlwifi: fix wowlan suspend
  mwifiex: reset encryption mode flag before association
  carl9170: fix frame delivery if sta is in powersave mode
  carl9170: Fix memory accounting when sta is in power-save mode.

Merge tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull last minute fixes from Olof Johansson:
"One samsung build fix due to a mis-applied patch, and a small set of
  OMAP fixes.  This should be the last from arm-soc for 3.3, hopefully."

* tag 'fixes-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: S3C2440: Fixed build error for s3c244x
  ARM: OMAP2+: Fix module build errors with CONFIG_OMAP4_ERRATA_I688
  ARM: OMAP: id: Add missing break statement in omap3xxx_check_revision
  ARM: OMAP2+: Remove apply_uV constraints for fixed regulator
  ARM: OMAP: irqs: Fix NR_IRQS value to handle PRCM interrupts

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"Another small, clear fix in a specific driver."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: tps65910: Configure correct value for VDDCTRL vout reg

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull minor devicetree bug fixes and documentation updates from Grant Likely:
"Fixes up a duplicate #include, adds an empty implementation of
  of_find_compatible_node() and make git ignore .dtb files.  And fix up
  bus name on OF described PHYs.  Nothing exciting here."

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux-2.6:
  doc: dt: Fix broken reference in gpio-leds documentation
  of/mdio: fix fixed link bus name
  of/fdt.c: asm/setup.h included twice
  of: add picochip vendor prefix
  dt: add empty of_find_compatible_node function
  ARM: devicetree: Add .dtb files to arch/arm/boot/.gitignore

Merge tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6

Pull SPI section mismatch bug fix for v3.3-rc3 from Grant Likely:
"Minor fix for pl022_dma_probe() function which was put in the wrong
section."

* tag 'spi-for-linus' of git://git.secretlab.ca/git/linux-2.6:
Fix section mismatch in spi-pl022.c

Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull four hwmon patches from Guenter Roeck

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (jc42) Add support for AT30TS00, TS3000GB2, TSE2002GB2, and MCP9804
  hwmon: (zl6100) Maintain delay parameter in driver instance data
  hwmon: (pmbus_core) Fix maximum number of POUT alarm attributes
  hwmon: (jc42) Add support for ST Microelectronics STTS2002 and STTS3000

Merge tag 'dm-3.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm

Pull device-mapper fixes for 3.3 from Alasdair Kergon

Eight small device-mapper bug fixes.

* tag 'dm-3.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
  dm raid: fix flush support
  dm raid: set MD_CHANGE_DEVS when rebuilding
  dm thin metadata: decrement counter after removing mapped block
  dm thin metadata: unlock superblock in init_pmd error path
  dm thin metadata: remove incorrect close_device on creation error paths
  dm flakey: fix crash on read when corrupt_bio_byte not set
  dm io: fix discard support
  dm ioctl: do not leak argv if target message only contains whitespace

net/hyperv: Use the built-in macro KBUILD_MODNAME for this driver

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: Fix module build errors with CONFIG_OMAP4_ERRATA_I688
  ARM: OMAP: id: Add missing break statement in omap3xxx_check_revision
  ARM: OMAP2+: Remove apply_uV constraints for fixed regulator
  ARM: OMAP: irqs: Fix NR_IRQS value to handle PRCM interrupts

ARM: S3C2440: Fixed build error for s3c244x

Fixed following:
arch/arm/mach-s3c2440/s3c244x.c: In function 's3c244x_restart':
arch/arm/mach-s3c2440/s3c244x.c:209: error: expected declaration or statement at end of input
make[1]: *** [arch/arm/mach-s3c24xx/s3c244x.o] Error 1
make: *** [arch/arm/mach-s3c24xx] Error 2

Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

ALSA: hdspm - Provide ioctl_compat

snd_hdspm uses its own ioctls to acquire config- and status information.
Expose the corresponding ioctl handler via ioctl_compat, so that 32bit
applications can use it on 64bit kernels.

Signed-off-by: Adrian Knoth <adi@drcomp.erfurt.thur.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

x86/amd: iommu_set_device_table() must not be __init

This function is called from enable_iommus(), which in turn is used
from amd_iommu_resume().

Cc: stable@vger.kernel.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

drm/radeon/kms: fix hdmi duallink checks

All pre-SI chips are limited to 165 Mhz for single link.
Code in question will be re-enabled when SI support is added.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=44755
https://bugzilla.kernel.org/show_bug.cgi?id=42887

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon/kms: set SX_MISC in the r6xx blit code (v2)

Mesa may set it to 1, causing all primitives to be killed.

v2: also update the r7xx code

Signed-off-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

net/usbnet: avoid recursive locking in usbnet_stop()

|kernel BUG at kernel/rtmutex.c:724!
|[<c029599c>] (rt_spin_lock_slowlock+0x108/0x2bc) from [<c01c2330>] (defer_bh+0x1c/0xb4)
|[<c01c2330>] (defer_bh+0x1c/0xb4) from [<c01c3afc>] (rx_complete+0x14c/0x194)
|[<c01c3afc>] (rx_complete+0x14c/0x194) from [<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0)
|[<c01cac88>] (usb_hcd_giveback_urb+0xa0/0xf0) from [<c01e1ff4>] (musb_giveback+0x34/0x40)
|[<c01e1ff4>] (musb_giveback+0x34/0x40) from [<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0)
|[<c01e2b1c>] (musb_advance_schedule+0xb4/0x1c0) from [<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c)
|[<c01e2ca8>] (musb_cleanup_urb.isra.9+0x80/0x8c) from [<c01e2ed0>] (musb_urb_dequeue+0xec/0x108)
|[<c01e2ed0>] (musb_urb_dequeue+0xec/0x108) from [<c01cbb90>] (unlink1+0xbc/0xcc)
|[<c01cbb90>] (unlink1+0xbc/0xcc) from [<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8)
|[<c01cc2ec>] (usb_hcd_unlink_urb+0x54/0xa8) from [<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58)
|[<c01c2a84>] (unlink_urbs.isra.17+0x2c/0x58) from [<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c)
|[<c01c2b44>] (usbnet_terminate_urbs+0x94/0x10c) from [<c01c2d68>] (usbnet_stop+0x100/0x15c)
|[<c01c2d68>] (usbnet_stop+0x100/0x15c) from [<c020f718>] (__dev_close_many+0x94/0xc8)

defer_bh() takes the lock which is hold during unlink_urbs(). The safe
walk suggest that the skb will be removed from the list and this is done
by defer_bh() so it seems to be okay to drop the lock here.

Cc: stable@kernel.org
Reported-by: AnÃbal Almeida Pinto <anibal.pinto@efacec.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

route: Remove redirect_genid

As we invalidate the inetpeer tree along with the routing cache now,
we don't need a genid to reset the redirect handling when the routing
cache is flushed.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

inetpeer: Invalidate the inetpeer tree along with the routing cache

We initialize the routing metrics with the values cached on the
inetpeer in rt_init_metrics(). So if we have the metrics cached on the
inetpeer, we ignore the user configured fib_metrics.

To fix this issue, we replace the old tree with a fresh initialized
inet_peer_base. The old tree is removed later with a delayed work queue.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_core: fix bug in modify_cq wrapper for resize flow.

The actual FW command is called in procedure "handle_resize".
Code incorrectly invoked the FW command again (in good flow), in
the modify_cq wrapper function.

Fix by skipping second FW invocation unconditionally for resize.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

atl1c: set ATL1C_WORK_EVENT_RESET bit correctly

ATL1C_WORK_EVENT_RESET is zero so the original code here is a nop. The
intent was to set the zero bit.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: fix state reporting when port is disabled

Now we have:
eth0: link *down*
br0: port 1(eth0) entered *forwarding* state

br_log_state(p) should be called *after* p->state is set
to BR_STATE_DISABLED.

Reported-by: Zilvinas Valinskas <zilvinas@wilibox.com>
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: br_log_state() s/entering/entered/

When br_log_state() is reporting state it should say "entered"
istead of "entering" since state at this point is already
changed.

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ehea: restore multicast and rx_errors fields

Commit 239c562c94d (ehea: Add 64bit statistics) added a regression,
since we no longer report multicast & rx_errors fields, taken from
port->stats structure. These fields are updated in ehea_update_stats()
every second.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Tested-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch

openvswitch: Fix checksum update for actions on UDP packets.

When modifying IP addresses or ports on a UDP packet we don't
correctly follow the rules for unchecksummed packets. This meant
that packets without a checksum can be given a incorrect new checksum
and packets with a checksum can become marked as being unchecksummed.
This fixes it to handle those requirements.

Signed-off-by: Jesse Gross <jesse@nicira.com>

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless

regulator: tps65910: Configure correct value for VDDCTRL vout reg

As per datasheet, the voltage output is defined as
from SEL[6:0] = 3 to 64 (dec)
Vout= (SEL[6:0] × 12.5 mV + 562.5 mV)

The list_voltage returns the vout as
600mV + selector * 12.5mV

and so equivalent VSEL is selector + 3.
Adding 3 on selector when configuring VSEL register for
VDDCTRL output.

Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>

dm raid: fix flush support

Fix dm-raid flush support.

Both md and dm have support for flush, but the dm-raid target
forgot to set the flag to indicate that flushes should be
passed on. (Important for data integrity e.g. with writeback cache
enabled.)

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm raid: set MD_CHANGE_DEVS when rebuilding

The 'rebuild' parameter is used to rebuild individual devices in an
array (e.g. resynchronize a RAID1 device or recalculate a parity device
in higher RAID).  The MD_CHANGE_DEVS flag must be set when this
parameter is given in order to write out the superblocks and make the
change take immediate effect.  The code that handles new devices in
super_load already sets MD_CHANGE_DEVS and 'FirstUse'.  (The 'FirstUse'
flag was being set as a special case for rebuilds in
super_init_validation.)

Add a condition for rebuilds in super_load to take care of both flags
without the special case in 'super_init_validation'.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm thin metadata: decrement counter after removing mapped block

Correct the number of mapped sectors shown on a thin device's
status line by decrementing td->mapped_blocks in __remove() each time
a block is removed.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm thin metadata: unlock superblock in init_pmd error path

If dm_sm_disk_create() fails the superblock must be unlocked.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm thin metadata: remove incorrect close_device on creation error paths

The __open_device() error paths in __create_thin() and __create_snap()
incorrectly call __close_device() even if td was not initialized by
__open_device(). Remove this.

Also document __open_device() return values, remove a redundant
td->changed = 1 in __create_thin(), and insert an additional
safeguard against creating an already-existing device.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm flakey: fix crash on read when corrupt_bio_byte not set

The following BUG is hit on the first read that is submitted to a dm
flakey test device while the device is "down" if the corrupt_bio_byte
feature wasn't requested when the device's table was loaded.

Example DM table that will hit this BUG:
0 2097152 flakey 8:0 2048 0 30

This bug was introduced by commit a3998799fb4df0b0af8271a7d50c4269032397aa
(dm flakey: add corrupt_bio_byte feature) in v3.1-rc1.

BUG: unable to handle kernel paging request at ffff8801cfce3fff
IP: [<ffffffffa008c233>] corrupt_bio_data+0x6e/0xae [dm_flakey]
PGD 1606063 PUD 0
Oops: 0002 [#1] SMP
...
Call Trace:
<IRQ>
[<ffffffffa008c2b5>] flakey_end_io+0x42/0x48 [dm_flakey]
[<ffffffffa00dca98>] clone_endio+0x54/0xb6 [dm_mod]
[<ffffffff81130587>] bio_endio+0x2d/0x2f
[<ffffffff811c819a>] req_bio_endio+0x96/0x9f
[<ffffffff811c94b9>] blk_update_request+0x1dc/0x3a9
[<ffffffff812f5ee2>] ? rcu_read_unlock+0x21/0x23
[<ffffffff811c96a6>] blk_update_bidi_request+0x20/0x6e
[<ffffffff811c9713>] blk_end_bidi_request+0x1f/0x5d
[<ffffffff811c978d>] blk_end_request+0x10/0x12
[<ffffffff8128f450>] scsi_io_completion+0x1e5/0x4b1
[<ffffffff812882a9>] scsi_finish_command+0xec/0xf5
[<ffffffff8128f830>] scsi_softirq_done+0xff/0x108
[<ffffffff811ce284>] blk_done_softirq+0x84/0x98
[<ffffffff81048d19>] __do_softirq+0xe3/0x1d5
[<ffffffff8138f83f>] ? _raw_spin_lock+0x62/0x69
[<ffffffff810997cf>] ? handle_irq_event+0x4c/0x61
[<ffffffff8139833c>] call_softirq+0x1c/0x30
[<ffffffff81003b37>] do_softirq+0x4b/0xa3
[<ffffffff81048a39>] irq_exit+0x53/0xca
[<ffffffff81398acd>] do_IRQ+0x9d/0xb4
[<ffffffff81390333>] common_interrupt+0x73/0x73
...

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org # 3.1+
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm io: fix discard support

This patch fixes a crash by recognising discards in dm_io.

Currently dm_mirror can send REQ_DISCARD bios if running over a
discard-enabled device and without support in dm_io the system
crashes badly.

BUG: unable to handle kernel paging request at 00800000
IP: __bio_add_page.part.17+0xf5/0x1e0
...
bio_add_page+0x56/0x70
dispatch_io+0x1cf/0x240 [dm_mod]
? km_get_page+0x50/0x50 [dm_mod]
? vm_next_page+0x20/0x20 [dm_mod]
? mirror_flush+0x130/0x130 [dm_mirror]
dm_io+0xdc/0x2b0 [dm_mod]
...

Introduced in 2.6.38-rc1 by commit 5fc2ffeabb9ee0fc0e71ff16b49f34f0ed3d05b4
(dm raid1: support discard).

Signed-off-by: Milan Broz <mbroz@redhat.com>
Cc: stable@kernel.org
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

dm ioctl: do not leak argv if target message only contains whitespace

If 'argc' is zero we jump to the 'out:' label, but this leaks the
(unused) memory that 'dm_split_args()' allocated for 'argv' if the
string being split consisted entirely of whitespace. Jump to the
'out_argv:' label instead to free up that memory.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>

hwmon: (jc42) Add support for AT30TS00, TS3000GB2, TSE2002GB2, and MCP9804

Also update IDT datasheet locations.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # 3.0+
Acked-by: Jean Delvare <khali@linux-fr.org>

hwmon: (zl6100) Maintain delay parameter in driver instance data

A global delay parameter has the side effect of being overwritten with 0 if a
single ZL2004 or ZL6105 is instantiated. If other chips supported by the same
driver are in the system, this will result in access errors for those chips.

To solve the problem, keep a per-instance copy of the delay parameter, and do
not change the original parameter.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org # 3.1+
Acked-by: Jean Delvare <khali@linux-fr.org>

hwmon: (pmbus_core) Fix maximum number of POUT alarm attributes

There are up to three POUT alarm attributes, not two, since cap_alarm was added.

Reported-by: Michele Petracca <mi.petracca@gmail.com>
Cc: stable@vger.kernel.org # 3.0+ [3.0 will need backport]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (jc42) Add support for ST Microelectronics STTS2002 and STTS3000

These are fully compatible with Jedec JC 42.4 as far as I can see.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: stable@vger.kernel.org # 3.0+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>

Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm

Pull ARM updates from Russell King.

* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7358/1: perf: add PMU hotplug notifier
  ARM: 7357/1: perf: fix overflow handling for xscale2 PMUs
  ARM: 7356/1: perf: check that we have an event in the PMU IRQ handlers
  ARM: 7355/1: perf: clear overflow flag when disabling counter on ARMv7 PMU
  ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode
  ARM: ecard: ensure fake vma vm_flags is setup
  ARM: 7346/1: errata: fix PL310 erratum #753970 workaround selection
  ARM: 7345/1: errata: update workaround for A9 erratum #743622
  ARM: 7348/1: arm/spear600: fix one-shot timer
  ARM: 7339/1: amba/serial.h: Include types.h for resolving dependency of type bool

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov: "Just a few driver fixups,
nothing exciting."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wacom - fix 3rd-gen Bamboo MT when 4+ fingers are in use
  Input: twl4030-vibra - use proper guard for PM methods
  Input: evdev - fix variable initialisation
  Input: wacom - add missing LEDS_CLASS to Kconfig
  Input: ALPS - fix touchpad detection when buttons are pressed

C6X: fix KSTK_EIP and KSTK_ESP macros

There was a latent typo in the C6X KSTK_EIP and KSTK_ESP macros which
caused a problem with a new patch which used them. The broken definitions
were of the form:

#define KSTK_FOO(tsk) (task_pt_regs(task)->foo)

Note the use of task vs tsk. This actually worked before because the
only place in the kernel which used these macros passed in a local
pointer named task.

Signed-off-by: Mark Salter <msalter@redhat.com>

Revert "CPU hotplug, cpusets, suspend: Don't touch cpusets during suspend/resume"

This reverts commit 8f2f748b0656257153bcf0941df8d6060acc5ca6.

It causes some odd regression that we have not figured out, and it's too
late in the -rc series to try to figure it out now.

As reported by Konstantin Khlebnikov, it causes consistent hangs on his
laptop (Thinkpad x220: 2x cores + HT). They can be avoided by adding
calls to "rebuild_sched_domains();" in cpuset_cpu_[in]active() for the
CPU_{ONLINE/DOWN_FAILED/DOWN_PREPARE}_FROZEN cases, but it's not at all
clear why, and it makes no sense.

Konstantin's config doesn't even have CONFIG_CPUSETS enabled, just to
make things even more interesting. So it's not the cpusets, it's just
the scheduling domains.

So until this is understood, revert.

Bisected-reported-and-tested-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drm/radeon: deal with errors from framebuffer init path.

We've been getting occasional oops running a 32-bit kernel on a certain
system in our RHEL test hw. It appears that we fail to get sufficent ioremap
space for the framebuffer, and this leads to an oops.

This patch should fix the oops and leave a message in the logs we can
check for.

A future fix would probably to resize the console to a size that we can
ioremap.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/radeon: fix a semaphore deadlock on pre cayman asics

The out of order execution of semaphore commands on
pre cayman asics doesn't work correctly and can
cause deadlocks, so turn it off for now.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

ARM: 7358/1: perf: add PMU hotplug notifier

When a CPU is taken out of reset, either cold booted or hotplugged in,
some of its PMU registers can contain UNKNOWN values.

This patch adds a hotplug notifier to ARM core perf code so that upon
CPU restart the PMU unit is reset and becomes ready to use again.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7357/1: perf: fix overflow handling for xscale2 PMUs

xscale2 PMUs indicate overflow not via the PMU control register, but by
a separate overflow FLAG register instead.

This patch fixes the xscale2 PMU code to use this register to detect
to overflow and ensures that we clear any pending overflow when
disabling a counter.

Cc: <stable@vger.kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7356/1: perf: check that we have an event in the PMU IRQ handlers

The PMU IRQ handlers in perf assume that if a counter has overflowed
then perf must be responsible. In the paranoid world of crazy hardware,
this could be false, so check that we do have a valid event before
attempting to dereference NULL in the interrupt path.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7355/1: perf: clear overflow flag when disabling counter on ARMv7 PMU

When disabling a counter on an ARMv7 PMU, we should also clear the
overflow flag in case an overflow occurred whilst stopping the counter.
This prevents a spurious overflow being picked up later and leading to
either false accounting or a NULL dereference.

Cc: <stable@vger.kernel.org>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ARM: 7354/1: perf: limit sample_period to half max_period in non-sampling mode

On ARM, the PMU does not stop counting after an overflow and therefore
IRQ latency affects the new counter value read by the kernel. This is
significant for non-sampling runs where it is possible for the new value
to overtake the previous one, causing the delta to be out by up to
max_period events.

Commit a737823d ("ARM: 6835/1: perf: ensure overflows aren't missed due
to IRQ latency") attempted to fix this problem by allowing interrupt
handlers to pass an overflow flag to the event update function, causing
the overflow calculation to assume that the counter passed through zero
when going from prev to new. Unfortunately, this doesn't work when
overflow occurs on the perf_task_tick path because we have the flag
cleared and end up computing a large negative delta.

This patch removes the overflow flag from armpmu_event_update and
instead limits the sample_period to half of the max_period for
non-sampling profiling runs.

Cc: <stable@vger.kernel.org>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Input: wacom - fix 3rd-gen Bamboo MT when 4+ fingers are in use

The message count field uses three bits of storage, not two.

Signed-off-by: Jason Gerecke <killertofu@gmail.com>
Acked-by: Chris Bagwell <chris@cnpbagwell.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>

ALSA: hda/realtek - Apply the coef-setup only to ALC269VB

The coef setup in alc269_fill_coef() was designed only for ALC269VB
model, and this has some bad effects for other ALC269 variants, such
as turning off the external mic input. Apply it only to ALC269VB.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) TCP can chop up SACK'd SKBs below below the unacked send sequence and
   that breaks lots of stuff.  Fix from Neal Cardwell.

2) There is code in ipv6 to properly join and leave the all-routers
   multicast code when the forwarding setting is changed, but once
   forwarding is turned on, we don't do the join for newly registered
   devices.  Fix from Li Wei.

3) Netfilter's NAT module autoload in ctnetlink drops a spinlock around
   a sleeping call, problem is this code path doesn't actually hold that
   lock.  Fix from Pablo Neira Ayuso.

4) TG3 uses the wrong interfaces to hook into the new byte queue limit
   support.  It uses the device level interfaces, which is fine for
   single queue devices, but on more recent chips this driver supports
   multiqueue so we have to use the multiqueue BQL APIs.  Fix from Tom
   Herbert.

5) r8169 resume fix from Francois Romieu.

6) Add some cxgb4 device IDs, from Vipul Pandya.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  IPv6: Fix not join all-router mcast group when forwarding set.
  caif-hsi: Set default MTU to 4096
  cxgb4vf: Add support for Chelsio's T480-CR and T440-LP-CR adapters
  cxgb4: Add support for Chelsio's T480-CR and T440-LP-CR adapters
  mlx4_core: remove buggy sched_queue masking
  netfilter: nf_conntrack: fix early_drop with reliable event delivery
  bridge: netfilter: don't call iptables on vlan packets if sysctl is off
  netfilter: bridge: fix wrong pointer dereference
  netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoload
  netfilter: ebtables: fix wrong name length while copying to user-space
  r8169: runtime resume before shutdown.
  tcp: fix tcp_shift_skb_data() to not shift SACKed data below snd_una
  tg3: Fix to use multi queue BQL interfaces

x86: fix typo in recent find_vma_prev purge

It turns out that test-compiling this file on x86-64 doesn't really
help, because much of it is x86-32-specific. And so I hadn't noticed
the slightly over-eager removal of the 'r' from 'addr' variable despite
thinking I had tested it.

Signed-off-by: Linus "oopsie" Torvalds <torvalds@linux-foundation.org>

vm: avoid using find_vma_prev() unnecessarily

Several users of "find_vma_prev()" were not in fact interested in the
previous vma if there was no primary vma to be found either. And in
those cases, we're much better off just using the regular "find_vma()",
and then "prev" can be looked up by just checking vma->vm_prev.

The find_vma_prev() semantics are fairly subtle (see Mikulas' recent
commit 83cd904d271b: "mm: fix find_vma_prev"), and the whole "return
prev by reference" means that it generates worse code too.

Thus this "let's avoid using this inconvenient and clearly too subtle
interface when we don't really have to" patch.

Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.samba.org/sfrench/cifs-2.6

Pull CIFS fixes from Steve French

* git://git.samba.org/sfrench/cifs-2.6:
cifs: fix dentry refcount leak when opening a FIFO on lookup
CIFS: Fix mkdir/rmdir bug for the non-POSIX case

mm: fix find_vma_prev

Commit 6bd4837de96e ("mm: simplify find_vma_prev()") broke memory
management on PA-RISC.

After application of the patch, programs that allocate big arrays on the
stack crash with segfault, for example, this will crash if compiled
without optimization:

  int main()
  {
char array[200000];
array[199999] = 0;
return 0;
  }

The reason is that PA-RISC has up-growing stack and the stack is usually
the last memory area.  In the above example, a page fault happens above
the stack.

Previously, if we passed too high address to find_vma_prev, it returned
NULL and stored the last VMA in *pprev.  After "simplify find_vma_prev"
change, it stores NULL in *pprev.  Consequently, the stack area is not
found and it is not expanded, as it used to be before the change.

This patch restores the old behavior and makes it return the last VMA in
*pprev if the requested address is higher than address of any other VMA.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

genirq: Clear action->thread_mask if IRQ_ONESHOT is not set

Xommit ac5637611(genirq: Unmask oneshot irqs when thread was not woken)
fails to unmask when a !IRQ_ONESHOT threaded handler is handled by
handle_level_irq.

This happens because thread_mask is or'ed unconditionally in
irq_wake_thread(), but for !IRQ_ONESHOT interrupts never cleared. So
the check for !desc->thread_active fails and keeps the interrupt
disabled.

Keep the thread_mask zero for !IRQ_ONESHOT interrupts.

Document the thread_mask magic while at it.

Reported-and-tested-by: Sven Joachim <svenjoac@gmx.de>
Reported-and-tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

openvswitch: Honor dp_ifindex, when specified, for vport lookup by name.

When OVS_VPORT_ATTR_NAME is specified and dp_ifindex is nonzero, the
logical behavior would be for the vport name lookup scope to be limited
to the specified datapath, but in fact the dp_ifindex value was ignored.
This commit causes the search scope to be honored.

Signed-off-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>

IPv6: Fix not join all-router mcast group when forwarding set.

When forwarding was set and a new net device is register,
we need add this device to the all-router mcast group.

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mmap: EINVAL not ENOMEM when rejecting VM_GROWS

Currently error is -ENOMEM when rejecting VM_GROWSDOWN|VM_GROWSUP
from shared anonymous: hoist the file case's -EINVAL up for both.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

caif-hsi: Set default MTU to 4096

Default MTU for CAIF HSI was wrongly set to 15 * 4092 bytes.
The patch sets default MTU size to 4096.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4vf: Add support for Chelsio's T480-CR and T440-LP-CR adapters

This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR
adapters.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Add support for Chelsio's T480-CR and T440-LP-CR adapters

This patch adds PCI device ids for Chelsio's T480-CR and T440-LP-CR
adapters.

Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mlx4_core: remove buggy sched_queue masking

Fixes a bug introduced by commit fe9a2603c, where the priority bits
in the schedule queue field were masked out.

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: nf_conntrack: fix early_drop with reliable event delivery

If reliable event delivery is enabled and ctnetlink fails to deliver
the destroy event in early_drop, the conntrack subsystem cannot
drop any the candidate flow that was planned to be evicted.

Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: netfilter: don't call iptables on vlan packets if sysctl is off

When net.bridge.bridge-nf-filter-vlan-tagged is 0 (default), vlan packets
arriving should not be sent to ip(6)tables by bridge netfilter.

However, it turns out that we currently always send VLAN packets to
netfilter, if ..
a), CONFIG_VLAN_8021Q is enabled ; or
b), CONFIG_VLAN_8021Q is not set but rx vlan offload is enabled
on the bridge port.

This is because bridge netfilter treats skb with
skb->protocol == ETH_P_IP{V6} as "non-vlan packet".

With rx vlan offload on or CONFIG_VLAN_8021Q=y, the vlan header has
already been removed here, and we cannot rely on skb->protocol alone.

Fix this by only using skb->protocol if the skb has no vlan tag,
or if a vlan tag is present and filter-vlan-tagged bridge netfilter
sysctl is enabled.

We cannot remove the skb->protocol == htons(ETH_P_8021Q) test
because the vlan tag is still around in the CONFIG_VLAN_8021Q=n &&
"ethtool -K $itf rxvlan off" case.

reproducer:
iptables -t raw -I PREROUTING -i br0
iptables -t raw -I PREROUTING -i br0.1

Then send packets to an ip address configured on br0.1 interface.
Even with net.bridge.bridge-nf-filter-vlan-tagged=0, the 1st rule
will match instead of the 2nd one.

With this patch applied, the 2nd rule will match instead.
In the non-local address case, netfilter won't be consulted after
this patch unless the sysctl is switched on.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: bridge: fix wrong pointer dereference

In adf7ff8, a invalid dereference was added in ebt_make_names.

CC [M] net/bridge/netfilter/ebtables.o
net/bridge/netfilter/ebtables.c: In function `ebt_make_names':
net/bridge/netfilter/ebtables.c:1371:20: warning: `t' may be used uninitialized in this function [-Wuninitialized]

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoload

Since 7d367e0, ctnetlink_new_conntrack is called without holding
the nf_conntrack_lock spinlock. Thus, ctnetlink_parse_nat_setup
does not require to release that spinlock anymore in the NAT module
autoload case.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: ebtables: fix wrong name length while copying to user-space

user-space ebtables expects 32 bytes-long names, but xt_match names
use 29 bytes. We have to copy less 29 bytes and then, make sure we
fill the remaining bytes with zeroes.

Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: runtime resume before shutdown.

With runtime PM, if the ethernet cable is disconnected, the device is
transitioned to D3 state to conserve energy. If the system is shutdown
in this state, any register accesses in rtl_shutdown are dropped on
the floor. As the device was programmed by .runtime_suspend() to wake
on link changes, it is thus brought back up as soon as the link recovers.

Resuming every suspended device through the driver core would slow things
down and it is not clear how many devices really need it now.

Original report and D0 transition patch by Sameer Nanda. Patch has been
changed to comply with advices by Rafael J. Wysocki and the PM folks.

Reported-by: Sameer Nanda <snanda@chromium.org>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Hayes Wang <hayeswang@realtek.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix tcp_shift_skb_data() to not shift SACKed data below snd_una

This commit fixes tcp_shift_skb_data() so that it does not shift
SACKed data below snd_una.

This fixes an issue whose symptoms exactly match reports showing
tp->sacked_out going negative since 3.3.0-rc4 (see "WARNING: at
net/ipv4/tcp_input.c:3418" thread on netdev).

Since 2008 (832d11c5cd076abc0aa1eaf7be96c81d1a59ce41)
tcp_shift_skb_data() had been shifting SACKed ranges that were below
snd_una. It checked that the *end* of the skb it was about to shift
from was above snd_una, but did not check that the end of the actual
shifted range was above snd_una; this commit adds that check.

Shifting SACKed ranges below snd_una is problematic because for such
ranges tcp_sacktag_one() short-circuits: it does not declare anything
as SACKed and does not increase sacked_out.

Before the fixes in commits cc9a672ee522d4805495b98680f4a3db5d0a0af9
and daef52bab1fd26e24e8e9578f8fb33ba1d0cb412, shifting SACKed ranges
below snd_una happened to work because tcp_shifted_skb() was always
(incorrectly) passing in to tcp_sacktag_one() an skb whose end_seq
tcp_shift_skb_data() had already guaranteed was beyond snd_una. Hence
tcp_sacktag_one() never short-circuited and always increased
tp->sacked_out in this case.

After those two fixes, my testing has verified that shifting SACKed
ranges below snd_una could cause tp->sacked_out to go negative with
the following sequence of events:

(1) tcp_shift_skb_data() sees an skb whose end_seq is beyond snd_una,
    then shifts a prefix of that skb that is below snd_una

(2) tcp_shifted_skb() increments the packet count of the
    already-SACKed prev sk_buff

(3) tcp_sacktag_one() sees the end of the new SACKed range is below
    snd_una, so it short-circuits and doesn't increase tp->sacked_out

(5) tcp_clean_rtx_queue() sees the SACKed skb has been ACKed,
    decrements tp->sacked_out by this "inflated" pcount that was
    missing a matching increase in tp->sacked_out, and hence
    tp->sacked_out underflows to a u32 like 0xFFFFFFFF, which casted
    to s32 is negative.

(6) this leads to the warnings seen in the recent "WARNING: at
    net/ipv4/tcp_input.c:3418" thread on the netdev list; e.g.:
    tcp_input.c:3418  WARN_ON((int)tp->sacked_out < 0);

More generally, I think this bug can be tickled in some cases where
two or more ACKs from the receiver are lost and then a DSACK arrives
that is immediately above an existing SACKed skb in the write queue.

This fix changes tcp_shift_skb_data() to abort this sequence at step
(1) in the scenario above by noticing that the bytes are below snd_una
and not shifting them.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem

Merge tag 'fixes-3.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull arm-soc bug fixes from Arnd Bergmann:
"Here are all the fixes I got after sending the last pull request.
  These fix mostly regressions on exynos, at91, pxa and ep93xx.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>"
* tag 'fixes-3.3-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: ep93xx: convert vision_ep9307 to MULTI_IRQ_HANDLER
  ARM: EXYNOS: fix touchscreen IRQ setup on Universal C210 board
  ARM: pxa: fix invalid mfp pin issue
  ARM: pxa: remove duplicated registeration on pxa-gpio
  ARM: pxa: add dummy clock for pxa25x and pxa27x
  ARM: S3C24XX: DMA resume regression fix
  ARM: S3C24XX: Fix restart on S3C2442
  ARM: SAMSUNG: Fix memory size for hsotg
  ARM: at91/dma: DMA controller registering with DT support
  ARM: at91/dma: remove platform data from DMA controller

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 regression fix from Martin Schwidefsky:
"It is a fix for a regression that has been introduced with git commit
  25f269f17316 - "[S390] qdio: EQBS retry after CCQ 96" - and if possible
  we would like to have working code for the fcp data router in 3.3."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  [S390] qdio: fix handler function arguments for zfcp data router

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator updates from Mark Brown:
"A simple fix that's obvious from inspection.  There's no mainline
  users of this driver yet (there's some i.MX platforms which will use
  it)."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: Fix mask parameter in da9052_reg_update calls

vsprintf: make %pV handling compatible with kasprintf()

kasprintf() (and potentially other functions that I didn't run across so
far) want to evaluate argument lists twice. Caring to do so for the
primary list is obviously their job, but they can't reasonably be
expected to check the format string for instances of %pV, which however
need special handling too: On architectures like x86-64 (as opposed to
e.g. ix86), using the same argument list twice doesn't produce the
expected results, as an internally managed cursor gets updated during
the first run.

Fix the problem by always acting on a copy of the original list when
handling %pV.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

page_cgroup: fix horrid swap accounting regression

Why is memcg's swap accounting so broken? Insane counts, wrong
ownership, unfreeable structures, which later get freed and then
accessed after free.

Turns out to be a tiny a little 3.3-rc1 regression in 9fb4b7cc0724
"page_cgroup: add helper function to get swap_cgroup": the helper
function (actually named lookup_swap_cgroup()) returns an address using
void* arithmetic, but the structure in question is a short.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Bob Liu <lliubbo@gmail.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>