git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/log

Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs

Pull zstd support from Chris Mason:
"Nick Terrell's patch series to add zstd support to the kernel has been
  floating around for a while. After talking with Dave Sterba, Herbert
  and Phillip, we decided to send the whole thing in as one pull
  request.

  zstd is a big win in speed over zlib and in compression ratio over
  lzo, and the compression team here at FB has gotten great results
  using it in production. Nick will continue to update the kernel side
  with new improvements from the open source zstd userland code.

  Nick has a number of benchmarks for the main zstd code in his lib/zstd
  commit:

      I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
      of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
      Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
      `silesia.tar` [3], which is 211,988,480 B large. Run the following
      commands for the benchmark:

        sudo modprobe zstd_compress_test
        sudo mknod zstd_compress_test c 245 0
        sudo cp silesia.tar zstd_compress_test

      The time is reported by the time of the userland `cp`.
      The MB/s is computed with

        1,536,217,008 B / time(buffer size, hash)

      which includes the time to copy from userland.
      The Adjusted MB/s is computed with

        1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).

      The memory reported is the amount of memory the compressor
      requests.

        | Method   | Size (B) | Time (s) | Ratio | MB/s    | Adj MB/s | Mem (MB) |
        |----------|----------|----------|-------|---------|----------|----------|
        | none     | 11988480 |    0.100 |     1 | 2119.88 |        - |        - |
        | zstd -1  | 73645762 |    1.044 | 2.878 |  203.05 |   224.56 |     1.23 |
        | zstd -3  | 66988878 |    1.761 | 3.165 |  120.38 |   127.63 |     2.47 |
        | zstd -5  | 65001259 |    2.563 | 3.261 |   82.71 |    86.07 |     2.86 |
        | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |    16.13 |    13.22 |
        | zstd -15 | 58009756 |   47.601 | 3.654 |    4.45 |     4.46 |    21.61 |
        | zstd -19 | 54014593 |  102.835 | 3.925 |    2.06 |     2.06 |    60.15 |
        | zlib -1  | 77260026 |    2.895 | 2.744 |   73.23 |    75.85 |     0.27 |
        | zlib -3  | 72972206 |    4.116 | 2.905 |   51.50 |    52.79 |     0.27 |
        | zlib -6  | 68190360 |    9.633 | 3.109 |   22.01 |    22.24 |     0.27 |
        | zlib -9  | 67613382 |   22.554 | 3.135 |    9.40 |     9.44 |     0.27 |

      I benchmarked zstd decompression using the same method on the same
      machine. The benchmark file is located in the upstream zstd repo
      under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
      memory reported is the amount of memory required to decompress
      data compressed with the given compression level. If you know the
      maximum size of your input, you can reduce the memory usage of
      decompression irrespective of the compression level.

        | Method   | Time (s) | MB/s    | Adjusted MB/s | Memory (MB) |
        |----------|----------|---------|---------------|-------------|
        | none     |    0.025 | 8479.54 |             - |           - |
        | zstd -1  |    0.358 |  592.15 |        636.60 |        0.84 |
        | zstd -3  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -5  |    0.396 |  535.32 |        571.40 |        1.46 |
        | zstd -10 |    0.374 |  566.81 |        607.42 |        2.51 |
        | zstd -15 |    0.379 |  559.34 |        598.84 |        4.61 |
        | zstd -19 |    0.412 |  514.54 |        547.77 |        8.80 |
        | zlib -1  |    0.940 |  225.52 |        231.68 |        0.04 |
        | zlib -3  |    0.883 |  240.08 |        247.07 |        0.04 |
        | zlib -6  |    0.844 |  251.17 |        258.84 |        0.04 |
        | zlib -9  |    0.837 |  253.27 |        287.64 |        0.04 |

  I ran a long series of tests and benchmarks on the btrfs side and the
  gains are very similar to the core benchmarks Nick ran"

* 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  squashfs: Add zstd support
  btrfs: Add zstd support
  lib: Add zstd modules
  lib: Add xxhash module

powerpc: Fix handling of alignment interrupt on dcbz instruction

This fixes the emulation of the dcbz instruction in the alignment
interrupt handler. The error was that we were comparing just the
instruction type field of op.type rather than the whole thing,
and therefore the comparison "type != CACHEOP + DCBZ" was always
true.

Fixes: 31bfdb036f12 ("powerpc: Use instruction emulation infrastructure to handle alignment faults")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Tested-by: Michal Sojka <sojkam1@fel.cvut.cz>
Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

firmware: delete in-kernel firmware

The last firmware change for the in-kernel firmware source code was back
in 2013. Everyone has been relying on the out-of-tree linux-firmware
package for a long long time.

So let's drop it, it's baggage we don't need to keep dragging around
(and having to fix random kbuild issues over time...)

Cc: Kyle McMartin <kyle@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Merge tag 'kbuild-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

- Use Make-builtin $(abspath ...) helper to get absolute path

- Add W=2 extra warning option to detect unused macros

- Use more KCONFIG_CONFIG instead hard-coded .config

- Fix bugs of tar*-pkg targets

* tag 'kbuild-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: buildtar: do not print successful message if tar returns error
  kbuild: buildtar: fix tar error when CONFIG_MODULES is disabled
  kbuild: Use KCONFIG_CONFIG in buildtar
  Kbuild: enable -Wunused-macros warning for "make W=2"
  kbuild: use $(abspath ...) instead of $(shell cd ... && /bin/pwd)

Merge tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

- Some request-based DM core and DM multipath fixes and cleanups

- Constify a few variables in DM core and DM integrity

- Add bufio optimization and checksum failure accounting to DM
   integrity

- Fix DM integrity to avoid checking integrity of failed reads

- Fix DM integrity to use init_completion

- A couple DM log-writes target fixes

- Simplify DAX flushing by eliminating the unnecessary flush
   abstraction that was stood up for DM's use.

* tag 'for-4.14/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dax: remove the pmem_dax_ops->flush abstraction
  dm integrity: use init_completion instead of COMPLETION_INITIALIZER_ONSTACK
  dm integrity: make blk_integrity_profile structure const
  dm integrity: do not check integrity for failed read operations
  dm log writes: fix >512b sectorsize support
  dm log writes: don't use all the cpu while waiting to log blocks
  dm ioctl: constify ioctl lookup table
  dm: constify argument arrays
  dm integrity: count and display checksum failures
  dm integrity: optimize writing dm-bufio buffers that are partially changed
  dm rq: do not update rq partially in each ending bio
  dm rq: make dm-sq requeuing behavior consistent with dm-mq behavior
  dm mpath: complain about unsupported __multipath_map_bio() return values
  dm mpath: avoid that building with W=1 causes gcc 7 to complain about fall-through

Merge tag 'fbdev-v4.14' of git://github.com/bzolnier/linux

Pull fbdev updates from Bartlomiej Zolnierkiewicz:

- make fbcon a built-time depency for fbdev (fbcon was tristate option
   before, now it is a bool) - this is a first step in preparations for
   making console_lock usage saner (currently it acts like the BKL for
   all things fbdev/fbcon) (Daniel Vetter)

- add fbcon=margin:<color> command line option to select the fbcon
   margin color (David Lechner)

- add DMI quirk table for x86 systems which need fbcon rotation
   (devices like Asus T100HA, GPD Pocket, the GPD win and the I.T.Works
   TW891) (Hans de Goede)

- fix 1bpp logo support for unusual width (needed by LEGO MINDSTORMS
   EV3) (David Lechner)

- enable Xilinx FB driver for ARM ZynqMP platform (Michal Simek)

- fix use after free in the error path of udlfb driver (Anton Vasilyev)

- fix error return code handling in pxa3xx_gcu driver (Gustavo A. R.
   Silva)

- fix bootparams.screeninfo arguments checking in vgacon (Jan H.
   Schönherr)

- do not leak uninitialized padding in clk to userspace in the debug
   code of atyfb driver (Vladis Dronov)

- fix compiler warnings in fbcon code and matroxfb driver (Arnd
   Bergmann)

- convert fbdev susbsytem to using %pOF instead of full_name (Rob
   Herring)

- structures constifications (Arvind Yadav, Bhumika Goyal, Gustavo A.
   R. Silva, Julia Lawall)

- misc cleanups (Gustavo A. R. Silva, Hyun Kwon, Julia Lawall, Kuninori
   Morimoto, Lynn Lei)

* tag 'fbdev-v4.14' of git://github.com/bzolnier/linux: (75 commits)
  video/console: Update BIOS dates list for GPD win console rotation DMI quirk
  video/console: Add rotated LCD-panel DMI quirk for the VIOS LTH17
  video: fbdev: sis: fix duplicated code for different branches
  video: fbdev: make fb_var_screeninfo const
  video: fbdev: aty: do not leak uninitialized padding in clk to userspace
  vgacon: Prevent faulty bootparams.screeninfo from causing harm
  video: fbdev: make fb_videomode const
  video/console: Add new BIOS date for GPD pocket to dmi quirk table
  fbcon: remove restriction on margin color
  video: ARM CLCD: constify amba_id
  video: fm2fb: constify zorro_device_id
  video: fbdev: annotate fb_fix_screeninfo with const and __initconst
  omapfb: constify omap_video_timings structures
  video: fbdev: udlfb: Fix use after free on dlfb_usb_probe error path
  fbdev: i810: make fb_ops const
  fbdev: matrox: make fb_ops const
  video: fbdev: pxa3xx_gcu: fix error return code in pxa3xx_gcu_probe()
  video: fbdev: Enable Xilinx FB for ZynqMP
  video: fbdev: Fix multiple style issues in xilinxfb
  video: fbdev: udlfb: constify usb_device_id.
  ...

Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

- add support for the watchdog on Meson8 and Meson8m2

- add support for MediaTek MT7623 and MT7622 SoC

- add support for the r8a77995 wdt

- explicitly request exclusive reset control for asm9260_wdt,
   zx2967_wdt, rt2880_wdt and mt7621_wdt

- improvements to asm9260_wdt, aspeed_wdt, renesas_wdt and cadence_wdt

- add support for reading freq via CCF + suspend/resume support for
   of_xilinx_wdt

- constify watchdog_ops and various device-id structures

- revert of commit 1fccb73011ea ("iTCO_wdt: all versions count down
   twice") (Bug 196509)

* git://www.linux-watchdog.org/linux-watchdog: (40 commits)
  watchdog: mei_wdt: constify mei_cl_device_id
  watchdog: sp805: constify amba_id
  watchdog: ziirave: constify i2c_device_id
  watchdog: sc1200: constify pnp_device_id
  dt-bindings: watchdog: renesas-wdt: Add support for the r8a77995 wdt
  watchdog: renesas_wdt: update copyright dates
  watchdog: renesas_wdt: make 'clk' a variable local to probe()
  watchdog: renesas_wdt: consistently use RuntimePM for clock management
  watchdog: aspeed: Support configuration of external signal properties
  dt-bindings: watchdog: aspeed: External reset signal properties
  drivers/watchdog: Add optional ASPEED device tree properties
  drivers/watchdog: ASPEED reference dev tree properties for config
  watchdog: da9063_wdt: Simplify by removing unneeded struct...
  watchdog: bcm7038: Check the return value from clk_prepare_enable()
  watchdog: qcom: Check for platform_get_resource() failure
  watchdog: of_xilinx_wdt: Add suspend/resume support
  watchdog: of_xilinx_wdt: Add support for reading freq via CCF
  dt-bindings: watchdog: mediatek: add support for MediaTek MT7623 and MT7622 SoC
  watchdog: max77620_wdt: constify platform_device_id
  watchdog: pcwd_usb: constify usb_device_id
  ...

Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging

Pull dmi update from Jean Delvare:
"Mark all struct dmi_system_id instances const"

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
dmi: Mark all struct dmi_system_id instances const

Merge tag 'pinctrl-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
"This slew of fixes for pin control was noticed and patched up early,
  so to get the annoyance out of the way for -rc1 it would make sense to
  send them already.

   - Fix a build include in the Uniphier driver to keep pace with
     ongoing refactorings.

   - Fix a slew of minor semantic and syntactic issues as well as
     stricting up Kconfig for the new Spreadtrum driver.

   - Fix the GPIO interrupt set-up on the Marvell 37xx Armada as fallout
     for dynamically allocating irq descriptors from the core. (Also
     tagged for stable.)

   - Fix AMD register suspend/resume state spool/unspooling so that
     wakeup works as it should. (Also tagged for stable.)"

* tag 'pinctrl-v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl/amd: save pin registers over suspend/resume
  pinctrl: armada-37xx: Fix gpio interrupt setup
  pinctrl: sprd: fix off by one bugs
  pinctrl: sprd: check for allocation failure
  pinctrl: sprd: Restrict PINCTRL_SPRD to ARCH_SPRD or COMPILE_TEST
  pinctrl: sprd: fix build errors and dependencies
  pinctrl: sprd: make three local functions static
  pinctrl: uniphier: include <linux/build_bug.h> instead of <linux/bug.h>

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"A few leftovers"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, page_owner: skip unnecessary stack_trace entries
  arm64: stacktrace: avoid listing stacktrace functions in stacktrace
  mm: treewide: remove GFP_TEMPORARY allocation flag
  IB/mlx4: fix sprintf format warning
  fscache: fix fscache_objlist_show format processing
  lib/test_bitmap.c: use ULL suffix for 64-bit constants
  procfs: remove unused variable
  drivers/media/cec/cec-adap.c: fix build with gcc-4.4.4
  idr: remove WARN_ON_ONCE() when trying to replace negative ID

orangefs: Adjust three checks for null pointers

MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The script “checkpatch.pl” pointed information out like the following.

Comparison to NULL could be written !…

Thus fix affected source code places.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: Use kcalloc() in orangefs_prepare_cdm_array()

* A multiplication for the size determination of a memory allocation
  indicated that an array data structure should be processed.
  Thus use the corresponding function "kcalloc".

  This issue was detected by using the Coccinelle software.

* Replace the specification of a data structure by a pointer dereference
  to make the corresponding size determination a bit safer according to
  the Linux coding style convention.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: Delete error messages for a failed memory allocation in five functions

Omit an extra message for a memory allocation failure in these functions.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: constify xattr_handler structure

The xattr_handler structure is only stored in an array of const
structures. Thus the xattr_handler structure itself can be
const.

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: don't call filemap_write_and_wait from fsync

Orangefs doesn't do buffered writes yet, so there's no point in
initiating and waiting for writeback.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: off by ones in xattr size checks

A previous patch which claimed to remove off by ones actually introduced
them.

strlen() returns the length of the string not including the NUL
character. We are using strcpy() to copy "name" into a buffer which is
ORANGEFS_MAX_XATTR_NAMELEN characters long. We should make sure to
leave space for the NUL, otherwise we're writing one character beyond
the end of the buffer.

Fixes: e675c5ec51fe ("orangefs: clean up oversize xattr validation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: documentation clean up

Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: react properly to posix_acl_update_mode's aftermath.

posix_acl_update_mode checks to see if the permissions
described by the ACL can be encoded into the
object's mode. If so, it sets "acl" to NULL
and "mode" to the new desired value. Prior to this patch
we failed to actually propagate the new mode back to the
server.

Signed-off-by: Mike Marshall <hubcap@omnibond.com>

orangefs: Don't clear SGID when inheriting ACLs

When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.

Fix the problem by creating __orangefs_set_acl() function that does not
call posix_acl_update_mode() and use it when inheriting ACLs. That
prevents SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.

Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: stable@vger.kernel.org
CC: Mike Marshall <hubcap@omnibond.com>
CC: pvfs2-developers@beowulf-underground.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>

tg3: clean up redundant initialization of tnapi

tnapi is being initialized and then immediately updated and
hence the initialiation is redundant. Clean up the warning
by moving the declaration and initialization to the inside
of the for-loop.

Cleans up clang scan-build warning:
warning: Value stored to 'tnapi' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sched/wait: Introduce wakeup boomark in wake_up_page_bit

Now that we have added breaks in the wait queue scan and allow bookmark
on scan position, we put this logic in the wake_up_page_bit function.

We can have very long page wait list in large system where multiple
pages share the same wait list. We break the wake up walk here to allow
other cpus a chance to access the list, and not to disable the interrupts
when traversing the list for too long. This reduces the interrupt and
rescheduling latency, and excessive page wait queue lock hold time.

[ v2: Remove bookmark_wake_function ]

Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched/wait: Break up long wake list walk

We encountered workloads that have very long wake up list on large
systems. A waker takes a long time to traverse the entire wake list and
execute all the wake functions.

We saw page wait list that are up to 3700+ entries long in tests of
large 4 and 8 socket systems. It took 0.8 sec to traverse such list
during wake up. Any other CPU that contends for the list spin lock will
spin for a long time. It is a result of the numa balancing migration of
hot pages that are shared by many threads.

Multiple CPUs waking are queued up behind the lock, and the last one
queued has to wait until all CPUs did all the wakeups.

The page wait list is traversed with interrupt disabled, which caused
various problems. This was the original cause that triggered the NMI
watch dog timer in: https://patchwork.kernel.org/patch/9800303/ . Only
extending the NMI watch dog timer there helped.

This patch bookmarks the waker's scan position in wake list and break
the wake up walk, to allow access to the list before the waker resume
its walk down the rest of the wait list. It lowers the interrupt and
rescheduling latency.

This patch also provides a performance boost when combined with the next
patch to break up page wakeup list walk. We saw 22% improvement in the
will-it-scale file pread2 test on a Xeon Phi system running 256 threads.

[ v2: Merged in Linus' changes to remove the bookmark_wake_function, and
simply access to flags. ]

Reported-by: Kan Liang <kan.liang@intel.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

tls: make tls_sw_free_resources static

Make the needlessly global function tls_sw_free_resources static to fix
a gcc/sparse warning.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

KVM: trace events: update list of exit reasons

Adding entries for exit reasons 23 - 27:

  KVM_EXIT_EPR
  KVM_EXIT_SYSTEM_EVENT
  KVM_EXIT_S390_STSI
  KVM_EXIT_IOAPIC_EOI
  KVM_EXIT_HYPERV

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: async_pf: Fix #DF due to inject "Page not Present" and "Page Ready" exceptions simultaneously

qemu-system-x86-8600  [004] d..1  7205.687530: kvm_entry: vcpu 2
qemu-system-x86-8600  [004] ....  7205.687532: kvm_exit: reason EXCEPTION_NMI rip 0xffffffffa921297d info ffffeb2c0e44e018 80000b0e
qemu-system-x86-8600  [004] ....  7205.687532: kvm_page_fault: address ffffeb2c0e44e018 error_code 0
qemu-system-x86-8600  [004] ....  7205.687620: kvm_try_async_get_page: gva = 0xffffeb2c0e44e018, gfn = 0x427e4e
qemu-system-x86-8600  [004] .N..  7205.687628: kvm_async_pf_not_present: token 0x8b002 gva 0xffffeb2c0e44e018
    kworker/4:2-7814  [004] ....  7205.687655: kvm_async_pf_completed: gva 0xffffeb2c0e44e018 address 0x7fcc30c4e000
qemu-system-x86-8600  [004] ....  7205.687703: kvm_async_pf_ready: token 0x8b002 gva 0xffffeb2c0e44e018
qemu-system-x86-8600  [004] d..1  7205.687711: kvm_entry: vcpu 2

After running some memory intensive workload in guest, I catch the kworker
which completes the GUP too quickly, and queues an "Page Ready" #PF exception
after the "Page not Present" exception before the next vmentry as the above
trace which will result in #DF injected to guest.

This patch fixes it by clearing the queue for "Page not Present" if "Page Ready"
occurs before the next vmentry since the GUP has already got the required page
and shadow page table has already been fixed by "Page Ready" handler.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Fixes: 7c90705bf2a3 ("KVM: Inject asynchronous page fault into a PV guest if page is swapped out.")
[Changed indentation and added clearing of injected. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

i2c: i2c-stm32f7: add driver

This patch adds initial support for the STM32F7 I2C controller.

Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

i2c: i2c-stm32f4: use generic definition of speed enum

This patch uses a more generic definition of speed enum for i2c-stm32f4
driver.

Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Reviewed-by: Ludovic BARRE <ludovic.barre@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

dt-bindings: i2c-stm32: Document the STM32F7 I2C bindings

This patch adds the documentation of device tree bindings for STM32F7 I2C

Signed-off-by: M'boumba Cedric Madianga <cedric.madianga@gmail.com>
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

Merge branch 'kvm-ppc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

Bug fixes for stable.

KVM: X86: Don't block vCPU if there is pending exception

Don't block vCPU if there is pending exception.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: SVM: Add irqchip_split() checks before enabling AVIC

SVM AVIC hardware accelerates guest write to APIC_EOI register
(for edge-trigger interrupt), which means it does not trap to KVM.

So, only enable SVM AVIC only in split irqchip mode.
(e.g. launching qemu w/ option '-machine kernel_irqchip=split').

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Fixes: 44a95dae1d22 ("KVM: x86: Detect and Initialize AVIC support")
[Removed pr_debug - Radim.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

dmi: Mark all struct dmi_system_id instances const

... and __initconst if applicable.

Based on similar work for an older kernel in the Grsecurity patch.

[JD: fix toshiba-wmi build]
[JD: add htcpen]
[JD: move __initconst where checkscript wants it]

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jean Delvare <jdelvare@suse.de>

mm, page_owner: skip unnecessary stack_trace entries

The page_owner stacktrace always begin as follows:

  [<ffffff987bfd48f4>] save_stack+0x40/0xc8
  [<ffffff987bfd4da8>] __set_page_owner+0x3c/0x6c

These two entries do not provide any useful information and limits the
available stacktrace depth.  The page_owner stacktrace was skipping
caller function from stack entries but this was missed with commit
f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")

Example page_owner entry after the patch:

  Page allocated via order 0, mask 0x8(ffffff80085fb714)
  PFN 654411 type Movable Block 639 type CMA Flags 0x0(ffffffbe5c7f12c0)
  [<ffffff9b64989c14>] post_alloc_hook+0x70/0x80
  ...
  [<ffffff9b651216e8>] msm_comm_try_state+0x5f8/0x14f4
  [<ffffff9b6512486c>] msm_vidc_open+0x5e4/0x7d0
  [<ffffff9b65113674>] msm_v4l2_open+0xa8/0x224

Link: http://lkml.kernel.org/r/1504078343-28754-2-git-send-email-guptap@codeaurora.org
Fixes: f2ca0b557107 ("mm/page_owner: use stackdepot to store stacktrace")
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

arm64: stacktrace: avoid listing stacktrace functions in stacktrace

The stacktraces always begin as follows:

  [<c00117b4>] save_stack_trace_tsk+0x0/0x98
  [<c0011870>] save_stack_trace+0x24/0x28
  ...

This is because the stack trace code includes the stack frames for
itself.  This is incorrect behaviour, and also leads to "skip" doing the
wrong thing (which is the number of stack frames to avoid recording.)

Perversely, it does the right thing when passed a non-current thread.
Fix this by ensuring that we have a known constant number of frames
above the main stack trace function, and always skip these.

This was fixed for arch arm by commit 3683f44c42e9 ("ARM: stacktrace:
avoid listing stacktrace functions in stacktrace")

Link: http://lkml.kernel.org/r/1504078343-28754-1-git-send-email-guptap@codeaurora.org
Signed-off-by: Prakash Gupta <guptap@codeaurora.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: treewide: remove GFP_TEMPORARY allocation flag

GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived
and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
primary motivation was to allow users to tell that an allocation is
short lived and so the allocator can try to place such allocations close
together and prevent long term fragmentation.  As much as this sounds
like a reasonable semantic it becomes much less clear when to use the
highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
context holding that memory sleep? Can it take locks? It seems there is
no good answer for those questions.

The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
__GFP_RECLAIMABLE which in itself is tricky because basically none of
the existing caller provide a way to reclaim the allocated memory.  So
this is rather misleading and hard to evaluate for any benefits.

I have checked some random users and none of them has added the flag
with a specific justification.  I suspect most of them just copied from
other existing users and others just thought it might be a good idea to
use without any measuring.  This suggests that GFP_TEMPORARY just
motivates for cargo cult usage without any reasoning.

I believe that our gfp flags are quite complex already and especially
those with highlevel semantic should be clearly defined to prevent from
confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
replace all existing users to simply use GFP_KERNEL.  Please note that
SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
so they will be placed properly for memory fragmentation prevention.

I can see reasons we might want some gfp flag to reflect shorterm
allocations but I propose starting from a clear semantic definition and
only then add users with proper justification.

This was been brought up before LSF this year by Matthew [1] and it
turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
seems to be a heuristic without any measured advantage for most (if not
all) its current users.  The follow up discussion has revealed that
opinions on what might be temporary allocation differ a lot between
developers.  So rather than trying to tweak existing users into a
semantic which they haven't expected I propose to simply remove the flag
and start from scratch if we really need a semantic for short term
allocations.

[1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

[akpm@linux-foundation.org: fix typo]
[akpm@linux-foundation.org: coding-style fixes]
[sfr@canb.auug.org.au: drm/i915: fix up]
Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

IB/mlx4: fix sprintf format warning

gcc-7 points out that a negative port_num value would overflow the
string buffer:

  drivers/infiniband/hw/mlx4/sysfs.c: In function 'mlx4_ib_device_register_sysfs':
  drivers/infiniband/hw/mlx4/sysfs.c:251:16: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
  drivers/infiniband/hw/mlx4/sysfs.c:251:2: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10
  drivers/infiniband/hw/mlx4/sysfs.c:303:17: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
  drivers/infiniband/hw/mlx4/sysfs.c:303:3: note: 'sprintf' output between 2 and 11 bytes into a destination of size 10

While we should be able to assume that port_num is positive here, making
the buffer one byte longer has no downsides and avoids the warning.

Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device")
Link: http://lkml.kernel.org/r/20170714120720.906842-23-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fscache: fix fscache_objlist_show format processing

gcc points out a minor bug in the handling of unknown cookie types,
which could result in a string overflow when the integer is copied into
a 3-byte string:

  fs/fscache/object-list.c: In function 'fscache_objlist_show':
  fs/fscache/object-list.c:265:19: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
   sprintf(_type, "%02u", cookie->def->type);
                  ^~~~~~
  fs/fscache/object-list.c:265:4: note: 'sprintf' output between 3 and 4 bytes into a destination of size 3

This is currently harmless as no code sets a type other than 0 or 1, but
it makes sense to use snprintf() here to avoid overflowing the array if
that changes.

Link: http://lkml.kernel.org/r/20170714120720.906842-22-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/test_bitmap.c: use ULL suffix for 64-bit constants

With gcc 4.1.2:

  lib/test_bitmap.c:189: warning: integer constant is too large for `long' type
  lib/test_bitmap.c:190: warning: integer constant is too large for `long' type
  lib/test_bitmap.c:194: warning: integer constant is too large for `long' type
  lib/test_bitmap.c:195: warning: integer constant is too large for `long' type

Add the missing "ULL" suffix to fix this.

Link: http://lkml.kernel.org/r/1505040523-31230-1-git-send-email-geert@linux-m68k.org
Fixes: 60ef690018b262dd ("bitmap: introduce BITMAP_FROM_U64()")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

procfs: remove unused variable

In NOMMU configurations, we get a warning about a variable that has become
unused:

fs/proc/task_nommu.c: In function 'nommu_vma_show':
fs/proc/task_nommu.c:148:28: error: unused variable 'priv' [-Werror=unused-variable]

Link: http://lkml.kernel.org/r/20170911200231.3171415-1-arnd@arndb.de
Fixes: 1240ea0dc3bb ("fs, proc: remove priv argument from is_stack")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/media/cec/cec-adap.c: fix build with gcc-4.4.4

gcc-4.4.4 has issues with initialization of anonymous unions:

drivers/media/cec/cec-adap.c: In function 'cec_queue_msg_fh':
drivers/media/cec/cec-adap.c:184: error: unknown field 'lost_msgs' specified in initializer

work around this.

Fixes: 6b2bbb08747a5 ("media: cec: rework the cec event handling")
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

idr: remove WARN_ON_ONCE() when trying to replace negative ID

IDR only supports non-negative IDs.  There used to be a 'WARN_ON_ONCE(id <
0)' in idr_replace(), but it was intentionally removed by commit
2e1c9b286765 ("idr: remove WARN_ON_ONCE() on negative IDs").

Then it was added back by commit 0a835c4f090a ("Reimplement IDR and IDA
using the radix tree").  However it seems that adding it back was a
mistake, given that some users such as drm_gem_handle_delete()
(DRM_IOCTL_GEM_CLOSE) pass in a value from userspace to idr_replace(),
allowing the WARN_ON_ONCE to be triggered.  drm_gem_handle_delete()
actually just wants idr_replace() to return an error code if the ID is
not allocated, including in the case where the ID is invalid (negative).

So once again remove the bogus WARN_ON_ONCE().

This bug was found by syzkaller, which encountered the following
warning:

    WARNING: CPU: 3 PID: 3008 at lib/idr.c:157 idr_replace+0x1d8/0x240 lib/idr.c:157
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 3 PID: 3008 Comm: syzkaller218828 Not tainted 4.13.0-rc4-next-20170811 #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
     fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:190
     do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
     do_trap+0x260/0x390 arch/x86/kernel/traps.c:273
     do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:310
     do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:323
     invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:930
    RIP: 0010:idr_replace+0x1d8/0x240 lib/idr.c:157
    RSP: 0018:ffff8800394bf9f8 EFLAGS: 00010297
    RAX: ffff88003c6c60c0 RBX: 1ffff10007297f43 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8800394bfa78
    RBP: ffff8800394bfae0 R08: ffffffff82856487 R09: 0000000000000000
    R10: ffff8800394bf9a8 R11: ffff88006c8bae28 R12: ffffffffffffffff
    R13: ffff8800394bfab8 R14: dffffc0000000000 R15: ffff8800394bfbc8
     drm_gem_handle_delete+0x33/0xa0 drivers/gpu/drm/drm_gem.c:297
     drm_gem_close_ioctl+0xa1/0xe0 drivers/gpu/drm/drm_gem.c:671
     drm_ioctl_kernel+0x1e7/0x2e0 drivers/gpu/drm/drm_ioctl.c:729
     drm_ioctl+0x72e/0xa50 drivers/gpu/drm/drm_ioctl.c:825
     vfs_ioctl fs/ioctl.c:45 [inline]
     do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
     SYSC_ioctl fs/ioctl.c:700 [inline]
     SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
     entry_SYSCALL_64_fastpath+0x1f/0xbe

Here is a C reproducer:

    #include <fcntl.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <sys/ioctl.h>
    #include <drm/drm.h>

    int main(void)
    {
            int cardfd = open("/dev/dri/card0", O_RDONLY);

            ioctl(cardfd, DRM_IOCTL_GEM_CLOSE,
                  &(struct drm_gem_close) { .handle = -1 } );
    }

Link: http://lkml.kernel.org/r/20170906235306.20534-1-ebiggers3@gmail.com
Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: <stable@vger.kernel.org> [v4.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sctp: potential read out of bounds in sctp_ulpevent_type_enabled()

This code causes a static checker warning because Smatch doesn't trust
anything that comes from skb->data. I've reviewed this code and I do
think skb->data can be controlled by the user here.

The sctp_event_subscribe struct has 13 __u8 fields and we want to see
if ours is non-zero. sn_type can be any value in the 0-USHRT_MAX range.
We're subtracting SCTP_SN_TYPE_BASE which is 1 << 15 so we could read
either before the start of the struct or after the end.

This is a very old bug and it's surprising that it would go undetected
for so long but my theory is that it just doesn't have a big impact so
it would be hard to notice.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

i2c: altera: Add Altera I2C Controller driver

Add driver support for the Altera I2C Controller. The I2C
controller is soft IP for use in FPGAs.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

dt-bindings: i2c: Add Altera I2C Controller

Add the documentation to support the Altera synthesizable
logic I2C Controller in FPGA.

Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

MAINTAINERS: review Renesas DT bindings as well

When adding myself as a reviewer for the Renesas Ethernet drivers
I somehow forgot about the bindings -- I want to review them as well.

Fixes: 8e6569af3a1b ("MAINTAINERS: add myself as Renesas Ethernet drivers reviewer")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

um: return negative in tuntap_open_tramp()

The intention is to return negative error codes. "pid" is already
negative but we accidentally negate it again back to positive.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: remove a stray tab

Static checkers would urge us to add curly braces to this code, but
actually the code works correctly. It just isn't indented right.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: Use relative modversions with LD_SCRIPT_DYN

When building a dynamic kernel image use relative symbols with MODVERSIONS.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: link vmlinux with -no-pie

Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option.
Link UML dynamic kernel image also with -no-pie to fix the build.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: Fix CONFIG_GCOV for modules.

Explicitly export symbols so modpost doesn't complain.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

Fix minor typos and grammar in UML start_up help

Signed-off-by: James Pack <jpack61108@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

um: defconfig: Cleanup from old Kconfig options

Remove old, dead Kconfig option INET_LRO. It is gone since
commit 7bbf3cae65b6 ("ipv4: Remove inet_lro library").

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>

net_sched: gen_estimator: fix scaling error in bytes/packets samples

Denys reported wrong rate estimations with HTB classes.

It appears the bug was added in linux-4.10, since my tests
where using intervals of one second only.

HTB using 4 sec default rate estimators, reported rates
were 4x higher.

We need to properly scale the bytes/packets samples before
integrating them in EWMA.

Tested:
echo 1 >/sys/module/sch_htb/parameters/htb_rate_est

Setup HTB with one class with a rate/cail of 5Gbit

Generate traffic on this class

tc -s -d cl sh dev eth0 classid 7002:11
class htb 7002:11 parent 7002:1 prio 5 quantum 200000 rate 5Gbit ceil
5Gbit linklayer ethernet burst 80000b/1 mpu 0b cburst 80000b/1 mpu 0b
level 0 rate_handle 1
Sent 1488215421648 bytes 982969243 pkt (dropped 0, overlimits 0
requeues 0)
rate 5Gbit 412814pps backlog 136260b 2p requeues 0
TCP pkts/rtx 982969327/45 bytes 1488215557414/68130
lended: 22732826 borrowed: 0 giants: 0
tokens: -1684 ctokens: -1684

Fixes: 1c0d32fde5bd ("net_sched: gen_estimator: complete rewrite of rate estimators")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'nfp-card-init'

Jakub Kicinski says:

====================
nfp: wait more carefully for card init

The first patch is a small fix for flower offload, we need a whitelist
of supported matches, otherwise the unsupported ones will be ignored.

The second and the third patch are adding wait/polling to the probe path.
We had reports of driver failing probe because it couldn't find the
control process (NSP) on the card. Turns out the NSP will only announce
its existence after it's fully initialized. Until now we assumed it
will be reachable, just not processing commands (hence we wait for
a NOOP command to execute successfully).

v2:
- fix a bad merge which resulted in a build warning and retest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: wait for the NSP resource to appear on boot

The control process (NSP) may take some time to complete its
initialization.  This is not a problem on most servers, but
on very fast-booting machines it may not be ready for operation
when driver probes the device.  There is also a version of the
flash in the wild where NSP tries to train the links as part
of init.  To wait for NSP initialization we should make sure
its resource has already been added to the resource table.
NSP adds itself there as last step of init.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: wait for board state before talking to the NSP

Board state informs us which low-level initialization stages the card
has completed. We should wait for the card to be fully initialized
before trying to communicate with it, not only before we configure
passing traffic.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: add whitelist of supported flow dissector

Previously we did not check the flow dissector against a list of allowed
and supported flow key dissectors. This patch introduces such a list and
correctly rejects unsupported flow keys.

Fixes: 43f84b72c50d ("nfp: add metadata to each flow offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

um: Fix FP register size for XSTATE/XSAVE

Hard code max size. Taken from
https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/common/x86-xstate.h

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

UBI: Fix two typos in comments

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>

ubi: fastmap: fix spelling mistake: "invalidiate" -> "invalidate"

Trivial fix to spelling mistake in ubi_err error message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>

ubi: pr_err() strings should end with newlines

In build.c, the following pr_err calls should be terminated with
a new-line to avoid other messages being concatenated onto the end.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

ubi: pr_err() strings should end with newlines

In ubi_attach_mtd_dev() the pr_err() calls should have their
messgaes terminated with a new-line to avoid other messages
being concatenated onto the end.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

ubi: pr_err() strings should end with newlines

The ubi_init() function has a few error paths that use the
pr_err() to output errors. These should have new lines on
them as pr_err() does not automatically do this.

This fixes issues where if multiple mtd fail to bind to
ubi the console output starts wrapping around.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"A handful of tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf stat: Wait for the correct child
  perf tools: Support running perf binaries with a dash in their name
  perf config: Check not only section->from_system_config but also item's
  perf ui progress: Fix progress update
  perf ui progress: Make sure we always define step value
  perf tools: Open perf.data with O_CLOEXEC flag
  tools lib api: Fix make DEBUG=1 build
  perf tests: Fix compile when libunwind's unwind.h is available
  tools include linux: Guard against redefinition of some macros

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"Three CPU hotplug related fixes and a debugging improvement"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/debug: Add debugfs knob for "sched_debug"
  sched/core: WARN() when migrating to an offline CPU
  sched/fair: Plug hole between hotplug and active_load_balance()
  sched/fair: Avoid newidle balance for !active CPUs

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"The main changes are the PCID fixes from Andy, but there's also two
  hyperv fixes and two paravirt updates"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition
  x86/hyper-V: Allocate the IDT entry early in boot
  paravirt: Switch maintainer
  x86/paravirt: Remove no longer used paravirt functions
  x86/mm/64: Initialize CR4.PCIDE early
  x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3
  x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()

Merge tag 'openrisc-for-linus' of git://github.com/openrisc/linux

Pull OpenRISC fixlet from Stafford Horne:
"Fix warning for upcoming work to remove linux/vmalloc.h from
asm-generic/io.h"

* tag 'openrisc-for-linus' of git://github.com/openrisc/linux:
openrisc: add forward declaration for struct vm_area_struct

Merge tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux

Pull modules updates from Jessica Yu:
"Summary of modules changes for the 4.14 merge window:

   - minor code cleanups and fixes

   - modpost: avoid building modules that have names that exceed the
     size of the name field in struct module"

* tag 'modules-for-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: Remove const attribute from alias for MODULE_DEVICE_TABLE
  module: fix ddebug_remove_module()
  modpost: abort if module name is too long

Fix up MAINTAINERS file sorting

Another merge window, another MAINTAINERS file disaster.

People have serious problems with the alphabet and sorting, and poor
Jérôme Glisse and Radim Krčmář get their names mangled by locale issues,
turning them into some mangled mess (probably others do too, but those
two stood out when sorting things again).

And we now have two copies of the same 'AS3645A LED FLASH CONTROLLER
DRIVER' in the tree and in the MAINTAINERS file, but that's a separate
issue - the duplication is real, and I left them as two entries for the
same name.

This does not try to sort the actual section pattern entries, although I
may end up doing that later.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk updates from Stephen Boyd:
"The diff is dominated by the Allwinner A10/A20 SoCs getting converted
  to the sunxi-ng framework. Otherwise, the heavy hitters are various
  drivers for SoCs like AT91, Amlogic, Renesas, and Rockchip. There are
  some other new clk drivers in here too but overall this is just a
  bunch of clk drivers for various different pieces of hardware and a
  collection of non-critical fixes for clk drivers.

  New Drivers:
   - Allwinner R40 SoCs
   - Renesas R-Car Gen3 USB 2.0 clock selector PHY
   - Atmel AT91 audio PLL
   - Uniphier PXs3 SoCs
   - ARC HSDK Board PLLs
   - AXS10X Board PLLs
   - STMicroelectronics STM32H743 SoCs

  Removed Drivers:
   - Non-compiling mb86s7x support

  Updates:
   - Allwinner A10/A20 SoCs converted to sunxi-ng framework
   - Allwinner H3 CPU clk fixes
   - Renesas R-Car D3 SoC
   - Renesas V2H and M3-W modules
   - Samsung Exynos5420/5422/5800 audio fixes
   - Rockchip fractional clk approximation fixes
   - Rockchip rk3126 SoC support within the rk3128 driver
   - Amlogic gxbb CEC32 and sd_emmc clks
   - Amlogic meson8b reset controller support
   - IDT VersaClock 5P49V5925/5P49V6901 support
   - Qualcomm MSM8996 SMMU clks
   - Various 'const' applications for struct clk_ops
   - si5351 PLL reset bugfix
   - Uniphier audio on LD11/LD20 and ethernet support on LD11/LD20/Pro4/PXs2
   - Assorted Tegra clk driver fixes"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (120 commits)
  clk: si5351: fix PLL reset
  ASoC: atmel-classd: remove aclk clock
  ASoC: atmel-classd: remove aclk clock from DT binding
  clk: at91: clk-generated: make gclk determine audio_pll rate
  clk: at91: clk-generated: create function to find best_diff
  clk: at91: add audio pll clock drivers
  dt-bindings: clk: at91: add audio plls to the compatible list
  clk: at91: clk-generated: remove useless divisor loop
  clk: mb86s7x: Drop non-building driver
  clk: ti: check for null return in strrchr to avoid null dereferencing
  clk: Don't write error code into divider register
  clk: uniphier: add video input subsystem clock
  clk: uniphier: add audio system clock
  clk: stm32h7: Add stm32h743 clock driver
  clk: gate: expose clk_gate_ops::is_enabled
  clk: nxp: clk-lpc32xx: rename clk_gate_is_enabled()
  clk: uniphier: add PXs3 clock data
  clk: hi6220: change watchdog clock source
  clk: Kconfig: Name RK805 in Kconfig for COMMON_CLK_RK808
  clk: cs2000: Add cs2000_set_saved_rate
  ...

Merge tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
"Subsystem:
   - remove .open() and .release() RTC ops
   - constify i2c_device_id

  New driver:
   - Realtek RTD1295
   - Android emulator (goldfish) RTC

  Drivers:
   - ds1307: Beginning of a huge cleanup
   - s35390a: handle invalid RTC time
   - sun6i: external oscillator gate support"

* tag 'rtc-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (40 commits)
  rtc: ds1307: use octal permissions
  rtc: ds1307: fix braces
  rtc: ds1307: fix alignments and blank lines
  rtc: ds1307: use BIT
  rtc: ds1307: use u32
  rtc: ds1307: use sizeof
  rtc: ds1307: remove regs member
  rtc: Add Realtek RTD1295
  dt-bindings: rtc: Add Realtek RTD1295
  rtc: sun6i: Add support for the external oscillator gate
  rtc: goldfish: Add RTC driver for Android emulator
  dt-bindings: Add device tree binding for Goldfish RTC driver
  rtc: ds1307: add basic support for ds1341 chip
  rtc: ds1307: remove member nvram_offset from struct ds1307
  rtc: ds1307: factor out offset to struct chip_desc
  rtc: ds1307: factor out rtc_ops to struct chip_desc
  rtc: ds1307: factor out irq_handler to struct chip_desc
  rtc: ds1307: improve irq setup
  rtc: ds1307: constify struct chip_desc variables
  rtc: ds1307: improve trickle charger initialization
  ...

Merge tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Most of the commits are trivial cleanup patches, while one commit is a
  significant fix for the race at ALSA sequencer that was spotted by
  syzkaller"

* tag 'sound-fix-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Cancel pending autoload work at unbinding device
  ALSA: firewire: Use common error handling code in snd_motu_stream_start_duplex()
  ALSA: asihpi: Kill BUG_ON() usages
  ALSA: core: Use %pS printk format for direct addresses
  ALSA: ymfpci: Use common error handling code in snd_ymfpci_create()
  ALSA: ymfpci: Use common error handling code in snd_card_ymfpci_probe()
  ALSA: 6fire: Use common error handling code in usb6fire_chip_probe()
  ALSA: usx2y: Use common error handling code in submit_urbs()
  ALSA: us122l: Use common error handling code in us122l_create_card()
  ALSA: hdspm: Use common error handling code in snd_hdspm_probe()
  ALSA: rme9652: Use common code in hdsp_get_iobox_version()
  ALSA: maestro3: Use common error handling code in two functions

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"A tiny update: one patch corrects a Kconfig problem with the shift of
  the SAS SMP code to BSG and the other removes a vestige of user space
  target mode"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: scsi_transport_sas: select BLK_DEV_BSGLIB
  scsi: Remove Scsi_Host.uspace_req_q

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Small collection of fixes that would be nice to have in -rc1. This
  contains:

   - NVMe pull request form Christoph, mostly with fixes for nvme-pci,
     host memory buffer in particular.

   - Error handling fixup for cgwb_create(), in case allocation of 'wb'
     fails. From Christophe Jaillet.

   - Ensure that trace_block_getrq() gets the 'dev' in an appropriate
     fashion, to avoid a potential NULL deref. From Greg Thelen.

   - Regression fix for dm-mq with blk-mq, fixing a problem with
     stacking IO schedulers. From me.

   - string.h fixup, fixing an issue with memcpy_and_pad(). This
     original change came in through an NVMe dependency, which is why
     I'm including it here. From Martin Wilck.

   - Fix potential int overflow in __blkdev_sectors_to_bio_pages(), from
     Mikulas.

   - MBR enable fix for sed-opal, from Scott"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: directly insert blk-mq request from blk_insert_cloned_request()
  mm/backing-dev.c: fix an error handling path in 'cgwb_create()'
  string.h: un-fortify memcpy_and_pad
  nvme-pci: implement the HMB entry number and size limitations
  nvme-pci: propagate (some) errors from host memory buffer setup
  nvme-pci: use appropriate initial chunk size for HMB allocation
  nvme-pci: fix host memory buffer allocation fallback
  nvme: fix lightnvm check
  block: fix integer overflow in __blkdev_sectors_to_bio_pages()
  block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled
  block: tolerate tracing of NULL bio

Merge tag 'docs-4.14' of git://git.lwn.net/linux

Pull documentation fixes from Jonathan Corbet:
"A cleanup from Mauro that needed to wait for the media pull, plus a
  handful of other fixes that wandered in"

* tag 'docs-4.14' of git://git.lwn.net/linux:
  kokr/memory-barriers.txt: Apply atomic_t.txt change
  kokr/doc: Update memory-barriers.txt for read-to-write dependencies
  docs-rst: don't require adjustbox anymore
  docs-rst: conf.py: only setup notice box colors if Sphinx < 1.6
  docs-rst: conf.py: remove lscape from LaTeX preamble

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:
"This fixes a regression (spotted by the Sandstorm.io folks) in the pid
  namespace handling introduced in 4.12.

  There's also a fix for honoring sync/dsync flags for pwritev2()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: getattr cleanup
  fuse: honor iocb sync flags on write
  fuse: allow server to run in different pid_ns

net: sched: fix use-after-free in tcf_action_destroy and tcf_del_walker

Recent commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu") removed
freeing in call_rcu, which changed already existing hard-to-hit
race condition into 100% hit:

[  598.599825] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  598.607782] IP: tcf_action_destroy+0xc0/0x140

Or:

[   40.858924] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[   40.862840] IP: tcf_generic_walker+0x534/0x820

Fix this by storing the ops and use them directly for module_put call.

Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

KVM: Add struct kvm_vcpu pointer parameter to get_enable_apicv()

Modify struct kvm_x86_ops.arch.apicv_active() to take struct kvm_vcpu
pointer as parameter in preparation to subsequent changes.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: SVM: Refactor AVIC vcpu initialization into avic_init_vcpu()

Preparing the base code for subsequent changes. This does not change
existing logic.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

be2net: fix TSO6/GSO issue causing TX-stall on Lancer/BEx

IPv6 TSO requests with extension hdrs are a problem to the
Lancer and BEx chips. Workaround is to disable TSO6 feature
for such packets.

Also in Lancer chips, MSS less than 256 was resulting in TX stall.
Fix this by disabling GSO when MSS less than 256.

Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs updates from Miklos Szeredi:
"This fixes d_ino correctness in readdir, which brings overlayfs on par
  with normal filesystems regarding inode number semantics, as long as
  all layers are on the same filesystem.

  There are also some bug fixes, one in particular (random ioctl's
  shouldn't be able to modify lower layers) that touches some vfs code,
  but of course no-op for non-overlay fs"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix false positive ESTALE on lookup
  ovl: don't allow writing ioctl on lower layer
  ovl: fix relatime for directories
  vfs: add flags to d_real()
  ovl: cleanup d_real for negative
  ovl: constant d_ino for non-merge dirs
  ovl: constant d_ino across copy up
  ovl: fix readdir error value
  ovl: check snprintf return

KVM: x86: fix clang build

Clang resolves __builtin_constant_p() to false even if the expression is
constant in the end. The only purpose of that expression was to
differentiate a case where the following expression couldn't be checked
at compile-time, so we can just remove the check.

Clang handles the following two correctly. Turn it into BUG_ON if there
are any more problems with this.

Fixes: d6321d493319 ("KVM: x86: generalize guest_cpuid_has_ helpers")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: fix rcu warning on VM_CREATE errors

commit 3898da947bba ("KVM: avoid using rcu_dereference_protected") can
trigger the following lockdep/rcu splat if the VM_CREATE ioctl fails,
for example if kvm_arch_init_vm fails:

WARNING: suspicious RCU usage
4.13.0+ #105 Not tainted
-----------------------------
./include/linux/kvm_host.h:481 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
no locks held by qemu-system-s39/79.
stack backtrace:
CPU: 0 PID: 79 Comm: qemu-system-s39 Not tainted 4.13.0+ #105
Hardware name: IBM 2964 NC9 704 (KVM/Linux)
Call Trace:
([<00000000001140b2>] show_stack+0xea/0xf0)
[<00000000008a68a4>] dump_stack+0x94/0xd8
[<0000000000134c12>] kvm_dev_ioctl+0x372/0x7a0
[<000000000038f940>] do_vfs_ioctl+0xa8/0x6c8
[<0000000000390004>] SyS_ioctl+0xa4/0xb8
[<00000000008c7a8c>] system_call+0xc4/0x27c
no locks held by qemu-system-s39/79.

We have to reset the just created users_count back to 0 to
tell the check to not trigger.

Reported-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 3898da947bba ("KVM: avoid using rcu_dereference_protected")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: x86: Fix immediate_exit handling for uninitialized AP

When user space sets kvm_run->immediate_exit, KVM is supposed to
return quickly. However, when a vCPU is in KVM_MP_STATE_UNINITIALIZED,
the value is not considered and the vCPU blocks.

Fix that oversight.

Fixes: 460df4c1fc7c008 ("KVM: race-free exit from KVM_RUN without POSIX signals")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: x86: Fix handling of pending signal on uninitialized AP

KVM API says that KVM_RUN will return with -EINTR when a signal is
pending. However, if a vCPU is in KVM_MP_STATE_UNINITIALIZED, then
the return value is unconditionally -EAGAIN.

Copy over some code from vcpu_run(), so that the case of a pending
signal results in the expected return value.

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: SVM: Add a missing 'break' statement

Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Fixes: f6511935f424 ("KVM: SVM: Add checks for IO instructions")
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

KVM: x86: Remove .get_pkru() from kvm_x86_ops

The commit

9dd21e104bc ('KVM: x86: simplify handling of PKRU')

removed all users and providers of that call-back, but
didn't remove it. Remove it now.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>

x86/hyper-v: Remove duplicated HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED definition

Commits:

7dcf90e9e032 ("PCI: hv: Use vPCI protocol version 1.2")
628f54cc6451 ("x86/hyper-v: Support extended CPU ranges for TLB flush hypercalls")

added the same definition and they came in through different trees.
Fix the duplication.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: devel@linuxdriverproject.org
Link: http://lkml.kernel.org/r/20170911150620.3998-1-vkuznets@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/hyper-V: Allocate the IDT entry early in boot

Allocate the hypervisor callback IDT entry early in the boot sequence.

The previous code would allocate the entry as part of registering the handler
when the vmbus driver loaded, and this caused a problem for the IDT cleanup
that Thomas is working on for v4.15.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: apw@canonical.com
Cc: devel@linuxdriverproject.org
Cc: gregkh@linuxfoundation.org
Cc: jasowang@redhat.com
Cc: olaf@aepfle.de
Link: http://lkml.kernel.org/r/20170908231557.2419-1-kys@exchange.microsoft.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

paravirt: Switch maintainer

Jeremy Fitzhardinge is stepping down as a paravirt maintainer. I'll
replace him.

While at it, update the file list to the actual pattern.

Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170905143407.9227-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/paravirt: Remove no longer used paravirt functions

With removal of lguest some of the paravirt functions are no longer
needed:

->read_cr4()
->store_idt()
->set_pmd_at()
->set_pud_at()
->pte_update()

Remove them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: chrisw@sous-sol.org
Cc: jeremy@goop.org
Cc: rusty@rustcorp.com.au
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20170904102527.25409-1-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/mm/64: Initialize CR4.PCIDE early

cpu_init() is weird: it's called rather late (after early
identification and after most MMU state is initialized) on the boot
CPU but is called extremely early (before identification) on secondary
CPUs.  It's called just late enough on the boot CPU that its CR4 value
isn't propagated to mmu_cr4_features.

Even if we put CR4.PCIDE into mmu_cr4_features, we'd hit two
problems.  First, we'd crash in the trampoline code.  That's
fixable, and I tried that.  It turns out that mmu_cr4_features is
totally ignored by secondary_start_64(), though, so even with the
trampoline code fixed, it wouldn't help.

This means that we don't currently have CR4.PCIDE reliably initialized
before we start playing with cpu_tlbstate.  This is very fragile and
tends to cause boot failures if I make even small changes to the TLB
handling code.

Make it more robust: initialize CR4.PCIDE earlier on the boot CPU
and propagate it to secondary CPUs in start_secondary().

( Yes, this is ugly.  I think we should have improved mmu_cr4_features
  to actually control CR4 during secondary bootup, but that would be
  fairly intrusive at this stage. )

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reported-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/hibernate/64: Mask off CR3's PCID bits in the saved CR3

Jiri reported a resume-from-hibernation failure triggered by PCID.
The root cause appears to be rather odd.  The hibernation asm
restores a CR3 value that comes from the image header.  If the image
kernel has PCID on, it's entirely reasonable for this CR3 value to
have one of the low 12 bits set.  The restore code restores it with
CR4.PCIDE=0, which means that those low 12 bits are accepted by the
CPU but are either ignored or interpreted as a caching mode.  This
is odd, but still works.  We blow up later when the image kernel
restores CR4, though, since changing CR4.PCIDE with CR3[11:0] != 0
is illegal.  Boom!

FWIW, it's entirely unclear to me what's supposed to happen if a PAE
kernel restores a non-PAE image or vice versa.  Ditto for LA57.

Reported-by: Jiri Kosina <jikos@kernel.org>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Link: http://lkml.kernel.org/r/18ca57090651a6341e97083883f9e814c4f14684.1504847163.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/mm: Get rid of VM_BUG_ON in switch_tlb_irqs_off()

If we hit the VM_BUG_ON(), we're detecting a genuinely bad situation,
but we're very unlikely to get a useful call trace.

Make it a warning instead.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3b4e06bbb382ca54a93218407c93925ff5871546.1504847163.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'perf-urgent-for-mingo-4.14-20170912' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix TUI progress bar when delta from new total from that of the
  previous update is greater than the progress "step" (screen width
  progress bar block))  (Jiri Olsa)

- Make tools/lib/api make DEBUG=1 build use -D_FORTIFY_SOURCE=2 not
  to cripple debuginfo, just like tools/perf/ does (Jiri Olsa)

- Avoid leaking the 'perf.data' file to workloads started from the
  'perf record' command line by using the O_CLOEXEC open flag (Jiri Olsa)

- Fix building when libunwind's 'unwind.h' file is present in the
  include path, clashing with tools/perf/util/unwind.h (Milian Wolff)

- Check per .perfconfig section entry flag, not just per section (Taeung Song)

- Support running perf binaries with a dash in their name, needed to
  run perf as an AppImage (Milian Wolff)

- Wait for the right child by using waitpid() when running workloads
  from 'perf stat', also to fix using perf as an AppImage (Milian Wolff)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge branch 'drm-next-4.14' of git://people.freedesktop.org/~agd5f/linux into drm-next

A few fixes for 4.14. Nothing too major.

w90p910_ether: include linux/interrupt.h

A randconfig build caused a compile failure:

drivers/net/ethernet/nuvoton/w90p910_ether.c: In function 'w90p910_ether_close':
drivers/net/ethernet/nuvoton/w90p910_ether.c:580:2: error: implicit declaration of function 'free_irq'; did you mean 'free_uid'? [-Werror=implicit-function-declaration]

Adding the correct include fixes the problem.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bonding: fix tlb_dynamic_lb default value

Commit 8b426dc54cf4 ("bonding: remove hardcoded value") changed the
default value for tlb_dynamic_lb which lead to either broken ALB mode
(since tlb_dynamic_lb can be changed only in TLB) or setting TLB mode
with tlb_dynamic_lb equal to 0.
The first issue was recently fixed by setting tlb_dynamic_lb to 1 always
when switching to ALB mode, but the default value is still wrong and
we'll enter TLB mode with tlb_dynamic_lb equal to 0 if the mode is
changed via netlink or sysfs. In order to restore the previous behaviour
and default value simply remove the mode check around the default param
initialization for tlb_dynamic_lb which will always set it to 1 as
before.

Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip6_tunnel: fix ip6 tunnel lookup in collect_md mode

In collect_md mode, if the tun dev is down, it still can call
__ip6_tnl_rcv to receive on packets, and the rx statistics increase
improperly.

When the md tunnel is down, it's not neccessary to increase RX drops
for the tunnel device, packets would be recieved on fallback tunnel,
and the RX drops on fallback device will be increased as expected.

Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ip_tunnel: fix ip tunnel lookup in collect_md mode

In collect_md mode, if the tun dev is down, it still can call
ip_tunnel_rcv to receive on packets, and the rx statistics increase
improperly.

When the md tunnel is down, it's not neccessary to increase RX drops
for the tunnel device, packets would be recieved on fallback tunnel,
and the RX drops on fallback device will be increased as expected.

Fixes: 2e15ea390e6f ("ip_gre: Add support to collect tunnel metadata.")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>