]> git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log
mirror_ubuntu-focal-kernel.git
4 years agoipv6: Fix use of anycast address with loopback
David Ahern [Tue, 7 Jul 2020 13:39:24 +0000 (07:39 -0600)]
ipv6: Fix use of anycast address with loopback

BugLink: https://bugs.launchpad.net/bugs/1888560
[ Upstream commit aea23c323d89836bcdcee67e49def997ffca043b ]

Thomas reported a regression with IPv6 and anycast using the following
reproducer:

    echo 1 >  /proc/sys/net/ipv6/conf/all/forwarding
    ip -6 a add fc12::1/16 dev lo
    sleep 2
    echo "pinging lo"
    ping6 -c 2 fc12::

The conversion of addrconf_f6i_alloc to use ip6_route_info_create missed
the use of fib6_is_reject which checks addresses added to the loopback
interface and sets the REJECT flag as needed. Update fib6_is_reject for
loopback checks to handle RTF_ANYCAST addresses.

Fixes: c7a1ce397ada ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create")
Reported-by: thomas.gambier@nexedi.com
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoipv6: fib6_select_path can not use out path for nexthop objects
David Ahern [Mon, 6 Jul 2020 17:45:07 +0000 (11:45 -0600)]
ipv6: fib6_select_path can not use out path for nexthop objects

BugLink: https://bugs.launchpad.net/bugs/1888560
[ Upstream commit 34fe5a1cf95c3f114068fc16d919c9cf4b00e428 ]

Brian reported a crash in IPv6 code when using rpfilter with a setup
running FRR and external nexthop objects. The root cause of the crash
is fib6_select_path setting fib6_nh in the result to NULL because of
an improper check for nexthop objects.

More specifically, rpfilter invokes ip6_route_lookup with flowi6_oif
set causing fib6_select_path to be called with have_oif_match set.
fib6_select_path has early check on have_oif_match and jumps to the
out label which presumes a builtin fib6_nh. This path is invalid for
nexthop objects; for external nexthops fib6_select_path needs to just
return if the fib6_nh has already been set in the result otherwise it
returns after the call to nexthop_path_fib6_result. Update the check
on have_oif_match to not bail on external nexthops.

Update selftests for this problem.

Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info")
Reported-by: Brian Rak <brak@choopa.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg
Sabrina Dubroca [Fri, 3 Jul 2020 15:00:32 +0000 (17:00 +0200)]
ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg

BugLink: https://bugs.launchpad.net/bugs/1888560
[ Upstream commit 5eff06902394425c722f0a44d9545909a8800f79 ]

IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to
incomplete IPsec ACQUIRE messages being sent to userspace. Currently,
both raw sockets and IPv6 ping sockets set those fields.

Expected output of "ip xfrm monitor":
    acquire proto esp
      sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4
      policy src 10.0.2.15/32 dst 8.8.8.8/32
        <snip>

Currently with ping sockets:
    acquire proto esp
      sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4
      policy src 10.0.2.15/32 dst 8.8.8.8/32
        <snip>

The Libreswan test suite found this problem after Fedora changed the
value for the sysctl net.ipv4.ping_group_range.

Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Paul Wouters <pwouters@redhat.com>
Tested-by: Paul Wouters <pwouters@redhat.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agogenetlink: remove genl_bind
Sean Tranchetti [Tue, 30 Jun 2020 17:50:17 +0000 (11:50 -0600)]
genetlink: remove genl_bind

BugLink: https://bugs.launchpad.net/bugs/1888560
[ Upstream commit 1e82a62fec613844da9e558f3493540a5b7a7b67 ]

A potential deadlock can occur during registering or unregistering a
new generic netlink family between the main nl_table_lock and the
cb_lock where each thread wants the lock held by the other, as
demonstrated below.

1) Thread 1 is performing a netlink_bind() operation on a socket. As part
   of this call, it will call netlink_lock_table(), incrementing the
   nl_table_users count to 1.
2) Thread 2 is registering (or unregistering) a genl_family via the
   genl_(un)register_family() API. The cb_lock semaphore will be taken for
   writing.
3) Thread 1 will call genl_bind() as part of the bind operation to handle
   subscribing to GENL multicast groups at the request of the user. It will
   attempt to take the cb_lock semaphore for reading, but it will fail and
   be scheduled away, waiting for Thread 2 to finish the write.
4) Thread 2 will call netlink_table_grab() during the (un)registration
   call. However, as Thread 1 has incremented nl_table_users, it will not
   be able to proceed, and both threads will be stuck waiting for the
   other.

genl_bind() is a noop, unless a genl_family implements the mcast_bind()
function to handle setting up family-specific multicast operations. Since
no one in-tree uses this functionality as Cong pointed out, simply removing
the genl_bind() function will remove the possibility for deadlock, as there
is no attempt by Thread 1 above to take the cb_lock semaphore.

Fixes: c380d9a7afff ("genetlink: pass multicast bind/unbind to families")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Johannes Berg <johannes.berg@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agobridge: mcast: Fix MLD2 Report IPv6 payload length check
Linus Lüssing [Sun, 5 Jul 2020 19:10:17 +0000 (21:10 +0200)]
bridge: mcast: Fix MLD2 Report IPv6 payload length check

BugLink: https://bugs.launchpad.net/bugs/1888560
[ Upstream commit 5fc6266af7b427243da24f3443a50cd4584aac06 ]

Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
igmp3/mld2 report handling") introduced a bug in the IPv6 header payload
length check which would potentially lead to rejecting a valid MLD2 Report:

The check needs to take into account the 2 bytes for the "Number of
Sources" field in the "Multicast Address Record" before reading it.
And not the size of a pointer to this field.

Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling")
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agonet: rmnet: fix lower interface leak
Taehee Yoo [Thu, 2 Jul 2020 17:08:18 +0000 (17:08 +0000)]
net: rmnet: fix lower interface leak

BugLink: https://bugs.launchpad.net/bugs/1888560
commit 2a762e9e8cd1cf1242e4269a2244666ed02eecd1 upstream.

There are two types of the lower interface of rmnet that are VND
and BRIDGE.
Each lower interface can have only one type either VND or BRIDGE.
But, there is a case, which uses both lower interface types.
Due to this unexpected behavior, lower interface leak occurs.

Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add rmnet0 link dummy0 type rmnet mux_id 1
    ip link set dummy1 master rmnet0
    ip link add rmnet1 link dummy1 type rmnet mux_id 2
    ip link del rmnet0

The dummy1 was attached as BRIDGE interface of rmnet0.
Then, it also was attached as VND interface of rmnet1.
This is unexpected behavior and there is no code for handling this case.
So that below splat occurs when the rmnet0 interface is deleted.

Splat looks like:
[   53.254112][    C1] WARNING: CPU: 1 PID: 1192 at net/core/dev.c:8992 rollback_registered_many+0x986/0xcf0
[   53.254117][    C1] Modules linked in: rmnet dummy openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nfx
[   53.254182][    C1] CPU: 1 PID: 1192 Comm: ip Not tainted 5.8.0-rc1+ #620
[   53.254188][    C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[   53.254192][    C1] RIP: 0010:rollback_registered_many+0x986/0xcf0
[   53.254200][    C1] Code: 41 8b 4e cc 45 31 c0 31 d2 4c 89 ee 48 89 df e8 e0 47 ff ff 85 c0 0f 84 cd fc ff ff 0f 0b e5
[   53.254205][    C1] RSP: 0018:ffff888050a5f2e0 EFLAGS: 00010287
[   53.254214][    C1] RAX: ffff88805756d658 RBX: ffff88804d99c000 RCX: ffffffff8329d323
[   53.254219][    C1] RDX: 1ffffffff0be6410 RSI: 0000000000000008 RDI: ffffffff85f32080
[   53.254223][    C1] RBP: dffffc0000000000 R08: fffffbfff0be6411 R09: fffffbfff0be6411
[   53.254228][    C1] R10: ffffffff85f32087 R11: 0000000000000001 R12: ffff888050a5f480
[   53.254233][    C1] R13: ffff88804d99c0b8 R14: ffff888050a5f400 R15: ffff8880548ebe40
[   53.254238][    C1] FS:  00007f6b86b370c0(0000) GS:ffff88806c200000(0000) knlGS:0000000000000000
[   53.254243][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.254248][    C1] CR2: 0000562c62438758 CR3: 000000003f600005 CR4: 00000000000606e0
[   53.254253][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   53.254257][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   53.254261][    C1] Call Trace:
[   53.254266][    C1]  ? lockdep_hardirqs_on_prepare+0x379/0x540
[   53.254270][    C1]  ? netif_set_real_num_tx_queues+0x780/0x780
[   53.254275][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[   53.254279][    C1]  ? __kasan_slab_free+0x126/0x150
[   53.254283][    C1]  ? kfree+0xdc/0x320
[   53.254288][    C1]  ? rmnet_unregister_real_device+0x56/0x90 [rmnet]
[   53.254293][    C1]  unregister_netdevice_many.part.135+0x13/0x1b0
[   53.254297][    C1]  rtnl_delete_link+0xbc/0x100
[   53.254301][    C1]  ? rtnl_af_register+0xc0/0xc0
[   53.254305][    C1]  rtnl_dellink+0x2dc/0x840
[   53.254309][    C1]  ? find_held_lock+0x39/0x1d0
[   53.254314][    C1]  ? valid_fdb_dump_strict+0x620/0x620
[   53.254318][    C1]  ? rtnetlink_rcv_msg+0x457/0x890
[   53.254322][    C1]  ? lock_contended+0xd20/0xd20
[   53.254326][    C1]  rtnetlink_rcv_msg+0x4a8/0x890
[ ... ]
[   73.813696][ T1192] unregister_netdevice: waiting for rmnet0 to become free. Usage count = 1

Fixes: 037f9cdf72fb ("net: rmnet: use upper/lower device infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agonet: atlantic: fix ip dst and ipv6 address filters
Dmitry Bogdanov [Wed, 8 Jul 2020 14:17:10 +0000 (17:17 +0300)]
net: atlantic: fix ip dst and ipv6 address filters

BugLink: https://bugs.launchpad.net/bugs/1888560
commit a42e6aee7f47a8a68d09923c720fc8f605a04207 upstream.

This patch fixes ip dst and ipv6 address filters.
There were 2 mistakes in the code, which led to the issue:
* invalid register was used for ipv4 dst address;
* incorrect write order of dwords for ipv6 addresses.

Fixes: 23e7a718a49b ("net: aquantia: add rx-flow filter definitions")
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Alexander Lobakin <alobakin@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agocrypto: atmel - Fix build error of CRYPTO_AUTHENC
YueHaibing [Wed, 13 Nov 2019 09:55:50 +0000 (17:55 +0800)]
crypto: atmel - Fix build error of CRYPTO_AUTHENC

BugLink: https://bugs.launchpad.net/bugs/1888560
commit aee1f9f3c30e1e20e7f74729ced61eac7d74ca68 upstream.

If CRYPTO_DEV_ATMEL_AUTHENC is m, CRYPTO_DEV_ATMEL_SHA is m,
but CRYPTO_DEV_ATMEL_AES is y, building will fail:

drivers/crypto/atmel-aes.o: In function `atmel_aes_authenc_init_tfm':
atmel-aes.c:(.text+0x670): undefined reference to `atmel_sha_authenc_get_reqsize'
atmel-aes.c:(.text+0x67a): undefined reference to `atmel_sha_authenc_spawn'
drivers/crypto/atmel-aes.o: In function `atmel_aes_authenc_setkey':
atmel-aes.c:(.text+0x7e5): undefined reference to `atmel_sha_authenc_setkey'

Make CRYPTO_DEV_ATMEL_AUTHENC depend on CRYPTO_DEV_ATMEL_AES,
and select CRYPTO_DEV_ATMEL_SHA and CRYPTO_AUTHENC for it under there.

Reported-by: Hulk Robot <hulkci@huawei.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 89a82ef87e01 ("crypto: atmel-authenc - add support to...")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agocrypto: atmel - Fix selection of CRYPTO_AUTHENC
Tudor Ambarus [Fri, 1 Nov 2019 16:40:37 +0000 (16:40 +0000)]
crypto: atmel - Fix selection of CRYPTO_AUTHENC

BugLink: https://bugs.launchpad.net/bugs/1888560
commit d158367682cd822aca811971e988be6a8d8f679f upstream.

The following error is raised when CONFIG_CRYPTO_DEV_ATMEL_AES=y and
CONFIG_CRYPTO_DEV_ATMEL_AUTHENC=m:
drivers/crypto/atmel-aes.o: In function `atmel_aes_authenc_setkey':
atmel-aes.c:(.text+0x9bc): undefined reference to `crypto_authenc_extractkeys'
Makefile:1094: recipe for target 'vmlinux' failed

Fix it by moving the selection of CRYPTO_AUTHENC under
config CRYPTO_DEV_ATMEL_AES.

Fixes: 89a82ef87e01 ("crypto: atmel-authenc - add support to...")
Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUBUNTU: [Debian] Disallow building linux-libc-dev from linux-riscv
Seth Forshee [Wed, 8 Jul 2020 16:27:24 +0000 (11:27 -0500)]
UBUNTU: [Debian] Disallow building linux-libc-dev from linux-riscv

BugLink: https://bugs.launchpad.net/bugs/1886188
Since we are no longer building linux-libc-dev from linux-riscv,
remove the exception allowing linux-riscv to build the package.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUBUNTU: [Packaging] Produce linux-libc-dev package for riscv64
Seth Forshee [Wed, 8 Jul 2020 16:27:23 +0000 (11:27 -0500)]
UBUNTU: [Packaging] Produce linux-libc-dev package for riscv64

BugLink: https://bugs.launchpad.net/bugs/1886188
Having linux-libc-dev for riscv64 built from a different source
package with potentially different version numbers breaks cross-
building and multi-arch. Move building of linux-libc-dev back
to the primary kernel package to fix this.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Move allocation of the shost object to after xconf- and xport-data
Benjamin Block [Fri, 8 May 2020 17:23:35 +0000 (19:23 +0200)]
scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data

BugLink: https://bugs.launchpad.net/bugs/1887124
At the moment we allocate and register the Scsi_Host object corresponding
to a zfcp adapter (FCP device) very early in the life cycle of the adapter
- even before we fully discover and initialize the underlying
firmware/hardware. This had the advantage that we could already use the
Scsi_Host object, and fill in all its information during said discover and
initialize.

Due to commit 737eb78e82d5 ("block: Delay default elevator initialization")
(first released in v5.4), we noticed a regression that would prevent us
from using any storage volume if zfcp is configured with support for DIF or
DIX (zfcp.dif=1 || zfcp.dix=1). Doing so would result in an illegal memory
access as soon as the first request is sent with such an configuration. As
example for a crash resulting from this:

  scsi host0: scsi_eh_0: sleeping
  scsi host0: zfcp
  qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP
  scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 0000000000000000 TEID: 0000000000000483
  Fault in home space mode while using kernel ASCE.
  AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d
  Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: ...
  CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1
  Hardware name: ...
  Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc]
  Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod])
             R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
  Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000
             000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000
             000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000
             00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0
  Krnl Code: 000003ff801fcd9ee310a35c0012        lt      %r1,860(%r10)
             000003ff801fcda4a7840010           brc     8,000003ff801fcdc4
            #000003ff801fcda8e310b2900004       lg      %r1,656(%r11)
            >000003ff801fcdaed71710001000       xc      0(24,%r1),0(%r1)
             000003ff801fcdb4e310b2900004       lg      %r1,656(%r11)
             000003ff801fcdba41201018           la      %r2,24(%r1)
             000003ff801fcdbee32010000024       stg     %r2,0(%r1)
             000003ff801fcdc4b904002b           lgr     %r2,%r11
  Call Trace:
   [<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod]
  ([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod])
   [<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680
   [<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8
   [<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160
   [<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228
   [<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0
   [<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8
   [<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90
   [<00000000349c1856>] blk_execute_rq+0x6e/0xb0
   [<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod]
   [<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod]
   [<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod]
   [<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod]
   [<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc]
   [<0000000034245ce8>] process_one_work+0x458/0x7d0
   [<00000000342462a2>] worker_thread+0x242/0x448
   [<0000000034250994>] kthread+0x15c/0x170
   [<0000000034e1979c>] ret_from_fork+0x30/0x38
  INFO: lockdep is turned off.
  Last Breaking-Event-Address:
   [<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod]
  Kernel panic - not syncing: Fatal exception: panic_on_oops

While this issue is exposed by the commit named above, this is only by
accident. The real issue exists for longer already - basically since it's
possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests
for a tag-set during initialization of the same. For a given Scsi_Host
object this is done when adding the object to the midlayer
(`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer
calculates how much memory is required for a single scsi_cmnd, and its
additional data, which also might include space for additional protection
data - depending on whether the Scsi_Host has any form of protection
capabilities (`scsi_host_get_prot()`).

The problem is now thus, because zfcp does this step before we actually
know whether the firmware/hardware has these capabilities, we don't set any
protection capabilities in the Scsi_Host object. And so, no space is
allocated for additional protection data for requests in the Scsi_Host
tag-set.

Once we go through discover and initialize the FCP device firmware/hardware
fully (this is done via the firmware commands "Exchange Config Data" and
"Exchange Port Data") we find out whether it actually supports DIF and DIX,
and we set the corresponding capabilities in the Scsi_Host object (in
`zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection
capabilities, but the already allocated requests in the tag-set don't have
any space allocated for that.

When we then trigger target scanning or add scsi_devices manually, the
midlayer will use requests from that tag-set, and before sending most
requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd
this function will check again whether the used Scsi_Host has any
protection capabilities - and now it potentially has - and if so, it will
try to initialize the assumed to be preallocated structures and thus it
causes the crash, like shown above.

Before delaying the default elevator initialization with the commit named
above, we always would also allocate an elevator for any scsi_device before
ever sending any requests - in contrast to now, where we do it after
device-probing. That elevator in turn would have its own tag-set, and that
is initialized after we went through discovery and initialization of the
underlying firmware/hardware. So requests from that tag-set can be
allocated properly, and if used - unless the user changes/disabled the
default elevator - this would hide the underlying issue.

To fix this for any configuration - with or without an elevator - we move
the allocation and registration of the Scsi_Host object for a given FCP
device to after the first complete discovery and initialization of the
underlying firmware/hardware. By doing that we can make all basic
properties of the Scsi_Host known to the midlayer by the time we call
`scsi_add_host()`, including whether we have any protection capabilities.

To do that we have to delay all the accesses that we would have done in the
past during discovery and initialization, and do them instead once we are
finished with it. The previous patches ramp up to this by fencing and
factoring out all these accesses, and make it possible to re-do them later
on. In addition we make also use of the diagnostic buffers we recently
added with

commit 92953c6e0aa7 ("scsi: zfcp: signal incomplete or error for sync exchange config/port data")
commit 7e418833e689 ("scsi: zfcp: diagnostics buffer caching and use for exchange port data")
commit 088210233e6f ("scsi: zfcp: add diagnostics buffer for exchange config data")

(first released in v5.5), because these already cache all the information
we need for that "re-do operation" - the information cached are always
updated during xconf or xport data, so it won't be stale.

In addition to the move and re-do, this patch also updates the
function-documentation of `zfcp_scsi_adapter_register()` and changes how it
reports if a Scsi_Host object already exists. In that case future
recovery-operations can skip this step completely and behave much like they
would do in the past - zfcp does not release a once allocated Scsi_Host
object unless the corresponding FCP device is deconstructed completely.

Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d0dff2ac98dd41d7d451127d9eae2f6478fc40b0)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Fence early sysfs interfaces for accesses of shost objects
Benjamin Block [Fri, 8 May 2020 17:23:34 +0000 (19:23 +0200)]
scsi: zfcp: Fence early sysfs interfaces for accesses of shost objects

BugLink: https://bugs.launchpad.net/bugs/1887124
When setting an adapter online for the first time, we also create a couple
of entries for it in the sysfs device tree. This is also true even if the
adapter has not yet ever gone successfully through exchange config and
exchange port data.

When moving the scsi host object allocation and registration to after the
first exchange config and exchange port data, this make the `port_rescan`
attribute susceptible to invalid pointer-dereferences of the shost field
before the adapter is fully initialized.

When written to, it schedules a `scan_work` item that will in turn make use
of the associated fibre channel host object to check the topology used for
this FCP device.

Because scanning for remote ports can't be done successfully without
completing exchange config and exchange port data first, we can simply
fence `port_rescan`, and so prevent the illegal access.

As with cases where we can't get a reference to the adapter, we also return
-ENODEV here. Applications need to handle that errno today already.

After a successful allocation of the scsi host object nothing changes in
the work flow.

Link: https://lore.kernel.org/r/ef65366d309993ca91b6917727590ca7ca166c8f.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 71159b6ecb067565fb41faba616116e8c0fc10ea)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Fence adapter status propagation for common statuses
Benjamin Block [Fri, 8 May 2020 17:23:33 +0000 (19:23 +0200)]
scsi: zfcp: Fence adapter status propagation for common statuses

BugLink: https://bugs.launchpad.net/bugs/1887124
Common status flags that all main objects - adapter, port, and unit -
support are propagated to sub-objects when set or cleared. For instance,
when setting the status ZFCP_STATUS_COMMON_ERP_INUSE for an adapter object,
we will propagate this to all its child ports and units - same for when
clearing a common status flag.

Units of an adapter object are enumerated via __shost_for_each_device()
over the scsi host object of the corresponding adapter.

Once we move the scsi host object allocation and registration to after the
first exchange config and exchange port data, this won't be possible for
cases where we set or clear common statuses during the very first adapter
recovery.

But since we won't have any port or unit objects yet at that point of time,
we can just fence the status propagation for cases where the scsi host
object is not yet set in the adapter object. It won't change any effective
status propagations, but will prevent us from dereferencing invalid
pointers.

For any later point in the work flow the scsi host object will be set and
thus nothing is changed then.

Link: https://lore.kernel.org/r/f51fe5f236a1e3d1ce53379c308777561bfe35e1.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 971f2abb4ca4095bb4bd7c2ba119d1cca078b433)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Move p-t-p port allocation to after xport data
Benjamin Block [Fri, 8 May 2020 17:23:32 +0000 (19:23 +0200)]
scsi: zfcp: Move p-t-p port allocation to after xport data

BugLink: https://bugs.launchpad.net/bugs/1887124
When doing the very first adapter recovery - initialization - for a FCP
device in a point-to-point topology we also allocate the port object
corresponding to the attached remote port, and trigger a port recovery for
it that will run after the adapter recovery finished.

Right now this happens right after we finished with the exchange config
data command, and uses the fibre channel host object corresponding to the
FCP device to determine whether a point-to-point topology is used.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this use of the fc_host object is not
possible anymore at that point in the work flow.

But the allocation and recovery trigger doesn't have notable side-effects
on the following exchange port data processing, so we can move those to
after xport data, and thus also to after the scsi host object allocation,
once we move it. Then the fc_host object can be used again, like it is now.

For any further adapter recoveries this doesn't change anything, because at
that point the port object already exists and recovery is triggered
elsewhere for existing port objects.

Link: https://lore.kernel.org/r/73e5d4ac21e2b37bf0c3ca8e530bc5a5c6e74f8f.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ac007adc4d2d9258fdf27abd25cc77a4e0e8d19f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Fence fc_host updates during link-down handling
Benjamin Block [Fri, 8 May 2020 17:23:31 +0000 (19:23 +0200)]
scsi: zfcp: Fence fc_host updates during link-down handling

BugLink: https://bugs.launchpad.net/bugs/1887124
When receiving a notification that a FCP device lost its local link we
usually update the fibre channel host object which represents that FCP
device to reflect that.

This notification/information can also surface when the FCP device is
running through adapter recovery (exchange config and exchange port data
return incomplete).

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, and this happens during the very first
adapter recovery, these updates can not be done until after the scsi host
object is allocated.

Reorder the fc_host updates in zfcp_fsf_fc_host_link_down() so that they
only happen after a check of whether the scsi host object is already
allocated or not.

During the first adapter recovery this will cause the skip of these updates
if a link-down condition is detected, but we can repeat them after we
allocated the scsi host object, if necessary.

For any further link-down handling the only changes in the work flow are
the slightly reordered assignments in zfcp_fsf_fc_host_link_down().

Link: https://lore.kernel.org/r/f841f2cda61dcd7b8549910c44e1831927459edf.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 990486f3a8508494dab2a7ff0fcc3eb977557d89)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Move fc_host updates during xport data handling into fenced function
Benjamin Block [Fri, 8 May 2020 17:23:30 +0000 (19:23 +0200)]
scsi: zfcp: Move fc_host updates during xport data handling into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When executing exchange port data for a FCP device for the first time, or
after an adapter recovery, we update several properties of the fibre
channel host object which represents that FCP device.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this is not possible for the former case.

Move all these update into separate, and fenced function that first checks
whether the scsi host object already exists or not, before making the
updates.

During the first ever exchange port data in the adapter life cycle this
will make the exchange port data handler skip over this update step, but we
can repeat it later, after we allocated the scsi host object.

For any further recovery of that adapter the work flow is only changed
slightly because then the scsi host object already exists and we don't free
it until we release the adapter completely at the end of its life cycle.

Link: https://lore.kernel.org/r/ae454c2dc6da0b02907c489af91d0b211d331825.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 52e61fde5ec95cb4011784fb0bc6b436e16fcaa8)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Move shost updates during xconfig data handling into fenced function
Benjamin Block [Fri, 8 May 2020 17:23:29 +0000 (19:23 +0200)]
scsi: zfcp: Move shost updates during xconfig data handling into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When executing exchange config data for a FCP device for the first time, or
after an adapter recovery, we update several properties of the scsi host or
fibre channel host object that represent that FCP device.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this is not possible for the former case.

Move all these update into separate, and fenced function that first checks
whether the scsi host object already exists or not, before making the
updates.

During the first ever exchange config data in the adapter life cycle this
will make the exchange config data handler skip over this update step, but
we can repeat it later, after we allocated the scsi host object.

For any further recovery of that adapter the work flow is only changed
slightly because then the scsi host object already exists and we don't free
it until we release the adapter completely at the end of its life cycle.

Link: https://lore.kernel.org/r/5fc3f4d38d4334f7aa595497c6f7865fb1102e0f.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit bd1684817d7d8d1a3b95a4347166246ad1f7670b)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: Move shost modification after QDIO (re-)open into fenced function
Benjamin Block [Fri, 8 May 2020 17:23:28 +0000 (19:23 +0200)]
scsi: zfcp: Move shost modification after QDIO (re-)open into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When establishing and activating the QDIO queue pair for a FCP device for
the first time, or after an adapter recovery, we publish some of its
characteristics to the scsi host object representing that FCP device.

When moving the scsi host object allocation and registration to after the
first exchange config and exchange port data, this is not possible for the
former case - QDIO open for the first time - because that happens before
exchange config and exchange port data.

Move the scsi host object update into a fenced function that checks whether
the object already exists or not. This way we can repeat that step later,
once we are past the allocation.

Once the first recovery succeeds we don't release the scsi host object
anymore, so further recoveries do work as before.

Link: https://lore.kernel.org/r/a214ebf508f71e3690113e3e90edab1cea0e24e3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 978857c7e367d6841f71c4ded5a8c244520f5e22)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: use fallthrough;
Joe Perches [Tue, 31 Mar 2020 14:21:48 +0000 (16:21 +0200)]
scsi: zfcp: use fallthrough;

BugLink: https://bugs.launchpad.net/bugs/1887124
Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
[bblock@linux.ibm.com: resolved merge conflict with recently upstream-sent patch "zfcp: expose fabric name as common fc_host sysfs attribute"]
Link: https://lore.kernel.org/r/d14669a67a17392490d3184117941123765db1a4.1585663010.git.bblock@linux.ibm.com
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit cec9cbac5244b017f2671e3770abfacc939d753d)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: log FC Endpoint Security errors
Jens Remus [Thu, 12 Mar 2020 17:45:05 +0000 (18:45 +0100)]
scsi: zfcp: log FC Endpoint Security errors

BugLink: https://bugs.launchpad.net/bugs/1887124
Log any FC Endpoint Security errors to the kernel ring buffer with rate-
limiting.

Link: https://lore.kernel.org/r/20200312174505.51294-11-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 42cabdaf103be174adb6f1ca61383eb2b35a013a)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: enhance handling of FC Endpoint Security errors
Jens Remus [Thu, 12 Mar 2020 17:45:04 +0000 (18:45 +0100)]
scsi: zfcp: enhance handling of FC Endpoint Security errors

BugLink: https://bugs.launchpad.net/bugs/1887124
Enable for explicit FCP channel FC Endpoint Security error reporting and
handle any FSF security errors according to specification. Take the
following recovery actions when a FSF_SECURITY_ERROR is reported for the
specified FSF commands:

- Open Port: Retry the command if possible
- Send FCP : Physically close the remote port and reopen

For Open Port the command status is set to error, which triggers a retry.
For Send FCP the command status is set to error and recovery is triggered
to physically reopen the remote port.

Link: https://lore.kernel.org/r/20200312174505.51294-10-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e53d92856e9f1cfa0be284fa1dc3367130ce433a)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: trace FC Endpoint Security of FCP devices and connections
Jens Remus [Thu, 12 Mar 2020 17:45:03 +0000 (18:45 +0100)]
scsi: zfcp: trace FC Endpoint Security of FCP devices and connections

BugLink: https://bugs.launchpad.net/bugs/1887124
Trace changes in Fibre Channel Endpoint Security capabilities of FCP
devices as well as changes in Fibre Channel Endpoint Security state of
their connections to FC remote ports as FC Endpoint Security changes with
trace level 3 in HBA DBF.

A change in FC Endpoint Security capabilities of FCP devices is traced as
response to FSF command FSF_QTCB_EXCHANGE_PORT_DATA with a trace tag of
"fsfcesa" and a WWPN of ZFCP_DBF_INVALID_WWPN = 0x0000000000000000 (see
FC-FS-4 §18 "Name_Identifier Formats", NAA field).

A change in FC Endpoint Security state of connections between FCP devices
and FC remote ports is traced as response to FSF command
FSF_QTCB_OPEN_PORT_WITH_DID with a trace tag of "fsfcesp".

Example trace record of FC Endpoint Security capability change of FCP
device formatted with zfcpdbf from s390-tools:

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ...
Caller         : 0x...
Record ID      : 5                    ZFCP_DBF_HBA_FCES
Tag            : fsfcesa              FSF FC Endpoint Security adapter
Request ID     : 0x...
Request status : 0x00000010
FSF cmnd       : 0x0000000e           FSF_QTCB_EXCHANGE_PORT_DATA
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000           FSF_GOOD
FSF stat qual  : n/a
Prot stat      : n/a
Prot stat qual : n/a
Port handle    : 0x00000000           none (invalid)
LUN handle     : n/a
WWPN           : 0x0000000000000000   ZFCP_DBF_INVALID_WWPN
FCES old       : 0x00000000           old FC Endpoint Security
FCES new       : 0x00000007           new FC Endpoint Security

Example trace record of FC Endpoint Security change of connection to
FC remote port formatted with zfcpdbf from s390-tools:

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ...
Caller         : 0x...
Record ID      : 5                    ZFCP_DBF_HBA_FCES
Tag            : fsfcesp              FSF FC Endpoint Security port
Request ID     : 0x...
Request status : 0x00000010
FSF cmnd       : 0x00000005           FSF_QTCB_OPEN_PORT_WITH_DID
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000           FSF_GOOD
FSF stat qual  : n/a
Prot stat      : n/a
Prot stat qual : n/a
Port handle    : 0x...
WWPN           : 0x500507630401120c   WWPN
FCES old       : 0x00000000           old FC Endpoint Security
FCES new       : 0x00000004           new FC Endpoint Security

Link: https://lore.kernel.org/r/20200312174505.51294-9-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 616da39e0060f3b8bbc0f36f7d911bb5abb31746)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: log FC Endpoint Security of connections
Jens Remus [Thu, 12 Mar 2020 17:45:02 +0000 (18:45 +0100)]
scsi: zfcp: log FC Endpoint Security of connections

BugLink: https://bugs.launchpad.net/bugs/1887124
Log the usage of and subsequent changes in FC Endpoint Security of
connections between FCP devices and FC remote ports to the kernel ring
buffer. Activation of FC Endpoint Security is logged as informational.
Change and deactivation are logged as warning.

No logging takes place, if FC Endpoint Security is not used (i.e. never
activated) on a connection or if it does not change during reopen of a port
(e.g. due to adapter or port recovery).

Link: https://lore.kernel.org/r/20200312174505.51294-8-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f0d26ae847489850509b793ef3f74be62f69ab0f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: report FC Endpoint Security in sysfs
Jens Remus [Thu, 12 Mar 2020 17:45:01 +0000 (18:45 +0100)]
scsi: zfcp: report FC Endpoint Security in sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
Add an interface to read Fibre Channel Endpoint Security information of FCP
channels and their connections to FC remote ports. It comes in the form of
new sysfs attributes that are attached to the CCW device representing the
FCP device and its zfcp port objects.

The read-only sysfs attribute "fc_security" of a CCW device representing a
FCP device shows the FC Endpoint Security capabilities of the device.
Possible values are: "unknown", "unsupported", "none", or a comma-
separated list of one or more mnemonics and/or one hexadecimal value
representing the supported FC Endpoint Security:

  Authentication: Authentication supported
  Encryption    : Encryption supported

The read-only sysfs attribute "fc_security" of a zfcp port object shows the
FC Endpoint Security used on the connection between its parent FCP device
and the FC remote port. Possible values are: "unknown", "unsupported",
"none", or a mnemonic or hexadecimal value representing the FC Endpoint
Security used:

  Authentication: Connection has been authenticated
  Encryption    : Connection is encrypted

Both sysfs attributes may return hexadecimal values instead of mnemonics,
if the mnemonic lookup table does not contain an entry for the FC Endpoint
Security reported by the FCP device.

Link: https://lore.kernel.org/r/20200312174505.51294-7-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a17c78460093aad8fb97fc6905c22355b7d1c923)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: auto variables for dereferenced structs in open port handler
Jens Remus [Thu, 12 Mar 2020 17:45:00 +0000 (18:45 +0100)]
scsi: zfcp: auto variables for dereferenced structs in open port handler

BugLink: https://bugs.launchpad.net/bugs/1887124
Introduce automatic variables for adapter and QTCB bottom in
zfcp_fsf_open_port_handler(). This facilitates subsequent changes to meet
the 80 character per line limit.

Link: https://lore.kernel.org/r/20200312174505.51294-6-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 185f2d2d595c29c6e2ed6e2897b9ccc52c50c917)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: fix fc_host attributes that should be unknown on local link down
Steffen Maier [Thu, 12 Mar 2020 17:44:59 +0000 (18:44 +0100)]
scsi: zfcp: fix fc_host attributes that should be unknown on local link down

BugLink: https://bugs.launchpad.net/bugs/1887124
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.

Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.

Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down.  Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.

The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value.  Only if something triggered an
exchange port data, some values changed.  Active_fc4s became all zeros as
unknown equivalent during link down.  Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph).  Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.

(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))

This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x01 0x00 ..."
    ...
    fabric_name         = "0x10000027f8e04c49"
    ...
    permanent_port_name = "0xc05076e4588059c1"
    port_id             = "0x244800"
    port_state          = "Linkdown"
    port_type           = "NPort (fabric via point-to-point)"
    ...
    speed               = "16 Gbit"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "37"
    ...
    utilization         = "0 0 0"

The second and next invocations produce (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x00 0x00 ..."
    ...
    fabric_name         = "0x0"
    ...
    permanent_port_name = "0x0"
    port_id             = "0x000000"
    port_state          = "Linkdown"
    port_type           = "Unknown"
    ...
    speed               = "unknown"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "38"
    ...
    utilization         = "0 0 0"

Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down().  All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval().  Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().

Port_name and node_name are always valid, so never reset them.

Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.

Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().

When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.

Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 7e0e4e0958ef794ee868838249880d5c521ff761)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: wire previously driver-specific sysfs attributes also to fc_host
Steffen Maier [Thu, 12 Mar 2020 17:44:58 +0000 (18:44 +0100)]
scsi: zfcp: wire previously driver-specific sysfs attributes also to fc_host

BugLink: https://bugs.launchpad.net/bugs/1887124
Manufacturer, HBA model, firmware version, and hardware version.  Use the
same value format as for the driver-specific attributes.  Keep the
driver-specific attributes for stable user space sysfs API.

Link: https://lore.kernel.org/r/20200312174505.51294-4-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 538c6e910baea9a94ba2a816c19c3e071892b49c)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: expose fabric name as common fc_host sysfs attribute
Steffen Maier [Thu, 12 Mar 2020 17:44:57 +0000 (18:44 +0100)]
scsi: zfcp: expose fabric name as common fc_host sysfs attribute

BugLink: https://bugs.launchpad.net/bugs/1887124
FICON Express8S or older, as well as card features newer than FICON
Express16S+ have no certain firmware level requirement.

FICON Express16S or FICON Express16S+ have the following
minimum firmware level requirements to show a proper fabric name value:

 z13 machine
  FICON Express16S  , MCL P08424.005 , LIC version 0x00000721
 z14 machine
  FICON Express16S  , MCL P42611.008 , LIC version 0x10200069
  FICON Express16S+ , MCL P42625.010 , LIC version 0x10300147

Otherwise, the read value is not the fabric name.

Each FCP channel of these card features might need one SAN fabric re-login
after concurrent microcode update in order to show the proper fabric name.
Possible ways to trigger a SAN fabric re-login are one of: Pull fibres
between FCP channel port and SAN switch port on either side and re-plug,
disable SAN switch port adjacent to FCP channel port and re-enable switch
port, or at Service Element toggle off all CHPIDs of FCP channel over all
LPARs and toggle CHPIDs on again.  Zfcp operating subchannels (FCP devices)
on such FCP channel recovers a fabric re-login.

Initialize fabric name for any topology and have it an invalid WWPN 0x0 for
anything but fabric topology.  Otherwise for e.g. point-to-point topology
one could see the initial -1 from fc_host_setup() and after a link unplug
our fabric name would turn to 0x0 (with subsequent commit ("zfcp: fix
fc_host attributes that should be unknown on local link down") and stay 0x0
on link replug.  I did not initialize to 0x0 somewhere even earlier in the
code path such that it would not flap from real to 0x0 to real on e.g. an
exchange config data with fabric topology.

Link: https://lore.kernel.org/r/20200312174505.51294-3-maier@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e05a10a055098bf55168a9d682156e38c6b00cfa)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: fix wrong data and display format of SFP+ temperature
Benjamin Block [Wed, 19 Feb 2020 15:09:25 +0000 (16:09 +0100)]
scsi: zfcp: fix wrong data and display format of SFP+ temperature

BugLink: https://bugs.launchpad.net/bugs/1887124
When implementing support for retrieval of local diagnostic data from the
FCP channel, the wrong data format was assumed for the temperature of the
local SFP+ connector. The Fibre Channel Link Services (FC-LS-3)
specification is not clear on the format of the stored integer, and only
after consulting the SNIA specification SFF-8472 did we realize it is
stored as two's complement. Thus, the used data and display format is
wrong, and highly misleading for users when the temperature should drop
below 0°C (however unlikely that may be).

To fix this, change the data format in `struct fsf_qtcb_bottom_port` from
unsigned to signed, and change the printf format string used to generate
`zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`.

Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com
Fixes: a10a61e807b0 ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data")
Fixes: 6028f7c4cd87 ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver")
Cc: <stable@vger.kernel.org> # 5.5+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a3fd4bfe85fbb67cf4ec1232d0af625ece3c508b)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act
Steffen Maier [Fri, 25 Oct 2019 16:12:52 +0000 (18:12 +0200)]
scsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act

BugLink: https://bugs.launchpad.net/bugs/1887124
No functional change.

The unary not operator only applies to the sub expression before the
logical or. So we return early if (not running) or failed.

Link: https://lore.kernel.org/r/df4f897f6e83eaa528465d0858d5a22daac47a2f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e76acc51942649398660ca50655af5afecf29c42)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: move maximum age of diagnostic buffers into a per-adapter variable
Benjamin Block [Fri, 25 Oct 2019 16:12:51 +0000 (18:12 +0200)]
scsi: zfcp: move maximum age of diagnostic buffers into a per-adapter variable

BugLink: https://bugs.launchpad.net/bugs/1887124
Replace the static define (ZFCP_DIAG_MAX_AGE) with a per-adapter variable
(${adapter}->diagnostics->max_age). This new variable is exported via
sysfs, along with other, already existing adapter variables, and can both
be read and written. This way users can choose how much time should pass
between refreshes of diagnostic buffers. The default value for the age
remains to be five seconds.

By setting this new variable to 0, the caching of diagnostic buffers for
userspace accesses can also be completely removed.

All diagnostic buffers of a given adapter are subject to this setting in
the same way.

Link: https://lore.kernel.org/r/b1d0977cc884b16dd4ca6418e4320c56a4c31d63.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 48910f8c35cfd250d806f3e03150d256f40b6d4c)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: implicitly refresh config-data diagnostics when reading sysfs
Benjamin Block [Fri, 25 Oct 2019 16:12:50 +0000 (18:12 +0200)]
scsi: zfcp: implicitly refresh config-data diagnostics when reading sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
Adds implicit updates of cached diagnostics via Exchange Config Data when
reading sysfs attributes interfacing them. Right now this only affects the
new B2B-Credit diagnostic attribute.

This uses the same mechanism previously also used for cached diagnostics
of Exchange Port Data.

Link: https://lore.kernel.org/r/60a94f55f2630b74b468fed5f39880208abb2679.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8a72db70b5ca3c3feb3ca25519e8a9516cc60cfe)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: introduce sysfs interface to read the local B2B-Credit
Benjamin Block [Fri, 25 Oct 2019 16:12:49 +0000 (18:12 +0200)]
scsi: zfcp: introduce sysfs interface to read the local B2B-Credit

BugLink: https://bugs.launchpad.net/bugs/1887124
In addition to the diagnostic data from the local SFP transceiver this
patch adds an interface to read the advertised buffer-to-buffer credit from
the local FC_Port.

With this patch the userspace-interface will only read data stored in the
corresponding "diagnostic buffer" (that was stored during completion of a
previous Exchange Config Data command). Implicit updating will follow later
in this series.

Link: https://lore.kernel.org/r/8a53aef87b53c50cfb1a3425b799bacb6f82b832.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5a2876f0d1ef26b76755749f978d46e4666013dd)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: implicitly refresh port-data diagnostics when reading sysfs
Benjamin Block [Fri, 25 Oct 2019 16:12:48 +0000 (18:12 +0200)]
scsi: zfcp: implicitly refresh port-data diagnostics when reading sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
This patch adds implicit updates to the sysfs entries that read the
diagnostic data stored in the "caching buffer" for Exchange Port Data.

An update is triggered once the buffer is older than ZFCP_DIAG_MAX_AGE
milliseconds (5s). This entails sending an Exchange Port Data command to
the FCP-Channel, and during its ingress path updating the cached data and
the timestamp. To prevent multiple concurrent userspace-applications from
triggering this update in parallel we synchronize all of them using a
wait-queue (waiting threads are interruptible; the updating thread is not).From: Benjamin Block <bblock@linux.ibm.com>

Link: https://lore.kernel.org/r/c145b5cfc99a63b6a018b1184fbd27bb09c955f5.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8155eb0785279728b6b2e29aba2ca52d16aa526f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver
Benjamin Block [Fri, 25 Oct 2019 16:12:47 +0000 (18:12 +0200)]
scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver

BugLink: https://bugs.launchpad.net/bugs/1887124
This adds an interface to read the diagnostics of the local SFP transceiver
of an FCP-Channel from userspace. This comes in the form of new sysfs
entries that are attached to the CCW device representing the FCP
device. Each type of data gets its own sysfs entry; the whole collection of
entries is pooled into a new child-directory of the CCW device node:
"diagnostics".

Adds sysfs entries for:
 * sfp_invalid:    boolean value evaluating to whether the following 5
                   fields are invalid; {0, 1}; 1 - invalid
 * temperature:    transceiver temp.; unit 1/256°C;
                   range [-128°C, +128°C]
 * vcc:            supply voltage; unit 100μV; range [0, 6.55V]
 * tx_bias:        transmitter laser bias current; unit 2μA;
                   range [0, 131mA]
 * tx_power:       coupled TX output power; unit 0.1μW; range [0, 6.5mW]
 * rx_power:       received optical power; unit 0.1μW; range [0, 6.5mW]

 * optical_port:   boolean value evaluating to whether the FCP-Channel has
                   an optical port; {0, 1}; 1 - optical
 * fec_active:     boolean value evaluating to whether 16G FEC is active;
                   {0, 1}; 1 - active
 * port_tx_type:   nibble describing the port type; {0, 1, 2, 3};
                   0 - unknown,             1 - short wave,
                   2 - long wave LC 1310nm, 3 - long wave LL 1550nm
 * connector_type: two bits describing the connector type; {0, 1};
                   0 - unknown,             1 - SFP+

This is only supported if the FCP-Channel in turn supports reporting the
SFP Diagnostic Data, otherwise read() on these new entries will return
EOPNOTSUPP (this affects only adapters older than FICON Express8S, on
Mainframe generations older than z14). Other possible errors for read()
include ENOLINK, ENODEV and ENOMEM.

With this patch the userspace-interface will only read data stored in
the corresponding "diagnostic buffer" (that was stored during completion
of an previous Exchange Port Data command). Implicit updating will
follow later in this series.

Link: https://lore.kernel.org/r/1f9cce7c829c881e7d71a3f10c5b57f3dd84ab32.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6028f7c4cd87cac13481255d7e35dd2c9207ecae)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: support retrieval of SFP Data via Exchange Port Data
Benjamin Block [Fri, 25 Oct 2019 16:12:46 +0000 (18:12 +0200)]
scsi: zfcp: support retrieval of SFP Data via Exchange Port Data

BugLink: https://bugs.launchpad.net/bugs/1887124
A new FCP channel feature allows us to read the diagnostics from our local
SFP transceivers. To make use of that add a flag
(FSF_FEATURE_REQUEST_SFP_DATA) to the feature-set we request from the FCP
channel. Whether the channel actually implements this can be determined via
an other new flag (FSF_FEATURE_REPORT_SFP_DATA), that is set in the
adapter_features field of the adapter structure after Exchange Config Data
finished.

Also add the corresponding definitions in the QTCB Bottom for Exchange Port
Data. These new definitions are only valid, if FSF_FEATURE_REPORT_SFP_DATA
is set.

Link: https://lore.kernel.org/r/ee1eba4de71eb06b4d82207ad4f428429346156f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a10a61e807b0a226b78e2041843cbf0521bd0c35)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: add diagnostics buffer for exchange config data
Benjamin Block [Fri, 25 Oct 2019 16:12:45 +0000 (18:12 +0200)]
scsi: zfcp: add diagnostics buffer for exchange config data

BugLink: https://bugs.launchpad.net/bugs/1887124
In the same vein as the previous patch, add diagnostic data capture for the
Exchange Config Data command.

Link: https://lore.kernel.org/r/7d8ac0a6cad403fa8f8b888693476a84e80a277b.1572018131.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 088210233e6fc039fd2c0bfe44b06bb94328d09e)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: diagnostics buffer caching and use for exchange port data
Benjamin Block [Fri, 25 Oct 2019 16:12:44 +0000 (18:12 +0200)]
scsi: zfcp: diagnostics buffer caching and use for exchange port data

BugLink: https://bugs.launchpad.net/bugs/1887124
The FCP channel exposes two central interfaces to receive information about
the local FCP-Adapter/-Port: Exchange Port and Exchange Config Data. Using
these commands can negatively impact the adapter if we allow them to be
sent at a very high rate.

The later parts of this patchset will introduce new user-interfaces to
receive more diagnostics from the adapter. To prevent any negative impact
from using those, this patch adds a simple caching-mechanism that will
prevent a malicious/faulty userspace-application from generating an
abnormal high amount of Exchange Port/Config Data traffic.

Relevant diagnostic data that is received via Exchange Config/Port Data is
cached in buffers associated with the corresponding adapter-struct.  Each
buffer is associated with a timestamp that signals how old the data is,
and, added via a following patch in this series, lets userspace-interfaces
determine when the data is too old and needs to be updated.

Buffer-updates are made during the normal response path of the
corresponding command. With this patch only the output of the Exchange Port
Data command is captured.

Link: https://lore.kernel.org/r/054ca020ce0a53dc0d9176428bea373898944e6a.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 7e418833e68948cb9ed15262889173b7db2960cb)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoscsi: zfcp: signal incomplete or error for sync exchange config/port data
Benjamin Block [Fri, 25 Oct 2019 16:12:43 +0000 (18:12 +0200)]
scsi: zfcp: signal incomplete or error for sync exchange config/port data

BugLink: https://bugs.launchpad.net/bugs/1887124
Adds a new FSF-Request status flag (ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE)
that signal that the data received using Exchange Config Data or Exchange
Port Data was incomplete. This new flags is set in the respective handlers
during the response path.

With this patch, only the synchronous FSF-functions for each command got
support for the new flag, otherwise it is transparent.

Together with this new flag and already existing status flags the
synchronous FSF-functions are extended to now detect whether the received
data is complete, incomplete or completely invalid (this includes cases
where a command ran into a timeout). This is now signaled back to the
caller, where previously only failures on the request path would result in
a bad return-code.

For complete data the return-code remains 0. For incomplete data a new
return-code -EAGAIN is added to the function-interface. For completely
invalid data the already existing return-code -EIO is reused - formerly
this was used to signal failures on the request path.

Existing callers of the FSF-functions are adjusted so that they behave as
before for return-code 0 and -EAGAIN, to not change the user-interface. As
-EIO existed all along, it was already exposed to the user - and needed
handling - and will now also be exposed in this new special case.

Link: https://lore.kernel.org/r/e14f0702fa2b00a4d1f37c7981a13f2dd1ea2c83.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 92953c6e0aa77d4febcca6dd691e8192910c8a28)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUSB: serial: option: add Quectel EG95 LTE modem
AceLan Kao [Wed, 15 Jul 2020 06:55:31 +0000 (14:55 +0800)]
USB: serial: option: add Quectel EG95 LTE modem

BugLink: https://bugs.launchpad.net/bugs/1886744
Add support for Quectel Wireless Solutions Co., Ltd. EG95 LTE modem

T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  5 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0195 Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
(cherry picked from commit da6902e5b6dbca9081e3d377f9802d4fd0c5ea59
linux-next)
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agonet: usb: qmi_wwan: add support for Quectel EG95 LTE modem
AceLan Kao [Wed, 15 Jul 2020 06:55:30 +0000 (14:55 +0800)]
net: usb: qmi_wwan: add support for Quectel EG95 LTE modem

BugLink: https://bugs.launchpad.net/bugs/1886744
Add support for Quectel Wireless Solutions Co., Ltd. EG95 LTE modem

T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  5 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0195 Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f815dd5cf48b905eeecf0a2b990e9b7ab048b4f1)
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoASoC: SOF: Intel: hda: move i915 init earlier
Kai Vehmanen [Tue, 14 Jul 2020 07:34:20 +0000 (15:34 +0800)]
ASoC: SOF: Intel: hda: move i915 init earlier

BugLink: https://bugs.launchpad.net/bugs/1886341
To be compliant with i915 display driver requirements, i915 power-up
must be done before any HDA communication takes place, including
parsing the bus capabilities. Otherwise the initial codec probe
may fail.

Move i915 initialization earlier in the SOF HDA sequence. This
sequence is now aligned with the snd-hda-intel driver where the
display_power() call is before snd_hdac_bus_parse_capabilities()
and rest of the capability parsing.

Also remove unnecessary ifdef around hda_codec_i915_init(). There's
a dummy implementation provided if CONFIG_SND_SOC_SOF_HDA is not
enabled.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20200206200223.7715-3-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(backported from commit af7aae1b1f6306a1cda4da393e920a1334eaa3d4)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoASoC: SOF: Intel: drop HDA codec upon probe failure
Kai Vehmanen [Tue, 14 Jul 2020 07:34:19 +0000 (15:34 +0800)]
ASoC: SOF: Intel: drop HDA codec upon probe failure

BugLink: https://bugs.launchpad.net/bugs/1886341
In case a HDA codec probe fails, do not raise error immediately,
but instead remove the codec from bus->codec_mask and continue
probe for other codecs.

This allows for more robust behaviour in cases where one codec
in the system is faulty. SOF driver load can still proceed with
the codecs that can be probed successfully. Probe may still
fail if suitable machine driver is not found, but in many
cases the generic HDA machine driver can operate with a subset
of codecs.

Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20191218002616.7652-6-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit 91dce767cd0b08be9f1c87bb2de8e63391a72692)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoASoC: SOF: intel: hda: Modify signature for hda_codec_probe_bus()
Ranjani Sridharan [Tue, 14 Jul 2020 07:34:18 +0000 (15:34 +0800)]
ASoC: SOF: intel: hda: Modify signature for hda_codec_probe_bus()

BugLink: https://bugs.launchpad.net/bugs/1886341
The machine driver selection for HDA platforms will be
consolidated and moved out of the SOF DSP
probe callback. In preparation for that, modify the
signature for hda_codec_probe_bus() to pass the
hda_codec_use_common_hdmi as a variable while probing the
HDA codecs.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20191204211556.12671-10-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(backported from commit 80acdd4f8ff763183dc1cd7f1cd31db9eaaecdc8)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agonet/smc: tolerate future SMCD versions
Ursula Braun [Fri, 10 Jul 2020 19:21:45 +0000 (21:21 +0200)]
net/smc: tolerate future SMCD versions

BugLink: https://bugs.launchpad.net/bugs/1882088
CLC proposal messages of future SMCD versions could be larger than SMCD
V1 CLC proposal messages.
To enable toleration in SMC V1 the receival of CLC proposal messages
is adapted:
* accept larger length values in CLC proposal
* check trailing eye catcher for incoming CLC proposal with V1 length only
* receive the whole CLC proposal even in cases it does not fit into the
  V1 buffer

Fixes: e7b7a64a8493d ("smc: support variable CLC proposal messages")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fb4f79264c0fc6fd5a68ffe3e31bfff97311e1f1 linux-next)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agodebian/dkms-versions: update ZFS dkms package version (LP: #1881107)
Colin Ian King [Wed, 8 Jul 2020 10:17:40 +0000 (11:17 +0100)]
debian/dkms-versions: update ZFS dkms package version (LP: #1881107)

BugLink: https://bugs.launchpad.net/bugs/1881107
Update the ZFS dkms version to pick up latest ZFS that contains
a backport of the AES-GCM performance accelleration updates.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agobcache: check and adjust logical block size for backing devices
Mauricio Faria de Oliveira [Tue, 7 Jul 2020 15:32:50 +0000 (12:32 -0300)]
bcache: check and adjust logical block size for backing devices

BugLink: https://bugs.launchpad.net/bugs/1867916
It's possible for a block driver to set logical block size to
a value greater than page size incorrectly; e.g. bcache takes
the value from the superblock, set by the user w/ make-bcache.

This causes a BUG/NULL pointer dereference in the path:

  __blkdev_get()
  -> set_init_blocksize() // set i_blkbits based on ...
     -> bdev_logical_block_size()
        -> queue_logical_block_size() // ... this value
  -> bdev_disk_changed()
     ...
     -> blkdev_readpage()
        -> block_read_full_page()
           -> create_page_buffers() // size = 1 << i_blkbits
              -> create_empty_buffers() // give size/take pointer
                 -> alloc_page_buffers() // return NULL
                 .. BUG!

Because alloc_page_buffers() is called with size > PAGE_SIZE,
thus it initializes head = NULL, skips the loop, return head;
then create_empty_buffers() gets (and uses) the NULL pointer.

This has been around longer than commit ad6bf88a6c19 ("block:
fix an integer overflow in logical block size"); however, it
increased the range of values that can trigger the issue.

Previously only 8k/16k/32k (on x86/4k page size) would do it,
as greater values overflow unsigned short to zero, and queue_
logical_block_size() would then use the default of 512.

Now the range with unsigned int is much larger, and users w/
the 512k value, which happened to be zero'ed previously and
work fine, started to hit this issue -- as the zero is gone,
and queue_logical_block_size() does return 512k (>PAGE_SIZE.)

Fix this by checking the bcache device's logical block size,
and if it's greater than page size, fallback to the backing/
cached device's logical page size.

This doesn't affect cache devices as those are still checked
for block/page size in read_super(); only the backing/cached
devices are not.

Apparently it's a regression from commit 2903381fce71 ("bcache:
Take data offset from the bdev superblock."), moving the check
into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks
of backing devices out there with this larger value, we cannot
refuse to load them (i.e., have a similar check in _BDEV.)

Ideally perhaps bcache should use all values from the backing
device (physical/logical/io_min block size)? But for now just
fix the problematic case.

Test-case:

    # IMG=/root/disk.img
    # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
    # DEV=$(losetup --find --show $IMG)
    # make-bcache --bdev $DEV --block 8k
      < see dmesg >

Before:

    # uname -r
    5.7.0-rc7

    [   55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
    ...
    [   55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
    ...
    [   55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
    ...
    [   55.966434] Call Trace:
    [   55.967021]  create_page_buffers+0x48/0x50
    [   55.967834]  block_read_full_page+0x49/0x380
    [   55.972181]  do_read_cache_page+0x494/0x610
    [   55.974780]  read_part_sector+0x2d/0xaa
    [   55.975558]  read_lba+0x10e/0x1e0
    [   55.977904]  efi_partition+0x120/0x5a6
    [   55.980227]  blk_add_partitions+0x161/0x390
    [   55.982177]  bdev_disk_changed+0x61/0xd0
    [   55.982961]  __blkdev_get+0x350/0x490
    [   55.983715]  __device_add_disk+0x318/0x480
    [   55.984539]  bch_cached_dev_run+0xc5/0x270
    [   55.986010]  register_bcache.cold+0x122/0x179
    [   55.987628]  kernfs_fop_write+0xbc/0x1a0
    [   55.988416]  vfs_write+0xb1/0x1a0
    [   55.989134]  ksys_write+0x5a/0xd0
    [   55.989825]  do_syscall_64+0x43/0x140
    [   55.990563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [   55.991519] RIP: 0033:0x7f7d60ba3154
    ...

After:

    # uname -r
    5.7.0.bcachelbspgsz

    [   31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512)
    [   31.675133] bcache: register_bdev() registered backing device loop0

    # grep ^ /sys/block/bcache0/queue/*_block_size
    /sys/block/bcache0/queue/logical_block_size:512
    /sys/block/bcache0/queue/physical_block_size:8192

Reported-by: Ryan Finnie <ryan@finnie.org>
Reported-by: Sebastian Marsching <sebastian@marsching.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(backported from commit dcacbc1242c71e18fa9d2eadc5647e115c9c627d)
[mfo: backport: hunks 1/3/4: adjust bcache_device_init() signature
 and calls for the lack of parameter make_request_fn from upstream.]
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Acked-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agor8169: fix firmware not resetting tp->ocp_base
Heiner Kallweit [Sun, 28 Jun 2020 07:58:42 +0000 (03:58 -0400)]
r8169: fix firmware not resetting tp->ocp_base

BugLink: https://bugs.launchpad.net/bugs/1885072
Typically the firmware takes care that tp->ocp_base is reset to its
default value. That's not the case (at least) for RTL8117.
As a result subsequent PHY access reads/writes the wrong page and
the link is broken. Fix this be resetting tp->ocp_base explicitly.

Fixes: 229c1e0dfd3d ("r8169: load firmware for RTL8168fp/RTL8117")
Reported-by: Aaron Ma <mapengyu@gmail.com>
Tested-by: Aaron Ma <mapengyu@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 89fbd26cca7ec9e82ec4787a4b6e95939b57d073)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agor8169: fix OCP access on RTL8117
Heiner Kallweit [Sun, 28 Jun 2020 07:58:41 +0000 (03:58 -0400)]
r8169: fix OCP access on RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
According to r8168 vendor driver DASHv3 chips like RTL8168fp/RTL8117
need a special addressing for OCP access.
Fix is compile-tested only due to missing test hardware.

Fixes: 1287723aa139 ("r8169: add support for RTL8117")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 561535b0f23961ced071b82575d5e83e6351a814)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agor8169: load firmware for RTL8168fp/RTL8117
Heiner Kallweit [Sun, 28 Jun 2020 07:58:40 +0000 (03:58 -0400)]
r8169: load firmware for RTL8168fp/RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
Load Realtek-provided firmware for RTL8168fp/RTL8117. Unlike the
firmware for other chip versions which is for the PHY, firmware for
RTL8168fp/RTL8117 is for the MAC.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 229c1e0dfd3dfac23b33b7cd88bcb7a9e391b000)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agor8169: add support for RTL8117
Heiner Kallweit [Sun, 28 Jun 2020 07:58:39 +0000 (03:58 -0400)]
r8169: add support for RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
Add support for chip version RTL8117. Settings have been copied from
Realtek's r8168 driver, there however chip ID 54a belongs to a chip
version called RTL8168FP. It was confirmed that RTL8117 works with
Realtek's driver, so both chip versions seem to be the same or at
least compatible.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(backported from commit 1287723aa139b46dc0b9b53c7212af71f1acfc39)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agor8169: add helper r8168g_phy_param
Heiner Kallweit [Sun, 28 Jun 2020 07:58:38 +0000 (03:58 -0400)]
r8169: add helper r8168g_phy_param

BugLink: https://bugs.launchpad.net/bugs/1885072
Integrated PHY's from RTL8168g support an indirect access method for
PHY parameters. On page 0x0a43 parameter number is written to register
0x13, then the parameter can be accessed via register 0x14.
Add a helper for this to improve readability and simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8bfdce1defb1fb62bcc0e4929abd756585dabd0d)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agos390/cpum_cf: Add new extended counters for IBM z15
Thomas Richter [Wed, 24 Jun 2020 20:11:05 +0000 (22:11 +0200)]
s390/cpum_cf: Add new extended counters for IBM z15

BugLink: https://bugs.launchpad.net/bugs/1881096
Add CPU measurement counter facility event description for IBM z15.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
(cherry picked from commit d68d5d51dc898895b4e15bea52e5668ca9e76180)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUBUNTU: SAUCE: shiftfs: prevent ESTALE for LOOKUP_JUMP lookups
Christian Brauner [Tue, 23 Jun 2020 17:46:16 +0000 (19:46 +0200)]
UBUNTU: SAUCE: shiftfs: prevent ESTALE for LOOKUP_JUMP lookups

BugLink: https://bugs.launchpad.net/bugs/1872757
Users reported that creating temporary files shiftfs reports ESTALE.
This can be reproduced via:

import tempfile
import os

def test():
    with tempfile.TemporaryFile() as fd:
        fd.write("data".encode('utf-8'))
        # re-open the file to get a read-only file descriptor
        return open(f"/proc/self/fd/{fd.fileno()}", "r")

def main():
   fd = test()
   fd.close()

if __name__ == "__main__":
    main()

a similar issue was reported here:
https://github.com/systemd/systemd/issues/14861

Our revalidate methods were very opinionated about whether or not a
lower dentry was valid especially when it became unlinked we simply
invalidated the lower dentry which caused above bug to surface. This has
led to bugs where a ESTALE was returned for e.g.  temporary files that
were created and directly re-opened afterwards through
/proc/<pid>/fd/<nr-of-deleted-file>. When a file is re-opened through
/proc/<pid>/fd/<nr> LOOKUP_JUMP is set and the vfs will revalidate via
d_weak_revalidate(). Since the file has been unhashed or even already
gone negative we'd fail the open when we should've succeeded.

Reported-by: Christian Kellner <ckellner@redhat.com>
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Cc: Seth Forshee <seth.forshee@canonical.com>
Link: https://github.com/systemd/systemd/issues/14861
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUBUNTU: SAUCE: Revert "UBUNTU: SAUCE: shiftfs: fix dentry revalidation"
Christian Brauner [Tue, 23 Jun 2020 17:46:15 +0000 (19:46 +0200)]
UBUNTU: SAUCE: Revert "UBUNTU: SAUCE: shiftfs: fix dentry revalidation"

BugLink: https://bugs.launchpad.net/bugs/1884767
With patch cfaa482afb97 ("UBUNTU: SAUCE: shiftfs: fix dentry revalidation")
we tried to fix
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757 but it
introduces a regression for shiftfs on btrfs users. Creating a btrfs
subvolume, deleting it, and then trying to recreate it will cause EEXIST
to be returned or the file be left in a half-created state. Faulty
behavior such as this can be reproduced via:

btrfs subvolume create my-subvol
ls -al my-subvol
btrfs subvolume delete my-subvol

We thus need to revert this change restoring the old behavior.This will
briefly resurface https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757
which is fixed in a follow-up patch on top of this revert. We simply
split out the minimal fix into a separate tiny patch.

This reverts commit cfaa482afb97e3c05d020af80b897b061109d51f.

Cc: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoUBUNTU: upstream stable to v5.4.52
Kamal Mostafa [Thu, 16 Jul 2020 18:43:24 +0000 (11:43 -0700)]
UBUNTU: upstream stable to v5.4.52

BugLink: https://bugs.launchpad.net/bugs/1887853
Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoLinux 5.4.52
Greg Kroah-Hartman [Thu, 16 Jul 2020 06:16:48 +0000 (08:16 +0200)]
Linux 5.4.52

BugLink: https://bugs.launchpad.net/bugs/1887853
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agos390/maccess: add no DAT mode to kernel_write
Vasily Gorbik [Wed, 24 Jun 2020 15:39:14 +0000 (17:39 +0200)]
s390/maccess: add no DAT mode to kernel_write

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit d6df52e9996dcc2062c3d9c9123288468bb95b52 ]

To be able to patch kernel code before paging is initialized do plain
memcpy if DAT is off. This is required to enable early jump label
initialization.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agos390: Change s390_kernel_write() return type to match memcpy()
Josh Poimboeuf [Wed, 29 Apr 2020 15:24:47 +0000 (10:24 -0500)]
s390: Change s390_kernel_write() return type to match memcpy()

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit cb2cceaefb4c4dc28fc27ff1f1b2d258bfc10353 ]

s390_kernel_write()'s function type is almost identical to memcpy().
Change its return type to "void *" so they can be used interchangeably.

Cc: linux-s390@vger.kernel.org
Cc: heiko.carstens@de.ibm.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agopwm: jz4740: Fix build failure
Uwe Kleine-König [Fri, 10 Jul 2020 10:27:58 +0000 (12:27 +0200)]
pwm: jz4740: Fix build failure

BugLink: https://bugs.launchpad.net/bugs/1887853
When commit 9017dc4fbd59 ("pwm: jz4740: Enhance precision in calculation
of duty cycle") from v5.8-rc1 was backported to v5.4.x its dependency on
commit ce1f9cece057 ("pwm: jz4740: Use clocks from TCU driver") was not
noticed which made the pwm-jz4740 driver fail to build.

As ce1f9cece057 depends on still more rework, just backport a small part
of this commit to make the driver build again. (There is no dependency
on the functionality introduced in ce1f9cece057, just the rate variable
is needed.)

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoperf scripts python: exported-sql-viewer.py: Fix unexpanded 'Find' result
Adrian Hunter [Mon, 29 Jun 2020 09:19:52 +0000 (12:19 +0300)]
perf scripts python: exported-sql-viewer.py: Fix unexpanded 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 3a3cf7c570a486b07d9a6e68a77548aea6a8421f upstream.

Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().

Example:

  $ perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB perf.data ]
  $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
  2020-06-26 15:32:14.928997 Creating database ...
  2020-06-26 15:32:14.933971 Writing records...
  2020-06-26 15:32:15.535251 Adding indexes
  2020-06-26 15:32:15.542993 Dropping unused tables
  2020-06-26 15:32:15.549716 Done
  $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

  Select: Reports -> Context-Sensitive Call Graph    or     Reports -> Call Tree
  Press: Ctrl-F
  Enter: main
  Press: Enter

Before: line showing 'main' does not display

After: tree is expanded to line showing 'main'

Fixes: ebd70c7dc2f5f ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoperf scripts python: exported-sql-viewer.py: Fix zero id in call tree 'Find' result
Adrian Hunter [Mon, 29 Jun 2020 09:19:54 +0000 (12:19 +0300)]
perf scripts python: exported-sql-viewer.py: Fix zero id in call tree 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 031c8d5edb1ddeb6d398f7942ce2a01a1a51ada9 upstream.

Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero.  Fix by excluding id zero from selection.

Example:

   $ perf record -e intel_pt//u uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.034 MB perf.data ]
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
   2020-06-26 15:32:14.928997 Creating database ...
   2020-06-26 15:32:14.933971 Writing records...
   2020-06-26 15:32:15.535251 Adding indexes
   2020-06-26 15:32:15.542993 Dropping unused tables
   2020-06-26 15:32:15.549716 Done
   $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

   Select: Reports -> Call Tree
   Press: Ctrl-F
   Enter: unknown
   Press: Enter

Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'

Fixes: ae8b887c00d3f ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoperf scripts python: exported-sql-viewer.py: Fix zero id in call graph 'Find' result
Adrian Hunter [Mon, 29 Jun 2020 09:19:53 +0000 (12:19 +0300)]
perf scripts python: exported-sql-viewer.py: Fix zero id in call graph 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7ff520b0a71dd2db695b52ad117d81b7eaf6ff9d upstream.

Using ctrl-F ('Find') would not find 'unknown' because it matches id zero.
Fix by excluding id zero from selection.

Example:

  $ perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB perf.data ]
  $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
  2020-06-26 15:32:14.928997 Creating database ...
  2020-06-26 15:32:14.933971 Writing records...
  2020-06-26 15:32:15.535251 Adding indexes
  2020-06-26 15:32:15.542993 Dropping unused tables
  2020-06-26 15:32:15.549716 Done
  $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

  Select: Reports -> Context-Sensitive Call Graph
  Press: Ctrl-F
  Enter: unknown
  Press: Enter

Before: gets stuck
After: tree is expanded to line showing 'unknown'

Fixes: 254c0d820b86d ("perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoperf scripts python: export-to-postgresql.py: Fix struct.pack() int argument
Adrian Hunter [Mon, 29 Jun 2020 09:19:50 +0000 (12:19 +0300)]
perf scripts python: export-to-postgresql.py: Fix struct.pack() int argument

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 640432e6bed08e9d5d2ba26856ba3f55008b07e3 upstream.

Python 3.8 is requiring that arguments being packed as integers are also
integers.  Add int() accordingly.

 Before:

   $ perf record -e intel_pt//u uname
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py perf_data_db branches calls
   2020-06-25 16:09:10.547256 Creating database...
   2020-06-25 16:09:10.733185 Writing to intermediate files...
   Traceback (most recent call last):
     File "/home/ahunter/libexec/perf-core/scripts/python/export-to-postgresql.py", line 1106, in synth_data
       cbr(id, raw_buf)
     File "/home/ahunter/libexec/perf-core/scripts/python/export-to-postgresql.py", line 1058, in cbr
       value = struct.pack("!hiqiiiiii", 4, 8, id, 4, cbr, 4, MHz, 4, percent)
   struct.error: required argument is not an integer
   Fatal Python error: problem in Python trace event handler
   Python runtime state: initialized

   Current thread 0x00007f35d3695780 (most recent call first):
   <no Python frame>
   Aborted (core dumped)

 After:

   $ dropdb perf_data_db
   $ rm -rf perf_data_db-perf-data
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py perf_data_db branches calls
   2020-06-25 16:09:40.990267 Creating database...
   2020-06-25 16:09:41.207009 Writing to intermediate files...
   2020-06-25 16:09:41.270915 Copying to database...
   2020-06-25 16:09:41.382030 Removing intermediate files...
   2020-06-25 16:09:41.384630 Adding primary keys
   2020-06-25 16:09:41.541894 Adding foreign keys
   2020-06-25 16:09:41.677044 Dropping unused tables
   2020-06-25 16:09:41.703761 Done

Fixes: aba44287a224 ("perf scripts python: export-to-postgresql.py: Export Intel PT power and ptwrite events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agodm writecache: reject asynchronous pmem devices
Michal Suchanek [Tue, 30 Jun 2020 15:49:24 +0000 (17:49 +0200)]
dm writecache: reject asynchronous pmem devices

BugLink: https://bugs.launchpad.net/bugs/1887853
commit a46624580376a3a0beb218d94cbc7f258696e29f upstream.

DM writecache does not handle asynchronous pmem. Reject it when
supplied as cache.

Link: https://lore.kernel.org/linux-nvdimm/87lfk5hahc.fsf@linux.ibm.com/
Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 5.3+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoblk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight()
Ming Lei [Tue, 7 Jul 2020 15:04:33 +0000 (11:04 -0400)]
blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight()

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 05a4fed69ff00a8bd83538684cb602a4636b07a7 upstream.

dm-multipath is the only user of blk_mq_queue_inflight().  When
dm-multipath calls blk_mq_queue_inflight() to check if it has
outstanding IO it can get a false negative.  The reason for this is
blk_mq_rq_inflight() doesn't consider requests that are no longer
MQ_RQ_IN_FLIGHT but that are now MQ_RQ_COMPLETE (->complete isn't
called or finished yet) as "inflight".

This causes request-based dm-multipath's dm_wait_for_completion() to
return before all outstanding dm-multipath requests have actually
completed.  This breaks DM multipath's suspend functionality because
blk-mq requests complete after DM's suspend has finished -- which
shouldn't happen.

Fix this by considering any request not in the MQ_RQ_IDLE state
(so either MQ_RQ_COMPLETE or MQ_RQ_IN_FLIGHT) as "inflight" in
blk_mq_rq_inflight().

Fixes: 3c94d83cb3526 ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agos390/mm: fix huge pte soft dirty copying
Janosch Frank [Tue, 7 Jul 2020 13:38:54 +0000 (15:38 +0200)]
s390/mm: fix huge pte soft dirty copying

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 528a9539348a0234375dfaa1ca5dbbb2f8f8e8d2 upstream.

If the pmd is soft dirty we must mark the pte as soft dirty (and not dirty).
This fixes some cases for guest migration with huge page backings.

Cc: <stable@vger.kernel.org> # 4.8
Fixes: bc29b7ac1d9f ("s390/mm: clean up pte/pmd encoding")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agos390/setup: init jump labels before command line parsing
Vasily Gorbik [Thu, 18 Jun 2020 15:17:19 +0000 (17:17 +0200)]
s390/setup: init jump labels before command line parsing

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 95e61b1b5d6394b53d147c0fcbe2ae70fbe09446 upstream.

Command line parameters might set static keys. This is true for s390 at
least since commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1
and init_on_free=1 boot options"). To avoid the following WARN:

static_key_enable_cpuslocked(): static key 'init_on_alloc+0x0/0x40' used
before call to jump_label_init()

call jump_label_init() just before parse_early_param().
jump_label_init() is safe to call multiple times (x86 does that), doesn't
do any memory allocations and hence should be safe to call that early.

Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Cc: <stable@vger.kernel.org> # 5.3: d6df52e9996d: s390/maccess: add no DAT mode to kernel_write
Cc: <stable@vger.kernel.org> # 5.3
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoARC: elf: use right ELF_ARCH
Vineet Gupta [Wed, 27 May 2020 21:18:45 +0000 (14:18 -0700)]
ARC: elf: use right ELF_ARCH

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b7faf971081a4e56147f082234bfff55135305cb upstream.

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE
Vineet Gupta [Wed, 20 May 2020 05:28:32 +0000 (22:28 -0700)]
ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 00fdec98d9881bf5173af09aebd353ab3b9ac729 upstream.

Trap handler for syscall tracing reads EFA (Exception Fault Address),
in case strace wants PC of trap instruction (EFA is not part of pt_regs
as of current code).

However this EFA read is racy as it happens after dropping to pure
kernel mode (re-enabling interrupts). A taken interrupt could
context-switch, trigger a different task's trap, clobbering EFA for this
execution context.

Fix this by reading EFA early, before re-enabling interrupts. A slight
side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
The trap handler is common to both ARCompact and ARCv2 builds too.

This just came out of code rework/review and no real problem was reported
but is clearly a potential problem specially for strace.

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agommc: meson-gx: limit segments to 1 when dram-access-quirk is needed
Neil Armstrong [Mon, 8 Jun 2020 08:44:58 +0000 (10:44 +0200)]
mmc: meson-gx: limit segments to 1 when dram-access-quirk is needed

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 27a5e7d36d383970affae801d77141deafd536a8 upstream.

The actual max_segs computation leads to failure while using the broadcom
sdio brcmfmac/bcmsdh driver, since the driver tries to make usage of
scatter gather.

But with the dram-access-quirk we use a 1,5K SRAM bounce buffer, and the
max_segs current value of 3 leads to max transfers to 4,5k, which doesn't
work.

This patch sets max_segs to 1 to better describe the hardware limitation,
and fix the SDIO functionality with the brcmfmac/bcmsdh driver on Amlogic
G12A/G12B SoCs on boards like SEI510 or Khadas VIM3.

Reported-by: Art Nikpal <art@khadas.com>
Reported-by: Christian Hewitt <christianshewitt@gmail.com>
Fixes: acdc8e71d9bb ("mmc: meson-gx: add dram-access-quirk")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200608084458.32014-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agodm: use noio when sending kobject event
Mikulas Patocka [Wed, 8 Jul 2020 16:25:20 +0000 (12:25 -0400)]
dm: use noio when sending kobject event

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6958c1c640af8c3f40fa8a2eee3b5b905d95b677 upstream.

kobject_uevent may allocate memory and it may be called while there are dm
devices suspended. The allocation may recurse into a suspended device,
causing a deadlock. We must set the noio flag when sending a uevent.

The observed deadlock was reported here:
https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html

Reported-by: Khazhismel Kumykov <khazhy@google.com>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agodrm/amdgpu: don't do soft recovery if gpu_recovery=0
Marek Olšák [Mon, 6 Jul 2020 22:23:17 +0000 (18:23 -0400)]
drm/amdgpu: don't do soft recovery if gpu_recovery=0

BugLink: https://bugs.launchpad.net/bugs/1887853
commit f4892c327a8e5df7ce16cab40897daf90baf6bec upstream.

It's impossible to debug shader hangs with soft recovery.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agodrm/radeon: fix double free
Tom Rix [Mon, 6 Jul 2020 12:28:57 +0000 (05:28 -0700)]
drm/radeon: fix double free

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 41855a898650803e24b284173354cc3e44d07725 upstream.

clang static analysis flags this error

drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
                kfree(rdev->pm.dpm.ps[i].ps_priv);
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
        kfree(rdev->pm.dpm.ps);
        ^~~~~~~~~~~~~~~~~~~~~~

problem is reported in ci_dpm_fini, with these code blocks.

for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
kfree(rdev->pm.dpm.ps[i].ps_priv);
}
kfree(rdev->pm.dpm.ps);

The first free happens in ci_parse_power_table where it cleans up locally
on a failure.  ci_dpm_fini also does a cleanup.

ret = ci_parse_power_table(rdev);
if (ret) {
ci_dpm_fini(rdev);
return ret;
}

So remove the cleanup in ci_parse_power_table and
move the num_ps calculation to inside the loop so ci_dpm_fini
will know how many array elements to free.

Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agobtrfs: fix double put of block group with nocow
Josef Bacik [Mon, 6 Jul 2020 13:14:12 +0000 (09:14 -0400)]
btrfs: fix double put of block group with nocow

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 230ed397435e85b54f055c524fcb267ae2ce3bc4 upstream.

While debugging a patch that I wrote I was hitting use-after-free panics
when accessing block groups on unmount.  This turned out to be because
in the nocow case if we bail out of doing the nocow for whatever reason
we need to call btrfs_dec_nocow_writers() if we called the inc.  This
puts our block group, but a few error cases does

if (nocow) {
    btrfs_dec_nocow_writers();
    goto error;
}

unfortunately, error is

error:
if (nocow)
btrfs_dec_nocow_writers();

so we get a double put on our block group.  Fix this by dropping the
error cases calling of btrfs_dec_nocow_writers(), as it's handled at the
error label now.

Fixes: 762bf09893b4 ("btrfs: improve error handling in run_delalloc_nocow")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agobtrfs: fix fatal extent_buffer readahead vs releasepage race
Boris Burkov [Wed, 17 Jun 2020 18:35:19 +0000 (11:35 -0700)]
btrfs: fix fatal extent_buffer readahead vs releasepage race

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6bf9cd2eed9aee6d742bb9296c994a91f5316949 upstream.

Under somewhat convoluted conditions, it is possible to attempt to
release an extent_buffer that is under io, which triggers a BUG_ON in
btrfs_release_extent_buffer_pages.

This relies on a few different factors. First, extent_buffer reads done
as readahead for searching use WAIT_NONE, so they free the local extent
buffer reference while the io is outstanding. However, they should still
be protected by TREE_REF. However, if the system is doing signficant
reclaim, and simultaneously heavily accessing the extent_buffers, it is
possible for releasepage to race with two concurrent readahead attempts
in a way that leaves TREE_REF unset when the readahead extent buffer is
released.

Essentially, if two tasks race to allocate a new extent_buffer, but the
winner who attempts the first io is rebuffed by a page being locked
(likely by the reclaim itself) then the loser will still go ahead with
issuing the readahead. The loser's call to find_extent_buffer must also
race with the reclaim task reading the extent_buffer's refcount as 1 in
a way that allows the reclaim to re-clear the TREE_REF checked by
find_extent_buffer.

The following represents an example execution demonstrating the race:

            CPU0                                                         CPU1                                           CPU2
reada_for_search                                            reada_for_search
  readahead_tree_block                                        readahead_tree_block
    find_create_tree_block                                      find_create_tree_block
      alloc_extent_buffer                                         alloc_extent_buffer
                                                                  find_extent_buffer // not found
                                                                  allocates eb
                                                                  lock pages
                                                                  associate pages to eb
                                                                  insert eb into radix tree
                                                                  set TREE_REF, refs == 2
                                                                  unlock pages
                                                              read_extent_buffer_pages // WAIT_NONE
                                                                not uptodate (brand new eb)
                                                                                                            lock_page
                                                                if !trylock_page
                                                                  goto unlock_exit // not an error
                                                              free_extent_buffer
                                                                release_extent_buffer
                                                                  atomic_dec_and_test refs to 1
        find_extent_buffer // found
                                                                                                            try_release_extent_buffer
                                                                                                              take refs_lock
                                                                                                              reads refs == 1; no io
          atomic_inc_not_zero refs to 2
          mark_buffer_accessed
            check_buffer_tree_ref
              // not STALE, won't take refs_lock
              refs == 2; TREE_REF set // no action
    read_extent_buffer_pages // WAIT_NONE
                                                                                                              clear TREE_REF
                                                                                                              release_extent_buffer
                                                                                                                atomic_dec_and_test refs to 1
                                                                                                                unlock_page
      still not uptodate (CPU1 read failed on trylock_page)
      locks pages
      set io_pages > 0
      submit io
      return
    free_extent_buffer
      release_extent_buffer
        dec refs to 0
        delete from radix tree
        btrfs_release_extent_buffer_pages
          BUG_ON(io_pages > 0)!!!

We observe this at a very low rate in production and were also able to
reproduce it in a test environment by introducing some spurious delays
and by introducing probabilistic trylock_page failures.

To fix it, we apply check_tree_ref at a point where it could not
possibly be unset by a competing task: after io_pages has been
incremented. All the codepaths that clear TREE_REF check for io, so they
would not be able to clear it after this point until the io is done.

Stack trace, for reference:
[1417839.424739] ------------[ cut here ]------------
[1417839.435328] kernel BUG at fs/btrfs/extent_io.c:4841!
[1417839.447024] invalid opcode: 0000 [#1] SMP
[1417839.502972] RIP: 0010:btrfs_release_extent_buffer_pages+0x20/0x1f0
[1417839.517008] Code: ed e9 ...
[1417839.558895] RSP: 0018:ffffc90020bcf798 EFLAGS: 00010202
[1417839.570816] RAX: 0000000000000002 RBX: ffff888102d6def0 RCX: 0000000000000028
[1417839.586962] RDX: 0000000000000002 RSI: ffff8887f0296482 RDI: ffff888102d6def0
[1417839.603108] RBP: ffff88885664a000 R08: 0000000000000046 R09: 0000000000000238
[1417839.619255] R10: 0000000000000028 R11: ffff88885664af68 R12: 0000000000000000
[1417839.635402] R13: 0000000000000000 R14: ffff88875f573ad0 R15: ffff888797aafd90
[1417839.651549] FS:  00007f5a844fa700(0000) GS:ffff88885f680000(0000) knlGS:0000000000000000
[1417839.669810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1417839.682887] CR2: 00007f7884541fe0 CR3: 000000049f609002 CR4: 00000000003606e0
[1417839.699037] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1417839.715187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1417839.731320] Call Trace:
[1417839.737103]  release_extent_buffer+0x39/0x90
[1417839.746913]  read_block_for_search.isra.38+0x2a3/0x370
[1417839.758645]  btrfs_search_slot+0x260/0x9b0
[1417839.768054]  btrfs_lookup_file_extent+0x4a/0x70
[1417839.778427]  btrfs_get_extent+0x15f/0x830
[1417839.787665]  ? submit_extent_page+0xc4/0x1c0
[1417839.797474]  ? __do_readpage+0x299/0x7a0
[1417839.806515]  __do_readpage+0x33b/0x7a0
[1417839.815171]  ? btrfs_releasepage+0x70/0x70
[1417839.824597]  extent_readpages+0x28f/0x400
[1417839.833836]  read_pages+0x6a/0x1c0
[1417839.841729]  ? startup_64+0x2/0x30
[1417839.849624]  __do_page_cache_readahead+0x13c/0x1a0
[1417839.860590]  filemap_fault+0x6c7/0x990
[1417839.869252]  ? xas_load+0x8/0x80
[1417839.876756]  ? xas_find+0x150/0x190
[1417839.884839]  ? filemap_map_pages+0x295/0x3b0
[1417839.894652]  __do_fault+0x32/0x110
[1417839.902540]  __handle_mm_fault+0xacd/0x1000
[1417839.912156]  handle_mm_fault+0xaa/0x1c0
[1417839.921004]  __do_page_fault+0x242/0x4b0
[1417839.930044]  ? page_fault+0x8/0x30
[1417839.937933]  page_fault+0x1e/0x30
[1417839.945631] RIP: 0033:0x33c4bae
[1417839.952927] Code: Bad RIP value.
[1417839.960411] RSP: 002b:00007f5a844f7350 EFLAGS: 00010206
[1417839.972331] RAX: 000000000000006e RBX: 1614b3ff6a50398a RCX: 0000000000000000
[1417839.988477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
[1417840.004626] RBP: 00007f5a844f7420 R08: 000000000000006e R09: 00007f5a94aeccb8
[1417840.020784] R10: 00007f5a844f7350 R11: 0000000000000000 R12: 00007f5a94aecc79
[1417840.036932] R13: 00007f5a94aecc78 R14: 00007f5a94aecc90 R15: 00007f5a94aecc40

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agobpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()
Kees Cook [Thu, 2 Jul 2020 22:45:23 +0000 (15:45 -0700)]
bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 63960260457a02af2a6cb35d75e6bdb17299c882 upstream.

When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agokprobes: Do not expose probe addresses to non-CAP_SYSLOG
Kees Cook [Thu, 2 Jul 2020 22:20:22 +0000 (15:20 -0700)]
kprobes: Do not expose probe addresses to non-CAP_SYSLOG

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 60f7bb66b88b649433bf700acfc60c3f24953871 upstream.

The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms does")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agomodule: Do not expose section addresses to non-CAP_SYSLOG
Kees Cook [Thu, 2 Jul 2020 21:43:59 +0000 (14:43 -0700)]
module: Do not expose section addresses to non-CAP_SYSLOG

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b25a7c5af9051850d4f3d93ca500056ab6ec724b upstream.

The printing of section addresses in /sys/module/*/sections/* was not
using the correct credentials to evaluate visibility.

Before:

 # cat /sys/module/*/sections/.*text
 0xffffffffc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0xffffffffc0458000
 ...

After:

 # cat /sys/module/*/sections/*.text
 0xffffffffc0458000
 ...
 # capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
 0x0000000000000000
 ...

Additionally replaces the existing (safe) /proc/modules check with
file->f_cred for consistency.

Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com>
Fixes: be71eda5383f ("module: Fix display of wrong module .text address")
Cc: stable@vger.kernel.org
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agomodule: Refactor section attr into bin attribute
Kees Cook [Thu, 2 Jul 2020 20:47:20 +0000 (13:47 -0700)]
module: Refactor section attr into bin attribute

BugLink: https://bugs.launchpad.net/bugs/1887853
commit ed66f991bb19d94cae5d38f77de81f96aac7813f upstream.

In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.

Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agokallsyms: Refactor kallsyms_show_value() to take cred
Kees Cook [Thu, 2 Jul 2020 18:49:23 +0000 (11:49 -0700)]
kallsyms: Refactor kallsyms_show_value() to take cred

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 160251842cd35a75edfb0a1d76afa3eb674ff40a upstream.

In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.

Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.

Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE
Steven Price [Wed, 17 Jun 2020 10:54:56 +0000 (11:54 +0100)]
KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE

BugLink: https://bugs.launchpad.net/bugs/1887853
If SVE is enabled then 'ret' can be assigned the return value of
kvm_vcpu_enable_sve() which may be 0 causing future "goto out" sites to
erroneously return 0 on failure rather than -EINVAL as expected.

Remove the initialisation of 'ret' and make setting the return value
explicit to avoid this situation in the future.

Fixes: 9a3cdf26e336 ("KVM: arm64/sve: Allow userspace to enable SVE for vcpus")
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200617105456.28245-1-steven.price@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: x86: Mark CR4.TSD as being possibly owned by the guest
Sean Christopherson [Fri, 3 Jul 2020 04:04:21 +0000 (21:04 -0700)]
KVM: x86: Mark CR4.TSD as being possibly owned by the guest

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7c83d096aed055a7763a03384f92115363448b71 upstream.

Mark CR4.TSD as being possibly owned by the guest as that is indeed the
case on VMX.  Without TSD being tagged as possibly owned by the guest, a
targeted read of CR4 to get TSD could observe a stale value.  This bug
is benign in the current code base as the sole consumer of TSD is the
emulator (for RDTSC) and the emulator always "reads" the entirety of CR4
when grabbing bits.

Add a build-time assertion in to ensure VMX doesn't hand over more CR4
bits without also updating x86.

Fixes: 52ce3c21aec3 ("x86,kvm,vmx: Don't trap writes to CR4.TSD")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode
Sean Christopherson [Fri, 3 Jul 2020 02:17:14 +0000 (19:17 -0700)]
KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode

BugLink: https://bugs.launchpad.net/bugs/1887853
commit d74fcfc1f0ff4b6c26ecef1f9e48d8089ab4eaac upstream.

Inject a #GP on MOV CR4 if CR4.LA57 is toggled in 64-bit mode, which is
illegal per Intel's SDM:

  CR4.LA57
    57-bit linear addresses (bit 12 of CR4) ... blah blah blah ...
    This bit cannot be modified in IA-32e mode.

Note, the pseudocode for MOV CR doesn't call out the fault condition,
which is likely why the check was missed during initial development.
This is arguably an SDM bug and will hopefully be fixed in future
release of the SDM.

Fixes: fd8cb433734ee ("KVM: MMU: Expose the LA57 feature to VM.")
Cc: stable@vger.kernel.org
Reported-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703021714.5549-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: x86: bit 8 of non-leaf PDPEs is not reserved
Paolo Bonzini [Tue, 30 Jun 2020 11:07:20 +0000 (07:07 -0400)]
KVM: x86: bit 8 of non-leaf PDPEs is not reserved

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 5ecad245de2ae23dc4e2dbece92f8ccfbaed2fa7 upstream.

Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries.  Intel ignores it; AMD ignores it in PDEs and PDPEs, but
reserves it in PML4Es.

Probably, earlier versions of the AMD manual documented it as reserved in PDPEs
as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it.

Cc: stable@vger.kernel.org
Reported-by: Nadav Amit <namit@vmware.com>
Fixes: a0c0feb57992 ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: arm64: Annotate hyp NMI-related functions as __always_inline
Alexandru Elisei [Thu, 18 Jun 2020 17:12:54 +0000 (18:12 +0100)]
KVM: arm64: Annotate hyp NMI-related functions as __always_inline

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7733306bd593c737c63110175da6c35b4b8bb32c upstream.

The "inline" keyword is a hint for the compiler to inline a function.  The
functions system_uses_irq_prio_masking() and gic_write_pmr() are used by
the code running at EL2 on a non-VHE system, so mark them as
__always_inline to make sure they'll always be part of the .hyp.text
section.

This fixes the following splat when trying to run a VM:

[   47.625273] Kernel panic - not syncing: HYP panic:
[   47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.625273] VCPU:0000000000000000
[   47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61
[   47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[   47.661139] Call trace:
[   47.663659]  dump_backtrace+0x0/0x1cc
[   47.667413]  show_stack+0x18/0x24
[   47.670822]  dump_stack+0xb8/0x108
[   47.674312]  panic+0x124/0x2f4
[   47.677446]  panic+0x0/0x2f4
[   47.680407] SMP: stopping secondary CPUs
[   47.684439] Kernel Offset: disabled
[   47.688018] CPU features: 0x240402,20002008
[   47.692318] Memory Limit: none
[   47.695465] ---[ end Kernel panic - not syncing: HYP panic:
[   47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.695465] VCPU:0000000000000000 ]---

The instruction abort was caused by the code running at EL2 trying to fetch
an instruction which wasn't mapped in the EL2 translation tables. Using
objdump showed the two functions as separate symbols in the .text section.

Fixes: 85738e05dc38 ("arm64: kvm: Unmask PMR before entering guest")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART
Andrew Scull [Mon, 6 Jul 2020 09:52:59 +0000 (10:52 +0100)]
KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b9e10d4a6c9f5cbe6369ce2c17ebc67d2e5a4be5 upstream.

HVC_SOFT_RESTART is given values for x0-2 that it should installed
before exiting to the new address so should not set x0 to stub HVC
success or failure code.

Fixes: af42f20480bf1 ("arm64: hyp-stub: Zero x0 on successful stub handling")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200706095259.1338221-1-ascull@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoKVM: arm64: Fix definition of PAGE_HYP_DEVICE
Will Deacon [Wed, 8 Jul 2020 16:25:46 +0000 (17:25 +0100)]
KVM: arm64: Fix definition of PAGE_HYP_DEVICE

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 68cf617309b5f6f3a651165f49f20af1494753ae upstream.

PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1
pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which
encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are
RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set.

Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits
at EL2.

Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: hda/realtek: Enable headset mic of Acer Veriton N4660G with ALC269VC
Jian-Hong Pan [Mon, 6 Jul 2020 07:18:29 +0000 (15:18 +0800)]
ALSA: hda/realtek: Enable headset mic of Acer Veriton N4660G with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 781c90c034d994c6a4e2badf189128a95ed864c2 upstream.

The Acer Veriton N4660G desktop's audio (1025:1248) with ALC269VC cannot
detect the headset microphone until ALC269VC_FIXUP_ACER_MIC_NO_PRESENCE
quirk maps the NID 0x18 as the headset mic pin.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-3-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: hda/realtek: Enable headset mic of Acer C20-820 with ALC269VC
Jian-Hong Pan [Mon, 6 Jul 2020 07:18:27 +0000 (15:18 +0800)]
ALSA: hda/realtek: Enable headset mic of Acer C20-820 with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6e15d1261d522d1d222f8f89b23c6966905e9049 upstream.

The Acer Aspire C20-820 AIO's audio (1025:1065) with ALC269VC can't
detect the headset microphone until ALC269VC_FIXUP_ACER_HEADSET_MIC
quirk maps the NID 0x18 as the headset mic pin.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-2-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: hda/realtek - Enable audio jacks of Acer vCopperbox with ALC269VC
Jian-Hong Pan [Mon, 6 Jul 2020 07:18:25 +0000 (15:18 +0800)]
ALSA: hda/realtek - Enable audio jacks of Acer vCopperbox with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 8eae7e9b3967f08efaa4d70403aec513cbe45ad0 upstream.

The Acer desktop vCopperbox with ALC269VC cannot detect the MIC of
headset, the line out and internal speaker until
ALC269VC_FIXUP_ACER_VCOPPERBOX_PINS quirk applied.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-1-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: hda/realtek - Fix Lenovo Thinkpad X1 Carbon 7th quirk subdevice id
Benjamin Poirier [Fri, 3 Jul 2020 08:00:04 +0000 (17:00 +0900)]
ALSA: hda/realtek - Fix Lenovo Thinkpad X1 Carbon 7th quirk subdevice id

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 9774dc218bb628974dcbc76412f970e9258e5f27 upstream.

1)
In snd_hda_pick_fixup(), quirks are first matched by PCI SSID and then, if
there is no match, by codec SSID. The Lenovo "ThinkPad X1 Carbon 7th" has
an audio chip with PCI SSID 0x2292 and codec SSID 0x2293[1]. Therefore, fix
the quirk meant for that device to match on .subdevice == 0x2292.

2)
The "Thinkpad X1 Yoga 7th" does not exist. The companion product to the
Carbon 7th is the Yoga 4th. That device has an audio chip with PCI SSID
0x2292 and codec SSID 0x2292[2]. Given the behavior of
snd_hda_pick_fixup(), it is not possible to have a separate quirk for the
Yoga based on SSID. Therefore, merge the quirks meant for the Carbon and
Yoga. This preserves the current behavior for the Yoga.

[1] This is the case on my own machine and can also be checked here
https://github.com/linuxhw/LsPCI/tree/master/Notebook/Lenovo/ThinkPad
https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3225701
[2]
https://github.com/linuxhw/LsPCI/tree/master/Convertible/Lenovo/ThinkPad
https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3176355

Fixes: d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen")
Fixes: 54a6a7dc107d ("ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen")
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Kailang Yang <kailang@realtek.com>
Tested-by: Vincent Bernat <vincent@bernat.ch>
Tested-by: Even Brenden <evenbrenden@gmail.com>
Signed-off-by: Benjamin Poirier <benjamin.poirier@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200703080005.8942-2-benjamin.poirier@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: usb-audio: Add implicit feedback quirk for RTX6001
Pavel Hofman [Fri, 3 Jul 2020 10:04:33 +0000 (12:04 +0200)]
ALSA: usb-audio: Add implicit feedback quirk for RTX6001

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b6a1e78b96a5d7f312f08b3a470eb911ab5feec0 upstream.

USB Audio analyzer RTX6001 uses the same implicit feedback quirk
as other XMOS-based devices.

Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Tested-by: Pavel Hofman <pavel.hofman@ivitera.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/822f0f20-1886-6884-a6b2-d11c685cbafa@ivitera.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: usb-audio: add quirk for MacroSilicon MS2109
Hector Martin [Thu, 2 Jul 2020 07:14:33 +0000 (16:14 +0900)]
ALSA: usb-audio: add quirk for MacroSilicon MS2109

BugLink: https://bugs.launchpad.net/bugs/1887853
commit e337bf19f6af38d5c3fa6d06cd594e0f890ca1ac upstream.

These devices claim to be 96kHz mono, but actually are 48kHz stereo with
swapped channels and unaligned transfers.

Cc: stable@vger.kernel.org
Signed-off-by: Hector Martin <marcan@marcan.st>
Link: https://lore.kernel.org/r/20200702071433.237843-1-marcan@marcan.st
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: hda - let hs_mic be picked ahead of hp_mic
Hui Wang [Thu, 25 Jun 2020 08:38:33 +0000 (16:38 +0800)]
ALSA: hda - let hs_mic be picked ahead of hp_mic

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6a6ca7881b1ab1c13fe0d70bae29211a65dd90de upstream.

We have a Dell AIO, there is neither internal speaker nor internal
mic, only a multi-function audio jack on it.

Users reported that after freshly installing the OS and plug
a headset to the audio jack, the headset can't output sound. I
reproduced this bug, at that moment, the Input Source is as below:
Simple mixer control 'Input Source',0
  Capabilities: cenum
  Items: 'Headphone Mic' 'Headset Mic'
  Item0: 'Headphone Mic'

That is because the patch_realtek will set this audio jack as mic_in
mode if Input Source's value is hp_mic.

If it is not fresh installing, this issue will not happen since the
systemd will run alsactl restore -f /var/lib/alsa/asound.state, this
will set the 'Input Source' according to history value.

If there is internal speaker or internal mic, this issue will not
happen since there is valid sink/source in the pulseaudio, the PA will
set the 'Input Source' according to active_port.

To fix this issue, change the parser function to let the hs_mic be
stored ahead of hp_mic.

Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20200625083833.11264-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoALSA: opl3: fix infoleak in opl3
xidongwang [Mon, 6 Jul 2020 03:27:38 +0000 (20:27 -0700)]
ALSA: opl3: fix infoleak in opl3

BugLink: https://bugs.launchpad.net/bugs/1887853
commit ad155712bb1ea2151944cf06a0e08c315c70c1e3 upstream.

The stack object “info” in snd_opl3_ioctl() has a leaking problem.
It has 2 padding bytes which are not initialized and leaked via
“copy_to_user”.

Signed-off-by: xidongwang <wangxidong_97@163.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1594006058-30362-1-git-send-email-wangxidong_97@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoIB/hfi1: Do not destroy link_wq when the device is shut down
Kaike Wan [Tue, 23 Jun 2020 20:40:53 +0000 (16:40 -0400)]
IB/hfi1: Do not destroy link_wq when the device is shut down

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 2315ec12ee8e8257bb335654c62e0cae71dc278d upstream.

The workqueue link_wq should only be destroyed when the hfi1 driver is
unloaded, not when the device is shut down.

Fixes: 71d47008ca1b ("IB/hfi1: Create workqueue for link events")
Link: https://lore.kernel.org/r/20200623204053.107638.70315.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agoIB/hfi1: Do not destroy hfi1_wq when the device is shut down
Kaike Wan [Tue, 23 Jun 2020 20:40:47 +0000 (16:40 -0400)]
IB/hfi1: Do not destroy hfi1_wq when the device is shut down

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 28b70cd9236563e1a88a6094673fef3c08db0d51 upstream.

The workqueue hfi1_wq is destroyed in function shutdown_device(), which is
called by either shutdown_one() or remove_one(). The function
shutdown_one() is called when the kernel is rebooted while remove_one() is
called when the hfi1 driver is unloaded. When the kernel is rebooted,
hfi1_wq is destroyed while all qps are still active, leading to a kernel
crash:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
  IP: [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0
  PGD 0
  Oops: 0000 [#1] SMP
  Modules linked in: dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) ib_isert iscsi_target_mod target_core_mod ib_ucm mlx4_ib iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm rpcrdma sunrpc irqbypass crc32_pclmul ghash_clmulni_intel rdma_ucm aesni_intel ib_uverbs lrw gf128mul opa_vnic glue_helper ablk_helper ib_iser cryptd ib_umad rdma_cm iw_cm ses enclosure libiscsi scsi_transport_sas pcspkr joydev ib_ipoib(OE) scsi_transport_iscsi ib_cm sg ipmi_ssif mei_me lpc_ich i2c_i801 mei ioatdma ipmi_si dm_multipath ipmi_devintf ipmi_msghandler wmi acpi_pad acpi_power_meter hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm hfi1(OE)
  crct10dif_pclmul crct10dif_common crc32c_intel drm ahci mlx4_core libahci rdmavt(OE) igb megaraid_sas ib_core libata drm_panel_orientation_quirks ptp pps_core devlink dca i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod
  CPU: 19 PID: 0 Comm: swapper/19 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
  Hardware name: Phegda X2226A/S2600CW, BIOS SE5C610.86B.01.01.0024.021320181901 02/13/2018
  task: ffff8a799ba0d140 ti: ffff8a799bad8000 task.ti: ffff8a799bad8000
  RIP: 0010:[<ffffffff94cb7b02>] [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0
  RSP: 0018:ffff8a90dde43d80 EFLAGS: 00010046
  RAX: 0000000000000082 RBX: 0000000000000086 RCX: 0000000000000000
  RDX: ffff8a90b924fcb8 RSI: 0000000000000000 RDI: 000000000000001b
  RBP: ffff8a90dde43db8 R08: ffff8a799ba0d6d8 R09: ffff8a90dde53900
  R10: 0000000000000002 R11: ffff8a90dde43de8 R12: ffff8a90b924fcb8
  R13: 000000000000001b R14: 0000000000000000 R15: ffff8a90d2890000
  FS: 0000000000000000(0000) GS:ffff8a90dde40000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000102 CR3: 0000001a70410000 CR4: 00000000001607e0
  Call Trace:
  [<ffffffff94cb8105>] queue_work_on+0x45/0x50
  [<ffffffffc03f781e>] _hfi1_schedule_send+0x6e/0xc0 [hfi1]
  [<ffffffffc03f78a2>] hfi1_schedule_send+0x32/0x70 [hfi1]
  [<ffffffffc02cf2d9>] rvt_rc_timeout+0xe9/0x130 [rdmavt]
  [<ffffffff94ce563a>] ? trigger_load_balance+0x6a/0x280
  [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt]
  [<ffffffff94ca7f58>] call_timer_fn+0x38/0x110
  [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt]
  [<ffffffff94caa3bd>] run_timer_softirq+0x24d/0x300
  [<ffffffff94ca0f05>] __do_softirq+0xf5/0x280
  [<ffffffff9537832c>] call_softirq+0x1c/0x30
  [<ffffffff94c2e675>] do_softirq+0x65/0xa0
  [<ffffffff94ca1285>] irq_exit+0x105/0x110
  [<ffffffff953796c8>] smp_apic_timer_interrupt+0x48/0x60
  [<ffffffff95375df2>] apic_timer_interrupt+0x162/0x170
  <EOI>
  [<ffffffff951adfb7>] ? cpuidle_enter_state+0x57/0xd0
  [<ffffffff951ae10e>] cpuidle_idle_call+0xde/0x230
  [<ffffffff94c366de>] arch_cpu_idle+0xe/0xc0
  [<ffffffff94cfc3ba>] cpu_startup_entry+0x14a/0x1e0
  [<ffffffff94c57db7>] start_secondary+0x1f7/0x270
  [<ffffffff94c000d5>] start_cpu+0x5/0x14

The solution is to destroy the workqueue only when the hfi1 driver is
unloaded, not when the device is shut down. In addition, when the device
is shut down, no more work should be scheduled on the workqueues and the
workqueues are flushed.

Fixes: 8d3e71136a08 ("IB/{hfi1, qib}: Add handling of kernel restart")
Link: https://lore.kernel.org/r/20200623204047.107638.77646.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
4 years agomlxsw: pci: Fix use-after-free in case of failed devlink reload
Ido Schimmel [Fri, 10 Jul 2020 13:41:39 +0000 (16:41 +0300)]
mlxsw: pci: Fix use-after-free in case of failed devlink reload

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit c4317b11675b99af6641662ebcbd3c6010600e64 ]

In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].

This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.

Fix this by initializing the command interface once during probe and not
during every reload.

This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.

[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355

CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xf6/0x16e lib/dump_stack.c:118
 print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
 memcpy+0x39/0x60 mm/kasan/common.c:106
 memcpy include/linux/string.h:406 [inline]
 mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
 mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
 mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
 mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
 mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
 mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
 mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
 devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
 devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
 genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
 netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
 __netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
 genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
 genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
 genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
 netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
 genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
 netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
 netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
 sock_sendmsg_nosec net/socket.c:652 [inline]
 sock_sendmsg+0x150/0x190 net/socket.c:672
 ____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
 ___sys_sendmsg+0xff/0x170 net/socket.c:2417
 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
 do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a9c8336f6544 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>