git.proxmox.com Git - mirror_ubuntu-focal-kernel.git/log

scsi: zfcp: Fence adapter status propagation for common statuses

BugLink: https://bugs.launchpad.net/bugs/1887124
Common status flags that all main objects - adapter, port, and unit -
support are propagated to sub-objects when set or cleared. For instance,
when setting the status ZFCP_STATUS_COMMON_ERP_INUSE for an adapter object,
we will propagate this to all its child ports and units - same for when
clearing a common status flag.

Units of an adapter object are enumerated via __shost_for_each_device()
over the scsi host object of the corresponding adapter.

Once we move the scsi host object allocation and registration to after the
first exchange config and exchange port data, this won't be possible for
cases where we set or clear common statuses during the very first adapter
recovery.

But since we won't have any port or unit objects yet at that point of time,
we can just fence the status propagation for cases where the scsi host
object is not yet set in the adapter object. It won't change any effective
status propagations, but will prevent us from dereferencing invalid
pointers.

For any later point in the work flow the scsi host object will be set and
thus nothing is changed then.

Link: https://lore.kernel.org/r/f51fe5f236a1e3d1ce53379c308777561bfe35e1.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 971f2abb4ca4095bb4bd7c2ba119d1cca078b433)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: Move p-t-p port allocation to after xport data

BugLink: https://bugs.launchpad.net/bugs/1887124
When doing the very first adapter recovery - initialization - for a FCP
device in a point-to-point topology we also allocate the port object
corresponding to the attached remote port, and trigger a port recovery for
it that will run after the adapter recovery finished.

Right now this happens right after we finished with the exchange config
data command, and uses the fibre channel host object corresponding to the
FCP device to determine whether a point-to-point topology is used.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this use of the fc_host object is not
possible anymore at that point in the work flow.

But the allocation and recovery trigger doesn't have notable side-effects
on the following exchange port data processing, so we can move those to
after xport data, and thus also to after the scsi host object allocation,
once we move it. Then the fc_host object can be used again, like it is now.

For any further adapter recoveries this doesn't change anything, because at
that point the port object already exists and recovery is triggered
elsewhere for existing port objects.

Link: https://lore.kernel.org/r/73e5d4ac21e2b37bf0c3ca8e530bc5a5c6e74f8f.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ac007adc4d2d9258fdf27abd25cc77a4e0e8d19f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: Fence fc_host updates during link-down handling

BugLink: https://bugs.launchpad.net/bugs/1887124
When receiving a notification that a FCP device lost its local link we
usually update the fibre channel host object which represents that FCP
device to reflect that.

This notification/information can also surface when the FCP device is
running through adapter recovery (exchange config and exchange port data
return incomplete).

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, and this happens during the very first
adapter recovery, these updates can not be done until after the scsi host
object is allocated.

Reorder the fc_host updates in zfcp_fsf_fc_host_link_down() so that they
only happen after a check of whether the scsi host object is already
allocated or not.

During the first adapter recovery this will cause the skip of these updates
if a link-down condition is detected, but we can repeat them after we
allocated the scsi host object, if necessary.

For any further link-down handling the only changes in the work flow are
the slightly reordered assignments in zfcp_fsf_fc_host_link_down().

Link: https://lore.kernel.org/r/f841f2cda61dcd7b8549910c44e1831927459edf.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 990486f3a8508494dab2a7ff0fcc3eb977557d89)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: Move fc_host updates during xport data handling into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When executing exchange port data for a FCP device for the first time, or
after an adapter recovery, we update several properties of the fibre
channel host object which represents that FCP device.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this is not possible for the former case.

Move all these update into separate, and fenced function that first checks
whether the scsi host object already exists or not, before making the
updates.

During the first ever exchange port data in the adapter life cycle this
will make the exchange port data handler skip over this update step, but we
can repeat it later, after we allocated the scsi host object.

For any further recovery of that adapter the work flow is only changed
slightly because then the scsi host object already exists and we don't free
it until we release the adapter completely at the end of its life cycle.

Link: https://lore.kernel.org/r/ae454c2dc6da0b02907c489af91d0b211d331825.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 52e61fde5ec95cb4011784fb0bc6b436e16fcaa8)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: Move shost updates during xconfig data handling into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When executing exchange config data for a FCP device for the first time, or
after an adapter recovery, we update several properties of the scsi host or
fibre channel host object that represent that FCP device.

When moving the scsi host object allocation and registration - and thus
also the fibre channel host object allocation - to after the first exchange
config and exchange port data, this is not possible for the former case.

Move all these update into separate, and fenced function that first checks
whether the scsi host object already exists or not, before making the
updates.

During the first ever exchange config data in the adapter life cycle this
will make the exchange config data handler skip over this update step, but
we can repeat it later, after we allocated the scsi host object.

For any further recovery of that adapter the work flow is only changed
slightly because then the scsi host object already exists and we don't free
it until we release the adapter completely at the end of its life cycle.

Link: https://lore.kernel.org/r/5fc3f4d38d4334f7aa595497c6f7865fb1102e0f.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit bd1684817d7d8d1a3b95a4347166246ad1f7670b)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: Move shost modification after QDIO (re-)open into fenced function

BugLink: https://bugs.launchpad.net/bugs/1887124
When establishing and activating the QDIO queue pair for a FCP device for
the first time, or after an adapter recovery, we publish some of its
characteristics to the scsi host object representing that FCP device.

When moving the scsi host object allocation and registration to after the
first exchange config and exchange port data, this is not possible for the
former case - QDIO open for the first time - because that happens before
exchange config and exchange port data.

Move the scsi host object update into a fenced function that checks whether
the object already exists or not. This way we can repeat that step later,
once we are past the allocation.

Once the first recovery succeeds we don't release the scsi host object
anymore, so further recoveries do work as before.

Link: https://lore.kernel.org/r/a214ebf508f71e3690113e3e90edab1cea0e24e3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 978857c7e367d6841f71c4ded5a8c244520f5e22)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: use fallthrough;

BugLink: https://bugs.launchpad.net/bugs/1887124
Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe.com/
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
[bblock@linux.ibm.com: resolved merge conflict with recently upstream-sent patch "zfcp: expose fabric name as common fc_host sysfs attribute"]
Link: https://lore.kernel.org/r/d14669a67a17392490d3184117941123765db1a4.1585663010.git.bblock@linux.ibm.com
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit cec9cbac5244b017f2671e3770abfacc939d753d)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: log FC Endpoint Security errors

BugLink: https://bugs.launchpad.net/bugs/1887124
Log any FC Endpoint Security errors to the kernel ring buffer with rate-
limiting.

Link: https://lore.kernel.org/r/20200312174505.51294-11-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 42cabdaf103be174adb6f1ca61383eb2b35a013a)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: enhance handling of FC Endpoint Security errors

BugLink: https://bugs.launchpad.net/bugs/1887124
Enable for explicit FCP channel FC Endpoint Security error reporting and
handle any FSF security errors according to specification. Take the
following recovery actions when a FSF_SECURITY_ERROR is reported for the
specified FSF commands:

- Open Port: Retry the command if possible
- Send FCP : Physically close the remote port and reopen

For Open Port the command status is set to error, which triggers a retry.
For Send FCP the command status is set to error and recovery is triggered
to physically reopen the remote port.

Link: https://lore.kernel.org/r/20200312174505.51294-10-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e53d92856e9f1cfa0be284fa1dc3367130ce433a)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: trace FC Endpoint Security of FCP devices and connections

BugLink: https://bugs.launchpad.net/bugs/1887124
Trace changes in Fibre Channel Endpoint Security capabilities of FCP
devices as well as changes in Fibre Channel Endpoint Security state of
their connections to FC remote ports as FC Endpoint Security changes with
trace level 3 in HBA DBF.

A change in FC Endpoint Security capabilities of FCP devices is traced as
response to FSF command FSF_QTCB_EXCHANGE_PORT_DATA with a trace tag of
"fsfcesa" and a WWPN of ZFCP_DBF_INVALID_WWPN = 0x0000000000000000 (see
FC-FS-4 §18 "Name_Identifier Formats", NAA field).

A change in FC Endpoint Security state of connections between FCP devices
and FC remote ports is traced as response to FSF command
FSF_QTCB_OPEN_PORT_WITH_DID with a trace tag of "fsfcesp".

Example trace record of FC Endpoint Security capability change of FCP
device formatted with zfcpdbf from s390-tools:

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ...
Caller         : 0x...
Record ID      : 5                    ZFCP_DBF_HBA_FCES
Tag            : fsfcesa              FSF FC Endpoint Security adapter
Request ID     : 0x...
Request status : 0x00000010
FSF cmnd       : 0x0000000e           FSF_QTCB_EXCHANGE_PORT_DATA
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000           FSF_GOOD
FSF stat qual  : n/a
Prot stat      : n/a
Prot stat qual : n/a
Port handle    : 0x00000000           none (invalid)
LUN handle     : n/a
WWPN           : 0x0000000000000000   ZFCP_DBF_INVALID_WWPN
FCES old       : 0x00000000           old FC Endpoint Security
FCES new       : 0x00000007           new FC Endpoint Security

Example trace record of FC Endpoint Security change of connection to
FC remote port formatted with zfcpdbf from s390-tools:

Timestamp      : ...
Area           : HBA
Subarea        : 00
Level          : 3
Exception      : -
CPU ID         : ...
Caller         : 0x...
Record ID      : 5                    ZFCP_DBF_HBA_FCES
Tag            : fsfcesp              FSF FC Endpoint Security port
Request ID     : 0x...
Request status : 0x00000010
FSF cmnd       : 0x00000005           FSF_QTCB_OPEN_PORT_WITH_DID
FSF sequence no: 0x...
FSF issued     : ...
FSF stat       : 0x00000000           FSF_GOOD
FSF stat qual  : n/a
Prot stat      : n/a
Prot stat qual : n/a
Port handle    : 0x...
WWPN           : 0x500507630401120c   WWPN
FCES old       : 0x00000000           old FC Endpoint Security
FCES new       : 0x00000004           new FC Endpoint Security

Link: https://lore.kernel.org/r/20200312174505.51294-9-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 616da39e0060f3b8bbc0f36f7d911bb5abb31746)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: log FC Endpoint Security of connections

BugLink: https://bugs.launchpad.net/bugs/1887124
Log the usage of and subsequent changes in FC Endpoint Security of
connections between FCP devices and FC remote ports to the kernel ring
buffer. Activation of FC Endpoint Security is logged as informational.
Change and deactivation are logged as warning.

No logging takes place, if FC Endpoint Security is not used (i.e. never
activated) on a connection or if it does not change during reopen of a port
(e.g. due to adapter or port recovery).

Link: https://lore.kernel.org/r/20200312174505.51294-8-maier@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f0d26ae847489850509b793ef3f74be62f69ab0f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: report FC Endpoint Security in sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
Add an interface to read Fibre Channel Endpoint Security information of FCP
channels and their connections to FC remote ports. It comes in the form of
new sysfs attributes that are attached to the CCW device representing the
FCP device and its zfcp port objects.

The read-only sysfs attribute "fc_security" of a CCW device representing a
FCP device shows the FC Endpoint Security capabilities of the device.
Possible values are: "unknown", "unsupported", "none", or a comma-
separated list of one or more mnemonics and/or one hexadecimal value
representing the supported FC Endpoint Security:

  Authentication: Authentication supported
  Encryption    : Encryption supported

The read-only sysfs attribute "fc_security" of a zfcp port object shows the
FC Endpoint Security used on the connection between its parent FCP device
and the FC remote port. Possible values are: "unknown", "unsupported",
"none", or a mnemonic or hexadecimal value representing the FC Endpoint
Security used:

  Authentication: Connection has been authenticated
  Encryption    : Connection is encrypted

Both sysfs attributes may return hexadecimal values instead of mnemonics,
if the mnemonic lookup table does not contain an entry for the FC Endpoint
Security reported by the FCP device.

Link: https://lore.kernel.org/r/20200312174505.51294-7-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a17c78460093aad8fb97fc6905c22355b7d1c923)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: auto variables for dereferenced structs in open port handler

BugLink: https://bugs.launchpad.net/bugs/1887124
Introduce automatic variables for adapter and QTCB bottom in
zfcp_fsf_open_port_handler(). This facilitates subsequent changes to meet
the 80 character per line limit.

Link: https://lore.kernel.org/r/20200312174505.51294-6-maier@linux.ibm.com
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 185f2d2d595c29c6e2ed6e2897b9ccc52c50c917)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: fix fc_host attributes that should be unknown on local link down

BugLink: https://bugs.launchpad.net/bugs/1887124
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.

Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.

Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down.  Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.

The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value.  Only if something triggered an
exchange port data, some values changed.  Active_fc4s became all zeros as
unknown equivalent during link down.  Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph).  Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.

(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))

This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x01 0x00 ..."
    ...
    fabric_name         = "0x10000027f8e04c49"
    ...
    permanent_port_name = "0xc05076e4588059c1"
    port_id             = "0x244800"
    port_state          = "Linkdown"
    port_type           = "NPort (fabric via point-to-point)"
    ...
    speed               = "16 Gbit"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "37"
    ...
    utilization         = "0 0 0"

The second and next invocations produce (abbreviated):

Class = "fc_host"
    active_fc4s         = "0x00 0x00 0x00 0x00 ..."
    ...
    fabric_name         = "0x0"
    ...
    permanent_port_name = "0x0"
    port_id             = "0x000000"
    port_state          = "Linkdown"
    port_type           = "Unknown"
    ...
    speed               = "unknown"
Class = "scsi_host"
    ...
    megabytes           = "0 0"
    ...
    requests            = "0 0 0"
    seconds_active      = "38"
    ...
    utilization         = "0 0 0"

Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down().  All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval().  Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().

Port_name and node_name are always valid, so never reset them.

Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.

Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().

When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.

Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 7e0e4e0958ef794ee868838249880d5c521ff761)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: wire previously driver-specific sysfs attributes also to fc_host

BugLink: https://bugs.launchpad.net/bugs/1887124
Manufacturer, HBA model, firmware version, and hardware version. Use the
same value format as for the driver-specific attributes. Keep the
driver-specific attributes for stable user space sysfs API.

Link: https://lore.kernel.org/r/20200312174505.51294-4-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 538c6e910baea9a94ba2a816c19c3e071892b49c)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: expose fabric name as common fc_host sysfs attribute

BugLink: https://bugs.launchpad.net/bugs/1887124
FICON Express8S or older, as well as card features newer than FICON
Express16S+ have no certain firmware level requirement.

FICON Express16S or FICON Express16S+ have the following
minimum firmware level requirements to show a proper fabric name value:

z13 machine
  FICON Express16S  , MCL P08424.005 , LIC version 0x00000721
z14 machine
  FICON Express16S  , MCL P42611.008 , LIC version 0x10200069
  FICON Express16S+ , MCL P42625.010 , LIC version 0x10300147

Otherwise, the read value is not the fabric name.

Each FCP channel of these card features might need one SAN fabric re-login
after concurrent microcode update in order to show the proper fabric name.
Possible ways to trigger a SAN fabric re-login are one of: Pull fibres
between FCP channel port and SAN switch port on either side and re-plug,
disable SAN switch port adjacent to FCP channel port and re-enable switch
port, or at Service Element toggle off all CHPIDs of FCP channel over all
LPARs and toggle CHPIDs on again.  Zfcp operating subchannels (FCP devices)
on such FCP channel recovers a fabric re-login.

Initialize fabric name for any topology and have it an invalid WWPN 0x0 for
anything but fabric topology.  Otherwise for e.g. point-to-point topology
one could see the initial -1 from fc_host_setup() and after a link unplug
our fabric name would turn to 0x0 (with subsequent commit ("zfcp: fix
fc_host attributes that should be unknown on local link down") and stay 0x0
on link replug.  I did not initialize to 0x0 somewhere even earlier in the
code path such that it would not flap from real to 0x0 to real on e.g. an
exchange config data with fabric topology.

Link: https://lore.kernel.org/r/20200312174505.51294-3-maier@linux.ibm.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e05a10a055098bf55168a9d682156e38c6b00cfa)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: fix wrong data and display format of SFP+ temperature

BugLink: https://bugs.launchpad.net/bugs/1887124
When implementing support for retrieval of local diagnostic data from the
FCP channel, the wrong data format was assumed for the temperature of the
local SFP+ connector. The Fibre Channel Link Services (FC-LS-3)
specification is not clear on the format of the stored integer, and only
after consulting the SNIA specification SFF-8472 did we realize it is
stored as two's complement. Thus, the used data and display format is
wrong, and highly misleading for users when the temperature should drop
below 0°C (however unlikely that may be).

To fix this, change the data format in `struct fsf_qtcb_bottom_port` from
unsigned to signed, and change the printf format string used to generate
`zfcp_sysfs_adapter_diag_sfp_temperature_show()` from `%hu` to `%hd`.

Link: https://lore.kernel.org/r/d6e3be5428da5c9490cfff4df7cae868bc9f1a7e.1582039501.git.bblock@linux.ibm.com
Fixes: a10a61e807b0 ("scsi: zfcp: support retrieval of SFP Data via Exchange Port Data")
Fixes: 6028f7c4cd87 ("scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver")
Cc: <stable@vger.kernel.org> # 5.5+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Fedor Loshakov <loshakov@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a3fd4bfe85fbb67cf4ec1232d0af625ece3c508b)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: proper indentation to reduce confusion in zfcp_erp_required_act

BugLink: https://bugs.launchpad.net/bugs/1887124
No functional change.

The unary not operator only applies to the sub expression before the
logical or. So we return early if (not running) or failed.

Link: https://lore.kernel.org/r/df4f897f6e83eaa528465d0858d5a22daac47a2f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit e76acc51942649398660ca50655af5afecf29c42)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: move maximum age of diagnostic buffers into a per-adapter variable

BugLink: https://bugs.launchpad.net/bugs/1887124
Replace the static define (ZFCP_DIAG_MAX_AGE) with a per-adapter variable
(${adapter}->diagnostics->max_age). This new variable is exported via
sysfs, along with other, already existing adapter variables, and can both
be read and written. This way users can choose how much time should pass
between refreshes of diagnostic buffers. The default value for the age
remains to be five seconds.

By setting this new variable to 0, the caching of diagnostic buffers for
userspace accesses can also be completely removed.

All diagnostic buffers of a given adapter are subject to this setting in
the same way.

Link: https://lore.kernel.org/r/b1d0977cc884b16dd4ca6418e4320c56a4c31d63.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 48910f8c35cfd250d806f3e03150d256f40b6d4c)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: implicitly refresh config-data diagnostics when reading sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
Adds implicit updates of cached diagnostics via Exchange Config Data when
reading sysfs attributes interfacing them. Right now this only affects the
new B2B-Credit diagnostic attribute.

This uses the same mechanism previously also used for cached diagnostics
of Exchange Port Data.

Link: https://lore.kernel.org/r/60a94f55f2630b74b468fed5f39880208abb2679.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8a72db70b5ca3c3feb3ca25519e8a9516cc60cfe)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: introduce sysfs interface to read the local B2B-Credit

BugLink: https://bugs.launchpad.net/bugs/1887124
In addition to the diagnostic data from the local SFP transceiver this
patch adds an interface to read the advertised buffer-to-buffer credit from
the local FC_Port.

With this patch the userspace-interface will only read data stored in the
corresponding "diagnostic buffer" (that was stored during completion of a
previous Exchange Config Data command). Implicit updating will follow later
in this series.

Link: https://lore.kernel.org/r/8a53aef87b53c50cfb1a3425b799bacb6f82b832.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5a2876f0d1ef26b76755749f978d46e4666013dd)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: implicitly refresh port-data diagnostics when reading sysfs

BugLink: https://bugs.launchpad.net/bugs/1887124
This patch adds implicit updates to the sysfs entries that read the
diagnostic data stored in the "caching buffer" for Exchange Port Data.

An update is triggered once the buffer is older than ZFCP_DIAG_MAX_AGE
milliseconds (5s). This entails sending an Exchange Port Data command to
the FCP-Channel, and during its ingress path updating the cached data and
the timestamp. To prevent multiple concurrent userspace-applications from
triggering this update in parallel we synchronize all of them using a
wait-queue (waiting threads are interruptible; the updating thread is not).From: Benjamin Block <bblock@linux.ibm.com>

Link: https://lore.kernel.org/r/c145b5cfc99a63b6a018b1184fbd27bb09c955f5.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8155eb0785279728b6b2e29aba2ca52d16aa526f)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: introduce sysfs interface for diagnostics of local SFP transceiver

BugLink: https://bugs.launchpad.net/bugs/1887124
This adds an interface to read the diagnostics of the local SFP transceiver
of an FCP-Channel from userspace. This comes in the form of new sysfs
entries that are attached to the CCW device representing the FCP
device. Each type of data gets its own sysfs entry; the whole collection of
entries is pooled into a new child-directory of the CCW device node:
"diagnostics".

Adds sysfs entries for:
* sfp_invalid:    boolean value evaluating to whether the following 5
                   fields are invalid; {0, 1}; 1 - invalid
* temperature:    transceiver temp.; unit 1/256°C;
                   range [-128°C, +128°C]
* vcc:            supply voltage; unit 100μV; range [0, 6.55V]
* tx_bias:        transmitter laser bias current; unit 2μA;
                   range [0, 131mA]
* tx_power:       coupled TX output power; unit 0.1μW; range [0, 6.5mW]
* rx_power:       received optical power; unit 0.1μW; range [0, 6.5mW]

* optical_port:   boolean value evaluating to whether the FCP-Channel has
                   an optical port; {0, 1}; 1 - optical
* fec_active:     boolean value evaluating to whether 16G FEC is active;
                   {0, 1}; 1 - active
* port_tx_type:   nibble describing the port type; {0, 1, 2, 3};
                   0 - unknown,             1 - short wave,
                   2 - long wave LC 1310nm, 3 - long wave LL 1550nm
* connector_type: two bits describing the connector type; {0, 1};
                   0 - unknown,             1 - SFP+

This is only supported if the FCP-Channel in turn supports reporting the
SFP Diagnostic Data, otherwise read() on these new entries will return
EOPNOTSUPP (this affects only adapters older than FICON Express8S, on
Mainframe generations older than z14). Other possible errors for read()
include ENOLINK, ENODEV and ENOMEM.

With this patch the userspace-interface will only read data stored in
the corresponding "diagnostic buffer" (that was stored during completion
of an previous Exchange Port Data command). Implicit updating will
follow later in this series.

Link: https://lore.kernel.org/r/1f9cce7c829c881e7d71a3f10c5b57f3dd84ab32.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6028f7c4cd87cac13481255d7e35dd2c9207ecae)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: support retrieval of SFP Data via Exchange Port Data

BugLink: https://bugs.launchpad.net/bugs/1887124
A new FCP channel feature allows us to read the diagnostics from our local
SFP transceivers. To make use of that add a flag
(FSF_FEATURE_REQUEST_SFP_DATA) to the feature-set we request from the FCP
channel. Whether the channel actually implements this can be determined via
an other new flag (FSF_FEATURE_REPORT_SFP_DATA), that is set in the
adapter_features field of the adapter structure after Exchange Config Data
finished.

Also add the corresponding definitions in the QTCB Bottom for Exchange Port
Data. These new definitions are only valid, if FSF_FEATURE_REPORT_SFP_DATA
is set.

Link: https://lore.kernel.org/r/ee1eba4de71eb06b4d82207ad4f428429346156f.1572018132.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a10a61e807b0a226b78e2041843cbf0521bd0c35)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: add diagnostics buffer for exchange config data

BugLink: https://bugs.launchpad.net/bugs/1887124
In the same vein as the previous patch, add diagnostic data capture for the
Exchange Config Data command.

Link: https://lore.kernel.org/r/7d8ac0a6cad403fa8f8b888693476a84e80a277b.1572018131.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 088210233e6fc039fd2c0bfe44b06bb94328d09e)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: diagnostics buffer caching and use for exchange port data

BugLink: https://bugs.launchpad.net/bugs/1887124
The FCP channel exposes two central interfaces to receive information about
the local FCP-Adapter/-Port: Exchange Port and Exchange Config Data. Using
these commands can negatively impact the adapter if we allow them to be
sent at a very high rate.

The later parts of this patchset will introduce new user-interfaces to
receive more diagnostics from the adapter. To prevent any negative impact
from using those, this patch adds a simple caching-mechanism that will
prevent a malicious/faulty userspace-application from generating an
abnormal high amount of Exchange Port/Config Data traffic.

Relevant diagnostic data that is received via Exchange Config/Port Data is
cached in buffers associated with the corresponding adapter-struct. Each
buffer is associated with a timestamp that signals how old the data is,
and, added via a following patch in this series, lets userspace-interfaces
determine when the data is too old and needs to be updated.

Buffer-updates are made during the normal response path of the
corresponding command. With this patch only the output of the Exchange Port
Data command is captured.

Link: https://lore.kernel.org/r/054ca020ce0a53dc0d9176428bea373898944e6a.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 7e418833e68948cb9ed15262889173b7db2960cb)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

scsi: zfcp: signal incomplete or error for sync exchange config/port data

BugLink: https://bugs.launchpad.net/bugs/1887124
Adds a new FSF-Request status flag (ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE)
that signal that the data received using Exchange Config Data or Exchange
Port Data was incomplete. This new flags is set in the respective handlers
during the response path.

With this patch, only the synchronous FSF-functions for each command got
support for the new flag, otherwise it is transparent.

Together with this new flag and already existing status flags the
synchronous FSF-functions are extended to now detect whether the received
data is complete, incomplete or completely invalid (this includes cases
where a command ran into a timeout). This is now signaled back to the
caller, where previously only failures on the request path would result in
a bad return-code.

For complete data the return-code remains 0. For incomplete data a new
return-code -EAGAIN is added to the function-interface. For completely
invalid data the already existing return-code -EIO is reused - formerly
this was used to signal failures on the request path.

Existing callers of the FSF-functions are adjusted so that they behave as
before for return-code 0 and -EAGAIN, to not change the user-interface. As
-EIO existed all along, it was already exposed to the user - and needed
handling - and will now also be exposed in this new special case.

Link: https://lore.kernel.org/r/e14f0702fa2b00a4d1f37c7981a13f2dd1ea2c83.1572018130.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 92953c6e0aa77d4febcca6dd691e8192910c8a28)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

USB: serial: option: add Quectel EG95 LTE modem

BugLink: https://bugs.launchpad.net/bugs/1886744
Add support for Quectel Wireless Solutions Co., Ltd. EG95 LTE modem

T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  5 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0195 Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
(cherry picked from commit da6902e5b6dbca9081e3d377f9802d4fd0c5ea59
linux-next)
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net: usb: qmi_wwan: add support for Quectel EG95 LTE modem

BugLink: https://bugs.launchpad.net/bugs/1886744
Add support for Quectel Wireless Solutions Co., Ltd. EG95 LTE modem

T:  Bus=01 Lev=01 Prnt=01 Port=02 Cnt=02 Dev#=  5 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2c7c ProdID=0195 Rev=03.18
S:  Manufacturer=Android
S:  Product=Android
C:  #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
I:  If#=0x4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit f815dd5cf48b905eeecf0a2b990e9b7ab048b4f1)
Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ASoC: SOF: Intel: hda: move i915 init earlier

BugLink: https://bugs.launchpad.net/bugs/1886341
To be compliant with i915 display driver requirements, i915 power-up
must be done before any HDA communication takes place, including
parsing the bus capabilities. Otherwise the initial codec probe
may fail.

Move i915 initialization earlier in the SOF HDA sequence. This
sequence is now aligned with the snd-hda-intel driver where the
display_power() call is before snd_hdac_bus_parse_capabilities()
and rest of the capability parsing.

Also remove unnecessary ifdef around hda_codec_i915_init(). There's
a dummy implementation provided if CONFIG_SND_SOC_SOF_HDA is not
enabled.

Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20200206200223.7715-3-kai.vehmanen@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(backported from commit af7aae1b1f6306a1cda4da393e920a1334eaa3d4)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ASoC: SOF: Intel: drop HDA codec upon probe failure

BugLink: https://bugs.launchpad.net/bugs/1886341
In case a HDA codec probe fails, do not raise error immediately,
but instead remove the codec from bus->codec_mask and continue
probe for other codecs.

This allows for more robust behaviour in cases where one codec
in the system is faulty. SOF driver load can still proceed with
the codecs that can be probed successfully. Probe may still
fail if suitable machine driver is not found, but in many
cases the generic HDA machine driver can operate with a subset
of codecs.

Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20191218002616.7652-6-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit 91dce767cd0b08be9f1c87bb2de8e63391a72692)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ASoC: SOF: intel: hda: Modify signature for hda_codec_probe_bus()

BugLink: https://bugs.launchpad.net/bugs/1886341
The machine driver selection for HDA platforms will be
consolidated and moved out of the SOF DSP
probe callback. In preparation for that, modify the
signature for hda_codec_probe_bus() to pass the
hda_codec_use_common_hdmi as a variable while probing the
HDA codecs.

Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20191204211556.12671-10-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(backported from commit 80acdd4f8ff763183dc1cd7f1cd31db9eaaecdc8)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net/smc: tolerate future SMCD versions

BugLink: https://bugs.launchpad.net/bugs/1882088
CLC proposal messages of future SMCD versions could be larger than SMCD
V1 CLC proposal messages.
To enable toleration in SMC V1 the receival of CLC proposal messages
is adapted:
* accept larger length values in CLC proposal
* check trailing eye catcher for incoming CLC proposal with V1 length only
* receive the whole CLC proposal even in cases it does not fit into the
V1 buffer

Fixes: e7b7a64a8493d ("smc: support variable CLC proposal messages")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit fb4f79264c0fc6fd5a68ffe3e31bfff97311e1f1 linux-next)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

debian/dkms-versions: update ZFS dkms package version (LP: #1881107)

BugLink: https://bugs.launchpad.net/bugs/1881107
Update the ZFS dkms version to pick up latest ZFS that contains
a backport of the AES-GCM performance accelleration updates.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

bcache: check and adjust logical block size for backing devices

BugLink: https://bugs.launchpad.net/bugs/1867916
It's possible for a block driver to set logical block size to
a value greater than page size incorrectly; e.g. bcache takes
the value from the superblock, set by the user w/ make-bcache.

This causes a BUG/NULL pointer dereference in the path:

  __blkdev_get()
  -> set_init_blocksize() // set i_blkbits based on ...
     -> bdev_logical_block_size()
        -> queue_logical_block_size() // ... this value
  -> bdev_disk_changed()
     ...
     -> blkdev_readpage()
        -> block_read_full_page()
           -> create_page_buffers() // size = 1 << i_blkbits
              -> create_empty_buffers() // give size/take pointer
                 -> alloc_page_buffers() // return NULL
                 .. BUG!

Because alloc_page_buffers() is called with size > PAGE_SIZE,
thus it initializes head = NULL, skips the loop, return head;
then create_empty_buffers() gets (and uses) the NULL pointer.

This has been around longer than commit ad6bf88a6c19 ("block:
fix an integer overflow in logical block size"); however, it
increased the range of values that can trigger the issue.

Previously only 8k/16k/32k (on x86/4k page size) would do it,
as greater values overflow unsigned short to zero, and queue_
logical_block_size() would then use the default of 512.

Now the range with unsigned int is much larger, and users w/
the 512k value, which happened to be zero'ed previously and
work fine, started to hit this issue -- as the zero is gone,
and queue_logical_block_size() does return 512k (>PAGE_SIZE.)

Fix this by checking the bcache device's logical block size,
and if it's greater than page size, fallback to the backing/
cached device's logical page size.

This doesn't affect cache devices as those are still checked
for block/page size in read_super(); only the backing/cached
devices are not.

Apparently it's a regression from commit 2903381fce71 ("bcache:
Take data offset from the bdev superblock."), moving the check
into BCACHE_SB_VERSION_CDEV only. Now that we have superblocks
of backing devices out there with this larger value, we cannot
refuse to load them (i.e., have a similar check in _BDEV.)

Ideally perhaps bcache should use all values from the backing
device (physical/logical/io_min block size)? But for now just
fix the problematic case.

Test-case:

    # IMG=/root/disk.img
    # dd if=/dev/zero of=$IMG bs=1 count=0 seek=1G
    # DEV=$(losetup --find --show $IMG)
    # make-bcache --bdev $DEV --block 8k
      < see dmesg >

Before:

    # uname -r
    5.7.0-rc7

    [   55.944046] BUG: kernel NULL pointer dereference, address: 0000000000000000
    ...
    [   55.949742] CPU: 3 PID: 610 Comm: bcache-register Not tainted 5.7.0-rc7 #4
    ...
    [   55.952281] RIP: 0010:create_empty_buffers+0x1a/0x100
    ...
    [   55.966434] Call Trace:
    [   55.967021]  create_page_buffers+0x48/0x50
    [   55.967834]  block_read_full_page+0x49/0x380
    [   55.972181]  do_read_cache_page+0x494/0x610
    [   55.974780]  read_part_sector+0x2d/0xaa
    [   55.975558]  read_lba+0x10e/0x1e0
    [   55.977904]  efi_partition+0x120/0x5a6
    [   55.980227]  blk_add_partitions+0x161/0x390
    [   55.982177]  bdev_disk_changed+0x61/0xd0
    [   55.982961]  __blkdev_get+0x350/0x490
    [   55.983715]  __device_add_disk+0x318/0x480
    [   55.984539]  bch_cached_dev_run+0xc5/0x270
    [   55.986010]  register_bcache.cold+0x122/0x179
    [   55.987628]  kernfs_fop_write+0xbc/0x1a0
    [   55.988416]  vfs_write+0xb1/0x1a0
    [   55.989134]  ksys_write+0x5a/0xd0
    [   55.989825]  do_syscall_64+0x43/0x140
    [   55.990563]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [   55.991519] RIP: 0033:0x7f7d60ba3154
    ...

After:

    # uname -r
    5.7.0.bcachelbspgsz

    [   31.672460] bcache: bcache_device_init() bcache0: sb/logical block size (8192) greater than page size (4096) falling back to device logical block size (512)
    [   31.675133] bcache: register_bdev() registered backing device loop0

    # grep ^ /sys/block/bcache0/queue/*_block_size
    /sys/block/bcache0/queue/logical_block_size:512
    /sys/block/bcache0/queue/physical_block_size:8192

Reported-by: Ryan Finnie <ryan@finnie.org>
Reported-by: Sebastian Marsching <sebastian@marsching.com>
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(backported from commit dcacbc1242c71e18fa9d2eadc5647e115c9c627d)
[mfo: backport: hunks 1/3/4: adjust bcache_device_init() signature
and calls for the lack of parameter make_request_fn from upstream.]
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Acked-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

r8169: fix firmware not resetting tp->ocp_base

BugLink: https://bugs.launchpad.net/bugs/1885072
Typically the firmware takes care that tp->ocp_base is reset to its
default value. That's not the case (at least) for RTL8117.
As a result subsequent PHY access reads/writes the wrong page and
the link is broken. Fix this be resetting tp->ocp_base explicitly.

Fixes: 229c1e0dfd3d ("r8169: load firmware for RTL8168fp/RTL8117")
Reported-by: Aaron Ma <mapengyu@gmail.com>
Tested-by: Aaron Ma <mapengyu@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 89fbd26cca7ec9e82ec4787a4b6e95939b57d073)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

r8169: fix OCP access on RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
According to r8168 vendor driver DASHv3 chips like RTL8168fp/RTL8117
need a special addressing for OCP access.
Fix is compile-tested only due to missing test hardware.

Fixes: 1287723aa139 ("r8169: add support for RTL8117")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 561535b0f23961ced071b82575d5e83e6351a814)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

r8169: load firmware for RTL8168fp/RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
Load Realtek-provided firmware for RTL8168fp/RTL8117. Unlike the
firmware for other chip versions which is for the PHY, firmware for
RTL8168fp/RTL8117 is for the MAC.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 229c1e0dfd3dfac23b33b7cd88bcb7a9e391b000)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

r8169: add support for RTL8117

BugLink: https://bugs.launchpad.net/bugs/1885072
Add support for chip version RTL8117. Settings have been copied from
Realtek's r8168 driver, there however chip ID 54a belongs to a chip
version called RTL8168FP. It was confirmed that RTL8117 works with
Realtek's driver, so both chip versions seem to be the same or at
least compatible.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(backported from commit 1287723aa139b46dc0b9b53c7212af71f1acfc39)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

r8169: add helper r8168g_phy_param

BugLink: https://bugs.launchpad.net/bugs/1885072
Integrated PHY's from RTL8168g support an indirect access method for
PHY parameters. On page 0x0a43 parameter number is written to register
0x13, then the parameter can be accessed via register 0x14.
Add a helper for this to improve readability and simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 8bfdce1defb1fb62bcc0e4929abd756585dabd0d)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

s390/cpum_cf: Add new extended counters for IBM z15

BugLink: https://bugs.launchpad.net/bugs/1881096
Add CPU measurement counter facility event description for IBM z15.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
(cherry picked from commit d68d5d51dc898895b4e15bea52e5668ca9e76180)
Signed-off-by: Frank Heimes <frank.heimes@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

UBUNTU: SAUCE: shiftfs: prevent ESTALE for LOOKUP_JUMP lookups

BugLink: https://bugs.launchpad.net/bugs/1872757
Users reported that creating temporary files shiftfs reports ESTALE.
This can be reproduced via:

import tempfile
import os

def test():
    with tempfile.TemporaryFile() as fd:
        fd.write("data".encode('utf-8'))
        # re-open the file to get a read-only file descriptor
        return open(f"/proc/self/fd/{fd.fileno()}", "r")

def main():
   fd = test()
   fd.close()

if __name__ == "__main__":
    main()

a similar issue was reported here:
https://github.com/systemd/systemd/issues/14861

Our revalidate methods were very opinionated about whether or not a
lower dentry was valid especially when it became unlinked we simply
invalidated the lower dentry which caused above bug to surface. This has
led to bugs where a ESTALE was returned for e.g.  temporary files that
were created and directly re-opened afterwards through
/proc/<pid>/fd/<nr-of-deleted-file>. When a file is re-opened through
/proc/<pid>/fd/<nr> LOOKUP_JUMP is set and the vfs will revalidate via
d_weak_revalidate(). Since the file has been unhashed or even already
gone negative we'd fail the open when we should've succeeded.

Reported-by: Christian Kellner <ckellner@redhat.com>
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Cc: Seth Forshee <seth.forshee@canonical.com>
Link: https://github.com/systemd/systemd/issues/14861
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

UBUNTU: SAUCE: Revert "UBUNTU: SAUCE: shiftfs: fix dentry revalidation"

BugLink: https://bugs.launchpad.net/bugs/1884767
With patch cfaa482afb97 ("UBUNTU: SAUCE: shiftfs: fix dentry revalidation")
we tried to fix
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757 but it
introduces a regression for shiftfs on btrfs users. Creating a btrfs
subvolume, deleting it, and then trying to recreate it will cause EEXIST
to be returned or the file be left in a half-created state. Faulty
behavior such as this can be reproduced via:

btrfs subvolume create my-subvol
ls -al my-subvol
btrfs subvolume delete my-subvol

We thus need to revert this change restoring the old behavior.This will
briefly resurface https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1872757
which is fixed in a follow-up patch on top of this revert. We simply
split out the minimal fix into a separate tiny patch.

This reverts commit cfaa482afb97e3c05d020af80b897b061109d51f.

Cc: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

UBUNTU: upstream stable to v5.4.52

BugLink: https://bugs.launchpad.net/bugs/1887853
Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

Linux 5.4.52

BugLink: https://bugs.launchpad.net/bugs/1887853
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

s390/maccess: add no DAT mode to kernel_write

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit d6df52e9996dcc2062c3d9c9123288468bb95b52 ]

To be able to patch kernel code before paging is initialized do plain
memcpy if DAT is off. This is required to enable early jump label
initialization.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

s390: Change s390_kernel_write() return type to match memcpy()

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit cb2cceaefb4c4dc28fc27ff1f1b2d258bfc10353 ]

s390_kernel_write()'s function type is almost identical to memcpy().
Change its return type to "void *" so they can be used interchangeably.

Cc: linux-s390@vger.kernel.org
Cc: heiko.carstens@de.ibm.com
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

pwm: jz4740: Fix build failure

BugLink: https://bugs.launchpad.net/bugs/1887853
When commit 9017dc4fbd59 ("pwm: jz4740: Enhance precision in calculation
of duty cycle") from v5.8-rc1 was backported to v5.4.x its dependency on
commit ce1f9cece057 ("pwm: jz4740: Use clocks from TCU driver") was not
noticed which made the pwm-jz4740 driver fail to build.

As ce1f9cece057 depends on still more rework, just backport a small part
of this commit to make the driver build again. (There is no dependency
on the functionality introduced in ce1f9cece057, just the rate variable
is needed.)

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

perf scripts python: exported-sql-viewer.py: Fix unexpanded 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 3a3cf7c570a486b07d9a6e68a77548aea6a8421f upstream.

Using Python version 3.8.2 and PySide2 version 5.14.0, ctrl-F ('Find')
would not expand the tree to the result. Fix by using setExpanded().

Example:

  $ perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB perf.data ]
  $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
  2020-06-26 15:32:14.928997 Creating database ...
  2020-06-26 15:32:14.933971 Writing records...
  2020-06-26 15:32:15.535251 Adding indexes
  2020-06-26 15:32:15.542993 Dropping unused tables
  2020-06-26 15:32:15.549716 Done
  $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

  Select: Reports -> Context-Sensitive Call Graph    or     Reports -> Call Tree
  Press: Ctrl-F
  Enter: main
  Press: Enter

Before: line showing 'main' does not display

After: tree is expanded to line showing 'main'

Fixes: ebd70c7dc2f5f ("perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

perf scripts python: exported-sql-viewer.py: Fix zero id in call tree 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 031c8d5edb1ddeb6d398f7942ce2a01a1a51ada9 upstream.

Using ctrl-F ('Find') would not find 'unknown' because it matches id
zero.  Fix by excluding id zero from selection.

Example:

   $ perf record -e intel_pt//u uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.034 MB perf.data ]
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
   2020-06-26 15:32:14.928997 Creating database ...
   2020-06-26 15:32:14.933971 Writing records...
   2020-06-26 15:32:15.535251 Adding indexes
   2020-06-26 15:32:15.542993 Dropping unused tables
   2020-06-26 15:32:15.549716 Done
   $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

   Select: Reports -> Call Tree
   Press: Ctrl-F
   Enter: unknown
   Press: Enter

Before: displays 'unknown' not found
After: tree is expanded to line showing 'unknown'

Fixes: ae8b887c00d3f ("perf scripts python: exported-sql-viewer.py: Add call tree")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

perf scripts python: exported-sql-viewer.py: Fix zero id in call graph 'Find' result

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7ff520b0a71dd2db695b52ad117d81b7eaf6ff9d upstream.

Using ctrl-F ('Find') would not find 'unknown' because it matches id zero.
Fix by excluding id zero from selection.

Example:

  $ perf record -e intel_pt//u uname
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.034 MB perf.data ]
  $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py perf.data.db branches calls
  2020-06-26 15:32:14.928997 Creating database ...
  2020-06-26 15:32:14.933971 Writing records...
  2020-06-26 15:32:15.535251 Adding indexes
  2020-06-26 15:32:15.542993 Dropping unused tables
  2020-06-26 15:32:15.549716 Done
  $ python3 ~/libexec/perf-core/scripts/python/exported-sql-viewer.py perf.data.db

  Select: Reports -> Context-Sensitive Call Graph
  Press: Ctrl-F
  Enter: unknown
  Press: Enter

Before: gets stuck
After: tree is expanded to line showing 'unknown'

Fixes: 254c0d820b86d ("perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

perf scripts python: export-to-postgresql.py: Fix struct.pack() int argument

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 640432e6bed08e9d5d2ba26856ba3f55008b07e3 upstream.

Python 3.8 is requiring that arguments being packed as integers are also
integers.  Add int() accordingly.

Before:

   $ perf record -e intel_pt//u uname
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py perf_data_db branches calls
   2020-06-25 16:09:10.547256 Creating database...
   2020-06-25 16:09:10.733185 Writing to intermediate files...
   Traceback (most recent call last):
     File "/home/ahunter/libexec/perf-core/scripts/python/export-to-postgresql.py", line 1106, in synth_data
       cbr(id, raw_buf)
     File "/home/ahunter/libexec/perf-core/scripts/python/export-to-postgresql.py", line 1058, in cbr
       value = struct.pack("!hiqiiiiii", 4, 8, id, 4, cbr, 4, MHz, 4, percent)
   struct.error: required argument is not an integer
   Fatal Python error: problem in Python trace event handler
   Python runtime state: initialized

   Current thread 0x00007f35d3695780 (most recent call first):
   <no Python frame>
   Aborted (core dumped)

After:

   $ dropdb perf_data_db
   $ rm -rf perf_data_db-perf-data
   $ perf script --itrace=bep -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py perf_data_db branches calls
   2020-06-25 16:09:40.990267 Creating database...
   2020-06-25 16:09:41.207009 Writing to intermediate files...
   2020-06-25 16:09:41.270915 Copying to database...
   2020-06-25 16:09:41.382030 Removing intermediate files...
   2020-06-25 16:09:41.384630 Adding primary keys
   2020-06-25 16:09:41.541894 Adding foreign keys
   2020-06-25 16:09:41.677044 Dropping unused tables
   2020-06-25 16:09:41.703761 Done

Fixes: aba44287a224 ("perf scripts python: export-to-postgresql.py: Export Intel PT power and ptwrite events")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200629091955.17090-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

dm writecache: reject asynchronous pmem devices

BugLink: https://bugs.launchpad.net/bugs/1887853
commit a46624580376a3a0beb218d94cbc7f258696e29f upstream.

DM writecache does not handle asynchronous pmem. Reject it when
supplied as cache.

Link: https://lore.kernel.org/linux-nvdimm/87lfk5hahc.fsf@linux.ibm.com/
Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # 5.3+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight()

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 05a4fed69ff00a8bd83538684cb602a4636b07a7 upstream.

dm-multipath is the only user of blk_mq_queue_inflight().  When
dm-multipath calls blk_mq_queue_inflight() to check if it has
outstanding IO it can get a false negative.  The reason for this is
blk_mq_rq_inflight() doesn't consider requests that are no longer
MQ_RQ_IN_FLIGHT but that are now MQ_RQ_COMPLETE (->complete isn't
called or finished yet) as "inflight".

This causes request-based dm-multipath's dm_wait_for_completion() to
return before all outstanding dm-multipath requests have actually
completed.  This breaks DM multipath's suspend functionality because
blk-mq requests complete after DM's suspend has finished -- which
shouldn't happen.

Fix this by considering any request not in the MQ_RQ_IDLE state
(so either MQ_RQ_COMPLETE or MQ_RQ_IN_FLIGHT) as "inflight" in
blk_mq_rq_inflight().

Fixes: 3c94d83cb3526 ("blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

s390/mm: fix huge pte soft dirty copying

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 528a9539348a0234375dfaa1ca5dbbb2f8f8e8d2 upstream.

If the pmd is soft dirty we must mark the pte as soft dirty (and not dirty).
This fixes some cases for guest migration with huge page backings.

Cc: <stable@vger.kernel.org> # 4.8
Fixes: bc29b7ac1d9f ("s390/mm: clean up pte/pmd encoding")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

s390/setup: init jump labels before command line parsing

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 95e61b1b5d6394b53d147c0fcbe2ae70fbe09446 upstream.

Command line parameters might set static keys. This is true for s390 at
least since commit 6471384af2a6 ("mm: security: introduce init_on_alloc=1
and init_on_free=1 boot options"). To avoid the following WARN:

static_key_enable_cpuslocked(): static key 'init_on_alloc+0x0/0x40' used
before call to jump_label_init()

call jump_label_init() just before parse_early_param().
jump_label_init() is safe to call multiple times (x86 does that), doesn't
do any memory allocations and hence should be safe to call that early.

Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Cc: <stable@vger.kernel.org> # 5.3: d6df52e9996d: s390/maccess: add no DAT mode to kernel_write
Cc: <stable@vger.kernel.org> # 5.3
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ARC: elf: use right ELF_ARCH

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b7faf971081a4e56147f082234bfff55135305cb upstream.

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 00fdec98d9881bf5173af09aebd353ab3b9ac729 upstream.

Trap handler for syscall tracing reads EFA (Exception Fault Address),
in case strace wants PC of trap instruction (EFA is not part of pt_regs
as of current code).

However this EFA read is racy as it happens after dropping to pure
kernel mode (re-enabling interrupts). A taken interrupt could
context-switch, trigger a different task's trap, clobbering EFA for this
execution context.

Fix this by reading EFA early, before re-enabling interrupts. A slight
side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
The trap handler is common to both ARCompact and ARCv2 builds too.

This just came out of code rework/review and no real problem was reported
but is clearly a potential problem specially for strace.

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

mmc: meson-gx: limit segments to 1 when dram-access-quirk is needed

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 27a5e7d36d383970affae801d77141deafd536a8 upstream.

The actual max_segs computation leads to failure while using the broadcom
sdio brcmfmac/bcmsdh driver, since the driver tries to make usage of
scatter gather.

But with the dram-access-quirk we use a 1,5K SRAM bounce buffer, and the
max_segs current value of 3 leads to max transfers to 4,5k, which doesn't
work.

This patch sets max_segs to 1 to better describe the hardware limitation,
and fix the SDIO functionality with the brcmfmac/bcmsdh driver on Amlogic
G12A/G12B SoCs on boards like SEI510 or Khadas VIM3.

Reported-by: Art Nikpal <art@khadas.com>
Reported-by: Christian Hewitt <christianshewitt@gmail.com>
Fixes: acdc8e71d9bb ("mmc: meson-gx: add dram-access-quirk")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200608084458.32014-1-narmstrong@baylibre.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

dm: use noio when sending kobject event

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6958c1c640af8c3f40fa8a2eee3b5b905d95b677 upstream.

kobject_uevent may allocate memory and it may be called while there are dm
devices suspended. The allocation may recurse into a suspended device,
causing a deadlock. We must set the noio flag when sending a uevent.

The observed deadlock was reported here:
https://www.redhat.com/archives/dm-devel/2020-March/msg00025.html

Reported-by: Khazhismel Kumykov <khazhy@google.com>
Reported-by: Tahsin Erdogan <tahsin@google.com>
Reported-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

drm/amdgpu: don't do soft recovery if gpu_recovery=0

BugLink: https://bugs.launchpad.net/bugs/1887853
commit f4892c327a8e5df7ce16cab40897daf90baf6bec upstream.

It's impossible to debug shader hangs with soft recovery.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

drm/radeon: fix double free

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 41855a898650803e24b284173354cc3e44d07725 upstream.

clang static analysis flags this error

drivers/gpu/drm/radeon/ci_dpm.c:5652:9: warning: Use of memory after it is freed [unix.Malloc]
                kfree(rdev->pm.dpm.ps[i].ps_priv);
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/radeon/ci_dpm.c:5654:2: warning: Attempt to free released memory [unix.Malloc]
        kfree(rdev->pm.dpm.ps);
        ^~~~~~~~~~~~~~~~~~~~~~

problem is reported in ci_dpm_fini, with these code blocks.

for (i = 0; i < rdev->pm.dpm.num_ps; i++) {
kfree(rdev->pm.dpm.ps[i].ps_priv);
}
kfree(rdev->pm.dpm.ps);

The first free happens in ci_parse_power_table where it cleans up locally
on a failure.  ci_dpm_fini also does a cleanup.

ret = ci_parse_power_table(rdev);
if (ret) {
ci_dpm_fini(rdev);
return ret;
}

So remove the cleanup in ci_parse_power_table and
move the num_ps calculation to inside the loop so ci_dpm_fini
will know how many array elements to free.

Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

btrfs: fix double put of block group with nocow

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 230ed397435e85b54f055c524fcb267ae2ce3bc4 upstream.

While debugging a patch that I wrote I was hitting use-after-free panics
when accessing block groups on unmount.  This turned out to be because
in the nocow case if we bail out of doing the nocow for whatever reason
we need to call btrfs_dec_nocow_writers() if we called the inc.  This
puts our block group, but a few error cases does

if (nocow) {
    btrfs_dec_nocow_writers();
    goto error;
}

unfortunately, error is

error:
if (nocow)
btrfs_dec_nocow_writers();

so we get a double put on our block group.  Fix this by dropping the
error cases calling of btrfs_dec_nocow_writers(), as it's handled at the
error label now.

Fixes: 762bf09893b4 ("btrfs: improve error handling in run_delalloc_nocow")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

btrfs: fix fatal extent_buffer readahead vs releasepage race

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6bf9cd2eed9aee6d742bb9296c994a91f5316949 upstream.

Under somewhat convoluted conditions, it is possible to attempt to
release an extent_buffer that is under io, which triggers a BUG_ON in
btrfs_release_extent_buffer_pages.

This relies on a few different factors. First, extent_buffer reads done
as readahead for searching use WAIT_NONE, so they free the local extent
buffer reference while the io is outstanding. However, they should still
be protected by TREE_REF. However, if the system is doing signficant
reclaim, and simultaneously heavily accessing the extent_buffers, it is
possible for releasepage to race with two concurrent readahead attempts
in a way that leaves TREE_REF unset when the readahead extent buffer is
released.

Essentially, if two tasks race to allocate a new extent_buffer, but the
winner who attempts the first io is rebuffed by a page being locked
(likely by the reclaim itself) then the loser will still go ahead with
issuing the readahead. The loser's call to find_extent_buffer must also
race with the reclaim task reading the extent_buffer's refcount as 1 in
a way that allows the reclaim to re-clear the TREE_REF checked by
find_extent_buffer.

The following represents an example execution demonstrating the race:

            CPU0                                                         CPU1                                           CPU2
reada_for_search                                            reada_for_search
  readahead_tree_block                                        readahead_tree_block
    find_create_tree_block                                      find_create_tree_block
      alloc_extent_buffer                                         alloc_extent_buffer
                                                                  find_extent_buffer // not found
                                                                  allocates eb
                                                                  lock pages
                                                                  associate pages to eb
                                                                  insert eb into radix tree
                                                                  set TREE_REF, refs == 2
                                                                  unlock pages
                                                              read_extent_buffer_pages // WAIT_NONE
                                                                not uptodate (brand new eb)
                                                                                                            lock_page
                                                                if !trylock_page
                                                                  goto unlock_exit // not an error
                                                              free_extent_buffer
                                                                release_extent_buffer
                                                                  atomic_dec_and_test refs to 1
        find_extent_buffer // found
                                                                                                            try_release_extent_buffer
                                                                                                              take refs_lock
                                                                                                              reads refs == 1; no io
          atomic_inc_not_zero refs to 2
          mark_buffer_accessed
            check_buffer_tree_ref
              // not STALE, won't take refs_lock
              refs == 2; TREE_REF set // no action
    read_extent_buffer_pages // WAIT_NONE
                                                                                                              clear TREE_REF
                                                                                                              release_extent_buffer
                                                                                                                atomic_dec_and_test refs to 1
                                                                                                                unlock_page
      still not uptodate (CPU1 read failed on trylock_page)
      locks pages
      set io_pages > 0
      submit io
      return
    free_extent_buffer
      release_extent_buffer
        dec refs to 0
        delete from radix tree
        btrfs_release_extent_buffer_pages
          BUG_ON(io_pages > 0)!!!

We observe this at a very low rate in production and were also able to
reproduce it in a test environment by introducing some spurious delays
and by introducing probabilistic trylock_page failures.

To fix it, we apply check_tree_ref at a point where it could not
possibly be unset by a competing task: after io_pages has been
incremented. All the codepaths that clear TREE_REF check for io, so they
would not be able to clear it after this point until the io is done.

Stack trace, for reference:
[1417839.424739] ------------[ cut here ]------------
[1417839.435328] kernel BUG at fs/btrfs/extent_io.c:4841!
[1417839.447024] invalid opcode: 0000 [#1] SMP
[1417839.502972] RIP: 0010:btrfs_release_extent_buffer_pages+0x20/0x1f0
[1417839.517008] Code: ed e9 ...
[1417839.558895] RSP: 0018:ffffc90020bcf798 EFLAGS: 00010202
[1417839.570816] RAX: 0000000000000002 RBX: ffff888102d6def0 RCX: 0000000000000028
[1417839.586962] RDX: 0000000000000002 RSI: ffff8887f0296482 RDI: ffff888102d6def0
[1417839.603108] RBP: ffff88885664a000 R08: 0000000000000046 R09: 0000000000000238
[1417839.619255] R10: 0000000000000028 R11: ffff88885664af68 R12: 0000000000000000
[1417839.635402] R13: 0000000000000000 R14: ffff88875f573ad0 R15: ffff888797aafd90
[1417839.651549] FS:  00007f5a844fa700(0000) GS:ffff88885f680000(0000) knlGS:0000000000000000
[1417839.669810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1417839.682887] CR2: 00007f7884541fe0 CR3: 000000049f609002 CR4: 00000000003606e0
[1417839.699037] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1417839.715187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1417839.731320] Call Trace:
[1417839.737103]  release_extent_buffer+0x39/0x90
[1417839.746913]  read_block_for_search.isra.38+0x2a3/0x370
[1417839.758645]  btrfs_search_slot+0x260/0x9b0
[1417839.768054]  btrfs_lookup_file_extent+0x4a/0x70
[1417839.778427]  btrfs_get_extent+0x15f/0x830
[1417839.787665]  ? submit_extent_page+0xc4/0x1c0
[1417839.797474]  ? __do_readpage+0x299/0x7a0
[1417839.806515]  __do_readpage+0x33b/0x7a0
[1417839.815171]  ? btrfs_releasepage+0x70/0x70
[1417839.824597]  extent_readpages+0x28f/0x400
[1417839.833836]  read_pages+0x6a/0x1c0
[1417839.841729]  ? startup_64+0x2/0x30
[1417839.849624]  __do_page_cache_readahead+0x13c/0x1a0
[1417839.860590]  filemap_fault+0x6c7/0x990
[1417839.869252]  ? xas_load+0x8/0x80
[1417839.876756]  ? xas_find+0x150/0x190
[1417839.884839]  ? filemap_map_pages+0x295/0x3b0
[1417839.894652]  __do_fault+0x32/0x110
[1417839.902540]  __handle_mm_fault+0xacd/0x1000
[1417839.912156]  handle_mm_fault+0xaa/0x1c0
[1417839.921004]  __do_page_fault+0x242/0x4b0
[1417839.930044]  ? page_fault+0x8/0x30
[1417839.937933]  page_fault+0x1e/0x30
[1417839.945631] RIP: 0033:0x33c4bae
[1417839.952927] Code: Bad RIP value.
[1417839.960411] RSP: 002b:00007f5a844f7350 EFLAGS: 00010206
[1417839.972331] RAX: 000000000000006e RBX: 1614b3ff6a50398a RCX: 0000000000000000
[1417839.988477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
[1417840.004626] RBP: 00007f5a844f7420 R08: 000000000000006e R09: 00007f5a94aeccb8
[1417840.020784] R10: 00007f5a844f7350 R11: 0000000000000000 R12: 00007f5a94aecc79
[1417840.036932] R13: 00007f5a94aecc78 R14: 00007f5a94aecc90 R15: 00007f5a94aecc40

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

bpf: Check correct cred for CAP_SYSLOG in bpf_dump_raw_ok()

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 63960260457a02af2a6cb35d75e6bdb17299c882 upstream.

When evaluating access control over kallsyms visibility, credentials at
open() time need to be used, not the "current" creds (though in BPF's
case, this has likely always been the same). Plumb access to associated
file->f_cred down through bpf_dump_raw_ok() and its callers now that
kallsysm_show_value() has been refactored to take struct cred.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

kprobes: Do not expose probe addresses to non-CAP_SYSLOG

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 60f7bb66b88b649433bf700acfc60c3f24953871 upstream.

The kprobe show() functions were using "current"'s creds instead
of the file opener's creds for kallsyms visibility. Fix to use
seq_file->file->f_cred.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 81365a947de4 ("kprobes: Show address of kprobes if kallsyms does")
Fixes: ffb9bd68ebdb ("kprobes: Show blacklist addresses as same as kallsyms does")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

module: Do not expose section addresses to non-CAP_SYSLOG

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b25a7c5af9051850d4f3d93ca500056ab6ec724b upstream.

The printing of section addresses in /sys/module/*/sections/* was not
using the correct credentials to evaluate visibility.

Before:

# cat /sys/module/*/sections/.*text
0xffffffffc0458000
...
# capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
0xffffffffc0458000
...

After:

# cat /sys/module/*/sections/*.text
0xffffffffc0458000
...
# capsh --drop=CAP_SYSLOG -- -c "cat /sys/module/*/sections/.*text"
0x0000000000000000
...

Additionally replaces the existing (safe) /proc/modules check with
file->f_cred for consistency.

Reported-by: Dominik Czarnota <dominik.czarnota@trailofbits.com>
Fixes: be71eda5383f ("module: Fix display of wrong module .text address")
Cc: stable@vger.kernel.org
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

module: Refactor section attr into bin attribute

BugLink: https://bugs.launchpad.net/bugs/1887853
commit ed66f991bb19d94cae5d38f77de81f96aac7813f upstream.

In order to gain access to the open file's f_cred for kallsym visibility
permission checks, refactor the module section attributes to use the
bin_attribute instead of attribute interface. Additionally removes the
redundant "name" struct member.

Cc: stable@vger.kernel.org
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

kallsyms: Refactor kallsyms_show_value() to take cred

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 160251842cd35a75edfb0a1d76afa3eb674ff40a upstream.

In order to perform future tests against the cred saved during open(),
switch kallsyms_show_value() to operate on a cred, and have all current
callers pass current_cred(). This makes it very obvious where callers
are checking the wrong credential in their "read" contexts. These will
be fixed in the coming patches.

Additionally switch return value to bool, since it is always used as a
direct permission check, not a 0-on-success, negative-on-error style
function return.

Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: arm64: Fix kvm_reset_vcpu() return code being incorrect with SVE

BugLink: https://bugs.launchpad.net/bugs/1887853
If SVE is enabled then 'ret' can be assigned the return value of
kvm_vcpu_enable_sve() which may be 0 causing future "goto out" sites to
erroneously return 0 on failure rather than -EINVAL as expected.

Remove the initialisation of 'ret' and make setting the return value
explicit to avoid this situation in the future.

Fixes: 9a3cdf26e336 ("KVM: arm64/sve: Allow userspace to enable SVE for vcpus")
Cc: stable@vger.kernel.org
Reported-by: James Morse <james.morse@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200617105456.28245-1-steven.price@arm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: x86: Mark CR4.TSD as being possibly owned by the guest

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7c83d096aed055a7763a03384f92115363448b71 upstream.

Mark CR4.TSD as being possibly owned by the guest as that is indeed the
case on VMX. Without TSD being tagged as possibly owned by the guest, a
targeted read of CR4 to get TSD could observe a stale value. This bug
is benign in the current code base as the sole consumer of TSD is the
emulator (for RDTSC) and the emulator always "reads" the entirety of CR4
when grabbing bits.

Add a build-time assertion in to ensure VMX doesn't hand over more CR4
bits without also updating x86.

Fixes: 52ce3c21aec3 ("x86,kvm,vmx: Don't trap writes to CR4.TSD")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703040422.31536-2-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: x86: Inject #GP if guest attempts to toggle CR4.LA57 in 64-bit mode

BugLink: https://bugs.launchpad.net/bugs/1887853
commit d74fcfc1f0ff4b6c26ecef1f9e48d8089ab4eaac upstream.

Inject a #GP on MOV CR4 if CR4.LA57 is toggled in 64-bit mode, which is
illegal per Intel's SDM:

  CR4.LA57
    57-bit linear addresses (bit 12 of CR4) ... blah blah blah ...
    This bit cannot be modified in IA-32e mode.

Note, the pseudocode for MOV CR doesn't call out the fault condition,
which is likely why the check was missed during initial development.
This is arguably an SDM bug and will hopefully be fixed in future
release of the SDM.

Fixes: fd8cb433734ee ("KVM: MMU: Expose the LA57 feature to VM.")
Cc: stable@vger.kernel.org
Reported-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200703021714.5549-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: x86: bit 8 of non-leaf PDPEs is not reserved

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 5ecad245de2ae23dc4e2dbece92f8ccfbaed2fa7 upstream.

Bit 8 would be the "global" bit, which does not quite make sense for non-leaf
page table entries. Intel ignores it; AMD ignores it in PDEs and PDPEs, but
reserves it in PML4Es.

Probably, earlier versions of the AMD manual documented it as reserved in PDPEs
as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it.

Cc: stable@vger.kernel.org
Reported-by: Nadav Amit <namit@vmware.com>
Fixes: a0c0feb57992 ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: arm64: Annotate hyp NMI-related functions as __always_inline

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 7733306bd593c737c63110175da6c35b4b8bb32c upstream.

The "inline" keyword is a hint for the compiler to inline a function.  The
functions system_uses_irq_prio_masking() and gic_write_pmr() are used by
the code running at EL2 on a non-VHE system, so mark them as
__always_inline to make sure they'll always be part of the .hyp.text
section.

This fixes the following splat when trying to run a VM:

[   47.625273] Kernel panic - not syncing: HYP panic:
[   47.625273] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.625273] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.625273] VCPU:0000000000000000
[   47.647261] CPU: 1 PID: 217 Comm: kvm-vcpu-0 Not tainted 5.8.0-rc1-ARCH+ #61
[   47.654508] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[   47.661139] Call trace:
[   47.663659]  dump_backtrace+0x0/0x1cc
[   47.667413]  show_stack+0x18/0x24
[   47.670822]  dump_stack+0xb8/0x108
[   47.674312]  panic+0x124/0x2f4
[   47.677446]  panic+0x0/0x2f4
[   47.680407] SMP: stopping secondary CPUs
[   47.684439] Kernel Offset: disabled
[   47.688018] CPU features: 0x240402,20002008
[   47.692318] Memory Limit: none
[   47.695465] ---[ end Kernel panic - not syncing: HYP panic:
[   47.695465] PS:a00003c9 PC:0000ca0b42049fc4 ESR:86000006
[   47.695465] FAR:0000ca0b42049fc4 HPFAR:0000000010001000 PAR:0000000000000000
[   47.695465] VCPU:0000000000000000 ]---

The instruction abort was caused by the code running at EL2 trying to fetch
an instruction which wasn't mapped in the EL2 translation tables. Using
objdump showed the two functions as separate symbols in the .text section.

Fixes: 85738e05dc38 ("arm64: kvm: Unmask PMR before entering guest")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20200618171254.1596055-1-alexandru.elisei@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: arm64: Stop clobbering x0 for HVC_SOFT_RESTART

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b9e10d4a6c9f5cbe6369ce2c17ebc67d2e5a4be5 upstream.

HVC_SOFT_RESTART is given values for x0-2 that it should installed
before exiting to the new address so should not set x0 to stub HVC
success or failure code.

Fixes: af42f20480bf1 ("arm64: hyp-stub: Zero x0 on successful stub handling")
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Scull <ascull@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200706095259.1338221-1-ascull@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

KVM: arm64: Fix definition of PAGE_HYP_DEVICE

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 68cf617309b5f6f3a651165f49f20af1494753ae upstream.

PAGE_HYP_DEVICE is intended to encode attribute bits for an EL2 stage-1
pte mapping a device. Unfortunately, it includes PROT_DEVICE_nGnRE which
encodes attributes for EL1 stage-1 mappings such as UXN and nG, which are
RES0 for EL2, and DBM which is meaningless as TCR_EL2.HD is not set.

Fix the definition of PAGE_HYP_DEVICE so that it doesn't set RES0 bits
at EL2.

Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200708162546.26176-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: hda/realtek: Enable headset mic of Acer Veriton N4660G with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 781c90c034d994c6a4e2badf189128a95ed864c2 upstream.

The Acer Veriton N4660G desktop's audio (1025:1248) with ALC269VC cannot
detect the headset microphone until ALC269VC_FIXUP_ACER_MIC_NO_PRESENCE
quirk maps the NID 0x18 as the headset mic pin.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-3-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: hda/realtek: Enable headset mic of Acer C20-820 with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6e15d1261d522d1d222f8f89b23c6966905e9049 upstream.

The Acer Aspire C20-820 AIO's audio (1025:1065) with ALC269VC can't
detect the headset microphone until ALC269VC_FIXUP_ACER_HEADSET_MIC
quirk maps the NID 0x18 as the headset mic pin.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-2-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: hda/realtek - Enable audio jacks of Acer vCopperbox with ALC269VC

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 8eae7e9b3967f08efaa4d70403aec513cbe45ad0 upstream.

The Acer desktop vCopperbox with ALC269VC cannot detect the MIC of
headset, the line out and internal speaker until
ALC269VC_FIXUP_ACER_VCOPPERBOX_PINS quirk applied.

Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200706071826.39726-1-jian-hong@endlessm.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: hda/realtek - Fix Lenovo Thinkpad X1 Carbon 7th quirk subdevice id

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 9774dc218bb628974dcbc76412f970e9258e5f27 upstream.

1)
In snd_hda_pick_fixup(), quirks are first matched by PCI SSID and then, if
there is no match, by codec SSID. The Lenovo "ThinkPad X1 Carbon 7th" has
an audio chip with PCI SSID 0x2292 and codec SSID 0x2293[1]. Therefore, fix
the quirk meant for that device to match on .subdevice == 0x2292.

2)
The "Thinkpad X1 Yoga 7th" does not exist. The companion product to the
Carbon 7th is the Yoga 4th. That device has an audio chip with PCI SSID
0x2292 and codec SSID 0x2292[2]. Given the behavior of
snd_hda_pick_fixup(), it is not possible to have a separate quirk for the
Yoga based on SSID. Therefore, merge the quirks meant for the Carbon and
Yoga. This preserves the current behavior for the Yoga.

[1] This is the case on my own machine and can also be checked here
https://github.com/linuxhw/LsPCI/tree/master/Notebook/Lenovo/ThinkPad
https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3225701
[2]
https://github.com/linuxhw/LsPCI/tree/master/Convertible/Lenovo/ThinkPad
https://gist.github.com/hamidzr/dd81e429dc86f4327ded7a2030e7d7d9#gistcomment-3176355

Fixes: d2cd795c4ece ("ALSA: hda - fixup for the bass speaker on Lenovo Carbon X1 7th gen")
Fixes: 54a6a7dc107d ("ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen")
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Kailang Yang <kailang@realtek.com>
Tested-by: Vincent Bernat <vincent@bernat.ch>
Tested-by: Even Brenden <evenbrenden@gmail.com>
Signed-off-by: Benjamin Poirier <benjamin.poirier@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20200703080005.8942-2-benjamin.poirier@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: usb-audio: Add implicit feedback quirk for RTX6001

BugLink: https://bugs.launchpad.net/bugs/1887853
commit b6a1e78b96a5d7f312f08b3a470eb911ab5feec0 upstream.

USB Audio analyzer RTX6001 uses the same implicit feedback quirk
as other XMOS-based devices.

Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Tested-by: Pavel Hofman <pavel.hofman@ivitera.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/822f0f20-1886-6884-a6b2-d11c685cbafa@ivitera.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: usb-audio: add quirk for MacroSilicon MS2109

BugLink: https://bugs.launchpad.net/bugs/1887853
commit e337bf19f6af38d5c3fa6d06cd594e0f890ca1ac upstream.

These devices claim to be 96kHz mono, but actually are 48kHz stereo with
swapped channels and unaligned transfers.

Cc: stable@vger.kernel.org
Signed-off-by: Hector Martin <marcan@marcan.st>
Link: https://lore.kernel.org/r/20200702071433.237843-1-marcan@marcan.st
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: hda - let hs_mic be picked ahead of hp_mic

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 6a6ca7881b1ab1c13fe0d70bae29211a65dd90de upstream.

We have a Dell AIO, there is neither internal speaker nor internal
mic, only a multi-function audio jack on it.

Users reported that after freshly installing the OS and plug
a headset to the audio jack, the headset can't output sound. I
reproduced this bug, at that moment, the Input Source is as below:
Simple mixer control 'Input Source',0
  Capabilities: cenum
  Items: 'Headphone Mic' 'Headset Mic'
  Item0: 'Headphone Mic'

That is because the patch_realtek will set this audio jack as mic_in
mode if Input Source's value is hp_mic.

If it is not fresh installing, this issue will not happen since the
systemd will run alsactl restore -f /var/lib/alsa/asound.state, this
will set the 'Input Source' according to history value.

If there is internal speaker or internal mic, this issue will not
happen since there is valid sink/source in the pulseaudio, the PA will
set the 'Input Source' according to active_port.

To fix this issue, change the parser function to let the hs_mic be
stored ahead of hp_mic.

Cc: stable@vger.kernel.org
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20200625083833.11264-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

ALSA: opl3: fix infoleak in opl3

BugLink: https://bugs.launchpad.net/bugs/1887853
commit ad155712bb1ea2151944cf06a0e08c315c70c1e3 upstream.

The stack object “info” in snd_opl3_ioctl() has a leaking problem.
It has 2 padding bytes which are not initialized and leaked via
“copy_to_user”.

Signed-off-by: xidongwang <wangxidong_97@163.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1594006058-30362-1-git-send-email-wangxidong_97@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

IB/hfi1: Do not destroy link_wq when the device is shut down

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 2315ec12ee8e8257bb335654c62e0cae71dc278d upstream.

The workqueue link_wq should only be destroyed when the hfi1 driver is
unloaded, not when the device is shut down.

Fixes: 71d47008ca1b ("IB/hfi1: Create workqueue for link events")
Link: https://lore.kernel.org/r/20200623204053.107638.70315.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

IB/hfi1: Do not destroy hfi1_wq when the device is shut down

BugLink: https://bugs.launchpad.net/bugs/1887853
commit 28b70cd9236563e1a88a6094673fef3c08db0d51 upstream.

The workqueue hfi1_wq is destroyed in function shutdown_device(), which is
called by either shutdown_one() or remove_one(). The function
shutdown_one() is called when the kernel is rebooted while remove_one() is
called when the hfi1 driver is unloaded. When the kernel is rebooted,
hfi1_wq is destroyed while all qps are still active, leading to a kernel
crash:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
  IP: [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0
  PGD 0
  Oops: 0000 [#1] SMP
  Modules linked in: dm_round_robin nvme_rdma(OE) nvme_fabrics(OE) nvme_core(OE) ib_isert iscsi_target_mod target_core_mod ib_ucm mlx4_ib iTCO_wdt iTCO_vendor_support mxm_wmi sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm rpcrdma sunrpc irqbypass crc32_pclmul ghash_clmulni_intel rdma_ucm aesni_intel ib_uverbs lrw gf128mul opa_vnic glue_helper ablk_helper ib_iser cryptd ib_umad rdma_cm iw_cm ses enclosure libiscsi scsi_transport_sas pcspkr joydev ib_ipoib(OE) scsi_transport_iscsi ib_cm sg ipmi_ssif mei_me lpc_ich i2c_i801 mei ioatdma ipmi_si dm_multipath ipmi_devintf ipmi_msghandler wmi acpi_pad acpi_power_meter hangcheck_timer ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm hfi1(OE)
  crct10dif_pclmul crct10dif_common crc32c_intel drm ahci mlx4_core libahci rdmavt(OE) igb megaraid_sas ib_core libata drm_panel_orientation_quirks ptp pps_core devlink dca i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod
  CPU: 19 PID: 0 Comm: swapper/19 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
  Hardware name: Phegda X2226A/S2600CW, BIOS SE5C610.86B.01.01.0024.021320181901 02/13/2018
  task: ffff8a799ba0d140 ti: ffff8a799bad8000 task.ti: ffff8a799bad8000
  RIP: 0010:[<ffffffff94cb7b02>] [<ffffffff94cb7b02>] __queue_work+0x32/0x3e0
  RSP: 0018:ffff8a90dde43d80 EFLAGS: 00010046
  RAX: 0000000000000082 RBX: 0000000000000086 RCX: 0000000000000000
  RDX: ffff8a90b924fcb8 RSI: 0000000000000000 RDI: 000000000000001b
  RBP: ffff8a90dde43db8 R08: ffff8a799ba0d6d8 R09: ffff8a90dde53900
  R10: 0000000000000002 R11: ffff8a90dde43de8 R12: ffff8a90b924fcb8
  R13: 000000000000001b R14: 0000000000000000 R15: ffff8a90d2890000
  FS: 0000000000000000(0000) GS:ffff8a90dde40000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000102 CR3: 0000001a70410000 CR4: 00000000001607e0
  Call Trace:
  [<ffffffff94cb8105>] queue_work_on+0x45/0x50
  [<ffffffffc03f781e>] _hfi1_schedule_send+0x6e/0xc0 [hfi1]
  [<ffffffffc03f78a2>] hfi1_schedule_send+0x32/0x70 [hfi1]
  [<ffffffffc02cf2d9>] rvt_rc_timeout+0xe9/0x130 [rdmavt]
  [<ffffffff94ce563a>] ? trigger_load_balance+0x6a/0x280
  [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt]
  [<ffffffff94ca7f58>] call_timer_fn+0x38/0x110
  [<ffffffffc02cf1f0>] ? rvt_free_qpn+0x40/0x40 [rdmavt]
  [<ffffffff94caa3bd>] run_timer_softirq+0x24d/0x300
  [<ffffffff94ca0f05>] __do_softirq+0xf5/0x280
  [<ffffffff9537832c>] call_softirq+0x1c/0x30
  [<ffffffff94c2e675>] do_softirq+0x65/0xa0
  [<ffffffff94ca1285>] irq_exit+0x105/0x110
  [<ffffffff953796c8>] smp_apic_timer_interrupt+0x48/0x60
  [<ffffffff95375df2>] apic_timer_interrupt+0x162/0x170
  <EOI>
  [<ffffffff951adfb7>] ? cpuidle_enter_state+0x57/0xd0
  [<ffffffff951ae10e>] cpuidle_idle_call+0xde/0x230
  [<ffffffff94c366de>] arch_cpu_idle+0xe/0xc0
  [<ffffffff94cfc3ba>] cpu_startup_entry+0x14a/0x1e0
  [<ffffffff94c57db7>] start_secondary+0x1f7/0x270
  [<ffffffff94c000d5>] start_cpu+0x5/0x14

The solution is to destroy the workqueue only when the hfi1 driver is
unloaded, not when the device is shut down. In addition, when the device
is shut down, no more work should be scheduled on the workqueues and the
workqueues are flushed.

Fixes: 8d3e71136a08 ("IB/{hfi1, qib}: Add handling of kernel restart")
Link: https://lore.kernel.org/r/20200623204047.107638.77646.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

mlxsw: pci: Fix use-after-free in case of failed devlink reload

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit c4317b11675b99af6641662ebcbd3c6010600e64 ]

In case devlink reload failed, it is possible to trigger a
use-after-free when querying the kernel for device info via 'devlink dev
info' [1].

This happens because as part of the reload error path the PCI command
interface is de-initialized and its mailboxes are freed. When the
devlink '->info_get()' callback is invoked the device is queried via the
command interface and the freed mailboxes are accessed.

Fix this by initializing the command interface once during probe and not
during every reload.

This is consistent with the other bus used by mlxsw (i.e., 'mlxsw_i2c')
and also allows user space to query the running firmware version (for
example) from the device after a failed reload.

[1]
BUG: KASAN: use-after-free in memcpy include/linux/string.h:406 [inline]
BUG: KASAN: use-after-free in mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
Write of size 4096 at addr ffff88810ae32000 by task syz-executor.1/2355

CPU: 1 PID: 2355 Comm: syz-executor.1 Not tainted 5.8.0-rc2+ #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0xf6/0x16e lib/dump_stack.c:118
print_address_description.constprop.0+0x1c/0x250 mm/kasan/report.c:383
__kasan_report mm/kasan/report.c:513 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
check_memory_region_inline mm/kasan/generic.c:186 [inline]
check_memory_region+0x14e/0x1b0 mm/kasan/generic.c:192
memcpy+0x39/0x60 mm/kasan/common.c:106
memcpy include/linux/string.h:406 [inline]
mlxsw_pci_cmd_exec+0x177/0xa60 drivers/net/ethernet/mellanox/mlxsw/pci.c:1675
mlxsw_cmd_exec+0x249/0x550 drivers/net/ethernet/mellanox/mlxsw/core.c:2335
mlxsw_cmd_access_reg drivers/net/ethernet/mellanox/mlxsw/cmd.h:859 [inline]
mlxsw_core_reg_access_cmd drivers/net/ethernet/mellanox/mlxsw/core.c:1938 [inline]
mlxsw_core_reg_access+0x2f6/0x540 drivers/net/ethernet/mellanox/mlxsw/core.c:1985
mlxsw_reg_query drivers/net/ethernet/mellanox/mlxsw/core.c:2000 [inline]
mlxsw_devlink_info_get+0x17f/0x6e0 drivers/net/ethernet/mellanox/mlxsw/core.c:1090
devlink_nl_info_fill.constprop.0+0x13c/0x2d0 net/core/devlink.c:4588
devlink_nl_cmd_info_get_dumpit+0x246/0x460 net/core/devlink.c:4648
genl_lock_dumpit+0x85/0xc0 net/netlink/genetlink.c:575
netlink_dump+0x515/0xe50 net/netlink/af_netlink.c:2245
__netlink_dump_start+0x53d/0x830 net/netlink/af_netlink.c:2353
genl_family_rcv_msg_dumpit.isra.0+0x296/0x300 net/netlink/genetlink.c:638
genl_family_rcv_msg net/netlink/genetlink.c:733 [inline]
genl_rcv_msg+0x78d/0x9d0 net/netlink/genetlink.c:753
netlink_rcv_skb+0x152/0x440 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:764
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x53a/0x750 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x850/0xd90 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0x150/0x190 net/socket.c:672
____sys_sendmsg+0x6d8/0x840 net/socket.c:2363
___sys_sendmsg+0xff/0x170 net/socket.c:2417
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2450
do_syscall_64+0x56/0xa0 arch/x86/entry/common.c:359
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: a9c8336f6544 ("mlxsw: core: Add support for devlink info command")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

mlxsw: spectrum_router: Remove inappropriate usage of WARN_ON()

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit d9d5420273997664a1c09151ca86ac993f2f89c1 ]

We should not trigger a warning when a memory allocation fails. Remove
the WARN_ON().

The warning is constantly triggered by syzkaller when it is injecting
faults:

[ 2230.758664] FAULT_INJECTION: forcing a failure.
[ 2230.758664] name failslab, interval 1, probability 0, space 0, times 0
[ 2230.762329] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
...
[ 2230.898175] WARNING: CPU: 3 PID: 1407 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:6265 mlxsw_sp_router_fib_event+0xfad/0x13e0
[ 2230.898179] Kernel panic - not syncing: panic_on_warn set ...
[ 2230.898183] CPU: 3 PID: 1407 Comm: syz-executor.0 Not tainted 5.8.0-rc2+ #28
[ 2230.898190] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014

Fixes: 3057224e014c ("mlxsw: spectrum_router: Implement FIB offload in deferred work")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net: macb: fix call to pm_runtime in the suspend/resume functions

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 6c8f85cac98a4c6b767c4c4f6af7283724c32b47 ]

The calls to pm_runtime_force_suspend/resume() functions are only
relevant if the device is not configured to act as a WoL wakeup source.
Add the device_may_wakeup() test before calling them.

Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net: macb: mark device wake capable when "magic-packet" property present

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit ced4799d06375929e013eea04ba6908207afabbe ]

Change the way the "magic-packet" DT property is handled in the
macb_probe() function, matching DT binding documentation.
Now we mark the device as "wakeup capable" instead of calling the
device_init_wakeup() function that would enable the wakeup source.

For Ethernet WoL, enabling the wakeup_source is done by
using ethtool and associated macb_set_wol() function that
already calls device_set_wakeup_enable() for this purpose.

That would reduce power consumption by cutting more clocks if
"magic-packet" property is set but WoL is not configured by ethtool.

Fixes: 3e2a5e153906 ("net: macb: add wake-on-lan support via magic packet")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Cc: Sergio Prado <sergio.prado@e-labworks.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net: macb: fix wakeup test in runtime suspend/resume routines

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 515a10a701d570e26dfbe6ee373f77c8bf11053f ]

Use the proper struct device pointer to check if the wakeup flag
and wakeup source are positioned.
Use the one passed by function call which is equivalent to
&bp->dev->dev.parent.

It's preventing the trigger of a spurious interrupt in case the
Wake-on-Lan feature is used.

Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support")
Cc: Claudiu Beznea <claudiu.beznea@microchip.com>
Cc: Harini Katakam <harini.katakam@xilinx.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

bnxt_en: fix NULL dereference in case SR-IOV configuration fails

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit c8b1d7436045d3599bae56aef1682813ecccaad7 ]

we need to set 'active_vfs' back to 0, if something goes wrong during the
allocation of SR-IOV resources: otherwise, further VF configurations will
wrongly assume that bp->pf.vf[x] are valid memory locations, and commands
like the ones in the following sequence:

# echo 2 >/sys/bus/pci/devices/${ADDR}/sriov_numvfs
# ip link set dev ens1f0np0 up
# ip link set dev ens1f0np0 vf 0 trust on

will cause a kernel crash similar to this:

bnxt_en 0000:3b:00.0: not enough MMIO resources for SR-IOV
BUG: kernel NULL pointer dereference, address: 0000000000000014
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 43 PID: 2059 Comm: ip Tainted: G          I       5.8.0-rc2.upstream+ #871
Hardware name: Dell Inc. PowerEdge R740/08D89F, BIOS 2.2.11 06/13/2019
RIP: 0010:bnxt_set_vf_trust+0x5b/0x110 [bnxt_en]
Code: 44 24 58 31 c0 e8 f5 fb ff ff 85 c0 0f 85 b6 00 00 00 48 8d 1c 5b 41 89 c6 b9 0b 00 00 00 48 c1 e3 04 49 03 9c 24 f0 0e 00 00 <8b> 43 14 89 c2 83 c8 10 83 e2 ef 45 84 ed 49 89 e5 0f 44 c2 4c 89
RSP: 0018:ffffac6246a1f570 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000000b
RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98b28f538900
RBP: ffff98b28f538900 R08: 0000000000000000 R09: 0000000000000008
R10: ffffffffb9515be0 R11: ffffac6246a1f678 R12: ffff98b28f538000
R13: 0000000000000001 R14: 0000000000000000 R15: ffffffffc05451e0
FS:  00007fde0f688800(0000) GS:ffff98baffd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000014 CR3: 000000104bb0a003 CR4: 00000000007606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
  do_setlink+0x994/0xfe0
  __rtnl_newlink+0x544/0x8d0
  rtnl_newlink+0x47/0x70
  rtnetlink_rcv_msg+0x29f/0x350
  netlink_rcv_skb+0x4a/0x110
  netlink_unicast+0x21d/0x300
  netlink_sendmsg+0x329/0x450
  sock_sendmsg+0x5b/0x60
  ____sys_sendmsg+0x204/0x280
  ___sys_sendmsg+0x88/0xd0
  __sys_sendmsg+0x5e/0xa0
  do_syscall_64+0x47/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.")
Reported-by: Fei Liu <feliu@redhat.com>
CC: Jonathan Toppins <jtoppins@redhat.com>
CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net/mlx5e: Fix 50G per lane indication

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 6a1cf4e443a3b0a4d690d3c93b84b1e9cbfcb1bd ]

Some released FW versions mistakenly don't set the capability that 50G
per lane link-modes are supported for VFs (ptys_extended_ethernet
capability bit). When the capability is unset, read
PTYS.ext_eth_proto_capability (always reliable).
If PTYS.ext_eth_proto_capability is valid (has a non-zero value)
conclude that the HCA supports 50G per lane. Otherwise, conclude that
the HCA doesn't support 50G per lane.

Fixes: a08b4ed1373d ("net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register")
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

net/mlx5: Fix eeprom support for SFP module

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 47afbdd2fa4c5775c383ba376a3d1da7d7f694dc ]

Fix eeprom SFP query support by setting i2c_addr, offset and page number
correctly. Unlike QSFP modules, SFP eeprom params are as follow:
- i2c_addr is 0x50 for offset 0 - 255 and 0x51 for offset 256 - 511.
- Page number is always zero.
- Page offset is always relative to zero.

As part of eeprom query, query the module ID (SFP / QSFP*) via helper
function to set the params accordingly.

In addition, change mlx5_qsfp_eeprom_page() input type to be u16 to avoid
unnecessary casting.

Fixes: a708fb7b1f8d ("net/mlx5e: ethtool, Add support for EEPROM high pages query")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

qed: Populate nvm-file attributes while reading nvm config partition.

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 13cf8aab7425a253070433b5a55b4209ceac8b19 ]

NVM config file address will be modified when the MBI image is upgraded.
Driver would return stale config values if user reads the nvm-config
(via ethtool -d) in this state. The fix is to re-populate nvm attribute
info while reading the nvm config values/partition.

Changes from previous version:
-------------------------------
v3: Corrected the formatting in 'Fixes' tag.
v2: Added 'Fixes' tag.

Fixes: 1ac4329a1cff ("qed: Add configuration information to register dump and debug data")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

IB/mlx5: Fix 50G per lane indication

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 530c8632b547ff72f11ff83654b22462a73f1f7b ]

Some released FW versions mistakenly don't set the capability that 50G per
lane link-modes are supported for VFs (ptys_extended_ethernet capability
bit).

Use PTYS.ext_eth_proto_capability instead, as this indication is always
accurate. If PTYS.ext_eth_proto_capability is valid
(has a non-zero value) conclude that the HCA supports 50G per lane.

Otherwise, conclude that the HCA doesn't support 50G per lane.

Fixes: 08e8676f1607 ("IB/mlx5: Add support for 50Gbps per lane link modes")
Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

cxgb4: fix all-mask IP address comparison

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 76c4d85c9260c3d741cbd194c30c61983d0a4303 ]

Convert all-mask IP address to Big Endian, instead, for comparison.

Fixes: f286dd8eaad5 ("cxgb4: use correct type for all-mask IP address comparison")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

nbd: Fix memory leak in nbd_add_socket

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 579dd91ab3a5446b148e7f179b6596b270dace46 ]

When adding first socket to nbd, if nsock's allocation failed, the data
structure member "config->socks" was reallocated, but the data structure
member "config->num_connections" was not updated. A memory leak will occur
then because the function "nbd_config_put" will free "config->socks" only
when "config->num_connections" is not zero.

Fixes: 03bf73c315ed ("nbd: prevent memory leak")
Reported-by: syzbot+934037347002901b8d2a@syzkaller.appspotmail.com
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

arm64: kgdb: Fix single-step exception handling oops

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 8523c006264df65aac7d77284cc69aac46a6f842 ]

After entering kdb due to breakpoint, when we execute 'ss' or 'go' (will
delay installing breakpoints, do single-step first), it won't work
correctly, and it will enter kdb due to oops.

It's because the reason gotten in kdb_stub() is not as expected, and it
seems that the ex_vector for single-step should be 0, like what arch
powerpc/sh/parisc has implemented.

Before the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 due to Breakpoint @ 0xffff8000101486cc
[3]kdb> ss

Entering kdb (current=0xffff0000fa878040, pid 266) on processor 3 Oops: (null)
due to oops @ 0xffff800010082ab8
CPU: 3 PID: 266 Comm: sh Not tainted 5.7.0-rc4-13839-gf0e5ad491718 #6
Hardware name: linux,dummy-virt (DT)
pstate: 00000085 (nzcv daIf -PAN -UAO)
pc : el1_irq+0x78/0x180
lr : __handle_sysrq+0x80/0x190
sp : ffff800015003bf0
x29: ffff800015003d20 x28: ffff0000fa878040
x27: 0000000000000000 x26: ffff80001126b1f0
x25: ffff800011b6a0d8 x24: 0000000000000000
x23: 0000000080200005 x22: ffff8000101486cc
x21: ffff800015003d30 x20: 0000ffffffffffff
x19: ffff8000119f2000 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 0000000000000000
x9 : 0000000000000000 x8 : ffff800015003e50
x7 : 0000000000000002 x6 : 00000000380b9990
x5 : ffff8000106e99e8 x4 : ffff0000fadd83c0
x3 : 0000ffffffffffff x2 : ffff800011b6a0d8
x1 : ffff800011b6a000 x0 : ffff80001130c9d8
Call trace:
el1_irq+0x78/0x180
printk+0x0/0x84
write_sysrq_trigger+0xb0/0x118
proc_reg_write+0xb4/0xe0
__vfs_write+0x18/0x40
vfs_write+0xb0/0x1b8
ksys_write+0x64/0xf0
__arm64_sys_write+0x14/0x20
el0_svc_common.constprop.2+0xb0/0x168
do_el0_svc+0x20/0x98
el0_sync_handler+0xec/0x1a8
el0_sync+0x140/0x180

[3]kdb>

After the patch:
Entering kdb (current=0xffff8000119e2dc0, pid 0) on processor 0 due to Keyboard Entry
[0]kdb> bp printk
Instruction(i) BP #0 at 0xffff8000101486cc (printk)
is enabled addr at ffff8000101486cc, hardtype=0 installed=0

[0]kdb> g

/ # echo h > /proc/sysrq-trigger

Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> g

Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to Breakpoint @ 0xffff8000101486cc
[0]kdb> ss

Entering kdb (current=0xffff0000fa852bc0, pid 268) on processor 0 due to SS trap @ 0xffff800010082ab8
[0]kdb>

Fixes: 44679a4f142b ("arm64: KGDB: Add step debugging support")
Signed-off-by: Wei Li <liwei391@huawei.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200509214159.19680-2-liwei391@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>

RDMA/siw: Fix reporting vendor_part_id

BugLink: https://bugs.launchpad.net/bugs/1887853
[ Upstream commit 04340645f69ab7abb6f9052688a60f0213b3f79c ]

Move the initialization of the vendor_part_id to be before calling
ib_register_device(), this is needed because the query_device() callback
is called from the context of ib_register_device() before initializing the
vendor_part_id, so the reported value is wrong.

Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface")
Link: https://lore.kernel.org/r/20200707130931.444724-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>