Commit f45709df7731 ("serial: 8250: Don't touch RTS modem control while
in rs485 mode") sought to prevent user space from interfering with rs485
communication by ignoring a TIOCMSET ioctl() which changes RTS polarity.
It did so in serial8250_do_set_mctrl(), which turns out to be too deep
in the call stack: When a uart_port is opened, RTS polarity is set by
the rs485-aware function uart_port_dtr_rts(). It calls down to
serial8250_do_set_mctrl() and that particular RTS polarity change should
*not* be ignored.
The user-visible result is that on 8250_omap ports which use rs485 with
inverse polarity (RTS bit in MCR register is 1 to receive, 0 to send),
a newly opened port initially sets up RTS for sending instead of
receiving. That's because omap_8250_startup() sets the cached value
up->mcr to 0 and omap_8250_restore_regs() subsequently writes it to the
MCR register. Due to the commit, serial8250_do_set_mctrl() preserves
that incorrect register value:
Have pericom_do_set_divisor() use the uartclk instead of a hard coded
value to work with different speed crystals. Tested with 14.7456 and 24
MHz crystals.
Have pericom_do_set_divisor() always calculate the divisor rather than
call serial8250_do_set_divisor() for rates below baud_base.
Do not write registers or call serial8250_do_set_divisor() if valid
divisors could not be found.
Fixes: 6bf4e42f1d19 ("serial: 8250: Add support for higher baud rates to Pericom chips") Cc: stable <stable@vger.kernel.org> Signed-off-by: Jay Dolan <jay.dolan@accesio.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211122120604.3909-3-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix error in table for PCI_DEVICE_ID_ACCESIO_PCIE_ICM_4S that caused it
and PCI_DEVICE_ID_ACCESIO_PCIE_ICM232_4 to be missing their fourth port.
Fixes: 78d3820b9bd3 ("serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()") Cc: stable <stable@vger.kernel.org> Signed-off-by: Jay Dolan <jay.dolan@accesio.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20211122120604.3909-2-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 761ed4a94582 ("tty: serial_core: convert uart_close to use
tty_port_close") converted serial core to use tty_port_close() but
failed to notice that the transmit buffer still needs to be freed on
final close.
Not freeing the transmit buffer means that the buffer is no longer
cleared on next open so that any ioctl() waiting for the buffer to drain
might wait indefinitely (e.g. on termios changes) or that stale data can
end up being transmitted in case tx is restarted.
Furthermore, the buffer of any port that has been opened would leak on
driver unbind.
Note that the port lock is held when clearing the buffer pointer due to
the ldisc race worked around by commit a5ba1d95e46e ("uart: fix race
between uart_put_char() and uart_shutdown()").
Also note that the tty-port shutdown() callback is not called for
console ports so it is not strictly necessary to free the buffer page
after releasing the lock (cf. d72402145ace ("tty/serial: do not free
trasnmit buffer page under port lock")).
The current implementation uses 0 as lower limit for the baud rate
tolerance for tegra20 and tegra30 chips which causes isses on UART
initialization as soon as baud rate clock is lower than required even
when within the standard UART tolerance of +/- 4%.
This fix aligns the implementation with the initial commit description
of +/- 4% tolerance for tegra chips other than tegra186 and
tegra194.
Fixes: d781ec21bae6 ("serial: tegra: report clk rate errors") Cc: stable <stable@vger.kernel.org> Signed-off-by: Patrik John <patrik.john@u-blox.com> Link: https://lore.kernel.org/r/sig.19614244f8.20211123132737.88341-1-patrik.john@u-blox.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The document 'ACPI for Arm Components 1.0' defines the following
_HID mappings:
-'Prime cell UART (PL011)': ARMH0011
-'SBSA UART': ARMHB000
Use the sbsa-uart driver when a device is described with
the 'ARMHB000' _HID.
Note:
PL011 devices currently use the sbsa-uart driver instead of the
uart-pl011 driver. Indeed, PL011 devices are not bound to a clock
in ACPI. It is not possible to change their baudrate.
The CONSOLE_POLLING mode is used for tools like k(g)db. In this kind of
setup, it is often sharing a serial device with the normal system console.
This is usually no problem because the polling helpers can consume input
values directly (when in kgdb context) and the normal Linux handlers can
only consume new input values after kgdb switched back.
This is not true anymore when RX DMA is enabled for UARTDM controllers.
Single input values can no longer be received correctly. Instead following
seems to happen:
* on 1. input, some old input is read (continuously)
* on 2. input, two old inputs are read (continuously)
* on 3. input, three old input values are read (continuously)
* on 4. input, 4 previous inputs are received
This repeats then for each group of 4 input values.
This behavior changes slightly depending on what state the controller was
when the first input was received. But this makes working with kgdb
basically impossible because control messages are always corrupted when
kgdboc tries to parse them.
RX DMA should therefore be off when CONSOLE_POLLING is enabled to avoid
these kind of problems. No such problem was noticed for TX DMA.
Fixes: 99693945013a ("tty: serial: msm: Add RX DMA support") Cc: stable@vger.kernel.org Signed-off-by: Sven Eckelmann <sven@narfation.org> Link: https://lore.kernel.org/r/20211113121050.7266-1-sven@narfation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The trampoline_pgd only maps the 0xfffffff000000000-0xffffffffffffffff
range of kernel memory (with 4-level paging). This range contains the
kernel's text+data+bss mappings and the module mapping space but not the
direct mapping and the vmalloc area.
This is enough to get the application processors out of real-mode, but
for code that switches back to real-mode the trampoline_pgd is missing
important parts of the address space. For example, consider this code
from arch/x86/kernel/reboot.c, function machine_real_restart() for a
64-bit kernel:
The code switches to the trampoline_pgd, which unmaps the direct mapping
and also the kernel stack. The call to cr4_clear_bits() will find no
stack and crash the machine. The real_mode_header pointer below points
into the direct mapping, and dereferencing it also causes a crash.
The reason this does not crash always is only that kernel mappings are
global and the CR3 switch does not flush those mappings. But if theses
mappings are not in the TLB already, the above code will crash before it
can jump to the real-mode stub.
Extend the trampoline_pgd to contain all kernel mappings to prevent
these crashes and to make code which runs on this page-table more
robust.
Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20211202153226.22946-5-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There are cases that the TSC clocksource is wrongly judged as unstable by
the clocksource watchdog mechanism which tries to validate the TSC against
HPET, PM_TIMER or jiffies. While there is hardly a general reliable way to
check the validity of a watchdog, Thomas Gleixner proposed [1]:
"I'm inclined to lift that requirement when the CPU has:
After two decades of horrors we're finally at a point where TSC seems
to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
was really key as we can now detect even small modifications reliably
and the important point is that we can cure them as well (not pretty
but better than all other options)."
As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations
of Atom processorz, and is always coupled with X86_FEATURE_CONSTANT_TSC
and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive
to use maximal 2 sockets.
The check is done inside tsc_init() before registering 'tsc-early' and
'tsc' clocksources, as there were cases that both of them had been
wrongly judged as unreliable.
For more background of tsc/watchdog, there is a good summary in [2]
[tglx} Update vs. jiffies:
On systems where the only remaining clocksource aside of TSC is jiffies
there is no way to make this work because that creates a circular
dependency. Jiffies accuracy depends on not missing a periodic timer
interrupt, which is not guaranteed. That could be detected by TSC, but as
TSC is not trusted this cannot be compensated. The consequence is a
circulus vitiosus which results in shutting down TSC and falling back to
the jiffies clocksource which is even more unreliable.
Fixes: 6e3cd95234dc ("x86/hpet: Use another crystalball to evaluate HPET usability") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211117023751.24190-2-feng.tang@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The TSC_ADJUST register is checked every time a CPU enters idle state, but
Thomas Gleixner mentioned there is still a caveat that a system won't enter
idle [1], either because it's too busy or configured purposely to not enter
idle.
Setup a periodic timer (every 10 minutes) to make sure the check is
happening on a regular base.
Fixes: 6e3cd95234dc ("x86/hpet: Use another crystalball to evaluate HPET usability") Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211117023751.24190-1-feng.tang@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In cdnsp_endpoint_init(), cdnsp_ring_alloc() is assigned to pep->ring
and there is a dereference of it in cdnsp_endpoint_init(), which could
lead to a NULL pointer dereference on failure of cdnsp_ring_alloc().
Fix this bug by adding a check of pep->ring.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_USB_CDNSP_GADGET=y show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@vger.kernel.org> Acked-by: Pawel Laszczak <pawell@cadence.com> Acked-by: Peter Chen <peter.chen@kernel.org> Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130172700.206650-1-zhou1615@umn.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This issue was found at android12 MTP.
1. MTP submit many out urb request.
2. Cancel left requests (>20) when enough data get from host
3. Send ACK by IN endpoint.
4. MTP submit new out urb request.
5. 4's urb never complete.
Actually DMA pos already bigger than previous submit request afbccb7d's TRB (184-184). The reason of (not handled) is that deq position is wrong.
The TRB link is below when irq happen.
DEQ LINK LINK LINK LINK LINK .... TRB(afbccb7d):START DMA(EP_TRADDR).
Original code check LINK TRB, but DEQ just move one step.
LINK DEQ LINK LINK LINK LINK .... TRB(afbccb7d):START DMA(EP_TRADDR).
This patch skip all LINK TRB and sync DEQ to trb's start.
LINK LINK LINK LINK LINK .... DEQ = TRB(afbccb7d):START DMA(EP_TRADDR).
Acked-by: Peter Chen <peter.chen@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Jun Li <jun.li@nxp.com> Link: https://lore.kernel.org/r/20211130154239.8029-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Stub from the spec:
"4.5.2.2.4.2 Exiting from AttachWait.SNK State
A Sink shall transition to Unattached.SNK when the state of both
the CC1 and CC2 pins is SNK.Open for at least tPDDebounce.
A DRP shall transition to Unattached.SRC when the state of both
the CC1 and CC2 pins is SNK.Open for at least tPDDebounce."
This change makes TCPM to wait in SNK_DEBOUNCED state until
CC1 and CC2 pins is SNK.Open for at least tPDDebounce. Previously,
TCPM resets the port if vbus is not present in PD_T_PS_SOURCE_ON.
This causes TCPM to loop continuously when connected to a
faulty power source that does not present vbus. Waiting in
SNK_DEBOUNCED also ensures that TCPM is adherant to
"4.5.2.2.4.2 Exiting from AttachWait.SNK State" requirements.
This is another branded 8153 device that doesn't work well with LPM:
r8152 2-2.1:1.0 enp0s13f0u2u1: Stop submitting intr, status -71
Disable LPM to resolve the issue.
Signed-off-by: Ole Ernst <olebowle@gmx.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211127090546.52072-1-olebowle@gmx.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Turns out some xHC controllers require all 64 bits in the CRCR register
to be written to execute a command abort.
The lower 32 bits containing the command abort bit is written first.
In case the command ring stops before we write the upper 32 bits then
hardware may use these upper bits to set the commnd ring dequeue pointer.
Solve this by making sure the upper 32 bits contain a valid command
ring dequeue pointer.
The original patch that only wrote the first 32 to stop the ring went
to stable, so this fix should go there as well.
Fixes: ff0e50d3564f ("xhci: Fix command ring pointer corruption while aborting a command") Cc: stable@vger.kernel.org Tested-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20211126122340.1193239-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
caused by `c->vc_cell_height' not having been initialized. This has
only started to trigger with commit 860dafa90259 ("vt: Fix character
height handling with VT_RESIZEX"), however the ultimate offender is
commit 50ec42edd978 ("[PATCH] Detaching fbcon: fix vgacon to allow
retaking of the console").
Said commit has added a call to `vc_resize' whenever `vgacon_init' is
called with the `init' argument set to 0, which did not happen before.
And the call is made before a key vgacon boot parameter retrieved in
`vgacon_startup' has been propagated in `vgacon_init' for `vc_resize' to
use to the console structure being worked on. Previously the parameter
was `c->vc_font.height' and now it is `c->vc_cell_height'.
In this particular scenario the registration of fbcon has failed and vt
resorts to vgacon. Now fbcon does have initialized `c->vc_font.height'
somehow, unlike `c->vc_cell_height', which is why this code did not
crash before, but either way the boot parameters should have been copied
to the console structure ahead of the call to `vc_resize' rather than
afterwards, so that first the call has a chance to use them and second
they do not change the console structure to something possibly different
from what was used by `vc_resize'.
Move the propagation of the vgacon boot parameters ahead of the call to
`vc_resize' then. Adjust the comment accordingly.
Fixes: 50ec42edd978 ("[PATCH] Detaching fbcon: fix vgacon to allow retaking of the console") Cc: stable@vger.kernel.org # v2.6.18+ Reported-by: Wim Osterholt <wim@djo.tudelft.nl> Reported-by: Pavel V. Panteleev <panteleev_p@mcst.ru> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Link: https://lore.kernel.org/r/alpine.DEB.2.21.2110252317110.58149@angie.orcam.me.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In commit c8c3735997a3 ("parisc: Enhance detection of synchronous cr16
clocksources") I assumed that CPUs on the same physical core are syncronous.
While booting up the kernel on two different C8000 machines, one with a
dual-core PA8800 and one with a dual-core PA8900 CPU, this turned out to be
wrong. The symptom was that I saw a jump in the internal clocks printed to the
syslog and strange overall behaviour. On machines which have 4 cores (2
dual-cores) the problem isn't visible, because the current logic already marked
the cr16 clocksource unstable in this case.
This patch now marks the cr16 interval timers unstable if we have more than one
CPU in the system, and it fixes this issue.
Fixes: c8c3735997a3 ("parisc: Enhance detection of synchronous cr16 clocksources") Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v5.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On newer debian releases the debian-provided "installkernel" script is
installed in /usr/sbin. Fix the kernel install.sh script to look for the
script in this directory as well.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v3.13+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Default KBUILD_IMAGE to $(boot)/bzImage if a self-extracting
(CONFIG_PARISC_SELF_EXTRACT=y) kernel is to be built.
This fixes the bindeb-pkg make target.
Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
There is a small window in time during resume where the hardware
flow control signal RTS can be asserted (which allows a sender to
resume sending data to the UART) but the baud rate has not yet
been restored. This will cause corrupted data and FRAMING, OVERRUN
and BREAK errors. This is happening because the MCTRL register is
shadowed in uart_port struct and is later used during resume to set
the MCTRL register during both serial8250_do_startup() and
uart_resume_port(). Unfortunately, serial8250_do_startup()
happens before the UART baud rate is restored. The fix is to clear
the shadowed mctrl value at the end of suspend and restore it at the
end of resume.
Fixes: 41a469482de2 ("serial: 8250: Add new 8250-core based Broadcom STB driver") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Al Cooper <alcooperx@gmail.com> Link: https://lore.kernel.org/r/20211201201402.47446-1-alcooperx@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Transport Interface Receive (TIR) objects perform the packet processing and
reassembly and is also responsible for demultiplexing the packets into the
different RQs.
There are certain TIR context attributes that propagate to the pointed RQs
and applied to them (like packet_merge offloads (LRO/SHAMPO) and
tunneled_offload_en). When TIRs do not agree on attributes values, a "last
one wins" policy is applied. Hence, if not synced properly, a race between
TIR params update and a concurrent TIR create/modify operation might yield
to a mismatch between the shadow parameters in SW and the actual applied
state of the RQs in HW.
tunneled_offload_en is a fixed attribute per profile, while packet merge
offload state might be toggled and get out-of-sync. When this happens,
packet_merge offload might be working although not requested, or the
opposite.
All updates to packet_merge state and all create/modify operations of
regular redirection/steering TIRs are done under the same priv->state_lock,
so they do not run in parallel, and no race is possible.
However, there are other kind of TIRs (acceleration offloads TIRs, like TLS
TIRs) which are created on demand for each new connection without holding
the coarse priv->state_lock, hence might race.
Fix this by synchronizing all packet_merge state reads and writes against
all TIR create/modify operations. Include the modify operations of the
regular redirection steering TIRs under the new lock, for better code
layering and division of responsibilities.
This series introduces new packet merge type, therefore rename lro
functions to packet merge to support the new merge type:
- Generalize + rename mlx5e_build_tir_ctx_lro to
mlx5e_build_tir_ctx_packet_merge.
- Rename mlx5e_modify_tirs_lro to mlx5e_modify_tirs_packet_merge.
- Rename lro bit in mlx5_ifc_modify_tir_bitmask_bits to packet_merge.
- Rename lro_en in mlx5e_params to packet_merge_type type and combine
packet_merge params into one struct mlx5e_packet_merge_param.
Signed-off-by: Khalid Manaa <khalidm@nvidia.com> Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
TIR stands for transport interface receive, the TIR object is
responsible for performing all transport related operations on
the receive side like packet processing, demultiplexing the packets
to different RQ's, etc.
lro_timeout is a field in the TIR that is used to set the timeout for lro
session, this series introduces new packet merge type, therefore rename
lro_timeout to packet_merge_timeout for all packet merge types.
Signed-off-by: Ben Ben-Ishay <benishay@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Drop the "flush" param and return values to/from the TDP MMU's helper for
zapping collapsible SPTEs. Because the helper runs with mmu_lock held
for read, not write, it uses tdp_mmu_zap_spte_atomic(), and the atomic
zap handles the necessary remote TLB flush.
Similarly, because mmu_lock is dropped and re-acquired between zapping
legacy MMUs and zapping TDP MMUs, kvm_mmu_zap_collapsible_sptes() must
handle remote TLB flushes from the legacy MMU before calling into the TDP
MMU.
Fixes: e2209710ccc5d ("KVM: x86/mmu: Skip rmap operations if rmaps not allocated") Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211120045046.3940942-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
slot_handle_leaf is a misnomer because it only operates on 4K SPTEs
whereas "leaf" is used to describe any valid terminal SPTE (4K or
large page). Rename slot_handle_leaf to slot_handle_level_4k to
avoid confusion.
Making this change makes it more obvious there is a benign discrepency
between the legacy MMU and the TDP MMU when it comes to dirty logging.
The legacy MMU only iterates through 4K SPTEs when zapping for
collapsing and when clearing D-bits. The TDP MMU, on the other hand,
iterates through SPTEs on all levels.
The TDP MMU behavior of zapping SPTEs at all levels is technically
overkill for its current dirty logging implementation, which always
demotes to 4k SPTES, but both the TDP MMU and legacy MMU zap if and only
if the SPTE can be replaced by a larger page, i.e. will not spuriously
zap 2m (or larger) SPTEs. Opportunistically add comments to explain this
discrepency in the code.
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20211019162223.3935109-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Return appropriate error codes if setting up the GHCB scratch area for an
SEV-ES guest fails. In particular, returning -EINVAL instead of -ENOMEM
when allocating the kernel buffer could be confusing as userspace would
likely suspect a guest issue.
Fixes: 8f423a80d299 ("KVM: SVM: Support MMIO for an SEV-ES guest") Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20211109222350.2266045-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit d81ae8aac85c ("sched/uclamp: Fix initialization of struct
uclamp_rq") introduced a bug where uclamp_max of the rq is not reset to
match the woken up task's uclamp_max when the rq is idle.
The code was relying on rq->uclamp_max initialized to zero, so on first
enqueue
if (uc_se->value > READ_ONCE(uc_rq->value))
WRITE_ONCE(uc_rq->value, uc_se->value);
}
was actually resetting it. But since commit d81ae8aac85c changed the
default to 1024, this no longer works. And since rq->uclamp_flags is
also initialized to 0, neither above code path nor uclamp_idle_reset()
update the rq->uclamp_max on first wake up from idle.
This is only visible from first wake up(s) until the first dequeue to
idle after enabling the static key. And it only matters if the
uclamp_max of this task is < 1024 since only then its uclamp_max will be
effectively ignored.
Fix it by properly initializing rq->uclamp_flags = UCLAMP_FLAG_IDLE to
ensure uclamp_idle_reset() is called which then will update the rq
uclamp_max value as expected.
Fixes: d81ae8aac85c ("sched/uclamp: Fix initialization of struct uclamp_rq") Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/20211202112033.1705279-1-qais.yousef@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
__setup() callbacks expect 1 for success and 0 for failure. Correct the
usage here to reflect that.
Fixes: 826bfeb37bb4 ("preempt/dynamic: Support dynamic preempt with preempt= boot option") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Halaney <ahalaney@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211203233203.133581-1-ahalaney@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In the native case, PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is the
trampoline stack. But XEN pv doesn't use trampoline stack, so
PER_CPU_VAR(cpu_tss_rw + TSS_sp0) is also the kernel stack.
In that case, source and destination stacks are identical, which means
that reusing swapgs_restore_regs_and_return_to_usermode() in XEN pv
would cause %rsp to move up to the top of the kernel stack and leave the
IRET frame below %rsp.
This is dangerous as it can be corrupted if #NMI / #MC hit as either of
these events occurring in the middle of the stack pushing would clobber
data on the (original) stack.
And, with XEN pv, swapgs_restore_regs_and_return_to_usermode() pushing
the IRET frame on to the original address is useless and error-prone
when there is any future attempt to modify the code.
[ bp: Massage commit message. ]
Fixes: 7f2590a110b8 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lkml.kernel.org/r/20211126101209.8613-4-jiangshanlai@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
removed a CR3 write in the faulting path of load_gs_index().
But the path's FENCE_SWAPGS_USER_ENTRY has no fence operation if PTI is
enabled, see spectre_v1_select_mitigation().
Rather, it depended on the serializing CR3 write of SWITCH_TO_KERNEL_CR3
and since it got removed, add a FENCE_SWAPGS_KERNEL_ENTRY call to make
sure speculation is blocked.
[ bp: Massage commit message and comment. ]
Fixes: c75890700455 ("x86/entry/64: Remove unneeded kernel CR3 switching") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211126101209.8613-3-jiangshanlai@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
18ec54fdd6d18 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
added FENCE_SWAPGS_{KERNEL|USER}_ENTRY for conditional SWAPGS. In
paranoid_entry(), it uses only FENCE_SWAPGS_KERNEL_ENTRY for both
branches. This is because the fence is required for both cases since the
CR3 write is conditional even when PTI is enabled.
But
96b2371413e8f ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry")
changed the order of SWAPGS and the CR3 write. And it missed the needed
FENCE_SWAPGS_KERNEL_ENTRY for the user gsbase case.
Add it back by changing the branches so that FENCE_SWAPGS_KERNEL_ENTRY
can cover both branches.
[ bp: Massage, fix typos, remove obsolete comment while at it. ]
Fixes: 96b2371413e8f ("x86/entry/64: Switch CR3 before SWAPGS in paranoid entry") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20211126101209.8613-2-jiangshanlai@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Properly type the operands being passed to __put_user()/__get_user().
Otherwise, these routines truncate data for dependent instructions
(e.g., INSW) and only read/write one byte.
This has been tested by sending a string with REP OUTSW to a port and
then reading it back in with REP INSW on the same port.
Previous behavior was to only send and receive the first char of the
size. For example, word operations for "abcd" would only read/write
"ac". With change, the full string is now written and read back.
Fixes: f980f9c31a923 (x86/sev-es: Compile early handler code into kernel image) Signed-off-by: Michael Sterritt <sterritt@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Marc Orr <marcorr@google.com> Reviewed-by: Peter Gonda <pgonda@google.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20211119232757.176201-1-sterritt@google.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We don't want to be retrying task_work creation failure if there's
an actual signal pending for the parent task. If we do, then we can
enter an infinite loop of perpetually retrying and each retry failing
with -ERESTARTNOINTR because a signal is pending.
Fixes: 3146cba99aa2 ("io-wq: make worker creation resilient against signals") Reported-by: Florian Fischer <florian.fl.fischer@fau.de> Link: https://lore.kernel.org/io-uring/20211202165606.mqryio4yzubl7ms5@pasture/ Tested-by: Florian Fischer <florian.fl.fischer@fau.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We need to test against the existing route type, not
the rtm_type in the netlink request.
Fixes: 83f0a0b7285b ("mctp: Specify route types, require rtm_type in RTM_*ROUTE messages") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The error paths in the prepare_vmcs02() function are supposed to set
*entry_failure_code but this path does not. It leads to using an
uninitialized variable in the caller.
Fixes: 71f7347025bf ("KVM: nVMX: Load GUEST_IA32_PERF_GLOBAL_CTRL MSR on VM-Entry") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Message-Id: <20211130125337.GB24578@kili> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
dmesg will report a #GP warning from an unchecked MSR access
error on MSR_F15H_PERF_CTLx.
This is because according to APM (Revision: 4.03) Figure 13-7,
the bits [35:32] of AMD PerfEvtSeln register is a part of the
event select encoding, which extends the EVENT_SELECT field
from 8 bits to 12 bits.
Opportunistically update pmu->reserved_bits for reserved bit 19.
Reported-by: Jim Mattson <jmattson@google.com> Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM") Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20211118130320.95997-1-likexu@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Always use MLX5_FLOW_TABLE_OTHER_VPORT flag when creating egress ACL
table for single FDB. Not doing so on BlueField will make firmware fail
the command. On BlueField the E-Switch manager is the ECPF (vport 0xFFFE)
which is filled in the flow table creation command but as the
other_vport field wasn't set the firmware complains about a bad parameter.
This is different from a regular HCA where the E-Switch manager vport is
the PF (vport 0x0). Passing MLX5_FLOW_TABLE_OTHER_VPORT will make the
firmware happy both on BlueField and on regular HCAs without special
condition for each.
This fixes the bellow firmware syndrome:
mlx5_cmd_check:819:(pid 571): CREATE_FLOW_TABLE(0x930) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x754a4)
Fixes: db202995f503 ("net/mlx5: E-Switch, add logic to enable shared FDB") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
To enable transmit schduler on vport FW require non-zero configuration
for vport's TSAR. If vport added to the group which has configured BW
share value and TX rate values of the vport are zero, then scheduler
wouldn't be enabled on this vport.
Fix that by calling BW normalization if BW share of the new group is
configured.
Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The cited patch added the IPsec support to uplink representor, however
as uplink representors have his private statistics where IPsec stats
is not part of it, that effectively makes IPsec stats hidden when uplink
representor stats queried.
Resolve by adding IPsec stats to uplink representor private statistics.
Fixes: 5589b8f1a2c7 ("net/mlx5e: Add IPsec support to uplink representor") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
This was broken before the introduction of KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM,
but technically harmless because the region list was unused for a mirror
VM. However, it is untidy and it now causes a NULL pointer access when
attempting to move the encryption context of a mirror VM.
Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context")
Message-Id: <20211123005036.2954379-7-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If the is an L1 with nNPT in 32bit, the shadow walk starts with
pae_root.
Fixes: a717a780fc4e ("KVM: x86/mmu: Support shadowing NPT when 5-level paging is enabled in host) Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <20211124122055.64424-2-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Should an error occur (invalid TLV len or memory allocation failure), the
memory already allocated in 'reduce_power_data' should be freed before
returning, otherwise it is leaking.
When converting to use an idr to map userspace fence seqno values back
to a dma_fence, we lost the error return when userspace passes seqno
that is larger than the last submitted fence. Restore this check.
Reported-by: Akhil P Oommen <akhilpo@codeaurora.org> Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211111192457.747899-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
We weren't dropping the submitqueue reference in all paths. In
particular, when the fence has already been signalled. Split out
a helper to simplify handling this in the various different return
paths.
Fixes: a61acbbe9cf8 ("drm/msm: Track "seqno" fences by idr") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211111192457.747899-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In commit 510410bfc034 ("drm/msm: Implement mmap as GEM object
function") we switched to a new/cleaner method of doing things. That's
good, but we missed a little bit.
Before that commit, we used to _first_ run through the
drm_gem_mmap_obj() case where `obj->funcs->mmap()` was NULL. That meant
that we ran:
...and _then_ we modified those mappings with our own. Now that
`obj->funcs->mmap()` is no longer NULL we don't run the default
code. It looks like the fact that the vm_flags got VM_IO / VM_DONTDUMP
was important because we're now getting crashes on Chromebooks that
use ARC++ while logging out. Specifically a crash that looks like this
(this is on a 5.10 kernel w/ relevant backports but also seen on a
5.15 kernel):
Unable to handle kernel paging request at virtual address ffffffc008000000
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=000000008293d000
[ffffffc008000000] pgd=00000001002b3003, p4d=00000001002b3003,
pud=00000001002b3003, pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
[...]
CPU: 7 PID: 15734 Comm: crash_dump64 Tainted: G W 5.10.67 #1 [...]
Hardware name: Qualcomm Technologies, Inc. sc7280 IDP SKU2 platform (DT)
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
pc : __arch_copy_to_user+0xc0/0x30c
lr : copyout+0xac/0x14c
[...]
Call trace:
__arch_copy_to_user+0xc0/0x30c
copy_page_to_iter+0x1a0/0x294
process_vm_rw_core+0x240/0x408
process_vm_rw+0x110/0x16c
__arm64_sys_process_vm_readv+0x30/0x3c
el0_svc_common+0xf8/0x250
do_el0_svc+0x30/0x80
el0_svc+0x10/0x1c
el0_sync_handler+0x78/0x108
el0_sync+0x184/0x1c0
Code: f8408423f80008c3910020c636100082 (b8404423)
Let's add the two flags back in.
While we're at it, the fact that we aren't running the default means
that we _don't_ need to clear out VM_PFNMAP, so remove that and save
an instruction.
NOTE: it was confirmed that VM_IO was the important flag to fix the
problem I was seeing, but adding back VM_DONTDUMP seems like a sane
thing to do so I'm doing that too.
Fixes: 510410bfc034 ("drm/msm: Implement mmap as GEM object function") Reported-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Tested-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20211110113334.1.I1687e716adb2df746da58b508db3f25423c40b27@changeid Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Reported-by: Douglas Anderson <dianders@chromium.org> Fixes: 9bc95570175a ("drm/msm: Devfreq tuning") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Tested-By: Steev Klimaszewski <steev@kali.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211105202021.181092-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Since tlb flush has been done for legacy MMU before
kvm_tdp_mmu_zap_collapsible_sptes(), so the parameter flush
should be false for kvm_tdp_mmu_zap_collapsible_sptes().
Fixes: e2209710ccc5d ("KVM: x86/mmu: Skip rmap operations if rmaps not allocated") Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com>
Message-Id: <21453a1d2533afb6e59fb6c729af89e771ff2e76.1637140154.git.houwenlong93@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
If the parameter flush is set, zap_gfn_range() would flush remote tlb
when yield, then tlb flush is not needed outside. So use the return
value of zap_gfn_range() directly instead of OR on it in
kvm_unmap_gfn_range() and kvm_tdp_mmu_unmap_gfn_range().
Fixes: 3039bcc744980 ("KVM: Move x86's MMU notifier memslot walkers to generic code") Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com>
Message-Id: <5e16546e228877a4d974f8c0e448a93d52c7a5a9.1637140154.git.houwenlong93@linux.alibaba.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
At the late production stages new dev ids were introduced. These are
now in production, so its important for the driver to recognize these.
And also fix the board caps for AQC115C adapter.
Fixes: b3f0c79cba206 ("net: atlantic: A2 hw_ops skeleton") Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The correct way to reflect firmware version is to use bundle version.
Hence populating the same instead of MAC fw version.
Fixes: c1be0bf092bd2 ("net: atlantic: common functions needed for basic A2 init/deinit hw_ops") Signed-off-by: Sameer Saurabh <ssaurabh@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When 2.5G is advertised, N-Base should be advertised against the T-base
caps. N5G is out of use in baseline code and driver should treat both 5G
and N5G (and also 2.5G and N2.5G) equally from user perspective.
Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Nikita Danilov <ndanilov@aquantia.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The max waiting period (of 1 ms) while reading the data from FW shared
buffer is too small for certain types of data (e.g., stats). There's a
chance that FW could be updating buffer at the same time and driver
would be unsuccessful in reading data. Firmware manual recommends to
have 1 sec timeout to fix this issue.
Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Dmitry Bogdanov <dbezrukov@marvell.com> Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Our current code is supposed to serialise the commits by waiting for all
the drm_crtc_commits associated to the previous HVS state.
However, assuming we have two CRTCs running and being configured and we
configure each one alternately, we end up in a situation where we're
not waiting at all.
Indeed, starting with a state (state 0) where both CRTCs are running,
and doing a commit (state 1) on the first CRTC (CRTC 0), we'll associate
its commit to its assigned FIFO in vc4_hvs_state.
If we get a new commit (state 2), this time affecting the second CRTC
(CRTC 1), the DRM core will allow both commits to execute in parallel
(assuming they don't have any share resources).
Our code in vc4_atomic_commit_tail is supposed to make sure we only get
one commit at a time and serialised by order of submission. It does so
by using for_each_old_crtc_in_state, making sure that the CRTC has a
FIFO assigned, is used, and has a commit pending. If it does, then we'll
wait for the commit before going forward.
During the transition from state 0 to state 1, as our old CRTC state we
get the CRTC 0 state 0, its commit, we wait for it, everything works fine.
During the transition from state 1 to state 2 though, the use of
for_each_old_crtc_in_state is wrong. Indeed, while the code assumes it's
returning the state of the CRTC in the old state (so CRTC 0 state 1), it
actually returns the old state of the CRTC affected by the current
commit, so CRTC 0 state 0 since it wasn't part of state 1.
Due to this, if we alternate between the configuration of CRTC 0 and
CRTC 1, we never actually wait for anything since we should be waiting
on the other every time, but it never is affected by the previous
commit.
Change the logic to, at every commit, look at every FIFO in the previous
HVS state, and if it's in use and has a commit associated to it, wait
for that commit.
Fixes: 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a commit") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-7-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Our HVS global state, when duplicated, will also copy the pointer to the
drm_crtc_commit (and increase the reference count) for each FIFO if the
pointer is not NULL.
However, our atomic_setup function will overwrite that pointer without
putting the reference back leading to a memory leak.
Since the commit is only relevant during the atomic commit process, it
doesn't make sense to duplicate the reference to the commit anyway.
Let's remove it.
Fixes: 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a commit") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-6-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a
commit") introduced a wait on the previous commit done on a given HVS
FIFO.
However, we never cleared that pointer once done. Since
drm_crtc_commit_put can free the drm_crtc_commit structure directly if
we were the last user, this means that it can lead to a use-after free
if we were to duplicate the state, and that stale pointer would even be
copied to the new state.
Set the pointer to NULL once we're done with the wait so that we don't
carry over a pointer to a free'd structure.
Fixes: 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a commit") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-5-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Commit 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a
commit") introduced a global state for the HVS, with each FIFO storing
the current CRTC commit so that we can properly synchronize commits.
However, the refcounting was off and we thus ended up leaking the
drm_crtc_commit structure every commit. Add a drm_crtc_commit_put to
prevent the leakage.
Fixes: 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a commit") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-4-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The HVS global state functions return an error pointer, but in most
cases we check if it's NULL, possibly resulting in an invalid pointer
dereference.
Fixes: 9ec03d7f1ed3 ("drm/vc4: kms: Wait on previous FIFO users before a commit") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-3-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Several DRM/KMS atomic commits can run in parallel if they affect
different CRTC. These commits share the global HVS state, so we have
some code to make sure we run commits in sequence. This synchronization
code is one of the first thing that runs in vc4_atomic_commit_tail().
Another constraints we have is that we need to make sure the HVS clock
gets a boost during the commit. That code relies on clk_set_min_rate and
will remove the old minimum and set a new one. We also need another,
temporary, minimum for the duration of the commit.
The algorithm is thus to set a temporary minimum, drop the previous
one, do the commit, and finally set the minimum for the current mode.
However, the part that sets the temporary minimum and drops the older
one runs before the commit synchronization code.
Thus, under the proper conditions, we can end up mixing up the minimums
and ending up with the wrong one for our current step.
To avoid it, let's move the clock setup in the protected section.
Fixes: d7d96c00e585 ("drm/vc4: hvs: Boost the core clock during modeset") Signed-off-by: Maxime Ripard <maxime@cerno.tech> Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com> Tested-by: Jian-Hong Pan <jhp@endlessos.org> Link: https://lore.kernel.org/r/20211117094527.146275-2-maxime@cerno.tech Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In particular, we need to ensure all the necessary blocks are switched
to 64b mode (a5xx+) otherwise the high bits of the address of the BO to
snapshot state into will be ignored, resulting in:
Fixes: 4f776f4511c7 ("drm/msm/gpu: Convert the GPU show function to use the GPU state") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211108180122.487859-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In commit 142639a52a01 ("drm/msm/a6xx: fix crashstate capture for
A650") we changed a6xx_get_gmu_registers() to read 3 sets of
registers. Unfortunately, we didn't change the memory allocation for
the array. That leads to a KASAN warning (this was on the chromeos-5.4
kernel, which has the problematic commit backported to it):
BUG: KASAN: slab-out-of-bounds in _a6xx_get_gmu_registers+0x144/0x430
Write of size 8 at addr ffffff80c89432b0 by task A618-worker/209
CPU: 5 PID: 209 Comm: A618-worker Tainted: G W 5.4.156-lockdep #22
Hardware name: Google Lazor Limozeen without Touchscreen (rev5 - rev8) (DT)
Call trace:
dump_backtrace+0x0/0x248
show_stack+0x20/0x2c
dump_stack+0x128/0x1ec
print_address_description+0x88/0x4a0
__kasan_report+0xfc/0x120
kasan_report+0x10/0x18
__asan_report_store8_noabort+0x1c/0x24
_a6xx_get_gmu_registers+0x144/0x430
a6xx_gpu_state_get+0x330/0x25d4
msm_gpu_crashstate_capture+0xa0/0x84c
recover_worker+0x328/0x838
kthread_worker_fn+0x32c/0x574
kthread+0x2dc/0x39c
ret_from_fork+0x10/0x18
Fixes: 142639a52a01 ("drm/msm/a6xx: fix crashstate capture for A650") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20211103153049.1.Idfa574ccb529d17b69db3a1852e49b580132035c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When smc_close_final() returns error, the return code overwrites by
kernel_sock_shutdown() in smc_close_active(). The return code of
smc_close_final() is more important than kernel_sock_shutdown(), and it
will pass to userspace directly.
Fix it by keeping both return codes, if smc_close_final() raises an
error, return it or kernel_sock_shutdown()'s.
Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Suggested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Correct an error where setting /proc/sys/net/rds/tcp/rds_tcp_rcvbuf would
instead modify the socket's sk_sndbuf and would leave sk_rcvbuf untouched.
Fixes: c6a58ffed536 ("RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket") Signed-off-by: William Kucharski <william.kucharski@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
smc_lgr_cleanup_early() meant to delete the link
group from the link group list, but it deleted
the list head by mistake.
This may cause memory corruption since we didn't
remove the real link group from the list and later
memseted the link group structure.
We got a list corruption panic when testing:
Fixes: a0a62ee15a829 ("net/smc: separate locks for SMCD and SMCR link group lists") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh")
changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the change, this is no longer the case, as free_fib_info_rcu()
runs after rcu grace period, without rtnl being held.
Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 28712 Comm: syz-executor.0 Tainted: G W 5.16.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20211130170155.2331929-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In rvu_mbox_init(), mbox_regions is not freed or passed out
under the switch-default region, which could lead to a memory leak.
Fix this bug by changing 'return err' to 'goto free_regions'.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_OCTEONTX2_AF=y show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: 98c561116360 (“octeontx2-af: cn10k: Add mbox support for CN10K platform”) Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130165039.192426-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The commit c55211892f46 ("dpaa2-eth: support PTP Sync packet one-step
timestamping") forgets to destroy workqueue at the end of remove
function.
Fix this by adding destroy_workqueue before fsl_mc_portal_free and
free_netdev.
Fixes: c55211892f46 ("dpaa2-eth: support PTP Sync packet one-step timestamping") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
'bitmap_fill()' fills a bitmap one 'long' at a time.
It is likely that an exact number of bits is expected.
Use 'bitmap_set()' instead in order not to set unexpected bits.
Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On most systems request for IRQ 0 will fail, phylib will print an error message
and fall back to polling. To fix this set the phydev->irq to PHY_POLL if no IRQ
is available.
Fixes: cc89c323a30e ("lan78xx: Use irq_domain for phy interrupt from USB Int. EP") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Sven Schuchmann <schuchmann@schleissheimer.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The driver assumes that split headers can be enabled/disabled without
stopping/starting the device, so it writes DMA_CHAN_CONTROL from
stmmac_set_features(). However, on my system (IP v5.10a without Split
Header support), simply writing DMA_CHAN_CONTROL when DMA is running
(for example, with the commands below) leads to a TX watchdog timeout.
Note that since my IP is configured without Split Header support, the
driver always just reads and writes the same value to the
DMA_CHAN_CONTROL register.
I don't have access to any platforms with Split Header support so I
don't know if these writes to the DMA_CHAN_CONTROL while DMA is running
actually work properly on such systems. I could not find anything in
the databook that says that DMA_CHAN_CONTROL should not be written when
the DMA is running.
But on systems without Split Header support, there is in any case no
need to call enable_sph() in stmmac_set_features() at all since SPH can
never be toggled, so we can avoid the watchdog timeout there by skipping
this call.
Fixes: 8c6fc097a2f4acf ("net: stmmac: gmac4+: Add Split Header support") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Need to call rxrpc_put_local() for peer candidate before kfree() as it
holds a ref to rxrpc_local.
[DH: v2: Changed to abstract the peer freeing code out into a function]
Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/all/20211121041608.133740-2-eiichi.tsukata@nutanix.com/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kcontrol put callback is expected to return 1 when there is change
in HW or when the update is acknowledged by driver. This would ensure
that change notifications are sent to subscribed applications. Update
the AHUB driver accordingly.
Fixes: 16e1bcc2caf4 ("ASoC: tegra: Add Tegra210 based AHUB driver") Signed-off-by: Sameer Pujar <spujar@nvidia.com> Suggested-by: Jaroslav Kysela <perex@perex.cz> Suggested-by: Mark Brown <broonie@kernel.org> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/1637219231-406-12-git-send-email-spujar@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kcontrol put callback is expected to return 1 when there is change
in HW or when the update is acknowledged by driver. This would ensure
that change notifications are sent to subscribed applications. Update
the DSPK driver accordingly.
Fixes: 327ef6470266 ("ASoC: tegra: Add Tegra186 based DSPK driver") Suggested-by: Jaroslav Kysela <perex@perex.cz> Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/1637219231-406-11-git-send-email-spujar@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kcontrol put callback is expected to return 1 when there is change
in HW or when the update is acknowledged by driver. This would ensure
that change notifications are sent to subscribed applications. Update
the DMIC driver accordingly.
Fixes: 8c8ff982e9e2 ("ASoC: tegra: Add Tegra210 based DMIC driver") Suggested-by: Jaroslav Kysela <perex@perex.cz> Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/1637219231-406-10-git-send-email-spujar@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kcontrol put callback is expected to return 1 when there is change
in HW or when the update is acknowledged by driver. This would ensure
that change notifications are sent to subscribed applications. Update
the I2S driver accordingly.
Fixes: c0bfa98349d1 ("ASoC: tegra: Add Tegra210 based I2S driver") Suggested-by: Jaroslav Kysela <perex@perex.cz> Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/1637219231-406-9-git-send-email-spujar@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
The kcontrol put callback is expected to return 1 when there is change
in HW or when the update is acknowledged by driver. This would ensure
that change notifications are sent to subscribed applications. Update
the ADMAIF driver accordingly.
Fixes: f74028e159bb ("ASoC: tegra: Add Tegra210 based ADMAIF driver") Suggested-by: Jaroslav Kysela <perex@perex.cz> Suggested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sameer Pujar <spujar@nvidia.com> Reviewed-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/1637219231-406-8-git-send-email-spujar@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fixes: 37f4ca907c46 ("mt76: mt7915: register per-phy HE capabilities for each interface") Fixes: e57b7901469f ("mt76: add mac80211 driver for MT7915 PCIe-based chipsets") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/ddae419a740f1fb9e48afd432035e9f394f512ee.1637239456.git.lorenzo@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Currently autoloading for SPI devices does not use the DT ID table, it
uses SPI modalises. Supporting OF modalises is going to be difficult if
not impractical, an attempt was made but has been reverted, so ensure
that module autoloading works for this driver by adding an id_table
listing the SPI IDs for everything.
Fixes: 96c8395e2166 ("spi: Revert modalias changes") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
ipv6_addr_bind/ipv4_addr_bind are function names. Previously, bind test
would not be run by default due to the wrong case names
Fixes: 34d0302ab861 ("selftests: Add ipv6 address bind tests to fcnal-test") Fixes: 75b2b2b3db4c ("selftests: Add ipv4 address bind tests to fcnal-test") Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In mlx4_en_try_alloc_resources(), mlx4_en_copy_priv() is called and
tmp->tx_cq will be freed on the error path of mlx4_en_copy_priv().
After that mlx4_en_alloc_resources() is called and there is a dereference
of &tmp->tx_cq[t][i] in mlx4_en_alloc_resources(), which could lead to
a use after free problem on failure of mlx4_en_copy_priv().
Fix this bug by adding a check of mlx4_en_copy_priv()
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_MLX4_EN=m show no new warnings,
and our static analyzer no longer warns about this code.
Fixes: ec25bc04ed8e ("net/mlx4_en: Add resilience in low memory systems") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211130164438.190591-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Current code wrongly uses the skb->protocol field which reflects the
outer l3 protocol to set the inner l3 type in Software Parser (SWP)
fields settings in the ethernet segment (eseg) in flows where inner
l3 exists like in Vxlan over ESP flow, the above method wrongly use
the outer protocol type instead of the inner one. thus breaking cases
where inner and outer headers have different protocols.
Fix by setting the inner l3 type in SWP according to the inner l3 ip
header version.
Fixes: 2ac9cfe78223 ("net/mlx5e: IPSec, Add Innova IPSec offload TX data path") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Fix warnings produced by:
- lockdep_assert_wiphy() in function reg_process_self_managed_hint(),
- wiphy_dereference() in function iwl_mvm_init_fw_regd().
Both function are expected to be called in critical section.
The warnings were discovered when running v5.15 kernel
with debug options enabled:
Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Signed-off-by: Łukasz Bartosik <lb@semihalf.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211110215744.5487-1-lukasz.bartosik@semihalf.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
When branch target identifiers are in use, code reachable via an
indirect branch requires a BTI landing pad at the branch target site.
When building FTRACE_WITH_REGS atop patchable-function-entry, we miss
BTIs at the start start of the `ftrace_caller` and `ftrace_regs_caller`
trampolines, and when these are called from a module via a PLT (which
will use a `BR X16`), we will encounter a BTI failure, e.g.
Fix this by adding the required `BTI C`, as we only require these to be
reachable via BL for direct calls or BR X16/X17 for PLTs. For now, these
are open-coded in the function prologue, matching the style of the
`__hwasan_tag_mismatch` trampoline.
In future we may wish to consider adding a new SYM_CODE_START_*()
variant which has an implicit BTI.
When ftrace is built atop mcount, the trampolines are marked with
SYM_FUNC_START(), and so get an implicit BTI. We may need to change
these over to SYM_CODE_START() in future for RELIABLE_STACKTRACE, in
case we need to apply special care aroud the return address being
rewritten.
Fixes: 97fed779f2a6 ("arm64: bti: Provide Kconfig for kernel mode BTI") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211129135709.2274019-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
because the ordinary load/store instructions (ldr, ldrh, ldrb) can
tolerate any misalignment of the memory address. However, load/store
double and load/store multiple instructions (ldrd, ldm) may still only
be used on memory addresses that are 32-bit aligned, and so we have to
use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
may end up with a severe performance hit due to alignment traps that
require fixups by the kernel. Testing shows that this currently happens
with clang-13 but not gcc-11. In theory, any compiler version can
produce this bug or other problems, as we are dealing with undefined
behavior in C99 even on architectures that support this in hardware,
see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363.
Fortunately, the get_unaligned() accessors do the right thing: when
building for ARMv6 or later, the compiler will emit unaligned accesses
using the ordinary load/store instructions (but avoid the ones that
require 32-bit alignment). When building for older ARM, those accessors
will emit the appropriate sequence of ldrb/mov/orr instructions. And on
architectures that can truly tolerate any kind of misalignment, the
get_unaligned() accessors resolve to the leXX_to_cpup accessors that
operate on aligned addresses.
Since the compiler will in fact emit ldrd or ldm instructions when
building this code for ARM v6 or later, the solution is to use the
unaligned accessors unconditionally on architectures where this is
known to be fast. The _aligned version of the hash function is
however still needed to get the best performance on architectures
that cannot do any unaligned access in hardware.
This new version avoids the undefined behavior and should produce
the fastest hash on all architectures we support.
There are various problems related to netlink notifications for mpls route
changes in response to interfaces being deleted:
* delete interface of only nexthop
DELROUTE notification is missing RTA_OIF attribute
* delete interface of non-last nexthop
NEWROUTE notification is missing entirely
* delete interface of last nexthop
DELROUTE notification is missing nexthop
All of these problems stem from the fact that existing routes are modified
in-place before sending a notification. Restructure mpls_ifdown() to avoid
changing the route in the DELROUTE cases and to create a copy in the
NEWROUTE case.
Fixes: f8efb73c97e2 ("mpls: multipath route support") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
In qlcnic_83xx_add_rings(), the indirect function of
ahw->hw_ops->alloc_mbx_args will be called to allocate memory for
cmd.req.arg, and there is a dereference of it in qlcnic_83xx_add_rings(),
which could lead to a NULL pointer dereference on failure of the
indirect function like qlcnic_83xx_alloc_mbx_args().
Fix this bug by adding a check of alloc_mbx_args(), this patch
imitates the logic of mbx_cmd()'s failure handling.
This bug was found by a static analyzer. The analysis employs
differential checking to identify inconsistent security operations
(e.g., checks or kfrees) between two code paths and confirms that the
inconsistent operations are not recovered in the current function or
the callers, so they constitute bugs.
Note that, as a bug found by static analysis, it can be a false
positive or hard to trigger. Multiple researchers have cross-reviewed
the bug.
Builds with CONFIG_QLCNIC=m show no new warnings, and our
static analyzer no longer warns about this code.
Fixes: 7f9664525f9c ("qlcnic: 83xx memory map and HW access routine") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Link: https://lore.kernel.org/r/20211130110848.109026-1-zhou1615@umn.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Steffen reported a TCP stream corruption for HTTP requests
served by the apache web-server using a cifs mount-point
and memory mapping the relevant file.
The root cause is quite similar to the one addressed by
commit 20eb4f29b602 ("net: fix sk_page_frag() recursion from
memory reclaim"). Here the nested access to the task page frag
is caused by a page fault on the (mmapped) user-space memory
buffer coming from the cifs file.
The page fault handler performs an smb transaction on a different
socket, inside the same process context. Since sk->sk_allaction
for such socket does not prevent the usage for the task_frag,
the nested allocation modify "under the hood" the page frag
in use by the outer sendmsg call, corrupting the stream.
The overall relevant stack trace looks like the following:
The cifs filesystem rightfully sets sk_allocations to GFP_NOFS,
we can avoid the nesting using the sk page frag for allocation
lacking the __GFP_FS flag. Do not define an additional mm-helper
for that, as this is strictly tied to the sk page frag usage.
v1 -> v2:
- use a stricted sk_page_frag() check instead of reordering the
code (Eric)
Reported-by: Steffen Froemer <sfroemer@redhat.com> Fixes: 5640f7685831 ("net: use a per task frag allocator") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>