If 'devm_regulator_get()' fails, we should go through the existing error
handling path instead of returning directly, as done is all the other
error handling paths in this function.
Commit dec08194ffec ("xhci: Limit USB2 port wake support for AMD Promontory
hosts") makes all high speed USB ports on ASUS PRIME B350M-A cease to
function after enabling runtime PM.
All boards with this chipsets will be affected, so revert the commit.
The original patch was added to stable 4.9, 4.11 and 4.12 and needs
to reverted from there as well
The flow control workaround for ASM1042A xHC hosts sleeps between
register polling. The workaround gets called in several places, among
them with spin_lock_irq() held when xHC host is resumed or hoplug removed.
This was noticed as kernel panics at resume on a Dell XPS15 9550 with
TB16 thunderbolt dock.
Avoid sleeping with spin_lock_irq() held, use udelay() instead
The original workaround was added to 4.9 and 4.12 stable releases,
this patch needs to be applied to those as well.
Fixes: 9da5a1092b13 ("xhci: Bad Ethernet performance plugged in ASM1042A host") Reported-by: Jose Marino <marinoj@nso.edu> Tested-by: Jose Marino <marinoj@nso.edu> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Commit 4c39d4b949d3 ("usb: xhci: use bus->sysdev for DMA configuration")
updated the method determining DMA for XHCI from sysdev. However, this
patch broke the ability to enumerate the FWNODE from parent ACPI devices
from the child plat XHCI device.
Currently, xhci_plat is not set up properly when the parent device is an
ACPI node. The conditions that xhci_plat_probe should satisfy are
1. xhci_plat comes from firmware
2. xhci_plat is child of a device from firmware (dwc3-plat)
3. xhci_plat is grandchild of a pci device (dwc3-pci)
Case 2 is covered when the child is an OF node (by checking
sysdev->parent->of_node), however, an ACPI parent will return NULL in
the of_node check and will thus not result in sysdev being set to
sysdev->parent
[ 17.591549] xhci-hcd: probe of xhci-hcd.6.auto failed with error -5
This change adds a check for ACPI to completely allow for condition 2.
This is done by first checking if the parent node is of type ACPI (e.g.,
dwc3-plat) and set sysdev to sysdev->parent if either of the two
following conditions are met:
1: If fwnode is empty (in the case that platform_device_add_properties
was not called on the allocated platform device)
2: fwnode exists but is not of type ACPI (this would happen if
platform_device_add_properties was called on the allocated device.
Instead of type FWNODE_ACPI, you would end up with FWNODE_PDATA)
Fixes: 4c39d4b949d3 ("usb: xhci: use bus->sysdev for DMA configuration") Tested-by: Thang Q. Nguyen <tqnguyen@apm.com> Signed-off-by: Adam Wallis <awallis@codeaurora.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Read the endpiont ESIT from endpiont context using correct macro.
Add a macro for reading the high bits of ESIT for Large ESIT Payload
Capable hosts (LEC=1)
In the xhci_add_endpoint(), a new ring was allocated and saved at
xhci_virt_ep->new_ring. Hence, when error happens, we need to free
the allocated ring before returning error.
Current code frees xhci_virt_ep->ring instead of the new_ring. This
patch fixes this.
Andrey Konovalov reported a possible out-of-bounds problem for a USB interface
association descriptor. He writes:
It seems there's no proper size check of a USB_DT_INTERFACE_ASSOCIATION
descriptor. It's only checked that the size is >= 2 in
usb_parse_configuration(), so find_iad() might do out-of-bounds access
to intf_assoc->bInterfaceCount.
And he's right, we don't check for crazy descriptors of this type very well, so
resolve this problem. Yet another issue found by syzkaller...
Commit e0429362ab15
("usb: Add device quirk for Logitech HD Pro Webcams C920 and C930e")
introduced quirk to workaround an issue with some Logitech webcams.
The workaround is introducing delay for some USB operations.
According to our testing, delay introduced by original commit
is not long enough and in rare cases we still see issues described
by the aforementioned commit.
This patch increases delays introduced by original commit.
Having this patch applied we do not see those problems anymore.
Andrey Konovalov reported a possible out-of-bounds problem for the
cdc_parse_cdc_header function. He writes:
It looks like cdc_parse_cdc_header() doesn't validate buflen
before accessing buffer[1], buffer[2] and so on. The only check
present is while (buflen > 0).
So fix this issue up by properly validating the buffer length matches
what the descriptor says it is.
The uas driver has a subtle bug in the way it handles alternate
settings. The uas_find_uas_alt_setting() routine returns an
altsetting value (the bAlternateSetting number in the descriptor), but
uas_use_uas_driver() then treats that value as an index to the
intf->altsetting array, which it isn't.
Normally this doesn't cause any problems because the various
alternate settings have bAlternateSetting values 0, 1, 2, ..., so the
value is equal to the index in the array. But this is not guaranteed,
and Andrey Konovalov used the syzkaller fuzzer with KASAN to get a
slab-out-of-bounds error by violating this assumption.
This patch fixes the bug by making uas_find_uas_alt_setting() return a
pointer to the altsetting entry rather than either the value or the
index. Pointers are less subject to misinterpretation.
As a holdover from the old g_file_storage gadget, the g_mass_storage
legacy gadget driver attempts to unregister itself when its main
operating thread terminates (if it hasn't been unregistered already).
This is not strictly necessary; it was never more than an attempt to
have the gadget fail cleanly if something went wrong and the main
thread was killed.
However, now that the UDC core manages gadget drivers independently of
UDC drivers, this scheme doesn't work any more. A simple test:
What happens is that removing the dummy-hcd driver causes the UDC core
to unbind the gadget driver, which it does while holding the udc_lock
mutex. The unbind routine in g_mass_storage tells the main thread to
exit and waits for it to terminate.
But as mentioned above, when the main thread exits it tries to
unregister the mass-storage function driver. Via the composite
framework this ends up calling usb_gadget_unregister_driver(), which
tries to acquire the udc_lock mutex. The result is deadlock.
The simplest way to fix the problem is not to be so clever: The main
thread doesn't have to unregister the function driver. The side
effects won't be so terrible; if the gadget is still attached to a USB
host when the main thread is killed, it will appear to the host as
though the gadget's firmware has crashed -- a reasonably accurate
interpretation, and an all-too-common occurrence for USB mass-storage
devices.
In fact, the code to unregister the driver when the main thread exits
is specific to g-mass-storage; it is not used when f-mass-storage is
included as a function in a larger composite device. Therefore the
entire mechanism responsible for this (the fsg_operations structure
with its ->thread_exits method, the fsg_common_set_ops() routine, and
the msg_thread_exits() callback routine) can all be eliminated. Even
the msg_registered bitflag can be removed, because now the driver is
unregistered in only one place rather than in two places.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The user buffer has "uurb->buffer_length" bytes. If the kernel has more
information than that, we should truncate it instead of writing past
the end of the user's buffer. I added a WARN_ONCE() to help the user
debug the issue.
Reported-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
There used to be an integer overflow check in proc_do_submiturb() but
we removed it. It turns out that it's still required. The
uurb->buffer_length variable is a signed integer and it's controlled by
the user. It can lead to an integer overflow when we do:
A recent change to the synchronization in dummy-hcd was incorrect.
The issue was that dummy_udc_stop() contained no locking and therefore
could race with various gadget driver callbacks, and the fix was to
add locking and issue the callbacks with the private spinlock held.
UDC drivers aren't supposed to do this. Gadget driver callback
routines are allowed to invoke functions in the UDC driver, and these
functions will generally try to acquire the private spinlock. This
would deadlock the driver.
The correct solution is to drop the spinlock before issuing callbacks,
and avoid races by emulating the synchronize_irq() call that all real
UDC drivers must perform in their ->udc_stop() routines after
disabling interrupts. This involves adding a flag to dummy-hcd's
private structure to keep track of whether interrupts are supposed to
be enabled, and adding a counter to keep track of ongoing callbacks so
that dummy_udc_stop() can wait for them all to finish.
A real UDC driver won't receive disconnect, reset, suspend, resume, or
setup events once it has disabled interrupts. dummy-hcd will receive
them but won't try to issue any gadget driver callbacks, which should
be just as good.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: f16443a034c7 ("USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks") Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The dummy-hcd HCD/UDC emulator tries not to do too much work during
each timer interrupt. But it doesn't try very hard; currently all
it does is limit the total amount of bulk data transferred. Other
transfer types aren't limited, and URBs that transfer no data (because
of an error, perhaps) don't count toward the limit, even though on a
real USB bus they would consume at least a minimum overhead.
This means it's possible to get the driver stuck in an infinite loop,
for example, if the host class driver resubmits an URB every time it
completes (which is common for interrupt URBs). Each time the URB is
resubmitted it gets added to the end of the pending-URBs list, and
dummy-hcd doesn't stop until that list is empty. Andrey Konovalov was
able to trigger this failure mode using the syzkaller fuzzer.
This patch fixes the infinite-loop problem by restricting the URBs
handled during each timer interrupt to those that were already on the
pending list when the interrupt routine started. Newly added URBs
won't be processed until the next timer interrupt. The problem of
properly accounting for non-bulk bandwidth (as well as packet and
transaction overhead) is not addressed here.
The dummy-hcd UDC driver is not careful about the way it handles
connection speeds. It ignores the module parameter that is supposed
to govern the maximum connection speed and it doesn't set the HCD
flags properly for the case where it ends up running at full speed.
The result is that in many cases, gadget enumeration over dummy-hcd
fails because the bMaxPacketSize byte in the device descriptor is set
incorrectly. For example, the default settings call for a high-speed
connection, but the maxpacket value for ep0 ends up being set for a
Super-Speed connection.
This patch fixes the problem by initializing the gadget's max_speed
and the HCD flags correctly.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The driver will forward errors to userspace after turning most of them
into -EIO. But all status codes are not equal. The -EPIPE (stall) in
particular can be seen more as a result of normal USB signaling than
an actual error. The state is automatically cleared by the USB core
without intervention from either driver or userspace.
And most devices and firmwares will never trigger a stall as a result
of GetEncapsulatedResponse. This is in fact a requirement for CDC WDM
devices. Quoting from section 7.1 of the CDC WMC spec revision 1.1:
The function shall not return STALL in response to
GetEncapsulatedResponse.
But this driver is also handling GetEncapsulatedResponse on behalf of
the qmi_wwan and cdc_mbim drivers. Unfortunately the relevant specs
are not as clear wrt stall. So some QMI and MBIM devices *will*
occasionally stall, causing the GetEncapsulatedResponse to return an
-EPIPE status. Translating this into -EIO for userspace has proven to
be harmful. Treating it as an empty read is safer, making the driver
behave as if the device was conforming to the CDC WDM spec.
There have been numerous reports of issues related to -EPIPE errors
from some newer CDC MBIM devices in particular, like for example the
Fibocom L831-EAU. Testing on this device has shown that the issues
go away if we simply ignore the -EPIPE status. Similar handling of
-EPIPE is already known from e.g. usb_get_string()
The -EPIPE log message is still kept to let us track devices with this
unexpected behaviour, hoping that it attracts attention from firmware
developers.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100938 Reported-and-tested-by: Christian Ehrig <christian.ehrig@mediamarktsaturn-bt.com> Reported-and-tested-by: Patrick Chilton <chpatrick@gmail.com> Reported-and-tested-by: Andreas Böhler <news@aboehler.at> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Servers were emitting failed handoff messages but were not
waiting the full 1 second as designated in section 4.22.1 of
the eXtensible Host Controller Interface specifications. The
handshake was using wrong units so calls were made with milliseconds
not microseconds. Comments referenced 5 seconds not 1 second as
in specs.
The wrong units were also corrected in a second handshake call.
When a USB-audio device receives a maliciously adjusted or corrupted
buffer descriptor, the USB-audio driver may access an out-of-bounce
value at its parser. This was detected by syzkaller, something like:
This patch fixes an issue that the usbhsf_fifo_clear() is possible
to cause 10 msec delay if the pipe is RX direction and empty because
the FRDY bit will never be set to 1 in such case.
Fixes: e8d548d54968 ("usb: renesas_usbhs: fifo became independent from pipe.") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
This patch fixes an issue that the driver sets the BCLR bit of
{C,Dn}FIFOCTR register to 1 even when it's non-DCP pipe and
the FRDY bit of {C,Dn}FIFOCTR register is set to 1.
Fixes: e8d548d54968 ("usb: renesas_usbhs: fifo became independent from pipe.") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Ever since commit a621bac3044e ("scsi_lib: correctly retry failed zero
length REQ_TYPE_FS commands"), people have been getting bogus error
messages for USB disk drives using ATA pass-thru. For example:
These messages can be quite annoying, because programs like udisks2
provoke them every 10 minutes or so. Other programs can also have
this effect, such as those in smartmontools.
I don't fully understand how that commit induced the SCSI core to log
these error messages, but the underlying cause for them is code added
to usb-storage by commit f1a0743bc0e7 ("USB: storage: When a device
returns no sense data, call it a Hardware Error"). At the time it was
necessary to do this, in order to prevent an infinite retry loop with
some not-so-great mass storage devices.
However, the ATA pass-thru protocol uses SCSI sense data to return
command status values, and some devices always report Check Condition
status for ATA pass-thru commands to ensure that the host retrieves
the sense data, even if the command succeeded. This violates the USB
mass-storage protocol (Check Condition status is supposed to mean the
command failed), but we can't help that.
This patch attempts to mitigate the problem of these bogus error
reports by changing usb-storage. The HARDWARE ERROR sense key will be
inserted only for commands that aren't ATA pass-thru.
Thanks to Ewan Milne for pointing out that this mechanism was present
in usb-storage. 8 years after writing it, I had completely forgotten
its existence.
Kris Lindgren reports that without the NO_WP_DETECT flag, his Seagate
external disk drive fails all write accesses. This regresssion dates
back approximately to the start of the 4.x kernel releases.
This patch fixes an issue that this driver cannot go status stage
in control read when the req.zero is set to 1 and the len in
usb3_write_pipe() is set to 0. Otherwise, if we use g_ncm driver,
usb enumeration takes long time (5 seconds or more).
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
According to the datasheet of R-Car Gen3, the Pn_RAMMAP.Pn_MPKT should
be set to one of 8, 16, 32, 64, 512 and 1024. Otherwise, when a gadget
driver uses an interrupt endpoint, unexpected behavior happens. So,
this patch fixes it.
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
When bRequestType & USB_DIR_IN is false and req.length is 0 in control
transfer, since it means non-data, this driver should not set the mode
as control write. So, this patch fixes it.
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The driver triggers actions on both edges of the vbus signal.
The former PIO controller was triggering IRQs on both falling and rising edges
by default. Newer PIO controller don't, so it's better to set it explicitly to
IRQF_TRIGGER_FALLING | IRQF_TRIGGER_RISING.
Without this patch we may trigger the connection with host but only on some
bouncing signal conditions and thus lose connecting events.
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The gadgetfs driver as a long-outstanding FIXME, regarding a call of
copy_to_user() made while holding a spinlock. This patch fixes the
issue by dropping the spinlock and using the dev->udc_usage mechanism
introduced by another recent patch to guard against status changes
while the lock isn't held.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The gadgetfs driver (drivers/usb/gadget/legacy/inode.c) was written
before the UDC and composite frameworks were adopted; it is a legacy
driver. As such, it expects that once bound to a UDC controller, it
will not be unbound until it unregisters itself.
However, the UDC framework does unbind function drivers while they are
still registered. When this happens, it can cause the gadgetfs driver
to misbehave or crash. For example, userspace can cause a crash by
opening the device file and doing an ioctl call before setting up a
configuration (found by Andrey Konovalov using the syzkaller fuzzer).
This patch adds checks and synchronization to prevent these bad
behaviors. It adds a udc_usage counter that the driver increments at
times when it is using a gadget interface without holding the private
spinlock. The unbind routine waits for this counter to go to 0 before
returning, thereby ensuring that the UDC is no longer in use.
The patch also adds a check in the dev_ioctl() routine to make sure
the driver is bound to a UDC before dereferencing the gadget pointer,
and it makes destroy_ep_files() synchronize with the endpoint I/O
routines, to prevent the user from accessing an endpoint data
structure after it has been removed.
Consider the following case: udc controller supports SuperSpeed. If we
first load a HighSpeed gadget followed by a SuperSpeed gadget, the
SuperSpeed gadget will be limited to HighSpeed as UDC core driver
doesn't call ->udc_set_speed() in the second case.
Call ->udc_set_speed() unconditionally to fix this issue.
This will also fix the case for dwc3 controller driver when SuperSpeed
gadget is loaded first and works in HighSpeed only as udc_set_speed()
was never being called.
- bpf prog_array just like all other types of bpf array accepts 32-bit index.
Clarify that in the comment.
- fix x64 JIT of bpf_tail_call which was incorrectly loading 8 instead of 4 bytes
- tighten corresponding check in the interpreter to stay consistent
The JIT bug can be triggered after introduction of BPF_F_NUMA_NODE flag
in commit 96eabe7a40aa in 4.14. Before that the map_flags would stay zero and
though JIT code is wrong it will check bounds correctly.
Hence two fixes tags. All other JITs don't have this problem.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Fixes: 96eabe7a40aa ("bpf: Allow selecting numa node during map creation") Fixes: b52f00e6a715 ("x86: bpf_jit: implement bpf_tail_call() helper") Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
When RTM_GETSTATS was added the fields of its header struct were not all
initialized when returning the result thus leaking 4 bytes of information
to user-space per rtnl_fill_statsinfo call, so initialize them now. Thanks
to Alexander Potapenko for the detailed report and bisection.
Reported-by: Alexander Potapenko <glider@google.com> Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Starting from linux-4.4, 3WHS no longer takes the listener lock.
Since this time, we might hit a use-after-free in sk_filter_charge(),
if the filter we got in the memcpy() of the listener content
just happened to be replaced by a thread changing listener BPF filter.
To fix this, we need to make sure the filter refcount is not already
zero before incrementing it again.
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The l2tp_eth module crashes if its netlink callbacks are run when the
pernet data aren't initialised.
We should normally register_pernet_device() before the genl callbacks.
However, the pernet data only maintain a list of l2tpeth interfaces,
and this list is never used. So let's just drop pernet handling
instead.
Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
In commit e3a77561e7d32 ("tipc: split up function tipc_msg_eval()"),
we have updated the function tipc_msg_lookup_dest() to set the error
codes to negative values at destination lookup failures. Thus when
the function sets the error code to -TIPC_ERR_NO_NAME, its inserted
into the 4 bit error field of the message header as 0xf instead of
TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code.
In this commit, we set only positive error code.
Fixes: e3a77561e7d32 ("tipc: split up function tipc_msg_eval()") Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Currently no error is emitted, but this infrastructure will
used by the next patch to allow source address validation
for mcast sockets.
Since early demux can do a route lookup and an ipv4 route
lookup can return an error code this is consistent with the
current ipv4 route infrastructure.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Now when updating mtu in tx path, it doesn't consider ARPHRD_ETHER tunnel
device, like ip6gre_tap tunnel, for which it should also subtract ether
header to get the correct mtu.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The patch 'ip_gre: ipgre_tap device should keep dst' fixed
a issue that ipgre_tap mtu couldn't be updated in tx path.
The same fix is needed for ip6gre_tap as well.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Drivers that use the start method for netlink dumping rely on dumpit not
being called if start fails. For example, ila_xlat.c allocates memory
and assigns it to cb->args[0] in its start() function. It might fail to
do that and return -ENOMEM instead. However, even when returning an
error, dumpit will be called, which, in the example above, quickly
dereferences the memory in cb->args[0], which will OOPS the kernel. This
is but one example of how this goes wrong.
Since start() has always been a function with an int return type, it
therefore makes sense to use it properly, rather than ignoring it. This
patch thus returns early and does not call dumpit() when start() fails.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
sk->sk_prot and sk->sk_prot_creator can differ when the app uses
IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
Which is why sk_prot_creator is there to make sure that sk_prot_free()
does the kmem_cache_free() on the right kmem_cache slab.
Now, if such a socket gets transformed back to a listening socket (using
connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
sk_clone_lock() when a new connection comes in. But sk_prot_creator will
still point to the IPv6 kmem_cache (as everything got copied in
sk_clone_lock()). When freeing, we will thus put this
memory back into the IPv6 kmem_cache although it was allocated in the
IPv4 cache. I have seen memory corruption happening because of this.
With slub-debugging and MEMCG_KMEM enabled this gives the warning
"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"
A C-program to trigger this:
void main(void)
{
int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
int new_fd, newest_fd, client_fd;
struct sockaddr_in6 bind_addr;
struct sockaddr_in bind_addr4, client_addr1, client_addr2;
struct sockaddr unsp;
int val;
As far as I can see, this bug has been there since the beginning of the
git-days.
Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
mv88e6xxx_g2_irq_free locks the registers mutex, but not
mv88e6xxx_g1_irq_free, which results in a stack trace from
assert_reg_lock when unloading the mv88e6xxx module. Fix this.
Fixes: 3460a5770ce9 ("net: dsa: mv88e6xxx: Mask g1 interrupts and free interrupt") Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Packet socket option po->has_vnet_hdr can be updated concurrently with
other operations if no ring is attached.
Do not test the option twice in packet_snd, as the value may change in
between calls. A race on setsockopt disable may cause a packet > mtu
to be sent without having GSO options set.
Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Once a socket has po->fanout set, it remains a member of the group
until it is destroyed. The prot_hook must be constant and identical
across sockets in the group.
If fanout_add races with packet_do_bind between the test of po->fanout
and taking the lock, the bind call may make type or dev inconsistent
with that of the fanout group.
Hold po->bind_lock when testing po->fanout to avoid this race.
I had to introduce artificial delay (local_bh_enable) to actually
observe the race.
Fixes: dc99f600698d ("packet: Add fanout support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
We cannot be registering the network device first, then setting its
carrier off and finally connecting it to a PHY, doing that leaves a
window during which the carrier is at best inconsistent, and at worse
the device is not usable without a down/up sequence since the network
device is visible to user space with possibly no PHY device attached.
Re-order steps so that they make logical sense. This fixes some devices
where the port was not usable after e.g: an unbind then bind of the
driver.
Fixes: 0071f56e46da ("dsa: Register netdev before phy") Fixes: 91da11f870f0 ("net: Distributed Switch Architecture protocol support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Ports with the same VLAN must all be in the same bridge. However the
CPU and DSA ports need to be in multiple VLANs spread over multiple
bridges. So exclude them when performing this test.
Fixes: b2f81d304cee ("net: dsa: add CPU and DSA ports as VLAN members") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
If we try to delete the same tunnel twice, the first delete operation
does a lookup (l2tp_tunnel_get), finds the tunnel, calls
l2tp_tunnel_delete, which queues it for deletion by
l2tp_tunnel_del_work.
The second delete operation also finds the tunnel and calls
l2tp_tunnel_delete. If the workqueue has already fired and started
running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
same tunnel a second time, and try to free the socket again.
Add a dead flag to prevent firing the workqueue twice. Then we can
remove the check of queue_work's result that was meant to prevent that
race but doesn't.
Reproducer:
ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000
ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000
ip link set l2tp1 up
ip l2tp del tunnel tunnel_id 3000
ip l2tp del tunnel tunnel_id 3000
Fixes: f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Can be fixed if we get skb->len before dst_output().
Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code") Fixes: 22e1b23dafa8 ("vti6: Support inter address family tunneling.") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
When mapping the RX DMA buffers, the driver was accidentally specifying
zero for the buffer length. Under normal circumstances, SWIOTLB does not
need to allocate a bounce buffer, so the address is just mapped without
checking the size field. This is why the error was not detected earlier.
Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver") Cc: stable@vger.kernel.org Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Make sure (of/i2c/platform)_device_id tables are NULL terminated.
Found by coccinelle spatch "misc/of_table.cocci"
Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
In isdn_ppp_write(), the header (i.e., protobuf) of the buffer is
fetched twice from userspace. The first fetch is used to peek at the
protocol of the message and reset the huptimer if necessary; while the
second fetch copies in the whole buffer. However, given that buf resides
in userspace memory, a user process can race to change its memory content
across fetches. By doing so, we can either avoid resetting the huptimer
for any type of packets (by first setting proto to PPP_LCP and later
change to the actual type) or force resetting the huptimer for LCP
packets.
This patch changes this double-fetch behavior into two single fetches
decided by condition (lp->isdn_device < 0 || lp->isdn_channel <0).
A more detailed discussion can be found at
https://marc.info/?l=linux-kernel&m=150586376926123&w=2
Signed-off-by: Meng Xu <mengxu.gatech@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Since XDP's view of the packet includes the MAC header, moving the start-
of-packet with bpf_xdp_adjust_head needs to also update the offset of the
MAC header (which is relative to skb->head, not to the skb->data that was
changed).
Without this, tcpdump sees packets starting from the old MAC header rather
than the new one, at least in my tests on the loopback device.
Fixes: b5cdae3291f7 ("net: Generic XDP") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
This patch fixes a bug exhibited by the following scenario:
1. fd1 = perf_event_open with attr.config = ID1
2. attach bpf program prog1 to fd1
3. fd2 = perf_event_open with attr.config = ID1
<this will be successful>
4. user program closes fd2 and prog1 is detached from the tracepoint.
5. user program with fd1 does not work properly as tracepoint
no output any more.
The issue happens at step 4. Multiple perf_event_open can be called
successfully, but only one bpf prog pointer in the tp_event. In the
current logic, any fd release for the same tp_event will free
the tp_event->prog.
The fix is to free tp_event->prog only when the closing fd
corresponds to the one which registered the program.
Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Packet socket bind operations must hold the po->bind_lock. This keeps
po->running consistent with whether the socket is actually on a ptype
list to receive packets.
fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
binds the fanout object to receive through packet_rcv_fanout.
Make it hold the po->bind_lock when testing po->running and rebinding.
Else, it can race with other rebind operations, such as that in
packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
can result in a socket being added to a fanout group twice, causing
use-after-free KASAN bug reports, among others.
Reported independently by both trinity and syzkaller.
Verified that the syzkaller reproducer passes after this patch.
Fixes: dc99f600698d ("packet: Add fanout support.") Reported-by: nixioaming <nixiaoming@huawei.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Commit f784ad3d79e5 ("ipv6: do not send RTM_DELADDR for tentative
addresses") incorrectly assumes that no RTM_NEWADDR are sent for
addresses in tentative state, as this does happen for the standard
IPv6 use-case of DAD failure, see the call to ipv6_ifa_notify() in
addconf_dad_stop(). So as a result of this change, no RTM_DELADDR is
sent after DAD failure for a link-local when strict DAD (accept_dad=2)
is configured, or on the next admin down in other cases. The absence
of this notification breaks backwards compatibility and causes problems
after DAD failure if this notification was being relied on. The
solution is to allow RTM_DELADDR to still be sent after DAD failure.
Fixes: f784ad3d79e5 ("ipv6: do not send RTM_DELADDR for tentative addresses") Signed-off-by: Mike Manning <mmanning@brocade.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
This patch is pretty much a carbon copy of
commit 3079c652141f ("caif: Fix napi poll list corruption")
with "caif" replaced by "emac".
The commit d75b1ade567f ("net: less interrupt masking in NAPI")
breaks emac.
It is now required that if the entire budget is consumed when poll
returns, the napi poll_list must remain empty. However, like some
other drivers emac tries to do a last-ditch check and if there is
more work it will call napi_reschedule and then immediately process
some of this new work. Should the entire budget be consumed while
processing such new work then we will violate the new caller
contract.
This patch fixes this by not touching any work when we reschedule
in emac.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Since hard irq are disabled by the caller, bpf_map_free_id()
should not try to enable/disable BH.
Another solution would be to change htab_map_delete_elem() to
defer the free_htab_elem() call after
raw_spin_unlock_irqrestore(&b->lock, flags), but this might be not
enough to cover other code paths.
[1]
WARNING: CPU: 1 PID: 8052 at kernel/softirq.c:161 __local_bh_enable_ip
+0x1e/0x160 kernel/softirq.c:161
Kernel panic - not syncing: panic_on_warn set ...
Fixes: f3f1c054c288 ("bpf: Introduce bpf_map ID") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
It seems we have to be more careful in napi_complete_done()
use. This patch is not a revert, as it seems we can
avoid bug that Ville reported by moving the napi_complete_done()
test in the spinlock section.
Many thanks to Ville for detective work and all tests.
Fixes: 617f01211baf ("8139too: use napi_complete_done()") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
this script, edited from Linux Advanced Routing and Traffic Control guide
tc q a dev en0 root handle 1: htb default a
tc c a dev en0 parent 1: classid 1:1 htb rate 6mbit burst 15k
tc c a dev en0 parent 1:1 classid 1:a htb rate 5mbit ceil 6mbit burst 15k
tc c a dev en0 parent 1:1 classid 1:b htb rate 1mbit ceil 6mbit burst 15k
tc f a dev en0 parent 1:0 prio 1 $clsname $clsargs classid 1:b
ping $address -c1
tc -s c s dev en0
classifies traffic to 1:b or 1:a, depending on whether the packet matches
or not the pattern $clsargs of filter $clsname. However, when $clsname is
'matchall', a systematic crash can be observed in htb_classify(). HTB and
classful qdiscs don't assign initial value to struct tcf_result, but then
they expect it to contain valid values after filters have been run. Thus,
current 'matchall' ignores the TCA_MATCHALL_CLASSID attribute, configured
by user, and makes HTB (and classful qdiscs) dereference random pointers.
By assigning head->res to *res in mall_classify(), before the actions are
invoked, we fix this crash and enable TCA_MATCHALL_CLASSID functionality,
that had no effect on 'matchall' classifier since its first introduction.
If ipv6 has been disabled from cmdline since kernel started, it makes
no sense to allow users to create any ip6 tunnel. Otherwise, it could
some potential problem.
Jianlin found a kernel crash caused by this in ip6_gre when he set
ipv6.disable=1 in grub:
This patch is to return -EOPNOTSUPP in ip6_tunnel_init if ipv6 has been
disabled from cmdline.
Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
To clear Speed Selection in MDIO control register(0x10),
ie, clear bits 6 and 13 to zero while keeping other bits same.
Before AND operation,The Mask value has to be perform with bitwise NOT
operation (ie, ~ operator)
This patch clears current speed selection before writing the
new speed settings to gmii2rgmii converter
Fixes: f411a6160bd4 ("net: phy: Add gmiitorgmii converter support") Signed-off-by: Fahad Kunnathadi <fahad.kunnathadi@dexceldesigns.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Now in ip6gre_header before packing the ipv6 header, it skb_push t->hlen
which only includes encap_hlen + tun_hlen. It means greh and inner header
would be over written by ipv6 stuff and ipv6h might have no chance to set
up.
Jianlin found this issue when using remote any on ip6_gre, the packets he
captured on gre dev are truncated:
It should also skb_push ipv6hdr so that ipv6h points to the right position
to set ipv6 stuff up.
This patch is to skb_push hlen + sizeof(*ipv6h) and also fix some indents
in ip6gre_header.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
While trying an ESP transport mode encryption for UDPv6 packets of
datagram size 1436 with MTU 1500, checksum error was observed in
the secondary fragment.
This error occurs due to the UDP payload checksum being missed out
when computing the full checksum for these packets in
udp6_hwcsum_outgoing().
Fixes: d39d938c8228 ("ipv6: Introduce udpv6_send_skb()") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
liujian reported a problem in TCP_USER_TIMEOUT processing with a patch
in tcp_probe_timer() :
https://www.spinics.net/lists/netdev/msg454496.html
After investigations, the root cause of the problem is that we update
skb->skb_mstamp of skbs in write queue, even if the attempt to send a
clone or copy of it failed. One reason being a routing problem.
This patch prevents this, solving liujian issue.
It also removes a potential RTT miscalculation, since
__tcp_retransmit_skb() is not OR-ing TCP_SKB_CB(skb)->sacked with
TCPCB_EVER_RETRANS if a failure happens, but skb->skb_mstamp has
been changed.
A future ACK would then lead to a very small RTT sample and min_rtt
would then be lowered to this too small value.
This code causes a static checker warning because Smatch doesn't trust
anything that comes from skb->data. I've reviewed this code and I do
think skb->data can be controlled by the user here.
The sctp_event_subscribe struct has 13 __u8 fields and we want to see
if ours is non-zero. sn_type can be any value in the 0-USHRT_MAX range.
We're subtracting SCTP_SN_TYPE_BASE which is 1 << 15 so we could read
either before the start of the struct or after the end.
This is a very old bug and it's surprising that it would go undetected
for so long but my theory is that it just doesn't have a big impact so
it would be hard to notice.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Recent commit d7fb60b9cafb ("net_sched: get rid of tcfa_rcu") removed
freeing in call_rcu, which changed already existing hard-to-hit
race condition into 100% hit:
[ 598.599825] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 598.607782] IP: tcf_action_destroy+0xc0/0x140
Or:
[ 40.858924] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[ 40.862840] IP: tcf_generic_walker+0x534/0x820
Fix this by storing the ops and use them directly for module_put call.
Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Commit 8b426dc54cf4 ("bonding: remove hardcoded value") changed the
default value for tlb_dynamic_lb which lead to either broken ALB mode
(since tlb_dynamic_lb can be changed only in TLB) or setting TLB mode
with tlb_dynamic_lb equal to 0.
The first issue was recently fixed by setting tlb_dynamic_lb to 1 always
when switching to ALB mode, but the default value is still wrong and
we'll enter TLB mode with tlb_dynamic_lb equal to 0 if the mode is
changed via netlink or sysfs. In order to restore the previous behaviour
and default value simply remove the mode check around the default param
initialization for tlb_dynamic_lb which will always set it to 1 as
before.
Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
When removing the offloading of mirred actions under
matchall classifiers, mlxsw would find the destination port
associated with the offloaded action and utilize it for undoing
the configuration.
Depending on the order by which ports are removed, it's possible that
the destination port would get removed before the source port.
In such a scenario, when actions would be flushed for the source port
mlxsw would perform an illegal dereference as the destination port is
no longer listed.
Since the only item necessary for undoing the configuration on the
destination side is the port-id and that in turn is already maintained
by mlxsw on the source-port, simply stop trying to access the
destination port and use the port-id directly instead.
Fixes: 763b4b70af ("mlxsw: spectrum: Add support in matchall mirror TC offloading") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Commit cbf5ecb30560 ("net: bonding: Fix transmit load balancing in
balance-alb mode") tried to fix transmit dynamic load balancing in
balance-alb mode, which wasn't working after commit 8b426dc54cf4
("bonding: remove hardcoded value").
It turned out that my previous patch only fixed the case when
balance-alb was specified as bonding module parameter, and not when
balance-alb mode was set using /sys/class/net/*/bonding/mode (the most
common usage). In the latter case, tlb_dynamic_lb was set up according
to the default mode of the bonding interface, which happens to be
balance-rr.
This additional patch addresses this issue by setting up tlb_dynamic_lb
to 1 if "mode" is set to balance-alb through the sysfs interface.
I didn't add code to change tlb_balance_lb back to the default value for
other modes, because "mode" is usually set up only once during
initialization, and it's not worthwhile to change the static variable
bonding_defaults in bond_main.c to a global variable just for this
purpose.
Commit 8b426dc54cf4 also changes the value of tlb_dynamic_lb for
balance-tlb mode if it is set up using the sysfs interface. I didn't
change that behavior, because the value of tlb_balance_lb can be changed
using the sysfs interface for balance-tlb, and I didn't like changing
the default value back and forth for balance-tlb.
As for balance-alb, /sys/class/net/*/bonding/tlb_balance_lb cannot be
written to. However, I think balance-alb with tlb_dynamic_lb set to 0
is not an intended usage, so there is little use making it writable at
this moment.
Fixes: 8b426dc54cf4 ("bonding: remove hardcoded value") Reported-by: Reinis Rozitis <r@roze.lv> Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Cc: stable@vger.kernel.org # v4.12+ Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The current code does not handle correctly the access to the upper page
in case of SFP/SFP+ EEPROM. In that case the offset should be local
and the I2C address should be changed.
Fixes: 2ea109039cd3 ("mlxsw: spectrum: Add support for access cable info via ethtool") Reported-by: Florian Klink <flokli@flokli.de> Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
We get a harmless warning about a potential uninitialized variable
use in the driver:
drivers/staging/media/imx/imx-media-of.c: In function 'of_parse_subdev':
drivers/staging/media/imx/imx-media-of.c:216:4: warning: 'remote_np' may be used uninitialized in this function [-Wmaybe-uninitialized]
I reworked that code to be easier to understand by gcc in mainline,
but that commit is too large to backport. This is a much simpler
workaround, avoiding the warning by adding a fake initialization
to the variable. The driver was only introduced in linux-4.13,
so the workaround is not needed for earlier stable kernels.
Fixes: e130291212df ("[media] media: Add i.MX media core driver") Cc: stable@vger.kernel.org Cc: Philipp Zabel <p.zabel@pengutronix.de> Cc: Steve Longerbeam <steve_longerbeam@mentor.com> Cc: Hans Verkuil <hans.verkuil@cisco.com> Cc: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Seth Forshee [Wed, 11 Oct 2017 14:00:47 +0000 (09:00 -0500)]
UBUNTU: [Config] Update configs for v4.13.5
BugLink: http://bugs.launchpad.net/bugs/1721777
Changes in the 4.13.5 stable update caused a couple of crypto
modules to become built-in. Run updateconfigs to reflect this and
ignore these modules in the abi check.
'clk' is copied to a userland with padding byte(s) after 'vclk_post_div'
field unitialized, leaking data from the stack. Fix this ensuring all of
'clk' is initialized to zero.
WARN_ON_ONCE(pi_test_sn(&vmx->pi_desc)) in kvm_vcpu_trigger_posted_interrupt()
intends to detect the violation of invariant that VT-d PI notification
event is not suppressed when vcpu is in the guest mode. Because the
two checks for the target vcpu mode and the target suppress field
cannot be performed atomically, the target vcpu mode may change in
between. If that does happen, WARN_ON_ONCE() here may raise false
alarms.
As the previous patch fixed the real invariant breaker, remove this
WARN_ON_ONCE() to avoid false alarms, and document the allowed cases
instead.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reported-by: "Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
In kvm_vcpu_trigger_posted_interrupt() and pi_pre_block(), KVM
assumes that PI notification events should not be suppressed when the
target vCPU is not blocked.
vmx_update_pi_irte() sets the SN field before changing an interrupt
from posting to remapping, but it does not check the vCPU mode.
Therefore, the change of SN field may break above the assumption.
Besides, I don't see reasons to suppress notification events here, so
remove the changes of SN field to avoid race condition.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reported-by: "Ramamurthy, Venkatesh" <venkatesh.ramamurthy@intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Fixes: 28b835d60fcc ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
On x86, userspace can use the ptrace() or rt_sigreturn() system calls to
set a task's extended state (xstate) or "FPU" registers. ptrace() can
set them for another task using the PTRACE_SETREGSET request with
NT_X86_XSTATE, while rt_sigreturn() can set them for the current task.
In either case, registers can be set to any value, but the kernel
assumes that the XSAVE area itself remains valid in the sense that the
CPU can restore it.
However, in the case where the kernel is using the uncompacted xstate
format (which it does whenever the XSAVES instruction is unavailable),
it was possible for userspace to set the xcomp_bv field in the
xstate_header to an arbitrary value. However, all bits in that field
are reserved in the uncompacted case, so when switching to a task with
nonzero xcomp_bv, the XRSTOR instruction failed with a #GP fault. This
caused the WARN_ON_FPU(err) in copy_kernel_to_xregs() to be hit. In
addition, since the error is otherwise ignored, the FPU registers from
the task previously executing on the CPU were leaked.
Fix the bug by checking that the user-supplied value of xcomp_bv is 0 in
the uncompacted case, and returning an error otherwise.
The reason for validating xcomp_bv rather than simply overwriting it
with 0 is that we want userspace to see an error if it (incorrectly)
provides an XSAVE area in compacted format rather than in uncompacted
format.
Note that as before, in case of error we clear the task's FPU state.
This is perhaps non-ideal, especially for PTRACE_SETREGSET; it might be
better to return an error before changing anything. But it seems the
"clear on error" behavior is fine for now, and it's a little tricky to
do otherwise because it would mean we couldn't simply copy the full
userspace state into kernel memory in one __copy_from_user().
This bug was found by syzkaller, which hit the above-mentioned
WARN_ON_FPU():
Here is a C reproducer. The expected behavior is that the program spin
forever with no output. However, on a buggy kernel running on a
processor with the "xsave" feature but without the "xsaves" feature
(e.g. Sandy Bridge through Broadwell for Intel), within a second or two
the program reports that the xmm registers were corrupted, i.e. were not
restored correctly. With CONFIG_X86_DEBUG_FPU=y it also hits the above
kernel warning.
Note: the program only tests for the bug using the ptrace() system call.
The bug can also be reproduced using the rt_sigreturn() system call, but
only when called from a 32-bit program, since for 64-bit programs the
kernel restores the FPU state from the signal frame by doing XRSTOR
directly from userspace memory (with proper error checking).
Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Kevin Hao <haokexin@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Yu-cheng Yu <yu-cheng.yu@intel.com> Cc: kernel-hardening@lists.openwall.com Fixes: 0b29643a5843 ("x86/xsaves: Change compacted format xsave area header") Link: http://lkml.kernel.org/r/20170922174156.16780-2-ebiggers3@gmail.com Link: http://lkml.kernel.org/r/20170923130016.21448-25-mingo@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
commit 7b2d0dbac489 ("x86/mm/pkeys: Pass VMA down in to fault signal
generation code") passes down a vma pointer to the error path, but that is
done once the mmap_sem is released when calling mm_fault_error() from
__do_page_fault().
This is dangerous as the vma structure is no more safe to be used once the
mmap_sem has been released. As only the protection key value is required in
the error processing, we could just pass down this value.
Fix it by passing a pointer to a protection key value down to the fault
signal generation code. The use of a pointer allows to keep the check
generating a warning message in fill_sig_info_pkey() when the vma was not
known. If the pointer is valid, the protection value can be accessed by
deferencing the pointer.
[ tglx: Made *pkey u32 as that's the type which is passed in siginfo ]
Fixes: 7b2d0dbac489 ("x86/mm/pkeys: Pass VMA down in to fault signal generation code") Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/1504513935-12742-1-git-send-email-ldufour@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The notifier callbacks may want to call some OPP helper routines which
may try to take the same opp_table->lock again and cause a deadlock. One
such usecase was reported by Chanwoo Choi, where calling
dev_pm_opp_disable() leads us to the devfreq's OPP notifier handler,
which further calls dev_pm_opp_find_freq_floor() and it deadlocks.
We don't really need the opp_table->lock to be held across the notifier
call though, all we want to make sure is that the 'opp' doesn't get
freed while being used from within the notifier chain. We can do it with
help of dev_pm_opp_get/put() as well. Let's do it.
Fixes: 5b650b388844 "PM / OPP: Take kref from _find_opp_table()" Reported-by: Chanwoo Choi <cw00.choi@samsung.com> Tested-by: Chanwoo Choi <cw00.choi@samsung.com> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
My Fujitsu-Siemens Lifebook S6120 doesn't have the FUJ02E3 device,
but it does have FUJ02B1. That means we do register the backlight
device (and it even seems to work), but the code will oops as soon
as we try to set the backlight brightness because it's trying to
call call_fext_func() with a NULL device. Let's just skip those
function calls when the FUJ02E3 device is not present.
Cc: Jonathan Woithe <jwoithe@just42.net> Cc: Andy Shevchenko <andy@infradead.org> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
`btrfs sub set-default` succeeds to set an ID which isn't corresponding to any
fs/file tree. If such the bad ID is set to a filesystem, we can't mount this
filesystem without specifying `subvol` or `subvolid` mount options.
Fixes: 6ef5ed0d386b ("Btrfs: add ioctl and incompat flag to set the default mount subvol") Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com> Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
btrfs_cmp_data_prepare() (almost) always returns 0 i.e. ignoring errors
from gather_extent_pages(). While the pages are freed by
btrfs_cmp_data_free(), cmp->num_pages still has > 0. Then,
btrfs_extent_same() try to access the already freed pages causing faults
(or violates PageLocked assertion).
This patch just return the error as is so that the caller stop the process.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Fixes: f441460202cb ("btrfs: fix deadlock with extent-same and readpage") Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
__endio_write_update_ordered() repeats the search until it reaches the end
of the specified range. This works well with direct IO path, because before
the function is called, it's ensured that there are ordered extents filling
whole the range. It's not the case, however, when it's called from
run_delalloc_range(): it is possible to have error in the midle of the loop
in e.g. run_delalloc_nocow(), so that there exisits the range not covered
by any ordered extents. By cleaning such "uncomplete" range,
__endio_write_update_ordered() stucks at offset where there're no ordered
extents.
Since the ordered extents are created from head to tail, we can stop the
search if there are no offset progress.
Commit 524272607e88 ("btrfs: Handle delalloc error correctly to avoid
ordered extent hang") introduced btrfs_cleanup_ordered_extents() to cleanup
submitted ordered extents. However, it does not clear the ordered bit
(Private2) of corresponding pages. Thus, the following BUG occurs from
free_pages_check_bad() (on btrfs/125 with nospace_cache).
__del_reloc_root should be called before freeing up reloc_root->node.
If not, calling __del_reloc_root() dereference reloc_root->node, causing
the system BUG.
Fixes: 6bdf131fac23 ("Btrfs: don't leak reloc root nodes on error") Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The driver_override implementation is susceptible to a race condition when
different threads are reading vs. storing a different driver override. Add
locking to avoid the race condition.
This is in close analogy to commit 6265539776a0 ("driver core: platform:
fix race condition with driver_override") from Adrian Salido.
With commit cc27b0c78c79, pers->make_request could bail out without handling
the bio. If that happens, we should retry. The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.
Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start()) Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
There was a reported suspicion about a race between exit_pi_state_list()
and put_pi_state(). The same report mentioned the comment with
put_pi_state() said it should be called with hb->lock held, and it no
longer is in all places.
As it turns out, the pi_state->owner serialization is indeed broken. As per
the new rules:
pi_state->owner should be serialized by pi_state->pi_mutex.wait_lock.
For the sites setting pi_state->owner we already hold wait_lock (where
required) but exit_pi_state_list() and put_pi_state() were not and
raced on clearing it.
All manipulations of the gem_object list need to be protected by
the list mutex, as GEM objects can be created and freed in parallel.
This fixes a kernel memory corruption.