Hans de Goede [Wed, 30 Aug 2017 09:48:06 +0000 (11:48 +0200)]
staging: typec: fusb302: Set max supply voltage to 5V
Anything higher then 5V may damage hardware not capable of it, so
the only sane default here is 5V. If a board is able to handle a
higher voltage that should come from board specific data such as
device-tree and not be hard coded into the fusb302 code.
Cc: "Yueyao (Nathan) Zhu" <yueyao@google.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A Rp signalling the default current limit indicates that we're possibly
connected to an USB2 power-source. In some cases the type-c port-controller
may provide the capability to detect the current-limit in this case,
through e.g. BC1.2 detection.
This commit adds an optional get_current_limit tcpc_dev callback which
allows the port-controller to provide current-limit detection for when
the CC pin is pulled up with Rp.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
staging:rtl8188eu Use __func__ instead of function name
This patch makes use of predefined identifier __func__ inorder to clear
the warning: Prefer using '"%s...", __func__' to using 'update_bmc_sta',
this function's name, in a string
Simo Koskinen [Tue, 29 Aug 2017 12:08:16 +0000 (14:08 +0200)]
staging: lustre: coding style fixes found by checkpatch.pl
The patch removes "WARNING: Prefer using '"%s...", __func__'
to using 'xxxxxxxx', this function's name, in a string" warnings
reported by checkpatch.pl script.
Signed-off-by: Simo Koskinen <koskisoft@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:29 +0000 (13:36 -0400)]
staging: unisys: visorbus: use all 80 characters for multi-line messages
The file visorchipset had a bunch of comments that were not using the full
screen before they wrapped, update the comments to wrap at 80 characters
instead.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:27 +0000 (13:36 -0400)]
staging: unisys: visorbus: Move parser functions location in file.
The parser functions were defined at the top of the file even though they
were not referenced until later in the file. This patch moves them closer
to where they are defined so they can be easily referenced.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:24 +0000 (13:36 -0400)]
staging: unisys: Change data to point to visor_controlvm_parameters_header.
The data field was being defined as a character array and then casted into
a visor_controlvm_parameters_header structure. This patch converts it to
just point to the visor_controlvm_parameters_header structure. The data
following the header is still behind the header_info.
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:23 +0000 (13:36 -0400)]
staging: unisys: visorbus: Split else if blocks into multiple if.
Visorbus_configure had a block of "else if" clauses at the beginning of the
function. Simplify this to just being "if" clauses since each code block
ended with a goto.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The functions to create the controlvm channel were disjointed and ignoring
information that was available. This patch consolidates it so it clearer
what is happening.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:17 +0000 (13:36 -0400)]
staging: unisys: Don't check for null before getting driver device.
The macro to convert to the driver object was giving a checkpatch warning
when it ateempted to check for a null driver. It would return NULL if it
found it, but only one location was checking to see if it was NULL.
Remove the check in the MACRO and do it prior to calling the macro if
required.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Kershner [Wed, 30 Aug 2017 17:36:13 +0000 (13:36 -0400)]
staging: unisys: visorbus: Clean up vmcall address function.
The function vmcall address needed to be cleaned up. The structure
vmcall_controlvm_addr was not needed so it was removed and was replaced
with vmcall_io_controlvm_addr_params since it needs to be allocated on the
heap for DMA access.
With the structure removed and the fields as local variables, it helped
clean up the formatting of the function.
Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
staging: unisys: visorbus: visorchipset.c: Fix bug in parser_init_byte_stream.
This patch fixes a bug in the function parser_init_byte_stream()
by removing the call to parser_done from goto err_finish_ctx.
The function parser_done() decrements
chipset_dev->controlvm_payload_bytes_buffered which is not
incremented before this gets called.
Signed-off-by: Sameer Wadgaonkar <sameer.wadgaonkar@unisys.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Kershner <david.kershner@unisys.com> Reviewed-by: Tim Sell <timothy.sell@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Simo Koskinen [Mon, 28 Aug 2017 13:01:32 +0000 (15:01 +0200)]
staging: comedi: coding style fixes found by checkpatch.pl
The patch removes "WARNING: Prefer using '"%s...", __func__'
to using 'xxxxxxxx', this function's name, in a string" warnings
reported by checkpatch.pl script.
Signed-off-by: Simo Koskinen <koskisoft@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
staging: typec: tcpm: Switch to PORT_RESET instead of SNK_UNATTACHED
When VBUS is not discovered within PD_T_PS_SOURCE_ON although Rp
is detected on CC, TCPM switches the port to SNK_UNATTACHED
state. SNK_UNATTACHED, however does not force TYPEC_CC_OPEN which
makes the partner(source) to think that it is connected.
To overcome this issue, force the port into PORT_RESET state
to make sure the CC lines are open.
staging: typec: tcpm: typec: tcpm: Wait for CC debounce before PD excg
Once, Rp or Rd is switched, wait for PD_T_CC_DEBOUNCE. If not the
PS_RDY message transmitted might result in failure.
Also, Only wait for PD_T_SRCSWAPSTDBY while in
PR_SWAP_SRC_SNK_TRANSITION_OFF. PD_T_PS_SOURCE_OFF is the overall
time after which the initial sink would issue hard reset.
staging: typec: tcpm: add cc change handling in src states
In the case that the lower layer driver reports a cc change directly
from SINK state to SOURCE state, TCPM doesn't handle these cc change
in SRC_SEND_CAPABILITIES, SRC_READY states. And with SRC_ATTACHED
state, the change is not handled as the port is still considered
connected.
[49606.131672] state change DRP_TOGGLING -> SRC_ATTACH_WAIT
[49606.131701] pending state change SRC_ATTACH_WAIT -> SRC_ATTACHED @
200 ms
[49606.329952] state change SRC_ATTACH_WAIT -> SRC_ATTACHED [delayed 200
ms]
[49606.329978] polarity 0
[49606.329989] Requesting mux mode 1, config 0, polarity 0
[49606.349416] vbus:=1 charge=0
[49606.372274] pending state change SRC_ATTACHED -> SRC_UNATTACHED @ 480
ms
[49606.372431] VBUS on
[49606.372488] state change SRC_ATTACHED -> SRC_STARTUP
...
(the lower layer driver reports a direct change from source to sink)
[49606.536927] pending state change SRC_SEND_CAPABILITIES ->
SRC_SEND_CAPABILITIES @ 150 ms
[49606.547244] CC1: 2 -> 5, CC2: 0 -> 0 [state SRC_SEND_CAPABILITIES,
polarity 0, connected]
This can happen when the lower layer driver and/or the hardware
handles a portion of the Type-C state machine work, and quietly goes
through the unattached state.
staging: typec: tcpm: Consider port_type while determining unattached_state
While performing PORT_RESET, upon receiving the cc disconnect
signal from the underlaying tcpc device, TCPM transitions into
unattached state. Consider the current type of port while determining
the unattached state.
In the below logs, although the port_type was set to sink, TCPM
transitioned into SRC_UNATTACHED.
[ 762.290654] state change SRC_READY -> PORT_RESET
[ 762.324531] Setting voltage/current limit 0 mV 0 mA
[ 762.327912] polarity 0
[ 762.334864] cc:=0
[ 762.347193] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms
[ 762.347200] VBUS off
[ 762.347203] CC1: 2 -> 0, CC2: 0 -> 0 [state PORT_RESET, polarity 0, disconnected]
[ 762.347206] state change PORT_RESET -> SRC_UNATTACHED
staging: typec: tcpm: Comply with TryWait.SNK State
According to the spec:
"4.5.2.2.10.2 Exiting from TryWait.SNK State
The port shall transition to Attached.SNK after tCCDebounce if or when VBUS
is detected. Note the Source may initiate USB PD communications which will
cause brief periods of the SNK.Open state on both the CC1 and CC2 pins,
but this event will not exceed tPDDebounce. The port shall transition to
Unattached.SNK when the state of both of the CC1 and CC2 pins is SNK.Open
for at least tPDDebounce."
According to spec:
" 4.5.2.2.9.2 Exiting from Try.SRC State:
The port shall transition to Attached.SRC when the SRC.Rd
state is detected on exactly one of the CC1 or CC2 pins for
at least tPDDebounce. The port shall transition to
TryWait.SNK after tDRPTry and the SRC.Rd state has not been
detected."
staging: typec: tcpm: Check for Rp for tPDDebounce
According the spec, the following is the conditions for exiting Try.SNK
state:
"The port shall wait for tDRPTry and only then begin monitoring the CC1 and
CC2 pins for the SNK.Rp state. The port shall then transition to
Attached.SNK when the SNK.Rp state is detected on exactly one of the CC1
or CC2 pins for at least tPDDebounce and V BUS is detected. Alternatively,
the port shall transition to TryWait.SRC if SNK.Rp state is not detected
for tPDDebounce."
staging: typec: tcpm: Prevent TCPM from looping in SRC_TRYWAIT
According to the spec the following is the condition
for exiting TryWait.SRC:
"The port shall transition to Attached.SRC when V BUS is at vSafe0V
and the SRC.Rd state is detected on exactly one of the CC pins for at
least tCCDebounce. The port shall transition to Unattached.SNK after
tDRPTry if neither of the CC1 or CC2 pins are in the SRC.Rd state"
TCPM at present keeps re-entering the SRC_TRYWAIT and keeps restarting
tDRPTry if the CC presents Rp and disconnects within tCCDebounce.
In TCPM, tDRPTry is set tp 100ms (min 75ms and max 150ms)
and tCCdebounce is set to 200ms (min 100ms and max 200ms).
To overcome the issue, record the time at which the port
enters TryWait.SRC(SRC_TRYWAIT) and re-enter SRC_TRYWAIT
only when CC keeps debouncing within tDRPTry.
The port type callback call enquires the tcpc_dev if
the requested port type is supported. If supported, then
performs a tcpm reset if required after setting the tcpm
internal port_type variable.
Check against the tcpm port_type instead of checking
against caps.type as port_type reflects the current
configuration.
tty: resolve tty contention between kernel and user space
The commit 12e84c71b7d4 ("tty: export tty_open_by_driver") exports
tty_open_by_device to allow tty to be opened from inside kernel which
works fine except that it doesn't handle contention with user space or
another kernel-space open of the same tty. For example, opening a tty
from user space while it is kernel opened results in failure and a
kernel log message about mismatch between tty->count and tty's file
open count.
This patch makes kernel access to tty exclusive, so that if a user
process or kernel opens a kernel opened tty, it gets -EBUSY. It does
this by adding TTY_KOPENED flag to tty->flags. When this flag is set,
tty_open_by_driver returns -EBUSY. Instead of overloading
tty_open_by_driver for both kernel and user space, this
patch creates a separate function tty_kopen which closely follows
tty_open_by_driver. tty_kclose closes the tty opened by tty_kopen.
To address the mismatch between tty->count and #fd's, this patch adds
#kopen's to the count before comparing it with tty->count. That way
check_tty_count reflects correct usage count.
Returning -EBUSY on tty open is a change in the interface. I have
tested this with minicom, picocom and commands like "echo foo >
/dev/ttyS0". They all correctly report "Device or resource busy" when
the tty is already kernel opened.
1. Remove module_init()/module_exit() macroes and
visorbus_register_visor_driver/visorbus_unregister_visor_driver
functions.
2. Replace with a short module_driver macro
Signed-off-by: Alex Briskin <br.shurik@gmail.com> Acked-by: David Kershner <david.kershner@unisys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cihangir Akturk [Fri, 25 Aug 2017 08:21:02 +0000 (11:21 +0300)]
staging: pi433: fix interrupt handler signatures
Remove "struct pt_regs *" parameter from interrupt handlers, since
it is no longer passed to interrupt handlers. Also, convert return
types to irqreturn_t.
Additionally, move DIO_irq_handler variable into the setup_GPIO
function, as it's not used outside of this function.
Colin Ian King [Wed, 23 Aug 2017 15:07:10 +0000 (16:07 +0100)]
staging: r8822be: fix null pointer dereference with a null driver_adapter
The call to _rtl_dbg_trace via macro HALMAC_RT_TRACE will trigger a null
pointer deference on the null driver_adapter. Fix this by assigning
driver_adapter earlier to halmac_adapter->driver_adapter before the tracing
call so that a non-null driver_adapter is passed instead.
Detected by CoverityScan, CID#1454613 ("Explicit null dereferenced")
Fixes: 938a0447f094 ("staging: r8822be: Add code for halmac sub-driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Wed, 23 Aug 2017 14:34:33 +0000 (15:34 +0100)]
staging: r8822be: fix memory leak of eeprom_map on error exit return
A memory leak of eeprom_map occurs if the call to halmac_eeprom_parser_88xx
fails. Fix this by kfree'ing it before returning.
Detected by CoverityScan, CID#1454569 ("Resource leak") Fixes: 938a0447f094 ("staging: r8822be: Add code for halmac sub-driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Sat, 26 Aug 2017 06:04:18 +0000 (09:04 +0300)]
staging: lustre: obdclass: return -EFAULT if copy_from_user() fails
The copy_from_user() function returns the number of bytes which we
weren't able to copy. We don't want to return that to the user but
instead we want to return -EFAULT.
Fixes: d7e09d0397e8 ("staging: add Lustre file system client support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Sat, 26 Aug 2017 06:02:55 +0000 (09:02 +0300)]
staging: lustre: obdclass: return -EFAULT if copy_to_user() fails
We recently changed from using obd_ioctl_popdata() to calling
copy_to_user() directly. This if statement was supposed to be deleted
but it was over looked. "err" is zero at this point so it means we
return success.
Larry Finger [Thu, 24 Aug 2017 21:28:08 +0000 (16:28 -0500)]
staging: rtlwifi: Improve debugging by using debugfs
The changes in this commit are also being sent to the main rtlwifi
drivers in wireless-next; however, these changes will also be useful for
any debugging of r8822be before it gets moved into the main tree.
Use debugfs to dump register and btcoex status, and also write registers
and h2c.
We create topdir in /sys/kernel/debug/rtlwifi/, and use the MAC address
as subdirectory with several entries to dump mac_reg, bb_reg, rf_reg etc.
An example is
/sys/kernel/debug/rtlwifi/00-11-22-33-44-55-66/mac_0
This change permits examination of device registers in a dynamic manner,
a feature not available with the current debug mechanism.
We use seq_file to replace RT_TRACE to dump status, then we can use 'cat'
to access btcoex's status through debugfs.
(i.e. /sys/kernel/debug/rtlwifi/00-11-22-33-44-55-66/btcoex)
Other related changes are
1. implement btc_disp_dbg_msg() to access btcoex's common status.
2. remove obsolete field bt_exist
Dan Carpenter [Thu, 24 Aug 2017 10:08:32 +0000 (13:08 +0300)]
staging: rtlwifi: check for array overflow
Smatch is distrustful of the "capab" value and marks it as user
controlled. I think it actually comes from the firmware? Anyway, I
looked at other drivers and they added a bounds check and it seems like
a harmless thing to have so I have added it here as well.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Mon, 28 Aug 2017 00:10:34 +0000 (17:10 -0700)]
Merge tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fix from Joerg Roedel:
"Another fix, this time in common IOMMU sysfs code.
In the conversion from the old iommu sysfs-code to the
iommu_device_register interface, I missed to update the release path
for the struct device associated with an IOMMU. It freed the 'struct
device', which was a pointer before, but is now embedded in another
struct.
Freeing from the middle of allocated memory had all kinds of nasty
side effects when an IOMMU was unplugged. Unfortunatly nobody
unplugged and IOMMU until now, so this was not discovered earlier. The
fix is to make the 'struct device' a pointer again"
* tag 'iommu-fixes-v4.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Fix wrong freeing of iommu_device->dev
Linus Torvalds [Mon, 28 Aug 2017 00:08:37 +0000 (17:08 -0700)]
Merge tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fix from Greg KH:
"Here is a single misc driver fix for 4.13-rc7. It resolves a reported
problem in the Android binder driver due to previous patches in
4.13-rc.
It's been in linux-next with no reported issues"
* tag 'char-misc-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
ANDROID: binder: fix proc->tsk check.
Linus Torvalds [Mon, 28 Aug 2017 00:03:33 +0000 (17:03 -0700)]
Merge tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/iio fixes from Greg KH:
"Here are few small staging driver fixes, and some more IIO driver
fixes for 4.13-rc7. Nothing major, just resolutions for some reported
problems.
All of these have been in linux-next with no reported problems"
* tag 'staging-4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: magnetometer: st_magn: remove ihl property for LSM303AGR
iio: magnetometer: st_magn: fix status register address for LSM303AGR
iio: hid-sensor-trigger: Fix the race with user space powering up sensors
iio: trigger: stm32-timer: fix get trigger mode
iio: imu: adis16480: Fix acceleration scale factor for adis16480
PATCH] iio: Fix some documentation warnings
staging: rtl8188eu: add RNX-N150NUB support
Revert "staging: fsl-mc: be consistent when checking strcmp() return"
iio: adc: stm32: fix common clock rate
iio: adc: ina219: Avoid underflow for sleeping time
iio: trigger: stm32-timer: add enable attribute
iio: trigger: stm32-timer: fix get/set down count direction
iio: trigger: stm32-timer: fix write_raw return value
iio: trigger: stm32-timer: fix quadrature mode get routine
iio: bmp280: properly initialize device for humidity reading
Linus Torvalds [Mon, 28 Aug 2017 00:01:54 +0000 (17:01 -0700)]
Merge tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb
Pull NTB fixes from Jon Mason:
"NTB bug fixes to address an incorrect ntb_mw_count reference in the
NTB transport, improperly bringing down the link if SPADs are
corrupted, and an out-of-order issue regarding link negotiation and
data passing"
* tag 'ntb-4.13-bugfixes' of git://github.com/jonmason/ntb:
ntb: ntb_test: ensure the link is up before trying to configure the mws
ntb: transport shouldn't disable link due to bogus values in SPADs
ntb: use correct mw_count function in ntb_tool and ntb_transport
Linus Torvalds [Sun, 27 Aug 2017 23:25:09 +0000 (16:25 -0700)]
Avoid page waitqueue race leaving possible page locker waiting
The "lock_page_killable()" function waits for exclusive access to the
page lock bit using the WQ_FLAG_EXCLUSIVE bit in the waitqueue entry
set.
That means that if it gets woken up, other waiters may have been
skipped.
That, in turn, means that if it sees the page being unlocked, it *must*
take that lock and return success, even if a lethal signal is also
pending.
So instead of checking for lethal signals first, we need to check for
them after we've checked the actual bit that we were waiting for. Even
if that might then delay the killing of the process.
This matches the order of the old "wait_on_bit_lock()" infrastructure
that the page locking used to use (and is still used in a few other
areas).
Note that if we still return an error after having unsuccessfully tried
to acquire the page lock, that is ok: that means that some other thread
was able to get ahead of us and lock the page, and when that other
thread then unlocks the page, the wakeup event will be repeated. So any
other pending waiters will now get properly woken up.
Fixes: 62906027091f ("mm: add PageWaiters indicating tasks are waiting for a page bit") Cc: Nick Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Jan Kara <jack@suse.cz> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 27 Aug 2017 20:55:12 +0000 (13:55 -0700)]
Minor page waitqueue cleanups
Tim Chen and Kan Liang have been battling a customer load that shows
extremely long page wakeup lists. The cause seems to be constant NUMA
migration of a hot page that is shared across a lot of threads, but the
actual root cause for the exact behavior has not been found.
Tim has a patch that batches the wait list traversal at wakeup time, so
that we at least don't get long uninterruptible cases where we traverse
and wake up thousands of processes and get nasty latency spikes. That
is likely 4.14 material, but we're still discussing the page waitqueue
specific parts of it.
In the meantime, I've tried to look at making the page wait queues less
expensive, and failing miserably. If you have thousands of threads
waiting for the same page, it will be painful. We'll need to try to
figure out the NUMA balancing issue some day, in addition to avoiding
the excessive spinlock hold times.
That said, having tried to rewrite the page wait queues, I can at least
fix up some of the braindamage in the current situation. In particular:
(a) we don't want to continue walking the page wait list if the bit
we're waiting for already got set again (which seems to be one of
the patterns of the bad load). That makes no progress and just
causes pointless cache pollution chasing the pointers.
(b) we don't want to put the non-locking waiters always on the front of
the queue, and the locking waiters always on the back. Not only is
that unfair, it means that we wake up thousands of reading threads
that will just end up being blocked by the writer later anyway.
Also add a comment about the layout of 'struct wait_page_key' - there is
an external user of it in the cachefiles code that means that it has to
match the layout of 'struct wait_bit_key' in the two first members. It
so happens to match, because 'struct page *' and 'unsigned long *' end
up having the same values simply because the page flags are the first
member in struct page.
Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Christopher Lameter <cl@linux.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 27 Aug 2017 19:12:25 +0000 (12:12 -0700)]
Clarify (and fix) MAX_LFS_FILESIZE macros
We have a MAX_LFS_FILESIZE macro that is meant to be filled in by
filesystems (and other IO targets) that know they are 64-bit clean and
don't have any 32-bit limits in their IO path.
It turns out that our 32-bit value for that limit was bogus. On 32-bit,
the VM layer is limited by the page cache to only 32-bit index values,
but our logic for that was confusing and actually wrong. We used to
define that value to
(((loff_t)PAGE_SIZE << (BITS_PER_LONG-1))-1)
which is actually odd in several ways: it limits the index to 31 bits,
and then it limits files so that they can't have data in that last byte
of a page that has the highest 31-bit index (ie page index 0x7fffffff).
Neither of those limitations make sense. The index is actually the full
32 bit unsigned value, and we can use that whole full page. So the
maximum size of the file would logically be "PAGE_SIZE << BITS_PER_LONG".
However, we do wan tto avoid the maximum index, because we have code
that iterates over the page indexes, and we don't want that code to
overflow. So the maximum size of a file on a 32-bit host should
actually be one page less than the full 32-bit index.
So the actual limit is ULONG_MAX << PAGE_SHIFT. That means that we will
not actually be using the page of that last index (ULONG_MAX), but we
can grow a file up to that limit.
The wrong value of MAX_LFS_FILESIZE actually caused problems for Doug
Nazar, who was still using a 32-bit host, but with a 9.7TB 2 x RAID5
volume. It turns out that our old MAX_LFS_FILESIZE was 8TiB (well, one
byte less), but the actual true VM limit is one page less than 16TiB.
This was invisible until commit c2a9737f45e2 ("vfs,mm: fix a dead loop
in truncate_inode_pages_range()"), which started applying that
MAX_LFS_FILESIZE limit to block devices too.
NOTE! On 64-bit, the page index isn't a limiter at all, and the limit is
actually just the offset type itself (loff_t), which is signed. But for
clarity, on 64-bit, just use the maximum signed value, and don't make
people have to count the number of 'f' characters in the hex constant.
So just use LLONG_MAX for the 64-bit case. That was what the value had
been before too, just written out as a hex constant.
Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") Reported-and-tested-by: Doug Nazar <nazard@nazar.ca> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 26 Aug 2017 19:48:29 +0000 (12:48 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- a tweak to the IBM Trackpoint driver that helps recognizing
trackpoints on never Lenovo Carbons
- a fix to the ALPS driver solving scroll issues on some Dells
- yet another ACPI ID has been added to Elan I2C toucpad driver
- quieted diagnostic message in soc_button_array driver
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ALPS - fix two-finger scroll breakage in right side on ALPS touchpad
Input: soc_button_array - silence -ENOENT error on Dell XPS13 9365
Input: trackpoint - add new trackpoint firmware ID
Input: elan_i2c - add ELAN0602 ACPI ID to support Lenovo Yoga310
Linus Torvalds [Sat, 26 Aug 2017 19:46:14 +0000 (12:46 -0700)]
Merge tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Remove needlessly alarming MSI affinity warning (this is not actually
a bug fix, but the warning prompts unnecessary bug reports)"
* tag 'pci-v4.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
Linus Torvalds [Sat, 26 Aug 2017 16:06:28 +0000 (09:06 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two fixes: one for an ldt_struct handling bug and a cherry-picked
objtool fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix use-after-free of ldt_struct
objtool: Fix '-mtune=atom' decoding support in objtool 2.0
Linus Torvalds [Sat, 26 Aug 2017 16:02:18 +0000 (09:02 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix a timer granularity handling race+bug, which would manifest itself
by spuriously increasing timeouts of some timers (from 1 jiffy to ~500
jiffies in the worst case measured) in certain nohz states"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timers: Fix excessive granularity of new timers after a nohz idle
Linus Torvalds [Sat, 26 Aug 2017 01:02:27 +0000 (18:02 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"6 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/memblock.c: reversed logic in memblock_discard()
fork: fix incorrect fput of ->exe_file causing use-after-free
mm/madvise.c: fix freeing of locked page with MADV_FREE
dax: fix deadlock due to misaligned PMD faults
mm, shmem: fix handling /sys/kernel/mm/transparent_hugepage/shmem_enabled
PM/hibernate: touch NMI watchdog when creating snapshot
Linus Torvalds [Sat, 26 Aug 2017 00:46:23 +0000 (17:46 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull Paolo Bonzini:
"Bugfixes for x86, PPC and s390"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PPC: Book3S: Fix race and leak in kvm_vm_ioctl_create_spapr_tce()
KVM, pkeys: do not use PKRU value in vcpu->arch.guest_fpu.state
KVM: x86: simplify handling of PKRU
KVM: x86: block guest protection keys unless the host has them enabled
KVM: PPC: Book3S HV: Add missing barriers to XIVE code and document them
KVM: PPC: Book3S HV: Workaround POWER9 DD1.0 bug causing IPB bit loss
KVM: PPC: Book3S HV: Use msgsync with hypervisor doorbells on POWER9
KVM: s390: sthyi: fix specification exception detection
KVM: s390: sthyi: fix sthyi inline assembly
Linus Torvalds [Sat, 26 Aug 2017 00:40:03 +0000 (17:40 -0700)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes two obvious bugs in virtio pci"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_pci: fix cpu affinity support
virtio_blk: fix incorrect message when disk is resized
Linus Torvalds [Sat, 26 Aug 2017 00:32:35 +0000 (17:32 -0700)]
Merge tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix, to add a barrier in the switch_mm() code to make sure
the mm cpumask update is ordered vs the MMU starting to load
translations. As far as we know no one's actually hit the bug, but
that's just luck.
Thanks to Benjamin Herrenschmidt, Nicholas Piggin"
* tag 'powerpc-4.13-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Ensure cpumask update is ordered
Linus Torvalds [Sat, 26 Aug 2017 00:27:26 +0000 (17:27 -0700)]
Merge tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Two nfsd bugfixes, neither 4.13 regressions, but both potentially
serious"
* tag 'nfsd-4.13-2' of git://linux-nfs.org/~bfields/linux:
net: sunrpc: svcsock: fix NULL-pointer exception
nfsd: Limit end of page list when decoding NFSv4 WRITE
Linus Torvalds [Sat, 26 Aug 2017 00:22:33 +0000 (17:22 -0700)]
Merge tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Some bug fixes for stable for cifs"
* tag 'cifs-fixes-for-4.13-rc6-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
cifs: return ENAMETOOLONG for overlong names in cifs_open()/cifs_lookup()
cifs: Fix df output for users with quota limits
Linus Torvalds [Sat, 26 Aug 2017 00:09:19 +0000 (17:09 -0700)]
Merge tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"Two fixes - one for a 4.13 regression, and the other for an older one:
- Atmel NAND: since we started utilizing ONFI timings, we found that
we were being too restrict at rejecting them, partly due to
discrepancies in ONFI 4.0 and earlier versions. Relax the
restriction to keep these platforms booting. This is a 4.13-rc1
regression.
- nandsim: repeated probe/removal may not work after a failed init,
because we didn't free up our debugfs files properly on the failure
path. This has been around since 3.8, but it's nice to get this
fixed now in a nice easy patch that can target -stable, since
there's already refactoring work (that also fixes the issue)
targeted for the next merge window"
* tag 'for-linus-20170825' of git://git.infradead.org/linux-mtd:
mtd: nand: atmel: Relax tADL_min constraint
mtd: nandsim: remove debugfs entries in error path
Linus Torvalds [Sat, 26 Aug 2017 00:02:59 +0000 (17:02 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A small batch of fixes that should be included for the 4.13 release.
This contains:
- Revert of the 4k loop blocksize support. Even with a recent batch
of 4 fixes, we're still not really happy with it. Rather than be
stuck with an API issue, let's revert it and get it right for 4.14.
- Trivial patch from Bart, adding a few flags to the blk-mq debugfs
exports that were added in this release, but not to the debugfs
parts.
- Regression fix for bsg, fixing a potential kernel panic. From
Benjamin.
- Tweak for the blk throttling, improving how we account discards.
From Shaohua"
* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq-debugfs: Add names for recently added flags
bsg-lib: fix kernel panic resulting from missing allocation of reply-buffer
Revert "loop: support 4k physical blocksize"
blk-throttle: cap discard request size
Linus Torvalds [Fri, 25 Aug 2017 23:59:38 +0000 (16:59 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has some bugfixes for you: mainly Jarkko fixed up a few things in
the designware driver regarding the new slave mode. But Ulf also fixed
a long-standing and now agreed suspend problem. Plus, some simple
stuff which nonetheless needs fixing"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: designware: Fix runtime PM for I2C slave mode
i2c: designware: Remove needless pm_runtime_put_noidle() call
i2c: aspeed: fixed potential null pointer dereference
i2c: simtec: use release_mem_region instead of release_resource
i2c: core: Make comment about I2C table requirement to reflect the code
i2c: designware: Fix standard mode speed when configuring the slave mode
i2c: designware: Fix oops from i2c_dw_irq_handler_slave
i2c: designware: Fix system suspend
PCI/MSI: Don't warn when irq_create_affinity_masks() returns NULL
irq_create_affinity_masks() can return NULL on non-SMP systems, when there
are not enough "free" vectors available to spread, or if memory allocation
for the CPU masks fails. Only the allocation failure is of interest, and
even then the system will work just fine except for non-optimally spread
vectors. Thus remove the warnings.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David S. Miller <davem@davemloft.net>