The PixArt OEM mice are known for disconnecting every minute in
runlevel 1 or 3 if they are not always polled. So add quirk
ALWAYS_POLL for two Primax mice as well.
0x4e22 is the Dell MS111-P and 0x4d0f is the unbranded HP Portia
mouse HP 697738-001. Both were built until approx. 2014.
Those were the standard mice from those vendors and are still
around - even as new old stock.
Kai-Heng Feng [Wed, 14 Nov 2018 07:24:57 +0000 (07:24 +0000)]
HID: i2c-hid: Disable runtime PM for LG touchscreen
LG touchscreen (1fd2:8001) stops working after reboot:
[ 4.859153] i2c_hid i2c-SAPS2101:00: i2c_hid_get_input: incomplete report (64/66)
[ 4.936070] i2c_hid i2c-SAPS2101:00: i2c_hid_get_input: incomplete report (64/66)
[ 9.948224] i2c_hid i2c-SAPS2101:00: failed to reset device.
The device in question stops working after receives SLEEP, ON, SLEEP
commands in a short period. The scenario is like this:
- Once the desktop session closes, it also closed the hid device, so the
device gets runtime suspended and receives a SLEEP command.
- Before calling shutdown callback, it gets runtime resumed and received
an ON command.
- In the shutdown callback, it receives another SLEEP command.
I failed to find a reliable interval between ON/SLEEP commands that can
make it work, so let's simply disable runtime PM for the device.
HID: steam: remove input device when a hid client is running.
Previously, when a HID client such as the Steam Client was running, this
driver disabled its input device to avoid doubling the input events.
While it worked mostly fine, some games got confused by the idle gamepad,
and switched to two player mode, or asked the user to choose which gamepad
to use. Other games just crashed, probably a bug in Unity [1].
With this commit, when a HID client starts, the input device is removed;
when the HID client ends the input device is recreated.
Please note that `strlcpy()` does *NOT* do what you think it does.
strlcpy() *ALWAYS* reads the full input string, regardless of the
'length' parameter. That is, if the input is not zero-terminated,
strlcpy() will *READ* beyond input boundaries. It does this, because it
always returns the size it *would* copy if the target was big enough,
not the truncated size it actually copied.
The original code was perfectly fine. The hid device is
zero-initialized and the strncpy() functions copied up to n-1
characters. The result is always zero-terminated this way.
This is the third time someone tried to replace strncpy with strlcpy in
this function, and gets it wrong. I now added a comment that should at
least make people reconsider.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Eric Biggers [Wed, 14 Nov 2018 21:55:09 +0000 (13:55 -0800)]
HID: uhid: forbid UHID_CREATE under KERNEL_DS or elevated privileges
When a UHID_CREATE command is written to the uhid char device, a
copy_from_user() is done from a user pointer embedded in the command.
When the address limit is KERNEL_DS, e.g. as is the case during
sys_sendfile(), this can read from kernel memory. Alternatively,
information can be leaked from a setuid binary that is tricked to write
to the file descriptor. Therefore, forbid UHID_CREATE in these cases.
No other commands in uhid_char_write() are affected by this bug and
UHID_CREATE is marked as "obsolete", so apply the restriction to
UHID_CREATE only rather than to uhid_char_write() entirely.
Thanks to Dmitry Vyukov for adding uhid definitions to syzkaller and to
Jann Horn for commit 9da3f2b740544 ("x86/fault: BUG() when uaccess
helpers fault on kernel addresses"), allowing this bug to be found.
Benson Leung [Thu, 8 Nov 2018 23:59:21 +0000 (15:59 -0800)]
HID: input: Ignore battery reported by Symbol DS4308
The Motorola/Zebra Symbol DS4308-HD is a handheld USB barcode scanner
which does not have a battery, but reports one anyway that always has
capacity 2.
Let's apply the IGNORE quirk to prevent it from being treated like a
power supply so that userspaces don't get confused that this
accessory is almost out of power and warn the user that they need to charge
their wired barcode scanner.
The PixArt OEM mice are known for disconnecting every minute in
runlevel 1 or 3 if they are not always polled. So add quirk
ALWAYS_POLL for this one as well.
Linus Walleij [Sun, 4 Nov 2018 10:32:47 +0000 (11:32 +0100)]
HID: fix up .raw_event() documentation
The documentation for the .raw_event() callback says that if the
driver return 1, there will be no further processing of the event,
but this is not true, the actual code in hid-core.c looks like this:
if (hdrv && hdrv->raw_event && hid_match_report(hid, report)) {
ret = hdrv->raw_event(hid, report, data, size);
if (ret < 0)
goto unlock;
}
ret = hid_report_raw_event(hid, type, data, size, interrupt);
The only return value that has any effect on the processing is
a negative error.
Correct this as it seems to confuse people: I found bogus code in
the Razer out-of-tree driver attempting to return 1 here.
asus_wmi_evaluate_method() is an empty dummy function when CONFIG_ASUS_WMI
is disabled, or not reachable from a built-in device driver. This leads to
a theoretical evaluation of an uninitialized variable that the compiler
complains about, failing to check that the hardcoded return value makes
this an unreachable code path:
In file included from include/linux/printk.h:336,
from include/linux/kernel.h:14,
from include/linux/list.h:9,
from include/linux/dmi.h:5,
from drivers/hid/hid-asus.c:29:
drivers/hid/hid-asus.c: In function 'asus_input_configured':
include/linux/dynamic_debug.h:135:3: error: 'value' may be used uninitialized in this function [-Werror=maybe-uninitialized]
__dynamic_dev_dbg(&descriptor, dev, fmt, \
^~~~~~~~~~~~~~~~~
drivers/hid/hid-asus.c:359:6: note: 'value' was declared here
u32 value;
^~~~~
With an extra IS_ENABLED() check, the warning goes away.
Fixes: 3b692c55e58d ("HID: asus: only support backlight when it's not driven by WMI") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Jiri Kosina [Tue, 6 Nov 2018 12:57:02 +0000 (13:57 +0100)]
Merge branch 'master' into for-4.20/upstream-fixes
Pull in a merge commit that brought in 3b692c55e58d ("HID: asus: only
support backlight when it's not driven by WMI") so that fixup could be
applied on top of it.
Linus Torvalds [Thu, 1 Nov 2018 15:42:21 +0000 (08:42 -0700)]
Merge tag 'platform-drivers-x86-v4.20-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Darren Hart:
- Move the Dell dcdbas and dell_rbu drivers into platform/drivers/x86
as they are closely coupled with other drivers in this location.
- Improve _init* usage for acerhdf and fix some usage issues with
messages and module parameters.
- Simplify asus-wmi by calling ACPI/WMI methods directly, eliminating
workqueue overhead, eliminate double reporting of keyboard backlight.
- Fix wake from USB failure on Bay Trail devices (intel_int0002_vgpio).
- Notify intel_telemetry users when IPC1 device is not enabled.
- Update various drivers with new laptop model IDs.
- Update several intel drivers to use SPDX identifers and order headers
alphabetically.
* tag 'platform-drivers-x86-v4.20-1' of git://git.infradead.org/linux-platform-drivers-x86: (64 commits)
HID: asus: only support backlight when it's not driven by WMI
platform/x86: asus-wmi: export function for evaluating WMI methods
platform/x86: asus-wmi: Only notify kbd LED hw_change by fn-key pressed
platform/x86: wmi: declare device_type structure as constant
platform/x86: ideapad: Add Y530-15ICH to no_hw_rfkill
platform/x86: Add Intel AtomISP2 dummy / power-management driver
platform/x86: touchscreen_dmi: Add min-x and min-y settings for various models
platform/x86: touchscreen_dmi: Add info for the Onda V80 Plus v3 tablet
platform/x86: touchscreen_dmi: Add info for the Trekstor Primetab T13B tablet
platform/x86: intel_telemetry: Get rid of custom macro
platform/x86: intel_telemetry: report debugfs failure
MAINTAINERS: intel_telemetry: Update maintainers info
platform/x86: Add LG Gram laptop special features driver
platform/x86: asus-wmi: Simplify the keyboard brightness updating process
platform/x86: touchscreen_dmi: Add info for the Trekstor Primebook C11 convertible
platform/x86: mlx-platform: Properly use mlxplat_mlxcpld_msn201x_items
MAINTAINERS: intel_pmc_core: Update MAINTAINERS
firmware: dcdbas: include linux/io.h
platform/x86: intel-wmi-thunderbolt: Add dynamic debugging
platform/x86: intel-wmi-thunderbolt: Convert to use SPDX identifier
...
Linus Torvalds [Wed, 31 Oct 2018 23:47:55 +0000 (16:47 -0700)]
Merge tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
Pull chrome-platform updates from Benson Leung:
- Move mfd/cros_ec_lpc* includes to drivers/platform from mfd
- Adding a new interrupt path for cros_ec_lpc
* tag 'tag-chrome-platform-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
platform/chrome: chromeos_tbmc - Remove unneeded const
platform/chrome: Add a new interrupt path for cros_ec_lpc
mfd: cros_ec: Fix and improve kerneldoc comments.
platform/chrome: Move mfd/cros_ec_lpc* includes to drivers/platform.
Linus Torvalds [Wed, 31 Oct 2018 23:20:28 +0000 (16:20 -0700)]
Merge tag 'riscv-for-linus-4.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull more RISC-V updates from Palmer Dabbelt:
"This contains the follow-on patches I'd like to target for the 4.20
merge window. I'm being somewhat conservative here, as while there are
a few patches on the mailing list that were posted early in the merge
window I'd like to let those bake for another round -- this was a
fairly big release as far as RISC-V is concerened, and we need to walk
before we can run.
As far as the patches that made it go:
- A patch to ignore offline CPUs when calculating AT_HWCAP. This
should fix GDB on the HiFive unleashed, which has an embedded core
for hart 0 which is exposed to Linux as an offline CPU.
- A move of EM_RISCV to elf-em.h, which is where it should have been
to begin with.
- I've also removed the 64-bit divide routines. I know I'm not really
playing by my own rules here because I posted the patches this
morning, but since they shouldn't be in the kernel I think it's
better to err on the side of going too fast here.
I don't anticipate any more patch sets for the merge window"
* tag 'riscv-for-linus-4.20-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Move EM_RISCV into elf-em.h
RISC-V: properly determine hardware caps
Revert "lib: Add umoddi3 and udivmoddi4 of GCC library routines"
Revert "RISC-V: Select GENERIC_LIB_UMODDI3 on RV32"
Linus Torvalds [Wed, 31 Oct 2018 22:46:16 +0000 (15:46 -0700)]
Merge branch 'for-linus-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- removal of old and dead code
- a bug fix for our tty driver
- other minor cleanups across the code base
* 'for-linus-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Make line/tty semantics use true write IRQ
um: trap: fix spelling mistake, EACCESS -> EACCES
um: Don't hardcode path as it is architecture dependent
um: NULL check before kfree is not needed
um: remove unused AIO code
um: Give start_idle_thread() a return code
um: Remove update_debugregs()
um: Drop own definition of PTRACE_SYSEMU/_SINGLESTEP
Linus Torvalds [Wed, 31 Oct 2018 21:50:02 +0000 (14:50 -0700)]
Merge tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
"As well as the usual bug fixes, this adds the following new features:
- cached readdir and readlink
- max I/O size increased from 128k to 1M
- improved performance and scalability of request queues
- copy_file_range support
The only non-fuse bits are trivial cleanups of macros in
<linux/bitops.h>"
* tag 'fuse-update-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (31 commits)
fuse: enable caching of symlinks
fuse: only invalidate atime in direct read
fuse: don't need GETATTR after every READ
fuse: allow fine grained attr cache invaldation
bitops: protect variables in bit_clear_unless() macro
bitops: protect variables in set_mask_bits() macro
fuse: realloc page array
fuse: add max_pages to init_out
fuse: allocate page array more efficiently
fuse: reduce size of struct fuse_inode
fuse: use iversion for readdir cache verification
fuse: use mtime for readdir cache verification
fuse: add readdir cache version
fuse: allow using readdir cache
fuse: allow caching readdir
fuse: extract fuse_emit() helper
fuse: add FOPEN_CACHE_DIR
fuse: split out readdir.c
fuse: Use hash table to link processing request
fuse: kill req->intr_unique
...
Linus Torvalds [Wed, 31 Oct 2018 21:42:31 +0000 (14:42 -0700)]
Merge tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The highlights are:
- a series that fixes some old memory allocation issues in libceph
(myself). We no longer allocate memory in places where allocation
failures cannot be handled and BUG when the allocation fails.
- support for copy_file_range() syscall (Luis Henriques). If size and
alignment conditions are met, it leverages RADOS copy-from
operation. Otherwise, a local copy is performed.
- a patch that reduces memory requirement of ceph_sync_read() from
the size of the entire read to the size of one object (Zheng Yan).
- fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis
Henriques)"
* tag 'ceph-for-4.20-rc1' of git://github.com/ceph/ceph-client: (25 commits)
ceph: new mount option to disable usage of copy-from op
ceph: support copy_file_range file operation
libceph: support the RADOS copy-from operation
ceph: add non-blocking parameter to ceph_try_get_caps()
libceph: check reply num_data_items in setup_request_data()
libceph: preallocate message data items
libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls
libceph: introduce alloc_watch_request()
libceph: assign cookies in linger_submit()
libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()
ceph: num_ops is off by one in ceph_aio_retry_work()
libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op()
ceph: set timeout conditionally in __cap_delay_requeue
libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()
libceph: introduce ceph_pagelist_alloc()
libceph: osd_req_op_cls_init() doesn't need to take opcode
libceph: bump CEPH_MSG_MAX_DATA_LEN
ceph: only allow punch hole mode in fallocate
ceph: refactor ceph_sync_read()
ceph: check if LOOKUPNAME request was aborted when filling trace
...
Palmer Dabbelt [Wed, 31 Oct 2018 19:13:54 +0000 (12:13 -0700)]
lib: Remove umoddi3 and udivmoddi4
These were only necessary for an out-of-tree driver that has since been
fixed to use the proper divide routines. I've simply reverted the pair
of commits we made last week.
Andreas Schwab [Tue, 23 Oct 2018 07:33:47 +0000 (09:33 +0200)]
RISC-V: properly determine hardware caps
On the Hifive-U platform, cpu 0 is a masked cpu with less capabilities
than the other cpus. Ignore it for the purpose of determining the
hardware capabilities of the system.
Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Linus Torvalds [Wed, 31 Oct 2018 18:41:37 +0000 (11:41 -0700)]
Merge tag 'fbdev-v4.20' of https://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
"No major changes to the subsystem itself, mainly fb drivers fixes &
cleanups (atyfb & udlfb updates stand out from the rest) + removal of
no longer needed old clps711xfb driver.
Details:
- update atyfb driver - improvements for ATI Mach64 chips: detect the
dot clock divider correctly on Sparc, fix display corruptions (due
to endianness issues and improper reading of accelerator
registers), optimize scrolling performance and also fix debugging
printks (Mikulas Patocka)
- rewrite USB unplug handling in udlfb driver using framebuffer
subsystem reference counting (Mikulas Patocka)
- fix support for native-mode display-timings in atmel_lcdfb driver
(Sam Ravnborg)
- fix information leak & add missing access_ok() checks in sbuslib
(Dan Carpenter)
- allow using GPIO expanders that can sleep in ssd1307fb driver
(Michal Vokáč)
- convert omapfb driver to use GPIO descriptors instead of GPIO
numbers for Amstrad Delta board (Janusz Krzysztofik)
- fix broken Kconfig menu dependencies (Randy Dunlap)
- convert fbdev subsystem to use %pOFn instead of device_node.name
(Rob Herring)
- remove the dead old CLPS711x LCD support driver (the new CLPS711x
LCD support driver is still available)
* tag 'fbdev-v4.20' of https://github.com/bzolnier/linux: (22 commits)
video: fbdev: remove redundant 'default n' from Kconfig-s
video: fbdev: remove dead old CLPS711x LCD support driver
Revert "video: ssd1307fb: Do not hard code active-low reset sequence"
video: fbdev: arcfb: mark expected switch fall-through
pxa168fb: remove set but not used variables 'mi'
video: ssd1307fb: Do not hard code active-low reset sequence
video: ssd1307fb: Use gpiod_set_value_cansleep() for reset
fbdev: fix broken menu dependencies
video: fbdev: sis: Remove unnecessary parentheses and commented code
video: fbdev: omapfb: lcd_ams_delta: use GPIO lookup table
fbdev: sbuslib: integer overflow in sbusfb_ioctl_helper()
fbdev: sbuslib: use checked version of put_user()
fbdev: Convert to using %pOFn instead of device_node.name
atmel_lcdfb: support native-mode display-timings
Video: vgastate: fixed a spacing coding style
atyfb: fix debugging printks
mach64: optimize wait_for_fifo
mach64: fix image corruption due to reading accelerator registers
mach64: fix display corruption on big endian machines
mach64: detect the dot clock divider correctly on sparc
...
Linus Torvalds [Wed, 31 Oct 2018 18:28:12 +0000 (11:28 -0700)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux
Pull thermal management updates from Zhang Rui:
- Fix a use-after-free issue when unregistering a thermal cooling
device (Dmitry Osipenko)
- use power_efficient_wq for thermal worker to save more power (Jeson
Gao)
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
thermal: core: using power_efficient_wq for thermal worker
thermal: core: Fix use-after-free in thermal_cooling_device_destroy_sysfs
Linus Torvalds [Wed, 31 Oct 2018 18:08:30 +0000 (11:08 -0700)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"This time it looks like a quieter release cycle in the clk tree. I
guess that's because of summer time holidays/vacations. The biggest
change in the diffstat is in the Qualcomm clk driver, where they got
support for CPUs and handful of SoCs. After that, the at91 driver got
a major rewrite for newer DT bindings that should make things easier
going forward and the TI code moved to a clockdomain based design.
The long tail is mostly small driver updates for newer clks and some
simpler SoC clock drivers such as the Hisilicon and imx support.
In the core framework, we only have two small changes this time.
One is a new clk API to get all clks for a device with the bulk clk
APIs. This allows drivers that don't care about doing anything besides
turning on all the clks to just clk_get() them all and turn them on.
The other change is the beginning of a way to support save and restore
of clk settings in the clk framework. TI is the only user right now,
but we will want to expand upon this design in the future to support
more save and restore of clk registers. At least this gets us started
and works well enough for one SoC, but there's more work in the
future.
Core:
- clk_bulk_get_all() API and friends to get all the clks for a device
- Basic clk state save/restore hooks
New Drivers:
- Renesas RZ/A2 (R7S9210) SoC, including early clocks
- Rensas RZ/G1N (R8A7744) and RZ/G2E (R8A774C0) SoCs
- Rensas RZ/G2M (r8a774a1) SoC
- Qualcomm Krait CPU clk support
- Qualcomm QCS404 GCC support
- Qualcomm SDM660 GCC support
- Qualcomm SDM845 camera clock controller
- Ingenic jz4725b CGU
- Hisilicon 3670 SoC support
- TI SCI clks on K3 SoCs
- iMX6 MMDC clks
- Reset Controller (RMU) support for Actions Semi Owl S900 and S700 SoCs
Updates:
- Rework at91 PMC clock driver for new DT bindings
- Nvidia Tegra clk driver MBIST workaround fix
- S2RAM support for Marvell mvebu periph clks
- Use updated printk format for OF node names
- Fix TI code to only search DT subnodes
- Various static analysis finds
- Tag various drivers with SPDX license tags
- Support dynamic frequency switching (DFS) on qcom SDM845 GCC
- Only use s2mps11 dt-binding defines instead of redefining them in the driver
- Add some more missing clks to qcom MSM8996 GCC
- Quad SPI clks on qcom SDM845
- Add support for CMT timer clocks on R-Car V3H
- Add support for SHDI and various timer clocks on R-Car V3M
- Improve OSC and RCLK (watchdog) handling on R-Car Gen3 SoCs
- Amlogic clk-pll driver improvements and updates
- Amlogic axg audio controller system clocks
- Register Amlogic meson8b clock controller early
- Add support for SATA and Fine Display Processor (FDP) clocks on R-Car M3-N
- Consolidation of system suspend related code in Exynos, S5P, S3C SoC clk drivers
- Fixes for system suspend support on Exynos542x (Odroid boards) and Exynos5433 SoC
- Remove obsoleted Exynos4212 ISP clock definitions
- Migrated TI am3/4/5 and dra7 SoCs to clockdomain based design
- TI RTC+DDR sleep mode support for clock save/restore
- Allwinner A64 display engine support and fixes
- Allwinner A83t display engine support and fixes"
- Generically select Type-1 IOMMU model support on ARM/ARM64 (Geert
Uytterhoeven)
- Quirk for VFs reporting INTx pin (Alex Williamson)
- Fix error path memory leak in MSI support (Li Qiang)
* tag 'vfio-v4.20-rc1.v2' of git://github.com/awilliam/linux-vfio:
vfio: add edid support to mbochs sample driver
vfio: add edid api for display (vgpu) devices.
drivers/vfio: Allow type-1 IOMMU instantiation with all ARM/ARM64 IOMMUs
vfio/pci: Mask buggy SR-IOV VF INTx support
vfio/pci: Fix potential memory leak in vfio_msi_cap_len
Linus Torvalds [Wed, 31 Oct 2018 17:53:29 +0000 (10:53 -0700)]
Merge tag 'media/v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull new experimental media request API from Mauro Carvalho Chehab:
"A new media request API
This API is needed to support device drivers that can dynamically
change their parameters for each new frame. The latest versions of
Google camera and codec HAL depends on such feature.
At this stage, it supports only stateless codecs.
It has been discussed for a long time (at least over the last 3-4
years), and we finally reached to something that seem to work.
This series contain both the API and core changes required to support
it and a new m2m decoder driver (cedrus).
As the current API is still experimental, the only real driver using
it (cedrus) was added at staging[1]. We intend to keep it there for a
while, in order to test the API. Only when we're sure that this API
works for other cases (like encoders), we'll move this driver out of
staging and set the API into a stone.
[1] We added support for the vivid virtual driver (used only for
testing) to it too, as it makes easier to test the API for the ones
that don't have the cedrus hardware"
* tag 'media/v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (53 commits)
media: dt-bindings: Document the Rockchip VPU bindings
media: platform: Add Cedrus VPU decoder driver
media: dt-bindings: media: Document bindings for the Cedrus VPU driver
media: v4l: Add definition for the Sunxi tiled NV12 format
media: v4l: Add definitions for MPEG-2 slice format and metadata
media: videobuf2-core: Rework and rename helper for request buffer count
media: v4l2-ctrls.c: initialize an error return code with zero
media: v4l2-compat-ioctl32.c: add missing documentation for a field
media: media-request: update documentation
media: media-request: EPERM -> EACCES/EBUSY
media: v4l2-ctrls: improve media_request_(un)lock_for_update
media: v4l2-ctrls: use media_request_(un)lock_for_access
media: media-request: add media_request_(un)lock_for_access
media: vb2: set reqbufs/create_bufs capabilities
media: videodev2.h: add new capabilities for buffer types
media: buffer.rst: only set V4L2_BUF_FLAG_REQUEST_FD for QBUF
media: v4l2-ctrls: return -EACCES if request wasn't completed
media: media-request: return -EINVAL for invalid request_fds
media: vivid: add request support
media: vivid: add mc
...
Linus Torvalds [Wed, 31 Oct 2018 16:25:15 +0000 (09:25 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- the rest of MM
- lib/bitmap updates
- hfs updates
- fatfs updates
- various other misc things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
mm/gup.c: fix __get_user_pages_fast() comment
mm: Fix warning in insert_pfn()
memory-hotplug.rst: add some details about locking internals
powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages()
powerpc/powernv: hold device_hotplug_lock when calling device_online()
mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
mm/memory_hotplug: make add_memory() take the device_hotplug_lock
mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
mm/memblock.c: warn if zero alignment was requested
memblock: stop using implicit alignment to SMP_CACHE_BYTES
docs/boot-time-mm: remove bootmem documentation
mm: remove include/linux/bootmem.h
memblock: replace BOOTMEM_ALLOC_* with MEMBLOCK variants
mm: remove nobootmem
memblock: rename __free_pages_bootmem to memblock_free_pages
memblock: rename free_all_bootmem to memblock_free_all
memblock: replace free_bootmem_late with memblock_free_late
memblock: replace free_bootmem{_node} with memblock_free
mm: nobootmem: remove bootmem allocation APIs
memblock: replace alloc_bootmem with memblock_alloc
...
Jan Kara [Tue, 30 Oct 2018 22:10:47 +0000 (15:10 -0700)]
mm: Fix warning in insert_pfn()
In DAX mode a write pagefault can race with write(2) in the following
way:
CPU0 CPU1
write fault for mapped zero page (hole)
dax_iomap_rw()
iomap_apply()
xfs_file_iomap_begin()
- allocates blocks
dax_iomap_actor()
invalidate_inode_pages2_range()
- invalidates radix tree entries in given range
dax_iomap_pte_fault()
grab_mapping_entry()
- no entry found, creates empty
...
xfs_file_iomap_begin()
- finds already allocated block
...
vmf_insert_mixed_mkwrite()
- WARNs and does nothing because there
is still zero page mapped in PTE
unmap_mapping_pages()
This race results in WARN_ON from insert_pfn() and is occasionally
triggered by fstest generic/344. Note that the race is otherwise
harmless as before write(2) on CPU0 is finished, we will invalidate page
tables properly and thus user of mmap will see modified data from
write(2) from that point on. So just restrict the warning only to the
case when the PFN in PTE is not zero page.
Link: http://lkml.kernel.org/r/20180824154542.26872-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memory-hotplug.rst: add some details about locking internals
Let's document the magic a bit, especially why device_hotplug_lock is
required when adding/removing memory and how it all play together with
requests to online/offline memory from user space.
Link: http://lkml.kernel.org/r/20180925091457.28651-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
powerpc/powernv: hold device_hotplug_lock when calling memtrace_offline_pages()
Let's perform all checking + offlining + removing under
device_hotplug_lock, so nobody can mess with these devices via sysfs
concurrently.
[david@redhat.com: take device_hotplug_lock outside of loop] Link: http://lkml.kernel.org/r/20180927092554.13567-6-david@redhat.com Link: http://lkml.kernel.org/r/20180925091457.28651-6-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
powerpc/powernv: hold device_hotplug_lock when calling device_online()
device_online() should be called with device_hotplug_lock() held.
Link: http://lkml.kernel.org/r/20180925091457.28651-5-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
There seem to be some problems as result of 30467e0b3be ("mm, hotplug:
fix concurrent memory hot-add deadlock"), which tried to fix a possible
lock inversion reported and discussed in [1] due to the two locks
a) device_lock()
b) mem_hotplug_lock
While add_memory() first takes b), followed by a) during
bus_probe_device(), onlining of memory from user space first took a),
followed by b), exposing a possible deadlock.
In [1], and it was decided to not make use of device_hotplug_lock, but
rather to enforce a locking order.
The problems I spotted related to this:
1. Memory block device attributes: While .state first calls
mem_hotplug_begin() and the calls device_online() - which takes
device_lock() - .online does no longer call mem_hotplug_begin(), so
effectively calls online_pages() without mem_hotplug_lock.
2. device_online() should be called under device_hotplug_lock, however
onlining memory during add_memory() does not take care of that.
In addition, I think there is also something wrong about the locking in
3. arch/powerpc/platforms/powernv/memtrace.c calls offline_pages()
without locks. This was introduced after 30467e0b3be. And skimming over
the code, I assume it could need some more care in regards to locking
(e.g. device_online() called without device_hotplug_lock. This will
be addressed in the following patches.
Now that we hold the device_hotplug_lock when
- adding memory (e.g. via add_memory()/add_memory_resource())
- removing memory (e.g. via remove_memory())
- device_online()/device_offline()
We can move mem_hotplug_lock usage back into
online_pages()/offline_pages().
Why is mem_hotplug_lock still needed? Essentially to make
get_online_mems()/put_online_mems() be very fast (relying on
device_hotplug_lock would be very slow), and to serialize against
addition of memory that does not create memory block devices (hmm).
This patch is partly based on a patch by Vitaly Kuznetsov.
Link: http://lkml.kernel.org/r/20180925091457.28651-4-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memory_hotplug: make add_memory() take the device_hotplug_lock
add_memory() currently does not take the device_hotplug_lock, however
is aleady called under the lock from
arch/powerpc/platforms/pseries/hotplug-memory.c
drivers/acpi/acpi_memhotplug.c
to synchronize against CPU hot-remove and similar.
In general, we should hold the device_hotplug_lock when adding memory to
synchronize against online/offline request (e.g. from user space) - which
already resulted in lock inversions due to device_lock() and
mem_hotplug_lock - see 30467e0b3be ("mm, hotplug: fix concurrent memory
hot-add deadlock"). add_memory()/add_memory_resource() will create memory
block devices, so this really feels like the right thing to do.
Holding the device_hotplug_lock makes sure that a memory block device
can really only be accessed (e.g. via .online/.state) from user space,
once the memory has been fully added to the system.
The lock is not held yet in
drivers/xen/balloon.c
arch/powerpc/platforms/powernv/memtrace.c
drivers/s390/char/sclp_cmd.c
drivers/hv/hv_balloon.c
So, let's either use the locked variants or take the lock.
Don't export add_memory_resource(), as it once was exported to be used by
XEN, which is never built as a module. If somebody requires it, we also
have to export a locked variant (as device_hotplug_lock is never
exported).
Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mathieu Malaterre <malat@debian.org> Cc: Pavel Tatashin <pavel.tatashin@microsoft.com> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memory_hotplug: make remove_memory() take the device_hotplug_lock
Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3.
Reading through the code and studying how mem_hotplug_lock is to be used,
I noticed that there are two places where we can end up calling
device_online()/device_offline() - online_pages()/offline_pages() without
the mem_hotplug_lock. And there are other places where we call
device_online()/device_offline() without the device_hotplug_lock.
While e.g.
echo "online" > /sys/devices/system/memory/memory9/state
is fine, e.g.
echo 1 > /sys/devices/system/memory/memory9/online
Will not take the mem_hotplug_lock. However the device_lock() and
device_hotplug_lock.
E.g. via memory_probe_store(), we can end up calling
add_memory()->online_pages() without the device_hotplug_lock. So we can
have concurrent callers in online_pages(). We e.g. touch in
online_pages() basically unprotected zone->present_pages then.
Looks like there is a longer history to that (see Patch #2 for details),
and fixing it to work the way it was intended is not really possible. We
would e.g. have to take the mem_hotplug_lock in device/base/core.c, which
sounds wrong.
Summary: We had a lock inversion on mem_hotplug_lock and device_lock().
More details can be found in patch 3 and patch 6.
I propose the general rules (documentation added in patch 6):
1. add_memory/add_memory_resource() must only be called with
device_hotplug_lock.
2. remove_memory() must only be called with device_hotplug_lock. This is
already documented and holds for all callers.
3. device_online()/device_offline() must only be called with
device_hotplug_lock. This is already documented and true for now in core
code. Other callers (related to memory hotplug) have to be fixed up.
4. mem_hotplug_lock is taken inside of add_memory/remove_memory/
online_pages/offline_pages.
To me, this looks way cleaner than what we have right now (and easier to
verify). And looking at the documentation of remove_memory, using
lock_device_hotplug also for add_memory() feels natural.
This patch (of 6):
remove_memory() is exported right now but requires the
device_hotplug_lock, which is not exported. So let's provide a variant
that takes the lock and only export that one.
The lock is already held in
arch/powerpc/platforms/pseries/hotplug-memory.c
drivers/acpi/acpi_memhotplug.c
arch/powerpc/platforms/powernv/memtrace.c
Apart from that, there are not other users in the tree.
Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Rashmica Gupta <rashmica.g@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com> Cc: Mathieu Malaterre <malat@debian.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Juergen Gross <jgross@suse.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:10:01 +0000 (15:10 -0700)]
mm/memblock.c: warn if zero alignment was requested
After updating all memblock users to explicitly specify SMP_CACHE_BYTES
alignment rather than use 0, it is still possible that uncovered users may
sneak in. Add a WARN_ON_ONCE for such cases.
[sfr@canb.auug.org.au: use dump_stack() instead of WARN_ON_ONCE for the alignment checks] Link: http://lkml.kernel.org/r/20181016131927.6ceba6ab@canb.auug.org.au
[akpm@linux-foundation.org: add apologetic comment] Link: http://lkml.kernel.org/r/20181011060850.GA19822@rapoport-lnx Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:09:57 +0000 (15:09 -0700)]
memblock: stop using implicit alignment to SMP_CACHE_BYTES
When a memblock allocation APIs are called with align = 0, the alignment
is implicitly set to SMP_CACHE_BYTES.
Implicit alignment is done deep in the memblock allocator and it can
come as a surprise. Not that such an alignment would be wrong even
when used incorrectly but it is better to be explicit for the sake of
clarity and the prinicple of the least surprise.
Replace all such uses of memblock APIs with the 'align' parameter
explicitly set to SMP_CACHE_BYTES and stop implicit alignment assignment
in the memblock internal allocation functions.
For the case when memblock APIs are used via helper functions, e.g. like
iommu_arena_new_node() in Alpha, the helper functions were detected with
Coccinelle's help and then manually examined and updated where
appropriate.
The direct memblock APIs users were updated using the semantic patch below:
Mike Rapoport [Tue, 30 Oct 2018 22:09:21 +0000 (15:09 -0700)]
memblock: replace free_bootmem{_node} with memblock_free
The free_bootmem and free_bootmem_node are merely wrappers for
memblock_free. Replace their usage with a call to memblock_free using the
following semantic patch:
Mike Rapoport [Tue, 30 Oct 2018 22:09:09 +0000 (15:09 -0700)]
memblock: replace alloc_bootmem with memblock_alloc
The alloc_bootmem(size) is a shortcut for allocation of SMP_CACHE_BYTES
aligned memory. When the align parameter of memblock_alloc() is 0, the
alignment is implicitly set to SMP_CACHE_BYTES and thus alloc_bootmem(size)
and memblock_alloc(size, 0) are equivalent.
The conversion is done using the following semantic patch:
Mike Rapoport [Tue, 30 Oct 2018 22:08:54 +0000 (15:08 -0700)]
memblock: replace alloc_bootmem_low_pages with memblock_alloc_low
The alloc_bootmem_low_pages() function allocates PAGE_SIZE aligned regions
from low memory. memblock_alloc_low() with alignment set to PAGE_SIZE does
exactly the same thing.
The conversion is done using the following semantic patch:
Mike Rapoport [Tue, 30 Oct 2018 22:08:49 +0000 (15:08 -0700)]
memblock: replace alloc_bootmem_node with memblock_alloc_node
Both functions attempt to allocate memory with specified alignment from a
particular node. If the allocation from that node fails, they both fall
back to allocating from any node in the system.
Usage of native memblock API eliminates the nobootmem translation layer.
Link: http://lkml.kernel.org/r/1536927045-23536-18-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:08:36 +0000 (15:08 -0700)]
memblock: add align parameter to memblock_alloc_node()
With the align parameter memblock_alloc_node() can be used as drop in
replacement for alloc_bootmem_pages_node() and __alloc_bootmem_node(),
which is done in the following patches.
Link: http://lkml.kernel.org/r/1536927045-23536-15-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:08:31 +0000 (15:08 -0700)]
memblock: replace __alloc_bootmem_nopanic with memblock_alloc_from_nopanic
When __alloc_bootmem_nopanic() is used with explicit lower limit for the
allocation it attempts to allocate memory at or above that limit and falls
back to allocation with no limit set.
The memblock_alloc_from_nopanic() does exactly the same thing and can be
used as a replacement for __alloc_bootmem_nopanic() is such cases.
Link: http://lkml.kernel.org/r/1536927045-23536-14-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:08:22 +0000 (15:08 -0700)]
memblock: replace alloc_bootmem_pages_nopanic with memblock_alloc_nopanic
The alloc_bootmem_pages_nopanic(size) is a shortcut for
__alloc_bootmem_nopanic(size, PAGE_SIZE, BOOTMEM_LOW_LIMIT) which allocates
PAGE_SIZE aligned memory. Since BOOTMEM_LOW_LIMIT is hardwired to 0 there
is no restrictions on where the allocated memory should reside.
The memblock_alloc_nopanic(size, PAGE_SIZE) also allocates PAGE_SIZE
aligned memory without any restrictions and thus can be used as a
replacement for alloc_bootmem_pages_nopanic()
Link: http://lkml.kernel.org/r/1536927045-23536-12-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Tue, 30 Oct 2018 22:08:18 +0000 (15:08 -0700)]
memblock: replace __alloc_bootmem_node_nopanic with memblock_alloc_try_nid_nopanic
The __alloc_bootmem_node_nopanic() attempts to allocate memory for a
specified node. If the allocation fails it then retries to allocate memory
from any node. Upon success, the allocated memory is set to 0.
The memblock_alloc_try_nid_nopanic() does exactly the same thing and can be
used instead.
Link: http://lkml.kernel.org/r/1536927045-23536-11-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <lftan@altera.com> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Paul Burton <paul.burton@mips.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Serge Semin <fancer.lancer@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Tue, 30 Oct 2018 22:07:32 +0000 (15:07 -0700)]
kbuild: fix kernel/bounds.c 'W=1' warning
Building any configuration with 'make W=1' produces a warning:
kernel/bounds.c:16:6: warning: no previous prototype for 'foo' [-Wmissing-prototypes]
When also passing -Werror, this prevents us from building any other files.
Nobody ever calls the function, but we can't make it 'static' either
since we want the compiler output.
Calling it 'main' instead however avoids the warning, because gcc
does not insist on having a declaration for main.
Gao Xiang [Tue, 30 Oct 2018 22:07:28 +0000 (15:07 -0700)]
lib/lz4: update LZ4 decompressor module
Update the LZ4 compression module based on LZ4 v1.8.3 in order for the
erofs file system to use the newest LZ4_decompress_safe_partial() which
can now decode exactly the nb of bytes requested [1] to take place of the
open hacked code in the erofs file system itself.
Currently, apart from the erofs file system, no other users use
LZ4_decompress_safe_partial, so no worry about the interface.
In addition, LZ4 v1.8.x boosts up decompression speed compared to the
current code which is based on LZ4 v1.7.3, mainly due to shortcut
optimization for the specific common LZ4-sequences [2].
lzbench testdata (tested in kirin710, 8 cores, 4 big cores
at 2189Mhz, 2GB DDR RAM at 1622Mhz, with enwik8 testdata [3]):
Waiman Long [Tue, 30 Oct 2018 22:07:24 +0000 (15:07 -0700)]
ipc: IPCMNI limit check for semmni
For SysV semaphores, the semmni value is the last part of the 4-element
sem number array. To make semmni behave in a similar way to msgmni and
shmmni, we can't directly use the _minmax handler. Instead, a special sem
specific handler is added to check the last argument to make sure that it
is limited to the [0, IPCMNI] range. An error will be returned if this is
not the case.
Link: http://lkml.kernel.org/r/1536352137-12003-3-git-send-email-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Tue, 30 Oct 2018 22:07:20 +0000 (15:07 -0700)]
ipc: IPCMNI limit check for msgmni and shmmni
Patch series "ipc: IPCMNI limit check for *mni & increase that limit", v9.
The sysctl parameters msgmni, shmmni and semmni have an inherent limit of
IPC_MNI (32k). However, users may not be aware of that because they can
write a value much higher than that without getting any error or
notification. Reading the parameters back will show the newly written
values which are not real.
The real IPCMNI limit is now enforced to make sure that users won't put in
an unrealistic value. The first 2 patches enforce the limits.
There are also users out there requesting increase in the IPCMNI value.
The last 2 patches attempt to do that by using a boot kernel parameter
"ipcmni_extend" to increase the IPCMNI limit from 32k to 8M if the users
really want the extended value.
This patch (of 4):
A user can write arbitrary integer values to msgmni and shmmni sysctl
parameters without getting error, but the actual limit is really IPCMNI
(32k). This can mislead users as they think they can get a value that is
not real.
The right limits are now set for msgmni and shmmni so that the users will
become aware if they set a value outside of the acceptable range.
Link: http://lkml.kernel.org/r/1536352137-12003-2-git-send-email-longman@redhat.com Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Cc: Kees Cook <keescook@chromium.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Borislav Petkov [Tue, 30 Oct 2018 22:07:17 +0000 (15:07 -0700)]
kernel/panic.c: filter out a potential trailing newline
If a call to panic() terminates the string with a \n , the result puts the
closing brace ']---' on a newline because panic() itself adds \n too.
Now, if one goes and removes the newline chars from all panic()
invocations - and the stats right now look like this:
~300 calls with a \n
~500 calls without a \n
one is destined to a neverending game of whack-a-mole because the usual
thing to do is add a newline at the end of a string a function is supposed
to print.
Therefore, simply zap any \n at the end of the panic string to avoid
touching so many places in the kernel.
Frank Sorenson [Tue, 30 Oct 2018 22:06:53 +0000 (15:06 -0700)]
fat: add functions to update and truncate timestamps appropriately
Add the fat-specific inode_operation ->update_time() and
fat_truncate_time() function to truncate the inode timestamps from 1
nanosecond to the appropriate granularity.
Frank Sorenson [Tue, 30 Oct 2018 22:06:50 +0000 (15:06 -0700)]
fat: create a function to calculate the timezone offest
Patch series "fat: timestamp updates", v5.
fat/msdos timestamps are stored on-disk with several different
granularities, some of them lower resolution than timespec64_trunc() can
provide. In addition, they are only truncated as they are written to
disk, so the timestamps in-memory for new or modified files/directories
may be different from the same timestamps after a remount, as the
now-truncated times are re-read from the on-disk format.
These patches allow finer granularity for the timestamps where possible
and add fat-specific ->update_time inode operation and fat_truncate_time
functions to truncate each timestamp correctly, giving consistent times
across remounts.
This patch (of 4):
Move the calculation of the number of seconds in the timezone offset to a
common function.
Mihir Mehta [Tue, 30 Oct 2018 22:06:46 +0000 (15:06 -0700)]
fat: expand a slightly out-of-date comment
The file namei.c seems to have been renamed to namei_msdos.c, so I decided
to update the comment with the correct name, and expand it a bit to tell
the reader what to look for.
Jann Horn [Tue, 30 Oct 2018 22:06:38 +0000 (15:06 -0700)]
reiserfs: propagate errors from fill_with_dentries() properly
fill_with_dentries() failed to propagate errors up to
reiserfs_for_each_xattr() properly. Plumb them through.
Note that reiserfs_for_each_xattr() is only used by
reiserfs_delete_xattrs() and reiserfs_chown_xattrs(). The result of
reiserfs_delete_xattrs() is discarded anyway, the only difference there is
whether a warning is printed to dmesg. The result of
reiserfs_chown_xattrs() does matter because it can block chowning of the
file to which the xattrs belong; but either way, the resulting state can
have misaligned ownership, so my patch doesn't improve things greatly.
Credit for making me look at this code goes to Al Viro, who pointed out
that the ->actor calling convention is suboptimal and should be changed.
Link: http://lkml.kernel.org/r/20180802163335.83312-1-jannh@google.com Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Ian King [Tue, 30 Oct 2018 22:06:35 +0000 (15:06 -0700)]
fs/hfs/extent.c: fix array out of bounds read of array extent
Currently extent and index i are both being incremented causing an array
out of bounds read on extent[i]. Fix this by removing the extraneous
increment of extent.
Ernesto said:
: This is only triggered when deleting a file with a resource fork. I
: may be wrong because the documentation isn't clear, but I don't think
: you can create those under linux. So I guess nobody was testing them.
:
: > A disk space leak, perhaps?
:
: That's what it looks like in general. hfs_free_extents() won't do
: anything if the block count doesn't add up, and the error will be
: ignored. Now, if the block count randomly does add up, we could see
: some corruption.
Detected by CoverityScan, CID#711541 ("Out of bounds read")
Link: http://lkml.kernel.org/r/20180831140538.31566-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Ernesto A. Fernndez <ernesto.mnd.fernandez@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Vyacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Direct writes to empty inodes fail with EIO. The generic direct-io code
is in part to blame (a patch has been submitted as "direct-io: allow
direct writes to empty inodes"), but hfs is worse affected than the other
filesystems because the fallback to buffered I/O doesn't happen.
The problem is the return value of hfs_get_block() when called with
!create. Change it to be more consistent with the other modules.
Direct writes to empty inodes fail with EIO. The generic direct-io code
is in part to blame (a patch has been submitted as "direct-io: allow
direct writes to empty inodes"), but hfsplus is worse affected than the
other filesystems because the fallback to buffered I/O doesn't happen.
The problem is the return value of hfsplus_get_block() when called with
!create. Change it to be more consistent with the other modules.
Inserting a new record in a btree may require splitting several of its
nodes. If we hit ENOSPC halfway through, the new nodes will be left
orphaned and their records will be lost. This could mean lost inodes or
extents.
Henceforth, check the available disk space before making any changes.
This still leaves the potential problem of corruption on ENOMEM.
There is no need to reserve space before deleting a catalog record, as we
do for hfsplus. This difference is because hfs index nodes have fixed
length keys.
Inserting or deleting a record in a btree may require splitting several of
its nodes. If we hit ENOSPC halfway through, the new nodes will be left
orphaned and their records will be lost. This could mean lost inodes,
extents or xattrs.
Henceforth, check the available disk space before making any changes.
This still leaves the potential problem of corruption on ENOMEM.
The patch can be tested with xfstests generic/027.
hfs_brec_update_parent() may hit BUG_ON() if the first record of both a
leaf node and its parent are changed, and if this forces the parent to
be split. It is not possible for this to happen on a valid hfs
filesystem because the index nodes have fixed length keys.
For reasons I ignore, the hfs module does have support for a number of
hfsplus features. A corrupt btree header may report variable length
keys and trigger this BUG, so it's better to fix it.
This bug is triggered whenever hfs_brec_update_parent() needs to split
the root node. The height of the btree is not increased, which leaves
the new node orphaned and its records lost. It is not possible for this
to happen on a valid hfs filesystem because the index nodes have fixed
length keys.
For reasons I ignore, the hfs module does have support for a number of
hfsplus features. A corrupt btree header may report variable length
keys and trigger this bug, so it's better to fix it.
Creating, renaming or deleting a file may hit BUG_ON() if the first
record of both a leaf node and its parent are changed, and if this
forces the parent to be split. This bug is triggered by xfstests
generic/027, somewhat rarely; here is a more reliable reproducer:
truncate -s 50M fs.iso
mkfs.hfsplus fs.iso
mount fs.iso /mnt
i=1000
while [ $i -le 2400 ]; do
touch /mnt/$i &>/dev/null
((++i))
done
i=2400
while [ $i -ge 1000 ]; do
mv /mnt/$i /mnt/$(perl -e "print $i x61") &>/dev/null
((--i))
done
The issue is that a newly created bnode is being put twice. Reset
new_node to NULL in hfs_brec_update_parent() before reaching goto again.
Creating, renaming or deleting a file may cause catalog corruption and
data loss. This bug is randomly triggered by xfstests generic/027, but
here is a faster reproducer:
truncate -s 50M fs.iso
mkfs.hfsplus fs.iso
mount fs.iso /mnt
i=100
while [ $i -le 150 ]; do
touch /mnt/$i &>/dev/null
((++i))
done
i=100
while [ $i -le 150 ]; do
mv /mnt/$i /mnt/$(perl -e "print $i x82") &>/dev/null
((++i))
done
umount /mnt
fsck.hfsplus -n fs.iso
The bug is triggered whenever hfs_brec_update_parent() needs to split the
root node. The height of the btree is not increased, which leaves the new
node orphaned and its records lost.
Nikolaus Voss [Tue, 30 Oct 2018 22:05:57 +0000 (15:05 -0700)]
init/do_mounts.c: add root=PARTLABEL=<name> support
Support referencing the root partition label from GPT as argument
to the root= option on the kernel command line in analogy to
referencing the partition uuid as root=PARTUUID=<uuid>.
Specifying the partition label instead of the uuid is often much
easier, e.g. in embedded environments when there is an
A/B rootfs partition scheme for interruptible firmware updates
(i.e. rootfsA/ rootfsB).
The partition label can be queried with the blkid command.
Christophe Leroy [Tue, 30 Oct 2018 22:05:53 +0000 (15:05 -0700)]
checkpatch: remove GCC_BINARY_CONSTANT warning
This warning was there to avoid the use of 0bxxx values as they are not
supported by gcc prior to v4.3
Since cafa0010cd51f ("Raise the minimum required gcc version to 4.6"),
it's not an issue anymore and using such values can increase readability
of code.
Joe said:
: Seems sensible as the other compilers also support binary literals from
: relatively old versions.
: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3472.pdf
: https://software.intel.com/en-us/articles/c14-features-supported-by-intel-c-compiler