Damien Le Moal [Fri, 12 Oct 2018 10:08:50 +0000 (19:08 +0900)]
block: Introduce blk_revalidate_disk_zones()
Drivers exposing zoned block devices have to initialize and maintain
correctness (i.e. revalidate) of the device zone bitmaps attached to
the device request queue (seq_zones_bitmap and seq_zones_wlock).
To simplify coding this, introduce a generic helper function
blk_revalidate_disk_zones() suitable for most (and likely all) cases.
This new function always update the seq_zones_bitmap and seq_zones_wlock
bitmaps as well as the queue nr_zones field when called for a disk
using a request based queue. For a disk using a BIO based queue, only
the number of zones is updated since these queues do not have
schedulers and so do not need the zone bitmaps.
With this change, the zone bitmap initialization code in sd_zbc.c can be
replaced with a call to this function in sd_zbc_read_zones(), which is
called from the disk revalidate block operation method.
A call to blk_revalidate_disk_zones() is also added to the null_blk
driver for devices created with the zoned mode enabled.
Finally, to ensure that zoned devices created with dm-linear or
dm-flakey expose the correct number of zones through sysfs, a call to
blk_revalidate_disk_zones() is added to dm_table_set_restrictions().
The zone bitmaps allocated and initialized with
blk_revalidate_disk_zones() are freed automatically from
__blk_release_queue() using the block internal function
blk_queue_free_zone_bitmaps().
Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dispatching a report zones command through the request queue is a major
pain due to the command reply payload rewriting necessary. Given that
blkdev_report_zones() is executing everything synchronously, implement
report zones as a block device file operation instead, allowing major
simplification of the code in many places.
sd, null-blk, dm-linear and dm-flakey being the only block device
drivers supporting exposing zoned block devices, these drivers are
modified to provide the device side implementation of the
report_zones() block device file operation.
For device mappers, a new report_zones() target type operation is
defined so that the upper block layer calls blkdev_report_zones() can
be propagated down to the underlying devices of the dm targets.
Implementation for this new operation is added to the dm-linear and
dm-flakey targets.
Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
[Damien]
* Changed method block_device argument to gendisk
* Various bug fixes and improvements
* Added support for null_blk, dm-linear and dm-flakey. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:48 +0000 (19:08 +0900)]
block: Expose queue nr_zones in sysfs
Expose through sysfs the nr_zones field of struct request_queue.
Exposing this value helps in debugging disk issues as well as
facilitating scripts based use of the disk (e.g. blktests).
For zoned block devices, the nr_zones field indicates the total number
of zones of the device calculated using the known disk capacity and
zone size. This number of zones is always 0 for regular block devices.
Since nr_zones is defined conditionally with CONFIG_BLK_DEV_ZONED,
introduce the blk_queue_nr_zones() function to return the correct value
for any device, regardless if CONFIG_BLK_DEV_ZONED is set.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:47 +0000 (19:08 +0900)]
block: Improve zone reset execution
There is no need to synchronously execute all REQ_OP_ZONE_RESET BIOs
necessary to reset a range of zones. Similarly to what is done for
discard BIOs in blk-lib.c, all zone reset BIOs can be chained and
executed asynchronously and a synchronous call done only for the last
BIO of the chain.
Modify blkdev_reset_zones() to operate similarly to
blkdev_issue_discard() using the next_bio() helper for chaining BIOs. To
avoid code duplication of that function in blk_zoned.c, rename
next_bio() into blk_next_bio() and declare it as a block internal
function in blk.h.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:46 +0000 (19:08 +0900)]
block: Introduce BLKGETNRZONES ioctl
Get a zoned block device total number of zones. The device can be a
partition of the whole device. The number of zones is always 0 for
regular block devices.
Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:45 +0000 (19:08 +0900)]
block: Introduce BLKGETZONESZ ioctl
Get a zoned block device zone size in number of 512 B sectors.
The zone size is always 0 for regular block devices.
Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:44 +0000 (19:08 +0900)]
block: Limit allocation of zone descriptors for report zones
There is no point in allocating more zone descriptors than the number of
zones a block device has for doing a zone report. Avoid doing that in
blkdev_report_zones_ioctl() by limiting the number of zone decriptors
allocated internally to process the user request.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:43 +0000 (19:08 +0900)]
block: Introduce blkdev_nr_zones() helper
Introduce the blkdev_nr_zones() helper function to get the total
number of zones of a zoned block device. This number is always 0 for a
regular block device (q->limits.zoned == BLK_ZONED_NONE case).
Replace hard-coded number of zones calculation in dmz_get_zoned_device()
with a call to this helper.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
The unsigned 32 bits overflow check for the zone size value is already
done within sd_zbc_check_zones() with the test:
} else if (logical_to_sectors(sdkp->device, zone_blocks) > UINT_MAX) {
so there is no need to check again for an out of range value in
sd_zbc_read_zones(). Simplify the code and fix sd_zbc_check_zones()
error return to -EFBIG instead of -ENODEV if the zone size is too large.
Change the return type of sd_zbc_check_zones() to an int for the error
code and return the zone size (zone_blocks) through a u32 pointer to
avoid overflowing the signed 32 return value.
Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:41 +0000 (19:08 +0900)]
scsi: sd_zbc: Reduce boot device scan and revalidate time
Handling checks of ZBC device capacity using the max_lba field of the
REPORT ZONES command reply for disks with rc_basis == 0 can be done
using the same report zones command reply used to check the "same"
field.
Avoid executing a report zones command solely to check the disk capacity
by merging sd_zbc_check_capacity() into sd_zbc_check_zone_size() and
renaming that function to sd_zbc_check_zones(). This removes a costly
execution of a full report zones command and so reduces device scan
duration at boot time as well as the duration of disk revalidate calls.
Furthermore, setting the partial report bit in the REPORT ZONES command
cdb can significantly reduce this command execution time as the device
does not have to count and report the total number of zones that could
be reported assuming a large enough reply buffer. A non-partial zone
report is necessary only for the first execution of report zones used to
check the same field value (to ensure that this value applies to all
zones of the disk). All other calls to sd_zbc_report_zones() can use a
partial report to reduce execution time.
Using a 14 TB ZBC disk, these simple changes reduce device scan time at
boot from about 3.5s down to about 900ms. Disk revalidate times are also
reduced from about 450ms down to 230ms.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Fri, 12 Oct 2018 10:08:40 +0000 (19:08 +0900)]
scsi: sd_zbc: Rearrange code
Move the urswrz check out of sd_zbc_read_zones() and into
sd_zbc_read_zoned_characteristics() where that value is obtained (read
from the disk zoned characteristics VPD page). Since this function now
does more than simply reading the VPD page, rename it to
sd_zbc_check_zoned_characteristics().
Also fix the error message displayed when reading that VPD page fails.
Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Check return values of dma_set_mask_and_coherent().
Otherwise, if dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
fails, the following piece of code will be executed even when the call
to dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32)); returns 0:
xen/blkfront: avoid NULL blkfront_info dereference on device removal
If a block device is hot-added when we are out of grants,
gnttab_grant_foreign_access fails with -ENOSPC (log message "28
granting access to ring page") in this code path:
Linus Torvalds [Thu, 25 Oct 2018 16:00:15 +0000 (09:00 -0700)]
Merge tag 'sound-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"There have been little changes in ALSA core stuff, but ASoC core still
kept rolling for the continued restructuring. The rest are lots of
small driver-specific changes and some minor API updates. Here are
highlights:
General:
- Appropriate fall-through annotations everywhere
- Some code cleanup in memalloc code, handling non-cacahed pages more
commonly in the helper
- Deployment of SNDRV_PCM_INFO_SYNC_APPLPTR flag consistently
Drivers:
- More HD-audio CA0132 codec improvement for supporting other Creative
boards
- Plumbing legacy HD-audio codecs as ASoC BE on Intel SST; this will
give move support of existing HD-audio devices with DSP
- A few device-specific HD-audio quirks as usual
- New quirk for RME CC devices and correction for B&W PX for USB-audio
- FireWire: code refactoring including devres usages
ASoC Core:
- Continued componentization works; it's almost done!
- A bunch of new for_each_foo macros
- Cleanups and fixes in DAPM code
ASoC Drivers:
- MCLK support for several different devices, including CS42L51, STM32
SAI, and MAX98373
- Support for Allwinner A64 CODEC analog, Intel boards with DA7219 and
MAX98927, Meson AXG PDM inputs, Nuvoton NAU8822, Renesas R8A7744 and
TI PCM3060"
* tag 'sound-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (299 commits)
ASoC: stm32: sai: fix master clock naming
ASoC: stm32: add clock dependency for sai
ALSA: hda/ca0132 - Actually fix microphone issue
ASoC: sun4i-i2s: move code from startup/shutdown hooks into pm_runtime hooks
ASoC: wm2000: Remove wm2000_read helper function
ASoC: cs42l51: fix mclk support
ASoC: wm_adsp: Log addresses as 8 digits in wm_adsp_buffer_populate
ASoC: wm_adsp: Rename memory fields in wm_adsp_buffer
ASoC: cs42l51: add mclk support
ASoC: stm32: sai: set sai as mclk clock provider
ASoC: dt-bindings: add mclk support to cs42l51
ASoC: dt-bindings: add mclk provider support to stm32 sai
ASoC: soc-core: fix trivial checkpatch issues
ASoC: dapm: Add support for hw_free on CODEC to CODEC links
ASoC: Intel: kbl_da7219_max98927: minor white space clean up
ALSA: i2c/cs8427: Fix int to char conversion
ALSA: doc: Brush up the old writing-an-alsa-driver
ASoC: rsnd: tidyup SSICR::SWSP for TDM
ASoC: rsnd: enable TDM settings for SSI parent
ASoC: pcm3168a: add hw constraint for capture channel
...
Linus Torvalds [Thu, 25 Oct 2018 14:40:30 +0000 (07:40 -0700)]
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380,
qla2xxx, lpfc, libsas, hisi_sas.
In addition there's a set of mostly small updates to the target
subsystem a set of conversions to the generic DMA API, which do have
some potential for issues in the older drivers but we'll handle those
as case by case fixes.
A new myrs driver for the DAC960/mylex raid controllers to replace the
block based DAC960 which is also being removed by Jens in this merge
window.
Plus the usual slew of trivial changes"
[ "myrs" stands for "MYlex Raid Scsi". Obviously. Silly of me to even
wonder. There's also a "myrb" driver, where the 'b' stands for
'block'. Truly, somebody has got mad naming skillz. - Linus ]
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (237 commits)
scsi: myrs: Fix the processor absent message in processor_show()
scsi: myrs: Fix a logical vs bitwise bug
scsi: hisi_sas: Fix NULL pointer dereference
scsi: myrs: fix build failure on 32 bit
scsi: fnic: replace gross legacy tag hack with blk-mq hack
scsi: mesh: switch to generic DMA API
scsi: ips: switch to generic DMA API
scsi: smartpqi: fully convert to the generic DMA API
scsi: vmw_pscsi: switch to generic DMA API
scsi: snic: switch to generic DMA API
scsi: qla4xxx: fully convert to the generic DMA API
scsi: qla2xxx: fully convert to the generic DMA API
scsi: qla1280: switch to generic DMA API
scsi: qedi: fully convert to the generic DMA API
scsi: qedf: fully convert to the generic DMA API
scsi: pm8001: switch to generic DMA API
scsi: nsp32: switch to generic DMA API
scsi: mvsas: fully convert to the generic DMA API
scsi: mvumi: switch to generic DMA API
scsi: mpt3sas: switch to generic DMA API
...
Linus Torvalds [Thu, 25 Oct 2018 13:40:00 +0000 (06:40 -0700)]
Merge tag 'edac_for_4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp
Pull EDAC updates from Borislav Petkov:
"The EDAC tree was busier than usual this cycle as the shortlog below
shows.
Also, this pull request is carrying an ACPI DSM driver which is used
to ask the platform to supply the DIMM location of a reported hardware
error and thus simplify all the EDAC logic when trying to map the
error address to the respective DIMM.
Core EDAC updates:
- amd64_edac: AMD family 0x17, models 0x10-0x2f support (Michael Jin)
Hygon Dhyana support (Pu Wen)
- sb_edac: New maintainer + fixes (Tony Luck) Error reporting
improvements and fixes (Qiuxu Zhuo)
- ghes_edac: SMBIOS handle type 17 for DIMM locating and per-DIMM
error accounting (Fan Wu)
- altera_edac: Stratix10 support and refactoring (Thor Thayer)
Out of tree addition:
- acpi_adxl: Address Translation interface using an ACPI DSM (Tony
Luck)
- the usual amount of other misc fixes and cleanups all over"
* tag 'edac_for_4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (22 commits)
ACPI/ADXL: Add address translation interface using an ACPI DSM
EDAC, thunderx: Fix memory leak in thunderx_l2c_threaded_isr()
EDAC, skx_edac: Fix logical channel intermediate decoding
EDAC, {i7core,sb,skx}_edac: Fix uncorrected error counting
EDAC, altera: Work around int-to-pointer-cast warnings
EDAC, amd64: Add Hygon Dhyana support
EDAC: Raise the maximum number of memory controllers
arm64: dts: stratix10: Add peripheral EDAC nodes
EDAC, altera: Add Stratix10 peripheral support
EDAC, altera: Merge Stratix10 into the Arria10 SDRAM probe routine
arm64: dts: stratix10: Add SDRAM node
EDAC, altera: Combine Stratix10 and Arria10 probe functions
arm64: dts: stratix10: Additions to EDAC System Manager
EDAC, i7core: Remove set but not used variable pvt
EDAC, ghes: Use CPER module handles to locate DIMMs
EDAC: Correct DIMM capacity unit symbol
EDAC, sb_edac: Fix signedness bugs in *_get_ha() functions
EDAC, sb_edac: Fix reporting for patrol scrubber errors
EDAC, sb_edac: Return early on ADDRV bit and address type test
MAINTAINERS: Update maintainer for drivers/edac/sb_edac.c
...
Linus Torvalds [Thu, 25 Oct 2018 13:31:56 +0000 (06:31 -0700)]
Merge tag 'libnvdimm-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
- Improve the efficiency and performance of reading nvdimm-namespace
labels. Reduce the amount of label data read at driver load time by a
few orders of magnitude. Reduce heavyweight call-outs to
platform-firmware routines.
- Handle media errors located in the 'struct page' array stored on a
persistent memory namespace. Let the kernel clear these errors rather
than an awkward userspace workaround.
- Fix Address Range Scrub (ARS) completion tracking. Correct occasions
where the kernel indicates completion of ARS before submission.
- Add support for reporting an nvdimm dirty-shutdown-count via sysfs.
- Fix various small libnvdimm core and uapi issues.
* tag 'libnvdimm-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
acpi, nfit: Further restrict userspace ARS start requests
acpi, nfit: Fix Address Range Scrub completion tracking
UAPI: ndctl: Remove use of PAGE_SIZE
UAPI: ndctl: Fix g++-unsupported initialisation in headers
tools/testing/nvdimm: Populate dirty shutdown data
acpi, nfit: Collect shutdown status
acpi, nfit: Introduce nfit_mem flags
libnvdimm, label: Fix sparse warning
nvdimm: Use namespace index data to reduce number of label reads needed
nvdimm: Split label init out from the logic for getting config data
nvdimm: Remove empty if statement
nvdimm: Clarify comment in sizeof_namespace_index
nvdimm: Sanity check labeloff
libnvdimm, dimm: Maximize label transfer size
libnvdimm, pmem: Fix badblocks population for 'raw' namespaces
libnvdimm, namespace: Drop the repeat assignment for variable dev->parent
libnvdimm, region: Fail badblocks listing for inactive regions
libnvdimm, pfn: during init, clear errors in the metadata area
libnvdimm: Set device node in nd_device_register
libnvdimm: Hold reference on parent while scheduling async init
...
Linus Torvalds [Thu, 25 Oct 2018 13:28:08 +0000 (06:28 -0700)]
Merge tag 'for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
- Add Spreadtrum SC2731 charger driver
- bq25890-charger: Add BQ25896 support
- bq27xxx-battery: Add support for BQ27411
- qcom-pon: Add pms405 pon support
- cros-charger: add support for dedicated port
- misc fixes
* tag 'for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (28 commits)
power: max8925: mark expected switch fall-through
power: supply: fix spelling mistake "Gauage" -> "Gauge"
power: reset: qcom-pon: Add pms405 pon support
power: supply: bq27xxx: Add support for BQ27411
power: supply: Add Spreadtrum SC2731 charger support
dt-bindings: power: Add Spreadtrum SC2731 charger documentation
power: supply: twl4030_charger: disable eoc interrupt on linear charge
power: supply: twl4030_charger: fix charging current out-of-bounds
power: supply: bq25890_charger: fix semicolon.cocci warnings
power: supply: max8998-charger: Fix platform data retrieval
power: supply: cros: add support for dedicated port
mfd: cros: add charger port count command definition
power: reset: at91-poweroff: do not procede if at91_shdwc is allocated
power: reset: at91-poweroff: rename at91_shdwc_base member of struct shdwc
power: reset: at91-poweroff: make sclk part of struct shdwc
power: reset: at91-poweroff: make mpddrc_base part of struct shdwc
power: reset: at91-poweroff: use only one poweroff function
power: reset: at91-poweroff: switch to slow clock before shutdown
power: reset: convert to SPDX identifiers
power: supply: ab8500_fg: silence uninitialized variable warnings
...
Linus Torvalds [Thu, 25 Oct 2018 13:23:07 +0000 (06:23 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
- rumble support for Xbox One S, from Andrey Smirnov
- high-resolution support for Logitech mice, from Harry Cutts
- support for recent devices requiring the HID parse to be able to cope
with tag report sizes > 256
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (35 commits)
HID: usbhid: Add quirk for Redragon/Dragonrise Seymur 2
HID: wacom: Work around HID descriptor bug in DTK-2451 and DTH-2452
HID: google: add dependency on Cros EC for Hammer
HID: elan: fix spelling mistake "registred" -> "registered"
HID: google: drop superfluous const before SIMPLE_DEV_PM_OPS()
HID: google: add support tablet mode switch for Whiskers
mfd: cros: add "base attached" MKBP switch definition
Input: reserve 2 events code because of HID
HID: magicmouse: add support for Apple Magic Trackpad 2
HID: i2c-hid: override HID descriptors for certain devices
HID: hid-bigbenff: driver for BigBen Interactive PS3OFMINIPAD gamepad
HID: logitech: fix a used uninitialized GCC warning
HID: intel-ish-hid: using list_head for ipc write queue
HID: intel-ish-hid: use resource-managed api
HID: intel_ish-hid: Enhance API to get ring buffer sizes
HID: intel-ish-hid: use helper function to search client id
HID: intel-ish-hid: ishtp: add helper function for client search
HID: intel-ish-hid: use helper function to access client buffer
HID: intel-ish-hid: ishtp: add helper functions for client buffer operation
HID: intel-ish-hid: use helper function for private driver data set/get
...
Linus Torvalds [Wed, 24 Oct 2018 17:01:11 +0000 (18:01 +0100)]
Merge tag 'docs-4.20' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"This is a fairly typical cycle for documentation. There's some welcome
readability improvements for the formatted output, some LICENSES
updates including the addition of the ISC license, the removal of the
unloved and unmaintained 00-INDEX files, the deprecated APIs document
from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
fixes and corrections"
* tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
docs: Fix typos in histogram.rst
docs: Introduce deprecated APIs list
kernel-doc: fix declaration type determination
doc: fix a typo in adding-syscalls.rst
docs/admin-guide: memory-hotplug: remove table of contents
doc: printk-formats: Remove bogus kobject references for device nodes
Documentation: preempt-locking: Use better example
dm flakey: Document "error_writes" feature
docs/completion.txt: Fix a couple of punctuation nits
LICENSES: Add ISC license text
LICENSES: Add note to CDDL-1.0 license that it should not be used
docs/core-api: memory-hotplug: add some details about locking internals
docs/core-api: rename memory-hotplug-notifier to memory-hotplug
docs: improve readability for people with poorer eyesight
yama: clarify ptrace_scope=2 in Yama documentation
docs/vm: split memory hotplug notifier description to Documentation/core-api
docs: move memory hotplug description into admin-guide/mm
doc: Fix acronym "FEKEK" in ecryptfs
docs: fix some broken documentation references
iommu: Fix passthrough option documentation
...
Linus Torvalds [Wed, 24 Oct 2018 16:42:24 +0000 (17:42 +0100)]
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
- further restructure ext4 documentation
- fix up ext4's delayed allocation for bigalloc file systems
- fix up some syzbot-detected races in EXT4_IOC_MOVE_EXT,
EXT4_IOC_SWAP_BOOT, and ext4_remount
- ... and a few other miscellaneous bugs and optimizations.
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
ext4: fix use-after-free race in ext4_remount()'s error path
ext4: cache NULL when both default_acl and acl are NULL
docs: promote the ext4 data structures book to top level
docs: move ext4 administrative docs to admin-guide/
jbd2: fix use after free in jbd2_log_do_checkpoint()
ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTR
ext4: fix setattr project check in fssetxattr ioctl
docs: make ext4 readme tables readable
docs: fix ext4 documentation table formatting problems
docs: generate a separate ext4 pdf file from the documentation
ext4: convert fault handler to use vm_fault_t type
ext4: initialize retries variable in ext4_da_write_inline_data_begin()
ext4: fix EXT4_IOC_SWAP_BOOT
ext4: fix build error when DX_DEBUG is defined
ext4: fix argument checking in EXT4_IOC_MOVE_EXT
ext4: fix reserved cluster accounting at page invalidation time
ext4: adjust reserved cluster count when removing extents
ext4: reduce reserved cluster count by number of allocated clusters
ext4: fix reserved cluster accounting at delayed write time
ext4: add new pending reservation mechanism
...
Linus Torvalds [Wed, 24 Oct 2018 16:39:36 +0000 (17:39 +0100)]
Merge tag 'f2fs-for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've added 1) superblock checksum feature, 2)
implemented new mount option which we can disable/enable checkpoint to
provide atomic updates of entire filesystem, 3) refactored quota
operations to enhance its consistency along with checkpoint, 4) fixed
subtle IO hang conditions and roll-forward recovery flow to resurrect
any fsync'ed inode metadata.
Enhancements:
- add checksum to keep superblock contents more safe
- add checkpoint=disable/enable to support A/B update of entire filesystem
- use plug for readahead IO in readdir
- add more IO counts to avoid block layer hacks
Bug fixes:
- prevent data corruption issue for hardware encryption
- fix IO hang issues when GC is heavily triggered
- add missing up_read in __write_node_page
- recover inode metadata during roll-forward recovery flow
- fix null pointer dereference issue in wrongly configured discard map
There are some more sanity checks and minor bug fixes as well"
* tag 'f2fs-for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (62 commits)
f2fs: fix to keep project quota consistent
f2fs: guarantee journalled quota data by checkpoint
f2fs: cleanup dirty pages if recover failed
f2fs: fix data corruption issue with hardware encryption
f2fs: fix to recover inode->i_flags of inode block during POR
f2fs: spread f2fs_set_inode_flags()
f2fs: fix to spread clear_cold_data()
Revert "f2fs: fix to clear PG_checked flag in set_page_dirty()"
f2fs: account read IOs and use IO counts for is_idle
f2fs: fix to account IO correctly for cgroup writeback
f2fs: fix to account IO correctly
f2fs: remove request_list check in is_idle()
f2fs: allow to mount, if quota is failed
f2fs: update REQ_TIME in f2fs_cross_rename()
f2fs: do not update REQ_TIME in case of error conditions
f2fs: remove unneeded disable_nat_bits()
f2fs: remove unused sbi->trigger_ssr_threshold
f2fs: shrink sbi->sb_lock coverage in set_file_temperature()
f2fs: use rb_*_cached friends
f2fs: fix to recover cold bit of inode block during POR
...
Linus Torvalds [Wed, 24 Oct 2018 16:36:12 +0000 (17:36 +0100)]
Merge tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pul xfs updates from Dave Chinner:
"There's not a huge amount of change in this cycle - Darrick has been
out of action for a couple of months (hence me sending the last few
pull requests), so we decided a quiet cycle mainly focussed on bug
fixes was a good idea. Darrick will take the helm again at the end of
this merge window.
FYI, I may be sending another update later in the cycle - there's a
pending rework of the clone/dedupe_file_range code that fixes numerous
bugs that is spread amongst the VFS, XFS and ocfs2 code. It has been
reviewed and tested, Al and I just need to work out the details of the
merge, so it may come from him rather than me.
Summary:
- only support filesystems with unwritten extents
- add definition for statfs XFS magic number
- remove unused parameters around reflink code
- more debug for dangling delalloc extents
- cancel COW extents on extent swap targets
- fix quota stats output and clean up the code
- refactor some of the attribute code in preparation for parent
pointers
- fix several buffer handling bugs"
* tag 'xfs-4.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (21 commits)
xfs: cancel COW blocks before swapext
xfs: clear ail delwri queued bufs on unmount of shutdown fs
xfs: use offsetof() in place of offset macros for __xfsstats
xfs: Fix xqmstats offsets in /proc/fs/xfs/xqmstat
xfs: fix use-after-free race in xfs_buf_rele
xfs: Add attibute remove and helper functions
xfs: Add attibute set and helper functions
xfs: Add helper function xfs_attr_try_sf_addname
xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
xfs: issue log message on user force shutdown
xfs: fix buffer state management in xrep_findroot_block
xfs: always assign buffer verifiers when one is provided
xfs: xrep_findroot_block should reject root blocks with siblings
xfs: add a define for statfs magic to uapi
xfs: print dangling delalloc extents
xfs: fix fork selection in xfs_find_trim_cow_extent
xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared
xfs: remove the unused shared argument to xfs_reflink_reserve_cow
xfs: handle zeroing in xfs_file_iomap_begin_delay
xfs: remove suport for filesystems without unwritten extent flag
...
Linus Torvalds [Wed, 24 Oct 2018 16:30:39 +0000 (17:30 +0100)]
Merge tag 'gfs2-4.20.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Bob Peterson:
"We've got 18 patches for this merge window, none of which are very
major:
- clean up the gfs2 block allocator to prepare for future performance
enhancements (Andreas Gruenbacher)
- fix a use-after-free problem (Andy Price)
- patches that fix gfs2's broken rgrplvb mount option (me)
- cleanup patches and error message improvements (me)
- enable getlabel support (Steve Whitehouse and Abhi Das)
- flush the glock delete workqueue at exit (Tim Smith)"
* tag 'gfs2-4.20.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fix minor typo: couln't versus couldn't.
gfs2: write revokes should traverse sd_ail1_list in reverse
gfs2: Pass resource group to rgblk_free
gfs2: Remove unnecessary gfs2_rlist_alloc parameter
gfs2: Fix marking bitmaps non-full
gfs2: Fix some minor typos
gfs2: Rename bitmap.bi_{len => bytes}
gfs2: Remove unused RGRP_RSRV_MINBYTES definition
gfs2: Move rs_{sizehint, rgd_gh} fields into the inode
gfs2: Clean up out-of-bounds check in gfs2_rbm_from_block
gfs2: Always check the result of gfs2_rbm_from_block
gfs2: getlabel support
GFS2: Flush the GFS2 delete workqueue before stopping the kernel threads
gfs2: Don't leave s_fs_info pointing to freed memory in init_sbd
gfs2: Use fs_* functions instead of pr_* function where we can
gfs2: slow the deluge of io error messages
gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
gfs2: improve debug information when lvb mismatches are found
Linus Torvalds [Wed, 24 Oct 2018 16:28:03 +0000 (17:28 +0100)]
Merge tag 'for-linus-4.20-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs updates from Mike Marshall:
"Fixes and a cleanup.
Fixes:
- fix superfluous service_operation return code check in
orangefs_lookup
- fix some error code paths that missed kmem_cache_free
- don't let orangefs_iget return NULL
- don't let orangefs_new_inode return NULL
- cache NULL when both default_acl and acl are NULL
Cleanup:
- rate limit the client not running info message"
* tag 'for-linus-4.20-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: no need to check for service_operation returns > 0
orangefs: some error code paths missed kmem_cache_free
orangefs: don't let orangefs_iget return NULL.
orangefs: don't let orangefs_new_inode return NULL
orangefs: rate limit the client not running info message
orangefs: cache NULL when both default_acl and acl are NULL
Linus Torvalds [Wed, 24 Oct 2018 16:24:04 +0000 (17:24 +0100)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
gfs2_meta: ->mount() can get NULL dev_name
ecryptfs_rename(): verify that lower dentries are still OK after lock_rename()
cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)
Linus Torvalds [Wed, 24 Oct 2018 16:22:16 +0000 (17:22 +0100)]
Merge tag 'jfs-for-4.20' of git://github.com/kleikamp/linux-shaggy
Pull jfs updates from David Kleikamp:
"Just a few small fixes"
* tag 'jfs-for-4.20' of git://github.com/kleikamp/linux-shaggy:
jfs: remove redundant dquot_initialize() in jfs_evict_inode()
jfs: remove quota option from ignore list
jfs: cache NULL when both default_acl and acl are NULL
Linus Torvalds [Wed, 24 Oct 2018 16:15:26 +0000 (17:15 +0100)]
Merge tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This is the first batch with fixes and some nice performance
improvements.
Preliminary results show eg. more files/sec in fsmark, better perf on
multi-threaded workloads (filebench, dbench), fewer context switches
and overall better memory allocation characteristics (multiple
benchmarks).
Apart from general performance, there's an improvement for qgroups +
balance workload that's been troubling our users.
Note for stable: there are 20+ patches tagged for stable, out of 90.
Not all of them apply cleanly on all stable versions but the conflicts
are mostly due to simple cleanups and resolving should be obvious. The
fixes are otherwise independent.
Performance improvements:
- transition between blocking and spinning modes of path is gone,
which originally resulted to more unnecessary wakeups and updates
to the path locks, the effects are measurable and improve latency
and scalability
- qgroups: first batch of changes that should speedup balancing with
qgroups on, skip quota accounting on unchanged subtrees, overall
gain is about 30+% in runtime
- use rb-tree with cached first node for several structures, small
improvement to avoid pointer chasing
Fixes:
- trim
- fix: some blockgroups could have been missed if their logical
address was past the total filesystem size (ie. after a lot of
balancing)
- better error reporting, after processing blockgroups and whole
device
- fix: continue trimming block groups after an error is
encountered
- check for trim support of the device earlier and avoid some
unnecessary work
- less interaction with transaction commit that improves latency
on slower storage (eg. image files over NFS)
- fsync
- fix warning when replaying log after fsync of a O_TMPFILE
- fix wrong dentries after fsync of file that got its parent
replaced
- qgroups: fix rescan that might misc some dirty groups
- don't clean dirty pages during buffered writes, this could lead to
lost updates in some corner cases
- some block groups could have been delayed in creation, if the
allocation triggered another one
- error handling improvements
Cleanups:
- removed unused struct members and variables
- function return type cleanups
- delayed refs code refactoring
- protect against deadlock that could be caused by crafted image that
tries to allocate from a tree that's locked already"
* tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (93 commits)
btrfs: switch return_bigger to bool in find_ref_head
btrfs: remove fs_info from btrfs_should_throttle_delayed_refs
btrfs: remove fs_info from btrfs_check_space_for_delayed_refs
btrfs: delayed-ref: pass delayed_refs directly to btrfs_delayed_ref_lock
btrfs: delayed-ref: pass delayed_refs directly to btrfs_select_ref_head
btrfs: qgroup: move the qgroup->members check out from (!qgroup)'s else branch
btrfs: relocation: Remove redundant tree level check
btrfs: relocation: Cleanup while loop using rbtree_postorder_for_each_entry_safe
btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled
Btrfs: fix wrong dentries after fsync of file that got its parent replaced
Btrfs: fix warning when replaying log after fsync of a tmpfile
btrfs: drop min_size from evict_refill_and_join
btrfs: assert on non-empty delayed iputs
btrfs: make sure we create all new block groups
btrfs: reset max_extent_size on clear in a bitmap
btrfs: protect space cache inode alloc with GFP_NOFS
btrfs: release metadata before running delayed refs
Btrfs: kill btrfs_clear_path_blocking
btrfs: dev-replace: remove pointless assert in write unlock
btrfs: dev-replace: move replace members out of fs_info
...
Linus Torvalds [Wed, 24 Oct 2018 13:43:41 +0000 (14:43 +0100)]
Merge branch 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull tty ioctl updates from Al Viro:
"This is the compat_ioctl work related to tty ioctls.
Quite a bit of dead code taken out, all tty-related stuff gone from
fs/compat_ioctl.c. A bunch of compat bugs fixed - some still remain,
but all more or less generic tty-related ioctls should be covered
(remaining issues are in things like driver-private ioctls in a pcmcia
serial card driver not getting properly handled in 32bit processes on
64bit host, etc)"
* 'work.tty-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (53 commits)
kill TIOCSERGSTRUCT
change semantics of ldisc ->compat_ioctl()
kill TIOCSER[SG]WILD
synclink_gt(): fix compat_ioctl()
pty: fix compat ioctls
compat_ioctl - kill keyboard ioctl handling
gigaset: add ->compat_ioctl()
vt_compat_ioctl(): clean up, use compat_ptr() properly
gigaset: don't try to printk userland buffer contents
dgnc: don't bother with (empty) stub for TCXONC
dgnc: leave TIOC[GS]SOFTCAR to ldisc
remove fallback to drivers for TIOCGICOUNT
dgnc: break-related ioctls won't reach ->ioctl()
kill the rest of tty COMPAT_IOCTL() entries
dgnc: TIOCM... won't reach ->ioctl()
isdn_tty: TCSBRK{,P} won't reach ->ioctl()
kill capinc_tty_ioctl()
take compat TIOC[SG]SERIAL treatment into tty_compat_ioctl()
synclink: reduce pointless checks in ->ioctl()
complete ->[sg]et_serial() switchover
...
Linus Torvalds [Wed, 24 Oct 2018 10:49:35 +0000 (11:49 +0100)]
Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
"In this patchset, there are a couple of minor updates, as well as some
reworking of the LSM initialization code from Kees Cook (these prepare
the way for ordered stackable LSMs, but are a valuable cleanup on
their own)"
* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
LSM: Don't ignore initialization failures
LSM: Provide init debugging infrastructure
LSM: Record LSM name in struct lsm_info
LSM: Convert security_initcall() into DEFINE_LSM()
vmlinux.lds.h: Move LSM_TABLE into INIT_DATA
LSM: Convert from initcall to struct lsm_info
LSM: Remove initcall tracing
LSM: Rename .security_initcall section to .lsm_info
vmlinux.lds.h: Avoid copy/paste of security_init section
LSM: Correctly announce start of LSM initialization
security: fix LSM description location
keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
seccomp: remove unnecessary unlikely()
security: tomoyo: Fix obsolete function
security/capabilities: remove check for -EINVAL
Linus Torvalds [Wed, 24 Oct 2018 10:47:32 +0000 (11:47 +0100)]
Merge tag 'selinux-pr-20181022' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"Three SELinux patches for v4.20, all fall under the bug-fix or
behave-better category, which is good. All three have pretty good
descriptions too, which is even better"
* tag 'selinux-pr-20181022' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: Add __GFP_NOWARN to allocation at str_read()
selinux: refactor mls_context_to_sid() and make it stricter
selinux: fix mounting of cgroup2 under older policies
Linus Torvalds [Wed, 24 Oct 2018 10:22:39 +0000 (11:22 +0100)]
Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull siginfo updates from Eric Biederman:
"I have been slowly sorting out siginfo and this is the culmination of
that work.
The primary result is in several ways the signal infrastructure has
been made less error prone. The code has been updated so that manually
specifying SEND_SIG_FORCED is never necessary. The conversion to the
new siginfo sending functions is now complete, which makes it
difficult to send a signal without filling in the proper siginfo
fields.
At the tail end of the patchset comes the optimization of decreasing
the size of struct siginfo in the kernel from 128 bytes to about 48
bytes on 64bit. The fundamental observation that enables this is by
definition none of the known ways to use struct siginfo uses the extra
bytes.
This comes at the cost of a small user space observable difference.
For the rare case of siginfo being injected into the kernel only what
can be copied into kernel_siginfo is delivered to the destination, the
rest of the bytes are set to 0. For cases where the signal and the
si_code are known this is safe, because we know those bytes are not
used. For cases where the signal and si_code combination is unknown
the bits that won't fit into struct kernel_siginfo are tested to
verify they are zero, and the send fails if they are not.
I made an extensive search through userspace code and I could not find
anything that would break because of the above change. If it turns out
I did break something it will take just the revert of a single change
to restore kernel_siginfo to the same size as userspace siginfo.
Testing did reveal dependencies on preferring the signo passed to
sigqueueinfo over si->signo, so bit the bullet and added the
complexity necessary to handle that case.
Testing also revealed bad things can happen if a negative signal
number is passed into the system calls. Something no sane application
will do but something a malicious program or a fuzzer might do. So I
have fixed the code that performs the bounds checks to ensure negative
signal numbers are handled"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
signal: Guard against negative signal numbers in copy_siginfo_from_user32
signal: Guard against negative signal numbers in copy_siginfo_from_user
signal: In sigqueueinfo prefer sig not si_signo
signal: Use a smaller struct siginfo in the kernel
signal: Distinguish between kernel_siginfo and siginfo
signal: Introduce copy_siginfo_from_user and use it's return value
signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
signal: Fail sigqueueinfo if si_signo != sig
signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
signal/unicore32: Use force_sig_fault where appropriate
signal/unicore32: Generate siginfo in ucs32_notify_die
signal/unicore32: Use send_sig_fault where appropriate
signal/arc: Use force_sig_fault where appropriate
signal/arc: Push siginfo generation into unhandled_exception
signal/ia64: Use force_sig_fault where appropriate
signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
signal/ia64: Use the generic force_sigsegv in setup_frame
signal/arm/kvm: Use send_sig_mceerr
signal/arm: Use send_sig_fault where appropriate
signal/arm: Use force_sig_fault where appropriate
...
Linus Torvalds [Wed, 24 Oct 2018 07:11:35 +0000 (08:11 +0100)]
net/kconfig: Make QCOM_QMI_HELPERS available when COMPILE_TEST
The networking merge brought in the experimental support for the
Qualcomm ath10k system NOC, which selects QCOM_QMI_HELPERS as a
dependency.
But the ATH10K_SNOC option (which selects QCOM_QMI_HELPERS) depends on
ARCH_QCOM || COMPILE_TEST in order to get wider build testing than just
the unusual QCOM architecture build, while the QCOM_QMI_HELPERS option
doesn't have that COMPILE_TEST option and is limited to only ARCH_QCOM.
As a result, a "make allmodconfig" complains
WARNING: unmet direct dependencies detected for QCOM_QMI_HELPERS
Depends on [n]: ARCH_QCOM && NET [=y]
Selected by [m]:
- ATH10K_SNOC [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && ATH10K [=m] && (ARCH_QCOM || COMPILE_TEST [=y])
Fix the config-time warning by making QCOM_QMI_HELPERS available when
COMPILE_TEST, since the result seems to build fine.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Govind Singh <govinds@codeaurora.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1) Add VF IPSEC offload support in ixgbe, from Shannon Nelson.
2) Add zero-copy AF_XDP support to i40e, from Björn Töpel.
3) All in-tree drivers are converted to {g,s}et_link_ksettings() so we
can get rid of the {g,s}et_settings ethtool callbacks, from Michal
Kubecek.
4) Add software timestamping to veth driver, from Michael Walle.
5) More work to make packet classifiers and actions lockless, from Vlad
Buslov.
6) Support sticky FDB entries in bridge, from Nikolay Aleksandrov.
7) Add ipv6 version of IP_MULTICAST_ALL sockopt, from Andre Naujoks.
8) Support batching of XDP buffers in vhost_net, from Jason Wang.
9) Add flow dissector BPF hook, from Petar Penkov.
10) i40e vf --> generic iavf conversion, from Jesse Brandeburg.
11) Add NLA_REJECT netlink attribute policy type, to signal when users
provide attributes in situations which don't make sense. From
Johannes Berg.
12) Switch TCP and fair-queue scheduler over to earliest departure time
model. From Eric Dumazet.
13) Improve guest receive performance by doing rx busy polling in tx
path of vhost networking driver, from Tonghao Zhang.
14) Add per-cgroup local storage to bpf
15) Add reference tracking to BPF, from Joe Stringer. The verifier can
now make sure that references taken to objects are properly released
by the program.
16) Support in-place encryption in TLS, from Vakul Garg.
17) Add new taprio packet scheduler, from Vinicius Costa Gomes.
18) Lots of selftests additions, too numerous to mention one by one here
but all of which are very much appreciated.
19) Support offloading of eBPF programs containing BPF to BPF calls in
nfp driver, frm Quentin Monnet.
20) Move dpaa2_ptp driver out of staging, from Yangbo Lu.
21) Lots of u32 classifier cleanups and simplifications, from Al Viro.
22) Add new strict versions of netlink message parsers, and enable them
for some situations. From David Ahern.
23) Evict neighbour entries on carrier down, also from David Ahern.
24) Support BPF sk_msg verdict programs with kTLS, from Daniel Borkmann
and John Fastabend.
25) Add support for filtering route dumps, from David Ahern.
26) New igc Intel driver for 2.5G parts, from Sasha Neftin et al.
27) Allow vxlan enslavement to bridges in mlxsw driver, from Ido
Schimmel.
28) Add queue and stack map types to eBPF, from Mauricio Vasquez B.
29) Add back byte-queue-limit support to r8169, with all the bug fixes
in other areas of the driver it works now! From Florian Westphal and
Heiner Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2147 commits)
tcp: add tcp_reset_xmit_timer() helper
qed: Fix static checker warning
Revert "be2net: remove desc field from be_eq_obj"
Revert "net: simplify sock_poll_wait"
net: socionext: Reset tx queue in ndo_stop
net: socionext: Add dummy PHY register read in phy_write()
net: socionext: Stop PHY before resetting netsec
net: stmmac: Set OWN bit for jumbo frames
arm64: dts: stratix10: Support Ethernet Jumbo frame
tls: Add maintainers
net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
octeontx2-af: Support for setting MAC address
octeontx2-af: Support for changing RSS algorithm
octeontx2-af: NIX Rx flowkey configuration for RSS
octeontx2-af: Install ucast and bcast pkt forwarding rules
octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
octeontx2-af: NPC MCAM and LDATA extract minimal configuration
octeontx2-af: Enable packet length and csum validation
octeontx2-af: Support for VTAG strip and capture
...
Eric Dumazet [Tue, 23 Oct 2018 18:54:16 +0000 (11:54 -0700)]
tcp: add tcp_reset_xmit_timer() helper
With EDT model, SRTT no longer is inflated by pacing delays.
This means that RTO and some other xmit timers might be setup
incorrectly. This is particularly visible with either :
- Very small enforced pacing rates (SO_MAX_PACING_RATE)
- Reduced rto (from the default 200 ms)
This can lead to TCP flows aborts in the worst case,
or spurious retransmits in other cases.
For example, this session gets far more throughput
than the requested 80kbit :
$ netperf -H 127.0.0.2 -l 100 -- -q 10000
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
540000 262144 262144 104.00 2.66
With the fix :
$ netperf -H 127.0.0.2 -l 100 -- -q 10000
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
540000 262144 262144 104.00 0.12
EDT allows for better control of rtx timers, since TCP has
a better idea of the earliest departure time of each skb
in the rtx queue. We only have to eventually add to the
timer the difference of the EDT time with current time.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 23 Oct 2018 19:02:03 +0000 (20:02 +0100)]
Merge branch 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Lots of small fixes and enhancements, most noteably:
- Many TLB and cache flush optimizations (Dave)
- Fixed HPMC/crash handler on 64-bit kernel (Dave and myself)
- Added alternative infrastructre. The kernel now live-patches itself
for various situations, e.g. replace SMP code when running on one
CPU only or drop cache flushes when system has no cache installed.
- vmlinuz now contains a full copy of the compressed vmlinux file.
This simplifies debugging the currently booted kernel.
- Unused driver removal (Christoph)
- Reduced warnings of Dino PCI bridge when running in qemu
- Removed gcc version check (Masahiro)"
* 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (23 commits)
parisc: Retrieve and display the PDC PAT capabilities
parisc: Optimze cache flush algorithms
parisc: Remove pte_inserted define
parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions
parisc: Drop two instructions from pte lookup code
parisc: Use zdep for shlw macro on PA1.1 and PA2.0
parisc: Add alternative coding infrastructure
parisc: Include compressed vmlinux file in vmlinuz boot kernel
extract-vmlinux: Check for uncompressed image as fallback
parisc: Fix address in HPMC IVA
parisc: Fix exported address of os_hpmc handler
parisc: Fix map_pages() to not overwrite existing pte entries
parisc: Purge TLB entries after updating page table entry and set page accessed flag in TLB handler
parisc: Release spinlocks using ordered store
parisc: Ratelimit dino stuck interrupt warnings
parisc: dino: Utilize DINO_MASK_IRQ() macro
parisc: Clean up crash header output
parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions
parisc: Remove PTE load and fault check from L2_ptep macro
parisc: Reorder TLB flush timing calculation
...
Linus Torvalds [Tue, 23 Oct 2018 18:32:10 +0000 (19:32 +0100)]
Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
"The main item in this pull request are the Spectre variant 1.1 fixes
from Julien Thierry.
A few other patches to improve various areas, and removal of some
obsolete mcount bits and a redundant kbuild conditional"
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8802/1: Call syscall_trace_exit even when system call skipped
ARM: 8797/1: spectre-v1.1: harden __copy_to_user
ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()
ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit
ARM: 8793/1: signal: replace __put_user_error with __put_user
ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()
ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state
ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context
ARM: 8789/1: signal: copy registers using __copy_to_user()
ARM: 8801/1: makefile: use ARMv3M mode for RiscPC
ARM: 8800/1: use choice for kernel unwinders
ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional
ARM: 8788/1: ftrace: remove old mcount support
ARM: 8786/1: Debug kernel copy by printing
Linus Torvalds [Tue, 23 Oct 2018 18:07:25 +0000 (19:07 +0100)]
Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
"Two main changes:
- Cleanups, simplifications and CLOCK_TAI support (Thomas Gleixner)
- Improve code generation (Andy Lutomirski)"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Rearrange do_hres() to improve code generation
x86/vdso: Document vgtod_ts better
x86/vdso: Remove "memory" clobbers in the vDSO syscall fallbacks
x66/vdso: Add CLOCK_TAI support
x86/vdso: Move cycle_last handling into the caller
x86/vdso: Simplify the invalid vclock case
x86/vdso: Replace the clockid switch case
x86/vdso: Collapse coarse functions
x86/vdso: Collapse high resolution functions
x86/vdso: Introduce and use vgtod_ts
x86/vdso: Use unsigned int consistently for vsyscall_gtod_data:: Seq
x86/vdso: Enforce 64bit clocksource
x86/time: Implement clocksource_arch_init()
clocksource: Provide clocksource_arch_init()
Rahul Verma [Tue, 23 Oct 2018 15:04:24 +0000 (08:04 -0700)]
qed: Fix static checker warning
Static Checker Warnings:
drivers/net/ethernet/qlogic/qed/qed_main.c:1510 qed_fill_link_capability()
error: uninitialized symbol 'tcvr_state'.
drivers/net/ethernet/qlogic/qed/qed_mcp.c:1951 qed_mcp_trans_speed_mask()
error: uninitialized symbol 'transceiver_state'.
drivers/net/ethernet/qlogic/qed/qed_mcp.c:1951 qed_mcp_trans_speed_mask()
error: uninitialized symbol 'transceiver_type'.
Symbols tcvr_state, transceiver_state and transceiver_type
are initialized with respective default state.
Fixes: c56a8be7e7aa ("qed: Add supported link and advertise link to display in ethtool.") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Tue, 23 Oct 2018 14:40:26 +0000 (16:40 +0200)]
Revert "be2net: remove desc field from be_eq_obj"
The mentioned commit needs to be reverted because we cannot pass
string allocated on stack to request_irq(). This function stores
uses this pointer for later use (e.g. /proc/interrupts) so we need
to keep this string persistently.
Fixes: d6d9704af8f4 ("be2net: remove desc field from be_eq_obj") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Oct 2018 17:55:35 +0000 (10:55 -0700)]
Merge branch 'netsec-fixes'
Masahisa Kojima says:
====================
Bugfix for the netsec driver
This patch series include bugfix for the netsec ethernet
controller driver, fix the problem in interface down/up.
changes in v2:
- change the place to perform the PHY power down
- use the MACROs defiend in include/uapi/linux/mii.h
- update commit comment
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:28 +0000 (20:24 +0900)]
net: socionext: Reset tx queue in ndo_stop
We observed that packets and bytes count are not reset
when user performs interface down. Eventually, tx queue is
exhausted and packets will not be sent out.
To avoid this problem, resets tx queue in ndo_stop.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org> Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:27 +0000 (20:24 +0900)]
net: socionext: Add dummy PHY register read in phy_write()
There is a compatibility issue between RTL8211E implemented
in Developerbox and netsec ethernet controller IP.
Our MDIO controller stops MDC clock right after the write
access, but RTL8211E expects MDC clock must be kept toggling
for several clock cycle with MDIO high before entering
the IDLE state. Without keeping clock after write access,
write access is not correctly handled and register is not
updated.
To meet this requirement, netsec driver needs to issue dummy
read(e.g. read PHYID1(offset 0x2) register) right after write
access, to keep MDC clock.
We think this compatibility issue is a problem specific to
our MDIO controller and RTL8211E.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org> Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:26 +0000 (20:24 +0900)]
net: socionext: Stop PHY before resetting netsec
In ndo_stop, driver resets the netsec ethernet controller IP.
When the netsec IP is reset, HW running mode turns to NRM mode
and driver has to wait until this mode transition completes.
But mode transition to NRM will not complete if the PHY is
in normal operation state. Netsec IP requires PHY is in
power down state when it is reset.
This modification stops the PHY before resetting netsec.
Together with this modification, phy_addr is stored in netsec_priv
structure because ndev->phydev is not yet ready in ndo_init.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org> Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 23 Oct 2018 17:43:04 +0000 (18:43 +0100)]
Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti updates from Ingo Molnar:
"The main changes:
- Make the IBPB barrier more strict and add STIBP support (Jiri
Kosina)
- Micro-optimize and clean up the entry code (Andy Lutomirski)
- ... plus misc other fixes"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Propagate information about RSB filling mitigation to sysfs
x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
x86/CPU: Fix unused variable warning when !CONFIG_IA32_EMULATION
x86/pti/64: Remove the SYSCALL64 entry trampoline
x86/entry/64: Use the TSS sp2 slot for SYSCALL/SYSRET scratch space
x86/entry/64: Document idtentry
Linus Torvalds [Tue, 23 Oct 2018 17:17:11 +0000 (18:17 +0100)]
Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"Two minor OLPC changes: a build fix and a new quirk"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
x86/olpc: Indicate that legacy PC XO-1 platform should not register RTC
Linus Torvalds [Tue, 23 Oct 2018 16:54:58 +0000 (17:54 +0100)]
Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 paravirt updates from Ingo Molnar:
"Two main changes:
- Remove no longer used parts of the paravirt infrastructure and put
large quantities of paravirt ops under a new config option
PARAVIRT_XXL=y, which is selected by XEN_PV only. (Joergen Gross)
- Enable PV spinlocks on Hyperv (Yi Sun)"
* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyperv: Enable PV qspinlock for Hyper-V
x86/hyperv: Add GUEST_IDLE_MSR support
x86/paravirt: Clean up native_patch()
x86/paravirt: Prevent redefinition of SAVE_FLAGS macro
x86/xen: Make xen_reservation_lock static
x86/paravirt: Remove unneeded mmu related paravirt ops bits
x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrella
x86/paravirt: Introduce new config option PARAVIRT_XXL
x86/paravirt: Remove unused paravirt bits
x86/paravirt: Use a single ops structure
x86/paravirt: Remove clobbers from struct paravirt_patch_site
x86/paravirt: Remove clobbers parameter from paravirt patch functions
x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() static
x86/xen: Add SPDX identifier in arch/x86/xen files
x86/xen: Link platform-pci-unplug.o only if CONFIG_XEN_PVHVM
x86/xen: Move pv specific parts of arch/x86/xen/mmu.c to mmu_pv.c
x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrella
Linus Torvalds [Tue, 23 Oct 2018 16:05:28 +0000 (17:05 +0100)]
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
"Lots of changes in this cycle:
- Lots of CPA (change page attribute) optimizations and related
cleanups (Thomas Gleixner, Peter Zijstra)
- Make lazy TLB mode even lazier (Rik van Riel)
- Fault handler cleanups and improvements (Dave Hansen)
- kdump, vmcore: Enable kdumping encrypted memory with AMD SME
enabled (Lianbo Jiang)
- Clean up VM layout documentation (Baoquan He, Ingo Molnar)
- ... plus misc other fixes and enhancements"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry()
x86/mm: Kill stray kernel fault handling comment
x86/mm: Do not warn about PCI BIOS W+X mappings
resource: Clean it up a bit
resource: Fix find_next_iomem_res() iteration issue
resource: Include resource end in walk_*() interfaces
x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
x86/mm: Remove spurious fault pkey check
x86/mm/vsyscall: Consider vsyscall page part of user address space
x86/mm: Add vsyscall address helper
x86/mm: Fix exception table comments
x86/mm: Add clarifying comments for user addr space
x86/mm: Break out user address space handling
x86/mm: Break out kernel address space handling
x86/mm: Clarify hardware vs. software "error_code"
x86/mm/tlb: Make lazy TLB mode lazier
x86/mm/tlb: Add freed_tables element to flush_tlb_info
x86/mm/tlb: Add freed_tables argument to flush_tlb_mm_range
smp,cpumask: introduce on_each_cpu_cond_mask
smp: use __cpumask_set_cpu in on_each_cpu_cond
...
Linus Torvalds [Tue, 23 Oct 2018 15:47:41 +0000 (16:47 +0100)]
Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 hyperv updates from Ingo Molnar:
"Two small changes: a boot warning removal and a minor cleanup"
* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyperv: Remove unused include
x86/hyperv: Suppress "PCI: Fatal: No config space access function found"
Linus Torvalds [Tue, 23 Oct 2018 15:31:33 +0000 (16:31 +0100)]
Merge branch 'x86-grub2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 grub2 updates from Ingo Molnar:
"This extends the x86 boot protocol to include an address for the RSDP
table - utilized by Xen currently.
Matching Grub2 patches are pending as well. (Juergen Gross)"
* 'x86-grub2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi, x86/boot: Take RSDP address for boot params if available
x86/boot: Add ACPI RSDP address to setup_header
x86/xen: Fix boot loader version reported for PVH guests
Linus Torvalds [Tue, 23 Oct 2018 15:16:40 +0000 (16:16 +0100)]
Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Ingo Molnar:
"The main changes in this cycle were:
- Add support for the "Dhyana" x86 CPUs by Hygon: these are licensed
based on the AMD Zen architecture, and are built and sold in China,
for domestic datacenter use. The code is pretty close to AMD
support, mostly with a few quirks and enumeration differences. (Pu
Wen)
- Enable CPUID support on Cyrix 6x86/6x86L processors"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/cpupower: Add Hygon Dhyana support
cpufreq: Add Hygon Dhyana support
ACPI: Add Hygon Dhyana support
x86/xen: Add Hygon Dhyana support to Xen
x86/kvm: Add Hygon Dhyana support to KVM
x86/mce: Add Hygon Dhyana support to the MCA infrastructure
x86/bugs: Add Hygon Dhyana to the respective mitigation machinery
x86/apic: Add Hygon Dhyana support
x86/pci, x86/amd_nb: Add Hygon Dhyana support to PCI and northbridge
x86/amd_nb: Check vendor in AMD-only functions
x86/alternative: Init ideal_nops for Hygon Dhyana
x86/events: Add Hygon Dhyana support to PMU infrastructure
x86/smpboot: Do not use BSP INIT delay and MWAIT to idle on Dhyana
x86/cpu/mtrr: Support TOP_MEM2 and get MTRR number
x86/cpu: Get cache info and setup cache cpumap for Hygon Dhyana
x86/cpu: Create Hygon Dhyana architecture support file
x86/CPU: Change query logic so CPUID is enabled before testing
x86/CPU: Use correct macros for Cyrix calls
Linus Torvalds [Tue, 23 Oct 2018 15:04:22 +0000 (16:04 +0100)]
Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build update from Ingo Molnar:
"A small cleanup to x86 Kconfigs"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kconfig: Remove redundant 'default n' lines from all x86 Kconfig's
Linus Torvalds [Tue, 23 Oct 2018 14:54:42 +0000 (15:54 +0100)]
Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"Two cleanups and a bugfix for a rare boot option combination"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/KASLR: Remove return value from handle_mem_options()
x86/corruption-check: Use pr_*() instead of printk()
x86/corruption-check: Fix panic in memory_corruption_check() when boot option without value is provided
Linus Torvalds [Tue, 23 Oct 2018 14:24:22 +0000 (15:24 +0100)]
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were the fsgsbase related preparatory
patches from Chang S. Bae - but there's also an optimized
memcpy_flushcache() and a cleanup for the __cmpxchg_double() assembly
glue"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fsgsbase/64: Clean up various details
x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick
x86/vdso: Initialize the CPU/node NR segment descriptor earlier
x86/vdso: Introduce helper functions for CPU and node number
x86/segments/64: Rename the GDT PER_CPU entry to CPU_NUMBER
x86/fsgsbase/64: Factor out FS/GS segment loading from __switch_to()
x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers
x86/fsgsbase/64: Make ptrace use the new FS/GS base helpers
x86/fsgsbase/64: Introduce FS/GS base helper functions
x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
x86/asm: Use CC_SET()/CC_OUT() in __cmpxchg_double()
x86/asm: Optimize memcpy_flushcache()
Linus Torvalds [Tue, 23 Oct 2018 14:15:20 +0000 (15:15 +0100)]
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Ingo Molnar:
"Improve the spreading of managed IRQs at allocation time"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/matrix: Spread managed interrupts on allocation
irq/matrix: Split out the CPU selection code into a helper
- Topology handling improvements, in particular when CPU capacity
changes and related load-balancing fixes/improvements (Morten
Rasmussen)
- ... plus misc other improvements, fixes and updates"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions
sched/completions/Documentation: Clean up the document some more
sched/completions/Documentation: Fix a couple of punctuation nits
cpu/SMT: State SMT is disabled even with nosmt and without "=force"
sched/core: Fix comment regarding nr_iowait_cpu() and get_iowait_load()
sched/fair: Remove setting task's se->runnable_weight during PELT update
sched/fair: Disable LB_BIAS by default
sched/pelt: Fix warning and clean up IRQ PELT config
sched/topology: Make local variables static
sched/debug: Use symbolic names for task state constants
sched/numa: Remove unused numa_stats::nr_running field
sched/numa: Remove unused code from update_numa_stats()
sched/debug: Explicitly cast sched_feat() to bool
sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains
sched/fair: Don't move tasks to lower capacity CPUs unless necessary
sched/fair: Set rq->rd->overload when misfit
sched/fair: Wrap rq->rd->overload accesses with READ/WRITE_ONCE()
sched/core: Change root_domain->overload type to int
sched/fair: Change 'prefer_sibling' type to bool
sched/fair: Kick nohz balance if rq->misfit_task_load
...
Linus Torvalds [Tue, 23 Oct 2018 12:46:36 +0000 (13:46 +0100)]
Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
"Misc smaller fixes and cleanups"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mcelog: Remove one mce_helper definition
x86/mce: Add macros for the corrected error count bit field
x86/mce: Use BIT_ULL(x) for bit mask definitions
x86/mce-inject: Reset injection struct after injection
Linus Torvalds [Tue, 23 Oct 2018 12:32:18 +0000 (13:32 +0100)]
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main updates in this cycle were:
- Lots of perf tooling changes too voluminous to list (big perf trace
and perf stat improvements, lots of libtraceevent reorganization,
etc.), so I'll list the authors and refer to the changelog for
details:
Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter
Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven
Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas
Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir
Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa.
... with the bulk of the changes written by Jiri Olsa, Tzvetomir
Stoyanov and Arnaldo Carvalho de Melo.
- Continued intel_rdt work with a focus on playing well with perf
events. This also imported some non-perf RDT work due to
dependencies. (Reinette Chatre)
- Implement counter freezing for Arch Perfmon v4 (Skylake and newer).
This allows to speed up the PMI handler by avoiding unnecessary MSR
writes and make it more accurate. (Andi Kleen)
- kprobes cleanups and simplification (Masami Hiramatsu)
- Intel Goldmont PMU updates (Kan Liang)
- ... plus misc other fixes and updates"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits)
kprobes/x86: Use preempt_enable() in optimized_callback()
x86/intel_rdt: Prevent pseudo-locking from using stale pointers
kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
perf/x86/intel: Export mem events only if there's PEBS support
x86/cpu: Drop pointless static qualifier in punit_dev_state_show()
x86/intel_rdt: Fix initial allocation to consider CDP
x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
x86/intel_rdt: Introduce utility to obtain CDP peer
tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file
tools lib traceevent: Separate out tep_strerror() for strerror_r() issues
perf python: More portable way to make CFLAGS work with clang
perf python: Make clang_has_option() work on Python 3
perf tools: Free temporary 'sys' string in read_event_files()
perf tools: Avoid double free in read_event_file()
perf tools: Free 'printk' string in parse_ftrace_printk()
perf tools: Cleanup trace-event-info 'tdata' leak
perf strbuf: Match va_{add,copy} with va_end
perf test: S390 does not support watchpoints in test 22
perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG
tools include: Adopt linux/bits.h
...
Linus Torvalds [Tue, 23 Oct 2018 12:08:53 +0000 (13:08 +0100)]
Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and misc x86 updates from Ingo Molnar:
"Lots of changes in this cycle - in part because locking/core attracted
a number of related x86 low level work which was easier to handle in a
single tree:
- Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E.
McKenney, Andrea Parri)
- lockdep scalability improvements and micro-optimizations (Waiman
Long)
- rwsem improvements (Waiman Long)
- spinlock micro-optimization (Matthew Wilcox)
- qspinlocks: Provide a liveness guarantee (more fairness) on x86.
(Peter Zijlstra)
- Add support for relative references in jump tables on arm64, x86
and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens)
- Be a lot less permissive on weird (kernel address) uaccess faults
on x86: BUG() when uaccess helpers fault on kernel addresses (Jann
Horn)
- macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav
Amit)
- ... and a handful of other smaller changes as well"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
locking/lockdep: Make global debug_locks* variables read-mostly
locking/lockdep: Fix debug_locks off performance problem
locking/pvqspinlock: Extend node size when pvqspinlock is configured
locking/qspinlock_stat: Count instances of nested lock slowpaths
locking/qspinlock, x86: Provide liveness guarantee
x86/asm: 'Simplify' GEN_*_RMWcc() macros
locking/qspinlock: Rework some comments
locking/qspinlock: Re-order code
locking/lockdep: Remove duplicated 'lock_class_ops' percpu array
x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
futex: Replace spin_is_locked() with lockdep
locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y
x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs
x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs
x86/extable: Macrofy inline assembly code to work around GCC inlining bugs
x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops
x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs
x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs
x86/refcount: Work around GCC inlining bug
x86/objtool: Use asm macros to work around GCC inlining bugs
...
Linus Torvalds [Tue, 23 Oct 2018 12:04:03 +0000 (13:04 +0100)]
Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes are:
- Add support for enlisting the help of the EFI firmware to create
memory reservations that persist across kexec.
- Add page fault handling to the runtime services support code on x86
so we can more gracefully recover from buggy EFI firmware.
- Fix command line handling on x86 for the boot path that omits the
stub's PE/COFF entry point.
- Other assorted fixes and updates"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: boot: Fix EFI stub alignment
efi/x86: Call efi_parse_options() from efi_main()
efi/x86: earlyprintk - Add 64bit efi fb address support
efi/x86: drop task_lock() from efi_switch_mm()
efi/x86: Handle page faults occurring while running EFI runtime services
efi: Make efi_rts_work accessible to efi page fault handler
efi/efi_test: add exporting ResetSystem runtime service
efi/libstub: arm: support building with clang
efi: add API to reserve memory persistently across kexec reboot
efi/arm: libstub: add a root memreserve config table
efi: honour memory reservations passed via a linux specific config table
Linus Torvalds [Tue, 23 Oct 2018 11:31:17 +0000 (12:31 +0100)]
Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The biggest change in this cycle is the conclusion of the big
'simplify RCU to two primary flavors' consolidation work - i.e.
there's a single RCU flavor for any kernel variant (PREEMPT and
!PREEMPT):
- Consolidate the RCU-bh, RCU-preempt, and RCU-sched flavors into a
single flavor similar to RCU-sched in !PREEMPT kernels and into a
single flavor similar to RCU-preempt (but also waiting on
preempt-disabled sequences of code) in PREEMPT kernels.
This branch also includes a refactoring of
rcu_{nmi,irq}_{enter,exit}() from Byungchul Park.
- Now that there is only one RCU flavor in any given running kernel,
the many "rsp" pointers are no longer required, and this cleanup
series removes them.
- This branch carries out additional cleanups made possible by the
RCU flavor consolidation, including inlining now-trivial
functions, updating comments and definitions, and removing
now-unneeded rcutorture scenarios.
- Now that there is only one flavor of RCU in any running kernel,
there is also only on rcu_data structure per CPU. This means that
the rcu_dynticks structure can be merged into the rcu_data
structure, a task taken on by this branch. This branch also
contains a -rt-related fix from Mike Galbraith.
There were also other updates:
- Documentation updates, including some good-eye catches from Joel
Fernandes.
- SRCU updates, most notably changes enabling call_srcu() to be
invoked very early in the boot sequence.
- Torture-test updates, including some preliminary work towards
making rcutorture better able to find problems that result in
insufficient grace-period forward progress.
- Initial changes to RCU to better promote forward progress of grace
periods, including fixing a bug found by Marius Hillenbrand and
David Woodhouse, with the fix suggested by Peter Zijlstra"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
srcu: Make early-boot call_srcu() reuse workqueue lists
rcutorture: Test early boot call_srcu()
srcu: Make call_srcu() available during very early boot
rcu: Convert rcu_state.ofl_lock to raw_spinlock_t
rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed
rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks
rcu: Switch dyntick nesting counters to rcu_data structure
rcu: Switch urgent quiescent-state requests to rcu_data structure
rcu: Switch lazy counts to rcu_data structure
rcu: Switch last accelerate/advance to rcu_data structure
rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure
rcu: Merge rcu_dynticks structure into rcu_data structure
rcu: Remove unused rcu_dynticks_snap() from Tiny RCU
rcu: Convert "1UL << x" to "BIT(x)"
rcu: Avoid resched_cpu() when rescheduling the current CPU
rcu: More aggressively enlist scheduler aid for nohz_full CPUs
rcu: Compute jiffies_till_sched_qs from other kernel parameters
rcu: Provide functions for determining if call_rcu() has been invoked
rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure
rcu: Motivate Tiny RCU forward progress
...
Linus Torvalds [Tue, 23 Oct 2018 10:14:47 +0000 (11:14 +0100)]
Merge tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- Improved access control for the zcrypt driver, multiple device nodes
can now be created with different access control lists
- Extend the pkey API to provide random protected keys, this is useful
for encrypted swap device with ephemeral protected keys
- Add support for virtually mapped kernel stacks
- Rework the early boot code, this moves the memory detection into the
boot code that runs prior to decompression.
- Add KASAN support
- Bug fixes and cleanups
* tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
s390/pkey: move pckmo subfunction available checks away from module init
s390/kasan: support preemptible kernel build
s390/pkey: Load pkey kernel module automatically
s390/perf: Return error when debug_register fails
s390/sthyi: Fix machine name validity indication
s390/zcrypt: fix broken zcrypt_send_cprb in-kernel api function
s390/vmalloc: fix VMALLOC_START calculation
s390/mem_detect: add missing include
s390/dumpstack: print psw mask and address again
s390/crypto: Enhance paes cipher to accept variable length key material
s390/pkey: Introduce new API for transforming key blobs
s390/pkey: Introduce new API for random protected key verification
s390/pkey: Add sysfs attributes to emit secure key blobs
s390/pkey: Add sysfs attributes to emit protected key blobs
s390/pkey: Define protected key blob format
s390/pkey: Introduce new API for random protected key generation
s390/zcrypt: add ap_adapter_mask sysfs attribute
s390/zcrypt: provide apfs failure code on type 86 error reply
s390/zcrypt: zcrypt device driver cleanup
s390/kasan: add support for mem= kernel parameter
...
Linus Torvalds [Tue, 23 Oct 2018 10:06:43 +0000 (11:06 +0100)]
Merge tag 'please-pull-next' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 updates from Tony Luck:
"Miscellaneous ia64 fixes from Christoph"
* tag 'please-pull-next' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
intel-iommu: mark intel_dma_ops static
ia64: remove machvec_dma_sync_{single,sg}
ia64/sn2: remove no-ops dma sync methods
ia64: remove the unused iommu_dma_init function
ia64: remove the unused pci_iommu_shutdown function
ia64: remove the unused bad_dma_address symbol
ia64: remove iommu_dma_supported
ia64: remove the dead iommu_sac_force variable
ia64: remove the kern_mem_attribute export
Linus Torvalds [Tue, 23 Oct 2018 10:02:05 +0000 (11:02 +0100)]
Merge branch 'stable/for-linus-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull xen swiotlb fix from Konrad Rzeszutek Wilk:
"One tiny fix for the Xen SWIOTLB mechanism that occasionally happened
with devices that didn't allocate size in power of two but rather some
odd sizes.
We neglected to make the memory coherent leading to all kinds of fun
crashes"
* 'stable/for-linus-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
xen-swiotlb: use actually allocated size on check physical continuous
Linus Torvalds [Tue, 23 Oct 2018 09:33:16 +0000 (10:33 +0100)]
Merge tag 'acpi-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These fix ACPICA issues related to the handling of module-level AML,
fix an ordering issue during ACPI initialization, update ACPICA to
upstream revision 20181003 (including fixes mostly), fix issues with
system-wide suspend/resume related to the ACPI driver for Intel SoCs
(LPSS), fix device enumeration issues on boards with Dollar Cove or
Whiskey Cove Intel PMICs, prevent ACPICA from calling ktime_get() in
unsuitable conditions, update a few drivers and clean up some code in
several places.
Specifics:
- Fix ACPICA issues related to the handling of module-level AML and
make the ACPI initialization code parse ECDT before loading the
definition block tables (Erik Schmauss).
- Update ACPICA to upstream revision 20181003 including fixes related
to the ill-defined "generic serial bus" and the handling of the
_REG object (Bob Moore).
- Fix some issues with system-wide suspend/resume on Intel BYT/CHT
related to the handling of I2C controllers in the ACPI LPSS driver
for Intel SoCs (Hans de Goede).
- Modify the ACPI namespace scanning code to enumerate INT33FE HID
devices as platform devices with I2C resources to avoid device
enumeration problems on boards with Dollar Cove or Whiskey Cove
Intel PMICs (Hans de Goede).
- Prevent ACPICA from using ktime_get() during early resume from
system-wide suspend before resuming the timekeeping which generally
is unsafe and triggers a warning from the timekeeping code (Bart
Van Assche).
- Add low-level real time clock support to the ACPI Time and Aalarm
Device (TAD) driver (Rafael Wysocki).
- Fix the ACPI SBS driver to avoid GPE storms on MacBook Pro and
Oopses when removing modules (Ronald Tschalär).
- Fix the ACPI PPTT parsing code to handle architecturally unknown
cache types properly (Jeffrey Hugo).
- Fix initialization issue in the ACPI processor driver (Dou Liyang).
- Clean up the code in several places (Andy Shevchenko, Bartlomiej
Zolnierkiewicz, David Arcari, zhong jiang)"
* tag 'acpi-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (33 commits)
ACPI / scan: Create platform device for INT33FE ACPI nodes
ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()
ACPI: probe ECDT before loading AML tables regardless of module-level code flag
ACPICA: Remove acpi_gbl_group_module_level_code and only use acpi_gbl_execute_tables_as_methods instead
ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes
ACPICA: AML interpreter: add region addresses in global list during initialization
ACPI: TAD: Add low-level support for real time capability
ACPI: remove redundant 'default n' from Kconfig
ACPI / SBS: Fix rare oops when removing modules
ACPI / SBS: Fix GPE storm on recent MacBookPro's
ACPI/PPTT: Handle architecturally unknown cache types
drivers: base: cacheinfo: Do not populate sysfs for unknown cache types
ACPICA: Update version to 20181003
ACPICA: Never run _REG on system_memory and system_IO
ACPICA: Split large interpreter file
ACPICA: Update for field unit access
ACPICA: Rename some of the Field Attribute defines
ACPICA: Update for generic_serial_bus and attrib_raw_process_bytes protocol
ACPI / processor: Fix the return value of acpi_processor_ids_walk()
ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq
...