Linus Torvalds [Fri, 24 Aug 2018 02:20:12 +0000 (19:20 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
- the rest of MM
- various misc fixes and tweaks
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
mm: Change return type int to vm_fault_t for fault handlers
lib/fonts: convert comments to utf-8
s390: ebcdic: convert comments to UTF-8
treewide: convert ISO_8859-1 text comments to utf-8
drivers/gpu/drm/gma500/: change return type to vm_fault_t
docs/core-api: mm-api: add section about GFP flags
docs/mm: make GFP flags descriptions usable as kernel-doc
docs/core-api: split memory management API to a separate file
docs/core-api: move *{str,mem}dup* to "String Manipulation"
docs/core-api: kill trailing whitespace in kernel-api.rst
mm/util: add kernel-doc for kvfree
mm/util: make strndup_user description a kernel-doc comment
fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds
treewide: correct "differenciate" and "instanciate" typos
fs/afs: use new return type vm_fault_t
drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t
mm: soft-offline: close the race against page allocation
mm: fix race on soft-offlining free huge pages
namei: allow restricted O_CREAT of FIFOs and regular files
hfs: prevent crash on exit from failed search
...
Souptick Joarder [Fri, 24 Aug 2018 00:01:36 +0000 (17:01 -0700)]
mm: Change return type int to vm_fault_t for fault handlers
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")
The aim is to change the return type of finish_fault() and
handle_mm_fault() to vm_fault_t type. As part of that clean up return
type of all other recursively called functions have been changed to
vm_fault_t type.
The places from where handle_mm_fault() is getting invoked will be
change to vm_fault_t type but in a separate patch.
vmf_error() is the newly introduce inline function in 4.17-rc6.
[akpm@linux-foundation.org: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()] Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:01:29 +0000 (17:01 -0700)]
s390: ebcdic: convert comments to UTF-8
The ebcdic.c file contains tables for converting between ebcdic and PC
codepage 437. I could however not identify which encoding was used for
the comments. This seems to be some variation of ISO_8859-1 with
non-UTF-8 escape characters.
I have converted this to UTF-8 by manually removing the escape
characters and then running it through recode, to get the same encoding
that we use for the rest of the kernel.
Arnd Bergmann [Fri, 24 Aug 2018 00:01:26 +0000 (17:01 -0700)]
treewide: convert ISO_8859-1 text comments to utf-8
Almost all files in the kernel are either plain text or UTF-8 encoded. A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.
This converts them all to UTF-8 for consistency.
Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Simon Horman <horms@verge.net.au> [IPVS portion] Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [IIO] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Rob Herring <robh@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 24 Aug 2018 00:01:22 +0000 (17:01 -0700)]
drivers/gpu/drm/gma500/: change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref-> 1c8f422059ae ("mm: change return type to vm_fault_t")
Previously vm_insert_{pfn,mixed} returns err which driver mapped into
VM_FAULT_* type. The new function vmf_insert_{pfn,mixed} will replace
this inefficiency by returning VM_FAULT_* type.
vmf_error() is the newly introduce inline function in 4.17-rc6.
Link: http://lkml.kernel.org/r/20180713154541.GA3345@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: David Airlie <airlied@linux.ie> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:19 +0000 (17:01 -0700)]
docs/core-api: mm-api: add section about GFP flags
Link: http://lkml.kernel.org/r/1532626360-16650-8-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:15 +0000 (17:01 -0700)]
docs/mm: make GFP flags descriptions usable as kernel-doc
This patch adds DOC: headings for GFP flag descriptions and adjusts the
formatting to fit sphinx expectations of paragraphs.
Link: http://lkml.kernel.org/r/1532626360-16650-7-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:12 +0000 (17:01 -0700)]
docs/core-api: split memory management API to a separate file
This is basically copy-paste of the memory management section from
kernel-api.rst with some minor adjustments:
* The "User Space Memory Access" is moved to the beginning
* The get_user_pages_fast reference is now a part of "User Space Memory
Access"
* And, of course, headings are adjusted with section being promoted to
chapters
Link: http://lkml.kernel.org/r/1532626360-16650-6-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:09 +0000 (17:01 -0700)]
docs/core-api: move *{str,mem}dup* to "String Manipulation"
The string and memory duplication routines fit better to the "String
Manipulation" section than to "The SLAB Cache".
Link: http://lkml.kernel.org/r/1532626360-16650-5-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:05 +0000 (17:01 -0700)]
docs/core-api: kill trailing whitespace in kernel-api.rst
Link: http://lkml.kernel.org/r/1532626360-16650-4-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:01:02 +0000 (17:01 -0700)]
mm/util: add kernel-doc for kvfree
Link: http://lkml.kernel.org/r/1532626360-16650-3-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 24 Aug 2018 00:00:59 +0000 (17:00 -0700)]
mm/util: make strndup_user description a kernel-doc comment
Patch series "memory management documentation updates", v3.
Here are several updates to the mm documentation.
Aside from really minor changes in the first three patches, the updates
are:
* move the documentation of kstrdup and friends to "String Manipulation"
section
* split memory management API into a separate .rst file
* adjust formating of the GFP flags description and include it in the
reference documentation.
This patch (of 7):
The description of the strndup_user function misses '*' character at the
beginning of the comment to be proper kernel-doc. Add the missing
character.
Link: http://lkml.kernel.org/r/1532626360-16650-2-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 24 Aug 2018 00:00:55 +0000 (17:00 -0700)]
fs/proc/vmcore.c: hide vmcoredd_mmap_dumps() for nommu builds
Without CONFIG_MMU, we get a build warning:
fs/proc/vmcore.c:228:12: error: 'vmcoredd_mmap_dumps' defined but not used [-Werror=unused-function]
static int vmcoredd_mmap_dumps(struct vm_area_struct *vma, unsigned long dst,
The function is only referenced from an #ifdef'ed caller, so
this uses the same #ifdef around it.
Souptick Joarder [Fri, 24 Aug 2018 00:00:48 +0000 (17:00 -0700)]
fs/afs: use new return type vm_fault_t
Use new return type vm_fault_t for fault handler in struct
vm_operations_struct. For now, this is just documenting that the
function returns a VM_FAULT value rather than an errno. Once all
instances are converted, vm_fault_t will become a distinct type.
See 1c8f422059ae ("mm: change return type to vm_fault_t") for reference.
Link: http://lkml.kernel.org/r/20180702152017.GA3780@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Fri, 24 Aug 2018 00:00:45 +0000 (17:00 -0700)]
drivers/hwtracing/intel_th/msu.c: change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
See 1c8f422059ae ("mm: change return type to vm_fault_t") for reference.
Link: http://lkml.kernel.org/r/20180702155801.GA4010@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Fri, 24 Aug 2018 00:00:42 +0000 (17:00 -0700)]
mm: soft-offline: close the race against page allocation
A process can be killed with SIGBUS(BUS_MCEERR_AR) when it tries to
allocate a page that was just freed on the way of soft-offline. This is
undesirable because soft-offline (which is about corrected error) is
less aggressive than hard-offline (which is about uncorrected error),
and we can make soft-offline fail and keep using the page for good
reason like "system is busy."
Two main changes of this patch are:
- setting migrate type of the target page to MIGRATE_ISOLATE. As done
in free_unref_page_commit(), this makes kernel bypass pcplist when
freeing the page. So we can assume that the page is in freelist just
after put_page() returns,
- setting PG_hwpoison on free page under zone->lock which protects
freelists, so this allows us to avoid setting PG_hwpoison on a page
that is decided to be allocated soon.
[akpm@linux-foundation.org: tweak set_hwpoison_free_buddy_page() comment] Link: http://lkml.kernel.org/r/1531452366-11661-3-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com> Tested-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <zy.zhengyi@alibaba-inc.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Fri, 24 Aug 2018 00:00:38 +0000 (17:00 -0700)]
mm: fix race on soft-offlining free huge pages
Patch series "mm: soft-offline: fix race against page allocation".
Xishi recently reported the issue about race on reusing the target pages
of soft offlining. Discussion and analysis showed that we need make
sure that setting PG_hwpoison should be done in the right place under
zone->lock for soft offline. 1/2 handles free hugepage's case, and 2/2
hanldes free buddy page's case.
This patch (of 2):
There's a race condition between soft offline and hugetlb_fault which
causes unexpected process killing and/or hugetlb allocation failure.
The process killing is caused by the following flow:
CPU 0 CPU 1 CPU 2
soft offline
get_any_page
// find the hugetlb is free
mmap a hugetlb file
page fault
...
hugetlb_fault
hugetlb_no_page
alloc_huge_page
// succeed
soft_offline_free_page
// set hwpoison flag
mmap the hugetlb file
page fault
...
hugetlb_fault
hugetlb_no_page
find_lock_page
return VM_FAULT_HWPOISON
mm_fault_error
do_sigbus
// kill the process
The hugetlb allocation failure comes from the following flow:
CPU 0 CPU 1
mmap a hugetlb file
// reserve all free page but don't fault-in
soft offline
get_any_page
// find the hugetlb is free
soft_offline_free_page
// set hwpoison flag
dissolve_free_huge_page
// fail because all free hugepages are reserved
page fault
...
hugetlb_fault
hugetlb_no_page
alloc_huge_page
...
dequeue_huge_page_node_exact
// ignore hwpoisoned hugepage
// and finally fail due to no-mem
The root cause of this is that current soft-offline code is written based
on an assumption that PageHWPoison flag should be set at first to avoid
accessing the corrupted data. This makes sense for memory_failure() or
hard offline, but does not for soft offline because soft offline is about
corrected (not uncorrected) error and is safe from data lost. This patch
changes soft offline semantics where it sets PageHWPoison flag only after
containment of the error page completes successfully.
Link: http://lkml.kernel.org/r/1531452366-11661-2-git-send-email-n-horiguchi@ah.jp.nec.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com> Suggested-by: Xishi Qiu <xishi.qiuxishi@alibaba-inc.com> Tested-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: <zy.zhengyi@alibaba-inc.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
namei: allow restricted O_CREAT of FIFOs and regular files
Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag. The purpose
is to make data spoofing attacks harder. This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection. This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.
This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:
This list is not meant to be complete. It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector. In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.
[s.mesoraca16@gmail.com: fix bug reported by Dan Carpenter] Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda Link: http://lkml.kernel.org/r/1524829819-11275-1-git-send-email-s.mesoraca16@gmail.com
[keescook@chromium.org: drop pr_warn_ratelimited() in favor of audit changes in the future]
[keescook@chromium.org: adjust commit subjet] Link: http://lkml.kernel.org/r/20180416175918.GA13494@beast Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Suggested-by: Solar Designer <solar@openwall.com> Suggested-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.
hfs_find_exit() expects fd->bnode to be NULL after a search has failed.
hfs_brec_insert() may instead set it to an error-valued pointer. Fix
this to prevent a crash.
An HFS+ filesystem can be mounted read-only without having a metadata
directory, which is needed to support hardlinks. But if the catalog
data is corrupted, a directory lookup may still find dentries claiming
to be hardlinks.
hfsplus_lookup() does check that ->hidden_dir is not NULL in such a
situation, but mistakenly does so after dereferencing it for the first
time. Reorder this check to prevent a crash.
This happens when looking up corrupted catalog data (dentry) on a
filesystem with no metadata directory (this could only ever happen on a
read-only mount). Wen Xu sent the replication steps in detail to the
fsdevel list: https://bugzilla.kernel.org/show_bug.cgi?id=200297
Will Deacon [Thu, 23 Aug 2018 23:23:04 +0000 (00:23 +0100)]
arm64: tlb: Provide forward declaration of tlb_flush() before including tlb.h
As of commit fd1102f0aade ("mm: mmu_notifier fix for tlb_end_vma"),
asm-generic/tlb.h now calls tlb_flush() from a static inline function,
so we need to make sure that it's declared before #including the
asm-generic header in the arch header.
Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Aug 2018 23:03:58 +0000 (16:03 -0700)]
Merge tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"These patches include adding async support for the v4.2 COPY
operation. I think Bruce is planning to send the server patches for
the next release, but I figured we could get the client side out of
the way now since it's been in my tree for a while. This shouldn't
cause any problems, since the server will still respond with
synchronous copies even if the client requests async.
Features:
- Add support for asynchronous server-side COPY operations
Stable bufixes:
- Fix an off-by-one in bl_map_stripe() (v3.17+)
- NFSv4 client live hangs after live data migration recovery (v4.9+)
- xprtrdma: Fix disconnect regression (v4.18+)
- Fix locking in pnfs_generic_recover_commit_reqs (v4.14+)
- Fix a sleep in atomic context in nfs4_callback_sequence() (v4.9+)
Other bugfixes and cleanups:
- Optimizations and fixes involving NFS v4.1 / pNFS layout handling
- Optimize lseek(fd, SEEK_CUR, 0) on directories to avoid locking
- Immediately reschedule writeback when the server replies with an
error
- Fix excessive attribute revalidation in nfs_execute_ok()
- Add error checking to nfs_idmap_prepare_message()
- Use new vm_fault_t return type
- Return a delegation when reclaiming one that the server has
recalled
- Referrals should inherit proto setting from parents
- Make rpc_auth_create_args a const
- Improvements to rpc_iostats tracking
- Fix a potential reference leak when there is an error processing a
callback
- Fix rmdir / mkdir / rename nlink accounting
- Fix updating inode change attribute
- Fix error handling in nfsn4_sp4_select_mode()
- Use an appropriate work queue for direct-write completion
- Don't busy wait if NFSv4 session draining is interrupted"
* tag 'nfs-for-4.19-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (54 commits)
pNFS: Remove unwanted optimisation of layoutget
pNFS/flexfiles: ff_layout_pg_init_read should exit on error
pNFS: Treat RECALLCONFLICT like DELAY...
pNFS: When updating the stateid in layoutreturn, also update the recall range
NFSv4: Fix a sleep in atomic context in nfs4_callback_sequence()
NFSv4: Fix locking in pnfs_generic_recover_commit_reqs
NFSv4: Fix a typo in nfs4_init_channel_attrs()
NFSv4: Don't busy wait if NFSv4 session draining is interrupted
NFS recover from destination server reboot for copies
NFS add a simple sync nfs4_proc_commit after async COPY
NFS handle COPY ERR_OFFLOAD_NO_REQS
NFS send OFFLOAD_CANCEL when COPY killed
NFS export nfs4_async_handle_error
NFS handle COPY reply CB_OFFLOAD call race
NFS add support for asynchronous COPY
NFS COPY xdr handle async reply
NFS OFFLOAD_CANCEL xdr
NFS CB_OFFLOAD xdr
NFS: Use an appropriate work queue for direct-write completion
NFSv4: Fix error handling in nfs4_sp4_select_mode()
...
Linus Torvalds [Thu, 23 Aug 2018 23:00:10 +0000 (16:00 -0700)]
Merge tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Chuck Lever fixed a problem with NFSv4.0 callbacks over GSS from
multi-homed servers.
The only new feature is a minor bit of protocol (change_attr_type)
which the client doesn't even use yet.
Other than that, various bugfixes and cleanup"
* tag 'nfsd-4.19-1' of git://linux-nfs.org/~bfields/linux: (27 commits)
sunrpc: Add comment defining gssd upcall API keywords
nfsd: Remove callback_cred
nfsd: Use correct credential for NFSv4.0 callback with GSS
sunrpc: Extract target name into svc_cred
sunrpc: Enable the kernel to specify the hostname part of service principals
sunrpc: Don't use stack buffer with scatterlist
rpc: remove unneeded variable 'ret' in rdma_listen_handler
nfsd: use true and false for boolean values
nfsd: constify write_op[]
fs/nfsd: Delete invalid assignment statements in nfsd4_decode_exchange_id
NFSD: Handle full-length symlinks
NFSD: Refactor the generic write vector fill helper
svcrdma: Clean up Read chunk path
svcrdma: Avoid releasing a page in svc_xprt_release()
nfsd: Mark expected switch fall-through
sunrpc: remove redundant variables 'checksumlen','blocksize' and 'data'
nfsd: fix leaked file lock with nfs exported overlayfs
nfsd: don't advertise a SCSI layout for an unsupported request_queue
nfsd: fix corrupted reply to badly ordered compound
nfsd: clarify check_op_ordering
...
Linus Torvalds [Thu, 23 Aug 2018 22:58:04 +0000 (15:58 -0700)]
Merge tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS updates from Richard Weinberger:
- Year 2038 preparations
- New UBI feature to skip CRC checks of static volumes
- A new Kconfig option to disable xattrs in UBIFS
- Lots of fixes in UBIFS, found by our new test framework
* tag 'upstream-4.19-rc1' of git://git.infradead.org/linux-ubifs: (21 commits)
ubifs: Set default assert action to read-only
ubifs: Allow setting assert action as mount parameter
ubifs: Rework ubifs_assert()
ubifs: Pass struct ubifs_info to ubifs_assert()
ubifs: Turn two ubifs_assert() into a WARN_ON()
ubi: expose the volume CRC check skip flag
ubi: provide a way to skip CRC checks
ubifs: Use kmalloc_array()
ubifs: Check data node size before truncate
Revert "UBIFS: Fix potential integer overflow in allocation"
ubifs: Add comment on c->commit_sem
ubifs: introduce Kconfig symbol for xattr support
ubifs: use swap macro in swap_dirty_idx
ubifs: tnc: use monotonic znode timestamp
ubifs: use timespec64 for inode timestamps
ubifs: xattr: Don't operate on deleted inodes
ubifs: gc: Fix typo
ubifs: Fix memory leak in lprobs self-check
ubi: Initialize Fastmap checkmapping correctly
ubifs: Fix synced_i_size calculation for xattr inodes
...
Linus Torvalds [Thu, 23 Aug 2018 22:51:09 +0000 (15:51 -0700)]
Merge tag 'pwm/for-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"This contains mostly minor bug fixes as well as some new chip support
for existing drivers"
* tag 'pwm/for-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: mediatek: Add MT7628 support
dt-bindings: pwm: Add MT7628 information
dt-bindings: pwm: rcar: Add bindings for R-Car E3 support
pwm: meson: Fix mux clock names
pwm: stm32-lp: Remove useless loop in stm32_pwm_lp_remove()
pwm: omap-dmtimer: Return -EPROBE_DEFER if no dmtimer platform data
pwm: mxs: Switch to SPDX identifier
dt-bindings: pwm: fsl-ftm: Add compatible string for i.MX8QM
pwm: fsl-ftm: Enable support for the new SoC i.MX8QM
pwm: fsl-ftm: Added the support of per-compatible data
pwm: fsl-ftm: Added a dedicated IP interface clock
pwm: cros-ec: Switch to SPDX identifier
pwm: imx: Switch to SPDX identifier
pwm: tiehrpwm: Fix disabling of output of PWMs
pwm: tiehrpwm: Don't use emulation mode bits to control PWM output
pwm: berlin: Don't use broken prescaler values
Linus Torvalds [Thu, 23 Aug 2018 22:44:58 +0000 (15:44 -0700)]
Merge tag 'fbdev-v4.19' of https://github.com/bzolnier/linux
Pull fbdev updates from Bartlomiej Zolnierkiewicz:
"Mostly small fixes and cleanups for fb drivers (the biggest updates
are for udlfb and pxafb drivers). This also adds deferred console
takeover support to the console code and efifb driver.
Summary:
- add support for deferred console takeover, when enabled defers
fbcon taking over the console from the dummy console until the
first text is displayed on the console - together with the "quiet"
kernel commandline option this allows fbcon to still be used
together with a smooth graphical bootup (Hans de Goede)
- copy the ACPI BGRT boot graphics to the framebuffer when deferred
console takeover support is used in efifb driver (Hans de Goede)
- update udlfb driver - fix lost console when the user unplugs a USB
adapter, fix the screen corruption issue, fix locking and add some
performance optimizations (Mikulas Patocka)
- update pxafb driver - fix using uninitialized memory, switch to
devm_* API, handle initialization errors and add support for
lcd-supply regulator (Daniel Mack)
- add support for boards booted with a DeviceTree in pxa3xx_gcu
driver (Daniel Mack)
- rename omap2 module to omap2fb.ko to avoid conflicts with omap1
driver (Arnd Bergmann)
- enable ACPI-based enumeration for goldfishfb driver (Yu Ning)
- fix goldfishfb driver to make user space Android code use 60 fps
(Christoffer Dall)
- print big fat warning when nomodeset kernel parameter is used in
vgacon driver (Lyude Paul)
- remove VLA usage from fsl-diu-fb driver (Kees Cook)
- misc fixes (Julia Lawall, Geert Uytterhoeven, Fredrik Noring,
Yisheng Xie, Dan Carpenter, Daniel Vetter, Anton Vasilyev, Randy
Dunlap, Gustavo A. R. Silva, Colin Ian King, Fengguang Wu)
- misc cleanups (Roman Kiryanov, Yisheng Xie, Colin Ian King)"
* tag 'fbdev-v4.19' of https://github.com/bzolnier/linux: (54 commits)
Documentation/fb: corrections for fbcon.txt
fbcon: Do not takeover the console from atomic context
dummycon: Stop exporting dummycon_[un]register_output_notifier
fbcon: Only defer console takeover if the current console driver is the dummycon
fbcon: Only allow FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER if fbdev is builtin
fbdev: omap2: omapfb: fix ifnullfree.cocci warnings
fbdev: omap2: omapfb: fix bugon.cocci warnings
fbdev: omap2: omapfb: fix boolreturn.cocci warnings
fb: amifb: fix build warnings when not builtin
fbdev/core: Disable console-lock warnings when fb.lockless_register_fb is set
console: Replace #if 0 with atomic var 'ignore_console_lock_warning'
udlfb: use spin_lock_irq instead of spin_lock_irqsave
udlfb: avoid prefetch
udlfb: optimization - test the backing buffer
udlfb: allow reallocating the framebuffer
udlfb: set line_length in dlfb_ops_set_par
udlfb: handle allocation failure
udlfb: set optimal write delay
udlfb: make a local copy of fb_ops
udlfb: don't switch if we are switching to the same videomode
...
Linus Torvalds [Thu, 23 Aug 2018 22:37:24 +0000 (15:37 -0700)]
Merge tag 'sound-fix-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"No surprises here: a regression fix for virmidi code refactoring,
three fixes for the new AC97 bus compat and runtime PM, and a usual
HD-audio quirk"
* tag 'sound-fix-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fix HP Headset Mic can't record
ALSA: ac97: fix unbalanced pm_runtime_enable
ALSA: ac97: fix check of pm_runtime_get_sync failure
ALSA: ac97: fix device initialization in the compat layer
ALSA: seq: virmidi: Fix discarding the unsubscribed output
Linus Torvalds [Thu, 23 Aug 2018 22:34:48 +0000 (15:34 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull more rdma updates from Jason Gunthorpe:
"This is the SMC cleanup promised, a randconfig regression fix, and
kernel oops fix.
Summary:
- Switch SMC over to rdma_get_gid_attr and remove the compat
- Fix a crash in HFI1 with some BIOS's
- Fix a randconfig failure"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/ucm: fix UCM link error
IB/hfi1: Invalid NUMA node information can cause a divide by zero
RDMA/smc: Replace ib_query_gid with rdma_get_gid_attr
Linus Torvalds [Thu, 23 Aug 2018 21:55:01 +0000 (14:55 -0700)]
Merge branch 'tlb-fixes'
Merge fixes for missing TLB shootdowns.
This fixes a couple of cases that involved us possibly freeing page
table structures before the required TLB shootdown had been done.
There are a few cleanup patches to make the code easier to follow, and
to avoid some of the more problematic cases entirely when not necessary.
To make this easier for backports, it undoes the recent lazy TLB
patches, because the cleanups and fixes are more important, and Rik is
ok with re-doing them later when things have calmed down.
The missing TLB flush was only delayed, and the wrong ordering only
happened under memory pressure (and in theory under a couple of other
fairly theoretical situations), so this may have been all very unlikely
to have hit people in practice.
But getting the TLB shootdown wrong is _so_ hard to debug and see that I
consider this a crticial fix.
Many thanks to Jann Horn for having debugged this.
* tlb-fixes:
x86/mm: Only use tlb_remove_table() for paravirt
mm: mmu_notifier fix for tlb_end_vma
mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
mm/tlb: Remove tlb_remove_table() non-concurrent condition
mm: move tlb_table_flush to tlb_flush_mmu_free
x86/mm/tlb: Revert the recent lazy TLB patches
Linus Torvalds [Thu, 23 Aug 2018 21:52:23 +0000 (14:52 -0700)]
Merge tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes and cleanups from Juergen Gross:
"Some cleanups, some minor fixes and a fix for a bug introduced in this
merge window hitting 32-bit PV guests"
* tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: enable early use of set_fixmap in 32-bit Xen PV guest
xen: remove unused hypercall functions
x86/xen: remove unused function xen_auto_xlated_memory_setup()
xen/ACPI: don't upload Px/Cx data for disabled processors
x86/Xen: further refine add_preferred_console() invocations
xen/mcelog: eliminate redundant setting of interface version
x86/Xen: mark xen_setup_gdt() __init
Linus Torvalds [Thu, 23 Aug 2018 21:23:08 +0000 (14:23 -0700)]
Merge tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
- Fix microMIPS build failures by adding a .insn directive to the
barrier_before_unreachable() asm statement in order to convince the
toolchain that the asm statement is a valid branch target rather
than a bogus attempt to switch ISA.
- Clean up our declarations of TLB functions that we overwrite with
generated code in order to prevent the compiler making assumptions
about alignment that cause microMIPS kernels built with GCC 7 &
above to die early during boot.
- Fix up a regression for MIPS32 kernels which slipped into the main
MIPS pull for 4.19, causing CONFIG_32BIT=y kernels to contain
inappropriate MIPS64 instructions.
- Extend our existing workaround for MIPSr6 builds that end up using
the __multi3 intrinsic to GCC 7 & below, rather than just GCC 7.
* tag 'mips_4.19_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: lib: Provide MIPS64r6 __multi3() for GCC < 7
MIPS: Workaround GCC __builtin_unreachable reordering bug
compiler.h: Allow arch-specific asm/compiler.h
MIPS: Avoid move psuedo-instruction whilst using MIPS_ISA_LEVEL
MIPS: Consistently declare TLB functions
MIPS: Export tlbmiss_handler_setup_pgd near its definition
Linus Torvalds [Thu, 23 Aug 2018 21:09:37 +0000 (14:09 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC update from Stafford Horne:
"Just one change for 4.19: refactoring from Christoph Hellwig to use
generic DMA facilities"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: use generic dma_noncoherent_ops
openrisc: fix cache maintainance the the sync_single_for_device DMA operation
openrisc: remove the no-op unmap_page and unmap_sg DMA operations
openrisc: remove the sync_single_for_cpu DMA operation
Linus Torvalds [Thu, 23 Aug 2018 21:02:22 +0000 (14:02 -0700)]
Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM device-tree updates from Olof Johansson:
"Business as usual -- the bulk of our changes are to devicetree files
with new hardware support, new SoCs and platforms, and new board
types.
New SoCs/platforms:
- Raspberry Pi Compute Module (CM1) and IO board
- i.MX6SSL from NXP
- Renesas RZ/N1D SoC (R9A06G032), Dual Cortex-A7 with Ethernet, CAN
and PLC interfaces
- TI AM654 SoC, Quad Cortex-A53, safety subsystem with Cortex-R5
controllers, communication and PRU subsystem and lots of other
interfaces (PCIe, USB3, etc).
New boards and systems:
- Several Atmel at91-based boards from Laird
- Marvell Armada388-based Helios4 board from SolidRun
- Samsung Aires-based phones (s5pv210)
- Allwinner A64-based Pinebook laptop
In addition to the above, there's the usual amount of new devices
described on existing platforms, fixes and tweaks and new minor
variants of boards/platforms"
Linus Torvalds [Thu, 23 Aug 2018 21:00:05 +0000 (14:00 -0700)]
Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC defconfig updates from Olof Johansson:
"We keep these separate since some files are shared and conflict-prone,
but there isn't really much to write about here.
Some of the churnier pieces is for the Aspeed platforms, which did an
overdue refresh of the defconfig, and enabled USB gadget and some
drivers from there. Most of the rest are minor additions here and
there to turn on drivers that are needed or useful on the various
platforms"
* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
ARM: multi_v7_defconfig: add CONFIG_UNIPHIER_THERMAL and CONFIG_SNI_AVE
ARM: config: aspeed: Enable new FSI drivers
ARM: config: multi_v5: Enable ASPEED drivers
ARM: config: multi_v5: Refresh configuration
ARM: config: aspeed: Update defconfig
ARM: multi_v7_defconfig: Enable support for RZN1D-DB
ARM: shmobile: defconfig: Disable /sbin/hotplug fork-bomb
ARM: shmobile: defconfig: Enable support for RZN1D-DB
ARM: shmobile: defconfig: Enable reset controller support
ARM: shmobile: defconfig: Drop NET_VENDOR_<FOO>=n
arm64: defconfig: Enable more peripherals for Samsung Chromebook Plus.
arm64: defconfig: Enable CONFIG_MTD_NAND_QCOM for IPQ8074
ARM: qcom_defconfig: Enable QCOM NAND related configs
ARM: imx_v6_v7_defconfig: add DMATEST support
ARM: mvebu_v7_defconfig: enable SFP support
ARM: mvebu_v7_defconfig: sync defconfig
ARM: multi_v7_defconfig: Add Marvell NAND controller support
arm: configs: Add USB gadget to Aspeed G5 defconfig
arm: configs: Add USB gadget to Aspeed G4 defconfig
arm64: defconfig: enable HiSilicon PMU driver
...
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (52 commits)
drivers/firmware: psci_checker: stash and use topology_core_cpumask for hotplug tests
soc: fsl: cleanup Kconfig menu
soc: fsl: dpio: Convert DPIO documentation to .rst
staging: fsl-mc: Remove remaining files
staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl
staging: fsl-dpaa2: eth: move generic FD defines to DPIO
soc: fsl: qe: gpio: Add qe_gpio_set_multiple
usb: host: exynos: Remove support for Exynos5440
clk: samsung: Remove support for Exynos5440
soc: sunxi: Add the A13, A23 and H3 system control compatibles
reset: uniphier: add reset control support for SPI
cpufreq: exynos: Remove support for Exynos5440
ata: ahci-platform: Remove support for Exynos5440
soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata
soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs
soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
soc: mediatek: pwrap: fix cipher init setting error
dt-bindings: pwrap: mediatek: add pwrap support for MT6797
reset: uniphier: add USB3 core reset control
dt-bindings: reset: uniphier: add USB3 core reset support
...
Linus Torvalds [Thu, 23 Aug 2018 20:37:01 +0000 (13:37 -0700)]
Merge tag 'riscv-for-linus-4.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V fixes from Palmer Dabbelt:
"This contains a pair of fixes to the RISC-V port:
- The removal of our compat.h, which didn't do anything.
- Fixes to sys_riscv_flush_icache to ensure it actually shows up.
We're going to just call this a bug in the ABI, as it was always
supposed to be there.
I've given these a simple build+boot test, both individually and as
the actual tag"
* tag 'riscv-for-linus-4.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
riscv: Delete asm/compat.h
RISC-V: Don't use a global include guard for uapi/asm/syscalls.h
RISC-V: Define sys_riscv_flush_icache when SMP=n
Linus Torvalds [Thu, 23 Aug 2018 20:07:00 +0000 (13:07 -0700)]
Merge tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Masami found an off by one bug in the code that keeps "notrace"
functions from being traced by kprobes. During my testing, I found
that there's places that we may want to add kprobes to notrace, thus
we may end up changing this code before 4.19 is released.
The history behind this change is that we found that adding kprobes to
various notrace functions caused the kernel to crashed. We took the
safe route and decided not to allow kprobes to trace any notrace
function.
But because notrace is added to functions that just cause weird side
effects to the function tracer, but are still safe, preventing kprobes
for all notrace functios may be too much of a big hammer.
One such place is __schedule() is marked notrace, to keep function
tracer from doing strange recursive loops when it gets traced with
NEED_RESCHED set. With this change, one can not add kprobes to the
scheduler.
Masami also added code to use gcov on ftrace"
* tag 'trace-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Fix to check notrace function with correct range
tracing: Allow gcov profiling on only ftrace subsystem
Nicholas Piggin [Thu, 23 Aug 2018 08:47:09 +0000 (18:47 +1000)]
mm: mmu_notifier fix for tlb_end_vma
The generic tlb_end_vma does not call invalidate_range mmu notifier, and
it resets resets the mmu_gather range, which means the notifier won't be
called on part of the range in case of an unmap that spans multiple
vmas.
ARM64 seems to be the only arch I could see that has notifiers and uses
the generic tlb_end_vma. I have not actually tested it.
[ Catalin and Will point out that ARM64 currently only uses the
notifiers for KVM, which doesn't use the ->invalidate_range()
callback right now, so it's a bug, but one that happens to
not affect them. So not necessary for stable. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Wed, 22 Aug 2018 15:30:15 +0000 (17:30 +0200)]
mm/tlb, x86/mm: Support invalidating TLB caches for RCU_TABLE_FREE
Jann reported that x86 was missing required TLB invalidates when he
hit the !*batch slow path in tlb_remove_table().
This is indeed the case; RCU_TABLE_FREE does not provide TLB (cache)
invalidates, the PowerPC-hash where this code originated and the
Sparc-hash where this was subsequently used did not need that. ARM
which later used this put an explicit TLB invalidate in their
__p*_free_tlb() functions, and PowerPC-radix followed that example.
But when we hooked up x86 we failed to consider this. Fix this by
(optionally) hooking tlb_remove_table() into the TLB invalidate code.
NOTE: s390 was also needing something like this and might now
be able to use the generic code again.
[ Modified to be on top of Nick's cleanups, which simplified this patch
now that tlb_flush_mmu_tlbonly() really only flushes the TLB - Linus ]
Fixes: 9e52fc2b50de ("x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rik van Riel <riel@surriel.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Will Deacon <will.deacon@arm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will noted that only checking mm_users is incorrect; we should also
check mm_count in order to cover CPUs that have a lazy reference to
this mm (and could do speculative TLB operations).
If removing this turns out to be a performance issue, we can
re-instate a more complete check, but in tlb_table_flush() eliding the
call_rcu_sched().
Fixes: 267239116987 ("mm, powerpc: move the RCU page-table freeing into generic code") Reported-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rik van Riel <riel@surriel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicholas Piggin [Thu, 23 Aug 2018 08:47:08 +0000 (18:47 +1000)]
mm: move tlb_table_flush to tlb_flush_mmu_free
There is no need to call this from tlb_flush_mmu_tlbonly, it logically
belongs with tlb_flush_mmu_free. This makes future fixes simpler.
[ This was originally done to allow code consolidation for the
mmu_notifier fix, but it also ends up helping simplify the
HAVE_RCU_TABLE_INVALIDATE fix. - Linus ]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Desaulniers [Wed, 22 Aug 2018 23:37:24 +0000 (16:37 -0700)]
include/linux/compiler*.h: make compiler-*.h mutually exclusive
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
recently exposed a brittle part of the build for supporting non-gcc
compilers.
Both Clang and ICC define __GNUC__, __GNUC_MINOR__, and
__GNUC_PATCHLEVEL__ for quick compatibility with code bases that haven't
added compiler specific checks for __clang__ or __INTEL_COMPILER.
This is brittle, as they happened to get compatibility by posing as a
certain version of GCC. This broke when upgrading the minimal version
of GCC required to build the kernel, to a version above what ICC and
Clang claim to be.
Rather than always including compiler-gcc.h then undefining or
redefining macros in compiler-intel.h or compiler-clang.h, let's
separate out the compiler specific macro definitions into mutually
exclusive headers, do more proper compiler detection, and keep shared
definitions in compiler_types.h.
Fixes: cafa0010cd51 ("Raise the minimum required gcc version to 4.6") Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Suggested-by: Eli Friedman <efriedma@codeaurora.org> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chuck Lever [Thu, 16 Aug 2018 16:06:04 +0000 (12:06 -0400)]
nfsd: Use correct credential for NFSv4.0 callback with GSS
I've had trouble when operating a multi-homed Linux NFS server with
Kerberos using NFSv4.0. Lately, I've seen my clients reporting
this (and then hanging):
The client-side commit f11b2a1cfbf5 ("nfs4: copy acceptor name from
context to nfs_client") appears to be related, but I suspect this
problem has been going on for some time before that.
RFC 7530 Section 3.3.3 says:
> For Kerberos V5, nfs/hostname would be a server principal in the
> Kerberos Key Distribution Center database. This is the same
> principal the client acquired a GSS-API context for when it issued
> the SETCLIENTID operation ...
In other words, an NFSv4.0 client expects that the server will use
the same GSS principal for callback that the client used to
establish its lease. For example, if the client used the service
principal "nfs@server.domain" to establish its lease, the server
is required to use "nfs@server.domain" when performing NFSv4.0
callback operations.
The Linux NFS server currently does not. It uses a common service
principal for all callback connections. Sometimes this works as
expected, and other times -- for example, when the server is
accessible via multiple hostnames -- it won't work at all.
This patch scrapes the target name from the client credential,
and uses that for the NFSv4.0 callback credential. That should
be correct much more often.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Chuck Lever [Thu, 16 Aug 2018 16:05:59 +0000 (12:05 -0400)]
sunrpc: Extract target name into svc_cred
NFSv4.0 callback needs to know the GSS target name the client used
when it established its lease. That information is available from
the GSS context created by gssproxy. Make it available in each
svc_cred.
Note this will also give us access to the real target service
principal name (which is typically "nfs", but spec does not require
that).
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Chuck Lever [Thu, 16 Aug 2018 16:05:54 +0000 (12:05 -0400)]
sunrpc: Enable the kernel to specify the hostname part of service principals
A multi-homed NFS server may have more than one "nfs" key in its
keytab. Enable the kernel to pick the key it wants as a machine
credential when establishing a GSS context.
This is useful for GSS-protected NFSv4.0 callbacks, which are
required by RFC 7530 S3.3.3 to use the same principal as the service
principal the client used when establishing its lease.
A complementary modification to rpc.gssd is required to fully enable
this feature.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This is BUG_ON(!virt_addr_valid(buf)) triggered by using a stack
allocated buffer with a scatterlist. Convert the buffer for
rc4salt to be dynamically allocated instead.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1615258 Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Linus Torvalds [Wed, 22 Aug 2018 21:14:15 +0000 (14:14 -0700)]
Merge tag 'platform-drivers-x86-v4.19-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:
- The driver for Silead touchscreen configurations has been renamed
from silead_dmi to touchscreen_dmi since it starts supporting other
touchscreens which require some DMI quirks
It also gets expanded to cover cases for Chuwi Vi10, ONDA V891W,
Connect Tablet 9, Onda V820w, and Cube KNote i1101 tablets.
- Another bunch of changes is related to Mellanox platform code to
allow user space to communicate with Mellanox for system control and
monitoring purposes. The driver notifies user on hotplug device
signal receiving.
- ASUS WMI drivers recognize lid flip action on UX360, and correctly
toggles airplane mode LED. In addition the keyboard backlight toggle
gets support.
- ThinkPad ACPI driver enables support for calculator key (on at least
P52). It also has been fixed to support three characters model
designators, which are used for modern laptops. Earlier the battery,
marked as BAT1, on ThinkPad laptops has not been configured properly,
which is fixed. On the opposite the multi-battery configurations now
probed correctly.
- Dell SMBIOS driver starts working on some Dell servers which do not
support token interface. The regression with backlight detection has
also been fixed. In order to support dock mode on some laptops, Intel
virtual button driver has been fixed. The last but not least is the
fix to Intel HID driver due to changes in Dell systems that prevented
to use power button.
* tag 'platform-drivers-x86-v4.19-1' of git://git.infradead.org/linux-platform-drivers-x86: (47 commits)
platform/x86: acer-wmi: Silence "unsupported" message a bit
platform/x86: intel_punit_ipc: fix build errors
platform/x86: ideapad: Add Y520-15IKBM and Y720-15IKBM to no_hw_rfkill
platform/x86: asus-nb-wmi: Add keymap entry for lid flip action on UX360
platform/x86: acer-wmi: refactor function has_cap
platform/x86: thinkpad_acpi: Fix multi-battery bug
platform/x86: thinkpad_acpi: extend battery quirk coverage
platform/x86: touchscreen_dmi: Add info for the Cube KNote i1101 tablet
platform/x86: mlx-platform: Fix copy-paste error in mlxplat_init()
platform/x86: mlx-platform: Remove unused define
platform/x86: mlx-platform: Change mlxreg-io configuration for MSN274x systems
Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
platform/x86: mlx-platform: Allow mlxreg-io driver activation for more systems
platform/x86: mlx-platform: Add ASIC hotplug device configuration
platform/mellanox: mlxreg-hotplug: Add hotplug hwmon uevent notification
platform/mellanox: mlxreg-hotplug: Improve mechanism of ASIC health discovery
platform/x86: mlx-platform: Add mlxreg-fan platform driver activation
platform/x86: dell-laptop: Fix backlight detection
platform/x86: toshiba_acpi: Fix defined but not used build warnings
platform/x86: thinkpad_acpi: Support battery quirk
...
Linus Torvalds [Wed, 22 Aug 2018 21:04:41 +0000 (14:04 -0700)]
Merge tag 'xtensa-20180820' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa updates from Max Filippov:
- switch xtensa arch to the generic noncoherent direct mapping
operations
- add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
- clean up users of platform/hardware.h in generic Xtensa code
- fix assembly cache maintenance code for long cache lines
- rework noMMU cache attributes initialization
- add big-endian HiFi2 test_kc705_be CPU variant
* tag 'xtensa-20180820' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: add test_kc705_be variant
xtensa: clean up boot-elf/bootstrap.S
xtensa: make bootparam parsing optional
xtensa: drop variant IRQ support
xtensa: drop unneeded platform/hardware.h headers
xtensa: move PLATFORM_NR_IRQS to Kconfig
xtensa: rework {CONFIG,PLATFORM}_DEFAULT_MEM_START
xtensa: drop unused {CONFIG,PLATFORM}_DEFAULT_MEM_SIZE
xtensa: rework noMMU cache attributes initialization
xtensa: increase ranges in ___invalidate_{i,d}cache_all
xtensa: limit offsets in __loop_cache_{all,page}
xtensa: platform-specific handling of coherent memory
xtensa: support DMA_ATTR_NO_KERNEL_MAPPING attribute
xtensa: use generic dma_noncoherent_ops
Linus Torvalds [Wed, 22 Aug 2018 20:52:44 +0000 (13:52 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull second set of KVM updates from Paolo Bonzini:
"ARM:
- Support for Group0 interrupts in guests
- Cache management optimizations for ARMv8.4 systems
- Userspace interface for RAS
- Fault path optimization
- Emulated physical timer fixes
- Random cleanups
x86:
- fixes for L1TF
- a new test case
- non-support for SGX (inject the right exception in the guest)
- fix lockdep false positive"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (49 commits)
KVM: VMX: fixes for vmentry_l1d_flush module parameter
kvm: selftest: add dirty logging test
kvm: selftest: pass in extra memory when create vm
kvm: selftest: include the tools headers
kvm: selftest: unify the guest port macros
tools: introduce test_and_clear_bit
KVM: x86: SVM: Call x86_spec_ctrl_set_guest/host() with interrupts disabled
KVM: vmx: Inject #UD for SGX ENCLS instruction in guest
KVM: vmx: Add defines for SGX ENCLS exiting
x86/kvm/vmx: Fix coding style in vmx_setup_l1d_flush()
x86: kvm: avoid unused variable warning
KVM: Documentation: rename the capability of KVM_CAP_ARM_SET_SERROR_ESR
KVM: arm/arm64: Skip updating PTE entry if no change
KVM: arm/arm64: Skip updating PMD entry if no change
KVM: arm: Use true and false for boolean values
KVM: arm/arm64: vgic: Do not use spin_lock_irqsave/restore with irq disabled
KVM: arm/arm64: vgic: Move DEBUG_SPINLOCK_BUG_ON to vgic.h
KVM: arm: vgic-v3: Add support for ICC_SGI0R and ICC_ASGI1R accesses
KVM: arm64: vgic-v3: Add support for ICC_SGI0R_EL1 and ICC_ASGI1R_EL1 accesses
KVM: arm/arm64: vgic-v3: Add core support for Group0 SGIs
...
* tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block: (31 commits)
block/DAC960.c: make some arrays static const, shrinks object size
blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter
blk-mq: init hctx sched after update ctx and hctx mapping
block: remove duplicate initialization
tracing/blktrace: Fix to allow setting same value
pktcdvd: fix setting of 'ret' error return for a few cases
block: change return type to bool
block, bfq: return nbytes and not zero from struct cftype .write() method
block, bfq: improve code of bfq_bfqq_charge_time
block, bfq: reduce write overcharge
block, bfq: always update the budget of an entity when needed
block, bfq: readd missing reset of parent-entity service
blk-wbt: fix IO hang in wbt_wait()
block: don't warn for flush on read-only device
bcache: add the missing comments for smp_mb()/smp_wmb()
bcache: remove unnecessary space before ioctl function pointer arguments
bcache: add missing SPDX header
bcache: move open brace at end of function definitions to next line
bcache: add static const prefix to char * array declarations
bcache: fix code comments style
...
Linus Torvalds [Wed, 22 Aug 2018 20:29:39 +0000 (13:29 -0700)]
Merge tag 'f2fs-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've tuned f2fs to improve general performance by
serializing block allocation and enhancing discard flows like fstrim
which avoids user IO contention. And we've added fsync_mode=nobarrier
which gives an option to user where it skips issuing cache_flush
commands to underlying flash storage. And there are many bug fixes
related to fuzzed images, revoked atomic writes, quota ops, and minor
direct IO.
Enhancements:
- add fsync_mode=nobarrier which bypasses cache_flush command
- enhance the discarding flow which avoids user IOs and issues in
LBA order
- readahead some encrypted blocks during GC
- enable in-memory inode checksum to verify the blocks if
F2FS_CHECK_FS is set
- enhance nat_bits behavior
- set -o discard by default
- set REQ_RAHEAD to bio in ->readpages
Bug fixes:
- fix a corner case to corrupt atomic_writes revoking flow
- revisit i_gc_rwsem to fix race conditions
- fix some dio behaviors captured by xfstests
- correct handling errors given by quota-related failures
- add many sanity check flows to avoid fuzz test failures
- add more error number propagation to their callers
- fix several corner cases to continue fault injection w/ shutdown
loop"
* tag 'f2fs-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (89 commits)
f2fs: readahead encrypted block during GC
f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc
f2fs: fix performance issue observed with multi-thread sequential read
f2fs: fix to skip verifying block address for non-regular inode
f2fs: rework fault injection handling to avoid a warning
f2fs: support fault_type mount option
f2fs: fix to return success when trimming meta area
f2fs: fix use-after-free of dicard command entry
f2fs: support discard submission error injection
f2fs: split discard command in prior to block layer
f2fs: wake up gc thread immediately when gc_urgent is set
f2fs: fix incorrect range->len in f2fs_trim_fs()
f2fs: refresh recent accessed nat entry in lru list
f2fs: fix avoid race between truncate and background GC
f2fs: avoid race between zero_range and background GC
f2fs: fix to do sanity check with block address in main area v2
f2fs: fix to do sanity check with inline flags
f2fs: fix to reset i_gc_failures correctly
f2fs: fix invalid memory access
f2fs: fix to avoid broken of dnode block list
...
Linus Torvalds [Wed, 22 Aug 2018 19:34:08 +0000 (12:34 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- the rest of MM
- procfs updates
- various misc things
- more y2038 fixes
- get_maintainer updates
- lib/ updates
- checkpatch updates
- various epoll updates
- autofs updates
- hfsplus
- some reiserfs work
- fatfs updates
- signal.c cleanups
- ipc/ updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (166 commits)
ipc/util.c: update return value of ipc_getref from int to bool
ipc/util.c: further variable name cleanups
ipc: simplify ipc initialization
ipc: get rid of ids->tables_initialized hack
lib/rhashtable: guarantee initial hashtable allocation
lib/rhashtable: simplify bucket_table_alloc()
ipc: drop ipc_lock()
ipc/util.c: correct comment in ipc_obtain_object_check
ipc: rename ipcctl_pre_down_nolock()
ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
ipc: reorganize initialization of kern_ipc_perm.seq
ipc: compute kern_ipc_perm.id under the ipc lock
init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
adfs: use timespec64 for time conversion
kernel/sysctl.c: fix typos in comments
drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
fork: don't copy inconsistent signal handler state to child
signal: make get_signal() return bool
signal: make sigkill_pending() return bool
...
Manfred Spraul [Wed, 22 Aug 2018 05:02:00 +0000 (22:02 -0700)]
ipc/util.c: further variable name cleanups
The varable names got a mess, thus standardize them again:
id: user space id. Called semid, shmid, msgid if the type is known.
Most functions use "id" already.
idx: "index" for the idr lookup
Right now, some functions use lid, ipc_addid() already uses idx as
the variable name.
seq: sequence number, to avoid quick collisions of the user space id
key: user space key, used for the rhash tree
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:52 +0000 (22:01 -0700)]
ipc: get rid of ids->tables_initialized hack
In sysvipc we have an ids->tables_initialized regarding the rhashtable,
introduced in 0cfb6aee70bd ("ipc: optimize semget/shmget/msgget for lots
of keys")
It's there, specifically, to prevent nil pointer dereferences, from using
an uninitialized api. Considering how rhashtable_init() can fail
(probably due to ENOMEM, if anything), this made the overall ipc
initialization capable of failure as well. That alone is ugly, but fine,
however I've spotted a few issues regarding the semantics of
tables_initialized (however unlikely they may be):
- There is inconsistency in what we return to userspace: ipc_addid()
returns ENOSPC which is certainly _wrong_, while ipc_obtain_object_idr()
returns EINVAL.
- After we started using rhashtables, ipc_findkey() can return nil upon
!tables_initialized, but the caller expects nil for when the ipc
structure isn't found, and can therefore call into ipcget() callbacks.
Now that rhashtable initialization cannot fail, we can properly get rid of
the hack altogether.
[manfred@colorfullife.com: commit id extended to 12 digits] Link: http://lkml.kernel.org/r/20180712185241.4017-10-manfred@colorfullife.com Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rhashtable_init() may fail due to -ENOMEM, thus making the entire api
unusable. This patch removes this scenario, however unlikely. In order
to guarantee memory allocation, this patch always ends up doing
GFP_KERNEL|__GFP_NOFAIL for both the tbl as well as
alloc_bucket_spinlocks().
Upon the first table allocation failure, we shrink the size to the
smallest value that makes sense and retry with __GFP_NOFAIL semantics.
With the defaults, this means that from 64 buckets, we retry with only 4.
Any later issues regarding performance due to collisions or larger table
resizing (when more memory becomes available) is the least of our
problems.
Link: http://lkml.kernel.org/r/20180712185241.4017-9-manfred@colorfullife.com Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:45 +0000 (22:01 -0700)]
lib/rhashtable: simplify bucket_table_alloc()
As of ce91f6ee5b3b ("mm: kvmalloc does not fallback to vmalloc for
incompatible gfp flags") we can simplify the caller and trust kvzalloc()
to just do the right thing. For the case of the GFP_ATOMIC context, we
can drop the __GFP_NORETRY flag for obvious reasons, and for the
__GFP_NOWARN case, however, it is changed such that the caller passes the
flag instead of making bucket_table_alloc() handle it.
This slightly changes the gfp flags passed on to nested_table_alloc() as
it will now also use GFP_ATOMIC | __GFP_NOWARN. However, I consider this
a positive consequence as for the same reasons we want nowarn semantics in
bucket_table_alloc().
[manfred@colorfullife.com: commit id extended to 12 digits, line wraps updated] Link: http://lkml.kernel.org/r/20180712185241.4017-8-manfred@colorfullife.com Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:41 +0000 (22:01 -0700)]
ipc: drop ipc_lock()
ipc/util.c contains multiple functions to get the ipc object pointer given
an id number.
There are two sets of function: One set verifies the sequence counter part
of the id number, other functions do not check the sequence counter.
The standard for function names in ipc/util.c is
- ..._check() functions verify the sequence counter
- ..._idr() functions do not verify the sequence counter
ipc_lock() is an exception: It does not verify the sequence counter value,
but this is not obvious from the function name.
Furthermore, shm.c is the only user of this helper. Thus, we can simply
move the logic into shm_lock() and get rid of the function altogether.
[manfred@colorfullife.com: most of changelog] Link: http://lkml.kernel.org/r/20180712185241.4017-7-manfred@colorfullife.com Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Kees Cook <keescook@chromium.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manfred Spraul [Wed, 22 Aug 2018 05:01:37 +0000 (22:01 -0700)]
ipc/util.c: correct comment in ipc_obtain_object_check
The comment that explains ipc_obtain_object_check is wrong: The function
checks the sequence number, not the reference counter.
Note that checking the reference counter would be meaningless: The
reference counter is decreased without holding any locks, thus an object
with kern_ipc_perm.deleted=true may disappear at the end of the next rcu
grace period.
Manfred Spraul [Wed, 22 Aug 2018 05:01:34 +0000 (22:01 -0700)]
ipc: rename ipcctl_pre_down_nolock()
Both the comment and the name of ipcctl_pre_down_nolock() are misleading:
The function must be called while holdling the rw semaphore.
Therefore the patch renames the function to ipcctl_obtain_check(): This
name matches the other names used in util.c:
- "obtain" function look up a pointer in the idr, without
acquiring the object lock.
- The caller is responsible for locking.
- _check means that the sequence number is checked.
Manfred Spraul [Wed, 22 Aug 2018 05:01:29 +0000 (22:01 -0700)]
ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
ipc_addid() is impossible to use:
- for certain failures, the caller must not use ipc_rcu_putref(),
because the reference counter is not yet initialized.
- for other failures, the caller must use ipc_rcu_putref(),
because parallel operations could be ongoing already.
The patch cleans that up, by initializing the refcount early, and by
modifying all callers.
The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with reading kern_ipc_perm.seq, here both read and write to already
released memory could happen.
Manfred Spraul [Wed, 22 Aug 2018 05:01:25 +0000 (22:01 -0700)]
ipc: reorganize initialization of kern_ipc_perm.seq
ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
(within ipc_idr_alloc()).
Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
may see an uninitialized value.
The patch moves the initialization of kern_ipc_perm.seq before the calls
of idr_alloc().
Notes:
1) This patch has a user space visible side effect:
If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
if semget()/msgget()/shmget() fails in the final step of adding the id
to the rhash tree, then .._next_id is cleared. Before the patch, is
remained unmodified.
There is no change of the behavior after a successful ..get() call: It
always clears .._next_id, there is no impact to non checkpoint/restore
code as that code does not use .._next_id.
2) The patch correctly documents that after a call to ipc_idr_alloc(),
the full tear-down sequence must be used. The callers of ipc_addid()
do not fullfill that, i.e. more bugfixes are required.
The patch is a squash of a patch from Dmitry and my own changes.
Adrian Reber [Wed, 22 Aug 2018 05:01:17 +0000 (22:01 -0700)]
init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
The CHECKPOINT_RESTORE configuration option was introduced in 2012 and
combined with EXPERT. CHECKPOINT_RESTORE is already enabled in many
distribution kernels and also part of the defconfigs of various
architectures.
To make it easier for distributions to enable CHECKPOINT_RESTORE this
removes EXPERT and moves the configuration option out of the EXPERT block.
Link: http://lkml.kernel.org/r/20180712130733.11510-1-adrian@lisas.de Signed-off-by: Adrian Reber <adrian@lisas.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Wed, 22 Aug 2018 05:01:13 +0000 (22:01 -0700)]
fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
get_seconds() is deprecated in favor of ktime_get_real_seconds(), which
returns a 64-bit timestamp.
In the SYSV file system, the superblock timestamp is only 32 bits wide,
and it is used to check whether a file system is clean, so the best
solution seems to be to force a wraparound and explicitly convert it to an
unsigned 32-bit value.
This is independent of the inode timestamps that are also 32-bit wide on
disk and that come from current_time().
Link: http://lkml.kernel.org/r/20180713145236.3152513-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pointer md is being assigned but is never used hence it is redundant and
can be removed.
Cleans up clang warning:
warning: variable 'md' set but not used [-Wunused-but-set-variable]
Link: http://lkml.kernel.org/r/20180711082346.5223-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Alexandre Bounine <alex.bou9@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jann Horn [Wed, 22 Aug 2018 05:00:58 +0000 (22:00 -0700)]
fork: don't copy inconsistent signal handler state to child
Before this change, if a multithreaded process forks while one of its
threads is changing a signal handler using sigaction(), the memcpy() in
copy_sighand() can race with the struct assignment in do_sigaction(). It
isn't clear whether this can cause corruption of the userspace signal
handler pointer, but it definitely can cause inconsistency between
different fields of struct sigaction.
Take the appropriate spinlock to avoid this.
I have tested that this patch prevents inconsistency between sa_sigaction
and sa_flags, which is possible before this patch.
Link: http://lkml.kernel.org/r/20180702145108.73189-1-jannh@google.com Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Rik van Riel <riel@redhat.com> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "signal: refactor some functions", v3.
This series refactors a bunch of functions in signal.c to simplify parts
of the code.
The greatest single change is declaring the static do_sigpending() helper
as void which makes it possible to remove a bunch of unnecessary checks in
the syscalls later on.
This patch (of 17):
force_sigsegv() returned 0 unconditionally so it doesn't make sense to have
it return at all. In addition, there are no callers that check
force_sigsegv()'s return value.
Link: http://lkml.kernel.org/r/20180602103653.18181-2-christian@brauner.io Signed-off-by: Christian Brauner <christian@brauner.io> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Morris <james.morris@microsoft.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>