Jan Kara [Tue, 5 Nov 2019 16:44:31 +0000 (17:44 +0100)]
jbd2: Fine tune estimate of necessary descriptor blocks
Currently we reserve j_max_transaction_buffers / 32 for transaction
descriptor blocks. Now that revoke descriptors are accounted for
separately this estimate is unnecessarily high and we can actually
compute much tighter estimate. In the common case of 32k journal blocks
and 4k blocksize this actually reduces the amount of reserved descriptor
blocks from 256 to ~25 which allows us to fit more real data into a
transaction.
Jan Kara [Tue, 5 Nov 2019 16:44:29 +0000 (17:44 +0100)]
ext4: Reserve revoke credits for freed blocks
So far we have reserved only relatively high fixed amount of revoke
credits for each transaction. We over-reserved by large amount for most
cases but when freeing large directories or files with data journalling,
the fixed amount is not enough. In fact the worst case estimate is
inconveniently large (maximum extent size) for freeing of one extent.
We fix this by doing proper estimate of the amount of blocks that need
to be revoked when removing blocks from the inode due to truncate or
hole punching and otherwise reserve just a small amount of revoke
credits for each transaction to accommodate freeing of xattrs block or
so.
Jan Kara [Tue, 5 Nov 2019 16:44:28 +0000 (17:44 +0100)]
jbd2: Make credit checking more strict
Make checking of available credits in jbd2_journal_dirty_metadata() more
strict. There should be always enough credits in the handle to write all
potential revoke descriptors. Also we warn in case there are not enough
credits since this is a bug in the filesystem.
Jan Kara [Tue, 5 Nov 2019 16:44:27 +0000 (17:44 +0100)]
jbd2: Rename h_buffer_credits to h_total_credits
The credit counter now contains both buffer and revoke descriptor block
credits. Rename to counter to h_total_credits to reflect that. No
functional change.
Jan Kara [Tue, 5 Nov 2019 16:44:26 +0000 (17:44 +0100)]
jbd2: Reserve space for revoke descriptor blocks
Extend functions for starting, extending, and restarting transaction
handles to take number of revoke records handle must be able to
accommodate. These functions then make sure transaction has enough
credits to be able to store resulting revoke descriptor blocks. Also
revoke code tracks number of revoke records created by a handle to catch
situation where some place didn't reserve enough space for revoke
records. Similarly to standard transaction credits, space for unused
reserved revoke records is released when the handle is stopped.
On the ext4 side we currently take a simplistic approach of reserving
space for 1024 revoke records for any transaction. This grows amount of
credits reserved for each handle only by a few and is enough for any
normal workload so that we don't hit warnings in jbd2. We will refine
the logic in following commits.
Jan Kara [Tue, 5 Nov 2019 16:44:24 +0000 (17:44 +0100)]
jbd2: Account descriptor blocks into t_outstanding_credits
Currently, journal descriptor blocks were not accounted in
transaction->t_outstanding_credits and we were just leaving some slack
space in the journal for them (in jbd2_log_space_left() and
jbd2_space_needed()). This is making proper accounting (and reservation
we want to add) of descriptor blocks difficult so switch to accounting
descriptor blocks in transaction->t_outstanding_credits and just reserve
the same amount of credits in t_outstanding credits for journal
descriptor blocks when creating transaction.
Jan Kara [Tue, 5 Nov 2019 16:44:23 +0000 (17:44 +0100)]
jbd2: Factor out common parts of stopping and restarting a handle
jbd2__journal_restart() has quite some code that is common with
jbd2_journal_stop(). Factor this functionality into stop_this_handle()
helper and use it from both functions. Note that this also drops
t_handle_lock protection from jbd2__journal_restart() as
jbd2_journal_stop() does the same thing without it.
Jan Kara [Tue, 5 Nov 2019 16:44:22 +0000 (17:44 +0100)]
jbd2: Drop pointless wakeup from jbd2_journal_stop()
When we drop last handle from a transaction and journal->j_barrier_count
> 0, jbd2_journal_stop() wakes up journal->j_wait_transaction_locked
wait queue. This looks pointless - wait for outstanding handles always
happens on journal->j_wait_updates waitqueue.
journal->j_wait_transaction_locked is used to wait for transaction state
changes and by start_this_handle() for waiting until
journal->j_barrier_count drops to 0. The first case is clearly
irrelevant here since only jbd2 thread changes transaction state. The
second case looks related but jbd2_journal_unlock_updates() is
responsible for the wakeup in this case. So just drop the wakeup.
Jan Kara [Tue, 5 Nov 2019 16:44:21 +0000 (17:44 +0100)]
jbd2: Drop pointless check from jbd2_journal_stop()
If a transaction is larger than journal->j_max_transaction_buffers, that
is a bug and not a trigger for transaction commit. Also the very next
attempt to start new handle will start transaction commit anyway. So
just remove the pointless check. Arguably, we could start transaction
commit whenever the transaction size is *close* to
journal->j_max_transaction_buffers. This has a potential to reduce
latency of the next jbd2_journal_start() at the cost of somewhat smaller
transactions. However for this to have any effect, it would mean that
there isn't someone already waiting in jbd2_journal_start() which means
metadata load for the fs is pretty light anyway so probably this
optimization is not worth it.
Jan Kara [Tue, 5 Nov 2019 16:44:20 +0000 (17:44 +0100)]
jbd2: Reorganize jbd2_journal_stop()
Move code in jbd2_journal_stop() around a bit. It removes some
unnecessary code duplication and will make factoring out parts common
with jbd2__journal_restart() easier.
Jan Kara [Tue, 5 Nov 2019 16:44:19 +0000 (17:44 +0100)]
jbd2: Fix statistics for the number of logged blocks
jbd2 statistics counting number of blocks logged in a transaction was
wrong. It didn't count the commit block and more importantly it didn't
count revoke descriptor blocks. Make sure these get properly counted.
Jan Kara [Tue, 5 Nov 2019 16:44:17 +0000 (17:44 +0100)]
ext4, jbd2: Provide accessor function for handle credits
Provide accessor function to get number of credits available in a handle
and use it from ext4. Later, computation of available credits won't be
so straightforward.
Jan Kara [Tue, 5 Nov 2019 16:44:16 +0000 (17:44 +0100)]
ext4: Provide function to handle transaction restarts
Provide ext4_journal_ensure_credits_fn() function to ensure transaction
has given amount of credits and call helper function to prepare for
restarting a transaction. This allows to remove some boilerplate code
from various places, add proper error handling for the case where
transaction extension or restart fails, and reduces following changes
needed for proper revoke record reservation tracking.
Jan Kara [Tue, 5 Nov 2019 16:44:15 +0000 (17:44 +0100)]
ext4: Avoid unnecessary revokes in ext4_alloc_branch()
Error cleanup path in ext4_alloc_branch() calls ext4_forget() on freshly
allocated indirect blocks with 'metadata' set to 1. This results in
generating revoke records for these blocks. However this is unnecessary
as the freed blocks are only allocated in the current transaction and
thus they will never be journalled. Make this cleanup path similar to
e.g. cleanup in ext4_splice_branch() and use ext4_free_blocks() to
handle block forgetting by passing EXT4_FREE_BLOCKS_FORGET and not
EXT4_FREE_BLOCKS_METADATA to ext4_free_blocks(). This also allows
allocating transaction not to reserve any credits for revoke records.
Jan Kara [Tue, 5 Nov 2019 16:44:13 +0000 (17:44 +0100)]
ext4: Fix ext4_should_journal_data() for EA inodes
Similarly to directories, EA inodes do only journalled modifications to
their data. Change ext4_should_journal_data() to return true for them so
that we don't have to special-case them during truncate.
Jan Kara [Tue, 5 Nov 2019 16:44:12 +0000 (17:44 +0100)]
ext4: Fix credit estimate for final inode freeing
Estimate for the number of credits needed for final freeing of inode in
ext4_evict_inode() was to small. We may modify 4 blocks (inode & sb for
orphan deletion, bitmap & group descriptor for inode freeing) and not
just 3.
Jan Kara [Tue, 5 Nov 2019 16:44:11 +0000 (17:44 +0100)]
ext4: Do not iput inode under running transaction
When ext4_mkdir(), ext4_symlink(), ext4_create(), or ext4_mknod() fail
to add entry into directory, it ends up dropping freshly created inode
under the running transaction and thus inode truncation happens under
that transaction. That breaks assumptions that evict() does not get
called from a transaction context and at least in ext4_symlink() case it
can result in inode eviction deadlocking in inode_wait_for_writeback()
when flush worker finds symlink inode, starts to write it back and
blocks on starting a transaction. So change the code in ext4_mkdir() and
ext4_add_nondir() to drop inode reference only after the transaction is
stopped. We also have to add inode to the orphan list in that case as
otherwise the inode would get leaked in case we crash before inode
deletion is committed.
Jan Kara [Tue, 5 Nov 2019 16:44:10 +0000 (17:44 +0100)]
ext4: Move marking of handle as sync to ext4_add_nondir()
Every caller of ext4_add_nondir() marks handle as sync if directory has
DIRSYNC set. Move this marking to ext4_add_nondir() so reduce some
duplication.
Jan Kara [Tue, 5 Nov 2019 16:44:09 +0000 (17:44 +0100)]
jbd2: Completely fill journal descriptor blocks
With 32-bit block numbers, we don't allocate the array for journal
buffer heads large enough for corresponding descriptor tags to fill the
descriptor block. Thus we end up writing out half-full descriptor blocks
to the journal unnecessarily growing the transaction. Fix the logic to
allocate the array large enough.
Jan Kara [Tue, 5 Nov 2019 16:44:07 +0000 (17:44 +0100)]
jbd2: Fix possible overflow in jbd2_log_space_left()
When number of free space in the journal is very low, the arithmetic in
jbd2_log_space_left() could underflow resulting in very high number of
free blocks and thus triggering assertion failure in transaction commit
code complaining there's not enough space in the journal:
Ritesh Harjani [Wed, 16 Oct 2019 07:37:11 +0000 (13:07 +0530)]
ext4: Enable blocksize < pagesize for dioread_nolock
All support is now added for blocksize < pagesize for dioread_nolock.
This patch removes those checks which disables dioread_nolock
feature for blocksize != pagesize.
Ritesh Harjani [Wed, 16 Oct 2019 07:37:10 +0000 (13:07 +0530)]
ext4: Add support for blocksize < pagesize in dioread_nolock
This patch adds the support for blocksize < pagesize for
dioread_nolock feature.
Since in case of blocksize < pagesize, we can have multiple
small buffers of page as unwritten extents, we need to
maintain a vector of these unwritten extents which needs
the conversion after the IO is complete. Thus, we maintain
a list of tuple <offset, size> pair (io_end_vec) for this &
traverse this list to do the unwritten to written conversion.
Ritesh Harjani [Wed, 16 Oct 2019 07:37:09 +0000 (13:07 +0530)]
ext4: Refactor mpage_map_and_submit_buffers function
This patch refactors mpage_map_and_submit_buffers to take
out the page buffers processing, as a separate function.
This will be required to add support for blocksize < pagesize
for dioread_nolock feature.
Ritesh Harjani [Wed, 16 Oct 2019 07:37:08 +0000 (13:07 +0530)]
ext4: Add API to bring in support for unwritten io_end_vec conversion
This patch just brings in the API for conversion of unwritten io_end_vec
extents which will be required for blocksize < pagesize support
for dioread_nolock feature.
Ritesh Harjani [Wed, 16 Oct 2019 07:37:07 +0000 (13:07 +0530)]
ext4: keep uniform naming convention for io & io_end variables
Let's keep uniform naming convention for ext4_submit_io (io)
& ext4_end_io_t (io_end) structures, to avoid any confusion.
No functionality change in this patch.
Thomas Gleixner [Fri, 9 Aug 2019 12:42:33 +0000 (14:42 +0200)]
jbd2: Free journal head outside of locked region
On PREEMPT_RT bit-spinlocks have the same semantics as on PREEMPT_RT=n,
i.e. they disable preemption. That means functions which are not safe to be
called in preempt disabled context on RT trigger a might_sleep() assert.
The journal head bit spinlock is mostly held for short code sequences with
trivial RT safe functionality, except for one place:
jbd2_journal_put_journal_head() invokes __journal_remove_journal_head()
with the journal head bit spinlock held. __journal_remove_journal_head()
invokes kmem_cache_free() which must not be called with preemption disabled
on RT.
Jan suggested to rework the removal function so the actual free happens
outside the bit-spinlocked region.
Split it into two parts:
- Do the sanity checks and the buffer head detach under the lock
- Do the actual free after dropping the lock
There is error case handling in the free part which needs to dereference
the b_size field of the now detached buffer head. Due to paranoia (caused
by ignorance) the size is retrieved in the detach function and handed into
the free function. Might be over-engineered, but better safe than sorry.
This makes the journal head bit-spinlock usage RT compliant and also avoids
nested locking which is not covered by lockdep.
Suggested-by: Jan Kara <jack@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-ext4@vger.kernel.org Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20190809124233.13277-8-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Thomas Gleixner [Fri, 9 Aug 2019 12:42:32 +0000 (14:42 +0200)]
jbd2: Make state lock a spinlock
Bit-spinlocks are problematic on PREEMPT_RT if functions which might sleep
on RT, e.g. spin_lock(), alloc/free(), are invoked inside the lock held
region because bit spinlocks disable preemption even on RT.
A first attempt was to replace state lock with a spinlock placed in struct
buffer_head and make the locking conditional on PREEMPT_RT and
DEBUG_BIT_SPINLOCKS.
Jan pointed out that there is a 4 byte hole in struct journal_head where a
regular spinlock fits in and he would not object to convert the state lock
to a spinlock unconditionally.
Aside of solving the RT problem, this also gains lockdep coverage for the
journal head state lock (bit-spinlocks are not covered by lockdep as it's
hard to fit a lockdep map into a single bit).
The trivial change would have been to convert the jbd_*lock_bh_state()
inlines, but that comes with the downside that these functions take a
buffer head pointer which needs to be converted to a journal head pointer
which adds another level of indirection.
As almost all functions which use this lock have a journal head pointer
readily available, it makes more sense to remove the lock helper inlines
and write out spin_*lock() at all call sites.
Fixup all locking comments as well.
Suggested-by: Jan Kara <jack@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jan Kara <jack@suse.cz> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jan Kara <jack@suse.com> Cc: linux-ext4@vger.kernel.org Link: https://lore.kernel.org/r/20190809124233.13277-7-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jan Kara [Fri, 9 Aug 2019 12:42:31 +0000 (14:42 +0200)]
jbd2: Don't call __bforget() unnecessarily
jbd2_journal_forget() jumps to 'not_jbd' branch which calls __bforget()
in cases where the buffer is clean which is pointless. In case of failed
assertion, it can be even argued that it is safer not to touch buffer's
dirty bits. Also logically it makes more sense to just jump to 'drop'
and that will make logic also simpler when we switch bh_state_lock to a
spinlock.
Jan Kara [Fri, 9 Aug 2019 12:42:30 +0000 (14:42 +0200)]
jbd2: Drop unnecessary branch from jbd2_journal_forget()
We have cleared both dirty & jbddirty bits from the bh. So there's no
difference between bforget() and brelse(). Thus there's no point jumping
to no_jbd branch.
Jan Kara [Fri, 9 Aug 2019 12:42:29 +0000 (14:42 +0200)]
jbd2: Move dropping of jh reference out of un/re-filing functions
__jbd2_journal_unfile_buffer() and __jbd2_journal_refile_buffer() drop
transaction's jh reference when they remove jh from a transaction. This
will be however inconvenient once we move state lock into journal_head
itself as we still need to unlock it and we'd need to grab jh reference
just for that. Move dropping of jh reference out of these functions into
the few callers.
Thomas Gleixner [Fri, 9 Aug 2019 12:42:28 +0000 (14:42 +0200)]
jbd2: Remove jbd_trylock_bh_state()
No users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: linux-ext4@vger.kernel.org Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20190809124233.13277-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Thomas Gleixner [Fri, 9 Aug 2019 12:42:27 +0000 (14:42 +0200)]
jbd2: Simplify journal_unmap_buffer()
journal_unmap_buffer() checks first whether the buffer head is a journal.
If so it takes locks and then invokes jbd2_journal_grab_journal_head()
followed by another check whether this is journal head buffer.
The double checking is pointless.
Replace the initial check with jbd2_journal_grab_journal_head() which
alredy checks whether the buffer head is actually a journal.
Allows also early access to the journal head pointer for the upcoming
conversion of state lock to a regular spinlock.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: linux-ext4@vger.kernel.org Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20190809124233.13277-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Linus Torvalds [Sun, 13 Oct 2019 21:47:10 +0000 (14:47 -0700)]
Merge tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"A few tracing fixes:
- Remove lockdown from tracefs itself and moved it to the trace
directory. Have the open functions there do the lockdown checks.
- Fix a few races with opening an instance file and the instance
being deleted (Discovered during the lockdown updates). Kept
separate from the clean up code such that they can be backported to
stable easier.
- Clean up and consolidated the checks done when opening a trace
file, as there were multiple checks that need to be done, and it
did not make sense having them done in each open instance.
- Fix a regression in the record mcount code.
- Small hw_lat detector tracer fixes.
- A trace_pipe read fix due to not initializing trace_seq"
* tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
tracing/hwlat: Report total time spent in all NMIs during the sample
recordmcount: Fix nop_mcount() function
tracing: Do not create tracefs files if tracefs lockdown is in effect
tracing: Add locked_down checks to the open calls of files created for tracefs
tracing: Add tracing_check_open_get_tr()
tracing: Have trace events system open call tracing_open_generic_tr()
tracing: Get trace_array reference for available_tracers files
ftrace: Get a reference counter for the trace_array on filter files
tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")
Linus Torvalds [Sun, 13 Oct 2019 15:40:31 +0000 (08:40 -0700)]
Merge tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- Update/fix inspur-ipsps1 and k10temp Documentation
- Fix nct7904 driver
- Fix HWMON_P_MIN_ALARM mask in hwmon core
* tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: docs: Extend inspur-ipsps1 title underline
hwmon: (nct7904) Add array fan_alarm and vsen_alarm to store the alarms in nct7904_data struct.
docs: hwmon: Include 'inspur-ipsps1.rst' into docs
hwmon: Fix HWMON_P_MIN_ALARM mask
hwmon: (k10temp) Update documentation and add temp2_input info
hwmon: (nct7904) Fix the incorrect value of vsen_mask in nct7904_data struct
Linus Torvalds [Sun, 13 Oct 2019 15:26:54 +0000 (08:26 -0700)]
Merge tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Richard Weinberger:
"Two fixes for MTD:
- spi-nor: Fix for a regression in write_sr()
- rawnand: Regression fix for the au1550nd driver"
* tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: au1550nd: Fix au_read_buf16() prototype
mtd: spi-nor: Fix direction of the write_sr() transfer
It caught the process in the middle of trace_access_unlock(). There is
no loop. So, it must be looping in the caller tracing_read_pipe()
via the "waitagain" label.
Crashdump analyze uncovered that iter->seq was completely zeroed
at this point, including iter->seq.seq.size. It means that
print_trace_line() was never able to print anything and
there was no forward progress.
The culprit seems to be in the code:
/* reset all but tr, trace, and overruns */
memset(&iter->seq, 0,
sizeof(struct trace_iterator) -
offsetof(struct trace_iterator, seq));
It was added by the commit 53d0aa773053ab182877 ("ftrace:
add logic to record overruns"). It was v2.6.27-rc1.
It was the time when iter->seq looked like:
struct trace_seq {
unsigned char buffer[PAGE_SIZE];
unsigned int len;
};
There was no "size" variable and zeroing was perfectly fine.
The solution is to reinitialize the structure after or without
zeroing.
tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
max_latency is intended to record the maximum ever observed hardware
latency, which may occur in either part of the loop (inner/outer). So
we need to also consider the outer-loop sample when updating
max_latency.
tracing/hwlat: Report total time spent in all NMIs during the sample
nmi_total_ts is supposed to record the total time spent in *all* NMIs
that occur on the given CPU during the (active portion of the)
sampling window. However, the code seems to be overwriting this
variable for each NMI, thereby only recording the time spent in the
most recent NMI. Fix it by accumulating the duration instead.
The removal of the longjmp code in recordmcount.c mistakenly made the return
of make_nop() being negative an exit of nop_mcount(). It should not exit the
routine, but instead just not process that part of the code. By exiting with
an error code, it would cause the update of recordmcount to fail some files
which would fail the build if ftrace function tracing was enabled.
tracing: Do not create tracefs files if tracefs lockdown is in effect
If on boot up, lockdown is activated for tracefs, don't even bother creating
the files. This can also prevent instances from being created if lockdown is
in effect.
tracing: Add locked_down checks to the open calls of files created for tracefs
Added various checks on open tracefs calls to see if tracefs is in lockdown
mode, and if so, to return -EPERM.
Note, the event format files (which are basically standard on all machines)
as well as the enabled_functions file (which shows what is currently being
traced) are not lockde down. Perhaps they should be, but it seems counter
intuitive to lockdown information to help you know if the system has been
modified.
Currently, most files in the tracefs directory test if tracing_disabled is
set. If so, it should return -ENODEV. The tracing_disabled is called when
tracing is found to be broken. Originally it was done in case the ring
buffer was found to be corrupted, and we wanted to prevent reading it from
crashing the kernel. But it's also called if a tracing selftest fails on
boot. It's a one way switch. That is, once it is triggered, tracing is
disabled until reboot.
As most tracefs files can also be used by instances in the tracefs
directory, they need to be carefully done. Each instance has a trace_array
associated to it, and when the instance is removed, the trace_array is
freed. But if an instance is opened with a reference to the trace_array,
then it requires looking up the trace_array to get its ref counter (as there
could be a race with it being deleted and the open itself). Once it is
found, a reference is added to prevent the instance from being removed (and
the trace_array associated with it freed).
Combine the two checks (tracing_disabled and trace_array_get()) into a
single helper function. This will also make it easier to add lockdown to
tracefs later.
tracing: Have trace events system open call tracing_open_generic_tr()
Instead of having the trace events system open call open code the taking of
the trace_array descriptor (with trace_array_get()) and then calling
trace_open_generic(), have it use the tracing_open_generic_tr() that does
the combination of the two. This requires making tracing_open_generic_tr()
global.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
tracing: Get trace_array reference for available_tracers files
As instances may have different tracers available, we need to look at the
trace_array descriptor that shows the list of the available tracers for the
instance. But there's a race between opening the file and an admin
deleting the instance. The trace_array_get() needs to be called before
accessing the trace_array.
Cc: stable@vger.kernel.org Fixes: 607e2ea167e56 ("tracing: Set up infrastructure to allow tracers for instances") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
ftrace: Get a reference counter for the trace_array on filter files
The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for
an instance now. They need to take a reference to the instance otherwise
there could be a race between accessing the files and deleting the instance.
It wasn't until the :mod: caching where these file operations started
referencing the trace_array directly.
Cc: stable@vger.kernel.org Fixes: 673feb9d76ab3 ("ftrace: Add :mod: caching infrastructure to trace_array") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
I bisected this down to the addition of the proxy_ops into tracefs for
lockdown. It appears that the allocation of the proxy_ops and then freeing
it in the destroy_inode callback, is causing havoc with the memory system.
Reading the documentation about destroy_inode and talking with Linus about
this, this is buggy and wrong. When defining the destroy_inode() method, it
is expected that the destroy_inode() will also free the inode, and not just
the extra allocations done in the creation of the inode. The faulty commit
causes a memory leak of the inode data structure when they are deleted.
Instead of allocating the proxy_ops (and then having to free it) the checks
should be done by the open functions themselves, and not hack into the
tracefs directory. First revert the tracefs updates for locked_down and then
later we can add the locked_down checks in the kernel/trace files.
Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home Fixes: ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Linus Torvalds [Sat, 12 Oct 2019 22:47:19 +0000 (15:47 -0700)]
Merge tag 'char-misc-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for 5.4-rc3.
Nothing huge here. Some binder driver fixes (although it is still
being discussed if these all fix the reported issues or not, so more
might be coming later), some mei device ids and fixes, and a google
firmware driver bugfix that fixes a regression, as well as some other
tiny fixes.
All have been in linux-next with no reported issues"
* tag 'char-misc-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
firmware: google: increment VPD key_len properly
w1: ds250x: Fix build error without CRC16
virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr
binder: Fix comment headers on binder_alloc_prepare_to_free()
binder: prevent UAF read in print_binder_transaction_log_entry()
misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach
mei: avoid FW version request on Ibex Peak and earlier
mei: me: add comet point (lake) LP device ids
Linus Torvalds [Sat, 12 Oct 2019 22:44:46 +0000 (15:44 -0700)]
Merge tag 'staging-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO driver fixes from Greg KH:
"Here are some staging and IIO driver fixes for 5.4-rc3.
The "biggest" thing here is a removal of the fbtft device and flexfb
code as they have been abandoned by their authors and are no longer
needed for that hardware.
Other than that, the usual amount of staging driver and iio driver
fixes for reported issues, and some speakup sysfs file documentation,
which has been long awaited for.
All have been in linux-next with no reported issues"
* tag 'staging-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits)
iio: Fix an undefied reference error in noa1305_probe
iio: light: opt3001: fix mutex unlock race
iio: adc: ad799x: fix probe error handling
iio: light: add missing vcnl4040 of_compatible
iio: light: fix vcnl4000 devicetree hooks
iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller
iio: adc: axp288: Override TS pin bias current for some models
iio: imu: adis16400: fix memory leak
iio: imu: adis16400: release allocated memory on failure
iio: adc: stm32-adc: fix a race when using several adcs with dma and irq
iio: adc: stm32-adc: move registers definitions
iio: accel: adxl372: Perform a reset at start up
iio: accel: adxl372: Fix push to buffers lost samples
iio: accel: adxl372: Fix/remove limitation for FIFO samples
iio: adc: hx711: fix bug in sampling of data
staging: vt6655: Fix memory leak in vt6655_probe
staging: exfat: Use kvzalloc() instead of kzalloc() for exfat_sb_info
Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc
staging: speakup: document sysfs attributes
staging: rtl8188eu: fix HighestRate check in odm_ARFBRefresh_8188E()
...
Linus Torvalds [Sat, 12 Oct 2019 22:42:19 +0000 (15:42 -0700)]
Merge tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are some small tty and serial driver fixes for 5.4-rc3 that
resolve a number of reported issues and regressions.
None of these are huge, full details are in the shortlog. There's also
a MAINTAINERS update that I think you might have already taken in your
tree already, but git should handle that merge easily.
All have been in linux-next with no reported issues"
* tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
MAINTAINERS: kgdb: Add myself as a reviewer for kgdb/kdb
tty: serial: imx: Use platform_get_irq_optional() for optional IRQs
serial: fix kernel-doc warning in comments
serial: 8250_omap: Fix gpio check for auto RTS/CTS
serial: mctrl_gpio: Check for NULL pointer
tty: serial: fsl_lpuart: Fix lpuart_flush_buffer()
tty: serial: Fix PORT_LINFLEXUART definition
tty: n_hdlc: fix build on SPARC
serial: uartps: Fix uartps_major handling
serial: uartlite: fix exit path null pointer
tty: serial: linflexuart: Fix magic SysRq handling
serial: sh-sci: Use platform_get_irq_optional() for optional interrupts
dt-bindings: serial: sh-sci: Document r8a774b1 bindings
serial/sifive: select SERIAL_EARLYCON
tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()'
tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
Linus Torvalds [Sat, 12 Oct 2019 22:37:12 +0000 (15:37 -0700)]
Merge tag 'usb-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a lot of small USB driver fixes for 5.4-rc3.
syzbot has stepped up its testing of the USB driver stack, now able to
trigger fun race conditions between disconnect and probe functions.
Because of that we have a lot of fixes in here from Johan and others
fixing these reported issues that have been around since almost all
time.
We also are just deleting the rio500 driver, making all of the syzbot
bugs found in it moot as it turns out no one has been using it for
years as there is a userspace version that is being used instead.
There are also a number of other small fixes in here, all resolving
reported issues or regressions.
All have been in linux-next without any reported issues"
* tag 'usb-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (65 commits)
USB: yurex: fix NULL-derefs on disconnect
USB: iowarrior: use pr_err()
USB: iowarrior: drop redundant iowarrior mutex
USB: iowarrior: drop redundant disconnect mutex
USB: iowarrior: fix use-after-free after driver unbind
USB: iowarrior: fix use-after-free on release
USB: iowarrior: fix use-after-free on disconnect
USB: chaoskey: fix use-after-free on release
USB: adutux: fix use-after-free on release
USB: ldusb: fix NULL-derefs on driver unbind
USB: legousbtower: fix use-after-free on release
usb: cdns3: Fix for incorrect DMA mask.
usb: cdns3: fix cdns3_core_init_role()
usb: cdns3: gadget: Fix full-speed mode
USB: usb-skeleton: drop redundant in-urb check
USB: usb-skeleton: fix use-after-free after driver unbind
USB: usb-skeleton: fix NULL-deref on disconnect
usb:cdns3: Fix for CV CH9 running with g_zero driver.
usb: dwc3: Remove dev_err() on platform_get_irq() failure
usb: dwc3: Switch to platform_get_irq_byname_optional()
...
Linus Torvalds [Sat, 12 Oct 2019 22:29:54 +0000 (15:29 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Two fixes: a guest-cputime accounting fix, and a cgroup bandwidth
quota precision fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/vtime: Fix guest/system mis-accounting on task switch
sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
Linus Torvalds [Sat, 12 Oct 2019 22:15:17 +0000 (15:15 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also a couple of updates for new Intel
models (which are technically hw-enablement, but to users it's a fix
to perf behavior on those new CPUs - hope this is fine), an AUX
inheritance fix, event time-sharing fix, and a fix for lost non-perf
NMI events on AMD systems"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf/x86/cstate: Add Tiger Lake CPU support
perf/x86/msr: Add Tiger Lake CPU support
perf/x86/intel: Add Tiger Lake CPU support
perf/x86/cstate: Update C-state counters for Ice Lake
perf/x86/msr: Add new CPU model numbers for Ice Lake
perf/x86/cstate: Add Comet Lake CPU support
perf/x86/msr: Add Comet Lake CPU support
perf/x86/intel: Add Comet Lake CPU support
perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
perf/core: Fix corner case in perf_rotate_context()
perf/core: Rework memory accounting in perf_mmap()
perf/core: Fix inheritance of aux_output groups
perf annotate: Don't return -1 for error when doing BPF disassembly
perf annotate: Return appropriate error code for allocation failures
perf annotate: Fix arch specific ->init() failure errors
perf annotate: Propagate the symbol__annotate() error return
perf annotate: Fix the signedness of failure returns
perf annotate: Propagate perf_env__arch() error
perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
perf tools: Propagate get_cpuid() error
...
Linus Torvalds [Sat, 12 Oct 2019 22:08:24 +0000 (15:08 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Misc EFI fixes all across the map: CPER error report fixes, fixes to
TPM event log parsing, fix for a kexec hang, a Sparse fix and other
fixes"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/tpm: Fix sanity check of unsigned tbl_size being less than zero
efi/x86: Do not clean dummy variable in kexec path
efi: Make unexported efi_rci2_sysfs_init() static
efi/tpm: Only set 'efi_tpm_final_log_size' after successful event log parsing
efi/tpm: Don't traverse an event log with no events
efi/tpm: Don't access event->count when it isn't mapped
efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified
efi/cper: Fix endianness of PCIe class code
Linus Torvalds [Sat, 12 Oct 2019 21:46:14 +0000 (14:46 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"A handful of fixes: a kexec linking fix, an AMD MWAITX fix, a vmware
guest support fix when built under Clang, and new CPU model number
definitions"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add Comet Lake to the Intel CPU models header
lib/string: Make memzero_explicit() inline instead of external
x86/cpu/vmware: Use the full form of INL in VMWARE_PORT
x86/asm: Fix MWAITX C-state hint value
Linus Torvalds [Sat, 12 Oct 2019 21:25:38 +0000 (14:25 -0700)]
Merge tag 'riscv/for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Fix several bugs in the breakpoint trap handler
- Drop an unnecessary loop around calls to preempt_schedule_irq()
* tag 'riscv/for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: entry: Remove unneeded need_resched() loop
riscv: Correct the handling of unexpected ebreak in do_trap_break()
riscv: avoid sending a SIGTRAP to a user thread trapped in WARN()
riscv: avoid kernel hangs when trapped in BUG()
Linus Torvalds [Sat, 12 Oct 2019 21:16:51 +0000 (14:16 -0700)]
Merge tag 'mips_fixes_5.4_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
- Build fixes for CONFIG_OPTIMIZE_INLINING=y builds in which the
compiler may choose not to inline __xchg() & __cmpxchg().
- A build fix for Loongson configurations with GCC 9.x.
- Expose some extra HWCAP bits to indicate support for various
instruction set extensions to userland.
- Fix bad stack access in firmware handling code for old SNI
RM200/300/400 machines.
* tag 'mips_fixes_5.4_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Disable Loongson MMI instructions for kernel build
MIPS: elf_hwcap: Export userspace ASEs
MIPS: fw: sni: Fix out of bounds init of o32 stack
MIPS: include: Mark __xchg as __always_inline
MIPS: include: Mark __cmpxchg as __always_inline
Linus Torvalds [Sat, 12 Oct 2019 21:13:55 +0000 (14:13 -0700)]
Merge tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix a kernel crash in spufs_create_root() on Cell machines, since the
new mount API went in.
Fix a regression in our KVM code caused by our recent PCR changes.
Avoid a warning message about a failing hypervisor API on systems that
don't have that API.
A couple of minor build fixes.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do
Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen
Rothwell"
* tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
spufs: fix a crash in spufs_create_root()
powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
powerpc/pseries: Remove confusing warning message.
powerpc/64s/radix: Fix build failure with RADIX_MMU=n
Linus Torvalds [Sat, 12 Oct 2019 21:11:21 +0000 (14:11 -0700)]
Merge tag 'for-linus-5.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- correct panic handling when running as a Xen guest
- cleanup the Xen grant driver to remove printing a pointer being
always NULL
- remove a soon to be wrong call of of_dma_configure()
* tag 'for-linus-5.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Stop abusing DT of_dma_configure API
xen/grant-table: remove unnecessary printing
x86/xen: Return from panic notifier
Kan Liang [Tue, 8 Oct 2019 15:50:08 +0000 (08:50 -0700)]
perf/x86/intel: Add Tiger Lake CPU support
Tiger Lake is the followon to Ice Lake. From the perspective of Intel
core PMU, there is little changes compared with Ice Lake, e.g. small
changes in event list. But it doesn't impact on core PMU functionality.
Share the perf code with Ice Lake. The event list patch will be submitted
later separately.
The patch has been tested on real hardware.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-8-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kan Liang [Tue, 8 Oct 2019 15:50:05 +0000 (08:50 -0700)]
perf/x86/cstate: Add Comet Lake CPU support
Comet Lake is the new 10th Gen Intel processor. From the perspective of
Intel cstate residency counters, there is nothing changed compared with
Kaby Lake.
Share hswult_cstates with Kaby Lake.
Update the comments for Comet Lake.
Kaby Lake is missed in the comments for some Residency Counters. Update
the comments for Kaby Lake as well.
The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.
The patch has been tested on real hardware.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kan Liang [Tue, 8 Oct 2019 15:50:03 +0000 (08:50 -0700)]
perf/x86/intel: Add Comet Lake CPU support
Comet Lake is the new 10th Gen Intel processor. From the perspective
of Intel PMU, there is nothing changed compared with Sky Lake.
Share the perf code with Sky Lake.
The patch has been tested on real hardware.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Fri, 11 Oct 2019 21:28:59 +0000 (14:28 -0700)]
Merge tag 'nfs-for-5.4-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client bugfixes from Anna Schumaker:
"Stable bugfixes:
- Fix O_DIRECT accounting of number of bytes read/written # v4.1+
Other fixes:
- Fix nfsi->nrequests count error on nfs_inode_remove_request()
- Remove redundant mirror tracking in O_DIRECT
- Fix leak of clp->cl_acceptor string
- Fix race to sk_err after xs_error_report"
* tag 'nfs-for-5.4-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC: fix race to sk_err after xs_error_report
NFSv4: Fix leak of clp->cl_acceptor string
NFS: Remove redundant mirror tracking in O_DIRECT
NFS: Fix O_DIRECT accounting of number of bytes read/written
nfs: Fix nfsi->nrequests count error on nfs_inode_remove_request
Linus Torvalds [Fri, 11 Oct 2019 21:01:13 +0000 (14:01 -0700)]
Merge tag '5.4-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Eight small SMB3 fixes, four for stable, and important fix for the
recent regression introduced by filesystem timestamp range patches"
* tag '5.4-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
CIFS: Force reval dentry if LOOKUP_REVAL flag is set
CIFS: Force revalidate inode when dentry is stale
smb3: Fix regression in time handling
smb3: remove noisy debug message and minor cleanup
CIFS: Gracefully handle QueryInfo errors during open
cifs: use cifsInodeInfo->open_file_lock while iterating to avoid a panic
fs: cifs: mute -Wunused-const-variable message
smb3: cleanup some recent endian errors spotted by updated sparse
Linus Torvalds [Fri, 11 Oct 2019 17:19:24 +0000 (10:19 -0700)]
Merge tag 'modules-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module fixes from Jessica Yu:
"Code cleanups and kbuild/namespace related fixups from Masahiro.
Most importantly, it fixes a namespace-related modpost issue for
external module builds
- Fix broken external module builds due to a modpost bug in
read_dump(), where the namespace was not being strdup'd and
sym->namespace would be set to bogus data.
- Various namespace-related kbuild fixes and cleanups thanks to
Masahiro Yamada"
* tag 'modules-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
doc: move namespaces.rst from kbuild/ to core-api/
nsdeps: make generated patches independent of locale
nsdeps: fix hashbang of scripts/nsdeps
kbuild: fix build error of 'make nsdeps' in clean tree
module: rename __kstrtab_ns_* to __kstrtabns_* to avoid symbol conflict
modpost: fix broken sym->namespace for external module builds
module: swap the order of symbol.namespace
scripts: add_namespace: Fix coccicheck failed
Linus Torvalds [Fri, 11 Oct 2019 17:12:45 +0000 (10:12 -0700)]
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fixes from Sasha Levin:
"Two fixes from Dexuan Cui:
- Fix a (harmless) warning when building vmbus without
CONFIG_PM_SLEEP
- Fix for a memory leak (and optimization) in the hyperv mouse code"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Fix harmless building warnings without CONFIG_PM_SLEEP
HID: hyperv: Use in-place iterator API in the channel callback
Convert the coding-style.rst example to the keyword style.
Add description and links to deprecated.rst.
Miguel Ojeda comments on the eventual [[fallthrough]] syntax:
"Note that C17/C18 does not have [[fallthrough]].
C++17 introduced it, as it is mentioned above. I would keep the
__attribute__((fallthrough)) -> [[fallthrough]] change you did,
though, since that is indeed the standard syntax (given the paragraph
references C++17).
I was told by Aaron Ballman (who is proposing them for C) that it is
more or less likely that it becomes standardized in C2x. However, it
is still not added to the draft (other attributes are already,
though). See N2268 and N2269:
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2268.pdf (fallthrough)
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2269.pdf (attributes in general)"
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Sat, 5 Oct 2019 16:46:42 +0000 (09:46 -0700)]
compiler_attributes.h: Add 'fallthrough' pseudo keyword for switch/case use
Reserve the pseudo keyword 'fallthrough' for the ability to convert the
various case block /* fallthrough */ style comments to appear to be an
actual reserved word with the same gcc case block missing fallthrough
warning capability.
Linus Torvalds [Fri, 11 Oct 2019 16:02:33 +0000 (09:02 -0700)]
Merge tag 'drm-fixes-2019-10-11' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"The regular fixes pull for rc3. The i915 team found some fixes they
(or I) missed for rc1, which is why this is a bit bigger than usual,
otherwise there is a single amdgpu fix, some spi panel aliases, and a
bridge fix.
i915:
- execlist access fixes
- list deletion fix
- CML display fix
- HSW workaround extension to GT2
- chicken bit whitelist
- GGTT resume issue
- SKL GPU hangs for Vulkan compute
amdgpu:
- memory leak fix
panel:
- spi aliases
tc358767:
- bridge artifacts fix"
* tag 'drm-fixes-2019-10-11' of git://anongit.freedesktop.org/drm/drm: (22 commits)
drm/bridge: tc358767: fix max_tu_symbol value
drm/i915/gt: execlists->active is serialised by the tasklet
drm/i915/execlists: Protect peeking at execlists->active
drm/i915: Fixup preempt-to-busy vs reset of a virtual request
drm/i915: Only enqueue already completed requests
drm/i915/execlists: Drop redundant list_del_init(&rq->sched.link)
drm/i915/cml: Add second PCH ID for CMP
drm/amdgpu: fix memory leak
drm/panel: tpo-td043mtea1: Fix SPI alias
drm/panel: tpo-td028ttec1: Fix SPI alias
drm/panel: sony-acx565akm: Fix SPI alias
drm/panel: nec-nl8048hl11: Fix SPI alias
drm/panel: lg-lb035q02: Fix SPI alias
drm/i915: Mark contents as dirty on a write fault
drm/i915: Prevent bonded requests from overtaking each other on preemption
drm/i915: Bump skl+ max plane width to 5k for linear/x-tiled
drm/i915: Verify the engine after acquiring the active.lock
drm/i915: Extend Haswell GT1 PSMI workaround to all
drm/i915: Don't mix srcu tag and negative error codes
drm/i915: Whitelist COMMON_SLICE_CHICKEN2
...
Linus Torvalds [Fri, 11 Oct 2019 15:45:32 +0000 (08:45 -0700)]
Merge tag 'for-linus-20191010' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix wbt performance regression introduced with the blk-rq-qos
refactoring (Harshad)
- Fix io_uring fileset removal inadvertently killing the workqueue (me)
- Fix io_uring typo in linked command nonblock submission (Pavel)
- Remove spurious io_uring wakeups on request free (Pavel)
- Fix null_blk zoned command error return (Keith)
- Don't use freezable workqueues for backing_dev, also means we can
revert a previous libata hack (Mika)
- Fix nbd sysfs mutex dropped too soon at removal time (Xiubo)
* tag 'for-linus-20191010' of git://git.kernel.dk/linux-block:
nbd: fix possible sysfs duplicate warning
null_blk: Fix zoned command return code
io_uring: only flush workqueues on fileset removal
io_uring: remove wait loop spurious wakeups
blk-wbt: fix performance regression in wbt scale_up/scale_down
Revert "libata, freezer: avoid block device removal while system is frozen"
bdi: Do not use freezable workqueue
io_uring: fix reversed nonblock flag for link submission
Depending on inlining decisions by the compiler, __get/put_user_fn
might become out of line. Then the compiler is no longer able to tell
that size can only be 1,2,4 or 8 due to the check in __get/put_user
resulting in false positives like
./arch/s390/include/asm/uaccess.h: In function ‘__put_user_fn’:
./arch/s390/include/asm/uaccess.h:113:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized]
113 | return rc;
| ^~
./arch/s390/include/asm/uaccess.h: In function ‘__get_user_fn’:
./arch/s390/include/asm/uaccess.h:143:9: warning: ‘rc’ may be used uninitialized in this function [-Wmaybe-uninitialized]
143 | return rc;
| ^~
These functions are supposed to be always inlined. Mark it as such.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Brian Norris [Mon, 30 Sep 2019 21:45:22 +0000 (14:45 -0700)]
firmware: google: increment VPD key_len properly
Commit 4b708b7b1a2c ("firmware: google: check if size is valid when
decoding VPD data") adds length checks, but the new vpd_decode_entry()
function botched the logic -- it adds the key length twice, instead of
adding the key and value lengths separately.
On my local system, this means vpd.c's vpd_section_create_attribs() hits
an error case after the first attribute it parses, since it's no longer
looking at the correct offset. With this patch, I'm back to seeing all
the correct attributes in /sys/firmware/vpd/...
Fixes: 4b708b7b1a2c ("firmware: google: check if size is valid when decoding VPD data") Cc: <stable@vger.kernel.org> Cc: Hung-Te Lin <hungte@chromium.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Link: https://lore.kernel.org/r/20190930214522.240680-1-briannorris@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Douglas Anderson [Fri, 20 Sep 2019 17:44:47 +0000 (10:44 -0700)]
MAINTAINERS: kgdb: Add myself as a reviewer for kgdb/kdb
I'm interested in kdb / kgdb and have sent various fixes over the
years. I'd like to get CCed on patches so I can be aware of them and
also help review.
The spu_fs_context was not set in fc->fs_private, this caused a crash
when accessing ctx->mode in spufs_create_root().
Fixes: d2e0981c3b9a ("vfs: Convert spufs to use the new mount API") Signed-off-by: Emmanuel Nicolet <emmanuel.nicolet@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20191008141342.GA266797@gmail.com
Jens Axboe [Fri, 11 Oct 2019 03:42:58 +0000 (21:42 -0600)]
io_uring: fix sequence logic for timeout requests
We have two ways a request can be deferred:
1) It's a regular request that depends on another one
2) It's a timeout that tracks completions
We have a shared helper to determine whether to defer, and that
attempts to make the right decision based on the request. But we
only have some of this information in the caller. Un-share the
two timeout/defer helpers so the caller can use the right one.
Dave Airlie [Fri, 11 Oct 2019 00:09:01 +0000 (10:09 +1000)]
Merge tag 'drm-intel-fixes-2019-10-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix CML display by adding a missing ID.
- Drop redundant list_del_init
- Only enqueue already completed requests to avoid races
- Fixup preempt-to-busy vs reset of a virtual request
- Protect peeking at execlists->active
- execlists->active is serialised by the tasklet
drm-intel-next-fixes-2019-09-19:
- Extend old HSW workaround to fix some GPU hangs on Haswell GT2
- Fix return error code on GEM mmap.
- White list a chicken bit register for push constants legacy mode on Mesa
- Fix resume issue related to GGTT restore
- Remove incorrect BUG_ON on execlist's schedule-out
- Fix unrecoverable GPU hangs with Vulkan compute workloads on SKL
drm-intel-next-fixes-2019-09-26:
- Fix concurrence on cases where requests where getting retired at same time as resubmitted to HW
- Fix gen9 display resolutions by setting the right max plane width
- Fix GPU hang on preemption
- Mark contents as dirty on a write fault. This was breaking cursor sprite with dumb buffers.
Dave Airlie [Fri, 11 Oct 2019 00:08:02 +0000 (10:08 +1000)]
Merge tag 'drm-misc-fixes-2019-10-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull (less than what git shortlog provides):
- SPI Aliases fixes for panels
- One fix for the tc358767 bridge dealing with visual artifacts
Since commit 4f8943f80883 ("SUNRPC: Replace direct task wakeups from
softirq context") there has been a race to the value of the sk_err if both
XPRT_SOCK_WAKE_ERROR and XPRT_SOCK_WAKE_DISCONNECT are set. In that case,
we may end up losing the sk_err value that existed when xs_error_report was
called.
Fix this by reverting to the previous behavior: instead of using SO_ERROR
to retrieve the value at a later time (which might also return sk_err_soft),
copy the sk_err value onto struct sock_xprt, and use that value to wake
pending tasks.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 4f8943f80883 ("SUNRPC: Replace direct task wakeups from softirq context") Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Fri, 4 Oct 2019 13:58:54 +0000 (09:58 -0400)]
NFSv4: Fix leak of clp->cl_acceptor string
Our client can issue multiple SETCLIENTID operations to the same
server in some circumstances. Ensure that calls to
nfs4_proc_setclientid() after the first one do not overwrite the
previously allocated cl_acceptor string.
Fixes: f11b2a1cfbf5 ("nfs4: copy acceptor name from context ... ") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Paul Burton [Thu, 10 Oct 2019 18:54:03 +0000 (18:54 +0000)]
MIPS: Disable Loongson MMI instructions for kernel build
GCC 9.x automatically enables support for Loongson MMI instructions when
using some -march= flags, and then errors out when -msoft-float is
specified with:
cc1: error: ‘-mloongson-mmi’ must be used with ‘-mhard-float’
The kernel shouldn't be using these MMI instructions anyway, just as it
doesn't use floating point instructions. Explicitly disable them in
order to fix the build with GCC 9.x.
Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: 3702bba5eb4f ("MIPS: Loongson: Add GCC 4.4 support for Loongson2E") Fixes: 6f7a251a259e ("MIPS: Loongson: Add basic Loongson 2F support") Fixes: 5188129b8c9f ("MIPS: Loongson-3: Improve -march option and move it to Platform") Cc: Huacai Chen <chenhc@lemote.com> Cc: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: stable@vger.kernel.org # v2.6.32+ Cc: linux-mips@vger.kernel.org
Jiaxun Yang [Thu, 10 Oct 2019 15:01:57 +0000 (23:01 +0800)]
MIPS: elf_hwcap: Export userspace ASEs
A Golang developer reported MIPS hwcap isn't reflecting instructions
that the processor actually supported so programs can't apply optimized
code at runtime.
Thus we export the ASEs that can be used in userspace programs.
Reported-by: Meng Zhuo <mengzhuo1203@gmail.com> Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Cc: linux-mips@vger.kernel.org Cc: Paul Burton <paul.burton@mips.com> Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Paul Burton <paul.burton@mips.com>
Linus Torvalds [Thu, 10 Oct 2019 18:47:16 +0000 (11:47 -0700)]
Merge tag 'xfs-5.4-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"A couple of small code cleanups and bug fixes for rounding errors,
metadata logging errors, and an extra layer of safeguards against
leaking memory contents.
- Fix a rounding error in the fallocate code
- Minor code cleanups
- Make sure to zero memory buffers before formatting metadata blocks
- Fix a few places where we forgot to log an inode metadata update
- Remove broken error handling that tried to clean up after a failure
but still got it wrong"
* tag 'xfs-5.4-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: move local to extent inode logging into bmap helper
xfs: remove broken error handling on failed attr sf to leaf change
xfs: log the inode on directory sf to block format change
xfs: assure zeroed memory buffers for certain kmem allocations
xfs: removed unused error variable from xchk_refcountbt_rec
xfs: remove unused flags arg from xfs_get_aghdr_buf()
xfs: Fix tail rounding in xfs_alloc_file_space()