nilfs2: use inode_set_flags() in nilfs_set_inode_flags()
Use inode_set_flags() to atomically set i_flags instead of clearing out
the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
FS_IMMUTABLE_FL, FS_APPEND_FL flags to avoid a race where an immutable
file has the immutable flag cleared for a brief window of time.
This is a similar fix to commit 5f16f3225b06 ("ext4: atomically set
inode->i_flags in ext4_set_inode_flags()").
nilfs2: put out gfp mask manipulation from nilfs_set_inode_flags()
nilfs_set_inode_flags() function adjusts gfp-mask of inode->i_mapping as
well as i_flags, however, this coupling of operations is not appropriate.
For instance, nilfs_ioctl_setflags(), one of three callers of
nilfs_set_inode_flags(), doesn't need to reinitialize the gfp-mask at all.
In addition, nilfs_new_inode(), another caller of
nilfs_set_inode_flags(), doesn't either because it has already initialized
the gfp-mask.
Only __nilfs_read_inode(), the remaining caller, needs it. So, this moves
the gfp mask manipulation to __nilfs_read_inode() from
nilfs_set_inode_flags().
nilfs2: fix gcc warning at nilfs_checkpoint_is_mounted()
Fix the following build warning:
fs/nilfs2/super.c: In function 'nilfs_checkpoint_is_mounted':
fs/nilfs2/super.c:1023:10: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
if (cno < 0 || cno > nilfs->ns_cno)
^
This warning indicates that the comparision "cno < 0" is useless because
variable "cno" has an unsigned integer type "__u64".
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nilfs2: improve execution time of NILFS_IOCTL_GET_CPINFO ioctl
The older a filesystem gets, the slower lscp command becomes. This is
because nilfs_cpfile_do_get_cpinfo() function meets more hole blocks
as the start offset of valid checkpoint numbers gets bigger.
This reduces the overhead by skipping hole blocks efficiently with
nilfs_mdt_find_block() helper.
A measurement result of this patch is as follows:
Before:
$ time lscp
CNO DATE TIME MODE FLG BLKCNT ICNT 5769303 2015-02-22 19:31:33 cp - 108 1 5769304 2015-02-22 19:38:54 cp - 108 1
real 0m0.182s
user 0m0.003s
sys 0m0.180s
After:
$ time lscp
CNO DATE TIME MODE FLG BLKCNT ICNT 5769303 2015-02-22 19:31:33 cp - 108 1 5769304 2015-02-22 19:38:54 cp - 108 1
nilfs2: add helper to find existent block on metadata file
Add a new metadata file function, nilfs_mdt_find_block(), which finds
an existent block on a metadata file in a given range of blocks. This
function skips continuous hole blocks efficiently by using
nilfs_bmap_seek_key().
Add a new bmap function, nilfs_bmap_seek_key(), which seeks a valid
entry and returns its key starting from a given key. This function
can be used to skip hole blocks efficiently.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
nilfs2: unify type of key arguments in bmap interface
The type of key arguments in block mapping interface varies depending
on function. For instance, nilfs_bmap_lookup_at_level() takes "__u64"
for its key argument whereas nilfs_bmap_lookup() takes "unsigned
long".
This fits them to "__u64" to eliminate the variation.
nilfs2: use set_mask_bits() for operations on buffer state bitmap
nilfs_forget_buffer(), nilfs_clear_dirty_page(), and
nilfs_segctor_complete_write() are using a bunch of atomic bit operations
against buffer state bitmap.
This reduces the number of them by utilizing set_mask_bits() macro.
nilfs2: do not use async write flag for segment summary buffers
The async write flag is introduced to nilfs2 in the commit 7f42ec394156
("nilfs2: fix issue with race condition of competition between segments
for dirty blocks"), but the flag only makes sense for data buffers and
btree node buffers. It is not needed for segment summary buffers.
This gets rid of the latter uses as part of refactoring of atomic bit
operations on buffer state bitmap.
drivers/rtc/rtc-hym8563.c: fix swapped enable/disable of clockout control bit
The hym8563 datasheet describes the clock output control-bit as "when set
to logic 0, the square wave output is enable, when set to logic 1, the
CLKOUT output is inhibited". But in reality the setting is exactly
opposite.
Before now, the clock output was not really used, but on the rk3288 soc
this generated clock is used to supply the temperature sensor block and
the swapped bit value prevented it from working. With the corrected
value, the tsadc now reports correct values.
drivers/rtc/rtc-omap.c: unlock and lock rtc registers before and after register writes
RTC module contains a kicker mechanism to prevent any spurious writes from
changing the register values. This mechanism requires two MMR writes to
the KICK0 and KICK1 registers with exact data values before the kicker
lock mechanism is released.
Currently the driver release the lock in the probe and leaves it enabled
until the rtc driver removal. This eliminates the idea of preventing
spurious writes when RTC driver is loaded. So implement rtc lock and
unlock functions before and after register writes.
This is as advised by Paul to implement lock and unlock functions in the
driver and not to unlock and leave it in probe. The same discussion can
be seen here:
http://www.mail-archive.com/linux-omap%40vger.kernel.org/msg111588.html
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Paul Walmsley <paul@pwsan.com> Cc: Tero Kristo <t-kristo@ti.com> Cc: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If there's a real error, it's likely that lower level or higher level
code will tell it anyway. Make these logs debug logs, and also print
the error code for the read failure.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Aaro Koskinen [Thu, 16 Apr 2015 19:45:48 +0000 (12:45 -0700)]
drivers/rtc/class.c: initialize rtc name early
In some error cases RTC name is used before it is initialized:
rtc-rs5c372 0-0032: clock needs to be set
rtc-rs5c372 0-0032: rs5c372b found, 24hr, driver version 0.6
rtc (null): read_time: fail to read
rtc-rs5c372 0-0032: rtc core: registered rtc-rs5c372 as rtc0
Fix by initializing the name early.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-s5m.c: add support for S2MPS13 RTC
The S2MPS13 RTC is almost the same as S2MPS14. The differences when
updating alarm are:
1. Set WUDR+AUDR field instead of WUDR+RUDR.
2. Clear the AUDR field later (it is not auto-cleared).
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MAINTAINERS: Add Alexandre Belloni as an RTC maintainer
I've noticed that most of the patches for the RTC subsystem are currently
either taken directly by Andrew or going through another maintainer's
tree, quite often without an Acked-by or Reviewed-by tag.
I'd like to propose myself as the RTC subsystem co-maintainer, to mainly
help Alessandro reviewing incoming patches and maintain a subsystem tree
to avoid having the RTC patches going through trees when they have no
particular dependencies.
rtc: driver for Conexant Digicolor CX92755 on-chip RTC
Add driver for the RTC hardware block on the Conexant CX92755 SoC, from
the Digicolor series of SoCs. Tested on the Equinox evaluation board for
the CX92755 chip.
Add a device tree binding documentation to the Real Time Clock hardware
block on the Conexant CX92755 SoC. The CX92755 is from the Digicolor SoCs
series. Other SoCs in that series may share the same hardware block.
drivers/rtc/rtc-hym8563.c: return clock rate even when clock is disabled
When the clock is disabled, do not return a rate of 0 but instead return
the rate the clock will be running at after it gets enabled. This
prevents problems when the core clock code is trying to determine a
suitable rate, while the clock is still off.
CHECK drivers/rtc/rtc-ds1685.c
drivers/rtc/rtc-ds1685.c:2178:1: warning: function 'ds1685_rtc_poweroff' with external linkage has definition
drivers/rtc/rtc-ds1685.c:802:23: warning: Using plain integer as NULL pointer
Fixes: aaaf5fbf56f1 ("rtc: add driver for DS1685 family of real time clocks") Signed-off-by: Joshua Kinard <kumba@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-ds1685.c: remove .owner assignment from platform_driver
The rtc driver core now sets the platform_driver 'owner' property, so
remove the assignment from the DS1685 driver.
Fixes: aaaf5fbf56f1: "rtc: add driver for DS1685 family of real time clocks" Signed-off-by: Joshua Kinard <kumba@gentoo.org> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/rtc/rtc-s3c.c: delete duplicate clock control
The current functions in s3c-rtc driver execute clk_enable/disable() to
control clocks and some functions execute s3c_rtc_alarm_clk_enable()
unnecessarily. So this patch deletes the duplicate clock control and
spilts s3c_rtc_alarm_clk_enable() out as
s3c_rtc_enable_clk()/s3c_rtc_disable_clk() to improve readability.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Kukjin Kim <kgene@kernel.org> Cc: Inki Dae <inki.dae@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rtc: hctosys: do not treat lack of RTC device as error
When using device trees on the ARM platform, it is not certain at compile
time whether or not the system will have a RTC.
If one enables CONFIG_HCTOSYS just in case the system booted has a RTC,
and it turns out not to be, this will result in a big fat "unable to open
rtc device" error being printed to console, even when "quiet" is set in
the kernel cmdline.
Fix this by outputting the message with loglevel info instead.
Peter Robinson [Thu, 16 Apr 2015 19:45:07 +0000 (12:45 -0700)]
drivers/rtc/rtc-em3027.c: add device tree support
Set the of_match_table for this driver so that devices can be described in
the device tree. This device is used in the Trimslice and is already
defined in the Trimslice device tree.
Signed-off-by: Peter Robinson <pbrobinson@gmail.com> Cc: Mike Rapoport <mike@compulab.co.il> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: Grant Likely <grant.likely@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
rtc: stmp3xxx: use optional crystal in low power states
The rtc's status register allows to determine if a 32k crystal is
connected to keep the rtc running in low power states provided the
corresponding fuse bits were blown correctly during production. (In case
they were not, the right frequency can be stated in the device tree.) If
there is no such crystal available force the 24 MHz XTAL clock to keep
running to retain the right date and time. Otherwise use the crystal to
save some power.
It would be nice to only switch to the crystal when the XTAL clock is
about to be disabled and keep the crystal off when unneeded because XTAL
is always on while the chip is powered on. But as sudden power loss isn't
detectable this is not save.
sprintf() reliably returns the number of characters printed, so we don't
need to ask strlen() where we are. Also replace calling sprintf("%02x")
in a loop with the much simpler bin2hex().
[akpm@linux-foundation.org: it's odd to include kernel.h after everything else] Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:44:53 +0000 (12:44 -0700)]
checkpatch: avoid "spaces required around that ':'" false positive
Since commit 1f65f947a6a8 ("checkpatch: add checks for question mark and
colon spacing") back in 2008, checkpatch has reported false positive for
asm volatile uses of "::" checkpatch thinks colons should always have
spaces around it.
Add an exception for colons with colons on either side for this valid asm
volatile (and c++) use.
Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Yehuda Yitschak <yehuday@marvell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:44:50 +0000 (12:44 -0700)]
checkpatch: fix --fix use with a patch of multiple files
If a patch touches multiple files, the --fix and --fix-inplace option
doesn't keep the proper line count and makes the new patch file not able
to be applied via bad offset line numbers when lines are added or deleted
by the --fix option.
Dunno how that extra backslash snuck in there.
Signed-off-by: Joe Perches <joe@perches.com> Cc: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Thu, 16 Apr 2015 19:44:47 +0000 (12:44 -0700)]
errno.h: Improve ENOSYS's comment
ENOSYS is the mechanism used by user code to detect whether the running
kernel implements a given system call. It should not be returned by
anything except an unimplemented system call.
Unfortunately, it is rather frequently used in the kernel to indicate that
various new functions of existing system calls are not implemented. This
should be discouraged.
Improve the comment in errno.h to help clarify ENOSYS's purpose.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Lutomirski [Thu, 16 Apr 2015 19:44:44 +0000 (12:44 -0700)]
checkpatch.pl: new instances of ENOSYS are errors
ENOSYS means that a nonexistent system call was called. We have a
bad habit of using it for things like invalid operations on
otherwise valid syscalls. We should avoid this in new code.
Pervasive incorrect usage of ENOSYS came up at the kernel summit ABI
review discussion. Let's see if checkpatch can help.
I'll submit a separate patch for include/uapi/asm-generic/errno.h.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sam Bobroff [Thu, 16 Apr 2015 19:44:39 +0000 (12:44 -0700)]
checkpatch: improve operator spacing check
Code such as:
x = timercmp(&now, &end, <);
Will currently trigger a checkpatch error. e.g.
ERROR: spaces required around that '<'
This is because the "Ignore operators passed as parameters" check looks
only for a comma following the operator. Improve the check by also
looking for a close parenthesis.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:44:28 +0000 (12:44 -0700)]
checkpatch, SubmittingPatches: suggest line wrapping commit messages at 75 columns
Commit messages lines are sometimes overly long.
Suggest line wrapping at 75 columns so the default git commit log
indentation of 4 plus the commit message text still fits on an 80 column
screen.
Add a checkpatch test for long commit messages lines too.
Signed-off-by: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ian Morris <ipm@chirality.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fabian Frederick [Thu, 16 Apr 2015 19:44:25 +0000 (12:44 -0700)]
checkpatch: don't ask for asm/file.h to linux/file.h unconditionally
Currently checkpatch warns when asm/file.h is included and linux/file.h
exists. That conversion can be made when linux/file.h includes asm/file.h
which is not always the case.(See signal.h)
Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Andy Whitcroft <apw@canonical.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:44:16 +0000 (12:44 -0700)]
checkpatch: match more world writable permissions
Currently checkpatch will fuss if one uses world writable settings in
debugfs files and DEVICE_ATTR uses by testing S_IWUGO but not testing
S_IWOTH, S_IRWXUGO or S_IALLUGO.
Extend the check to catch all cases exporting world writable permissions
including octal values.
[akpm@linux-foundation.org: remove stray $] Signed-off-by: Joe Perches <joe@perches.com> Original-patch-by: Nicholas Mc Guire <hofrat@osadl.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Thu, 16 Apr 2015 19:44:08 +0000 (12:44 -0700)]
checkpatch: add spell checking of email subject line
Only commit log and patch additions are checked for typos and spelling
errors currently. Add a check of the email subject line too.
Signed-off-by: Joe Perches <joe@perches.com> Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Tested-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolas Iooss [Thu, 16 Apr 2015 19:44:02 +0000 (12:44 -0700)]
firmware/ihex2fw.c: restore missing default in switch statement
Commit 2473238eac95 ("ihex: add support for CS:IP/EIP records") removes
the "default:" statement in the switch block, making the "return
usage();" line dead code and ihex2fw silently ignoring unknown options.
Restore this statement.
This bug was found by building with HOSTCC=clang and adding
-Wunreachable-code-return to HOSTCFLAGS.
Fixes: 2473238eac95 ("ihex: add support for CS:IP/EIP records") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We have grown a number of different implementations of
DIV_ROUND_CLOSEST_ULL throughout the kernel. Move the i915 one to
kernel.h so that it can be reused.
Signed-off-by: Javi Merino <javi.merino@arm.com> Reviewed-by: Jeff Epler <jepler@unpythonic.net> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: Guenter Roeck <linux@roeck-us.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Alex Elder <elder@linaro.org> Cc: Antti Palosaari <crope@iki.fi> Cc: Javi Merino <javi.merino@arm.com> Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Turquette <mturquette@linaro.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I hadn't had enough coffee when I wrote this. Currently, the final
increment of buf depends on the value loaded from the table, and
causes gcc to emit a cmov immediately before the return. It is smarter
to let it depend on r, since the increment can then be computed in
parallel with the final load/store pair. It also shaves 16 bytes of
.text.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series unduplicates the code used to find the member in an array
closest to 'x'.
The first patch adds a macro implementing the algorithm in two flavors -
for arrays sorted in ascending and descending order. The second updates
Documentation/CodingStyle on the naming convention for local variables in
macros resembling functions. Other three patches replace duplicated code
with calls to one of these macros in some hwmon drivers.
This patch (of 5):
Searching for the member of an array closest to 'x' is duplicated in
several places.
Add a new include - util_macros.h - and two macros that implement this
algorithm for arrays sorted both in ascending and descending order.
Uses linear search.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Ott [Thu, 16 Apr 2015 19:43:25 +0000 (12:43 -0700)]
lib/dma-debug: fix bucket_find_contain()
bucket_find_contain() will search the bucket list for a dma_debug_entry.
When the entry isn't found it needs to search other buckets too, since
only the start address of a dma range is hashed (which might be in a
different bucket).
A copy of the dma_debug_entry is used to get the previous hash bucket
but when its list is searched the original dma_debug_entry is to be used
not its modified copy.
This fixes false "device driver tries to sync DMA memory it has not allocated"
warnings.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Horia Geanta <horia.geanta@freescale.com> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/vsprintf.c: even faster binary to decimal conversion
The most expensive part of decimal conversion is the divisions by 10
(albeit done using reciprocal multiplication with appropriately chosen
constants). I decided to see if one could eliminate around half of
these multiplications by emitting two digits at a time, at the cost of a
200 byte lookup table, and it does indeed seem like there is something
to be gained, especially on 64 bits. Microbenchmarking shows
improvements ranging from -50% (for numbers uniformly distributed in [0,
2^64-1]) to -25% (for numbers heavily biased toward the smaller end, a
more realistic distribution).
On a larger scale, perf shows that top, one of the big consumers of /proc
data, uses 0.5-1.0% fewer cpu cycles.
I had to jump through some hoops to get the 32 bit code to compile and run
on my 64 bit machine, so I'm not sure how relevant these numbers are, but
just for comparison the microbenchmark showed improvements between -30%
and -10%.
The bloat-o-meter costs are around 150 bytes (the generated code is a
little smaller, so it's not the full 200 bytes) on both 32 and 64 bit.
I'm aware that extra cache misses won't show up in a microbenchmark as
used above, but on the other hand decimal conversions often happen in bulk
(for example in the case of top).
I have of course tested that the new code generates the same output as the
old, for both the first and last 1e10 numbers in [0,2^64-1] and 4e9
'random' numbers in-between.
Test and verification code on github: https://github.com/Villemoes/dec.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Tested-by: Jeff Epler <jepler@unpythonic.net> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently all 'find_*_bit' family is located in lib/find_next_bit.c,
except 'find_last_bit', which is in lib/find_last_bit.c. It seems,
there's no major benefit to have it separated.
Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: George Spelvin <linux@horizon.com> Cc: Alexey Klimov <klimov.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mark Salter <msalter@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset does rework to find_bit function family to achieve better
performance, and decrease size of text. All rework is done in patch 1.
Patches 2 and 3 are about code moving and renaming.
It was boot-tested on x86_64 and MIPS (big-endian) machines.
Performance tests were ran on userspace with code like this:
/* addr[] is filled from /dev/urandom */
start = clock();
while (ret < nbits)
ret = find_next_bit(addr, nbits, ret + 1);
end = clock();
printf("%ld\t", (unsigned long) end - start);
On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for
find_next_bit, nbits is 8M, for find_first_bit - 80K)
find_next_bit performance gain is 35-40%;
find_first_bit - no measurable difference.
On ARM machine, there is arch-specific implementation for find_bit.
Thanks a lot to George Spelvin and Rasmus Villemoes for hints and
helpful discussions.
This patch (of 3):
New implementations takes less space in source file (see diffstat) and in
object. For me it's 710 vs 453 bytes of text. It also shows better
performance.
find_last_bit description fixed due to obvious typo.
[akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus] Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: George Spelvin <linux@horizon.com> Cc: Alexey Klimov <klimov.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mark Salter <msalter@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alpha: forward declare struct pt_regs in processor.h
Removal of exec domains uncovered this new warning. processor.h re-used
struct pt_regs from personality.h which is now gone.
./arch/alpha/include/asm/processor.h:47:33: warning: 'struct pt_regs' declared inside parameter list [enabled by default]
Signed-off-by: Richard Weinberger <richard@nod.at> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 9c521a200bc3 ("crypto: api - remove instance when test failed")
tried to grab a module reference count before the module was even set.
Worse, it then goes on to free the module reference count after it is
set so you quickly end up with a negative module reference count which
prevents people from using any instances belonging to that module.
This patch moves the module initialisation before the reference
count.
* akpm: (114 commits)
parisc: remove use of seq_printf return value
lru_cache: remove use of seq_printf return value
tracing: remove use of seq_printf return value
cgroup: remove use of seq_printf return value
proc: remove use of seq_printf return value
s390: remove use of seq_printf return value
cris fasttimer: remove use of seq_printf return value
cris: remove use of seq_printf return value
openrisc: remove use of seq_printf return value
ARM: plat-pxa: remove use of seq_printf return value
nios2: cpuinfo: remove use of seq_printf return value
microblaze: mb: remove use of seq_printf return value
ipc: remove use of seq_printf return value
rtc: remove use of seq_printf return value
power: wakeup: remove use of seq_printf return value
x86: mtrr: if: remove use of seq_printf return value
linux/bitmap.h: improve BITMAP_{LAST,FIRST}_WORD_MASK
MAINTAINERS: CREDITS: remove Stefano Brivio from B43
.mailmap: add Ricardo Ribalda
CREDITS: add Ricardo Ribalda Delgado
...
Joe Perches [Wed, 15 Apr 2015 23:18:22 +0000 (16:18 -0700)]
tracing: remove use of seq_printf return value
The seq_printf return value, because it's frequently misused,
will eventually be converted to void.
See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")
Miscellanea:
o Remove unused return value from trace_lookup_stack
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Wed, 15 Apr 2015 23:18:02 +0000 (16:18 -0700)]
ARM: plat-pxa: remove use of seq_printf return value
The seq_printf return value, because it's frequently misused,
(as it is here, it doesn't return # of chars emitted) will
eventually be converted to void.
See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")
Signed-off-by: Joe Perches <joe@perches.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Wed, 15 Apr 2015 23:17:48 +0000 (16:17 -0700)]
power: wakeup: remove use of seq_printf return value
The seq_printf return value, because it's frequently misused,
will eventually be converted to void.
See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")
Signed-off-by: Joe Perches <joe@perches.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Len Brown <len.brown@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The macro BITMAP_LAST_WORD_MASK can be implemented without a conditional,
which will generally lead to slightly better generated code (221 bytes
saved for allmodconfig-GCOV_KERNEL, ~2k with GCOV_KERNEL). As a small
bonus, this also ensures that the nbits parameter is expanded exactly
once.
In BITMAP_FIRST_WORD_MASK, if start is signed gcc is technically allowed
to assume it is positive (or divisible by BITS_PER_LONG), and hence just
do the simple mask. It doesn't seem to use this, and even on an
architecture like x86 where the shift only depends on the lower 5 or 6
bits, and these bits are not affected by the signedness of the expression,
gcc still generates code to compute the C99 mandated value of start %
BITS_PER_LONG. So just use a mask explicitly, also for consistency with
BITMAP_LAST_WORD_MASK.