Bob Peterson [Mon, 5 Mar 2012 15:19:35 +0000 (10:19 -0500)]
GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpd
This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd
and removes calls to it that are made redundant by it. The problem is
that a gfs2_grow can add rgrps to the rindex, then put those rgrps into
use, thus rendering the rindex we read in at mount time incomplete.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Mon, 5 Mar 2012 14:20:59 +0000 (09:20 -0500)]
GFS2: Eliminate sd_rindex_mutex
Over time, we've slowly eliminated the use of sd_rindex_mutex.
Up to this point, it was only used in two places: function
gfs2_ri_total (which totals the file system size by reading
and parsing the rindex file) and function gfs2_rindex_update
which updates the rgrps in memory. Both of these functions have
the rindex glock to protect them, so the rindex is unnecessary.
Since gfs2_grow writes to the rindex via the meta_fs, the mutex
is in the wrong order according to the normal rules. This patch
eliminates the mutex entirely to avoid the problem.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Tue, 14 Feb 2012 19:49:57 +0000 (14:49 -0500)]
GFS2: Sort the ordered write list
This patch sorts the ordered write list for GFS2 writes.
This increases the throughput for simultaneous writes.
For example, if you have ten processes, all doing:
dd if=/dev/zero of=/mnt/gfs2/fileX
on different files, the throughput will be much better.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The FITRIM ioctl provides an alternative way to send discard requests to
the underlying device. Using the discard mount option results in every
freed block generating a discard request to the block device. This can
be slow, since many block devices can only process discard requests of
larger sizes, and also such operations can be time consuming.
Rather than using the discard mount option, FITRIM allows a sweep of the
filesystem on an occasional basis, and also to optionally avoid sending
down discard requests for smaller regions.
In GFS2 FITRIM will work at resource group granularity. There is a flag
for each resource group which keeps track of which resource groups have
been trimmed. This flag is reset whenever a deallocation occurs in the
resource group, and set whenever a successful FITRIM of that resource
group has taken place. This helps to reduce repeated discard requests
for the same block ranges, again improving performance.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
The stats are divided into two sets: those relating to the
super block and those relating to an individual glock. The
super block stats are done on a per cpu basis in order to
try and reduce the overhead of gathering them. They are also
further divided by glock type.
In the case of both the super block and glock statistics,
the same information is gathered in each case. The super
block statistics are used to provide default values for
most of the glock statistics, so that newly created glocks
should have, as far as possible, a sensible starting point.
The statistics are divided into three pairs of mean and
variance, plus two counters. The mean/variance pairs are
smoothed exponential estimates and the algorithm used is
one which will be very familiar to those used to calculation
of round trip times in network code.
The three pairs of mean/variance measure the following
things:
1. DLM lock time (non-blocking requests)
2. DLM lock time (blocking requests)
3. Inter-request time (again to the DLM)
A non-blocking request is one which will complete right
away, whatever the state of the DLM lock in question. That
currently means any requests when (a) the current state of
the lock is exclusive (b) the requested state is either null
or unlocked or (c) the "try lock" flag is set. A blocking
request covers all the other lock requests.
There are two counters. The first is there primarily to show
how many lock requests have been made, and thus how much data
has gone into the mean/variance calculations. The other counter
is counting queueing of holders at the top layer of the glock
code. Hopefully that number will be a lot larger than the number
of dlm lock requests issued.
So why gather these statistics? There are several reasons
we'd like to get a better idea of these timings:
1. To be able to better set the glock "min hold time"
2. To spot performance issues more easily
3. To improve the algorithm for selecting resource groups for
allocation (to base it on lock wait time, rather than blindly
using a "try lock")
Due to the smoothing action of the updates, a step change in
some input quantity being sampled will only fully be taken
into account after 8 samples (or 4 for the variance) and this
needs to be carefully considered when interpreting the
results.
Knowing both the time it takes a lock request to complete and
the average time between lock requests for a glock means we
can compute the total percentage of the time for which the
node is able to use a glock vs. time that the rest of the
cluster has its share. That will be very useful when setting
the lock min hold time.
The other point to remember is that all times are in
nanoseconds. Great care has been taken to ensure that we
measure exactly the quantities that we want, as accurately
as possible. There are always inaccuracies in any
measuring system, but I hope this is as accurate as we
can reasonably make it.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: Read resource groups on mount
GFS2: Ensure rindex is uptodate for fallocate
GFS2: Read in rindex if necessary during unlink
GFS2: Fix race between lru_list and glock ref count
Linus Torvalds [Tue, 28 Feb 2012 17:15:31 +0000 (09:15 -0800)]
Merge tag 'iommu-fixes-v3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
IOMMU fixes for Linux 3.3-rc5
All the fixes are for the OMAP IOMMU driver. The first patch is the
biggest one. It fixes the calls of the function omap_find_iovm_area() in
the omap-iommu-debug module which expects a 'struct device' parameter
since commit fabdbca instead of an omap_iommu handle. The
omap-iommu-debug code still passed the handle to the function which
caused a crash.
The second patch fixes a NULL pointer dereference in the OMAP code and
the third patch makes sure that the omap-iommu is initialized before the
omap-isp driver, which relies on the iommu. The last patch is only a
workaround until defered probing is implemented.
* tag 'iommu-fixes-v3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
ARM: OMAP: make iommu subsys_initcall to fix builtin omap3isp
iommu/omap: fix NULL pointer dereference
iommu/omap: fix erroneous omap-iommu-debug API calls
This makes mount take slightly longer, but at the same time, the first
write to the filesystem will be faster too. It also means that if there
is a problem in the resource index, then we can refuse to mount rather
than having to try and report that when the first write occurs.
In addition, to avoid recursive locking, we hvae to take account of
instances when the rindex glock may already be held when we are
trying to update the rbtree of resource groups.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Fri, 17 Feb 2012 14:15:52 +0000 (09:15 -0500)]
GFS2: Ensure rindex is uptodate for fallocate
This patch fixes a problem whereby gfs2_grow was failing and causing GFS2
to assert. The problem was that when GFS2's fallocate operation tried to
acquire an "allocation" it made sure the rindex was up to date, and if not,
it called gfs2_rindex_update. However, if the file being fallocated was
the rindex itself, it was already locked at that point. By calling
gfs2_rindex_update at an earlier point in time, we bring rindex up to date
and thereby avoid trying to lock it when the "allocation" is acquired.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Bob Peterson [Thu, 16 Feb 2012 16:31:04 +0000 (11:31 -0500)]
GFS2: Read in rindex if necessary during unlink
This patch fixes a problem whereby you were unable to delete
files until other file system operations were done (such as
statfs, touch, writes, etc.) that caused the rindex to be
read in.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Linus Torvalds [Mon, 27 Feb 2012 23:43:05 +0000 (15:43 -0800)]
Merge tag 'ktest-fix-make-min-failed-build-for-real' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
While demoing ktest at ELC in 2012, it was embarrassing that the
make_min_config test failed to work because the snowball board I was
testing it against had a config that would not build. But the
make_min_config only tested the testing part and ignored build failures.
The end result was a config file that would not boot.
This time, for real.
* tag 'ktest-fix-make-min-failed-build-for-real' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Fix make_min_config test when build fails
Steven Rostedt [Mon, 27 Feb 2012 18:58:49 +0000 (13:58 -0500)]
ktest: Fix make_min_config test when build fails
The make_min_config does not take into account when the build fails,
resulting in a invalid MIN_CONFIG .config file. When the build fails,
it is ignored and the boot test is executed, using the previous built
kernel. The configs that should be tested are not tested and they may
be added or removed depending on the result of the last kernel that
succeeded to be built.
If the build fails, mark the current config as a failure and the
configs that were disabled may still be needed.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Here are some trivial NTFS changes (a spelling fix and two use before
NULL check cases found by Coverity as well as an update in MAINTAINERS
for the path to the ntfs git repo) together with a simple LDM fix for
parsing fragmented VBLKs.
* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
NTFS: Update git repo path in MAINTAINERS file.
LDM: Fix reassembly of extended VBLKs.
NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.
NTFS: Do not dereference pointer before checking for NULL.
NTFS: Remove unused variable.
Linus Torvalds [Mon, 27 Feb 2012 15:55:51 +0000 (07:55 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce/AMD: Fix UP build error
x86: Specify a size for the cmp in the NMI handler
x86/nmi: Test saved %cs in NMI to determine nested NMI case
x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors
x86/microcode: Remove noisy AMD microcode warning
Linus Torvalds [Mon, 27 Feb 2012 15:54:57 +0000 (07:54 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Handle pending irqs in irq_startup()
genirq: Unmask oneshot irqs when thread was not woken
Linus Torvalds [Mon, 27 Feb 2012 05:03:16 +0000 (21:03 -0800)]
Merge tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Two fixes to fix a memory corruption bug when WC pages never get
converted back to WB but end up being recycled in the general memory
pool as WC.
There is a better way of fixing this, but there is not enough time to do
the full benchmarking to pick one of the right options - so picking the
one that favors stability for right now.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pat: Disable PAT support for now.
xen/setup: Remove redundant filtering of PTE masks.
Andreas Bießmann [Fri, 24 Feb 2012 07:23:53 +0000 (08:23 +0100)]
mod/file2alias: make modpost compile on darwin again
commit e49ce14150c64b29a8dd211df785576fa19a9858 breaks cross compiling
the linux kernel on darwin hosts.
This fix introduce some minimal glue to adopt linker section handling
for darwin hosts.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Jochen Friedrich <jochen@scram.de> CC: Samuel Ortiz <sameo@linux.intel.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Bernhard Walle <bernhard@bwalle.de>
1) ICMP sockets leave err uninitialized but we try to return it for the
unsupported MSG_OOB case, reported by Dave Jones.
2) Add new Zaurus device ID entries, from Dave Jones.
3) Pointer calculation in hso driver memset is wrong, from Dan
Carpenter.
4) ks8851_probe() checks unsigned value as negative, fix also from Dan
Carpenter.
5) Fix crashes in atl1c driver due to TX queue handling, from Eric
Dumazet. I anticipate some TX side locking fixes coming in the near
future for this driver as well.
6) The inline directive fix in Bluetooth which was breaking the build
only with very new versions of GCC, from Johan Hedberg.
7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
window, reported by Meelis Roos and fixed by Eric Dumazet.
9) Some ip6_route_output() callers test the return value for NULL, but
this never happens as the convention is to return a dst entry with
dst->error set. Fixes from RonQing Li.
10) Logitech Harmony 900 should be handled by zaurus driver not
cdc_ether, update white lists and black lists accordingly. From
Scott Talbert.
11) Receiving from certain kinds of devices there won't be a MAC header,
so there is no MAC header to fixup in the IPSEC code, and if we try
to do it we'll crash. Fix from Eric Dumazet.
12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
Petrilin.
13) Fix regression in link-down handling in davinci_emac which causes
all RX descriptors to be freed up and therefore RX to wedge
completely, from Christian Riesch.
14) It took two attempts, but ctnetlink soft lockups seem to be
cured now, from Pablo Neira Ayuso.
15) Endianness bug fix in ENIC driver, from Santosh Nayak.
16) The long ago conversion of the PPP fragmentation code over to
abstracted SKB list handling wasn't perfect, once we get an
out of sequence SKB we don't flush the rest of them like we
should. From Ben McKeegan.
17) Fix regression of ->ip_summed initialization in sfc driver.
From Ben Hutchings.
18) Bluetooth timeout mistakenly using msecs instead of jiffies,
from Andrzej Kaczmarek.
19) Using _sync variant of work cancellation results in deadlocks,
use the non _sync variants instead. From Andre Guedes.
20) Bluetooth rfcomm code had reference counting problems leading
to crashes, fix from Octavian Purdila.
21) The conversion of netem over to classful qdisc handling added
two bugs to netem_dequeue(), fixes from Eric Dumazet.
22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall.
23) b44_pci_exit() should not have __exit tag since it's invoked from
non-__exit code. From Nikola Pajkovsky.
24) The conversion of the neighbour hash tables over to RCU added a
race, fixed here by adding the necessary reread of tbl->nht, fix
from Michel Machado.
25) When we added VF (virtual function) attributes for network device
dumps, this potentially bloats up the size of the dump of one
network device such that the dump size is too large for the buffer
allocated by properly written netlink applications.
In particular, if you add 255 VFs to a network device, parts of
GLIBC stop working.
To fix this, we add an attribute that is used to turn on these
extended portions of the network device dump. Sophisticaed
applications like 'ip' that want to see this stuff will be changed
to set the attribute, whereas things like GLIBC that don't care
about VFs simply will not, and therefore won't be busted by the
mere presence of VFs on a network device.
Thanks to the tireless work of Greg Rose on this fix.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
sfc: Fix assignment of ip_summed for pre-allocated skbs
ppp: fix 'ppp_mp_reconstruct bad seq' errors
enic: Fix endianness bug.
gre: fix spelling in comments
netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
davinci_emac: Do not free all rx dma descriptors during init
mlx4_core: Fixing array indexes when setting port types
phy: IC+101G and PHY_HAS_INTERRUPT flag
netdev/phy/icplus: Correct broken phy_init code
ipsec: be careful of non existing mac headers
Move Logitech Harmony 900 from cdc_ether to zaurus
hso: memsetting wrong data in hso_get_count()
netfilter: ip6_route_output() never returns NULL.
ethernet/broadcom: ip6_route_output() never returns NULL.
ipv6: ip6_route_output() never returns NULL.
jme: Fix FIFO flush issue
atm: clip: remove clip_tbl
ipv4: ping: Fix recvmsg MSG_OOB error handling.
rtnetlink: Fix problem with buffer allocation
...
Linus Torvalds [Sun, 26 Feb 2012 17:44:55 +0000 (09:44 -0800)]
Fix autofs compile without CONFIG_COMPAT
The autofs compat handling fix caused a compile failure when
CONFIG_COMPAT isn't defined.
Instead of adding random #ifdef'fery in autofs, let's just make the
compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
just hardcodes to zero.
We could probably do something similar for a number of other cases where
we have #ifdef's in code, but this is the low-hanging fruit.
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 25 Feb 2012 20:12:08 +0000 (12:12 -0800)]
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Couple of minor driver fixes.
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (max34440) Fix resetting temperature history
hwmon: (f75375s) Fix register write order when setting fans to full speed
hwmon: (ads1015) Fix file leak in probe function
hwmon: (max6639) Fix PPR register initialization to set both channels
hwmon: (max6639) Fix FAN_FROM_REG calculation
Linus Torvalds [Sat, 25 Feb 2012 20:11:25 +0000 (12:11 -0800)]
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
three kbuild fixes for 3.3:
- make deb-pkg symlink race fix.
- make coccicheck fix.
- Dropping the check for modutils. This is not a regression, but
allows the module-init-tools replacement kmod work with the 3.3
kernel.
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
coccicheck: change handling of C={1,2} when M= is set
builddeb: Don't create files in /tmp with predictable names
kbuild: do not check for ancient modutils tools
Ian Kent [Wed, 22 Feb 2012 12:45:44 +0000 (20:45 +0800)]
autofs: work around unhappy compat problem on x86-64
When the autofs protocol version 5 packet type was added in commit 5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Hutchings [Sat, 25 Feb 2012 00:03:10 +0000 (00:03 +0000)]
sfc: Fix assignment of ip_summed for pre-allocated skbs
When pre-allocating skbs for received packets, we set ip_summed =
CHECKSUM_UNNCESSARY. We used to change it back to CHECKSUM_NONE when
the received packet had an incorrect checksum or unhandled protocol.
Commit bc8acf2c8c3e43fcc192762a9f964b3e9a17748b ('drivers/net: avoid
some skb->ip_summed initializations') mistakenly replaced the latter
assignment with a DEBUG-only assertion that ip_summed ==
CHECKSUM_NONE. This assertion is always false, but it seems no-one
has exercised this code path in a DEBUG build.
Fix this by moving our assignment of CHECKSUM_UNNECESSARY into
efx_rx_packet_gro().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Linus Torvalds [Sat, 25 Feb 2012 00:08:51 +0000 (16:08 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
SCSI fixes on 20120224:
"This is a set of assorted bug fixes for power management, mpt2sas,
ipr, the rdac device handler and quite a big chunk for qla2xxx (plus a
use after free of scsi_host in scsi_scan.c). "
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] scsi_dh_rdac: Fix for unbalanced reference count
[SCSI] scsi_pm: Fix bug in the SCSI power management handler
[SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost'
[SCSI] qla2xxx: Update version number to 8.03.07.13-k.
[SCSI] qla2xxx: Proper detection of firmware abort error code for ISP82xx.
[SCSI] qla2xxx: Remove resetting memory during device initialization for ISP82xx.
[SCSI] qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle.
[SCSI] qla2xxx: Remove check for null fcport from host reset handler.
[SCSI] qla2xxx: Correct out of bounds read of ISP2200 mailbox registers.
[SCSI] qla2xxx: Remove errant clearing of MBX_INTERRUPT flag during CT-IOCB processing.
[SCSI] qla2xxx: Clear options-flags while issuing stop-firmware mbx command.
[SCSI] qla2xxx: Add an "is reset active" helper.
[SCSI] qla2xxx: Add check for null fcport references in qla2xxx_queuecommand.
[SCSI] qla2xxx: Propagate up abort failures.
[SCSI] isci: Fix NULL ptr dereference when no firmware is being loaded
[SCSI] ipr: fix eeh recovery for 64-bit adapters
[SCSI] mpt2sas: Fix mismatch in mpt2sas_base_hard_reset_handler() mutex lock-unlock
Ben McKeegan [Fri, 24 Feb 2012 06:33:56 +0000 (06:33 +0000)]
ppp: fix 'ppp_mp_reconstruct bad seq' errors
This patch fixes a (mostly cosmetic) bug introduced by the patch
'ppp: Use SKB queue abstraction interfaces in fragment processing'
found here: http://www.spinics.net/lists/netdev/msg153312.html
The above patch rewrote and moved the code responsible for cleaning
up discarded fragments but the new code does not catch every case
where this is necessary. This results in some discarded fragments
remaining in the queue, and triggering a 'bad seq' error on the
subsequent call to ppp_mp_reconstruct. Fragments are discarded
whenever other fragments of the same frame have been lost.
This can generate a lot of unwanted and misleading log messages.
This patch also adds additional detail to the debug logging to
make it clearer which fragments were lost and which other fragments
were discarded as a result of losses. (Run pppd with 'kdebug 1'
option to enable debug logging.)
Signed-off-by: Ben McKeegan <ben@netservers.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 24 Feb 2012 20:32:51 +0000 (12:32 -0800)]
Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] hdpvr: update picture controls to support firmware versions > 0.15
[media] wl128x: fix build errors when GPIOLIB is not enabled
[media] hdpvr: fix race conditon during start of streaming
[media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes
[media] imon: don't wedge hardware after early callbacks
Oleg Nesterov [Fri, 24 Feb 2012 19:07:29 +0000 (20:07 +0100)]
epoll: ep_unregister_pollwait() can use the freed pwq->whead
signalfd_cleanup() ensures that ->signalfd_wqh is not used, but
this is not enough. eppoll_entry->whead still points to the memory
we are going to free, ep_unregister_pollwait()->remove_wait_queue()
is obviously unsafe.
Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL,
change ep_unregister_pollwait() to check pwq->whead != NULL under
rcu_read_lock() before remove_wait_queue(). We add the new helper,
ep_remove_wait_queue(), for this.
This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because
->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand.
ep_unregister_pollwait()->remove_wait_queue() can play with already
freed and potentially reused ->sighand, but this is fine. This memory
must have the valid ->signalfd_wqh until rcu_read_unlock().
Oleg Nesterov [Fri, 24 Feb 2012 19:07:11 +0000 (20:07 +0100)]
epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.
epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.
This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.
__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.
ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.
The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.
In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.
Note:
- we do not have wake_up_all_poll() but wake_up_poll()
is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.
- signalfd_cleanup() uses POLLHUP along with POLLFREE,
we need a couple of simple changes in eventpoll.c to
make sure it can't be "lost".
Linus Torvalds [Fri, 24 Feb 2012 17:02:53 +0000 (09:02 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Quoth Chris:
"This is later than I wanted because I got backed up running through
btrfs bugs from the Oracle QA teams. But they are all bug fixes that
we've queued and tested since rc1.
Nothing in particular stands out, this just reflects bug fixing and QA
done in parallel by all the btrfs developers. The most user visible
of these is:
Btrfs: clear the extent uptodate bits during parent transid failures
Because that helps deal with out of date drives (say an iscsi disk
that has gone away and come back). The old code wasn't always
properly retrying the other mirror for this type of failure."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (24 commits)
Btrfs: fix compiler warnings on 32 bit systems
Btrfs: increase the global block reserve estimates
Btrfs: clear the extent uptodate bits during parent transid failures
Btrfs: add extra sanity checks on the path names in btrfs_mksubvol
Btrfs: make sure we update latest_bdev
Btrfs: improve error handling for btrfs_insert_dir_item callers
Btrfs: be less strict on finding next node in clear_extent_bit
Btrfs: fix a bug on overcommit stuff
Btrfs: kick out redundant stuff in convert_extent_bit
Btrfs: skip states when they does not contain bits to clear
Btrfs: check return value of lookup_extent_mapping() correctly
Btrfs: fix deadlock on page lock when doing auto-defragment
Btrfs: fix return value check of extent_io_ops
btrfs: honor umask when creating subvol root
btrfs: silence warning in raid array setup
btrfs: fix structs where bitfields and spinlock/atomic share 8B word
btrfs: delalloc for page dirtied out-of-band in fixup worker
Btrfs: fix memory leak in load_free_space_cache()
btrfs: don't check DUP chunks twice
Btrfs: fix trim 0 bytes after a device delete
...
Linus Torvalds [Fri, 24 Feb 2012 17:01:46 +0000 (09:01 -0800)]
Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming
This is the arch/c6x part of commit 7c43185138cf ("Kbuild: Use dtc's -d
(dependency) option") which was dropped because c6x had not yet been
merged at the time.
* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
Kbuild: Use dtc's -d (dependency) option
Kyle McMartin [Fri, 24 Feb 2012 15:36:16 +0000 (10:36 -0500)]
MAINTAINERS: drop me from PA-RISC maintenance
I don't even live in the same country as any of my PA-RISC hardware
these days, so the odds of me touching the code are pretty low.
(Also re-order things to ensure jejb gets CC'd since he's been the
primary maintainer for the last few years.)
David Howells [Thu, 23 Feb 2012 13:50:35 +0000 (13:50 +0000)]
NOMMU: Lock i_mmap_mutex for access to the VMA prio list
Lock i_mmap_mutex for access to the VMA prio list to prevent concurrent
access. Currently, certain parts of the mmap handling are protected by
the region mutex, but not all.
Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 24 Feb 2012 16:56:51 +0000 (08:56 -0800)]
Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh
SuperH fixes for 3.3-rc5
* tag 'sh-for-linus' of git://github.com/pmundt/linux-sh:
sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH
sh: modify a resource of sh_eth_giga1_resources in board-sh7757lcr
arch/sh: remove references to cpu_*_map.
sh: Fix typo in pci-sh7780.c
sh: add platform_device for SPI1 in setup-sh7757
sh: modify resource for SPI0 in setup-sh7757
sh: se7724: fix compile breakage
sh: clkfwk: bugfix: use clk_reparent() for div6 clocks
sh: clock-sh7724: fixup sh_fsi clock settings
sh: sh7757lcr: update to the new MMCIF DMA configuration
sh: fix the sh_mmcif_plat_data in board-sh7757lcr
video: pvr2fb: Fix up spurious section mismatch warnings.
sh: Defer to asm-generic/device.h.
Anton Vorontsov [Fri, 24 Feb 2012 01:14:46 +0000 (05:14 +0400)]
mm: memcg: Correct unregistring of events attached to the same eventfd
There is an issue when memcg unregisters events that were attached to
the same eventfd:
- On the first call mem_cgroup_usage_unregister_event() removes all
events attached to a given eventfd, and if there were no events left,
thresholds->primary would become NULL;
- Since there were several events registered, cgroups core will call
mem_cgroup_usage_unregister_event() again, but now kernel will oops,
as the function doesn't expect that threshold->primary may be NULL.
That's a good question whether mem_cgroup_usage_unregister_event()
should actually remove all events in one go, but nowadays it can't
do any better as cftype->unregister_event callback doesn't pass
any private event-associated cookie. So, let's fix the issue by
simply checking for threshold->primary.
FWIW, w/o the patch the following oops may be observed:
Ohad Ben-Cohen [Wed, 22 Feb 2012 08:52:51 +0000 (10:52 +0200)]
iommu/omap: fix erroneous omap-iommu-debug API calls
Adapt omap-iommu-debug to the latest omap-iommu API changes, which
were introduced by commit fabdbca "iommu/omap: eliminate the public
omap_find_iommu_device() method".
In a nutshell, iommu users are not expected to provide the omap_iommu
handle anymore - instead, iommus are attached using their user's device
handle.
omap-iommu-debug is a hybrid beast though: it invokes both public and
private omap iommu API, so fix it as necessary (otherwise a crash
is imminent).
Note: omap-iommu-debug is a bit disturbing, as it fiddles with internal
omap iommu data and requires exposing API which is otherwise not needed.
It should better be more tightly coupled with omap-iommu, to prevent
further bit rot and avoid exposing redundant API. Naturally that's out
of scope for the -rc cycle, so for now just fix the obvious.
Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Cc: Tony Lindgren <tony@atomide.com> Cc: Hiroshi Doyu <hdoyu@nvidia.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Joerg Roedel <Joerg.Roedel@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Jozsef Kadlecsik [Fri, 24 Feb 2012 10:45:49 +0000 (11:45 +0100)]
netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
Marcell Zambo and Janos Farago noticed and reported that when
new conntrack entries are added via netlink and the conntrack table
gets full, soft lockup happens. This is because the nf_conntrack_lock
is held while nf_conntrack_alloc is called, which is in turn wants
to lock nf_conntrack_lock while evicting entries from the full table.
The patch fixes the soft lockup with limiting the holding of the
nf_conntrack_lock to the minimum, where it's absolutely required.
It required to extend (and thus change) nf_conntrack_hash_insert
so that it makes sure conntrack and ctnetlink do not add the same entry
twice to the conntrack table.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Extended VBLKs (those larger than the preset VBLK size) are divided
into fragments, each with its own VBLK header. Our LDM implementation
generally assumes that each VBLK is contiguous in memory, so these
fragments must be assembled before further processing.
Currently the reassembly seems to be done quite wrongly - no VBLK
header is copied into the contiguous buffer, and the length of the
header is subtracted twice from each fragment. Also the total
length of the reassembled VBLK is calculated incorrectly.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
Said commit adds a check whether the carrier link is ok. If the link is
not ok, the skb is freed and no new dma descriptor added to the rx dma
channel. This causes trouble during initialization when the carrier
status has not yet been updated. If a lot of packets are received while
netif_carrier_ok returns false, all dma descriptors are freed and the
rx dma transfer is stopped.
The bug occurs when the board is connected to a network with lots of
traffic and the ifconfig down/up is done, e.g., when reconfiguring
the interface with DHCP.
The bug can be reproduced by flood pinging the davinci board while doing
ifconfig eth0 down
ifconfig eth0 up
on the board.
After that, the rx path stops working and the overrun value reported
by ifconfig is counting up.
Rusty Russell [Wed, 15 Feb 2012 04:58:04 +0000 (15:28 +1030)]
arch/sh: remove references to cpu_*_map.
This has been obsolescent for a while; time for the final push.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org Signed-off-by: Paul Mundt <lethal@linux-sh.org>
With kernel 3.1, Christoph removed i_alloc_sem and replaced it with
calls (namely inode_dio_wait() and inode_dio_done()) which are
EXPORT_SYMBOL_GPL() thus they cannot be used by non-GPL file systems and
further inode_dio_wait() was pushed from notify_change() into the file
system ->setattr() method but no non-GPL file system can make this call.
That means non-GPL file systems cannot exist any more unless they do not
use any VFS functionality related to reading/writing as far as I can
tell or at least as long as they want to implement direct i/o.
Both Linus and Al (and others) have said on LKML that this breakage of
the VFS API should not have happened and that the change was simply
missed as it was not documented in the change logs of the patches that
did those changes.
This patch changes the two function exports in question to be
EXPORT_SYMBOL() thus restoring the VFS API as it used to be - accessible
for all modules.
Christoph, who introduced the two functions and exported them GPL-only
is CC-ed on this patch to give him the opportunity to object to the
symbols being changed in this manner if he did indeed intend them to be
GPL-only and does not want them to become available to all modules.
Signed-off-by: Anton Altaparmakov <anton@tuxera.com> CC: Christoph Hellwig <hch@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 23 Feb 2012 23:38:57 +0000 (15:38 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
A fix from Jesper Juhl removes an assignment in an ASSERT when a compare
is intended. Two fixes from Mitsuo Hayasaka address off-by-ones in XFS
quota enforcement.
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: make inode quota check more general
xfs: change available ranges of softlimit and hardlimit in quota check
XFS: xfs_trans_add_item() - don't assign in ASSERT() when compare is intended
This patch adds the PHY_HAS_INTERRUPT flag for IC+101 device series.
Also the patch does a simple dity-up to signal that
the driver actually is for IP101A LF and IP101G devices.
In fact, these are two similar PHYs that have the same IDs
and mainly differ for the EEE capability supported in the
G series.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David McKay [Tue, 21 Feb 2012 21:24:57 +0000 (21:24 +0000)]
netdev/phy/icplus: Correct broken phy_init code
The code for ip1001_config_init() was totally broken if you were not
using RGMII. Instead of returning an error code or zero it actually
returned the value in the IP1001_SPEC_CTRL_STATUS_2 register. It was
also trying to set the IP1001_APS_ON bit , but never actually wrote
back the register.
The error checking was also incorrect in both this function and the
reset function, so this patch fixes that up in a consistent fashion.
Signed-off-by: David McKay <david.mckay@st.com> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Niccolò Belli <darkbasic@linuxsystems.it> Tested-by: Niccolò Belli <darkbasic@linuxsystems.it> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 23 Feb 2012 19:48:36 +0000 (11:48 -0800)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
BenH says:
'Here are a few more powerpc bits for you. A stupid regression I
introduced with my previous commit to "fix" program check exceptions
(brown paper bag for me), fix the cpuidle default, a bug fix for
something that isn't strictly speaking a regression but some upstream
changes causes it to show in lockdep now while it didn't before, and
finally a trivial one for rusty to make his life easier later on
removing the old cpumask cruft. '
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix various issues with return to userspace
cpuidle: Default y on powerpc pSeries
powerpc: Fix program check handling when lockdep is enabled
powerpc: Remove references to cpu_*_map
Chris Mason [Wed, 22 Feb 2012 17:36:24 +0000 (12:36 -0500)]
Btrfs: clear the extent uptodate bits during parent transid failures
If btrfs reads a block and finds a parent transid mismatch, it clears
the uptodate flags on the extent buffer, and the pages inside it. But
we only clear the uptodate bits in the state tree if the block straddles
more than one page.
This is from an old optimization from to reduce contention on the extent
state tree. But it is buggy because the code that retries a read from
a different copy of the block is going to find the uptodate state bits
set and skip the IO.
The end result of the bug is that we'll never actually read the good
copy (if there is one).
The fix here is to always clear the uptodate state bits, which is safe
because this code is only called when the parent transid fails.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Nikolaus Schulz [Wed, 22 Feb 2012 22:18:44 +0000 (23:18 +0100)]
hwmon: (f75375s) Fix register write order when setting fans to full speed
By hwmon sysfs interface convention, setting pwm_enable to zero sets a fan
to full speed. In the f75375s driver, this need be done by enabling
manual fan control, plus duty mode for the F875387 chip, and then setting
the maximum duty cycle. Fix a bug where the two necessary register writes
were swapped, effectively discarding the setting to full-speed.
Guenter Roeck [Wed, 22 Feb 2012 16:13:52 +0000 (08:13 -0800)]
hwmon: (ads1015) Fix file leak in probe function
An error while creating sysfs attribute files in the driver's probe function
results in an error abort, but already created files are not removed. This patch
fixes the problem.
Doug Ledford [Mon, 20 Feb 2012 17:19:03 +0000 (12:19 -0500)]
mlx4_core: Exported functions can't be static
At least on powerpc, it breaks the build if exported functions are
static. Fix some static exported functions introduced with the mlx4
SR-IOV support added in 3.3-rc1.
Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
Scott Talbert [Tue, 21 Feb 2012 13:06:00 +0000 (13:06 +0000)]
Move Logitech Harmony 900 from cdc_ether to zaurus
In the current kernel implementation, the Logitech Harmony 900 remote
control is matched to the cdc_ether driver through the generic
USB_CDC_SUBCLASS_MDLM entry. However, this device appears to be of the
pseudo-MDLM (Belcarra) type, rather than the standard one. This patch
blacklists the Harmony 900 from the cdc_ether driver and whitelists it for
the pseudo-MDLM driver in zaurus.
Signed-off-by: Scott Talbert <talbert@techie.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 21 Feb 2012 21:30:25 +0000 (21:30 +0000)]
hso: memsetting wrong data in hso_get_count()
The intent was to clear out the icount struct here, but we accidentally
clear stack memory instead. It probably will lead to a NULL dereference
right away.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 22 Feb 2012 19:58:30 +0000 (11:58 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Intel, radeon, exynos fixes.
Intel: fixes a few Ivybridge hangs, along with fixing RC6 on SNB (still
not on, but at least allows for distros to patch it on easily).
radeon: oops reading some files in debugfs that weren't meant to appear,
a fix that touches a lot of files, so looks worse than it is, it fixes
an oops if a GPU reset fails and userspace keeps submitting more data,
along with a minor BIOS fix for newer boards.
exynos: a group of fixes for exynos, they've sent me a few more but
these were all I got through, and its no hw vanilla kernel users see a
lot off yet.
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon/kms/atom: dpms bios scratch reg updates
drm/radeon/kms: properly set accel working flag and bailout when false
drm/radeon: Only create additional ring debugfs files on Cayman or newer.
drm/exynos: added postclose to release resource.
drm/exynos: removed exynos_drm_fbdev_recreate function.
drm/exynos: fixed page flip issue.
drm/exynos: added possible_clones setup function.
drm/exynos: removed pageflip_event_list init code when closed.
drm/exynos: changed priority of mixer layers.
drm/exynos: Fix typo in exynos_mixer.c
drm/i915: do not enable RC6p on Sandy Bridge
drm/i915: gen7: Disable the RHWO optimization as it can cause GPU hangs.
drm/i915: gen7: work around a system hang on IVB
drm/i915: gen7: Implement an L3 caching workaround.
drm/i915: gen7: implement rczunit workaround
Guo-Fu Tseng [Wed, 22 Feb 2012 08:58:10 +0000 (08:58 +0000)]
jme: Fix FIFO flush issue
Set the RX FIFO flush watermark lower.
According to Federico and JMicron's reply,
setting it to 16QW would be stable on most platforms.
Otherwise, user might experience packet drop issue.
CC: stable@kernel.org Reported-by: Federico Quagliata <federico@quagliata.org> Fixed-by: Federico Quagliata <federico@quagliata.org> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Moger, Babu [Thu, 2 Feb 2012 15:21:54 +0000 (15:21 +0000)]
[SCSI] scsi_dh_rdac: Fix for unbalanced reference count
This patch fixes an unbalanced refcount issue.
Elevating the lock for both kref_put and also for controller node deletion.
Previously, controller deletion was protected but the not the kref_put. This
was causing the other thread to pick up the controller structure which was
already kref'd zero.
This was causing the following WARN_ON and also sometimes panic.
Linus Torvalds [Wed, 22 Feb 2012 16:45:08 +0000 (08:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
It contains 3 important fixes for ColdFire based machines:
- fix processes getting stuck when running from strace
- fix kernel vmalloced pages not being visible in all kernel contexts
- fix shared user pages sometimes being visible in another process
context
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: Do not set global share for non-kernel shared pages
m68k: Add shared bit to Coldfire kernel page entries
m68knommu: fix syscall tracing stuck process
Linus Torvalds [Wed, 22 Feb 2012 16:43:35 +0000 (08:43 -0800)]
Merge tag 'nfs-for-3.3-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Bugfixes for the NFS client.
Fix a nasty Oops in the NFSv4 getacl code, another source of infinite
loops in the NFSv4 state recovery code, and a regression in NFSv4.1
session initialisation.
Also deal with an NFSv4.1 memory leak.
* tag 'nfs-for-3.3-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFSv4: fix server_scope memory leak
NFSv4.1: Fix a NFSv4.1 session initialisation regression
NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEID
NFSv4: Fix an Oops in the NFSv4 getacl code