First remove items from work_list as soon as we start working on them. This
means we don't have to track any pending or visited state and can get
rid of all the RCU magic freeing the work items - we can simply free
them once the operation has finished. Second use a real completion for
tracking synchronous requests - if the caller sets the completion pointer
we complete it, otherwise use it as a boolean indicator that we can free
the work item directly. Third unify struct wb_writeback_args and struct
bdi_work into a single data structure, wb_writeback_work. Previous we
set all parameters into a struct wb_writeback_args, copied it into
struct bdi_work, copied it again on the stack to use it there. Instead
of just allocate one structure dynamically or on the stack and use it
all the way through the stack.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The case where we have a superblock doesn't require a loop here as we scan
over all inodes in writeback_sb_inodes. Split it out into a separate helper
to make the code simpler. This also allows to get rid of the sb member in
struct writeback_control, which was rather out of place there.
Also update the comments in writeback_sb_inodes that explain the handling
of inodes from wrong superblocks.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
This was just an odd wrapper around writeback_inodes_wb. Removing this
also allows to get rid of the bdi member of struct writeback_control
which was rather out of place there.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Randy Dunlap [Thu, 1 Jul 2010 06:26:34 +0000 (08:26 +0200)]
fs-writeback: fix kernel-doc warnings
Fix kernel-doc to match the function's changed args.
Warning(fs/fs-writeback.c:190): No description found for parameter 'args'
Warning(fs/fs-writeback.c:190): Excess function parameter 'sb' description in 'bdi_queue_work_onstack'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Linus Torvalds [Tue, 29 Jun 2010 17:42:52 +0000 (10:42 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: Don't count_vm_events for discard bio in submit_bio.
cfq: fix recursive call in cfq_blkiocg_update_completion_stats()
cfq-iosched: Fixed boot warning with BLK_CGROUP=y and CFQ_GROUP_IOSCHED=n
cfq: Don't allow queue merges for queues that have no process references
block: fix DISCARD_BARRIER requests
cciss: set SCSI max cmd len to 16, as default is wrong
cpqarray: fix two more wrong section type
cpqarray: fix wrong __init type on pci probe function
drbd: Fixed a race between disk-attach and unexpected state changes
writeback: fix pin_sb_for_writeback
writeback: add missing requeue_io in writeback_inodes_wb
writeback: simplify and split bdi_start_writeback
writeback: simplify wakeup_flusher_threads
writeback: fix writeback_inodes_wb from writeback_inodes_sb
writeback: enforce s_umount locking in writeback_inodes_sb
writeback: queue work on stack in writeback_inodes_sb
writeback: fix writeback completion notifications
npiggin@suse.de [Thu, 24 Jun 2010 03:02:14 +0000 (13:02 +1000)]
fs: fix superblock iteration race
list_for_each_entry_safe is not suitable to protect against concurrent
modification of the list. 6754af6 introduced a race in sb walking.
list_for_each_entry can use the trick of pinning the current entry in
the list before we drop and retake the lock because it subsequently
follows cur->next. However list_for_each_entry_safe saves n=cur->next
for following before entering the loop body, so when the lock is
dropped, n may be deleted.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Christoph Hellwig <hch@infradead.org> Cc: John Stultz <johnstul@us.ibm.com> Cc: Frank Mayhar <fmayhar@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 28 Jun 2010 19:06:25 +0000 (12:06 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, Calgary: Increase max PHB number
x86: Fix rebooting on Dell Precision WorkStation T7400
x86: Fix vsyscall on gcc 4.5 with -Os
x86, pat: Proper init of memtype subtree_max_end
um, hweight: Fix UML boot crash due to x86 optimized hweight
x86, setup: Set ax register in boot vga query
percpu, x86: Avoid warnings of unused variables in per cpu
x86, irq: Rename gsi_end gsi_top, and fix off by one errors
x86: use __ASSEMBLY__ rather than __ASSEMBLER__
Linus Torvalds [Mon, 28 Jun 2010 19:06:00 +0000 (12:06 -0700)]
Merge branch 'fixes' of ssh://master.kernel.org/~sfr/next-fixes
* 'fixes' of ssh://master.kernel.org/~sfr/next-fixes:
acpi: update gfp/slab.h includes
ocfs2: update gfp/slab.h includes
davinci: update gfp/slab.h includes
arm: update gfp/slab.h includes
v4l-dvb: update gfp/slab.h includes
Linus Torvalds [Mon, 28 Jun 2010 05:56:32 +0000 (22:56 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md/raid5: don't include 'spare' drives when reshaping to fewer devices.
md/raid5: add a missing 'continue' in a loop.
md/raid5: Allow recovered part of partially recovered devices to be in-sync
md/raid5: More careful check for "has array failed".
md: Don't update ->recovery_offset when reshaping an array to fewer devices.
md/raid5: avoid oops when number of devices is reduced then increased.
md: enable raid4->raid0 takeover
md: clear layout after ->raid0 takeover
md: fix raid10 takeover: use new_layout for setup_conf
md: fix handling of array level takeover that re-arranges devices.
md: raid10: Fix null pointer dereference in fix_read_error()
Restore partition detection of newly created md arrays.
Tejun Heo [Mon, 29 Mar 2010 17:52:44 +0000 (02:52 +0900)]
acpi: update gfp/slab.h includes
Implicit slab.h inclusion via percpu.h is about to go away. Make sure
gfp.h or slab.h is included as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tejun Heo [Mon, 29 Mar 2010 17:52:45 +0000 (02:52 +0900)]
arm: update gfp/slab.h includes
Implicit slab.h inclusion via percpu.h is about to go away. Make sure
gfp.h or slab.h is included as necessary.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Linus Torvalds [Sun, 27 Jun 2010 18:33:44 +0000 (11:33 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix first chunk match in per_cpu_ptr_to_phys()
percpu: fix trivial bugs in pcpu_build_alloc_info()
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits)
phylib: Add autoload support for the LXT973 phy.
ISDN: hysdn, fix potential NULL dereference
vxge: fix memory leak in vxge_alloc_msix() error path
isdn/gigaset: correct CAPI connection state storage
isdn/gigaset: encode HLC and BC together
isdn/gigaset: correct CAPI DATA_B3 Delivery Confirmation
isdn/gigaset: correct CAPI voice connection encoding
isdn/gigaset: honor CAPI application's buffer size request
cpmac: do not leak struct net_device on phy_connect errors
smc91c92_cs: fix the problem that lan & modem does not work simultaneously
ipv6: fix NULL reference in proxy neighbor discovery
Bluetooth: Bring back var 'i' increment
xfrm: check bundle policy existance before dereferencing it
sky2: enable rx/tx in sky2_phy_reinit()
cnic: Disable statistics initialization for eth clients that do not support statistics
net: add dependency on fw class module to qlcnic and netxen_nic
snmp: fix SNMP_ADD_STATS()
hso: remove setting of low_latency flag
udp: Fix bogus UFO packet generation
lasi82596: fix netdev_mc_count conversion
...
Linus Torvalds [Sun, 27 Jun 2010 16:04:02 +0000 (09:04 -0700)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFSv4: Fix an embarassing typo in encode_attrs()
NFSv4: Ensure that /proc/self/mountinfo displays the minor version number
NFSv4.1: Ensure that we initialise the session when following a referral
SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
nfs4 use mandatory attribute file type in nfs4_get_root
Linus Torvalds [Sun, 27 Jun 2010 15:03:00 +0000 (08:03 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
MAINTAINERS - Add an entry for the input MT protocol
Input: wacom - fix serial number handling on Cintiq 21UX2
Input: fixup X86_MRST selects
Input: sysrq - fix "stuck" SysRq mode
Input: ad7877 - fix spi word size to 16 bit
Input: pcf8574_keypad - fix off by one in pcf8574_kp_irq_handler()
Linus Torvalds [Sun, 27 Jun 2010 14:50:47 +0000 (07:50 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
ext3: update ctime when changing the file's permission by setfacl
ext2: update ctime when changing the file's permission by setfacl
Linus Torvalds [Sun, 27 Jun 2010 14:39:57 +0000 (07:39 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: usb/endpoint, fix dangling pointer use
ALSA: asihpi - Get rid of incorrect "long" types and casts.
ASoC: DaVinci: Fix McASP hardware FIFO configuration
ALSA: hda - Fix line-in for mb5 model MacBook (Pro) 5,1 / 5,2
ALSA: usb-audio: fix UAC2 control value queries
ALSA: usb-audio: parse UAC2 sample rate ranges correctly
ALSA: usb-audio: fix control messages for USB_RECIP_INTERFACE
ALSA: usb-audio: add check for faulty clock in parse_audio_format_rates_v2()
ALSA: hda - Don't check capture source mixer if no ADC is available
Linus Torvalds [Sun, 27 Jun 2010 14:39:38 +0000 (07:39 -0700)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (k8temp) Bypass core swapping on single-core processors
hwmon: (i5k_amb) Fix sysfs attribute for lockdep
hwmon: (k10temp) Do not blacklist known working CPU models
Linus Torvalds [Sun, 27 Jun 2010 14:37:51 +0000 (07:37 -0700)]
Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
intel-iommu: Force-disable IOMMU for iGFX on broken Cantiga revisions.
intel-iommu: Fix double lock in get_domain_for_dev()
intel-iommu: Fix reference by physical address in intel_iommu_attach_device()
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
MAINTAINERS: change mailing list address for CIFS
cifs: remove bogus first_time check in NTLMv2 session setup code
cifs: don't call cifs_new_fileinfo unless cifs_open succeeds
cifs: don't ignore cifs_posix_open_inode_helper return value
cifs: clean up arguments to cifs_open_inode_helper
cifs: pass instantiated filp back after open call
cifs: move cifs_new_fileinfo call out of cifs_posix_open
cifs: implement drop_inode superblock op
cifs: don't attempt busy-file rename unless it's in same directory
Linus Torvalds [Sun, 27 Jun 2010 14:30:25 +0000 (07:30 -0700)]
Merge branch 'merge' of git://git.secretlab.ca/git/linux-2.6
* 'merge' of git://git.secretlab.ca/git/linux-2.6:
powerpc/5200: fix lite5200 ethernet phy address
powerpc/5200: Fix build error in sound code.
powerpc/5200: fix oops during going to standby
powerpc/5200: add lite5200 onboard I2C eeprom and flash
maintainers: Add git trees for SPI and device tree
of: Drop properties with "/" in their name
Linus Torvalds [Sun, 27 Jun 2010 14:29:19 +0000 (07:29 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
alpha: Fix de2104x driver failing to readout MAC address correctly
alpha: Detect Super IO chip, no IDE on Avanti, enable EPP19
alpha: fix pci_mmap_resource API breakage
alpha: fix __arch_hweight32 typo
Linus Torvalds [Sun, 27 Jun 2010 14:15:53 +0000 (07:15 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: Fix mpic_resume on early G5 macs
powerpc: rtas_flash needs to use rtas_data_buf
powerpc: Unconditionally enabled irq stacks
powerpc/kexec: Wait for online/possible CPUs only.
powerpc: Disable CONFIG_SYSFS_DEPRECATED
powerpc/boot: Remove addRamdisk.c since it is now unused
powerpc: Move kdump default base address to 64MB on 64bit
powerpc: Remove dead CONFIG_HIGHPTE
powerpc/fsl-booke: Move loadcam_entry back to asm code to fix SMP ftrace
powerpc/fsl-booke: Fix InstructionTLBError execute permission check
Linus Torvalds [Sun, 27 Jun 2010 14:05:02 +0000 (07:05 -0700)]
Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: fix LOCALVERSION handling to match description
kbuild: Fix modpost segfault
Linus Torvalds [Sun, 27 Jun 2010 14:03:12 +0000 (07:03 -0700)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI / PM: Do not enable GPEs for system wakeup in advance
ACPICA: Truncate I/O addresses to 16 bits for Windows compatibility
ACPICA: Limit maximum time for Sleep() operator
ACPICA: Fix namestring associated with AE_NO_HANDLER exception
ACPI / ACPICA: Fix sysfs GPE interface
ACPI / ACPICA: Fix GPE initialization
ACPI / ACPICA: Avoid writing full enable masks to GPE registers
ACPI / ACPICA: Fix low-level GPE manipulation code
ACPI / ACPICA: Use helper function for computing GPE masks
ACPI / ACPICA: Do not attempt to disable GPE when installing handler
ACPI: Disable Vista compatibility for Sony VGN-NS50B_L
ACPI: fan: fix unbalanced code block
ACPI: Store NVS state even when entering suspend to RAM
suspend: Move NVS save/restore code to generic suspend functionality
ACPI: Do not try to set up acpi processor stuff on cores exceeding maxcpus=
ACPI: acpi_pad: Don't needlessly mark LAPIC unstable
Dan Carpenter [Fri, 11 Jun 2010 16:30:05 +0000 (17:30 +0100)]
KEYS: Propagate error code instead of returning -EINVAL
This is from a Smatch check I'm writing.
strncpy_from_user() returns -EFAULT on error so the first change just
silences a warning but doesn't change how the code works.
The other change is a bug fix because install_thread_keyring_to_cred()
can return a variety of errors such as -EINVAL, -EEXIST, -ENOMEM or
-EKEYREVOKED.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jiri Slaby [Tue, 22 Jun 2010 01:41:36 +0000 (01:41 +0000)]
ISDN: hysdn, fix potential NULL dereference
Stanse found that lp is dereferenced earlier than checked for being
NULL in hysdn_rx_netpkt. Move the initialization below the test.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Karsten Keil <isdn@linux-pingi.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Patrick McHardy <kaber@trash.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Mon, 21 Jun 2010 13:55:20 +0000 (13:55 +0000)]
isdn/gigaset: correct CAPI connection state storage
CAPI applications can handle several connections in parallel,
so one connection state per application isn't sufficient.
Store the connection state in the channel structure instead.
Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Mon, 21 Jun 2010 13:55:05 +0000 (13:55 +0000)]
isdn/gigaset: encode HLC and BC together
Adapt to buggy device firmware which accepts setting HLC only in the
same command line as BC, by encoding HLC and BC in a single command
if both are specified, and rejecting HLC without BC.
Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
The Gigaset CAPI driver handled all DATA_B3_REQ messages as if the
Delivery Confirmation flag bit was set, delaying the emission of the
DATA_B3_CONF reply until the data was actually transmitted. Some
CAPI applications (notably Asterisk) aren't happy with that
behaviour. Change it to actually evaluate the Delivery Confirmation
flag as described the CAPI specification.
Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Make the Gigaset CAPI driver select L2_VOICE (AT^SBPR=2) as the
layer 2 encoding for transparent connections, like the ISDN4Linux
variant. L2_BITSYNC (AT^SBPR=0) mutes internal connections and
distorts external ones.
Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the Gigaset CAPI driver to limit the length of a connection's
payload data receive buffers to the corresponding CAPI application's
data buffer size, as some real-life CAPI applications tend to be
rather unhappy if they receive bigger data blocks than requested.
Impact: bugfix Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
Ken Kawasaki [Sat, 19 Jun 2010 15:24:27 +0000 (15:24 +0000)]
smc91c92_cs: fix the problem that lan & modem does not work simultaneously
smc91c92_cs:
Fix the problem that lan & modem does not work simultaneously
in the Megahertz multi-function card.
We need to write MEGAHERTZ_ISR to retrigger interrupt.
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6: fix NULL reference in proxy neighbor discovery
The addition of TLLAO option created a kernel OOPS regression
for the case where neighbor advertisement is being sent via
proxy path. When using proxy, ipv6_get_ifaddr() returns NULL
causing the NULL dereference.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Darrick J. Wong [Thu, 24 Jun 2010 21:26:47 +0000 (14:26 -0700)]
x86, Calgary: Increase max PHB number
Newer systems (x3950M2) can have 48 PHBs per chassis and 8
chassis, so bump the limits up and provide an explanation
of the requirements for each class.
Will Deacon [Mon, 24 May 2010 19:11:43 +0000 (12:11 -0700)]
sched: Prevent compiler from optimising the sched_avg_update() loop
GCC 4.4.1 on ARM has been observed to replace the while loop in
sched_avg_update with a call to uldivmod, resulting in the
following build failure at link-time:
kernel/built-in.o: In function `sched_avg_update':
kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
kernel/sched.c:1261: undefined reference to `__aeabi_uldivmod'
make: *** [.tmp_vmlinux1] Error 1
This patch introduces a fake data hazard to the loop body to
prevent the compiler optimising the loop away.
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Henrik Rydberg [Fri, 25 Jun 2010 02:10:40 +0000 (19:10 -0700)]
MAINTAINERS - Add an entry for the input MT protocol
This patch adds a maintainer for the input multitouch (MT) protocol,
such that get_maintainer.pl selects it whenever an MT event is present
in the patch.
Signed-off-by: Henrik Rydberg <rydberg@euromail.se> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Timo Teräs [Thu, 24 Jun 2010 21:35:00 +0000 (14:35 -0700)]
xfrm: check bundle policy existance before dereferencing it
Fix the bundle validation code to not assume having a valid policy.
When we have multiple transformations for a xfrm policy, the bundle
instance will be a chain of bundles with only the first one having
the policy reference. When policy_genid is bumped it will expire the
first bundle in the chain which is equivalent of expiring the whole
chain.
Reported-bisected-and-tested-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
Tao Ma [Wed, 23 Jun 2010 23:43:57 +0000 (07:43 +0800)]
block: Don't count_vm_events for discard bio in submit_bio.
In submit_bio, we count vm events by check READ/WRITE.
But actually DISCARD_NOBARRIER also has the WRITE flag set.
It looks as if in blkdev_issue_discard, we also add a
page as the payload and the bio_has_data check isn't enough.
So add another check for discard bio.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
NeilBrown [Thu, 17 Jun 2010 07:48:26 +0000 (17:48 +1000)]
md/raid5: don't include 'spare' drives when reshaping to fewer devices.
There are few situations where it would make any sense to add a spare
when reducing the number of devices in an array, but it is
conceivable: A 6 drive RAID6 with two missing devices could be
reshaped to a 5 drive RAID6, and a spare could become available
just in time for the reshape, but not early enough to have been
recovered first. 'freezing' recovery can make this easy to
do without any races.
However doing such a thing is a bad idea. md will not record the
partially-recovered state of the 'spare' and when the reshape
finished it will think that the spare is still spare.
Easiest way to avoid this confusion is to simply disallow it.
NeilBrown [Thu, 17 Jun 2010 07:41:03 +0000 (17:41 +1000)]
md/raid5: add a missing 'continue' in a loop.
As the comment says, the tail of this loop only applies to devices
that are not fully in sync, so if In_sync was set, we should avoid
the rest of the loop.
This bug will hardly ever cause an actual problem. The worst it
can do is allow an array to be assembled that is dirty and degraded,
which is not generally a good idea (without warning the sysadmin
first).
This will only happen if the array is RAID4 or a RAID5/6 in an
intermediate state during a reshape and so has one drive that is
all 'parity' - no data - while some other device has failed.
This is certainly possible, but not at all common.
NeilBrown [Thu, 17 Jun 2010 07:25:21 +0000 (17:25 +1000)]
md/raid5: Allow recovered part of partially recovered devices to be in-sync
During a recovery of reshape the early part of some devices might be
in-sync while the later parts are not.
We we know we are looking at an early part it is good to treat that
part as in-sync for stripe calculations.
This is particularly important for a reshape which suffers device
failure. Treating the data as in-sync can mean the difference between
data-safety and data-loss.
NeilBrown [Wed, 16 Jun 2010 07:17:53 +0000 (17:17 +1000)]
md/raid5: More careful check for "has array failed".
When we are reshaping an array, the device failure combinations
that cause us to decide that the array as failed are more subtle.
In particular, any 'spare' will be fully in-sync in the section
of the array that has already been reshaped, thus failures that
affect only that section are less critical.
So encode this subtlety in a new function and call it as appropriate.
The case that showed this problem was a 4 drive RAID5 to 8 drive RAID6
conversion where the last two devices failed.
This resulted in:
good good good good incomplete good good failed failed
while converting a 5-drive RAID6 to 8 drive RAID5
The incomplete device causes the whole array to look bad,
bad as it was actually good for the section that had been
converted to 8-drives, all the data was actually safe.
Reported-by: Terry Morris <tbmorris@tbmorris.com> Signed-off-by: NeilBrown <neilb@suse.de>
NeilBrown [Wed, 16 Jun 2010 07:01:25 +0000 (17:01 +1000)]
md: Don't update ->recovery_offset when reshaping an array to fewer devices.
When an array is reshaped to have fewer devices, the reshape proceeds
from the end of the devices to the beginning.
If a device happens to be non-In_sync (which is possible but rare)
we would normally update the ->recovery_offset as the reshape
progresses. However that would be wrong as the recover_offset records
that the early part of the device is in_sync, while in fact it would
only be the later part that is in_sync, and in any case the offset
number would be measured from the wrong end of the device.
Relatedly, if after a reshape a spare is discovered to not be
recoverred all the way to the end, not allow spare_active
to incorporate it in the array.
This becomes relevant in the following sample scenario:
A 4 drive RAID5 is converted to a 6 drive RAID6 in a combined
operation.
The RAID5->RAID6 conversion will cause a 5 drive to be included as a
spare, then the 5drive -> 6drive reshape will effectively rebuild that
spare as it progresses. The 6th drive is treated as in_sync the whole
time as there is never any case that we might consider reading from
it, but must not because there is no valid data.
If we interrupt this reshape part-way through and reverse it to return
to a 5-drive RAID6 (or event a 4-drive RAID5), we don't want to update
the recovery_offset - as that would be wrong - and we don't want to
include that spare as active in the 5-drive RAID6 when the reversed
reshape completed and it will be mostly out-of-sync still.
NeilBrown [Wed, 16 Jun 2010 06:45:16 +0000 (16:45 +1000)]
md/raid5: avoid oops when number of devices is reduced then increased.
The entries in the stripe_cache maintained by raid5 are enlarged
when we increased the number of devices in the array, but not
shrunk when we reduce the number of devices.
So if entries are added after reducing the number of devices, we
much ensure to initialise the whole entry, not just the part that
is currently relevant. Otherwise if we enlarge the array again,
we will reference uninitialised values.
As grow_buffers/shrink_buffer now want to use a count that is stored
explicity in the raid_conf, they should get it from there rather than
being passed it as a parameter.
NeilBrown [Tue, 15 Jun 2010 08:36:03 +0000 (09:36 +0100)]
md: fix handling of array level takeover that re-arranges devices.
Most array level changes leave the list of devices largely unchanged,
possibly causing one at the end to become redundant.
However conversions between RAID0 and RAID10 need to renumber
all devices (except 0).
This renumbering is currently being done in the ->run method when the
new personality takes over. However this is too late as the common
code in md.c might already have invalidated some of the devices if
they had a ->raid_disk number that appeared to high.
Moving it into the ->takeover method is too early as the array is
still active at that time and wrong ->raid_disk numbers could cause
confusion.
So add a ->new_raid_disk field to mdk_rdev_s and use it to communicate
the new raid_disk number.
Now the common code knows exactly which devices need to be renumbered,
and which can be invalidated, and can do it all at a convenient time
when the array is suspend.
It can also update some symlinks in sysfs which previously were not be
updated correctly.
Reported-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
md: raid10: Fix null pointer dereference in fix_read_error()
Such NULL pointer dereference can occur when the driver was fixing the
read errors/bad blocks and the disk was physically removed
causing a system crash. This patch check if the
rcu_dereference() returns valid rdev before accessing it in fix_read_error().
Cc: stable@kernel.org Signed-off-by: Prasanna S. Panchamukhi <prasanna.panchamukhi@riverbed.com> Signed-off-by: Rob Becker <rbecker@riverbed.com> Signed-off-by: NeilBrown <neilb@suse.de>
The logic was almost right. However if revalidate_disk is called
when the device is not yet open, bdev->bd_disk won't be set, so the
flush_disk() Call will not set bd_invalidated.
So when md_open is called we still need to ensure that
->bd_invalidated gets set. This is easily done with a call to
check_disk_size_change in the place where the offending commit removed
check_disk_change. At the important times, the size will have changed
from 0 to non-zero, so check_disk_size_change will set bd_invalidated.
Peter Zijlstra [Tue, 22 Jun 2010 09:44:53 +0000 (11:44 +0200)]
sched: silence PROVE_RCU in sched_fork()
Because cgroup_fork() is ran before sched_fork() [ from copy_process() ]
and the child's pid is not yet visible the child is pinned to its
cgroup. Therefore we can silence this warning.
A nicer solution would be moving cgroup_fork() to right after
dup_task_struct() and exclude PF_STARTING from task_subsys_state().
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Brandon Philips [Wed, 16 Jun 2010 16:21:58 +0000 (16:21 +0000)]
sky2: enable rx/tx in sky2_phy_reinit()
sky2_phy_reinit is called by the ethtool helpers sky2_set_settings,
sky2_nway_reset and sky2_set_pauseparam when netif_running.
However, at the end of sky2_phy_init GM_GP_CTRL has GM_GPCR_RX_ENA and
GM_GPCR_TX_ENA cleared. So, doing these commands causes the device to
stop working:
$ ethtool -r eth0
$ ethtool -A eth0 autoneg off
Fix this issue by enabling Rx/Tx after running sky2_phy_init in
sky2_phy_reinit.
Signed-off-by: Brandon Philips <bphilips@suse.de> Tested-by: Brandon Philips <bphilips@suse.de> Cc: stable@kernel.org Tested-by: Mike McCormack <mikem@ring3k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: add dependency on fw class module to qlcnic and netxen_nic
netxen_nic and qlcnic driver depends on firmware_class module.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Convert to rcu_dereference_raw() given that many callers may have many
different locking models.
Located-by: Miles Lane <miles.lane@gmail.com> Tested-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
The task_group() function returns a pointer that must be protected
by either RCU, the ->alloc_lock, or the cgroup lock (see the
rcu_dereference_check() in task_subsys_state(), which is invoked by
task_group()). The wake_affine() function currently does none of these,
which means that a concurrent update would be within its rights to free
the structure returned by task_group(). Because wake_affine() uses this
structure only to compute load-balancing heuristics, there is no reason
to acquire either of the two locks.
Therefore, this commit introduces an RCU read-side critical section that
starts before the first call to task_group() and ends after the last use
of the "tg" pointer returned from task_group(). Thanks to Li Zefan for
pointing out the need to extend the RCU read-side critical section from
that proposed by the original patch.
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
virtio-pci resets the device at startup by writing to the status
register, but this does not clear the pci config space,
specifically msi enable status which affects register
layout.
This breaks things like kdump when they try to use e.g. virtio-blk.
Fix by forcing msi off at startup. Since pci.c already has
a routine to do this, we export and use it instead of duplicating code.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: linux-pci@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org
add_buf returns ring size on out of memory,
this is not what devices expect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org # .34.x
Randy Dunlap [Fri, 18 Jun 2010 05:31:17 +0000 (22:31 -0700)]
Input: fixup X86_MRST selects
Some of the recent X86_MRST additions make some "select"s
conditional on X86_MRST but missed some related kconfig symbols,
causing:
drivers/built-in.o: In function `ps2_end_command':
(.text+0x257ab2): undefined reference to `i8042_check_port_owner'
drivers/built-in.o: In function `ps2_end_command':
(.text+0x257ae1): undefined reference to `i8042_unlock_chip'
drivers/built-in.o: In function `ps2_begin_command':
(.text+0x257b40): undefined reference to `i8042_check_port_owner'
drivers/built-in.o: In function `ps2_begin_command':
(.text+0x257b6f): undefined reference to `i8042_lock_chip'
when SERIO_I8042=m, SERIO_LIBPS2=y, KEYBOARD_ATKBD=y.
We need to make i8042 dependant upon !X86_MRST and allow deselecting
atkbd on Moorestown even when !CONFIG_EMBEDDED.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Jacob Pan <jacob.jun.pan@intel.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Trond Myklebust [Wed, 16 Jun 2010 17:57:32 +0000 (13:57 -0400)]
SUNRPC: Fix a re-entrancy bug in xs_tcp_read_calldir()
If the attempt to read the calldir fails, then instead of storing the read
bytes, we currently discard them. This leads to a garbage final result when
upon re-entry to the same routine, we read the remaining bytes.
Fixes the regression in bugzilla number 16213. Please see
https://bugzilla.kernel.org/show_bug.cgi?id=16213
Filip Aben [Tue, 22 Jun 2010 17:10:35 +0000 (10:10 -0700)]
hso: remove setting of low_latency flag
This patch removes the setting of the low_latency flag.
tty_flip_buffer_push() is occasionally being called in irq context, which
causes a hang if the low_latency flag is set.
Removing the low_latency flag only seems to impact the flush to ldisc,
which will now be put on a workqueue.
Signed-off-by: Filip Aben <f.aben@option.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Tue, 15 Jun 2010 01:52:25 +0000 (01:52 +0000)]
udp: Fix bogus UFO packet generation
It has been reported that the new UFO software fallback path
fails under certain conditions with NFS. I tracked the problem
down to the generation of UFO packets that are smaller than the
MTU. The software fallback path simply discards these packets.
This patch fixes the problem by not generating such packets on
the UFO path.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ralf Baechle [Mon, 21 Jun 2010 03:44:50 +0000 (03:44 +0000)]
NET: MIPSsim: Fix modpost warning.
$ make CONFIG_DEBUG_SECTION_MISMATCH=y
[...]
WARNING: drivers/net/built-in.o(.data+0x0): Section mismatch in reference from the variable mipsnet_driver to the function .init.text:mipsnet_probe()
The variable mipsnet_driver references
the function __init mipsnet_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,
[...]
Fixed by making mipsnet_probe __devinit.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
drivers/net/mipsnet.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-) Signed-off-by: David S. Miller <davem@davemloft.net>
Wu Zhangjin [Mon, 21 Jun 2010 11:09:09 +0000 (19:09 +0800)]
tracing: Fix undeclared ENOSYS in include/linux/tracepoint.h
The header file include/linux/tracepoint.h may be included without
include/linux/errno.h and then the compiler will fail on building for
undelcared ENOSYS. This patch fixes this problem via including <linux/errno.h>
to include/linux/tracepoint.h.
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
LKML-Reference: <1277118549-622-1-git-send-email-wuzhangjin@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Jiri Slaby [Mon, 21 Jun 2010 15:03:21 +0000 (17:03 +0200)]
ALSA: usb/endpoint, fix dangling pointer use
Stanse found that in snd_usb_parse_audio_endpoints, there is a
dangling pointer dereference. When snd_usb_parse_audio_format fails,
fp is freed, and continue invoked. On the next loop, there is
"fp && fp->altsetting == 1 && fp->channels == 1" test, but fp is set
from the last iteration (but is bogus) and thus ilegally dereferenced.