PM: Allow SCSI devices to suspend/resume asynchronously
Set power.async_suspend for all SCSI devices, targets and hosts, so
that they can be suspended and resumed in parallel with the main
suspend/resume thread and possibly with other devices they don't
depend on in a known way (i.e. devices which are not their parents or
children).
The power.async_suspend flag is also set for devices that don't have
suspend or resume callbacks, because otherwise they would make the
main suspend/resume thread wait for their "asynchronous" children
(during suspend) or parents (during resume), effectively negating the
possible gains from executing these devices' suspend and resume
callbacks asynchronously.
PM: Allow USB devices to suspend/resume asynchronously
Set power.async_suspend for USB devices, endpoints and interfaces,
allowing them to be suspended and resumed asynchronously during
system sleep transitions.
The power.async_suspend flag is also set for devices that don't have
suspend or resume callbacks, because otherwise they would make the
main suspend/resume thread wait for their "asynchronous" children
(during suspend) or parents (during resume), effectively negating the
possible gains from executing these devices' suspend and resume
callbacks asynchronously.
Alan Stern [Fri, 12 Feb 2010 11:21:11 +0000 (12:21 +0100)]
USB: implement non-tree resume ordering constraints for PCI host controllers
This patch (as1331) adds non-tree ordering constraints needed for
proper resume of PCI USB host controllers from hibernation. The main
issue is that non-high-speed devices must not be resumed before the
high-speed root hub, because it is the ehci_bus_resume() routine which
takes care of handing the device connection over to the companion
controller. If the device resume is attempted before the handover
then the device won't be found and it will be treated as though it had
disconnected.
The patch adds a new field to the usb_bus structure; for each
full/low-speed bus this field will contain a pointer to the companion
high-speed bus (if one exists). It is used during normal device
resume; if the hs_companion pointer isn't NULL then we wait for the
root-hub device on the hs_companion bus.
A secondary issue is that an EHCI controlller shouldn't be resumed
before any of its companions. On some machines I have observed
handovers failing if the companion controller is reinitialized after
the handover. Thus, the EHCI resume routine must wait for the
companion controllers to be resumed.
The patch also fixes a small bug in usb_hcd_pci_probe(); an error path
jumps to the wrong label, causing a memory leak.
[rjw: Fixed compilation for CONFIG_PM_SLEEP unset.]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
PM: Allow PCI devices to suspend/resume asynchronously
Set power.async_suspend for all PCI devices and PCIe port services,
so that they can be suspended and resumed in parallel with other
devices they don't depend on in a known way (i.e. devices which are
not their parents or children).
This only affects the "regular" suspend and resume stages, which
means in particular that the restoration of the PCI devices' standard
configuration registers during resume will still be carried out
synchronously (at the "early" resume stage).
Jiri Slaby [Wed, 27 Jan 2010 22:47:56 +0000 (23:47 +0100)]
PM / Hibernate: Swap, remove useless check from swsusp_read()
It will never reach here if the sws_resume_bdev is erratic.
swsusp_read() is called only from software_resume(), but after
swsusp_check() which would catch the error state.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
There are some dependencies between devices (in particular, between
EHCI USB controllers and their OHCI/UHCI siblings) which are not
reflected by the structure of the device tree. With synchronous
suspend and resume these dependencies are taken into accout
automatically, because the devices in question are always registered
in the right order, but to meet these constraints with asynchronous
suspend and resume the drivers of these devices will need to use
dpm_wait() in their suspend/resume routines, so introduce a helper
function allowing them to do that.
It has been shown by testing that total device resume time can be
reduced significantly (by as much as 50% or more) if the async
threads executing some devices' resume routines are all started
before the main resume thread starts to handle the "synchronous"
devices.
This is a consequence of the fact that the slowest devices tend to be
located at the end of dpm_list, so their resume routines are started
very late. Consequently, they have to wait for all the preceding
"synchronous" devices before their resume routines can be started
by the main resume thread, even if they are "asynchronous". By
starting their async threads upfront we effectively move those
devices towards the beginning of dpm_list, without breaking their
ordering with respect to their parents and children. As a result,
their resume routines are started much earlier and we are able to
save much more device resume time this way.
PM: Add facility for advanced testing of async suspend/resume
Add configuration switch CONFIG_PM_ADVANCED_DEBUG for compiling in
extra PM debugging/testing code allowing one to access some
PM-related attributes of devices from the user space via sysfs.
If CONFIG_PM_ADVANCED_DEBUG is set, add sysfs attribute power/async
for every device allowing the user space to access the device's
power.async_suspend flag and modify it, if desired.
Theoretically, the total time of system sleep transitions (suspend
to RAM, hibernation) can be reduced by running suspend and resume
callbacks of device drivers in parallel with each other. However,
there are dependencies between devices such that we're not allowed
to suspend the parent of a device before suspending the device
itself. Analogously, we're not allowed to resume a device before
resuming its parent.
The most straightforward way to take these dependencies into accout
is to start the async threads used for suspending and resuming
devices at the core level, so that async_schedule() is called for
each suspend and resume callback supposed to be executed
asynchronously.
For this purpose, introduce a new device flag, power.async_suspend,
used to mark the devices whose suspend and resume callbacks are to be
executed asynchronously (ie. in parallel with the main suspend/resume
thread and possibly in parallel with each other) and helper function
device_enable_async_suspend() allowing one to set power.async_suspend
for given device (power.async_suspend is unset by default for all
devices). For each device with the power.async_suspend flag set the
PM core will use async_schedule() to execute its suspend and resume
callbacks.
The async threads started for different devices as a result of
calling async_schedule() are synchronized with each other and with
the main suspend/resume thread with the help of completions, in the
following way:
(1) There is a completion, power.completion, for each device object.
(2) Each device's completion is reset before calling async_schedule()
for the device or, in the case of devices with the
power.async_suspend flags unset, before executing the device's
suspend and resume callbacks.
(3) During suspend, right before running the bus type, device type
and device class suspend callbacks for the device, the PM core
waits for the completions of all the device's children to be
completed.
(4) During resume, right before running the bus type, device type and
device class resume callbacks for the device, the PM core waits
for the completion of the device's parent to be completed.
(5) The PM core completes power.completion for each device right
after the bus type, device type and device class suspend (or
resume) callbacks executed for the device have returned.
Add new device sysfs attribute, power/control, allowing the user
space to block the run-time power management of the devices. If this
attribute is set to "on", the driver of the device won't be able to power
manage it at run time (without breaking the rules) and the device will
always be in the full power state (except when the entire system goes
into a sleep state).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Alan Stern <stern@rowland.harvard.edu>
Linus Torvalds [Fri, 26 Feb 2010 18:35:27 +0000 (10:35 -0800)]
Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (48 commits)
x86/PCI: Prevent mmconfig memory corruption
ACPI: Use GPE reference counting to support shared GPEs
x86/PCI: use host bridge _CRS info by default on 2008 and newer machines
PCI: augment bus resource table with a list
PCI: add pci_bus_for_each_resource(), remove direct bus->resource[] refs
PCI: read bridge windows before filling in subtractive decode resources
PCI: split up pci_read_bridge_bases()
PCIe PME: use pci_pcie_cap()
PCI PM: Run-time callbacks for PCI bus type
PCIe PME: use pci_is_pcie()
PCI / ACPI / PM: Platform support for PCI PME wake-up
ACPI / ACPICA: Multiple system notify handlers per device
ACPI / PM: Add more run-time wake-up fields
ACPI: Use GPE reference counting to support shared GPEs
PCI PM: Make it possible to force using INTx for PCIe PME signaling
PCI PM: PCIe PME root port service driver
PCI PM: Add function for checking PME status of devices
PCI: mark is_pcie obsolete
PCI: set PCI_PREF_RANGE_TYPE_64 in pci_bridge_check_ranges
PCI: pciehp: second try to get big range for pcie devices
...
Linus Torvalds [Fri, 26 Feb 2010 18:03:22 +0000 (10:03 -0800)]
Lower USB storage settling delay to something more reasonable
The five-second delay can be rather annoying, and makes the system
appear much less responsive when you connect a USB drive.
It's also not entirely clear that it is needed - the settling delay has
at least historically been an issue on some Apple iPods, for example,
and some devices have been reported to need even more than the old 5s
delay.
But before we penalize them all, let's see how bad it really is. Some
of the reasons for long delays seem to be actual historical kernel bugs
that should probably never have been papered over with a delay in the
first place (there's a Ubuntu bug report for 2.6.20 about a NULL pointer
dereference unless 'delay_use' is 8 or more, for example).
It also looks like some distros have already shipped with delay_use=0,
so the five second default may well be totally historical.
Linus Torvalds [Thu, 25 Feb 2010 23:38:37 +0000 (15:38 -0800)]
Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6: (41 commits)
of: remove undefined request_OF_resource & release_OF_resource
of/sparc: Remove sparc-local declaration of allnodes and devtree_lock
of: move definition of of_chosen into common code.
of: remove unused extern reference to devtree_lock
of: put default string compare and #a/s-cell values into common header
of/flattree: Don't assume HAVE_LMB
of: protect linux/of.h with CONFIG_OF
proc_devtree: fix THIS_MODULE without module.h
of: Remove old and misplaced function declarations
of/flattree: Make the kernel accept ePAPR style phandle information
of/flattree: endian-convert members of boot_param_header
of: assume big-endian properties, adding conversions where necessary
of: use __be32 for cell value accessors
of/flattree: use OF_ROOT_NODE_{SIZE,ADDR}_CELLS DEFAULT for fdt parsing
of/flattree: use callback to setup initrd from /chosen
proc_devtree: include linux/of.h
of: make set_node_proc_entry private to proc_devtree.c
of: include linux/proc_fs.h
of/flattree: merge early_init_dt_scan_memory() common code
of: add 'of_' prefix to machine_is_compatible()
...
Linus Torvalds [Thu, 25 Feb 2010 23:38:03 +0000 (15:38 -0800)]
Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (31 commits)
spi: Correct SPI clock frequency setting in spi_mpc8xxx
spi/spi_s3c64xx.c: Fix continuation line formats
spi/dw_spi: Fix dw_spi_mmio to depend on HAVE_CLK
spi/dw_spi: Allow dw_spi.c to be a module
spi/dw_spi: mmio code style fixups
Memory-mapped dw_spi driver
spi/dw_spi: fix missing export of dw_spi_remove_host
spi/dw_spi: conditional transfer mode changes
spi/dw_spi: remove conditional from 'poll_transfer'.
spi/dw_spi: fixed a spelling typo in a warning message.
spi/dw_spi: add return value to empty mrst_spi_debugfs_init()
spi/dw_spi: enable platform specific chipselect.
spi/dw_spi: add a FIFO depth detection
spi/dw_spi: fix __init/__devinit section mismatch
spi: xilinx_spi: Fix up I/O routine wrapping bogosity.
spi/spi_imx: add device information by switching pr_debug() to dev_dbg()
spi: update MSIOF includes
spi/dw_spi: refine the IRQ mode working flow
spi/dw_spi: add a missed dw_spi_remove_host() in exit sequence
spi/dw_spi: bug fix in wait_till_not_busy()
...
Linus Torvalds [Thu, 25 Feb 2010 22:44:33 +0000 (14:44 -0800)]
Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-kconfig
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-kconfig:
kconfig: Simplify LSMOD= handling
kconfig: Add LSMOD=file to override the lsmod for localmodconfig
kconfig: Look in both /bin and /sbin for lsmod in streamline_config.pl
kconfig: Check for if conditions in Kconfig for localmodconfig
kconfig: Create include/generated for localmodconfig
Linus Torvalds [Thu, 25 Feb 2010 22:42:39 +0000 (14:42 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (41 commits)
HID: usbhid: initialize interface pointers early enough
HID: extend mask for BUTTON usage page
HID: hid-ntrig: Single touch mode tap
HID: hid-ntrig: multitouch cleanup and fix
HID: n-trig: remove unnecessary tool switching
HID: hid-ntrig add multi input quirk and clean up
HID: usbhid: introduce timeout for stuck ctrl/out URBs
HID: magicmouse: coding style and probe failure fixes
HID: remove MODULE_VERSION from new drivers
HID: fix up Kconfig entry for MagicMouse
HID: add a device driver for the Apple Magic Mouse.
HID: Export hid_register_report
HID: Support for MosArt multitouch panel
HID: add pressure support for the Stantum multitouch panel
HID: fixed bug in single-touch emulation on the stantum panel
HID: fix typo in error message
HID: add mapping for "AL Network Chat" usage
HID: use multi input quirk for TouchPack touchscreen
HID: make full-fledged hid-bus drivers properly selectable
HID: make Wacom modesetting failures non-fatal
...
Thomas Gleixner [Thu, 25 Feb 2010 15:42:11 +0000 (16:42 +0100)]
x86/PCI: Prevent mmconfig memory corruption
commit ff097ddd4 (x86/PCI: MMCONFIG: manage pci_mmcfg_region as a
list, not a table) introduced a nasty memory corruption when
pci_mmcfg_list is empty.
pci_mmcfg_check_end_bus_number() dereferences pci_mmcfg_list.prev even
when the list is empty. The following write hits some variable near to
pci_mmcfg_list.
Further down a similar problem exists, where cfg->list.next is
dereferenced unconditionally and a comparison with some variable near
to pci_mmcfg_list happens.
Add a check for the last element into the for_each_entry() loop and
remove all the other crappy logic which is just a leftover of the old
array based code which was replaced by the list conversion.
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (44 commits)
Add MAINTAINERS entry for virtio_console
virtio: console: Fill ports' entire in_vq with buffers
virtio: console: Error out if we can't allocate buffers for control queue
virtio: console: Add ability to remove module
virtio: console: Ensure no memleaks in case of unused buffers
virtio: console: show error message if hvc_alloc fails for console ports
virtio: console: Add debugfs files for each port to expose debug info
virtio: console: Add ability to hot-unplug ports
virtio: console: Handle port hot-plug
virtio: console: Remove cached data on port close
virtio: console: Register with sysfs and create a 'name' attribute for ports
virtio: console: Ensure only one process can have a port open at a time
virtio: console: Add file operations to ports for open/read/write/poll
virtio: console: Associate each port with a char device
virtio: console: Prepare for writing to userspace buffers
virtio: console: Add a new MULTIPORT feature, support for generic ports
virtio: console: Introduce a send_buf function for a common path for sending data to host
virtio: console: Introduce function to hand off data from host to readers
virtio: console: Separate out find_vqs operation into a different function
virtio: console: Separate out console init into a new function
...
Joshua Roys [Wed, 24 Feb 2010 23:52:44 +0000 (18:52 -0500)]
netlabel: fix export of SELinux categories > 127
This fixes corrupted CIPSO packets when SELinux categories greater than 127
are used. The bug occured on the second (and later) loops through the
while; the inner for loop through the ebitmap->maps array used the same
index as the NetLabel catmap->bitmap array, even though the NetLabel bitmap
is twice as long as the SELinux bitmap.
Signed-off-by: Joshua Roys <joshua.roys@gtri.gatech.edu> Acked-by: Paul Moore <paul.moore@hp.com> Signed-off-by: James Morris <jmorris@namei.org>
Robert Hancock [Wed, 27 Jan 2010 04:33:23 +0000 (22:33 -0600)]
ahci: disable FPDMA auto-activate optimization on NVIDIA AHCI
Mike Cui reported that his system with an NVIDIA MCP79 (aka MCP7A)
chipset stopped working with 2.6.32. The problem appears to be that
2.6.32 now enables the FPDMA auto-activate optimization in the ahci
driver. The drive works fine with this enabled on an Intel AHCI so
this appears to be a chipset bug. Since MCP79 is a fairly recent
NVIDIA chipset and we don't have any info on whether any other NVIDIA
chipsets have this issue, disable FPDMA AA optimization on all NVIDIA
AHCI controllers for now.
Should address http://bugzilla.kernel.org/show_bug.cgi?id=14922
Signed-off-by: Robert Hancock <hancockrwd@gmail.com>
While-we-investigate-issue-this-patch-looks-good-to-me-by:
Prajakta Gudadhe <pgudadhe@nvidia.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Cc: stable@kernel.org
ACPI: Use GPE reference counting to support shared GPEs
To fix a bug and address the reviewers' comments regarding the ACPI
GPE refcounting patch, do the following additional changes:
o Remove the second argument of acpi_ev_enable_gpe(),
'write_to_hardware', because it is not necessary any more.
o Add the "bad parameter" test against 'type' in
acpi_enable_gpe() and acpi_disable_gpe().
o Make acpi_enable_gpe() only check 'status' for runtime GPEs if
acpi_ev_enable_gpe() was actually called.
o Make acpi_disable_gpe() return 'status' returned by
acpi_ev_disable_gpe() and fix a bug where ACPI_GPE_TYPE_WAKE
and ACPI_GPE_TYPE_RUNTIME were exchanged by mistake.
o Add comments explaining why acpi_set_gpe() is used by the ACPI EC
driver.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Carlos O'Donell [Mon, 22 Feb 2010 23:25:59 +0000 (23:25 +0000)]
parisc: Set PCI CLS early in boot.
Set the PCI CLS early in the boot process to prevent
device failures. In pcibios_set_master use the new
pci_cache_line_size instead of a hard-coded value.
Signed-off-by: Carlos O'Donell <carlos@codesourcery.com> Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Kyle McMartin <kyle@redhat.com>
Michal Simek [Mon, 15 Feb 2010 09:50:42 +0000 (10:50 +0100)]
microblaze: Fix cache loop function for cache range
I create wrong asm code but none test shows that this part of code is wrong.
I am not convinces that were good idea to create asm optimized macros
for caches. The reason is that there is not optimization with previous code
that's why make sense to add old code and do some benchmarking which
functions are faster.
Amit Shah [Fri, 12 Feb 2010 05:02:18 +0000 (10:32 +0530)]
virtio: console: Fill ports' entire in_vq with buffers
Instead of allocating just one buffer for a port's in_vq, fill
the entire in_vq with buffers so the host need not stall while
an application consumes the data and makes the buffer available
again for the host.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Fri, 12 Feb 2010 05:02:17 +0000 (10:32 +0530)]
virtio: console: Error out if we can't allocate buffers for control queue
With MULTIPORT support, the control queue is an integral part of the
functioning of the device. If we can't get any buffers allocated, the
host won't be able to relay important information and the device may not
function as intended.
Ensure 'probe' doesn't succeed until the control queue has at least one
buffer allocated for its ivq.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 21 Dec 2009 16:27:40 +0000 (21:57 +0530)]
virtio: console: Register with sysfs and create a 'name' attribute for ports
The host can set a name for ports so that they're easily discoverable
instead of going by the /dev/vportNpn naming. This attribute will be
placed in /sys/class/virtio-ports/vportNpn/name. udev scripts can then
create symlinks to the port using the name.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Thu, 26 Nov 2009 05:55:38 +0000 (11:25 +0530)]
virtio: console: Ensure only one process can have a port open at a time
Add a guest_connected field that ensures only one process
can have a port open at a time.
This also ensures we don't have a race when we later add support for
dropping buffers when closing the char dev and buffer caching is turned
off for the particular port.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 21 Dec 2009 16:19:30 +0000 (21:49 +0530)]
virtio: console: Add file operations to ports for open/read/write/poll
Allow guest userspace applications to open, read from, write to, poll
the ports via the char dev interface.
When a port gets opened, a notification is sent to the host via a
control message indicating a connection has been established. Similarly,
on closing of the port, a notification is sent indicating disconnection.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 21 Dec 2009 15:56:45 +0000 (21:26 +0530)]
virtio: console: Prepare for writing to userspace buffers
When ports get advertised as char devices, the buffers will come from
userspace. Equip the fill_readbuf function with the ability to write
to userspace buffers.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 21 Dec 2009 15:33:25 +0000 (21:03 +0530)]
virtio: console: Add a new MULTIPORT feature, support for generic ports
This commit adds a new feature, MULTIPORT. If the host supports this
feature as well, the config space has the number of ports defined for
that device. New ports are spawned according to this information.
The config space also has the maximum number of ports that can be
spawned for a particular device. This is useful in initializing the
appropriate number of virtqueues in advance, as ports might be
hot-plugged in later.
Using this feature, generic ports can be created which are not tied to
hvc consoles.
We also open up a private channel between the host and the guest via
which some "control" messages are exchanged for the ports, like whether
the port being spawned is a console port, resizing the console window,
etc.
Next commits will add support for hotplugging and presenting char
devices in /dev/ for bi-directional guest-host communication.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 18 Jan 2010 13:45:12 +0000 (19:15 +0530)]
virtio: console: Introduce function to hand off data from host to readers
In preparation for serving data to userspace (generic ports) as well as
in-kernel users (hvc consoles), separate out the functionality common to
both in a 'fill_readbuf()' function.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 18 Jan 2010 13:45:10 +0000 (19:15 +0530)]
virtio: console: Separate out console init into a new function
Console ports could be hot-added. Also, with the new multiport support,
a port is identified as a console port only if the host sends a control
message.
Move the console port init into a separate function so it can be invoked
from other places.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 18 Jan 2010 13:45:09 +0000 (19:15 +0530)]
virtio: console: Separate out console-specific data into a separate struct
Move out console-specific stuff into a separate struct from 'struct
port' as we need to maintain two lists: one for all the ports (which
includes consoles) and one only for consoles since the hvc callbacks
only give us the vtermno.
This makes console handling cleaner.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 18 Jan 2010 13:45:06 +0000 (19:15 +0530)]
virtio: console: remove global var
Now we can use an allocation function to remove our global console variable.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Amit Shah [Mon, 18 Jan 2010 13:45:05 +0000 (19:15 +0530)]
virtio: console: don't assume a single console port.
Keep a list of all ports being used as a console, and provide a lock
and a lookup function. The hvc callbacks only give us a vterm number,
so we need to map this.
Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 18 Jan 2010 13:45:04 +0000 (19:15 +0530)]
virtio: console: use vdev->priv to avoid accessing global var.
Part of removing our "one console" assumptions, use vdev->priv to point
to the port (currently == the global console).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Mon, 18 Jan 2010 13:45:00 +0000 (19:15 +0530)]
virtio: console: port encapsulation
We are heading towards a multiple-"port" system, so as part of weaning off
globals we encapsulate the information into 'struct port'.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Sat, 28 Nov 2009 06:50:26 +0000 (12:20 +0530)]
hvc_console: make the ops pointer const.
This is nicer for modern R/O protection. And noone needs it non-const, so
constify the callers as well.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linuxppc-dev@ozlabs.org
That way, we can make it const as is good kernel style. We use a separate
indirection for the early console, rather than mugging ops.put_chars.
We rename it hv_ops, too.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Shirley Ma [Mon, 18 Jan 2010 13:45:23 +0000 (19:15 +0530)]
virtio: Add ability to detach unused buffers from vrings
There's currently no way for a virtio driver to ask for unused
buffers, so it has to keep a list itself to reclaim them at shutdown.
This is redundant, since virtio_ring stores that information. So
add a new hook to do this.
Signed-off-by: Shirley Ma <xma@us.ibm.com> Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Allow reading various alignment values from the config page. This
allows the guest to much better align I/O requests depending on the
storage topology.
Note that the formats for the config values appear a bit messed up,
but we follow the formats used by ATA and SCSI so they are expected in
the storage world.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
virtio is communicating with a virtual "device" that actually runs on
another host processor. Thus SMP barriers can be used to control
memory access ordering.
Where possible, we should use SMP barriers which are more lightweight than
mandatory barriers, because mandatory barriers also control MMIO effects on
accesses through relaxed memory I/O windows (which virtio does not use)
(compare specifically smp_rmb and rmb on x86_64).
We can't just use smp_mb and friends though, because
we must force memory ordering even if guest is UP since host could be
running on another CPU, but SMP barriers are defined to barrier() in
that configuration. So, for UP fall back to mandatory barriers instead.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rusty Russell [Wed, 24 Feb 2010 20:22:22 +0000 (14:22 -0600)]
virtio: remove bogus barriers from DEBUG version of virtio_ring.c
With DEBUG defined, we add an ->in_use flag to detect if the caller
invokes two virtio methods in parallel. The barriers attempt to ensure
timely update of the ->in_use flag.
But they're voodoo: if we need these barriers it implies that the
calling code doesn't have sufficient synchronization to ensure the
code paths aren't invoked at the same time anyway, and we want to
detect it.
Also, adding barriers changes timing, so turning on debug has more
chance of hiding real problems.
Thanks to MST for drawing my attention to this code...
CC: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Adam Litke [Thu, 10 Dec 2009 22:35:15 +0000 (16:35 -0600)]
virtio: Fix scheduling while atomic in virtio_balloon stats
This is a fix for my earlier patch: "virtio: Add memory statistics reporting to
the balloon driver (V4)".
I discovered that all_vm_events() can sleep and therefore stats collection
cannot be done in interrupt context. One solution is to handle the interrupt
by noting that stats need to be collected and waking the existing vballoon
kthread which will complete the work via stats_handle_request(). Rusty, is
this a saner way of doing business?
There is one issue that I would like a broader opinion on. In stats_request, I
update vb->need_stats_update and then wake up the kthread. The kthread uses
vb->need_stats_update as a condition variable. Do I need a memory barrier
between the update and wake_up to ensure that my kthread sees the correct
value? My testing suggests that it is not needed but I would like some
confirmation from the experts.
Signed-off-by: Adam Litke <agl@us.ibm.com>
To: Rusty Russell <rusty@rustcorp.com.au> Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Adam Litke [Mon, 30 Nov 2009 16:14:15 +0000 (10:14 -0600)]
virtio: Add memory statistics reporting to the balloon driver (V4)
Changes since V3:
- Do not do endian conversions as they will be done in the host
- Report stats that reference a quantity of memory in bytes
- Minor coding style updates
Changes since V2:
- Increase stat field size to 64 bits
- Report all sizes in kb (not pages)
- Drop anon_pages stat and fix endianness conversion
Changes since V1:
- Use a virtqueue instead of the device config space
When using ballooning to manage overcommitted memory on a host, a system for
guests to communicate their memory usage to the host can provide information
that will minimize the impact of ballooning on the guests. The current method
employs a daemon running in each guest that communicates memory statistics to a
host daemon at a specified time interval. The host daemon aggregates this
information and inflates and/or deflates balloons according to the level of
host memory pressure. This approach is effective but overly complex since a
daemon must be installed inside each guest and coordinated to communicate with
the host. A simpler approach is to collect memory statistics in the virtio
balloon driver and communicate them directly to the hypervisor.
This patch enables the guest-side support by adding stats collection and
reporting to the virtio balloon driver.
Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Anthony Liguori <anthony@codemonkey.ws> Cc: virtualization@lists.linux-foundation.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor fixes)
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: bug fix for vlan + gro issue
tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
cdc_ether: new PID for Ericsson C3607w to the whitelist (resubmit)
IPv6: better document max_addresses parameter
MAINTAINERS: update mv643xx_eth maintenance status
e1000: Fix DMA mapping error handling on RX
iwlwifi: sanity check before counting number of tfds can be free
iwlwifi: error checking for number of tfds in queue
iwlwifi: set HT flags after channel in rxon
Ajit Khaparde [Tue, 16 Feb 2010 20:25:43 +0000 (20:25 +0000)]
net: bug fix for vlan + gro issue
Traffic (tcp) doesnot start on a vlan interface when gro is enabled.
Even the tcp handshake was not taking place.
This is because, the eth_type_trans call before the netif_receive_skb
in napi_gro_finish() resets the skb->dev to napi->dev from the previously
set vlan netdev interface. This causes the ip_route_input to drop the
incoming packet considering it as a packet coming from a martian source.
I could repro this on 2.6.32.7 (stable) and 2.6.33-rc7.
With this fix, the traffic starts and the test runs fine on both vlan
and non-vlan interfaces.
CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Patrick McHardy <kaber@trash.net> Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 24 Feb 2010 02:15:05 +0000 (18:15 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: Be in TS_POLLING state during mwait based C-state entry
ACPI: Fix regression where _PPC is not read at boot even when ignore_ppc=0
acer-wmi: Respect current backlight level when loading
Linus Torvalds [Wed, 24 Feb 2010 02:13:34 +0000 (18:13 -0800)]
Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/vmwgfx: Fix queries if no dma buffer thrashing is occuring.
drm/nv50: fix vram ptes on IGPs to point at stolen system memory
drm/nv50: fix instmem binding on IGPs to point at stolen system memory
drm/nv50: improve vram page table construction
drm/nv50: more efficient clearing of gpu page table entries
drm/nv50: make nv50_mem_vm_{bind,unbind} operate only on vram
drm/nouveau: Fix up pre-nv17 analog load detection.
Fixing the build the b94b08081fcecf83fa690d6c5664f6316fe72208 way
breaks xpc because genksyms then fails to generate an CRC for
per_cpu____sn_cnodeid_to_nasid because of limitations in the
generic genksyms code.
Signed-off-by: Hedi Berriche <hedi@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Bjorn Helgaas [Tue, 23 Feb 2010 17:24:41 +0000 (10:24 -0700)]
x86/PCI: use host bridge _CRS info by default on 2008 and newer machines
The main benefit of using ACPI host bridge window information is that
we can do better resource allocation in systems with multiple host bridges,
e.g., http://bugzilla.kernel.org/show_bug.cgi?id=14183
Sometimes we need _CRS information even if we only have one host bridge,
e.g., https://bugs.launchpad.net/ubuntu/+source/linux/+bug/341681
Most of these systems are relatively new, so this patch turns on
"pci=use_crs" only on machines with a BIOS date of 2008 or newer.
Bjorn Helgaas [Tue, 23 Feb 2010 17:24:36 +0000 (10:24 -0700)]
PCI: augment bus resource table with a list
Previously we used a table of size PCI_BUS_NUM_RESOURCES (16) for resources
forwarded to a bus by its upstream bridge. We've increased this size
several times when the table overflowed.
But there's no good limit on the number of resources because host bridges
and subtractive decode bridges can forward any number of ranges to their
secondary buses.
This patch reduces the table to only PCI_BRIDGE_RESOURCE_NUM (4) entries,
which corresponds to the number of windows a PCI-to-PCI (3) or CardBus (4)
bridge can positively decode. Any additional resources, e.g., PCI host
bridge windows or subtractively-decoded regions, are kept in a list.
I'd prefer a single list rather than this split table/list approach, but
that requires simultaneous changes to every architecture. This approach
only requires immediate changes where we set up (a) host bridges with more
than four windows and (b) subtractive-decode P2P bridges, and we can
incrementally change other architectures to use the list.
Bjorn Helgaas [Tue, 23 Feb 2010 17:24:31 +0000 (10:24 -0700)]
PCI: add pci_bus_for_each_resource(), remove direct bus->resource[] refs
No functional change; this converts loops that iterate from 0 to
PCI_BUS_NUM_RESOURCES through pci_bus resource[] table to use the
pci_bus_for_each_resource() iterator instead.
This doesn't change the way resources are stored; it merely removes
dependencies on the fact that they're in a table.
Bjorn Helgaas [Tue, 23 Feb 2010 17:24:26 +0000 (10:24 -0700)]
PCI: read bridge windows before filling in subtractive decode resources
No functional change; this fills in the bus subtractive decode resources
after reading the bridge window information rather than before. Also,
print out the subtractive decode resources as we already do for the
positive decode windows.
Bjorn Helgaas [Tue, 23 Feb 2010 17:24:21 +0000 (10:24 -0700)]
PCI: split up pci_read_bridge_bases()
No functional change; this breaks up pci_read_bridge_bases() into separate
pieces for the I/O, memory, and prefetchable memory windows, similar to how
Yinghai recently split up pci_setup_bridge() in 68e84ff3bdc.
Atsushi Nemoto [Fri, 19 Feb 2010 05:13:58 +0000 (05:13 +0000)]
tc35815: Remove a wrong netif_wake_queue() call which triggers BUG_ON
The netif_wake_queue() is called correctly (i.e. only on !txfull
condition) from txdone routine. So Unconditional call to the
netif_wake_queue() here is wrong. This might cause calling of
start_xmit routine on txfull state and trigger BUG_ON.
This bug does not happen when NAPI disabled. After txdone there
must be at least one free tx slot. But with NAPI, this is not
true anymore and the BUG_ON can hits on heavy load.
In this driver NAPI was enabled on 2.6.33-rc1 so this is
regression from 2.6.32 kernel.
Reported-by: Ralf Roesch <ralf.roesch@rw-gmbh.de> Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
Torgny Johansson [Fri, 19 Feb 2010 01:59:15 +0000 (01:59 +0000)]
cdc_ether: new PID for Ericsson C3607w to the whitelist (resubmit)
This patch adds a new vid/pid to the cdc_ether whitelist.
Device added:
- Ericsson Mobile Broadband variant C3607w
Signed-off-by: Torgny Johansson <torgny.johansson@gmail.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html Signed-off-by: David S. Miller <davem@davemloft.net>
Brian Haley [Mon, 22 Feb 2010 12:27:21 +0000 (12:27 +0000)]
IPv6: better document max_addresses parameter
Andrew Morton wrote:
>> >From ip-sysctl.txt file in kernel documentation I can see following description
>> for max_addresses:
>> max_addresses - INTEGER
>> Number of maximum addresses per interface. 0 disables limitation.
>> It is recommended not set too large value (or 0) because it would
>> be too easy way to crash kernel to allow to create too much of
>> autoconfigured addresses.
^^^^^^^^^^^^^^
>> If this parameter applies only for auto-configured IP addressed, please state
>> it more clearly in docs or rename the parameter to show that it refers to
>> auto-configuration.
It did mention autoconfigured in the text, but the below makes it more obvious.
More clearly document IPv6 max_addresses parameter.
Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Blanchard [Fri, 19 Feb 2010 17:54:53 +0000 (17:54 +0000)]
e1000: Fix DMA mapping error handling on RX
Check for error return from pci_map_single/pci_map_page and clean up.
With this and the previous patch the driver was able to handle a significant
percentage of errors (I set the fault injection rate to 10% and could still
download large files at a reasonable speed).
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
"Benjamin S." <sbenni@gmx.de> reports that the patch in question
causes a big drop in sequential throughput for him, dropping from
200MB/sec down to only 70MB/sec.
Needs to be investigated more fully, for now lets just revert the
offending commit.