Ryusuke Konishi [Sun, 3 Oct 2010 08:44:03 +0000 (17:44 +0900)]
nilfs2: change license of exported header file
This allows other projects to carry copies of the header file related
to ABI and disk format (i.e. "nilfs2_fs.h") without it or distributors
having to worry about effects on the project's overall license terms.
It's also desired for switching the license of nilfs library to LGPL.
Jiro SEKIBA pointed out these license issues (Message-ID:
<87tylo7msw.wl%jir@sekiba.com>), and he suggested switching license of
the library and nilfs2_fs.h to GNU Lesser General Public License. We
take in his suggestion to avoid the license issues.
Nilfs hasn't supported the freeze/thaw feature because it didn't work
due to the peculiar design that multiple super block instances could
be allocated for a device. This limitation was removed by the patch
"nilfs2: do not allocate multiple super block instances for a device".
So now this adds the freeze/thaw support to nilfs.
nilfs2: accept 64-bit checkpoint numbers in cp mount option
The current implementation doesn't mount snapshots with checkpoint
numbers larger than INT_MAX since it uses match_int() for parsing
"cp=" mount option.
This uses simple_strtoull() for the conversion to resolve the issue.
nilfs2: remove own inode allocator and destructor for metadata files
This finally removes own inode allocator and destructor functions for
metadata files. Several routines, nilfs_mdt_new(),
nilfs_mdt_new_common(), nilfs_mdt_clear(), nilfs_mdt_destroy(), and
nilfs_alloc_inode_common() will be gone.
nilfs2: get rid of back pointer to writable sb instance
Nilfs object holds a back pointer to a writable super block instance
in nilfs->ns_writer, and this became eliminable since sb is now made
per device and all inodes have a valid pointer to it.
This deletes the ns_writer pointer and a reader/writer semaphore
protecting it.
Ryusuke Konishi [Tue, 31 Aug 2010 02:40:34 +0000 (11:40 +0900)]
nilfs2: add routines to redirect access to buffers of DAT file
During garbage collection (GC), DAT file, which converts virtual block
number to real block number, may return disk block number that is not
yet written to the device.
To avoid access to unwritten blocks, the current implementation stores
changes to the caches of GCDAT during GC and atomically commit the
changes into the DAT file after they are written to the device.
This patch, instead, adds a function that makes a copy of specified
buffer and stores it in nilfs_shadow_map, and a function to get the
backup copy as needed (nilfs_mdt_freeze_buffer and
nilfs_mdt_get_frozen_buffer respectively).
Before DAT changes block number in an entry block, it makes a copy and
redirect access to the buffer so that address conversion function
(i.e. nilfs_dat_translate) refers to the old address saved in the
copy.
nilfs2: add routines to roll back state of DAT file
This adds optional function to metadata files which makes a copy of
bmap, page caches, and b-tree node cache, and rolls back to the copy
as needed.
This enhancement is intended to displace gcdat inode that provides a
similar function in a different way.
In this patch, nilfs_shadow_map structure is added to store a copy of
the foregoing states. nilfs_mdt_setup_shadow_map relates this
structure to a metadata file. And, nilfs_mdt_save_to_shadow_map() and
nilfs_mdt_restore_from_shadow_map() provides save and restore
functions respectively. Finally, nilfs_mdt_clear_shadow_map() clears
states of nilfs_shadow_map.
The copy of b-tree node cache and page cache is made by duplicating
only dirty pages into corresponding caches in nilfs_shadow_map. Their
restoration is done by clearing dirty pages from original caches and
by copying dirty pages back from nilfs_shadow_map.
nilfs2: simplify life cycle management of nilfs object
This stops pre-allocating nilfs object in nilfs_get_sb routine, and
stops managing its life cycle by reference counting.
nilfs_find_or_create_nilfs() function, nilfs->ns_mount_mutex,
nilfs_objects list, and the reference counter will be removed through
the simplification.
Ryusuke Konishi [Sun, 15 Aug 2010 16:54:52 +0000 (01:54 +0900)]
nilfs2: do not allocate multiple super block instances for a device
This stops allocating multiple super block instances for a device.
All snapshots and a current mode mount (i.e. latest tree) will be
controlled with nilfs_root objects that are kept within an sb
instance.
nilfs_get_sb() is rewritten so that it always has a root object for
the latest tree and snapshots make additional root objects.
The root dentry of the latest tree is binded to sb->s_root even if it
isn't attached on a directory. Root dentries of snapshots or the
latest tree are binded to mnt->mnt_root on which they are mounted.
With this patch, nilfs_find_sbinfo() function, nilfs->ns_supers list,
and nilfs->ns_current back pointer, are deleted. In addition,
init_nilfs() and load_nilfs() are simplified since they will be called
once for a device, not repeatedly called for mount points.
Ryusuke Konishi [Sun, 15 Aug 2010 14:33:57 +0000 (23:33 +0900)]
nilfs2: deny write access to inodes in snapshots
Snapshots of nilfs are read-only.
After super block instances (sb) will be unified, nilfs will need to
check write access by a way other than implicit test with
IS_RDONLY(inode). This is because IS_RDONLY() refers to MS_RDONLY bit
of inode->i_sb->s_flags and it will become inaccurate after the
unification of sb.
To prepare for the issue, this uses i_op->permission to deny write
access to inodes in snapshots.
Ryusuke Konishi [Sat, 14 Aug 2010 12:44:51 +0000 (21:44 +0900)]
nilfs2: use checkpoint tree for mount check of snapshots
This rewrites nilfs_checkpoint_is_mounted() function so that it
decides whether a checkpoint is mounted by whether the corresponding
root object is found in checkpoint tree.
Ryusuke Konishi [Sat, 14 Aug 2010 04:07:15 +0000 (13:07 +0900)]
nilfs2: use root object to get ifile
This rewrites functions using ifile so that they get ifile from
nilfs_root object, and will remove sbi->s_ifile. Some functions that
don't know the root object are extended to receive it from caller.
Ryusuke Konishi [Thu, 26 Aug 2010 15:23:02 +0000 (00:23 +0900)]
nilfs2: make snapshots in checkpoint tree exportable
The previous export operations cannot handle multiple versions of
a filesystem if they belong to the same sb instance.
This adds a new type of file handle and extends export operations so
that they can get the inode specified by a checkpoint number as well
as an inode number and a generation number.
Ryusuke Konishi [Wed, 25 Aug 2010 08:45:44 +0000 (17:45 +0900)]
nilfs2: set pointer to root object in inodes
This puts a pointer to nilfs_root object in the private part of
on-memory inode, and makes nilfs_iget function pick up the inode with
the same root object.
Non-root inodes inherit its nilfs_root object from parent inode. That
of the root inode is allocated through nilfs_attach_checkpoint()
function.
Ryusuke Konishi [Sat, 14 Aug 2010 03:59:15 +0000 (12:59 +0900)]
nilfs2: add checkpoint tree to nilfs object
To hold multiple versions of a filesystem in one sb instance, a new
on-memory structure is necessary to handle one or more checkpoints.
This adds a red-black tree of checkpoints to nilfs object, and adds
lookup and create functions for them.
Each checkpoint is represented by "nilfs_root" structure, and this
structure has rb_node to configure the rb-tree.
The nilfs_root object is identified with a checkpoint number. For
each snapshot, a nilfs_root object is allocated and the checkpoint
number of snapshot is assigned to it. For a regular mount
(i.e. current mode mount), NILFS_CPTREE_CURRENT_CNO constant is
assigned to the corresponding nilfs_root object.
Each nilfs_root object has an ifile inode and some counters. These
items will displace those of nilfs_sb_info structure in successive
patches.
Ryusuke Konishi [Fri, 20 Aug 2010 10:06:11 +0000 (19:06 +0900)]
nilfs2: remove own inode hash used for GC
This uses inode hash function that vfs provides instead of the own
hash table for caching gc inodes. This finally removes the own inode
hash from nilfs.
Ryusuke Konishi [Sat, 21 Aug 2010 13:01:51 +0000 (22:01 +0900)]
nilfs2: separate initializer of metadata file inode
This separates a part of initialization code of metadata file inode,
and makes it available from the nilfs iget function that a later patch
will add to.
Ryusuke Konishi [Fri, 20 Aug 2010 11:10:38 +0000 (20:10 +0900)]
nilfs2: keep zero value in i_cno except for gc-inodes
On-memory inode structures of nilfs have a member "i_cno" which stores
a checkpoint number related to the inode. For gc-inodes, this field
indicates version of data each gc-inode caches for GC. Log writer
temporarily uses "i_cno" to transfer the latest checkpoint number.
This stops the latter use and lets only gc-inodes use it.
The purpose of this patch is to allow the successive change use
"i_cno" for inode lookup.
Ryusuke Konishi [Mon, 9 Aug 2010 15:58:41 +0000 (00:58 +0900)]
nilfs2: accept future revisions
Compatibility of nilfs partitions is now managed with three feature
sets. This changes old compatibility check with revision number so
that it can accept future revisions.
Note that we can stop support of experimental versions of nilfs that
doesn't know the feature sets by incrementing NILFS_CURRENT_REV. We
don't have to do it soon, but it would be a possible option whenever
the need arises.
Linus Torvalds [Fri, 22 Oct 2010 18:23:42 +0000 (11:23 -0700)]
Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6
* 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
pcmcia: fix ni_daq_700 compilation
pcmcia: IOCARD is also required for using IRQs
* git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic/io.h: allow people to override individual funcs
bitops: remove duplicated extern declarations
bitops: make asm-generic/bitops/find.h more generic
asm-generic: kdebug.h: Checkpatch cleanup
asm-generic: fcntl: make exported headers use strict posix types
asm-generic: cmpxchg does not handle non-long arguments
asm-generic: make atomic_add_unless a function
Linus Torvalds [Fri, 22 Oct 2010 17:52:56 +0000 (10:52 -0700)]
Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
Linus Torvalds [Fri, 22 Oct 2010 17:52:01 +0000 (10:52 -0700)]
Merge branch 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: (30 commits)
BKL: remove BKL from freevxfs
BKL: remove BKL from qnx4
autofs4: Only declare function when CONFIG_COMPAT is defined
autofs: Only declare function when CONFIG_COMPAT is defined
ncpfs: Lock socket in ncpfs while setting its callbacks
fs/locks.c: prepare for BKL removal
BKL: Remove BKL from ncpfs
BKL: Remove BKL from OCFS2
BKL: Remove BKL from squashfs
BKL: Remove BKL from jffs2
BKL: Remove BKL from ecryptfs
BKL: Remove BKL from afs
BKL: Remove BKL from USB gadgetfs
BKL: Remove BKL from autofs4
BKL: Remove BKL from isofs
BKL: Remove BKL from fat
BKL: Remove BKL from ext2 filesystem
BKL: Remove BKL from do_new_mount()
BKL: Remove BKL from cgroup
BKL: Remove BKL from NTFS
...
Linus Torvalds [Fri, 22 Oct 2010 17:43:11 +0000 (10:43 -0700)]
Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl
* 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
BKL: introduce CONFIG_BKL.
dabusb: remove the BKL
sunrpc: remove the big kernel lock
init/main.c: remove BKL notations
blktrace: remove the big kernel lock
rtmutex-tester: make it build without BKL
dvb-core: kill the big kernel lock
dvb/bt8xx: kill the big kernel lock
tlclk: remove big kernel lock
fix rawctl compat ioctls breakage on amd64 and itanic
uml: kill big kernel lock
parisc: remove big kernel lock
cris: autoconvert trivial BKL users
alpha: kill big kernel lock
isapnp: BKL removal
s390/block: kill the big kernel lock
hpet: kill BKL, add compat_ioctl
Dave Hinds pointed out to me that 37979e1546a7 will break b43 and
ray_cs, as IOCARD is not -- as the name would suggest -- only needed
for cards using IO ports. Instead, as it re-deines several pins, it
is also required for using interrupts.
Linus Torvalds [Fri, 22 Oct 2010 04:19:54 +0000 (21:19 -0700)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (71 commits)
powerpc/44x: Update ppc44x_defconfig
powerpc/watchdog: Make default timeout for Book-E watchdog a Kconfig option
fsl_rio: Add comments for sRIO registers.
powerpc/fsl-booke: Add e55xx (64-bit) smp defconfig
powerpc/fsl-booke: Add p5020 DS board support
powerpc/fsl-booke64: Use TLB CAMs to cover linear mapping on FSL 64-bit chips
powerpc/fsl-booke: Add support for FSL Arch v1.0 MMU in setup_page_sizes
powerpc/fsl-booke: Add support for FSL 64-bit e5500 core
powerpc/85xx: add cache-sram support
powerpc/85xx: add ngPIXIS FPGA device tree node to the P1022DS board
powerpc: Fix compile error with paca code on ppc64e
powerpc/fsl-booke: Add p3041 DS board support
oprofile/fsl emb: Don't set MSR[PMM] until after clearing the interrupt.
powerpc/fsl-booke: Add PCI device ids for P2040/P3041/P5010/P5020 QoirQ chips
powerpc/mpc8xxx_gpio: Add support for 'qoriq-gpio' controllers
powerpc/fsl_booke: Add support to boot from core other than 0
powerpc/p1022: Add probing for individual DMA channels
powerpc/fsl_soc: Search all global-utilities nodes for rstccr
powerpc: Fix invalid page flags in create TLB CAM path for PTE_64BIT
powerpc/mpc83xx: Support for MPC8308 P1M board
...
Fix up conflict with the generic irq_work changes in arch/powerpc/kernel/time.c
Linus Torvalds [Fri, 22 Oct 2010 02:03:38 +0000 (19:03 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: (26 commits)
include/linux/libata.h: fix typo
pata_bf54x: fix return type of bfin_set_devctl
Drivers: ata: Makefile: replace the use of <module>-objs with <module>-y
libahci: fix result_tf handling after an ATA PIO data-in command
pata_sl82c105: implement sff_irq_check() method
pata_sil680: implement sff_irq_check() method
pata_pdc202xx_old: implement sff_irq_check() method
pata_cmd640: implement sff_irq_check() method
ata_piix: Add device ID for ICH4-L
pata_sil680: make sil680_sff_exec_command() 'static'
ata: Intel IDE-R support
libata: reorder ata_queued_cmd to remove alignment padding on 64 bit builds
libata: Signal that our SATL supports WRITE SAME(16) with UNMAP
ata_piix: remove SIDPR locking
libata: implement cross-port EH exclusion
libata: add @ap to ata_wait_register() and introduce ata_msleep()
ata_piix: implement LPM support
libata: implement LPM support for port multipliers
libata: reimplement link power management
libata: implement sata_link_scr_lpm() and make ata_dev_set_feature() global
...
Linus Torvalds [Fri, 22 Oct 2010 02:01:34 +0000 (19:01 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (48 commits)
ocfs2: Avoid to evaluate xattr block flags again.
ocfs2/cluster: Release debugfs file elapsed_time_in_ms
ocfs2: Add a mount option "coherency=*" to handle cluster coherency for O_DIRECT writes.
Initialize max_slots early
When I tried to compile I got the following warning: fs/ocfs2/slot_map.c: In function ‘ocfs2_init_slot_info’: fs/ocfs2/slot_map.c:360: warning: ‘bytes’ may be used uninitialized in this function fs/ocfs2/slot_map.c:360: note: ‘bytes’ was declared here Compiler: gcc version 4.4.3 (GCC) on Mandriva I'm not sure why this warning occurs, I think compiler don't know that variable "bytes" is initialized when it is sent by reference to ocfs2_slot_map_physical_size and it throws that ugly warning. However, a simple initialization of "bytes" variable with 0 will fix it.
ocfs2: validate bg_free_bits_count after update
ocfs2/cluster: Bump up dlm protocol to version 1.1
ocfs2/cluster: Show per region heartbeat elapsed time
ocfs2/cluster: Add mlogs for heartbeat up/down events
ocfs2/cluster: Create debugfs dir/files for each region
ocfs2/cluster: Create debugfs files for live, quorum and failed region bitmaps
ocfs2/cluster: Maintain bitmap of failed regions
ocfs2/cluster: Maintain bitmap of quorum regions
ocfs2/cluster: Track bitmap of live heartbeat regions
ocfs2/cluster: Track number of global heartbeat regions
ocfs2/cluster: Maintain live node bitmap per heartbeat region
ocfs2/cluster: Reorganize o2hb debugfs init
ocfs2/cluster: Check slots for unconfigured live nodes
ocfs2/cluster: Print messages when adding/removing nodes
ocfs2/cluster: Print messages when adding/removing heartbeat regions
...
Linus Torvalds [Fri, 22 Oct 2010 01:52:11 +0000 (18:52 -0700)]
Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (74 commits)
x86-64: Only set max_pfn_mapped to 512 MiB if we enter via head_64.S
xen: Cope with unmapped pages when initializing kernel pagetable
memblock, bootmem: Round pfn properly for memory and reserved regions
memblock: Annotate memblock functions with __init_memblock
memblock: Allow memblock_init to be called early
memblock/arm: Fix memblock_region_is_memory() typo
x86, memblock: Remove __memblock_x86_find_in_range_size()
memblock: Fix wraparound in find_region()
x86-32, memblock: Make add_highpages honor early reserved ranges
x86, memblock: Fix crashkernel allocation
arm, memblock: Fix the sparsemem build
memblock: Fix section mismatch warnings
powerpc, memblock: Fix memblock API change fallout
memblock, microblaze: Fix memblock API change fallout
x86: Remove old bootmem code
x86, memblock: Use memblock_memory_size()/memblock_free_memory_size() to get correct dma_reserve
x86: Remove not used early_res code
x86, memblock: Replace e820_/_early string with memblock_
x86: Use memblock to replace early_res
x86, memblock: Use memblock_debug to control debug message print out
...
Fix up trivial conflicts in arch/x86/kernel/setup.c and kernel/Makefile
Mike Frysinger [Thu, 21 Oct 2010 08:00:40 +0000 (04:00 -0400)]
pata_bf54x: fix return type of bfin_set_devctl
The new devctl func added for us to the driver has the wrong return
type. Which is to say there shouldn't be any. This fixes compile
time warnings as there shouldn't be any runtime difference.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Fri, 15 Oct 2010 09:00:08 +0000 (11:00 +0200)]
libahci: fix result_tf handling after an ATA PIO data-in command
ATA devices don't send D2H Reg FIS after an successful ATA PIO data-in
command. The host is supposed to take the TF and E_Status of the
preceding PIO Setup FIS. Update ahci_qc_fill_rtf() such that it takes
TF + E_Status from PIO Setup FIS after a successful ATA PIO data-in
command.
Without this patch, result_tf for such a command is filled with the
content of the previous D2H Reg FIS which belongs to a previous
command, which can make the command incorrectly seen as failed.
* Patch updated to grab the whole TF + E_Status from PIO Setup FIS
instead of just E_Status as suggested by Robert Hancock.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mark Lord <kernel@teksavvy.com> Cc: Robert Hancock <hancockrwd@gmail.com> Cc: stable@kernel.org Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Ben Hutchings [Sun, 10 Oct 2010 21:42:21 +0000 (22:42 +0100)]
ata_piix: Add device ID for ICH4-L
ICH4-L is a variant of ICH4 lacking USB2 functionality and with some
different device IDs.
It is documented in Intel specification update 290745-025, currently
at <http://www.intel.com/assets/pdf/specupdate/290745.pdf>, and is
included in the device ID table for piix.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Alan Cox [Tue, 28 Sep 2010 12:19:38 +0000 (13:19 +0100)]
ata: Intel IDE-R support
Intel IDE-R devices are part of the Intel AMT management setup. They don't
have any special configuration registers or settings so the ata_generic
driver will support them fully.
Rather than add a huge table of IDs for each chipset and keep sending in
new ones this patch autodetects them.
Signed-off-by: Alan Cox <alan@linux.intel.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Richard Kennedy [Fri, 10 Sep 2010 11:19:43 +0000 (12:19 +0100)]
libata: reorder ata_queued_cmd to remove alignment padding on 64 bit builds
Reorder structure ata_queued_cmd to remove 8 bytes of alignment padding
on 64 bit builds & therefore reduce the size of structure ata_port by
256 bytes.
Overall this will have little impact, other than reducing the amount of
memory that is cleared when allocating ata_ports.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata: Signal that our SATL supports WRITE SAME(16) with UNMAP
Until now identifying that a device supports WRITE SAME(16) with the
UNMAP bit set has been black magic. Implement support for the SBC-3
Thin Provisioning VPD page and set the TPWS bit.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Now that libata provides proper cross-port EH exclusion. The SIDPR
locking added by commit 213373cf (ata_piix: fix locking around SIDPR
access) is no longer necessary. Remove it.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
In libata, the non-EH code paths should always take and release
ap->lock explicitly when accessing hardware or shared data structures.
However, once EH is active, it's assumed that the port is owned by EH
and EH methods don't explicitly take ap->lock unless race from irq
handler or other code paths are expected. However, libata EH didn't
guarantee exclusion among EHs for ports of the same host. IOW,
multiple EHs may execute in parallel on multiple ports of the same
controller.
In many cases, especially in SATA, the ports are completely
independent of each other and this doesn't cause problems; however,
there are cases where different ports share the same resource, which
lead to obscure timing related bugs such as the one fixed by commit 213373cf (ata_piix: fix locking around SIDPR access).
This patch implements exclusion among EHs of the same host. When EH
begins, it acquires per-host EH ownership by calling ata_eh_acquire().
When EH finishes, the ownership is released by calling
ata_eh_release(). EH ownership is also released whenever the EH
thread goes to sleep from ata_msleep() or explicitly and reacquired
after waking up.
This ensures that while EH is actively accessing the hardware, it has
exclusive access to it while allowing EHs to interleave and progress
in parallel as they hit waiting stages, which dominate the time spent
in EH. This achieves cross-port EH exclusion without pervasive and
fragile changes while still allowing parallel EH for the most part.
This was first reported by yuanding02@gmail.com more than three years
ago in the following bugzilla. :-)
https://bugzilla.kernel.org/show_bug.cgi?id=8223
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Reported-by: yuanding02@gmail.com Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata: add @ap to ata_wait_register() and introduce ata_msleep()
Add optional @ap argument to ata_wait_register() and replace msleep()
calls with ata_msleep() which take optional @ap in addition to the
duration. These will be used to implement EH exclusion.
This patch doesn't cause any behavior difference.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata: implement LPM support for port multipliers
Port multipliers can do DIPM on fan-out links fine. Implement support
for it. Tested w/ SIMG 57xx and marvell PMPs. Both the host and
fan-out links enter power save modes nicely.
SIMG 37xx and 47xx report link offline on SStatus causing EH to detach
the devices. Blacklisted.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The current LPM implementation has the following issues.
* Operation order isn't well thought-out. e.g. HIPM should be
configured after IPM in SControl is properly configured. Not the
other way around.
* Suspend/resume paths call ata_lpm_enable/disable() which must only
be called from EH context directly. Also, ata_lpm_enable/disable()
were called whether LPM was in use or not.
* Implementation is per-port when it should be per-link. As a result,
it can't be used for controllers with slave links or PMP.
* LPM state isn't managed consistently. After a link reset for
whatever reason including suspend/resume the actual LPM state would
be reset leaving ap->lpm_policy inconsistent.
* Generic/driver-specific logic boundary isn't clear. Currently,
libahci has to mangle stuff which libata EH proper should be
handling. This makes the implementation unnecessarily complex and
fragile.
* Tied to ALPM. Doesn't consider DIPM only cases and doesn't check
whether the device allows HIPM.
* Error handling isn't implemented.
Given the extent of mismatch with the rest of libata, I don't think
trying to fix it piecewise makes much sense. This patch reimplements
LPM support.
* The new implementation is per-link. The target policy is still
port-wide (ap->target_lpm_policy) but all the mechanisms and states
are per-link and integrate well with the rest of link abstraction
and can work with slave and PMP links.
* Core EH has proper control of LPM state. LPM state is reconfigured
when and only when reconfiguration is necessary. It makes sure that
LPM state is reset when probing for new device on the link.
Controller agnostic logic is now implemented in libata EH proper and
driver implementation only has to deal with controller specifics.
* Proper error handling. LPM config failure is attributed to the
device on the link and LPM is disabled for the link if it fails
repeatedly.
* ops->enable/disable_pm() are replaced with single ops->set_lpm()
which takes @policy and @hints. This simplifies driver specific
implementation.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata: clean up lpm related symbols and sysfs show/store functions
Link power management related symbols are in confusing state w/ mixed
usages of lpm, ipm and pm. This patch cleans up lpm related symbols
and sysfs show/store functions as follows.
* lpm states - NOT_AVAILABLE, MIN_POWER, MAX_PERFORMANCE and
MEDIUM_POWER are renamed to ATA_LPM_UNKNOWN and
ATA_LPM_{MIN|MAX|MED}_POWER.
* Pre/postfixes are unified to lpm.
* sysfs show/store functions for link_power_management_policy were
curiously named get/put and unnecessarily complex. Renamed to
show/store and simplified.
Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Luck, Tony [Mon, 23 Aug 2010 20:18:02 +0000 (13:18 -0700)]
[libata] Fix section mismatch: ata_sff_exit
This build error showed up in linux-next tag next-20100820 for ia64:
WARNING: vmlinux.o(.init.text+0x4a952): Section mismatch in reference from the function ata_init() to the function .exit.text:ata_sff_exit()
The function __init ata_init() references
a function __exit ata_sff_exit().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __exit annotation of
ata_sff_exit() so it may be used outside an exit section.
Sure enough, dropping the __exit fixes the problem.
Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Grant Grundler [Tue, 17 Aug 2010 17:56:53 +0000 (10:56 -0700)]
[libata] support for > 512 byte sectors (e.g. 4K Native)
This change enables my x86 machine to recognize and talk to a
"Native 4K" SATA device.
When I started working on this, I didn't know Matthew Wilcox had
posted a similar patch 2 years ago:
http://git.kernel.org/?p=linux/kernel/git/willy/ata.git;a=shortlog;h=refs/heads/ata-large-sectors
Gwendal Grignou pointed me at the the above code and small portions of
this patch include Matthew's work. That's why Mathew is first on the
"Signed-off-by:". I've NOT included his use of a bitmap to determine
512 vs Native for ATA command block size - just used a simple table.
And bugs are almost certainly mine.
Lastly, the patch has been tested with a native 4K 'Engineering
Sample' drive provided by Hitachi GST.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by: Grant Grundler <grundler@google.com> Reviewed-by: Gwendal Grignou <gwendal@google.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Tejun Heo [Fri, 25 Jun 2010 13:03:34 +0000 (15:03 +0200)]
libata: always use ata_qc_complete_multiple() for NCQ command completions
Currently, sata_fsl, mv and nv call ata_qc_complete() multiple times
from their interrupt handlers to indicate completion of NCQ commands.
This limits the visibility the libata core layer has into how commands
are being executed and completed, which is necessary to support IRQ
expecting in generic way. libata already has an interface to complete
multiple commands at once - ata_qc_complete_multiple() which ahci and
sata_sil24 already use.
This patch updates the three drivers to use ata_qc_complete_multiple()
too and updates comments on ata_qc_complete[_multiple]() regarding
their usages with NCQ completions. This change not only provides
better visibility into command execution to the core layer but also
simplifies low level drivers.
* sata_fsl: It already builds done_mask. Conversion is straight
forward.
* sata_mv: mv_process_crpb_response() no longer checks for illegal
completions, it just returns whether the tag is completed or not.
mv_process_crpb_entries() builds done_mask from it and passes it to
ata_qc_complete_multiple() which will check for illegal completions.
* sata_nv adma: Similar to sata_mv. nv_adma_check_cpb() now just
returns the tag status and nv_adma_interrupt() builds done_mask from
it and passes it to ata_qc_complete_multiple().
* sata_nv swncq: It already builds done_mask. Drop unnecessary
illegal transition checks and call ata_qc_complete_multiple().
In the long run, it might be a good idea to make ata_qc_complete()
whine if called when multiple NCQ commands are in flight.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Ashish Kalra <ashish.kalra@freescale.com> Cc: Saeed Bishara <saeed@marvell.com> Cc: Mark Lord <liml@rtr.ca> Cc: Robert Hancock <hancockr@shaw.ca> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Gwendal Grignou [Tue, 25 May 2010 19:31:38 +0000 (12:31 -0700)]
[libata] Add ATA transport class
This is a scheleton for libata transport class.
All information is read only, exporting information from libata:
- ata_port class: one per ATA port
- ata_link class: one per ATA port or 15 for SATA Port Multiplier
- ata_device class: up to 2 for PATA link, usually one for SATA.
Signed-off-by: Gwendal Grignou <gwendal@google.com> Reviewed-by: Grant Grundler <grundler@google.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Thu, 21 Oct 2010 23:42:32 +0000 (16:42 -0700)]
Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (278 commits)
arm: remove machine_desc.io_pg_offst and .phys_io
arm: use addruart macro to establish debug mappings
arm: return both physical and virtual addresses from addruart
arm/debug: consolidate addruart macros for CONFIG_DEBUG_ICEDCC
ARM: make struct machine_desc definition coherent with its comment
eukrea_mbimxsd-baseboard: Pass the correct GPIO to gpio_free
cpuimx27: fix compile when ULPI is selected
mach-pcm037_eet: fix compile errors
Fixing ethernet driver compilation error for i.MX31 ADS board
cpuimx51: update board support
mx5: add cpuimx51sd module and its baseboard
iomux-mx51: fix GPIO_1_xx 's IOMUX configuration
imx-esdhc: update devices registration
mx51: add resources for SD/MMC on i.MX51
iomux-mx51: fix SD1 and SD2's iomux configuration
clock-mx51: rename CLOCK1 to CLOCK_CCGR for better readability
clock-mx51: factorize clk_set_parent and clk_get_rate
eukrea_mbimxsd: add support for DVI displays
cpuimx25 & cpuimx35: fix OTG port registration in host mode
i.MX31 and i.MX35 : fix errate TLSbo65953 and ENGcm09472
...
* git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-irqflags:
Fix IRQ flag handling naming
MIPS: Add missing #inclusions of <linux/irq.h>
smc91x: Add missing #inclusion of <linux/irq.h>
Drop a couple of unnecessary asm/system.h inclusions
SH: Add missing consts to sys_execve() declaration
Blackfin: Rename IRQ flags handling functions
Blackfin: Add missing dep to asm/irqflags.h
Blackfin: Rename DES PC2() symbol to avoid collision
Blackfin: Split the BF532 BFIN_*_FIO_FLAG() functions to their own header
Blackfin: Split PLL code from mach-specific cdef headers
Linus Torvalds [Thu, 21 Oct 2010 21:37:00 +0000 (14:37 -0700)]
Merge branch 'next-spi' of git://git.secretlab.ca/git/linux-2.6
* 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (53 commits)
spi/omap2_mcspi: Verify TX reg is empty after TX only xfer with DMA
spi/omap2_mcspi: disable channel after TX_ONLY transfer in PIO mode
spi/bfin_spi: namespace local structs
spi/bfin_spi: init early
spi/bfin_spi: check per-transfer bits_per_word
spi/bfin_spi: warn when CS is driven by hardware (CPHA=0)
spi/bfin_spi: cs should be always low when a new transfer begins
spi/bfin_spi: fix typo in comment
spi/bfin_spi: reject unsupported SPI modes
spi/bfin_spi: use dma_disable_irq_nosync() in irq handler
spi/bfin_spi: combine duplicate SPI_CTL read/write logic
spi/bfin_spi: reset ctl_reg bits when setup is run again on a device
spi/bfin_spi: push all size checks into the transfer function
spi/bfin_spi: use nosync when disabling the IRQ from the IRQ handler
spi/bfin_spi: sync hardware state before reprogramming everything
spi/bfin_spi: save/restore state when suspending/resuming
spi/bfin_spi: redo GPIO CS handling
Blackfin: SPI: expand SPI bitmasks
spi/bfin_spi: use the SPI namespaced bit names
spi/bfin_spi: drop extra memory we don't need
...
Linus Torvalds [Thu, 21 Oct 2010 21:11:46 +0000 (14:11 -0700)]
Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (96 commits)
apic, x86: Use BIOS settings for IBS and MCE threshold interrupt LVT offsets
apic, x86: Check if EILVT APIC registers are available (AMD only)
x86: ioapic: Call free_irte only if interrupt remapping enabled
arm: Use ARCH_IRQ_INIT_FLAGS
genirq, ARM: Fix boot on ARM platforms
genirq: Fix CONFIG_GENIRQ_NO_DEPRECATED=y build
x86: Switch sparse_irq allocations to GFP_KERNEL
genirq: Switch sparse_irq allocator to GFP_KERNEL
genirq: Make sparse_lock a mutex
x86: lguest: Use new irq allocator
genirq: Remove the now unused sparse irq leftovers
genirq: Sanitize dynamic irq handling
genirq: Remove arch_init_chip_data()
x86: xen: Sanitise sparse_irq handling
x86: Use sane enumeration
x86: uv: Clean up the direct access to irq_desc
x86: Make io_apic.c local functions static
genirq: Remove irq_2_iommu
x86: Speed up the irq_remapped check in hot pathes
intr_remap: Simplify the code further
...
Linus Torvalds [Thu, 21 Oct 2010 21:04:25 +0000 (14:04 -0700)]
Merge branch 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6
* 'stable/swiotlb-0.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
swiotlb: Use page alignment for early buffer allocation
swiotlb: make io_tlb_overflow static
Linus Torvalds [Thu, 21 Oct 2010 20:54:05 +0000 (13:54 -0700)]
Merge branch 'x86-x2apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-x2apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, x2apic: Simplify apic init in SMP and UP builds
x86, intr-remap: Remove IRTE setup duplicate code
x86, intr-remap: Set redirection hint in the IRTE
Linus Torvalds [Thu, 21 Oct 2010 20:53:24 +0000 (13:53 -0700)]
Merge branch 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-vmware-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, paravirt: Remove alloc_pmd_clone hook, only used by VMI
x86, vmware: Remove deprecated VMI kernel support
Fix up trivial #include conflict in arch/x86/kernel/smpboot.c
Linus Torvalds [Thu, 21 Oct 2010 20:51:41 +0000 (13:51 -0700)]
Merge branch 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mtrr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mtrr: Support mtrr lookup for range spanning across MTRR range
x86, mtrr: Refactor MTRR type overlap check code
Linus Torvalds [Thu, 21 Oct 2010 20:47:54 +0000 (13:47 -0700)]
Merge branch 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mrst-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: sfi: Make local functions static
x86, earlyprintk: Add hsu early console for Intel Medfield platform
x86, earlyprintk: Add earlyprintk for Intel Moorestown platform
x86: Add two helper macros for fixed address mapping
x86, mrst: A function in a header file needs to be marked "inline"
Linus Torvalds [Thu, 21 Oct 2010 20:47:29 +0000 (13:47 -0700)]
Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, percpu: Correct the ordering of the percpu readmostly section
x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64G
x86: Spread tlb flush vector between nodes
percpu: Introduce a read-mostly percpu API
x86, mm: Fix incorrect data type in vmalloc_sync_all()
x86, mm: Hold mm->page_table_lock while doing vmalloc_sync
x86, mm: Fix bogus whitespace in sync_global_pgds()
x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation
x86, mm: Add RESERVE_BRK_ARRAY() helper
mm, x86: Saving vmcore with non-lazy freeing of vmas
x86, kdump: Change copy_oldmem_page() to use cached addressing
x86, mm: fix uninitialized addr in kernel_physical_mapping_init()
x86, kmemcheck: Remove double test
x86, mm: Make spurious_fault check explicitly check the PRESENT bit
x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes
x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions
x86, mm: Avoid unnecessary TLB flush
Linus Torvalds [Thu, 21 Oct 2010 20:45:38 +0000 (13:45 -0700)]
Merge branch 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, hotplug: In the MWAIT case of play_dead, CLFLUSH the cache line
x86, hotplug: Move WBINVD back outside the play_dead loop
x86, hotplug: Use mwait to offline a processor, fix the legacy case
x86, mwait: Move mwait constants to a common header file
Linus Torvalds [Thu, 21 Oct 2010 20:20:32 +0000 (13:20 -0700)]
Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Remove pr_<level> uses of KERN_<level>
therm_throt.c: Trivial printk message fix for a unsuitable abbreviation of 'thermal'
x86: Use {push,pop}{l,q}_cfi in more places
i386: Add unwind directives to syscall ptregs stubs
x86-64: Use symbolics instead of raw numbers in entry_64.S
x86-64: Adjust frame type at paranoid_exit:
x86-64: Fix unwind annotations in syscall stubs