]> git.proxmox.com Git - mirror_zfs.git/log
mirror_zfs.git
7 years agoFix lookup_bdev() on Ubuntu
Hajo Möller [Wed, 26 Oct 2016 17:30:43 +0000 (19:30 +0200)]
Fix lookup_bdev() on Ubuntu

Ubuntu added support for checking inode permissions to lookup_bdev() in kernel
commit 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee (merged in 4.4.0-6.21).
Upstream bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1636517

This patch adds a test for Ubuntu's variant of lookup_bdev() to configure and
calls the function in the correct way.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Hajo Möller <dasjoe@gmail.com>
Closes #5336

7 years agoDisable zio_dva_throttle_enabled by default
Brian Behlendorf [Wed, 26 Oct 2016 16:13:43 +0000 (09:13 -0700)]
Disable zio_dva_throttle_enabled by default

Until it can be determined definitively that a performance
regression wasn't introduced accidentally by 3dfb57a this
functionality is being disabled by default.  It can be re-
enabled by setting zio_dva_throttle_enabled=1.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5335
Issue #5289

7 years agoAllow for '-o feature@<feature>=disabled' on the command line
LOLi [Tue, 25 Oct 2016 23:17:47 +0000 (01:17 +0200)]
Allow for '-o feature@<feature>=disabled' on the command line

Sometimes it is desirable to specifically disable one or several
features directly on the 'zpool create' command line.

$ zpool create -o feature@<feature>=disabled ...

Original-patch-by: Turbo Fredriksson <turbo@bayour.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #3460
Closes #5142
Closes #5324

7 years agoDo not upgrade userobj accounting for snapshot dataset
jxiong [Tue, 25 Oct 2016 20:21:05 +0000 (04:21 +0800)]
Do not upgrade userobj accounting for snapshot dataset

'zfs recv' could disown a living objset without calling
dmu_objset_disown(). This will cause the problem that the objset
would be released while the upgrading thread is still running.

This patch avoids the problem by checking if a dataset is a snapshot
before calling dmu_objset_userobjspace_upgrade().  Snapshots
are immutable and therefore it doesn't make sense to update them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Closes #5295
Closes #5328

7 years agoFix statechange-led.sh & unnecessary libdevmapper warning
Tony Hutter [Tue, 25 Oct 2016 18:05:30 +0000 (11:05 -0700)]
Fix statechange-led.sh & unnecessary libdevmapper warning

- Fix autoreplace behaviour on statechange-led.sh script.

ZED sends the following events on an auto-replace:

1. statechange: Disk goes UNAVAIL->ONLINE
2. statechange: Disk goes ONLINE->UNAVAIL
3. vdev_attach: Disk goes ONLINE

Events 1-2 happen when ZED first attempts to do an auto-online.  When that
fails, ZED then tries an auto-replace, generating the vdev_attach event in #3.

In the previous code, statechange-led was only looking at the UNAVAIL->ONLINE
transition to turn off the LED.  It ignored the #2 ONLINE->UNAVAIL transition,
assuming it was just the "old" VDEV going offline.  This is problematic, as
a drive can go from ONLINE->UNAVAIL when it's malfunctioning, and we don't want
to ignore that.

This new patch correctly turns on the fault LED every time a drive becomes
UNAVAIL.  It also monitors vdev_attach events to trigger turning off the LED
when an auto-replaced disk comes online.

- Remove unnecessary libdevmapper warning with --with-config=kernel

This fixes an unnecessary libdevmapper warning when building
--with-config=kernel.  Kernel code does not use libdevmapper, so the warning
is not needed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #2375
Closes #5312
Closes #5331

7 years agoicp: mark asm files with noexec stack
Jason Zaman [Tue, 25 Oct 2016 17:44:09 +0000 (01:44 +0800)]
icp: mark asm files with noexec stack

Similar to commit a3600a106.  Asm files need an explicit note
that they do not require an executable stack.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jason Zaman <jason@perfinion.com>
Closes #5332

7 years agoFix cred leak in zpl_fallocate_common
tuxoko [Mon, 24 Oct 2016 23:41:56 +0000 (16:41 -0700)]
Fix cred leak in zpl_fallocate_common

This is caught by kmemleak when running compress_004_pos

Reviewed-by: Tim Chase <tim@chase2k.com>
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Closes #5244
Closes #5330

7 years agoDisable zpool_upgrade_002_pos test case
Brian Behlendorf [Mon, 24 Oct 2016 23:39:47 +0000 (16:39 -0700)]
Disable zpool_upgrade_002_pos test case

This test case frequently triggers issue #4034.  There exists a
fix for this which is in the process of being upstreamed.  Until
that fix is available disable the test case.

Reviewed by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5329
Issue #4034

7 years agoFix coverity defects: CID 147511, 147513
cao [Mon, 24 Oct 2016 20:37:38 +0000 (04:37 +0800)]
Fix coverity defects: CID 147511, 147513

CID 147511: Type:Dereference before null check
CID 147513: Type:Dereference before null check

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5306

7 years agoFix taskq creation failure in vdev_open_children()
Brian Behlendorf [Mon, 24 Oct 2016 20:28:58 +0000 (13:28 -0700)]
Fix taskq creation failure in vdev_open_children()

When creating and destroying pools in tight loop it's possible to
exhaust the number of allowed threads on a system.  This results
in taskq_create() failling and a NULL dereference.

Resolve the issue by falling back to opening the vdevs all
synchronously.

Reviewed-by: Denys Rtveliashvili <denys@rtveliashvili.name>
Reviewed-by: Håkan Johansson <f96hajo@chalmers.se>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes zfsonlinux/spl#521
Closes #4637

7 years agoTurn on/off enclosure slot fault LED even when disk isn't present
Tony Hutter [Mon, 24 Oct 2016 17:45:59 +0000 (10:45 -0700)]
Turn on/off enclosure slot fault LED even when disk isn't present

Previously when a drive faulted, the statechange-led.sh script would lookup
the drive's LED sysfs entry in /sys/block/sd*/device/enclosure_device, and
turn it on.  During testing we noticed that if you pulled out a drive, or if
the drive was so badly broken that it no longer appeared to Linux, that the
/sys/block/sd* path would be removed, and the script could not lookup the
LED entry.

To fix this, this patch looks up the disks's more persistent
"/sys/class/enclosure/X:X:X:X/Slot N" LED sysfs path at pool import.  It then
passes that path to the statechange-led script to use, rather than having the
script look it up on the fly.  This allows the script to turn on/off the slot
LEDs even when the drive is missing.

Closes #5309
Closes #2375

7 years agoChange location of current symlink created by test-runner
Giuseppe Di Natale [Mon, 24 Oct 2016 17:24:10 +0000 (10:24 -0700)]
Change location of current symlink created by test-runner

test-runner should be creating the current symlink in the
directory above the output directory. In a previous commit,
the current symlink was placed in the current working
directory, which could be inaccessible. It is more likely
that the output directory is always accessible.

This is needed because without this there's no deterministic
way to get the path to ZFS Test Suite results until after the
test suite has started. This makes it difficult for buildbot to
follow the log file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5314

7 years agoFletcher4 algorithm implemented in pure NEON for Aarch64 / ARMv8 64 bits
Romain Dolbeau [Fri, 21 Oct 2016 17:55:49 +0000 (19:55 +0200)]
Fletcher4 algorithm implemented in pure NEON for Aarch64 / ARMv8 64 bits

This is not useful on micro-architecture with a weak NEON
implementation (only 64 bits); the native version is slower &
the byteswap barely faster than scalar.  On A53 or A57, it's
a small improvement on scalar but OK for byteswap.

Results from an A53 system:
0 0 0x01 -1 0 1499068294333000 1499101101878000
implementation   native         byteswap
scalar           1008227510     755880264
aarch64_neon     1198098720     1044818671
fastest          aarch64_neon   aarch64_neon

Results from a A57 system:
0 0 0x01 -1 0 4407214734807033 4407233933777404
implementation   native         byteswap
scalar           2302071241     1124873346
aarch64_neon     2542214946     2245570352
fastest          aarch64_neon   aarch64_neon

Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@atos.net>
Closes #5248

7 years agoFix userquota_compare() function
Brian Behlendorf [Fri, 21 Oct 2016 15:23:27 +0000 (08:23 -0700)]
Fix userquota_compare() function

The AVL tree compare function requires that either -1, 0, or 1 be
returned.  However the strcmp() function only guarantees that a
negative, zero, or positive value is returned.  Therefore, the
return value of strcmp() needs to be sanitized with AVL_ISIGN.

This was initially overlooked because the x86_64 implementation
of strcmp() happens to only returns the allowed values.  This
was observed on an aarch64 platform which behaves correctly but
differently as described above.

Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5311
Closes #5313

7 years agoFix coverity defects: CID 153459
luozhengzheng [Thu, 20 Oct 2016 18:54:02 +0000 (02:54 +0800)]
Fix coverity defects: CID 153459

CID 153459: Null pointer dereferences (FORWARD_NULL)
Accidentally introduced by #5159.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5310

7 years agoFix coverity defects: CID 147551, 147552
cao [Thu, 20 Oct 2016 18:49:50 +0000 (02:49 +0800)]
Fix coverity defects: CID 147551, 147552

CID 147551: Type:dereference null return value
CID 147552: Type:dereference null return value

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5279

7 years agoFix coverity defects: CID 147472
cao [Thu, 20 Oct 2016 18:24:01 +0000 (02:24 +0800)]
Fix coverity defects: CID 147472

CID 147472: Type: 'Constant' variable guards dead code

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5288

7 years agoFix coverity defects: CID 150919, 150923
luozhengzheng [Thu, 20 Oct 2016 18:09:39 +0000 (02:09 +0800)]
Fix coverity defects: CID 150919, 150923

CID 150919: Buffer not null terminated (BUFFER_SIZE_WARNING)
CID 150923: Buffer not null terminated (BUFFER_SIZE_WARNING)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5298

7 years agoUpdate migration_004_pos, migration_005_pos, migration_006_pos
legend-hua [Thu, 20 Oct 2016 18:04:30 +0000 (02:04 +0800)]
Update migration_004_pos, migration_005_pos, migration_006_pos

Log function should be "log_fail", rather than "log_failED"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: legend-hua <liu.hua130@zte.com.cn>
Closes #5300

7 years agoFix make distclean Makefile.am removal
Brian Behlendorf [Thu, 20 Oct 2016 16:55:03 +0000 (09:55 -0700)]
Fix make distclean Makefile.am removal

The file tests/zfs-tests/tests/stress/Makefile.am gets mistakenly
removed by the distclean target because it's empty.  Adding a
`SUBDIRS =` line prevents the removal.

This directory is being preserved as the location to add assorted
stress tests.  These may include but are not limited to.

  http://kernel.ubuntu.com/~cking/stress-ng/
  https://github.com/zfsonlinux/zfsstress/

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5308

7 years agoLinux 4.9 compat: inode_change_ok() renamed setattr_prepare()
Brian Behlendorf [Tue, 18 Oct 2016 23:49:23 +0000 (23:49 +0000)]
Linux 4.9 compat: inode_change_ok() renamed setattr_prepare()

In torvalds/linux@31051c8 the inode_change_ok() function was
renamed setattr_prepare() and updated to take a dentry ratheri
than an inode.  Update the code to call the setattr_prepare()
and add a wrapper function which call inode_change_ok() for
older kernels.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
Requires-spl: refs/pull/581/head

7 years agoLinux 4.9 compat: remove iops->{set,get,remove}xattr
Chunwei Chen [Wed, 19 Oct 2016 18:19:17 +0000 (11:19 -0700)]
Linux 4.9 compat: remove iops->{set,get,remove}xattr

In Linux 4.9, torvalds/linux@fd50eca, iops->{set,get,remove}xattr and
generic_{set,get,remove}xattr are removed. xattr operations will directly
go through sb->s_xattr.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
7 years agoLinux 4.9 compat: iops->rename() wants flags
Chunwei Chen [Wed, 19 Oct 2016 18:19:01 +0000 (11:19 -0700)]
Linux 4.9 compat: iops->rename() wants flags

In Linux 4.9, torvalds/linux@2773bf0, iops->rename() and iops->rename2() are
merged together into iops->rename(), it now wants flags.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
7 years agoRemove dir inode operations from zpl_inode_operations
Chunwei Chen [Wed, 19 Oct 2016 18:12:20 +0000 (11:12 -0700)]
Remove dir inode operations from zpl_inode_operations

These operations are dir specific, there's no point putting them in
zpl_inode_operations which is for regular files.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
7 years agoUpdate .gitignore
Brian Behlendorf [Wed, 19 Oct 2016 21:29:33 +0000 (14:29 -0700)]
Update .gitignore

Two additional files were recently introduced and should be
ignored by git.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5299

7 years agoMultipath autoreplace, control enclosure LEDs, event rate limiting
Tony Hutter [Wed, 19 Oct 2016 19:55:59 +0000 (12:55 -0700)]
Multipath autoreplace, control enclosure LEDs, event rate limiting

1. Enable multipath autoreplace support for FMA.

This extends FMA autoreplace to work with multipath disks.  This
requires libdevmapper to be installed at build time.

2. Turn on/off fault LEDs when VDEVs become degraded/faulted/online

Set ZED_USE_ENCLOSURE_LEDS=1 in zed.rc to have ZED turn on/off the enclosure
LED for a drive when a drive becomes FAULTED/DEGRADED.  Your enclosure must
be supported by the Linux SES driver for this to work.  The enclosure LED
scripts work for multipath devices as well.  The scripts will clear the LED
when the fault is cleared.

3. Rate limit ZIO delay and checksum events so as not to flood ZED

ZIO delay and checksum events are rate limited to 5/sec in the zfs module.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #2449
Closes #3017
Closes #5159

7 years agoFix coverity defects: CID 150926
luozhengzheng [Tue, 18 Oct 2016 18:32:59 +0000 (02:32 +0800)]
Fix coverity defects: CID 150926

CID 150926: Unchecked return value (CHECKED_RETURN)
- This case cannot occur given the existing taskq implementation
  and flags passed to task_dispatch().

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5272

7 years agoFix unused variable
Brian Behlendorf [Tue, 18 Oct 2016 17:44:44 +0000 (10:44 -0700)]
Fix unused variable

Accidentally introduced by 3dfb57a, when building with debugging
disabled several variables are unused.  Resolve this by wrapping
them in ASSERTV to remove them for non-debug builds.

Reviewed by: Don Brady <don.brady@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5284

7 years agoFix coverity defects: CID 147643, 152204, 49339
GeLiXin [Tue, 18 Oct 2016 17:43:22 +0000 (01:43 +0800)]
Fix coverity defects: CID 147643, 152204, 49339

CID 147643: Type: String not null terminated
- make sure that the string is null terminated before strlen
  and fprintf.

CID 152204: Type: Copy into fixed size buffer
- since strlcpy isn't availabe here, use strncpy and terminate
  the string manually.

CID 49339: Type: Buffer not null terminated
- since strlcpy isn't availabe here, terminate the string
  manually before fprintf.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5283

7 years agoFix coverity defects: CID 49339, 153393
cao [Tue, 18 Oct 2016 17:31:57 +0000 (01:31 +0800)]
Fix coverity defects: CID 49339, 153393

CID 49339: Type:Buffer not null terminated
CID 153393: Type:Buffer not null terminated

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: <cao.xuewen cao.xuewen@zte.com.cn>
Closes #5296

7 years agoCreate a symlink to current test-runner output
Giuseppe Di Natale [Tue, 18 Oct 2016 17:19:28 +0000 (10:19 -0700)]
Create a symlink to current test-runner output

Generate a symlink in the current working directory to
test-runner.py output. This will make it easier for the
ZFS buildbot to collect logs.

Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Closes #5293

7 years agoFix coverity defects: CID 150924
luozhengzheng [Mon, 17 Oct 2016 19:03:52 +0000 (03:03 +0800)]
Fix coverity defects: CID 150924

CID 150924: Unchecked return value (CHECKED_RETURN)
- On taskq_dispatch failure the reference must be dropped and
  this entry can be safely skipped.  This case should be impossible
  in the existing implementation but should be handled regardless.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5278

7 years agoProperly use the Dracut cleanup hook to order pool shutdown
Rudd-O [Mon, 17 Oct 2016 18:51:15 +0000 (18:51 +0000)]
Properly use the Dracut cleanup hook to order pool shutdown

When Dracut starts up, it needs to determine whether a pool will remain
"hanging open" before the system shuts off. In such a case, then the
code to clean up the pool (using the previous export -F work) must
be invoked. Since Dracut has had a recent change that makes
mount-zfs.sh simply not run when the root dataset is already mounted,
we must use the cleanup hook to order Dracut to do shutdown cleanup.

Important note: this code will not accomplish its stated goal until this
bug is fixed: https://bugzilla.redhat.com/show_bug.cgi?id=1385432

That bug impacts more than just ZFS. It impacts LUKS, dmraid, and
unmount during poweroff. It is a Fedora-wide bug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Closes #5287

7 years agoPass status_cbdata_t to print_status_config() and friends
Håkan Johansson [Mon, 17 Oct 2016 18:46:35 +0000 (20:46 +0200)]
Pass status_cbdata_t to print_status_config() and friends

First rename spare_cbdata_t cb -> spare_cb in print_status_config(),
to free up cb.

Using the structure removes the explicit parameters namewidth
and name_flags from several functions.  Also use status_cbdata_t
for print_import_config().  This simplifies print_logs().

Remove the parameter 'verbose' for print_logs().  It does not really
mean verbose, it selected between the print_status_config and
print_import_config() paths.  This selection is now done by
cb_print_config of spare_cbdata_t.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Håkan Johansson <f96hajo@chalmers.se>
Closes #5259

7 years agoUse -F to export pools so as not to dirty up device labels
Rudd-O [Sun, 16 Oct 2016 03:30:53 +0000 (03:30 +0000)]
Use -F to export pools so as not to dirty up device labels

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Closes #5228
Closes #5238

7 years agoAllow partition aliases in vdev_id.conf (#5266)
Brian Behlendorf [Fri, 14 Oct 2016 23:11:16 +0000 (16:11 -0700)]
Allow partition aliases in vdev_id.conf (#5266)

When pools are assembled from partitions, vdev_id.conf aliases
do not work.  The directory /dev/disk/by-vdev is not created because
the associated udev rule for parsing vdev_id.conf is never called.
Extend to logic to match "disk" and "partition".

Patch-proposed-by: @sparksh
Reviewed-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3859
Closes #5266

7 years agoFix coverity defects: CID 147488, 147490
cao [Fri, 14 Oct 2016 18:00:47 +0000 (02:00 +0800)]
Fix coverity defects: CID 147488, 147490

CID 147488, Type:explicit null dereferenced
CID 147490, Type:dereference null return value

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5237

7 years agoOpenZFS 6877 - zfs_rename_006_pos fails due to missing zvol snapshot device file
Akash Ayare [Wed, 20 Apr 2016 04:07:54 +0000 (21:07 -0700)]
OpenZFS 6877 - zfs_rename_006_pos fails due to missing zvol snapshot device file

Authored by: Akash Ayare <aayare@delphix.com>
Reviewed by: John Kennedy <john.kennedy@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Reviewed-by: yuxiang <guo.yong33@zte.com.cn>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Bug was caused due to a change in functionality. At some point, ZFS
snapshots no longer created associated device files which were being
used in the test. To resolve this issue, a clone of the snapshot can be
produced which will also create the expected device files; then, the
test will behave as it did historically.

OpenZFS-issue: https://www.illumos.org/issues/6877
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2200f27
Closes #5275

Porting Notes:
- Hardcoded /dev/zvol/rdsk changed to $ZVOL_RDEVDIR for compatibility.
- Enabled in linux runfile.

7 years agoEnable zfs_rename_002_pos, zfs_rename_005_neg, zfs_rename_007_pos
Brian Behlendorf [Thu, 13 Oct 2016 23:00:26 +0000 (16:00 -0700)]
Enable zfs_rename_002_pos, zfs_rename_005_neg, zfs_rename_007_pos

These tests all pass once updated to wait for udev to create the
expected linked under /dev/zvol/.

Reviewed-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Reviewed-by: yuxiang <guo.yong33@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5275

7 years agoFix coverity defects: CID 150921, 150927
luozhengzheng [Fri, 14 Oct 2016 16:40:08 +0000 (00:40 +0800)]
Fix coverity defects: CID 150921, 150927

CID 150921: Unchecked return value (CHECKED_RETURN)
CID 150927 : Unchecked return value (CHECKED_RETURN)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5267

7 years agoEnable quota_002_pos, quota_004_pos and quota_005_pos
liaoyuxiangqin [Fri, 14 Oct 2016 16:33:51 +0000 (00:33 +0800)]
Enable quota_002_pos, quota_004_pos and quota_005_pos

In this test the 'ls -ls' command was used to print testfile size in
blocks.  Because the environment variable BLOCK_SIZE was set
the 'ls -ls' command detected this and output its block count as the
number of 8192 blocks.  Rather than change the variable name
the -k was was added to force ls to return 1k blocks.  This has the
additional advantage of behaving consistently across platforms.

For additional details on GNU 'ls' behavior regarding block size:

https://www.gnu.org/software/coreutils/manual/html_node/Block-size.html

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes #5269

7 years agoEnable zfs_receive_011_pos
Brian Behlendorf [Fri, 14 Oct 2016 16:17:56 +0000 (09:17 -0700)]
Enable zfs_receive_011_pos

The zfs_receive_011_pos test can be enabled now that OpenZFS 6562
has been merged.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5276

7 years agoOpenZFS 7090 - zfs should throttle allocations
Don Brady [Fri, 14 Oct 2016 00:59:18 +0000 (18:59 -0600)]
OpenZFS 7090 - zfs should throttle allocations

OpenZFS 7090 - zfs should throttle allocations

Authored by: George Wilson <george.wilson@delphix.com>
Reviewed by: Alex Reece <alex@delphix.com>
Reviewed by: Christopher Siden <christopher.siden@delphix.com>
Reviewed by: Dan Kimmel <dan.kimmel@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <paul.dagnelie@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Sebastien Roy <sebastien.roy@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Ported-by: Don Brady <don.brady@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
When write I/Os are issued, they are issued in block order but the ZIO
pipeline will drive them asynchronously through the allocation stage
which can result in blocks being allocated out-of-order. It would be
nice to preserve as much of the logical order as possible.

In addition, the allocations are equally scattered across all top-level
VDEVs but not all top-level VDEVs are created equally. The pipeline
should be able to detect devices that are more capable of handling
allocations and should allocate more blocks to those devices. This
allows for dynamic allocation distribution when devices are imbalanced
as fuller devices will tend to be slower than empty devices.

The change includes a new pool-wide allocation queue which would
throttle and order allocations in the ZIO pipeline. The queue would be
ordered by issued time and offset and would provide an initial amount of
allocation of work to each top-level vdev. The allocation logic utilizes
a reservation system to reserve allocations that will be performed by
the allocator. Once an allocation is successfully completed it's
scheduled on a given top-level vdev. Each top-level vdev maintains a
maximum number of allocations that it can handle (mg_alloc_queue_depth).
The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth)
are distributed across the top-level vdevs metaslab groups and round
robin across all eligible metaslab groups to distribute the work. As
top-levels complete their work, they receive additional work from the
pool-wide allocation queue until the allocation queue is emptied.

OpenZFS-issue: https://www.illumos.org/issues/7090
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4756c3d7
Closes #5258

Porting Notes:
- Maintained minimal stack in zio_done
- Preserve linux-specific io sizes in zio_write_compress
- Added module params and documentation
- Updated to use optimize AVL cmp macros

7 years agoFix coverity defects: CID 147692, 147693, 147694
cao [Thu, 13 Oct 2016 21:38:59 +0000 (05:38 +0800)]
Fix coverity defects: CID 147692, 147693, 147694

CID:147692, Type:Uninitialized scalar variable
CID:147693, Type:Uninitialized scalar variable
CID:147694, Type:Uninitialized scalar variable

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5252

7 years agoFix coverity defects: CID 150943, 150938
cao [Thu, 13 Oct 2016 21:30:50 +0000 (05:30 +0800)]
Fix coverity defects: CID 150943, 150938

CID:150943, Type:Unintentional integer overflow
CID:150938, Type:Explicit null dereferenced

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5255

7 years agoFix coverity defects: CID 147571, 147574
luozhengzheng [Thu, 13 Oct 2016 21:25:05 +0000 (05:25 +0800)]
Fix coverity defects: CID 147571, 147574

CID 147571: Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
CID 147574: Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5268

7 years agoEnable refquota_002_pos and refquota_004_pos
liaoyuxiangqin [Thu, 13 Oct 2016 21:21:15 +0000 (05:21 +0800)]
Enable refquota_002_pos and refquota_004_pos

The refquota_002_pos and refquota_004_pos test cases can pass
without modification.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes #5273

7 years agoFix coverity defects: CID 147654, 147690
GeLiXin [Thu, 13 Oct 2016 21:02:07 +0000 (05:02 +0800)]
Fix coverity defects: CID 147654, 147690

coverity scan CID:147654,type: Copy into fixed size buffer
- string operation may write past the end of the fixed-size
  destination buffer

coverity scan CID:147690,type: Uninitialized scalar variable
- call zfs_prop_get first in case we use sourcetype and
  share_sourcetype without initialization

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5253

7 years agoFix coverity defects: CID 153394
luozhengzheng [Wed, 12 Oct 2016 20:24:03 +0000 (04:24 +0800)]
Fix coverity defects: CID 153394

coverity scan CID 153394, Type:String overflow

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5263

7 years agoFix ICP memleak introduced in #4760
Tom Caputi [Wed, 12 Oct 2016 19:52:30 +0000 (15:52 -0400)]
Fix ICP memleak introduced in #4760

The ICP requires destructors to for each crypto module that is added.
These do not necessarily exist in Illumos because they assume that
these modules can never be unloaded from the kernel. Some of this
cleanup code was missed when #4760 was merged, resulting in leaks.
This patch simply fixes that.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #4760
Closes #5265

7 years agoFix coverity defects: CID 147606, 147609
cao [Wed, 12 Oct 2016 18:16:47 +0000 (02:16 +0800)]
Fix coverity defects: CID 147606, 147609

coverity scan CID:147606, Type:resource leak
coverity scan CID:147609, Type:resource leak

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5245

7 years agoFix coverity defects: CID 147452, 147447, 147446
cao [Tue, 11 Oct 2016 18:32:34 +0000 (02:32 +0800)]
Fix coverity defects: CID 147452, 147447, 147446

coverity scan CID:147452, Type:Unchecked return value from library
coverity scan CID:147447, Type:Unchecked return value from library
coverity scan CID:147446, Type:Unchecked return value from library

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5264

7 years agoFix memory leak in recv_skip
luozhengzheng [Tue, 11 Oct 2016 17:24:18 +0000 (01:24 +0800)]
Fix memory leak in recv_skip

When the exception branch exits, the buf is leaked.

Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5262

7 years agoFix zfsctl_snapshot_{,un}mount() issues
Brian Behlendorf [Tue, 11 Oct 2016 16:56:28 +0000 (09:56 -0700)]
Fix zfsctl_snapshot_{,un}mount() issues

Fix use after free in zfsctl_snapshot_unmount(). Use /usr/bin/env
instead of /bin/sh to fix a shell code injection flaw and allow use
with grsecurity.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Stian Ellingsen <stian@plaimi.net>
Closes #5250
Closes #4377

7 years agoEnable zfs_snapshot_008_neg and zfs_snapshot_009_pos (#5260)
Brian Behlendorf [Tue, 11 Oct 2016 16:32:31 +0000 (09:32 -0700)]
Enable zfs_snapshot_008_neg and zfs_snapshot_009_pos (#5260)

The zfs_snapshot_008_neg test case does not use nested pools and
can be safely enabled.  The zfs_snapshot_009_pos test case is
also passing without modification.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5260

7 years agoEnable reservation_012_pos, reservation_015_pos and reservation_016_pos
liaoyuxiangqin [Tue, 11 Oct 2016 16:28:49 +0000 (00:28 +0800)]
Enable reservation_012_pos, reservation_015_pos and reservation_016_pos

Enable reservation_012_pos, reservation_015_pos and reservation_016_pos
test cases which are passing.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: yuxiang <guo.yong33@zte.com.cn>
Closes #5254

7 years agoFix coverity defects: CID 147639
GeLiXin [Mon, 10 Oct 2016 22:30:22 +0000 (06:30 +0800)]
Fix coverity defects: CID 147639

When array is passed as a parameter it degenerates into a
pointer so the sizeof(path) in is_shorthand_path() and always
get return value of 8, instead of the string length we want.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5198

7 years agoWrite issue taskq shouldn't be dynamic
Tim Chase [Mon, 10 Oct 2016 22:19:14 +0000 (17:19 -0500)]
Write issue taskq shouldn't be dynamic

This is as much an upstream compatibility as it's a bit of a performance
gain.

The illumos taskq implemention doesn't allow a TASKQ_THREADS_CPU_PCT type
to be dynamic and in fact enforces as much with an ASSERT.

As to performance, if this taskq is dynamic, it can cause excessive
contention on tq_lock as the threads are created and destroyed because it
can see bursts of many thousands of tasks in a short time, particularly
in heavy high-concurrency zvol write workloads.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Chase <tim@chase2k.com>
Closes #5236

7 years agoPorting over some ICP code that was missed in #4760
Tom Caputi [Mon, 10 Oct 2016 18:34:57 +0000 (14:34 -0400)]
Porting over some ICP code that was missed in #4760

When #4760 was merged tests were added to ensure that the new checksums
were working properly. However, some of the functionality for sha2
functions were not ported over, resulting in some Coverity defects and
code that would be unstable when needed in the future. This patch
simply ports over the missing code and fixes the defects in the
process.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tom Caputi <tcaputi@datto.com>
Issue #4760
Closes #5251

7 years agoEnable readonly_001_pos
ChaoyuZhang [Mon, 10 Oct 2016 00:50:16 +0000 (08:50 +0800)]
Enable readonly_001_pos

Enable readonly_001_pos this test is now passing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: ChaoyuZhang <zhang.chaoyu@zte.com.cn>
7 years agoFix file permissions
Brian Behlendorf [Sat, 8 Oct 2016 21:57:56 +0000 (14:57 -0700)]
Fix file permissions

The following new test cases need to have execute permissions set:

  userquota/groupspace_003_pos.ksh
  userquota/userquota_013_pos.ksh
  userquota/userspace_003_pos.ksh
  upgrade/upgrade_userobj_001_pos.ksh
  upgrade/setup.ksh
  upgrade/cleanup.ksh

The following source files accidentally were marked executable:

  lib/libzpool/kernel.c
  lib/libshare/nfs.c
  lib/libzfs/libzfs_dataset.c
  lib/libzfs/libzfs_util.c
  tests/zfs-tests/cmd/rm_lnkcnt_zero_file/rm_lnkcnt_zero_file.c
  tests/zfs-tests/cmd/dir_rd_update/dir_rd_update.c
  cmd/zed/zed_exec.c
  module/icp/core/kcf_sched.c
  module/zfs/dsl_pool.c
  module/zfs/arc.c
  module/nvpair/nvpair.c
  man/man5/zfs-module-parameters.5

Reviewed-by: GeLiXin <ge.lixin@zte.com.cn>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5241

7 years agoUse env, not sh in zfsctl_snapshot_{,un}mount()
Stian Ellingsen [Thu, 6 Oct 2016 18:03:41 +0000 (20:03 +0200)]
Use env, not sh in zfsctl_snapshot_{,un}mount()

Call mount and umount via /usr/bin/env instead of /bin/sh in
zfsctl_snapshot_mount() and zfsctl_snapshot_unmount().

This change fixes a shell code injection flaw.  The call to /bin/sh
passed the mountpoint unescaped, only surrounded by single quotes.  A
mountpoint containing one or more single quotes would cause the command
to fail or potentially execute arbitrary shell code.

This change also provides compatibility with grsecurity patches.
Grsecurity only allows call_usermodehelper() to use helper binaries in
certain paths.  /usr/bin/* is allowed, /bin/* is not.

7 years agoFix use after free in zfsctl_snapshot_unmount()
Stian Ellingsen [Thu, 6 Oct 2016 17:53:27 +0000 (19:53 +0200)]
Fix use after free in zfsctl_snapshot_unmount()

7 years agoRename hole_birth tunable to match OpenZFS
Brian Behlendorf [Sat, 8 Oct 2016 04:02:24 +0000 (21:02 -0700)]
Rename hole_birth tunable to match OpenZFS

OpenZFS decided that ignore_hole_birth was too imprecise and
incorrect a name (and went with send_holes_without_birth_time).
Rename it in ZoL too, while keeping the name "ignore_hole_birth"
pointing to the same variable for existing consumers.

Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #5239

7 years agoFix vdev_open_child() race on updating vdev_parent->vdev_nonrot
Håkan Johansson [Fri, 7 Oct 2016 20:25:35 +0000 (22:25 +0200)]
Fix vdev_open_child() race on updating vdev_parent->vdev_nonrot

Updating vd->vdev_parent->vdev_nonrot in vdev_open_child()
is a race when vdev_open_child is called for many children
from a task queue.

vdev_open_child() is only called by vdev_open_children(), let
the latter update the parent vdev_nonrot member.  The update
was already there, so done twice previously.  Thus using the
same logic at the end in vdev_open_children() to update
vdev_nonrot, either we are vdev_uses_zvols() or not.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Haakan T Johansson <f96hajo@chalmers.se>
Closes #5162

7 years agoFix coverity defects: CID 147565-147567
cao [Fri, 7 Oct 2016 20:19:43 +0000 (04:19 +0800)]
Fix coverity defects: CID 147565-147567

coverity scan CID:147567, Type:dereference null return value
coverity scan CID:147566, Type:dereference null return value
coverity scan CID:147565, Type:dereference null return value

Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5166

7 years agoFletcher4: Incremental updates and ctx calculation
Brian Behlendorf [Fri, 7 Oct 2016 19:44:12 +0000 (12:44 -0700)]
Fletcher4: Incremental updates and ctx calculation

Fixes ABI issues with fletcher4 code, adds support for
incremental updates, and adds ztest method for testing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5164

7 years agoFix uninitialized variable snapprops_nvlist in zfs_receive_one
LOLi [Fri, 7 Oct 2016 17:05:06 +0000 (19:05 +0200)]
Fix uninitialized variable snapprops_nvlist in zfs_receive_one

The variable snapprops_nvlist was never initialized, so properties
were not applied to the received snapshot.

Additionally, add zfs_receive_013_pos.ksh script to ZFS test suite to exercise
'zfs receive' functionality for user properties.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: loli10K <ezomori.nozomu@gmail.com>
Closes #4338

7 years agoAdd python style checking
Brian Behlendorf [Fri, 7 Oct 2016 16:54:02 +0000 (09:54 -0700)]
Add python style checking

Introduce a make recipe for flake8 to enable python
style checking. Ensure all python scripts pass flake8.
Return an error code of 0 for arcstat.py -v and
dbufstat.py -v.  Add test cases for python scripts.

Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ian Lee <IanLee1521@gmail.com>
Closes #5230

7 years agoOpenZFS 6988 spa_sync() spends half its time in dmu_objset_do_userquota_updates
Jinshan Xiong [Wed, 21 Sep 2016 20:49:47 +0000 (13:49 -0700)]
OpenZFS 6988 spa_sync() spends half its time in dmu_objset_do_userquota_updates

Using a benchmark which creates 2 million files in one TXG, I observe
that the thread running spa_sync() is on CPU almost the entire time we
are syncing, and therefore can be a performance bottleneck. About 50% of
the time in spa_sync() is in dmu_objset_do_userquota_updates().

The problem is that dmu_objset_do_userquota_updates() calls
zap_increment_int(DMU_USERUSED_OBJECT) once for every file that was
modified (or created). In this benchmark, all the files are owned by the
same user/group, so all 2 million calls to zap_increment_int() are
modifying the same entry in the zap. The same issue exists for the
DMU_GROUPUSED_OBJECT.

We should keep an in-memory map from user to space delta while we are
syncing, and when we finish, iterate over the in-memory map and modify
the ZAP once per entry. This reduces the number of calls to
zap_increment_int() from "number of objects modified" to "number of
owners/groups of modified files".

This reduced the time spent in spa_sync() in the file create benchmark
by ~33%, from 11 seconds to 7 seconds.

Upstream bugs: DLPX-44799
Ported by: Ned Bass <bass6@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/6988
ZFSonLinux-issue: https://github.com/zfsonlinux/zfs/issues/4642
OpenZFS-commit: unmerged

Porting notes:
- Added curly braces around declaration of userquota_cache_t cache to
  quiet compiler warning;
- Handled the userobj accounting the same way it proposed in this path.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
7 years agoAdd support for user/group dnode accounting & quota
Jinshan Xiong [Tue, 4 Oct 2016 18:46:10 +0000 (11:46 -0700)]
Add support for user/group dnode accounting & quota

This patch tracks dnode usage for each user/group in the
DMU_USER/GROUPUSED_OBJECT ZAPs. ZAP entries dedicated to dnode
accounting have the key prefixed with "obj-" followed by the UID/GID
in string format (as done for the block accounting).
A new SPA feature has been added for dnode accounting as well as
a new ZPL version. The SPA feature must be enabled in the pool
before upgrading the zfs filesystem. During the zfs version upgrade,
a "quotacheck" will be executed by marking all dnode as dirty.

ZoL-bug-id: https://github.com/zfsonlinux/zfs/issues/3500

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
7 years agoIntroduce tests for python scripts
Giuseppe Di Natale [Tue, 4 Oct 2016 22:13:42 +0000 (15:13 -0700)]
Introduce tests for python scripts

Implement tests to ensure that python scripts
that are distributed with ZFS continue to at
minimum run without errors. This will help prevent
accidental breaking of these scripts.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
7 years agoIntroduce a make recipe for flake8
Giuseppe Di Natale [Thu, 6 Oct 2016 17:50:15 +0000 (10:50 -0700)]
Introduce a make recipe for flake8

Add a make recipe to enable developers
to easily run flake8 if it is available.
This will help enforce good python coding
standards.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
7 years agoCorrect exit code for dbufstat -v and arcstat -v
Giuseppe Di Natale [Wed, 5 Oct 2016 15:41:26 +0000 (08:41 -0700)]
Correct exit code for dbufstat -v and arcstat -v

Both scripts were returning an error code of 1
when using the -v argument. -v should exit with
an error code of 0.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
7 years agoAdd icp kernel module to dkms build
Marcel Huber [Thu, 6 Oct 2016 17:31:42 +0000 (19:31 +0200)]
Add icp kernel module to dkms build

Added new section to build icp module.

Reviewed by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5232
Closes #5234

7 years agoUse a different technique to detect whether to mount-zfs
Rudd-O [Thu, 6 Oct 2016 17:26:47 +0000 (17:26 +0000)]
Use a different technique to detect whether to mount-zfs

The behavior of the Dracut module was very wrong before.

The correct behavior: initramfs should not run `zfs-mount` to completion
if the two generator files exist.  If, however, one of them is missing,
it indicates one of three cases:

* The kernel command line did not specify a root ZFS file system, and
  another Dracut module is already handling root mount (via systemd).
  `mount-zfs` can run, but it will do nothing.
* There is no systemd to run `sysroot.mount` to begin with.
  `mount-zfs` must run.
* The root parameter is zfs:AUTO, which cannot be run in sysroot.mount.
  `mount-zfs` must run.

In any of these three cases, it is safe to run `zfs-mount` to completion.

`zfs-mount` must also delete itself if it determines it should not run,
or else Dracut will do the insane thing of running it over and over again.
Literally, the definition of insanity, doing the same thing that did not
work before, expecting different results.  Doing that may have had a great
result before, when we had a race between devices appearing and pools
being mounted, and `mount-zfs` was tasked with the full responsibility
of importing the needed pool, but nowadays it is wrong behavior and
should be suppressed.

I deduced that self-deletion was the correct thing to do by looking at
other Dracut code, because (as we all are very fully aware of) Dracut
is entirely, ahem, "implementation-defined".

Tested-by: @wphilips
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Manuel Amador (Rudd-O) <rudd-o@rudd-o.com>
Closes #5157
Closes #5204

7 years agoCorrect style in test-runner
Giuseppe Di Natale [Thu, 6 Oct 2016 17:11:06 +0000 (10:11 -0700)]
Correct style in test-runner

Correct test-runner.py so it passes flake8
python style checking.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
7 years agoCorrect style in arcstat and arc_summary
Giuseppe Di Natale [Thu, 6 Oct 2016 17:04:54 +0000 (10:04 -0700)]
Correct style in arcstat and arc_summary

Fix arcstat and arc_summary so they pass
flake8 python code style checks.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
7 years agoRefactor updating of immutable/appendonly flags
lorddoskias [Wed, 5 Oct 2016 21:47:29 +0000 (00:47 +0300)]
Refactor updating of immutable/appendonly flags

Move the synchronization of inode/znode i_flgas/pflags into
the respective internal zfs function. This is mostly
mechanical work and shouldn't introduce any functional
changes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Issue #227
Closes #5223

7 years agoFletcher4: save/reload implementation context
Gvozden Neskovic [Sat, 24 Sep 2016 22:56:22 +0000 (00:56 +0200)]
Fletcher4: save/reload implementation context

Init, compute, and fini methods are changed to work on internal context object.
This is necessary because ABI does not guarantee that SIMD registers will be preserved
on function calls. This is technically the case in Linux kernel in between
`kfpu_begin()/kfpu_end()`, but it breaks user-space tests and some kernels that
don't require disabling preemption for using SIMD (osx).

Use scalar compute methods in-place for small buffers, and when the buffer size
does not meet SIMD size alignment.

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
7 years agoFletcher4: Incremental using SIMD
Gvozden Neskovic [Fri, 23 Sep 2016 01:52:29 +0000 (03:52 +0200)]
Fletcher4: Incremental using SIMD

Combine incrementally computed fletcher4 checksums. Checksums are combined
a posteriori, allowing for parallel computation on chunks to be implemented if
required. The algorithm is general, and does not add changes in each SIMD
implementation.
New test in ztest verifies incremental fletcher computations.

Checksum combining matrix for two buffers `a` and `b`, where `Ca` and `Cb` are
respective fletcher4 checksums, `Cab` is combined checksum, `s` is size of buffer
`b` (divided by sizeof(uint32_t)) is:

Cab[A] = Cb[A] + Ca[A]
Cab[B] = Cb[B] + Ca[B] + s * Ca[A]
Cab[C] = Cb[C] + Ca[C] + s * Ca[B] + s(s+1)/2 * Ca[A]
Cab[D] = Cb[D] + Ca[D] + s * Ca[C] + s(s+1)/2 * Ca[B] + s(s+1)(s+2)/6 * Ca[A]

NOTE: this calculation overflows for larger buffers. Thus, internally, the calculation
is performed on 8MiB chunks.

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
7 years agoFix coverity defects: CID 150953, 147603, 147610
luozhengzheng [Wed, 5 Oct 2016 01:15:57 +0000 (09:15 +0800)]
Fix coverity defects: CID 150953, 147603, 147610

coverity scan CID:150953,type: uninitialized scalar variable
coverity scan CID:147603,type: Resource leak
coverity scan CID:147610,type: Resource leak

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5209

7 years agoMerge OpenZFS 4185
Brian Behlendorf [Tue, 4 Oct 2016 18:20:38 +0000 (11:20 -0700)]
Merge OpenZFS 4185

OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R

Reviewed-by: Chunwei Chen <david.chen@osnexus.com>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: David Quigley <david.quigley@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #4760

7 years agozloop: check if core file is generated by zdb
Gvozden Neskovic [Mon, 3 Oct 2016 22:42:13 +0000 (00:42 +0200)]
zloop: check if core file is generated by zdb

Run `gdb` core file inspection with correct program.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Closes #5215

7 years agoUse 100MB pool for filetest_001_pos.ksh checksum test
Tony Hutter [Mon, 15 Aug 2016 23:34:02 +0000 (19:34 -0400)]
Use 100MB pool for filetest_001_pos.ksh checksum test

As part of its tests, filetest_001_pos.ksh wipes the entire vdev to
create checksum errors.  This patch uses the setup/cleanup scripts from
the scrub_mirror test to create a custom 100MB pool, rather than
using the entire device size that is passed into zfs-tests.sh
(which defaults to 2GB).  This speeds up the buildbot tests, and also
makes it possible for someone to use real disks (say, 1TB) without the
test taking an insanely long amount of time.

7 years agoOpenZFS 6585 - sha512, skein, and edonr have an unenforced dependency on extensible...
ilovezfs [Thu, 28 Jan 2016 12:51:19 +0000 (04:51 -0800)]
OpenZFS 6585 - sha512, skein, and edonr have an unenforced dependency on extensible dataset

Authored by: ilovezfs <ilovezfs@icloud.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported by: Tony Hutter <hutter2@llnl.gov>

In any pool without the extensible dataset feature flag already enabled,
creating a dataset with dedup set to use one of the new checksums would
result in the following panic as soon as any data was added:

panic[cpu0]/thread=ffffff0006761c40: feature_get_refcount(spa, feature,
&refcount) != 48 (0x30 != 0x30), file: ../../common/fs/zfs/zfeature.c
line 390

Inpsection showed that feature->fi_feature was 7, which is the value of
SPA_FEATURE_EXTENSIBLE_DATASET in the spa_feature enum.  This commit
adds extensible dataset as a dependency for the sha512, edonr, and skein
feature flags, which prevents the panic.

OpenZFS-issue: https://www.illumos.org/issues/6585
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/892586e8a147c02d7f4053cc405229a13e796928
Porting Notes:
This code was originally from Illumos, but I actually ported it from:
openzfsonosx/zfs@b62a652

7 years agoOpenZFS 6541 - Pool feature-flag check defeated if "verify" is included in the dedup...
ilovezfs [Tue, 26 Jan 2016 07:41:11 +0000 (23:41 -0800)]
OpenZFS 6541 - Pool feature-flag check defeated if "verify" is included in the dedup property value

Authored by: ilovezfs <ilovezfs@icloud.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Richard Laager <rlaager@wiktel.com>
Approved by: Robert Mustacchi <rm@joyent.com>
Ported-by: Tony Hutter <hutter2@llnl.gov>
zio_checksum_to_feature() expects a zio_checksum enum not a raw property
intval, so the new checksums weren't being detected when the
ZIO_CHECKSUM_VERIFY flag got in the way.

Given a pool without feature@sha512,

    zfs create -o dedup=sha512 naughty/fivetwelve_noverify_ds

would fail as expected since the raw intval would indeed be equal to
SPA_FEATURE_SHA512.

However,

    zfs create -o dedup=sha512,verify naughty/fivetwelve_verify_ds

would incorrectly succeed because ZIO_CHECKSUM_VERIFY would be in the
way, the raw intval would not be a member of the enum, and
zio_checksum_to_feature() would return SPA_FEATURE_NONE, with the result
that spa_feature_is_enabled() would never be called.

This was first detected with edonr, since in that case verify is
required.

This commit clears the ZIO_CHECKSUM_VERIFY flag before calling
zio_checksum_to_feature() using the ZIO_CHECKSUM_MASK and verifies in
zio_checksum_to_feature() that ZIO_CHECKSUM_MASK has been applied by the
caller to attempt to prevent the same bug from occurring again in the
future.

OpenZFS-issue: https://www.illumos.org/issues/6541
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/971640e6aa954c91b0706543741aa4570299f4d7

Porting notes:
This code was originally from Illumos, but I actually ported it from:
openzfsonosx/zfs@bef06e1

7 years agoOpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R
Tony Hutter [Wed, 15 Jun 2016 22:47:05 +0000 (15:47 -0700)]
OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-R

Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Saso Kiselkov <saso.kiselkov@nexenta.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Garrett D'Amore <garrett@damore.org>
Ported by: Tony Hutter <hutter2@llnl.gov>

OpenZFS-issue: https://www.illumos.org/issues/4185
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/45818ee

Porting Notes:
This code is ported on top of the Illumos Crypto Framework code:

    https://github.com/zfsonlinux/zfs/pull/4329/commits/b5e030c8dbb9cd393d313571dee4756fbba8c22d

The list of porting changes includes:

- Copied module/icp/include/sha2/sha2.h directly from illumos

- Removed from module/icp/algs/sha2/sha2.c:
#pragma inline(SHA256Init, SHA384Init, SHA512Init)

- Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since
  it now takes in an extra parameter.

- Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c

- Added skein & edonr to libicp/Makefile.am

- Added sha512.S.  It was generated from sha512-x86_64.pl in Illumos.

- Updated ztest.c with new fletcher_4_*() args; used NULL for new CTX argument.

- In icp/algs/edonr/edonr_byteorder.h, Removed the #if defined(__linux) section
  to not #include the non-existant endian.h.

- In skein_test.c, renane NULL to 0 in "no test vector" array entries to get
  around a compiler warning.

- Fixup test files:
- Rename <sys/varargs.h> -> <varargs.h>, <strings.h> -> <string.h>,
- Remove <note.h> and define NOTE() as NOP.
- Define u_longlong_t
- Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p"
- Rename NULL to 0 in "no test vector" array entries to get around a
  compiler warning.
- Remove "for isa in $($ISAINFO); do" stuff
- Add/update Makefiles
- Add some userspace headers like stdio.h/stdlib.h in places of
  sys/types.h.

- EXPORT_SYMBOL *_Init/*_Update/*_Final... routines in ICP modules.

- Update scripts/zfs2zol-patch.sed

- include <sys/sha2.h> in sha2_impl.h

- Add sha2.h to include/sys/Makefile.am

- Add skein and edonr dirs to icp Makefile

- Add new checksums to zpool_get.cfg

- Move checksum switch block from zfs_secpolicy_setprop() to
  zfs_check_settable()

- Fix -Wuninitialized error in edonr_byteorder.h on PPC

- Fix stack frame size errors on ARM32
   - Don't unroll loops in Skein on 32-bit to save stack space
   - Add memory barriers in sha2.c on 32-bit to save stack space

- Add filetest_001_pos.ksh checksum sanity test

- Add option to write psudorandom data in file_write utility

7 years agoFletcher4: Init in libzfs_init()
Gvozden Neskovic [Sun, 25 Sep 2016 08:35:12 +0000 (10:35 +0200)]
Fletcher4: Init in libzfs_init()

All users of fletcher4 methods must call `fletcher_4_init()/_fini()`
There's no benchmarking overhead when called from user-space.

Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
7 years agoAdd parity generation/rebuild using 128-bits NEON for Aarch64
Romain Dolbeau [Mon, 3 Oct 2016 16:44:00 +0000 (18:44 +0200)]
Add parity generation/rebuild using 128-bits NEON for Aarch64

This re-use the framework established for SSE2, SSSE3 and
AVX2. However, GCC is using FP registers on Aarch64, so
unlike SSE/AVX2 we can't rely on the registers being left alone
between ASM statements. So instead, the NEON code uses
C variables and GCC extended ASM syntax. Note that since
the kernel explicitly disable vector registers, they
have to be locally re-enabled explicitly.

As we use the variable's number to define the symbolic
name, and GCC won't allow duplicate symbolic names,
numbers have to be unique. Even when the code is not
going to be used (e.g. the case for 4 registers when
using the macro with only 2). Only the actually used
variables should be declared, otherwise the build
will fails in debug mode.

This requires the replacement of the XOR(X,X) syntax
by a new ZERO(X) macro, which does the same thing but
without repeating the argument. And perhaps someday
there will be a machine where there is a more efficient
way to zero a register than XOR with itself. This affects
scalar, SSE2, SSSE3 and AVX2 as they need the new macro.

It's possible to write faster implementations (different
scheduling, different unrolling, interleaving NEON and
scalar, ...) for various cores, but this one has the
advantage of fitting in the current state of the code,
and thus is likely easier to review/check/merge.

The only difference between aarch64-neon and aarch64-neonx2
is that aarch64-neonx2 unroll some functions some more.

Reviewed-by: Gvozden Neskovic <neskovic@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Romain Dolbeau <romain.dolbeau@atos.net>
Closes #4801

7 years agoCorrect zpool_vdev_remove() error message
Richard Laager [Sun, 2 Oct 2016 18:34:17 +0000 (13:34 -0500)]
Correct zpool_vdev_remove() error message

The error message in zpool_vdev_remove() said top-level devices
could be removed, but that has never been true.

Reported-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Laager <rlaager@wiktel.com>
Closes #4506
Closes #5213

7 years agoFix coverity defects: CID 147448, 147449, 147450, 147453, 147454
luozhengzheng [Sun, 2 Oct 2016 18:24:54 +0000 (02:24 +0800)]
Fix coverity defects: CID 147448, 147449, 147450, 147453, 147454

coverity scan CID:147448,type: unchecked return value
coverity scan CID:147449,type: unchecked return value
coverity scan CID:147450,type: unchecked return value
coverity scan CID:147453,type: unchecked return value
coverity scan CID:147454,type: unchecked return value

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: luozhengzheng <luo.zhengzheng@zte.com.cn>
Closes #5206

7 years agoFix NULL deref in kcf_remove_mech_provider
candychencan [Fri, 30 Sep 2016 23:04:43 +0000 (07:04 +0800)]
Fix NULL deref in kcf_remove_mech_provider

In the default case the function must return to avoid dereferencing
'prov_mech' which will be NULL.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: candychencan <chen.can2@zte.com.cn>
Closes #5134

7 years agoFix coverity defects: CID 147563, 147560
cao [Fri, 30 Sep 2016 22:56:17 +0000 (06:56 +0800)]
Fix coverity defects: CID 147563, 147560

coverity scan CID:147563, Type:dereference null return value
coverity scan CID:147560, Type:dereference null return value

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5168

7 years agoFix coverity defects: CID 147531 147532 147533 147535
GeLiXin [Fri, 30 Sep 2016 22:47:57 +0000 (06:47 +0800)]
Fix coverity defects: CID 147531 147532 147533 147535

coverity scan CID:147531,type: Argument cannot be negative
- may copy data with negative size
coverity scan CID:147532,type: resource leaks
- may close a fd which is negative
coverity scan CID:147533,type: resource leaks
- may call pwrite64 with a negative size
coverity scan CID:147535,type: resource leaks
- may call fdopen with a negative fd

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5176

7 years agoFix coverity defects: CID 147536, 147537, 147538
GeLiXin [Fri, 30 Sep 2016 22:40:07 +0000 (06:40 +0800)]
Fix coverity defects: CID 147536, 147537, 147538

coverity scan CID:147536, type: Argument cannot be negative
- may write or close fd which is negative
coverity scan CID:147537, type: Argument cannot be negative
- may call dup2 with a negative fd
coverity scan CID:147538, type: Argument cannot be negative
- may read or fchown with a negative fd

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: GeLiXin <ge.lixin@zte.com.cn>
Closes #5185

7 years agoraidz_test: respect wall time
Gvozden Neskovic [Fri, 30 Sep 2016 22:19:51 +0000 (00:19 +0200)]
raidz_test: respect wall time

When timeout is specified (-t), stop worker threads in the middle of work units.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gvozden Neskovic <neskovic@gmail.com>
Issue #5180
Closes #5190

7 years agoFix cppcheck warning in buf_init()
Brian Behlendorf [Fri, 30 Sep 2016 22:04:21 +0000 (15:04 -0700)]
Fix cppcheck warning in buf_init()

Cppcheck 1.63 erroneously complains about an uninitialized value
in buf_init().  Newer versions of cppcheck (1.72) handle this
correctly but we'll initialize the value anyway to silence the
warning.

Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5203

7 years agoDisable zpool_import_002_pos and ro_props_001_pos
Brian Behlendorf [Fri, 30 Sep 2016 19:12:53 +0000 (12:12 -0700)]
Disable zpool_import_002_pos and ro_props_001_pos

These test cases fail some percentage of the time resulting
in automated testing failures.  Disable the offending tests
until they can be made reliable.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #5201
Issue #5202
Closes #5194

7 years agoFix coverity defects: CID 147707
cao [Fri, 30 Sep 2016 17:49:16 +0000 (01:49 +0800)]
Fix coverity defects: CID 147707

coverity scan CID:147707, Type:Double free.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: cao.xuewen <cao.xuewen@zte.com.cn>
Closes #5097