Dominik Csapak [Thu, 6 Oct 2016 09:42:28 +0000 (11:42 +0200)]
allow rbd images < 1M to be detected
without this, having an efidisk on a ceph storage
prevents creating another disk on the same
ceph storage, because it will not be detected
and we try to allocate one with the same name
the smart checks are only needed for the API call(s) that
list all disks and their status, but get_disks is also used
in disk usage checks and in the Ceph code, where the smart
status is completely irrelevant.
drop the implicit skipping of smart checks if $disk is set,
since we have an explicit parameter for this now.
because we never ever want to die in get_disks because of a
single disk, but the nodes/xyz/disks/smart API path is
allowed to fail if a disk device is unsupported by smartctl
or something else goes wrong.
While the mkdir option deals with the case where we don't
want to clobber a mount point with directories (like ZFS,
gluster or NFS), putting a directory storage directly onto a
mount point is still risky:
If the path exists - which it usually does even if not
mounted - the storage will be considered successfully
activated, but empty (or with unexpected content). Some
operations will then lead to unexpected problems: the
free_disk operation for instance only warns if the disk does
not exist, but does not throw an error. In this case the
configuration might be updated without the real disk being
deleted. Once it's mounted back in, later operations which
check existing disks which are not part of the current VM
configuration (like migration) might error unexpectedly.
This adds an 'is_mountpoint' option to directory storages
which assumes the directory is an externally managed mount
point (eg. fstab or zfs) and changes activate_storage() to
throw an error if the path is not mounted.
So far this only prevented the creation of the toplevel
directory. This does not cover all problem cases,
particularly when said directory is supposed to be a mount
point, including NFS and glusterfs beside ZFS.
The directory based storages we have already use mkpath
whenever they need to create files, and for actions on files
which are supposed to exist it's fine if it errors out.
So it should also be safe to skip the creation of standard
subdirectories in activate_storage().
Additionally NFS and glusterfs storages should also accept
the mkdir option as they otherwise may exhibit similar
issues, eg. when an NFS storage is mounted onto a directory
inside a ZFS subvolume.
since the rbd images themselves are named differently than
the volumes in our config files, we need to recreate this
information from the parent relation in the ceph metadata,
otherwise list_images() might return wrong volume names/IDs
since list_images is used by PVE::Storage::vdisk_free() to
check for children still referencing a base image, because
of the wrong volume id RBDPlugin->parse_volname() does not
detect the base image of linked clones and the check fails.
this is thankfully mitigated by the protected status of the
base snapshot, but creates a rather confusing error message.
scenario (VM 701 is a linked clone of template VM 700):
before (pvesm list reports wrong volume ID, check fails):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
rbd unprotect base-700-disk-1 snap '__base__' error: snap_unprotect: can't unprotect; at least 1 child(ren) in pool rbd
after (correct volume ID, check works as intended):
$ pvesm list ceph_qemu
ceph_qemu:base-700-disk-1 raw 2147483648 700
ceph_qemu:base-700-disk-1/vm-701-disk-1 raw 2147483648 701
$ pvesm free ceph_qemu:base-700-disk-1
base volume 'base-700-disk-1' is still in use (use by 'base-700-disk-1/vm-701-disk-1')
Dominik Csapak [Tue, 23 Aug 2016 10:20:46 +0000 (12:20 +0200)]
add api entries for disk management
adds a new class (intended to be used under nodes in pve-manager)
which adds the three api calls: list, smart and init
list being a general list of the available disk with infos
smart being a call to get the smart data from a given device
init being a call to write a gpt header to an unused disk
Dominik Csapak [Tue, 23 Aug 2016 10:20:45 +0000 (12:20 +0200)]
add Diskmanage Utilities
this adds the functions for listing the disks (mostly copied from
the ceph code), checking if a disk is a valid blockdevice, if it
is used/in a zfs pool/as an lvm pv, and an init function (just to add a gpt header;
this is important if one wants to use a fresh disk for ceph journals)
Dmitry Petuhov [Fri, 26 Aug 2016 13:06:33 +0000 (16:06 +0300)]
Add support for custom storage plugins
PVE team cannot support specialized vendor-specific storage
plugins because of lack of hardware. But we can allow users to
add own plugins for their storages without need to rewrite any
PVE code and thus ease PVE updates to them.
Idea of this patch is to add folder /usr/share/perl5/PVE/Storage/Custom
where user can place his plugins and PVE will automatically load
them on start or warn if it could not and continue. Maybe we could
even load all plugins (except PVE::Storage::Plugin itself) this way,
because current storage plugins are not really plugins, if they
need to be explicitly loaded in PVE code :-).
Custom plugins MUST have api() method returning version for which
it was designed. If API changes from PVE side, module is just not
being registered and warnig message is printed do log, so user have
to update module. Until module update, corresponding storage will
just disappear from PVE, so it shall not impose any data damage
because of API change.
This approach works (with some limitations) if plugin works in
generic PVE way: full control of volumes lifecycle. And will not
currently work for custom plugins like iSCSI, which needs to select
pre-existing volumes. Maybe someone will add more flexible way to
pve-manager to select input elements for storage plugins to target
this.
This way we get parameter verification on monitor addresses
as well as the ability to pass multiple `--monhost`
arguments to `pvesm add`.
Since our '-list' schemas default to using commas we now
need to properly support these, so all uses of the monhost
property now replace all of semicolon, space or comma into
the currently required character.
This should fix the issues reported by Alwin Antreich on the
pve-user list.
Since this schema supports both ipv6+port notations we need
to make sure we convert to the bracket enclosed variant.
Added a helper for this.
"ceph version" retrieves the version from the cluster (i.e.,
from the queried monitor), but what is needed here is the
local ceph version, which is returned by "ceph --version".
Ideally we don't need this, but this with the directory
storage this is a user-input field which gets returned
by the storage's path() method which is used in various
external command calls.
By default a directory storage creates its path. In some
cases this can be undesired, mostly when storages have
nested paths (eg. a dir storage on a ZFS path or in an NFS
share, or inside custom mount points).
As a simple fix to this the 'mkdir' option (default ON)
can now be used to disable this behavior.
otherwise mapping those images will fail. disabling the
features only needs to be done once per image, so it makes
sense to do this when creating the images.
unfortunately, the command does not work in hammer, so
it needs a version check for jewel or higher.
extract_vzdump_config_tar is an adapted combination
of tar_archive_search_conf() and the first part of
recover_config(), both from PVE::LXC::Create.
a compressed vma backup file needs special error
handling because vma exits as soon as it found the config
file, which the used decompressors treat as error.