Thomas Lamprecht [Mon, 25 Apr 2022 09:37:57 +0000 (11:37 +0200)]
datastore: add helpers to destroy whole namespaces
The behavior on "any snapshot was protected" isn't yet ideal, as we
then do not cleanup any (sub) namespace, even if some of them where
cleaned from groups & snapshots completely. But that isn't easy to do
with our current depth-first pre-order iterator behavior, and it's
also not completely wrong either, the user can re-do the removal on
the sub-namespaces, so leave that for later.
Should get moved to a datastore::BackupNamespace type once/if we get
one
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
The idea is to have namespaces in a datastore to allow grouping and
namespacing backups from different (but similar trusted) sources,
e.g., different PVE clusters, geo sites, use-cases or company
service-branches, without separating the underlying
deduplication domain and thus blowing up data and (GC/verify)
resource usage.
To avoid namespace ID clashes with anything existing or future
usecases use a intermediate `ns` level on *each* depth.
The current implementation treats that as internal and thus hides
that fact from the API, iow., the namespace path the users passes
along or gets returned won't include the `ns` level, they do not
matter there at all.
The max-depth of 8 is chosen with the following in mind:
- assume that end-users already are in a deeper level of a hierarchy,
most often they'll start at level one or two, as the higher ones
are used by the seller/admin to namespace different users/groups,
so lower than four would be very limiting for a lot of target use
cases
- all the more, a PBS could be used as huge second level archive in a
big company, so one could imagine a namespace structure like:
/<state>/<intra-state-location>/<datacenter>/<company-branch>/<workload-type>/<service-type>/
e.g.: /us/east-coast/dc12345/financial/report-storage/cassandra/
that's six levels that one can imagine for a reasonable use-case,
leave some room for the ones harder to imagine ;-)
- on the other hand, we do not want to allow unlimited levels as we
have request parameter limits and deep nesting can create other
issues as well (e.g., stack exhaustion), so doubling the minimum
level of 4 (1st point) we got room to breath even for the
more odd (or huge) use cases (2nd point)
- a per-level length of 32 (-1 due to separator) is enough to use
telling names, making lives of users and admin simpler, but not
blowing up parameter total length with the max depth of 8
- 8 * 32 = 256 which is nice buffer size
Much thanks for Wolfgang for all the great work on the type
implementation and assisting greatly with the design.
Co-authored-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Co-authored-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Dominik Csapak [Mon, 9 May 2022 10:41:18 +0000 (12:41 +0200)]
api: tape/restore: fix wrong datastore locking
used_datastores returned the 'target', but in the full_restore_worker,
we interpreted it as the source and searched for a mapping
(which we then locked)
since we cannot return a HashSet of Arc<T> (missing Hash trait on DataStore),
we have now a map of source -> target
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
[ T: squash in cargo fmt fixup for some trailing ws ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
proxmox-backup-proxy: stop accept() loop on daemon shutdown
On reload the old process hands over to the new process but needs to
keep running until all its worker tasks are finished to avoid
breaking a in-progress action like a xterm.js web shell or a backup
creation/restore.
During that wait time the receiving channel was already closed, but
the TCP sockt accept listener was still left active by mistake.
That paired with the `SO_REUSEPORT` being set on the underlying
socket, made the kernel choose either the old or new process for new
incoming connections, both still listened for them after all and
reuse-port + multiple processes is often used as load-balancer
mechanism.
As the old proxy accepted connections but didn't process them anymore
one could observer sporadic connection failures on any API call, well
any new connection to the proxy, depending on which process got the
it assigned.
The fix is to stop accepting new connections one we shutdown, so poll
the shutdown_future too during accept and just exit the accept-loop
on shutdown.
Note: This part of the code, nor other parts that could influence it,
wasn't changed at all in recent times, so it's still unresolved for
why it pops up only now.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Co-authored-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[ T: add more (root cause) info and reword a bit ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Returning the GC status was dropped by mistake in commit 762f7d15
("datastore status: factor out api type DataStoreStatusListItem")
As this is considered a breaking change which we also felt, due to
the gc-status being used in the web interface for the datastore
overview list (not the dashboard), re add it.
Fixes: 762f7d15 Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
[ T: add reference to breaking commit, reword message ] Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
the duration of mounting zpools not only correspond to the number of disks,
but also to the content (many subvols for example) which we cannot know
beforehand. so avoid mounting them at the start, and mount it only when
the user requests a listing/extraction with the zpool in path
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
restore-daemon: put blocking code into 'block_in_place'
DISK_STATE.lock() and '.resolve()' can both block since they access
the disks. Putting them into a 'block_in_place' makes tokio move it
out in its own thread to avoid that the executor isn't able to
progress any other futures in the mean time.
Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stefan Sterz [Tue, 12 Apr 2022 10:34:23 +0000 (12:34 +0200)]
fix #3067: ui: add a separate notes view for longer markdown notes
since markdown notes might be rather long, this commit adds a tab
similar to pve's datacenter or node notes. requires a bump of the
widget toolkit in order to use the `pmxNotesView`.
Thomas Lamprecht [Sun, 24 Apr 2022 17:09:38 +0000 (19:09 +0200)]
datastore: move blob loading into BackupDir impl and adapt call sites
data blobs can only appear in a BackupDir (snapshot) in the backup
hierachy, so makes more sense that it lives in there.
As it wasn't widely used anyway it's easy to move the single
non-package call site over to the new one directly and drop the
implementation from Datastore completely.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Sun, 24 Apr 2022 16:06:17 +0000 (18:06 +0200)]
datastore: improve backup group/snapshot iters
move the check for directory before doing the OSString -> String
conversion, which should be a bit more efficient.
Also let the match return the entry in the non-skip/return case to
reduce indentation level for the inner "yield element" part, making
it slightly easier to follow.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Thu, 21 Apr 2022 13:54:59 +0000 (15:54 +0200)]
server pull: fix comment w.r.t. initial downloaded chunk capacity
> The hash set will be able to hold at least capacity elements
> without reallocating. If capacity is 0, the hash set will not
> allocate.
-- rustdoc, HashSet::with_capacity
So, the number we pass is the amount of chunk "IDs" we safe, which is
then 64Ki, not 16Ki and thus the size we can reference too is also
256 GiB, not 64 GiB.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
pbs-tape: sgutils2: check sense data when status is 'CHECK_CONDITION'
Some raid controllers return a 'transport error' when we expected a
'sense error'. it seems the correct way to check the sense data is when
either the result category is 'SENSE' or when the status is 'CHECK_CONDITION',
so do that. (similar to how 'sg_raw' returns the errors)
These are supposed to be instances for a datastore, the pure
specifications are the ones in pbs_api_types which should be
preferred in crates like clients which do not need to deal
with the datastore directly.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Thomas Lamprecht [Fri, 15 Apr 2022 09:02:36 +0000 (11:02 +0200)]
datastore: implement Iterator for backup group listing
While currently it's still only used in a collected() way, most call
sites can be switched over to use the iterator directly, as often
they already convert the not-so-cheap, in-memory vector back in
.into_iter() anyway.
somewhat also preparatory (yak shaving) work for namespaces
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Thomas Lamprecht [Fri, 15 Apr 2022 07:03:13 +0000 (09:03 +0200)]
datastore: move list_backup_groups into Datastore impl
Having that as static method in BackupInfo makes zero sense and just
complicates call sites, which need to extract the base_path from the
store manually upfront.
Mark old fn as deprecated so that we can do the move in a separate
step.
It's also planned to add an Iterator impl for this to allow more
efficient usage in the future.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>