Stoiko Ivanov [Mon, 28 Nov 2022 18:17:29 +0000 (19:17 +0100)]
user accesslists: reword logging and hits for newer SA rule sets
This commit adapts the sa-hits and the logging for the user
block/welcomelist to be consistent with the terms used in the
SpamAssassin 4.0 release, which tries to avoid some terms that might
be interpreted as racially charged.
This patch is a (small part) of the fix for #3755, which will be
addressed along with the upgrade to SpamAssassin 4.0 (to be
consistent with the (quite well thought-through) namings used by SA)
Keeping the USER_IN_BLACKLIST hit when loading the descriptions
catches mails put in quarantine before the patched version was
installed.
Stoiko Ivanov [Mon, 28 Nov 2022 18:17:28 +0000 (19:17 +0100)]
user-bl: use custom description of USER_IN_BLACKLIST consistently
The USER_IN_BLACKLIST spamassassin hit is created by the Spam What
object, if the sending e-mail is in the receivers blacklist.
This 'hit' is kept on the PMG only - it is not written to the SPAMINFO
macro - and only visible in the quarantine interface afaict.
The description shown in the quarantine interface, however is read
from SpamAssassin sources.
They have recently changed to include a 'DEPRECATED' prefix, since the
rules containing 'blacklist' and 'whitelist' have been renamed to
'blocklist' and 'welcomelist' for the upcoming 4.0 series of
spamassassin.
In any case we should keep our description consistent, thus the move
to a sub of its own for reusing in both locations.
The mechanism for welcomlisted/whitelisted mails does not create an
'internal' sa-rule (but simply drops the SA hits for analysis) - so no
symmetric change is needed.
Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
The read_tasklog API call now stream the whole log file if the query
parameter 'download' is set to true.
This is done in preparation for the task log download button to be
added in the TaskViewer.
I saw an opportunity here to clear some redundant code for displaying
the tasklog and replaced it with a call to dump_logfile(), akin to how
this is handled in pve-manager.
Signed-off-by: Daniel Tschlatscher <d.tschlatscher@proxmox.com>
Dominik Csapak [Thu, 24 Nov 2022 12:21:11 +0000 (13:21 +0100)]
ldap: improve unicode support
when we receive mails with SMTPUTF8 encoded sender/recipient,
we have to encode these values for our ldapcache to work,
otherwise pmg-smtp-filter fails with when trying to insert
perl strings.
on read from the cache we have to decode these values again so
that the webui can show them correctly
also encode/decode dn and group names, since according to rfc4514[0]
utf-8 should be ok here
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:07 +0000 (13:21 +0100)]
pmgqm: handle smtputf8 data
$data->{pmail} is both used in the template rendering ('Spam Report for
$pmail'), and as content for the To header, which need different
treatment. Thus introduce 'pmail_raw' additionally.
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:06 +0000 (13:21 +0100)]
quarantine: handle utf8 data
use try_decode_utf8 for sender/receiver of the smtp dialog and mail
headers since they're either ASCII (not SMTPUTF8) or UTF-8 (with SMTPUTF8)
encoded
change the mail regex for wl/bl to basic email/domain syntax without
the restriction of ascii only. (whitespace and backslashes are
forbidden, but they shouldn't normally occur in email addresses and
domains)
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:05 +0000 (13:21 +0100)]
partially fix #2465: handle smtputf8 addresses in the rule-system
the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.
This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
encoding
* valid utf-8 (smtputf8)
The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:03 +0000 (13:21 +0100)]
fix #2541 ruledb: encode relevant values as utf-8 in database
This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.
backwards-compatibility should not be an issue:
* currently the database should not contain any utf-8 multibyte
characters, as our tooling prevented this due to sending
wide-characters, which causes an exception in DBI.
* any character > 127 and < 256 will be correctly interpreted when
stored in a perl-string (this happens if the decode fails in
try_decode_utf8), and will be correctly encoded when storing into
the database.
the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)
This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
(puny-code for the domain part, or everything as UTF-8) and it would
need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases
The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:02 +0000 (13:21 +0100)]
ruledb: properly substitute prox_vars in headers
by storing the variables as perl-string (not mime-encoded, and not
utf-8 encoded), and appropriately dealing with multi-line values to
input (folding the headers and encoding as mime).
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:01 +0000 (13:21 +0100)]
utils: return perl string from decode_rfc1522
decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)
the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit 43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with
Dominik Csapak [Wed, 23 Nov 2022 14:52:21 +0000 (15:52 +0100)]
fix #3287: add pmail parameter to virus/attch. quarantine list
so that we can filter by the recipient email
for that we also have to add the quarantine type to the 'spamusers' api
call, or else we cannot list which recipients have mails in the
respective quarantine
Stoiko Ivanov [Wed, 9 Nov 2022 18:27:25 +0000 (19:27 +0100)]
ruledb: add deprecation warnings for unused actions
* ReportSpam
* Attach
* Counter
are all still present since (at least) the release of PMG 5.0, but
were never exposed in the API/GUI.
All of them in their current form don't seem to fit well nowadays, or
their functionality was taken over by some other Action:
* Attach - the functionality is currently present in the Notify action
(attach original mail)
* Counter - without a matching What object simply increasing a counter
by one in the database serves no purpose
* ReportSpam - sending potentially sensitive mail automatically to the
public SpamAssassin project does not seem to fit well nowadays
Instead of dropping them right away - this patch adds logging when
they are encountered while loading or when they are run, to keep
backwards-compatibility for users who have very long-running PMG
instances (not sure if the actions were ever used in the pre git-days
of PMG)
MIME::Words::encode_mimewords does not deal with multiline headers
(the warning about this being a 'quick and dirty' solution [0]
partially tells as much).
Instead - split the replacement value after variable substition on:
'\r?\n\s*' (to capture multi-line values like __SPAM_INFO__, but also
already folded headers, which are separated by '\r?\n\s+') and do the
substitution for each line seperately.
reported in our community forum:
https://forum.proxmox.com/threads/.118001/
Dominik Csapak [Fri, 4 Nov 2022 15:04:20 +0000 (16:04 +0100)]
api/quarantine: allow 'listattachments' for quarantine users
we use 'get_attachments' which uses 'get_and_check_mail'. that already
checks the correct permsissions (quser are only able to retriever their
own mails/attachments) so it's ok here to allow it
Dominik Csapak [Wed, 5 Oct 2022 07:49:41 +0000 (09:49 +0200)]
RuleDB/Notify: properly en-/decode the mail subject
we need to mime decode the subject after reading it, so that we get
the 'real' subject instead of the (possibly) mime encoded one (which
might be base64 or quoted-printable encoded). To get a proper subject in
the notification mail again, we have to encode it again before passing
it MIME::Entity->build
fix #4269: rule cache: from match: cope with undefined IP
No semantic change, just avoids an ugly warning. Can normally only
happen if a mail is send/inject directly from the PMG host.
fwiw, all the rule implementation that actually use $ip got an early
return 0 if $ip evaluates to false(y), one might actually consider
checking the counterpart too for false-y in this case, and return a
match if both are false (or maybe better, make the check a
definedness one); but as this is for an edge case we might just keep
it as is for now, worked ok for more than a decade..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this was forgotten when introducing the more flexible kernel-keeping
logic with proxmox-boot-tool (in 6.4).
with this file present no pve-kernel gets autoremoved.
this patch uses d/maintscript for removing instead of using
debian/conffiles (deb-conffiles(5)) 'remove-on-upgrade'
sticking with d/maintscript was chosen, since else it depends on the
installed debhelper version if the removal is done at all (debhelper
from bullseye simply ignores remove-on-upgrade in d/conffiles)
Tested the following with a local version bump to 7.1-5 and a VM:
* regular unchanged /etc/apt/apt.conf.d/75pmgconf
* manually modified /etc/apt/apt.conf.d/75pmgconf
* manually removed /etc/apt/apt.conf.d/75pmgconf
Stoiko Ivanov [Tue, 17 May 2022 10:19:50 +0000 (12:19 +0200)]
rulesystem: matchfield: match all headers not only the first
currently the match field uses $entity->head->get in scalar context,
which only returns the first matching header (see [0])
switch over to using get_all in list context and iterating over all
headers makes it possible to match subsequent headers.
while it is uncommon in general - the Received headers are usually not
restricted to one - reported in our community forum:
https://forum.proxmox.com/threads/.109629/
Thomas Lamprecht [Sat, 14 May 2022 15:21:56 +0000 (17:21 +0200)]
d/control: bump versioned dependencies
for namespace support, but note that proxmox-backup-client 2.1.10-1
is still missing some changes only in git yet, i.e., making the CLI
prune command NS aware.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Markus Frank [Wed, 30 Mar 2022 12:32:15 +0000 (14:32 +0200)]
fix #3924: ldap: accept only valid email-address
If a mail attribute contains special characters in ldap at the first
line, it will be set as primary email and results in a
"400 invalid format - value does not look like a valid email address"
Error-Statement in the webconsole. This mostly can happen if SIP
Addresses are in Active-Directory's proxyAddresses which begin with "SIP:".
To make the validation more strict I changed the api to use
pmg-email-address and added a regex which looks for protocolnames (sip:)
that could be in proxyAddresses but are not compatible and skips these
addresses.
avoid a overly long line and a useless overwriting a scalar only to
extend another one with its value, really no biggie especially in the
context that's used, but its so easy to avoid that it still has some
merit.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stoiko Ivanov [Thu, 25 Nov 2021 17:48:13 +0000 (18:48 +0100)]
fix #2795: add support for DSN
store the esmtp parameters for the MAIL and RCPT command needed to
support Delivery status notifications (DSN - RFC 3464 [0]) and pass
them to the outbound postfix instance (port 10025) used for sending
the mail further (see also [1]).
Postfix does syntax-checking before passing the mail to the proxy
also in before-queue filtering mode.
Since the handling is done by postfix we don't need to generate any
DSN in the regular case.
For mail put into quarantine I decided to skip sending a delivery
notification (on the expectation, that few people are using quarantine
outbound, and that I would not consider a mail put in quarantine as
delivered successfully)
We only store a whitelist of parameters, instead of passing all,
because some parameters might not be valid anymore after processing
(e.g. SIZE)
The DSN EHLO keyword was added for the after-queue filtering case -
else the inbound postfix is the system that sends out the
notification.
tested with various combinations of the -V, -N and -R parameters to
sendmail (e.g.):
```
/usr/sbin/sendmail -N success,delay,failure \
-V '<xxxxxxxx@test.proxmox.com>'\
-R hdrs test@test.domain.example
```
tested the following scenarios in before and after-queue filter mode:
* successful delivery
* successful delivery with set DSN
* failed delivery (recipient rejects with 544)
* failed delivery with DSN
* delivering a mail with empty envelope sender (bounce)
some tests with invalid combinations were also done with netcat.
Stoiko Ivanov [Thu, 25 Nov 2021 17:44:11 +0000 (18:44 +0100)]
partially fix #2795: allow for '>' in smtp parameters
The regular expressions parsing the MAIL and RCPT commands do not
cover the case where a esmtp parameter may contain angle brackets
(e.g. the ENVID parameter for the delivery status notification
extension - RFC3464 [0]).
following section 4.1.2 of RFC5321 [1] the regex is changed to:
* consider everything up to the first '>' the mailbox
* consider everything afterwards (if it starts with a ' ') as
parameters
* since the parameter group might not match (in case no parameters are
set - e.g. after-queue filtering) - default to '' if it's not
defined
This is fairly robusts, only not parsing correctly if the local part
contains '>' (as quoted text) - but this did not work before anyways
(and causes problems in other places as well).