Stoiko Ivanov [Thu, 24 Nov 2022 12:21:07 +0000 (13:21 +0100)]
pmgqm: handle smtputf8 data
$data->{pmail} is both used in the template rendering ('Spam Report for
$pmail'), and as content for the To header, which need different
treatment. Thus introduce 'pmail_raw' additionally.
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:06 +0000 (13:21 +0100)]
quarantine: handle utf8 data
use try_decode_utf8 for sender/receiver of the smtp dialog and mail
headers since they're either ASCII (not SMTPUTF8) or UTF-8 (with SMTPUTF8)
encoded
change the mail regex for wl/bl to basic email/domain syntax without
the restriction of ascii only. (whitespace and backslashes are
forbidden, but they shouldn't normally occur in email addresses and
domains)
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:05 +0000 (13:21 +0100)]
partially fix #2465: handle smtputf8 addresses in the rule-system
the envelope addresses are used in the rule-system for lookups and
statistics. When the mail is received with smtputf8 the addresses are
decoded (multi-byte perl-strings) and thus need encoding before using
them as parameter in a database query.
This patch encodes the addresses as utf-8 for the relevant queries
unconditionally, because envelope-senders should either be:
* (a subset of) ascii (no smtputf8) - which is invariant for utf-8
encoding
* valid utf-8 (smtputf8)
The patch does not address the issues with multi-byte addresses in our
LDAP-implementation (hence the partial fix), but should still be an
improvment for many deployments
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:03 +0000 (13:21 +0100)]
fix #2541 ruledb: encode relevant values as utf-8 in database
This patch adds support for storing rule names, comments(info), and
most relevant values (e.g. the header content to match) in utf-8 in
the database.
backwards-compatibility should not be an issue:
* currently the database should not contain any utf-8 multibyte
characters, as our tooling prevented this due to sending
wide-characters, which causes an exception in DBI.
* any character > 127 and < 256 will be correctly interpreted when
stored in a perl-string (this happens if the decode fails in
try_decode_utf8), and will be correctly encoded when storing into
the database.
the database is created with SQL_ASCII encoding - which behaves by
interpreting bytes <= 127 as ascii and those > 127 are not interpreted
(see [0], which just means that we have to explicitly en-/decode upon
storing/reading from there)
This patch currently omits most Who objects:
* for email/domain we'd still need to consider how to store them
(puny-code for the domain part, or everything as UTF-8) and it would
need changes to the API-types.
* the LDAP objects currently would not work too well, since our LDAPCache
is not UTF-8 safe - and fixing warants its own patch-series
* WhoRegex should work and be able to handle many use-cases
The ContentType values should also contain only ascii characters per
RFC6838 [1] and RFC2045 [2].
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:02 +0000 (13:21 +0100)]
ruledb: properly substitute prox_vars in headers
by storing the variables as perl-string (not mime-encoded, and not
utf-8 encoded), and appropriately dealing with multi-line values to
input (folding the headers and encoding as mime).
Stoiko Ivanov [Thu, 24 Nov 2022 12:21:01 +0000 (13:21 +0100)]
utils: return perl string from decode_rfc1522
decode_rfc1522 is a more robust version of decode_mimewords (in
scalar context) - adapt it to return a perlstring, under the
assumption that data is utf-8 encoded (holds true for ascii and
smtputf8 mails)
the try_decode_utf8 helper sub backwards will be used extensively in
later patches and is inspired by commit 43f8112f0bb424f99057106d57d32276d7d422a6 in pve-storage:
We consider that the valid multibyte utf-8 characters do not really
yield sensible combinations of single-byte perl characters (starting
with a byte > 127 - e.g. "£") so if something decodes without error
from utf-8 it will in all likelyhood have been utf-8 to begin with
Dominik Csapak [Wed, 23 Nov 2022 14:52:21 +0000 (15:52 +0100)]
fix #3287: add pmail parameter to virus/attch. quarantine list
so that we can filter by the recipient email
for that we also have to add the quarantine type to the 'spamusers' api
call, or else we cannot list which recipients have mails in the
respective quarantine
Stoiko Ivanov [Wed, 9 Nov 2022 18:27:25 +0000 (19:27 +0100)]
ruledb: add deprecation warnings for unused actions
* ReportSpam
* Attach
* Counter
are all still present since (at least) the release of PMG 5.0, but
were never exposed in the API/GUI.
All of them in their current form don't seem to fit well nowadays, or
their functionality was taken over by some other Action:
* Attach - the functionality is currently present in the Notify action
(attach original mail)
* Counter - without a matching What object simply increasing a counter
by one in the database serves no purpose
* ReportSpam - sending potentially sensitive mail automatically to the
public SpamAssassin project does not seem to fit well nowadays
Instead of dropping them right away - this patch adds logging when
they are encountered while loading or when they are run, to keep
backwards-compatibility for users who have very long-running PMG
instances (not sure if the actions were ever used in the pre git-days
of PMG)
MIME::Words::encode_mimewords does not deal with multiline headers
(the warning about this being a 'quick and dirty' solution [0]
partially tells as much).
Instead - split the replacement value after variable substition on:
'\r?\n\s*' (to capture multi-line values like __SPAM_INFO__, but also
already folded headers, which are separated by '\r?\n\s+') and do the
substitution for each line seperately.
reported in our community forum:
https://forum.proxmox.com/threads/.118001/
Dominik Csapak [Fri, 4 Nov 2022 15:04:20 +0000 (16:04 +0100)]
api/quarantine: allow 'listattachments' for quarantine users
we use 'get_attachments' which uses 'get_and_check_mail'. that already
checks the correct permsissions (quser are only able to retriever their
own mails/attachments) so it's ok here to allow it
Dominik Csapak [Wed, 5 Oct 2022 07:49:41 +0000 (09:49 +0200)]
RuleDB/Notify: properly en-/decode the mail subject
we need to mime decode the subject after reading it, so that we get
the 'real' subject instead of the (possibly) mime encoded one (which
might be base64 or quoted-printable encoded). To get a proper subject in
the notification mail again, we have to encode it again before passing
it MIME::Entity->build
fix #4269: rule cache: from match: cope with undefined IP
No semantic change, just avoids an ugly warning. Can normally only
happen if a mail is send/inject directly from the PMG host.
fwiw, all the rule implementation that actually use $ip got an early
return 0 if $ip evaluates to false(y), one might actually consider
checking the counterpart too for false-y in this case, and return a
match if both are false (or maybe better, make the check a
definedness one); but as this is for an edge case we might just keep
it as is for now, worked ok for more than a decade..
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
this was forgotten when introducing the more flexible kernel-keeping
logic with proxmox-boot-tool (in 6.4).
with this file present no pve-kernel gets autoremoved.
this patch uses d/maintscript for removing instead of using
debian/conffiles (deb-conffiles(5)) 'remove-on-upgrade'
sticking with d/maintscript was chosen, since else it depends on the
installed debhelper version if the removal is done at all (debhelper
from bullseye simply ignores remove-on-upgrade in d/conffiles)
Tested the following with a local version bump to 7.1-5 and a VM:
* regular unchanged /etc/apt/apt.conf.d/75pmgconf
* manually modified /etc/apt/apt.conf.d/75pmgconf
* manually removed /etc/apt/apt.conf.d/75pmgconf
Stoiko Ivanov [Tue, 17 May 2022 10:19:50 +0000 (12:19 +0200)]
rulesystem: matchfield: match all headers not only the first
currently the match field uses $entity->head->get in scalar context,
which only returns the first matching header (see [0])
switch over to using get_all in list context and iterating over all
headers makes it possible to match subsequent headers.
while it is uncommon in general - the Received headers are usually not
restricted to one - reported in our community forum:
https://forum.proxmox.com/threads/.109629/
Thomas Lamprecht [Sat, 14 May 2022 15:21:56 +0000 (17:21 +0200)]
d/control: bump versioned dependencies
for namespace support, but note that proxmox-backup-client 2.1.10-1
is still missing some changes only in git yet, i.e., making the CLI
prune command NS aware.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Markus Frank [Wed, 30 Mar 2022 12:32:15 +0000 (14:32 +0200)]
fix #3924: ldap: accept only valid email-address
If a mail attribute contains special characters in ldap at the first
line, it will be set as primary email and results in a
"400 invalid format - value does not look like a valid email address"
Error-Statement in the webconsole. This mostly can happen if SIP
Addresses are in Active-Directory's proxyAddresses which begin with "SIP:".
To make the validation more strict I changed the api to use
pmg-email-address and added a regex which looks for protocolnames (sip:)
that could be in proxyAddresses but are not compatible and skips these
addresses.
avoid a overly long line and a useless overwriting a scalar only to
extend another one with its value, really no biggie especially in the
context that's used, but its so easy to avoid that it still has some
merit.
Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Stoiko Ivanov [Thu, 25 Nov 2021 17:48:13 +0000 (18:48 +0100)]
fix #2795: add support for DSN
store the esmtp parameters for the MAIL and RCPT command needed to
support Delivery status notifications (DSN - RFC 3464 [0]) and pass
them to the outbound postfix instance (port 10025) used for sending
the mail further (see also [1]).
Postfix does syntax-checking before passing the mail to the proxy
also in before-queue filtering mode.
Since the handling is done by postfix we don't need to generate any
DSN in the regular case.
For mail put into quarantine I decided to skip sending a delivery
notification (on the expectation, that few people are using quarantine
outbound, and that I would not consider a mail put in quarantine as
delivered successfully)
We only store a whitelist of parameters, instead of passing all,
because some parameters might not be valid anymore after processing
(e.g. SIZE)
The DSN EHLO keyword was added for the after-queue filtering case -
else the inbound postfix is the system that sends out the
notification.
tested with various combinations of the -V, -N and -R parameters to
sendmail (e.g.):
```
/usr/sbin/sendmail -N success,delay,failure \
-V '<xxxxxxxx@test.proxmox.com>'\
-R hdrs test@test.domain.example
```
tested the following scenarios in before and after-queue filter mode:
* successful delivery
* successful delivery with set DSN
* failed delivery (recipient rejects with 544)
* failed delivery with DSN
* delivering a mail with empty envelope sender (bounce)
some tests with invalid combinations were also done with netcat.
Stoiko Ivanov [Thu, 25 Nov 2021 17:44:11 +0000 (18:44 +0100)]
partially fix #2795: allow for '>' in smtp parameters
The regular expressions parsing the MAIL and RCPT commands do not
cover the case where a esmtp parameter may contain angle brackets
(e.g. the ENVID parameter for the delivery status notification
extension - RFC3464 [0]).
following section 4.1.2 of RFC5321 [1] the regex is changed to:
* consider everything up to the first '>' the mailbox
* consider everything afterwards (if it starts with a ' ') as
parameters
* since the parameter group might not match (in case no parameters are
set - e.g. after-queue filtering) - default to '' if it's not
defined
This is fairly robusts, only not parsing correctly if the local part
contains '>' (as quoted text) - but this did not work before anyways
(and causes problems in other places as well).
Dominik Csapak [Thu, 25 Nov 2021 14:14:41 +0000 (15:14 +0100)]
fix #3734: scrub 'url' from style tags/attributes
if 'view images' for the quarantine is disabled, it is expected that
*no* images will be loaded. but in addition to img (src/href/etc.)
also css can load external images via the 'url' directive
since html scrubber does not parse/iterate over css, we simply remove
the url+protocol part of those tags/attributes. this technically leaves behind
invalid css, but the browsers should cope with that.
(we cannot 'cleanly' remove without much more effort because of quoting)
also we have to scrub the style tags in 'dump_html' since HTML::Scrubber
does not have a way to modify the *content* of a tag, only the
attributes...
Stoiko Ivanov [Wed, 24 Nov 2021 21:00:48 +0000 (22:00 +0100)]
rulesystem: limit linelength of disclaimer to 998 bytes
As described in
http://www.postfix.org/postconf.5.html#smtp_line_length_limit
postfix splits lines which are longer by inserting <cr><lf><space> to
adhere with RFC 5322 (section 2.1.1):
https://datatracker.ietf.org/doc/html/rfc5322#section-2.1.1
(or actually section 4.5.3.1.6. where characters are translated to
octets)
If a longer line is part of the disclaimer pmg-smtp-filter adds it
without this modification, which breaks DKIM signatures (since the
body is modified by postfix after the body hash is computed)
regular-expression matching is used instead of length(), because the
limit is on line-length (and a disclaimer can contain multiple lines)
reported in our community forum:
https://forum.proxmox.com/threads/.97919/
Stoiko Ivanov [Wed, 24 Nov 2021 16:04:09 +0000 (17:04 +0100)]
api-daemons: set oom-policy to continue
OOMPolicy [0] defaults to stop - resulting in the complete daemon to
be killed.
Our Daemon class does start new workers automatically if it detects
that fewer than configured are running.
Dominik Csapak [Wed, 24 Nov 2021 14:48:52 +0000 (15:48 +0100)]
api: journal: stream the journal data to the client
instead of accumulating the whole output of 'mini-journalreader' in
the api call (this can be quite big), use the download mechanic of the
http-server to stream the output to the client.
we lose some error handling possibilities, but we do not have
to allocate anything here, and since perl does not free memory after
allocating[0] this is our desired behaviour.
to keep api compatiblitiy, we need to give the journalreader the '-j'
flag to let it output json.
also tell the http server that the encoding is gzip and pipe
the output through it.
Stoiko Ivanov [Mon, 22 Nov 2021 19:49:39 +0000 (20:49 +0100)]
fix #3712: strip trailing dot from searchdomain
having a trailing '.' in the search domain is perfectly legal syntax
(for domain names in general). postfix refuses to use a fqdn with
trailing dot as hostname[0].
The restriction might be due to section 2.3.5 (Domain Names) of
RFC5321 (a top-level domain is a single string without any dots) [1]
[0] src/util/valid_hostname.c in the postfix source
[1] https://datatracker.ietf.org/doc/html/rfc5321#section-2.3.5