]> git.proxmox.com Git - proxmox-spamassassin.git/blob - upstream/README
update SpamAssassin to 4.0.0
[proxmox-spamassassin.git] / upstream / README
1 Welcome to Apache SpamAssassin!
2 -------------------------------
3
4 What is Apache SpamAssassin
5 ---------------------------
6
7 Apache SpamAssassin is the #1 Open Source anti-spam platform giving
8 system administrators a filter to classify email and block "spam"
9 (unsolicited bulk email). It uses a robust scoring framework and plug-ins
10 to integrate a wide range of advanced heuristic and statistical analysis
11 tests on email headers and body text including text analysis, Bayesian
12 filtering, DNS blocklists, and collaborative filtering databases.
13
14 Apache SpamAssassin is a project of the Apache Software Foundation (ASF).
15
16
17 What Apache SpamAssassin Is Not
18 -------------------------------
19
20 Apache SpamAssassin is not a program to delete spam, route spam and ham to
21 separate mailboxes or folders, or send bounces when you receive spam.
22 Those are mail routing functions, and Apache SpamAssassin is not a mail
23 router. Apache SpamAssassin is a mail filter or classifier. It will examine
24 each message presented to it, and assign a score indicating the
25 likelihood that the mail is spam. An external program must then
26 examine this score and do any routing the user wants done. There are
27 many programs that will easily perform these functions after examining
28 the score assigned by Apache SpamAssassin.
29
30
31 How Apache SpamAssassin Works
32 -----------------------------
33
34 Apache SpamAssassin uses a wide range of heuristic tests on mail headers and
35 body text to identify "spam", also known as unsolicited commercial
36 email.
37
38 Once identified, the mail can then be optionally tagged as spam for
39 later filtering using the user's own mail user-agent application.
40
41 Apache SpamAssassin typically differentiates successfully between spam and
42 non-spam in between 95% and 100% of cases, depending on what kind of mail
43 you get and your training of its Bayesian filter. Specifically,
44 Apache SpamAssassin has been shown to produce around 1.5% false negatives (spam
45 that was missed) and around 0.06% false positives (ham incorrectly marked
46 as spam). See the rules/STATISTICS*.txt files for more information.
47
48 Apache SpamAssassin also includes plugins to support reporting spam messages
49 automatically or manually to collaborative filtering databases such as
50 Pyzor, DCC, and Vipul's Razor.
51
52 The distribution provides "spamassassin", a command line tool to
53 perform filtering, along with the "Mail::SpamAssassin" module set
54 which allows Apache SpamAssassin to be used in spam-protection proxy SMTP or
55 POP/IMAP server, or a variety of different spam-blocking scenarios.
56
57 In addition, "spamd", a daemonized version of Apache SpamAssassin which
58 runs persistently, is available. Using its counterpart, "spamc",
59 a lightweight client written in C, an MTA can process large volumes of
60 mail through Apache SpamAssassin without having to fork/exec a perl interpreter
61 for each message.
62
63
64 Questions? Need Help?
65 ---------------------
66
67 If you have questions about Apache SpamAssassin, please check the Wiki[1] to
68 see if someone has already posted an answer to your question. (The
69 Wiki doubles as a FAQ.) Failing that, post a message to the
70 spamassassin-users mailing list[2]. If you've found a bug (and you're
71 sure it's a bug after checking the Wiki), please file a report in our
72 Bugzilla[3].
73
74 [1]: https://wiki.apache.org/spamassassin/
75 [2]: https://wiki.apache.org/spamassassin/MailingLists
76 [3]: https://issues.apache.org/SpamAssassin/
77
78 Please also be sure to read the man pages.
79
80
81 Upgrading Apache SpamAssassin
82 -----------------------------
83
84 IMPORTANT: If you are upgrading from a previous major version of Apache
85 SpamAssassin, please be sure to read the notes in UPGRADE to find out
86 what has changed in a non- backward compatible way.
87
88
89 Installing Apache SpamAssassin
90 ------------------------------
91
92 See the INSTALL file.
93
94
95 Customizing Apache SpamAssassin
96 -------------------------------
97
98 These are the configuration files installed by Apache SpamAssassin. The commands
99 that can be used therein are listed in the POD documentation for the
100 Mail::SpamAssassin::Conf class (run the following command to read it:
101 "perldoc Mail::SpamAssassin::Conf"). Note: The following directories are
102 the standard defaults that people use. There is an explanation of all the
103 default locations that Apache SpamAssassin will look at the end.
104
105 - /usr/share/spamassassin/*.cf:
106
107 Distributed configuration files, with all defaults. Do not modify
108 these, as they are overwritten when you upgrade.
109
110 - /var/lib/spamassassin/*/*.cf:
111
112 Local state directory; updated rulesets, overriding the
113 distributed configuration files, downloaded using "sa-update". Do
114 not modify these, as they are overwritten when you run
115 "sa-update".
116
117 - /etc/mail/spamassassin/*.cf:
118
119 Site config files, for system admins to create, modify, and
120 add local rules and scores to. Modifications here will be
121 appended to the config loaded from the above directory.
122
123 - /etc/mail/spamassassin/*.pre:
124
125 Plugin control files, installed from the distribution. These are
126 used to control what plugins are loaded. Modifications here will
127 be loaded before any configuration loaded from the above
128 directories.
129
130 You want to modify these files if you want to load additional
131 plugins, or inhibit loading a plugin that is enabled by default.
132 If the files exist in /etc/mail/spamassassin, they will not
133 be overwritten during future installs.
134
135 - /usr/share/spamassassin/user_prefs.template:
136
137 Distributed default user preferences. Do not modify this, as it is
138 overwritten when you upgrade.
139
140 - /etc/mail/spamassassin/user_prefs.template:
141
142 Default user preferences, for system admins to create, modify, and
143 set defaults for users' preferences files. Takes precedence over
144 the above prefs file, if it exists.
145
146 Do not put system-wide settings in here; put them in a file in the
147 "/etc/mail/spamassassin" directory ending in ".cf". This file is
148 just a template, which will be copied to a user's home directory
149 for them to change.
150
151 - $USER_HOME/.spamassassin:
152
153 User state directory. Used to hold spamassassin state, such
154 as a per-user automatic welcomelist, and the user's preferences
155 file.
156
157 - $USER_HOME/.spamassassin/user_prefs:
158
159 User preferences file. If it does not exist, one of the
160 default prefs file from above will be copied here for the
161 user to edit later, if they wish.
162
163 Unless you're using spamd, there is no difference in
164 interpretation between the rules file and the preferences file, so
165 users can add new rules for their own use in the
166 "~/.spamassassin/user_prefs" file, if they like. (spamd disables
167 this for security and increased speed.)
168
169 - $USER_HOME/.spamassassin/bayes*
170
171 Statistics databases used for Bayesian filtering. If they do
172 not exist, they will be created by Apache SpamAssassin.
173
174 Spamd users may wish to create a shared set of bayes databases;
175 the "bayes_path" and "bayes_file_mode" configuration settings
176 can be used to do this.
177
178 See "perldoc sa-learn" for more documentation on how
179 to train this.
180
181 File Locations:
182
183 Apache SpamAssassin will look in a number of areas to find the default
184 configuration files that are used. The "__*__" text are variables
185 whose value you can see by looking at the first several lines of the
186 "spamassassin" or "spamd" scripts.
187
188 They are set on install time and can be overridden with the Makefile.PL
189 command line options DATADIR (for __def_rules_dir__) and CONFDIR (for
190 __local_rules_dir__). If none of these options were given, FHS-compliant
191 locations based on the PREFIX (which becomes __prefix__) are chosen.
192 These are:
193
194 __prefix__ __def_rules_dir__ __local_rules_dir__
195 -------------------------------------------------------------------------
196 /usr /usr/share/spamassassin /etc/mail/spamassassin
197 /usr/local /usr/local/share/spamassassin /etc/mail/spamassassin
198 /opt/$DIR /opt/$DIR/share/spamassassin /etc/opt/mail/spamassassin
199 $DIR $DIR/share/spamassassin $DIR/etc/mail/spamassassin
200
201 The files themselves are then looked for in these paths:
202
203 - Distributed Configuration Files
204 '__def_rules_dir__'
205 '__prefix__/share/spamassassin'
206 '/usr/local/share/spamassassin'
207 '/usr/share/spamassassin'
208
209 - Site Configuration Files
210 '__local_rules_dir__'
211 '__prefix__/etc/mail/spamassassin'
212 '__prefix__/etc/spamassassin'
213 '/usr/local/etc/spamassassin'
214 '/usr/pkg/etc/spamassassin'
215 '/usr/etc/spamassassin'
216 '/etc/mail/spamassassin'
217 '/etc/spamassassin'
218
219 - Default User Preferences File
220 '__local_rules_dir__/user_prefs.template'
221 '__prefix__/etc/mail/spamassassin/user_prefs.template'
222 '__prefix__/share/spamassassin/user_prefs.template'
223 '/etc/spamassassin/user_prefs.template'
224 '/etc/mail/spamassassin/user_prefs.template'
225 '/usr/local/share/spamassassin/user_prefs.template'
226 '/usr/share/spamassassin/user_prefs.template'
227
228
229 In addition, the "Distributed Configuration Files" location is overridden
230 by a "Local State Directory", used to store an updated copy of the
231 ruleset:
232
233 __prefix__ __local_state_dir__
234 -------------------------------------------------------------------------
235 /usr /var/lib/spamassassin/__version__
236 /usr/local /var/lib/spamassassin/__version__
237 /opt/$DIR /var/opt/spamassassin/__version__
238 $DIR $DIR/var/spamassassin/__version__
239
240 This is normally written to by the "sa-update" script. "__version__" is
241 replaced by a representation of the version number, so that multiple
242 versions of Apache SpamAssassin will not interfere with each other's rulesets.
243
244
245 After installation, try "perldoc Mail::SpamAssassin::Conf" to see what
246 can be set. Common first-time tweaks include:
247
248 - required_score
249
250 Set this higher to make Apache SpamAssassin less sensitive.
251 If you are installing Apache SpamAssassin system-wide, this is
252 **strongly** recommended!
253
254 Statistics on how many false positives to expect at various
255 different thresholds are available in the "STATISTICS.txt" file in
256 the "rules" directory.
257
258 - rewrite_header, add_header
259
260 These options affect the way messages are tagged as spam or
261 non-spam. This makes it easy to identify incoming mail.
262
263 - ok_locales
264
265 If you expect to receive mail in non-ISO-8859 character sets (ie.
266 Chinese, Cyrillic, Japanese, Korean, or Thai) then set this.
267
268
269 Learning
270 --------
271
272 Apache SpamAssassin includes a Bayesian learning filter, so it is worthwhile
273 training Apache SpamAssassin with your collection of non-spam and spam,
274 if possible. This will make it more accurate for your incoming mail.
275 Do this using the "sa-learn" tools, like so:
276
277 sa-learn --spam ~/Mail/saved-spam-folder
278 sa-learn --ham ~/Mail/inbox
279 sa-learn --ham ~/Mail/other-nonspam-folder
280
281
282 If these are mail folders in mbox format, use the --mbox switch, for
283 Maildirs use a trailing slash, like Maildir/cur/.
284
285 Use as many mailboxes as you like. Note that Apache SpamAssassin will remember
286 what mails it has learnt from, so you can re-run this as often as you like.
287
288
289 Localization
290 ------------
291
292 All text displayed to users is taken from the configuration files. This
293 means that you can translate messages, test descriptions, and templates
294 into other languages.
295
296 If you do so, we would *really* appreciate it if you could contribute
297 these translations, so that they can be added to the
298 distribution. Please file a bug in our Bugzilla[4], and attach your
299 translations. You will, of course, be credited for this work!
300
301 [4]: https://issues.apache.org/SpamAssassin/
302
303
304 Disabled code
305 -------------
306
307 There are some tests and code in Apache SpamAssassin that are turned off by
308 default: experimental code, slow code, or code that depends on
309 non-open-source software or services that are not always free. These
310 disabled tests include:
311
312 - DCC: depends on non-open-source software (disabled in init.pre)
313 - MAPS: commercial service (disabled in 50_scores.cf)
314 - TextCat: slow (disabled in init.pre)
315 - various optional plugins, disabled for speed (disabled in *.pre)
316
317 To turn on tests disabled in 50_scores.cf, simply assign them a non-zero
318 score, e.g. by adding score lines to your ~/.spamassassin/user_prefs file.
319
320 To turn on tests disabled by commenting out the required plugin in
321 init.pre, you need to uncomment the loadplugin line and make sure the
322 prerequisites for proper operation of the plugin are present.
323
324
325 Automatic Reputation System
326 --------------------------
327
328 Apache SpamAssassin includes an automatic reputation system. The way it works is
329 by tracking for each sender address a rolling average score of messages
330 so far seen from there. Then, it combines this long-term average score
331 for the sender with the score for the particular message being evaluated,
332 after all other rules have been applied.
333
334 This functionality can be enabled or disabled with the
335 "use_txrep" option.
336
337 For more information, read sql/README.txrep
338
339 (end of README)
340
341 // vim:tw=74: