Spamassassin:贝叶斯学习在这里起作用吗?

问题描述

我正在尝试训练最近安装的Spamassassin副本,给人的印象是贝叶斯学习不起作用。

首先:是的,spamd--allow-tell选项一起运行。

现在,我有一封垃圾邮件。我首先由Spamassassin运行它,并且得到了给定的分数:

[paulo@myserver ~]$ spamc -R < spam6.txt 
2.9/5.0
Spam detection software,running on the system "myserver",has NOT identified this incoming email as spam.  The original
message has been attached to this so you can view it or label
similar future email.  If you have any questions,see
the administrator of that system for details.

Content preview:  Nombre - herbertrl1 E-mail: - mu18@atsushi1010.masumi76.pushmail.fun
   Asunto - Mensaje - New sexy website is available on the web http://porndreamscene.sexjanet.com/?katarina
   porn star carl paula blum porn double d hamster porn video oiled porn clitoris
   massage free young nubile porn [...] 

Content analysis details:   (2.9 points,5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 1.2 RCVD_IN_BL_SPAMcop_NET RBL: Received via a relay in bl.spamcop.net
              [Blocked - see <https://www.spamcop.net/bl.shtml?164.132.34.35>]
 1.7 URIBL_BLACK            Contains an URL listed in the URIBL blacklist
                            [URIs: sexjanet.com]
 0.0 SPF_HELO_NONE          SPF: HELO does not publish an SPF Record

因此,我使用spamc选项将其馈送到-L

[paulo@myserver ~]$ spamc -L spam < spam6.txt
Message successfully un/learned

然后我尝试再次使用spamc分析它……我得到的分数完全相同:

[paulo@myserver ~]$ spamc -R < spam6.txt 
2.9/5.0
Spam detection software,5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
 1.2 RCVD_IN_BL_SPAMcop_NET RBL: Received via a relay in bl.spamcop.net
              [Blocked - see <https://www.spamcop.net/bl.shtml?164.132.34.35>]
 1.7 URIBL_BLACK            Contains an URL listed in the URIBL blacklist
                            [URIs: sexjanet.com]
 0.0 SPF_HELO_NONE          SPF: HELO does not publish an SPF Record

我想念什么吗?

解决方法

SpamAssasin:贝叶斯需要多少学习?

默认的spamassassin配置要求至少200条垃圾邮件和200条火腿消息来训练贝叶斯。您可以执行sa-learn --dump magic来检查传递给贝叶斯学习的消息数。

man Mail::SpamAssassin::Conf (SpamAssassin version 3.1)

bayes_min_ham_num (默认值:200)
bayes_min_spam_num (默认值:200)
准确地说,在获悉一定数量的火腿(非垃圾邮件)和垃圾邮件之前,贝叶斯系统不会激活。默认是 每个火腿和垃圾邮件200个,但是您可以使用这两个设置来向上或向下调整它们

$ sa-learn --dump magic
[…]
0.000          0       2508          0  non-token data: nspam
0.000          0        508          0  non-token data: nham
[…]

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...