WO2006066444A1 - Connection-oriented junk mail filtering system and method - Google Patents

Connection-oriented junk mail filtering system and method Download PDF

Info

Publication number
WO2006066444A1
WO2006066444A1 PCT/CN2004/001480 CN2004001480W WO2006066444A1 WO 2006066444 A1 WO2006066444 A1 WO 2006066444A1 CN 2004001480 W CN2004001480 W CN 2004001480W WO 2006066444 A1 WO2006066444 A1 WO 2006066444A1
Authority
WO
WIPO (PCT)
Prior art keywords
filtering
connection
data
module
mail
Prior art date
Application number
PCT/CN2004/001480
Other languages
French (fr)
Chinese (zh)
Inventor
Shengyu Cheng
Dongxin Lu
Qiang Li
Yingjie Bai
Zhiyun Luo
Zuoliang Zhu
Original Assignee
Zte Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zte Corporation filed Critical Zte Corporation
Priority to PCT/CN2004/001480 priority Critical patent/WO2006066444A1/en
Priority to CN2004800441850A priority patent/CN101040279B/en
Publication of WO2006066444A1 publication Critical patent/WO2006066444A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/212Monitoring or handling of messages using filtering or selective blocking

Abstract

A connection-oriented junk mail filtering system and method, the system includes at least data acquisition module, filtering strategy management module, filtering analysis module, and data processing module, wherein the data acquisition module is used for capturing packets from monitored network, and submitting them to the filtering analysis module as data input of the whole filtering system; filtering strategy management module is used for configuration and management of filtering strategy; filtering analysis module is used for analysing the input packet based on configured filtering strategy, and checking whether it contains informations in which the filtering strategy is interested; data processing module is used for performing various processing on analysis result data of the filtering analysis module. The present invention solves the problem of missing alarm and false alarm for packet filtering, and its dominant characteristic is that it is independent of specific mail servers, and it is absolutely transparent to mail clients and servers. In contrast to the prior art, the present invention greatly improves the reliability of junk mail filtering system, and widens the applicability of the system.

Description

面向连接的垃圾邮件过滤系统和方法  Connection-oriented spam filtering system and method
技术领域  Technical field
本发明涉及一种网络内容安全监控方法, 尤其涉及网络信息安全领域的 垃圾邮件过滤系统和方法。 ' 背景技术  The invention relates to a network content security monitoring method, in particular to a spam filtering system and method in the field of network information security. ' Background technique
电子邮件是因特网上的最重要应用之一, 己逐渐成为人们生产生活中不 可或缺的部分。 垃圾邮件通常是指包含反动言论、 色情或暴力等不良信息的 电子邮件, 也包括非请求大宗电子邮件和电子邮件形式的非请求商业广告。 这些信息常常是大批量发送, 不仅耗用大量的网络资源, 降低生产率, 而且 可能扰乱社会稳定, 危害青少年身心健康。 据统计, 垃圾邮件每年给全球经 济带来的损失达数百亿美元。 如何有效地防范垃圾邮件, 已是十分迫切的问 题。  E-mail is one of the most important applications on the Internet, and it has gradually become an indispensable part of people's production and life. Spam is usually an email containing bad information such as reactionary speech, pornography or violence, as well as unsolicited commercials in the form of unsolicited bulk emails and emails. This information is often sent in large quantities, which not only consumes a lot of network resources, but also reduces productivity, and may disturb social stability and endanger the physical and mental health of young people. According to statistics, spam has caused tens of billions of dollars in damage to the global economy every year. How to effectively prevent spam is a very urgent issue.
现有的垃圾邮件过滤系统主要有以下两类:一是基于邮件客户端的过滤, 通常是以邮件客户端程序的插件形式存在, 这类系统仅监控单机, 应用范围 有限; 二是基于邮件服务器的过滤, 通常要求与邮件服务器建立双向连接并 协同工作, 这类系统的监控范围也仅限于直接相连的邮件服务器。 以上两类 垃圾邮件过滤系统, 都需要对原邮件客户端或者邮件服务器程序做一定的改 造, 并与原系统协同工作, 因此不透明。 也有一些垃圾邮件过滤系统, 不依 赖邮件客户 和服务器, 可以布置在被监控网络的出入口。 大多数这类系统 的工作原理与防火墙类似, 一般是检查邮件数据包的 IP地址, 以及对邮件头 The existing spam filtering system mainly has the following two types: one is based on the filtering of the mail client, usually in the form of a plug-in of the mail client program, such a system only monitors a single machine, and the application scope is limited; the second is based on the mail server. Filtering usually requires a two-way connection with the mail server and works together. The monitoring scope of such systems is limited to directly connected mail servers. The above two types of spam filtering systems need to make certain modifications to the original mail client or mail server program, and work with the original system, so it is opaque. There are also spam filtering systems that do not rely on mail clients and servers and can be placed at the gateways of monitored networks. Most of these systems work like firewalls, typically checking the IP address of a mail packet and the header.
(例如邮件发送者、 邮件接收者和邮件标题等) 进行过滤, 由于采用的是简 单包过滤技术, 也不能避免分包过滤的漏报警, 易受碎片攻击。 (For example, mail senders, mail recipients, and mail headers, etc.) Filtering, because of the simple packet filtering technology, can not avoid the leak alarm of packet filtering, and is vulnerable to fragmentation attacks.
综上所述, 现有的垃圾邮件过滤技术主要有两点不足: 一是过于依赖邮 件服务器或邮件客户端, 要求对原邮件服务器或邮件客户端做一定的改造; 二是不能对邮件内容进行过滤或者不能解决碎片攻击问题。 发明内容  In summary, the existing spam filtering technology mainly has two shortcomings: First, it relies too much on the mail server or mail client, and requires a certain transformation to the original mail server or mail client; Filtering or not addressing the fragmentation issue. Summary of the invention
本发明所解决的技术问题提出一种面向连接的垃圾邮件过滤系统, 实现电子邮件内容的全文过滤, 且不存在碎片攻击脆弱性问题, 独立于特定 的电子邮件服务器, 既可布置于共享式局域网内部, 也可布置于企业网络、 省际或国际骨干网络的出入口处, 该系统适用范围广、 可靠性高。 The technical problem solved by the present invention proposes a connection-oriented spam filtering system. Implement full-text filtering of email content without fragmentation vulnerability issues, independent of specific email servers, either within a shared LAN or at the entrance or exit of a corporate network, interprovincial or international backbone network The system has a wide range of applications and high reliability.
本发明所解决的另一技术问题提出一种面向连接的垃圾邮件过滤方法, 能够实现电子邮件内容的全文过滤, 且不存在碎片攻击脆弱性问题, 提高了 垃圾邮件过滤系统的可靠性。  Another technical problem solved by the present invention is to provide a connection-oriented spam filtering method, which can implement full-text filtering of email content without the vulnerability of fragmentation attacks, and improve the reliability of the spam filtering system.
本发明所解决的另一技术问题提出一种面向连接的垃圾邮件过滤方法, 能够避免产生漏报警和误报警等。  Another technical problem solved by the present invention proposes a connection-oriented spam filtering method capable of avoiding occurrence of a leak alarm and a false alarm.
为了实现上述本发明的目的, 本发明提供了一种面向连接的垃圾邮件过 滤系统, 该系统至少包含: 数据采集模块、 过滤策略管理模块、 过滤分析模 块和数据处理模块, 其中, 数据采集模块用于从被监控网络上捕获数据包, 提交给过滤分析模块, 作为整个过滤系统的数据输入; 过滤策略管理模块用 于过滤策略的配置及管理; 过滤分析模块用于根据配置的过滤策略, 对输入 的数据包进行分析, 检查是否包含过滤策略所关注的信息; 数据处理模块用 于对过滤分析模块的分析结果数据进行各种不同的处理。  In order to achieve the above object of the present invention, the present invention provides a connection-oriented spam filtering system, the system at least comprising: a data collection module, a filtering policy management module, a filtering analysis module, and a data processing module, wherein the data acquisition module is used The data packet is captured from the monitored network and submitted to the filtering analysis module as the data input of the entire filtering system; the filtering policy management module is used for the configuration and management of the filtering policy; and the filtering analysis module is configured to input the filtering policy according to the configuration. The data packet is analyzed to check whether the information of the filtering strategy is included; the data processing module is used to perform various processing on the analysis result data of the filtering analysis module.
所述的面向连接的垃圾邮件过滤系统, 还包括操作维护模块和存储备份 模块, 其中, 操作维护模块用于系统的维护, 存储备份模块用于系统数据及 数据包的存储备份。  The connection-oriented spam filtering system further includes an operation and maintenance module and a storage backup module, wherein the operation and maintenance module is used for system maintenance, and the storage backup module is used for storage backup of system data and data packets.
所述过滤策略包含过滤条件和对应的处理方式, 所述过滤条件可为多项 条件的逻辑组合。  The filtering policy includes a filtering condition and a corresponding processing manner, and the filtering condition may be a logical combination of a plurality of conditions.
所述过滤分析模块包括 TCP 连接维护子模块、 邮件协议解析子模块、 MIME解码及内容扫描子模块,其中, TCP连接维护子模块用于维护一张 TCP 连接哈希表, 邮件协议解析子模块用于完成邮件协议的解析, MIME解码及 内容扫描子模块用于判断输入的邮件数据的编码方式, 并调用相应的编码转 换函数进行编码转换, 然后对邮件内容进行全文扫描。  The filtering analysis module includes a TCP connection maintenance submodule, a mail protocol parsing submodule, a MIME decoding, and a content scanning submodule, wherein the TCP connection maintenance submodule is used to maintain a TCP connection hash table, and the mail protocol parsing submodule is used. After completing the parsing of the mail protocol, the MIME decoding and content scanning sub-module is used to judge the encoding mode of the input mail data, and call the corresponding encoding conversion function for encoding conversion, and then perform full-text scanning on the mail content.
所述哈希表以数据包的源 IP地址、 目的 IP地址、源端口、 目的端口四元 组作为计算哈希键值的输入, 可用多种快速哈希算法计算哈希值, 哈希冲突 可采用链地址法解决。  The hash table uses the source IP address, the destination IP address, the source port, and the destination port quaternion of the data packet as input for calculating the hash key value, and the hash value can be calculated by using various fast hash algorithms, and the hash conflict can be Solved by the chain address method.
所述 TCP连接维护子模块维护的哈希表中的每个 TCP连接节点至少包 含: ( 1 ) 客户端和服务器端的 IP地址和传输层端口号, 这四个参数是用于 确定数据包所属连接的唯一标识; Each TCP connection node in the hash table maintained by the TCP connection maintenance submodule includes at least: (1) IP address and transport layer port number of the client and server. These four parameters are unique identifiers used to determine the connection to which the data packet belongs.
(2) 协议类型: SMTP、 POP3或者 IMAP;  (2) Protocol type: SMTP, POP3 or IMAP;
(3 ) 本连接生命期: 用来防止很久不活动的连接占用系统资源; (4)数据包缓存队列: 缓存本连接上的邮件数据包, 如果判明了本连接 上有不安全数据, 以便恢复邮件数据并保存;  (3) The lifetime of the connection: The connection used to prevent long-term inactivity occupies system resources; (4) Packet buffer queue: Caches the mail packet on this connection, if it is determined that there is unsafe data on the connection, in order to recover Mail data and save it;
( 5 ) 本连接上的会话所处的状态: 是命令交互状态还是数据传输状态; (5) The state of the session on the connection: whether it is a command interaction state or a data transmission state;
(6)自动机临时状态:用以解决按数据包进行关键字过滤时的漏报警问 题; (6) Temporary state of the automatic machine: used to solve the problem of missed alarm when filtering keywords by data packet;
(7)本连接的安全标识:当判明了连接上有不安全信息,在该字段标记, 并不再扫描连接上的后续数据。  (7) Security ID of this connection: When it is determined that there is insecure information on the connection, it is marked in this field, and subsequent data on the connection is no longer scanned.
为了更好地实现上述目的, 本发明还提供了一种面向连接的垃圾邮件过 滤方法, 其中, 该方法至少包括如下步骤:  In order to better achieve the above object, the present invention also provides a connection-oriented spam filtering method, wherein the method includes at least the following steps:
( 1 )数据采集步骤, 用于从被监控网络上捕获数据包, 提交给过滤分析 模块, 作为整个过滤系统的数据输入;  (1) a data collection step for capturing a data packet from the monitored network and submitting it to the filtering analysis module as a data input of the entire filtering system;
(2) 过滤策略管理步骤, 用于过滤策略的配置及管理;  (2) Filtering policy management steps for configuring and managing filtering policies;
(3 )过滤分析步骤, 用于根据配置的过滤策略, 对输入的数据包进行分 析, 检查是否包含过滤策略所关注的信息;  (3) a filtering analysis step, configured to analyze the input data packet according to the configured filtering policy, and check whether the information concerned by the filtering policy is included;
(4)数据处理步骤, 用于对过滤分析模块的分析结果数据进行各种不同 的处理。  (4) A data processing step for performing various processing on the analysis result data of the filter analysis module.
所述步骤 (3 ) 还包括: 使用 SMTP、 POP3或 IMAP传输电子邮件时, 在命令交互状态下, 提取输入数据包中的交互命令及其参数并进行分析; 在 数据传输状态下, 从数据包中提取邮件数据, 进行 MIME解码和内容扫描, 并将扫描结果提交给数据处理模块。  The step (3) further includes: when transmitting the email by using SMTP, POP3 or IMAP, extracting and analyzing the interactive command and its parameters in the input data packet in the command interaction state; in the data transmission state, the slave data packet The mail data is extracted, MIME decoding and content scanning are performed, and the scan result is submitted to the data processing module.
所述步骤 (3 ) 还包括如下步骤:  The step (3) further includes the following steps:
( 111 ) TCP连接维护步骤, 用于维护一张 TCP连接哈希表;  ( 111 ) a TCP connection maintenance step for maintaining a TCP connection hash table;
( 112 ) 邮件协议解析步骤, 用于完成邮件协议的解析;  (112) a mail protocol parsing step for completing the parsing of the mail protocol;
( 113 ) MIME解码及内容扫描步骤,用于判断输入的邮件数据的编码方 式, 并调用相应的编码转换函数进行编码转换, 然后对邮件内容进行全文扫 描。 所述步骤(113 )还包括: 每扫描完一个包后, 把当前状态暂存在连接所 属连接节点的自动机临时状态字段中, 扫描下一个包时, 从所属连接节点的 自动机临时状态所指的状态开始匹配, 以避免产生漏报警。 (113) The MIME decoding and content scanning steps are used to judge the encoding mode of the input mail data, and call the corresponding encoding conversion function for encoding conversion, and then perform full-text scanning on the mail content. The step (113) further includes: after each packet is scanned, temporarily storing the current state in the automaton temporary status field of the connection node to which the connection belongs, and scanning the next packet from the temporary state of the automaton of the connected node The status begins to match to avoid a leak alarm.
所述步骤 (113 ) 还包括: 对同一个 TCP连接上的有乱序的数据包进行 排序, 并按照正确的顺序进行内容扫描, 以避免产生误报警。  The step (113) further includes: sorting the out-of-order packets on the same TCP connection, and performing content scanning in the correct order to avoid false alarms.
本发明所述的垃圾邮件过滤系统和方法, 由于采用了 "面向连接"的技 术措施和合适的算法, 解决了包过滤的漏报警和误报警问题, 使其可不依赖 于特定的邮件服务器,对邮件客户端和服务器都完全透明。与现有技术相比, 本发明极大地提高了垃圾邮件过滤系统的可靠性, 拓宽了适用范 '围。 附图说明  The spam filtering system and method of the present invention solves the problem of missed alarm and false alarm of packet filtering by adopting the "connection-oriented" technical measure and a suitable algorithm, so that it can be independent of a specific mail server, Both the mail client and the server are completely transparent. Compared with the prior art, the invention greatly improves the reliability of the spam filtering system and broadens the applicable scope. DRAWINGS
图 1为本垃圾邮件过滤系统在共享式局域网中的布置示意图;  FIG. 1 is a schematic diagram of the arrangement of the spam filtering system in a shared local area network;
图 2为本垃圾邮件过滤系统在网络出入口处的布置示意图;  2 is a schematic diagram of the arrangement of the spam filtering system at the network entrance and exit;
图 3为本发面所述的垃圾邮件过滤系统的结构示意图;  3 is a schematic structural diagram of a spam filtering system according to the present invention;
图 4 为本发明过滤分析模块结构示意图;  4 is a schematic structural view of a filter analysis module according to the present invention;
图 5 为 TCP连接 HASH表结构示意图;  Figure 5 is a schematic diagram of the structure of a TCP connection HASH table;
图 6 为 TCP连接查找 HASH算法示意图;  Figure 6 is a schematic diagram of a TCP connection lookup HASH algorithm;
图 7A、 7B为包过滤的漏报警问题示意图;  7A and 7B are schematic diagrams of a leak alarm problem of packet filtering;
图 8A、 8B 为包乱序的误报警问题示意图。 具体实施方式 、 下面结合附图, 基本按照附图的顺序对技术方案的实施作进一步的1详细 描述: 8A and 8B are schematic diagrams of the problem of false alarms in disorder. DETAILED DESCRIPTION below in conjunction with the accompanying drawings, the figures in the order of the basic technical solution of the embodiment 1 will be further described in detail:
本垃圾邮件过滤系统对使用 SMTP ( Simple Mail Transfer Protocol-简单 邮件传输协议)、 POP3(Post Office Protocol: Version3 - 邮局协议版本 3)和 IMAP(Internet Message Access Protocol - 互联网消息存取协议)传输的电子邮 件进行监控。  This spam filtering system uses electronic transmissions using SMTP (Simple Mail Transfer Protocol), POP3 (Post Office Protocol: Version 3 - Post Office Protocol Version 3), and IMAP (Internet Message Access Protocol - Internet Message Access Protocol). Mail is monitored.
本发明所描述的垃圾邮件过滤系统,既可以布置在共享式局域网内部 (参 见图 1 ),也可以布置在企业网、省际或国际骨干网络的出入口处(参见图 2)。  The spam filtering system described in the present invention can be arranged inside a shared local area network (see Fig. 1) or at the entrance and exit of an enterprise network, an interprovincial or international backbone network (see Fig. 2).
图 1介绍了本发明所述的垃圾邮件过滤系统在共享式局域网内的布置方 式。 这种方式下, 可通过把网卡设置为混杂模式来捕获网络数据包, 但只能 被动监听。 Figure 1 illustrates the arrangement of the spam filtering system of the present invention in a shared local area network Style. In this way, network packets can be captured by setting the NIC to promiscuous mode, but only passively.
图 2介绍了本发明所述的垃圾邮件过滤系统在网络出入口的布置方式。 这种方式下, 可以采用专有设备采集网络数据包, 可以对网络数据包进行完 全的监视和控制。  FIG. 2 illustrates the arrangement of the spam filtering system of the present invention at the network entrance and exit. In this way, network packets can be collected using proprietary devices, and network packets can be fully monitored and controlled.
图 3介绍了本发明所述的垃圾邮件过滤系统的基本结构。 至少包含以下 几个模块: 数据采集模块 31、 过滤策略管理模块 32、 过滤分析模块 33和数 据处理模块 34, 基本体系结构参见图 3。  Figure 3 illustrates the basic structure of the spam filtering system of the present invention. At least the following modules are included: Data Acquisition Module 31, Filter Policy Management Module 32, Filter Analysis Module 33, and Data Processing Module 34. See Figure 3 for the basic architecture.
数据采集模块 31从被监控网络上捕获数据包,提交给过滤分析模块,作 为整个过滤系统的数据输入。 数据采集可以采用普通的捕包工具实现, 也可 以采用专有设备实现。  The data acquisition module 31 captures the data packet from the monitored network and submits it to the filtering analysis module as the data input for the entire filtering system. Data collection can be done using common capture tools or proprietary equipment.
过滤策略模块 32负责对过滤策略进行配置和管理。过滤策略是系统赖以 工作的核心基础, 它至少应当包含过滤条件和对应的处理方式, 过滤条件可 以是多项条件的逻辑组合。 下面给出两条过滤策略的示例:  The filtering policy module 32 is responsible for configuring and managing the filtering policy. The filtering strategy is the core foundation for the system to work. It should at least contain the filtering conditions and corresponding processing methods. The filtering conditions can be a logical combination of multiple conditions. An example of two filtering strategies is given below:
过滤策略示例 1 : 过滤条件= "目的 IP地址是 168.168.192.*,并且发件人 是 seqing@nopermit.com", 处理方式= "保存邮件并报警";  Filtering policy example 1 : Filtering condition = "The destination IP address is 168.168.192.*, and the sender is seqing@nopermit.com", processing method = "Save mail and alarm";
过滤策略示例 2 : 过滤条件= "发件人是 xxx@fishy.net,并且收件人是 fishy@xxx.com", 处理方式= "切断用户连接并告警"。  Filtering policy example 2: Filtering criteria = "The sender is xxx@fishy.net and the recipient is fishy@xxx.com", processing mode = "Turn off user connection and alert".
过滤分析模块 33根据配置的过滤策略,对输入的数据包进行分析,检查 是否包含过滤策略所关注的信息。 本模块的结构参见图 4。  The filtering analysis module 33 analyzes the input data packet according to the configured filtering policy, and checks whether the information of the filtering policy is included. See Figure 4 for the structure of this module.
本模块包含 TCP (Transmission Control Protocol — 传输层控制协议)连 接维护 41、邮件协议解析 42、 MIME (Multipurpose Internet Mail Extensions ― 多用途因特网邮件扩展协议) 解码及内容扫描 43 三个子模块。 这里所说的 TCP连接, 是指被监控的邮件客户端和邮件服务器之间为传送电子邮件而建 立的 TCP连接, 本过滤系统与该连接无关, 只是监控它上面传输的数据。  This module includes three sub-modules: TCP (Transmission Control Protocol) connection maintenance 41, mail protocol resolution 42, MIME (Multipurpose Internet Mail Extensions) decoding and content scanning. The TCP connection mentioned here refers to the TCP connection established between the monitored mail client and the mail server for transmitting e-mail. The filtering system has nothing to do with the connection, but only monitors the data transmitted on it.
TCP连接维护模块 41维护一张 TCP连接哈希表(参见图 5 ), 该哈希表 以数据包的 (源 IP地址、 目的 IP地址、 源端口、 目的端口) 四元组作为计 算哈希键值的输入(参见图 6), 可用多种快速哈希算法计算哈希值, 哈希冲 突可采用链地址法解决。哈希表中的每个 TCP连接节点至少包含连接双方的 IP地址、 传输层端口号及本连接当前的一些状态信息。 也可以视具体情况, 分别为 SMTP、 POP3和 IMAP协议维护一张 TCP连接哈希表。 The TCP connection maintenance module 41 maintains a TCP connection hash table (see FIG. 5), which uses the data packet (source IP address, destination IP address, source port, destination port) quad as a calculation hash key. The input of the value (see Figure 6), the hash value can be calculated by a variety of fast hash algorithms, and the hash conflict can be solved by the chain address method. Each TCP connection node in the hash table contains at least the IP address of the connection parties, the transport layer port number, and some current status information of the connection. Also depending on the situation, Maintain a TCP connection hash table for the SMTP, POP3, and IMAP protocols, respectively.
对输入的每一个数据包, 首先检查它是否属于己经建立的某个 TCP 连 接。 如果是, 则根据它所属连接当前所处的状态进行处理; 否则, 为它新创 建 TCP连接节点。  For each packet entered, first check if it belongs to a TCP connection that has already been established. If yes, it is processed according to the current state of the connection to which it belongs; otherwise, a new TCP connection node is created for it.
协议解析子模块 42完成邮件协议的解析:如果当前连接处于命令交互状 态, 则从输入的数据包中提取协议命令及参数并处理; 如果当前连接处于数 据传输状态, 则从输入的数据包中提取邮件数据, 并提交给 MIME解码及内 容扫描子模块。  The protocol parsing sub-module 42 completes the parsing of the mail protocol: if the current connection is in the command interaction state, the protocol command and parameters are extracted from the input data packet and processed; if the current connection is in the data transmission state, the data packet is extracted from the input data packet. Mail data, and submitted to the MIME decoding and content scanning sub-module.
图 4介绍了过滤分析模块的基本结构。 该模块对输入的每一个数据包, 首先根据 (源 IP地址、 目的 IP地址、 源端口、 目的端口) 四元组计算其哈 希键值, 判断它是否属于已经建立的某个 TCP连接。 如果是, 则根据它所属 连接当前所处的状态进行处理, 例如, 若己经确知该连接违反安全策略, 就 不必扫描输入数据包的内容, 而直接缓存数据包, 待整封邮件数据到齐后, 再重组邮件数据并保存; 若还不知道该连接上的数据是否违反安全策略, 则 扫描当前输入的数据包, 并将扫描结果信息暂存在本连接节点中; 如果输入 的数据包不属于任何已经建立的连接, 则为它创建 TCP连接节点, 然后扫描 数据包内容, 同样将扫描结果暂存在本连接节点中。  Figure 4 shows the basic structure of the filter analysis module. For each packet that is input, the module first calculates its hash key value according to the (source IP address, destination IP address, source port, destination port) quad, and determines whether it belongs to a TCP connection that has already been established. If yes, it is processed according to the current state of the connection to which it belongs. For example, if it is known that the connection violates the security policy, it is not necessary to scan the contents of the input data packet, and directly cache the data packet, and the entire mail data is to be After the transaction, the mail data is reorganized and saved; if it is not known whether the data on the connection violates the security policy, the currently input data packet is scanned, and the scan result information is temporarily stored in the connection node; if the input data packet is not If it belongs to any established connection, it creates a TCP connection node for it, then scans the contents of the packet, and also temporarily stores the scan result in the connection node.
使用 SMTP、POP3或 IMAP传输电子邮件时,一次会话有两个基本状态: 命令交互状态和数据传输状态。 命令交互状态下, 邮件客户端和服务器在进 行一系列的命令交互, 不传送邮件数据本身; 数据传输状态下, 邮件客户端 和服务器在传输电子邮件数据。 能够通过捕获到的命令来判断这两个状态的 转换。 例如, SMTP协议中, 捕获到 "DATA"命令后, 进入数据传输状态, 捕获到邮件结束符 " ·"时, 又回到命令交互状态; 而对于 POP3协议, 捕获 到 "RETR"命令进入数据传输状态, 捕获到邮件结束符 " ·"时, 又回到命 令交互状态。 因为可能漏捕包而不能正确判断命令交互状态和数据传输状态 的转换, 因此系统还要釆取一定的保护措施。 例如, 如果漏抓了客户端发往 服务器的 "DATA"包, 可以根据服务器返回给客户端的代号为 "354"的相 应包来判断邮件数据传输状态的开始。  When using SMTP, POP3, or IMAP to transfer e-mail, a session has two basic states: command interaction status and data transfer status. In the interactive state of the command, the mail client and the server perform a series of command interactions, and do not transmit the mail data itself; in the data transmission state, the mail client and the server are transmitting the email data. It is possible to judge the transition of these two states by the captured command. For example, in the SMTP protocol, after the "DATA" command is captured, the data transfer state is entered, and when the message end character "·" is captured, the command interaction state is returned; and for the POP3 protocol, the "RETR" command is captured to enter the data transfer. The state, when the message end character "·" is captured, returns to the command interaction state. Because the packet may be missed and the transition between the command interaction state and the data transmission state cannot be correctly judged, the system must also take certain protective measures. For example, if you miss the "DATA" packet sent by the client to the server, you can determine the start of the mail data transmission status based on the corresponding packet with the code "354" returned by the server to the client.
图 5介绍了 TCP连接哈希表的结构, 该表采用链地址法解决哈希冲突。 哈希表中的每一个节点, 就是一个 TCP连接节点结构, 代表一个当前正在进 行的邮件协议会话。 Figure 5 shows the structure of the TCP connection hash table, which uses the chain address method to resolve hash collisions. Each node in the hash table is a TCP connection node structure, representing a current progress The mail protocol session for the line.
图 6介绍了 TCP连接査找的哈希函数的实现。哈希函数以数据包的(源 IP地址、 目的 IP地址、 源端口、 目的端口) 四元组作为输入, 计算出哈希 值。 此哈希值用于在图 4所示的哈希表中, 査找输入的四元组是否属于己经 建立的某个连接。 因为一个 TCP连接上的会话数据包是双向的, 因此, 哈希 算法的设计必须保证同一个连接上的双向数据映像到同一个哈希值。 例如, 下面两个四元组的哈希值应该相同:  Figure 6 shows the implementation of the hash function for TCP connection lookups. The hash function takes the quaternion of the packet (source IP address, destination IP address, source port, destination port) as input and calculates the hash value. This hash value is used in the hash table shown in Figure 4 to find out if the input quad is a connection that has already been established. Because the session packets on a TCP connection are bidirectional, the hash algorithm must be designed to ensure that the bidirectional data on the same connection is mapped to the same hash value. For example, the hash values for the following two quads should be the same:
四元组 1 : ( 168.168.192.1, 10.198.60.2, 1386,25 );  Quad 1 : ( 168.168.192.1, 10.198.60.2, 1386, 25 );
四元组 2: ( 10.198.60.2, 168.168.192.1, 25 , 1386)。  Quad 2: ( 10.198.60.2, 168.168.192.1, 25, 1386).
此外, 由于查找 TCP连接的操作十分频繁(对每一个邮件数据包调用一 次), 因此采用的哈希算法应该速度很快, 并且产生的键值冲突少。  In addition, since the operation of finding a TCP connection is very frequent (invoked once for each mail packet), the hash algorithm used should be fast and generate fewer key-value conflicts.
MIME解码及内容扫描子模块 43首先判断输入的邮件数据的编码方式, 并调用相应的编码转换函数进行编码转换, 然后对邮件内容进行全文扫描。 由于包过滤容易产生漏报警 (参见图 7A、 7B ), 因此, 需要采用合适的算法 实现内容扫描。 如果包乱序, 还可能产生误报警 (参见图 8A、 8B), 因此, 需要对同一个 TCP连接上的数据包进行排序,并按照正确的顺序进行内容扫 The MIME decoding and content scanning sub-module 43 first determines the encoding mode of the input mail data, and calls the corresponding encoding conversion function to perform encoding conversion, and then performs full-text scanning on the mail content. Since packet filtering is prone to leak alarms (see Figures 7A, 7B), proper scanning of the content is required. If the packets are out of order, false alarms may also occur (see Figures 8A, 8B). Therefore, it is necessary to sort the packets on the same TCP connection and perform content scanning in the correct order.
+11- 本发明所指的内容扫描, 主要针对邮件正文及附件的文本部分, 但只要 算法性能允许, 同样适用于其它类型媒体信息(例如图片、 声音等)的过滤。 +11- The content scanning referred to in the present invention is mainly for the text part of the mail body and the attachment, but is also applicable to the filtering of other types of media information (such as pictures, sounds, etc.) as long as the performance of the algorithm allows.
图 7A、 7B介绍了包过滤的漏报警问题。 假如邮件过滤系统要检查的关 键字为 "babb"。 现有一包含该模式串的用户数据流如图 7A所示, 其中 表示任意不包含 "babb"和 "bab"子串的字符串。 该用户数据在网络上传输 时, 被分成了两个数据包, 如图 7B所示。 那么, 包过滤的邮件过滤系统无 论是过滤数据包 1 还是过滤数据包 2, 都不能发现用户数据流中所包含的 "babb"串。 显然出现了漏报警。 因此, 需要采用合适的算法实现内容扫描。 如果每次扫描只检查一个关键字, 可以 (但不限于) 采用改造的有限自动机 单关键字匹配算法, 每扫描完一个包后, 把当前状态暂存在连接所属连接节 点的 "自动机临时状态"字段中, 扫描下一个包时, 从所属连接节点的 "自 动机临时状态 "所指的状态开始匹配,而不是从自动机的初始状态开始匹配; 如果每次扫描要检查多个关键字,可以(但不限于)采用改造的 Aho-Comsick 多关键字匹配算法, 同样是每扫描完一个包后, 把当前状态暂存在所属连接 节点的 "自动机临时状态"字段中, 扫描下一个包时, 也不从自动机的初始 状态幵始匹配, 而是 "自动机临时状态"所指的状态幵始匹配。 Figures 7A and 7B illustrate the problem of leak alarms for packet filtering. If the mail filtering system wants to check the keyword is "babb". A user data stream containing the pattern string is shown in FIG. 7A, which represents any character string that does not contain the "babb" and "bab" substrings. When the user data is transmitted on the network, it is divided into two data packets, as shown in Fig. 7B. Then, the packet filtering mail filtering system can not find the "babb" string contained in the user data stream whether it is filtering packet 1 or filtering packet 2. There is obviously a leak alarm. Therefore, it is necessary to implement content scanning using a suitable algorithm. If only one keyword is checked per scan, it can be (but not limited to) the modified finite automaton single keyword matching algorithm. After each packet is scanned, the current state is temporarily stored in the "automatic machine temporary state" of the connected node to which the connection belongs. "In the field, when scanning the next packet, the matching starts from the state indicated by the "automatic temporary state" of the connected node, instead of starting from the initial state of the automaton; if multiple keywords are to be checked for each scan, Can (but is not limited to) the modified Aho-Comsick The multi-keyword matching algorithm also temporarily stores the current state in the "automatic machine temporary state" field of the connected node after scanning one packet, and does not start from the initial state of the automaton when scanning the next packet. , but the status indicated by the "automatic machine temporary status" starts to match.
图 8A、 8B介绍了包乱序引起的误报警问题。 假设要过滤的关键字同前, 用户数据流如图 8A所示, 在网络上传输时, 它被分割为两个数据包, 如图 8B 所示。 图中, " * "表示不包含 "babb,,、 "bab"和 "abb"子串的任意字符 串。 那么关键字匹配的结果就不会识别到 "babb" 串。 但是按照上述算法, 假如数据包 2先到达, 然后数据包 1到达, 那么数据包 2结尾处的 "b"和数 据包 1开始的 "abb"就构成了被过滤的关键字 "babb"。 显然出现了误报警。 因此, 对邮件正文的扫描需要按照正确的次序进行, 如果接收到的数据包有 乱序, 过滤分析模块的 TCP连接维护子模块首先对它们进行棑序, 然后才提 交给后续子模块。  Figures 8A and 8B illustrate the problem of false alarms caused by out-of-order packets. Assuming that the keyword to be filtered is the same as before, the user data stream is as shown in Fig. 8A. When transmitting on the network, it is divided into two data packets, as shown in Fig. 8B. In the figure, " * " means any string that does not contain the "babb,,, "bab" and "abb" substrings. Then the result of the keyword match will not recognize the "babb" string. However, according to the above algorithm, if When packet 2 arrives first, then packet 1 arrives, then "b" at the end of packet 2 and "abb" at the beginning of packet 1 constitute the filtered keyword "babb". Apparently a false alarm has occurred. The scanning of the body of the mail needs to be performed in the correct order. If the received data packets are out of order, the TCP connection maintenance sub-module of the filtering analysis module first sorts them and then submits them to the subsequent sub-modules.
为了实现邮件协议的解析和内容过滤,在 TCP连接节点中记录本连接当 前所处的状态。 节点结构至少包含以下信息:  In order to implement the parsing and content filtering of the mail protocol, the current state of the connection is recorded in the TCP connection node. The node structure contains at least the following information:
1. 客户端和服务器端的 IP地址和传输层端口号: 这四个参数是确定数 据包所属连接的唯一标识;  1. IP address and transport layer port number of the client and server: These four parameters are the unique identifiers that determine the connection to which the packet belongs;
2. 协议类型: SMTP、 POP3或者 IMAP;  2. Protocol type: SMTP, POP3 or IMAP;
3. 本连接生命期: 用来防止很久不活动的连接占用系统资源;  3. The lifetime of the connection: The connection used to prevent long periods of inactivity occupies system resources;
4. 数据包缓存队列: 缓存本连接上的邮件数据包, 如果判明了本连接上 有不安全数据, 以便恢复邮件数据并保存;  4. Packet Cache Queue: Cache the mail packet on this connection. If it is determined that there is unsafe data on the connection, the message data can be restored and saved.
5. 本连接上的会话所处的状态: 是命令交互状态还是数据传输状态; 5. The state of the session on this connection: whether it is a command interaction state or a data transmission state;
6. 自动机临时状态: 用以解决按数据包进行关键字过滤时的漏报警问 题。 一封邮件结束时, 本字段需要复位, 即指向自动机的初始状态; 6. Automaton Temporary Status: Used to solve the problem of missed alarm when filtering keywords by packet. At the end of a message, this field needs to be reset, that is, it points to the initial state of the automaton;
7. 本连接的安全标识: 当判明了连接上有不安全信息, 在该字段标记, 并不再扫描连接上的后续数据。  7. Security ID of this connection: When it is determined that there is unsafe information on the connection, it is marked in this field and the subsequent data on the connection is no longer scanned.
在命令交互状态下,提取输入数据包中的交互命令及其参数并进行分析; 在数据传输状态下, 从数据数据包中提取邮件数据, 进行 MIME解码和内容 扫描, 并将扫描结果提交给数据处理模块。  In the command interaction state, extract the interactive command and its parameters in the input data packet and analyze it; in the data transmission state, extract the mail data from the data packet, perform MIME decoding and content scanning, and submit the scan result to the data. Processing module.
数据处理模块 34根据安全过滤策略规定的处理方式,对过滤分析模块的 分析结果数据进行各种不同的处理。 例如, 转发数据包、 丢弃数据包、 切断 用户连接、 报警, 或者把电子邮件数据包还原并重组成应用层数据流并保存 到数据库等等。 The data processing module 34 performs various processing on the analysis result data of the filtering analysis module according to the processing method specified by the security filtering policy. For example, forwarding packets, dropping packets, cutting off Users connect, alert, or restore email packets and reorganize them into application layer data streams and save them to a database.
根据实际需要, 也可以增加操作维护模块 36、 存储备份模块 35等。 其 中操作维护模块用于系统的维护, 存储备份模块用于系统数据及数据包的存 储备份。 工业应用性  The operation and maintenance module 36, the storage backup module 35, and the like may also be added according to actual needs. The operation and maintenance module is used for system maintenance, and the storage backup module is used for storage and backup of system data and data packets. Industrial applicability
本发明所述的垃圾邮件过滤系统, 由于采用了 "面向连接"的技术措施 和合适的算法, 解决了包过滤的漏报警和误报警问题, 其最大特点在于它不 依赖于特定的邮件服务器, 对邮件客户端和服务器都完全透明。 与现有技术 相比, 本发明极大地提高了垃圾邮件过滤系统的可靠性, 拓宽了系统的适用 范围。  The spam filtering system of the present invention solves the problem of missed alarm and false alarm of packet filtering by adopting the "connection-oriented" technical measures and suitable algorithms, and the biggest feature is that it does not depend on a specific mail server. It is completely transparent to both the mail client and the server. Compared with the prior art, the invention greatly improves the reliability of the spam filtering system and broadens the application range of the system.

Claims

权利要求书 Claim
1. 一种面向连接的垃圾邮件过滤系统, 其特征在于, 至少包含: 数据采 集模块、 过滤策略管理模块、 过滤分析模块和数据处理模块, 其中, 数据采集 模块用于从被监控网络上捕获数据包,提交给过滤分析模块, 作为整个过滤系 统的数据输入; 过滤策略管理模块用于过滤策略的配置及管理; 过滤分析模块 用于根据配置的过滤策略,对输入的数据包进行分析, 检查是否包含过滤策略 所关注的信息;数据处理模块用于对过滤分析模块的分析结果数据进行各种不 同的处理。 A connection-oriented spam filtering system, comprising: a data collection module, a filtering policy management module, a filtering analysis module, and a data processing module, wherein the data collection module is configured to capture data from the monitored network. The package is submitted to the filtering analysis module as the data input of the entire filtering system; the filtering policy management module is used for the configuration and management of the filtering policy; the filtering analysis module is configured to analyze the input data packet according to the configured filtering policy, and check whether The information of the filtering strategy is included; the data processing module is configured to perform various processing on the analysis result data of the filtering analysis module.
2. 如权利要求 1 所述的面向连接的垃圾邮件过滤系统, 其特征在于, 该 系统还包括操作维护模块和存储备份模块, 其中,操作维护模块用于系统的维 护, 存储备份模块用于系统数据及数据包的存储备份。  2. The connection-oriented spam filtering system according to claim 1, wherein the system further comprises an operation and maintenance module and a storage backup module, wherein the operation and maintenance module is used for system maintenance, and the storage backup module is used for the system. Storage backup of data and data packets.
3. 如权利要求 1 所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述过滤策略包含过滤条件和对应的处理方式,所述过滤条件为多项条件的逻辑 组合。  3. The connection-oriented spam filtering system of claim 1, wherein the filtering policy comprises a filtering condition and a corresponding processing mode, the filtering condition being a logical combination of a plurality of conditions.
4. 如权利要求 1 所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述过滤分析模块包括 TCP连接维护子模块、 邮件协议解析子模块、 MIME解 码及内容扫描子模块, 其中, TCP连接维护子模块用于维护一张 TCP连接哈 希表; 邮件协议解析子模块用于完成邮件协议的解析; MIME解码及内容扫描 子模块用于判断输入的邮件数据的编码方式,并调用相应的编码转换函数进行 编码转换, 然后对邮件内容进行全文扫描。  The connection-oriented spam filtering system according to claim 1, wherein the filtering analysis module comprises a TCP connection maintenance sub-module, a mail protocol parsing sub-module, a MIME decoding, and a content scanning sub-module, wherein, the TCP The connection maintenance submodule is used to maintain a TCP connection hash table; the mail protocol parsing submodule is used to complete the parsing of the mail protocol; the MIME decoding and content scanning submodule is used to judge the encoding mode of the input mail data, and call the corresponding The encoding conversion function performs encoding conversion, and then performs full-text scanning on the content of the mail.
5. 如权利要求 4所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述哈希表以数据包的源 IP地址、 目的 IP地址、 源端口、 目的端口四元组作为 计算哈希键值的输入,采用多种快速哈希算法计算哈希值, 哈希冲突采用链地 址法解决。  The connection-oriented spam filtering system according to claim 4, wherein the hash table uses a source IP address, a destination IP address, a source port, and a destination port quad of the data packet as a calculation hash. The input of the key value uses a variety of fast hash algorithms to calculate the hash value, and the hash conflict is solved by the chain address method.
6. 如权利要求 4所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述哈希表中的每个 TCP连接节点至少包含连接双方的 IP地址、 传输层端口号 及本连接当前的一些状态信息。  6. The connection-oriented spam filtering system according to claim 4, wherein each of the TCP connection nodes in the hash table includes at least an IP address of the connection party, a transport layer port number, and a current connection of the current connection. Some status information.
7. 如权利要求 4所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述 TCP连接维护子模块的 TCP连接节点中记录本连接当前所处的状态。 8 ·如权利要求 7所述的面向连接的垃圾邮件过滤系统, 其特征在于, 所 述连接节点的结构至少包含: 7. The connection-oriented spam filtering system according to claim 4, wherein the TCP connection node of the TCP connection maintenance submodule records the current state of the connection. The connection-oriented spam filtering system according to claim 7, wherein the structure of the connection node comprises at least:
( 1 )客户端和服务器端的 IP地址和传输层端口号, 这四个参数是用于确 定数据包所属连接的唯一标识;  (1) IP address and transport layer port number of the client and server. These four parameters are unique identifiers used to determine the connection to which the packet belongs.
(2) 协议类型: SMTP、 POP3或者 IMAP;  (2) Protocol type: SMTP, POP3 or IMAP;
( 3 ) 本连接生命期: 用来防止很久不活动的连接占用系统资源;  (3) The life of the connection: The connection used to prevent long-term inactivity takes up system resources;
(4) 数据包缓存队列: 缓存本连接上的邮件数据包, 如果判明了本连接 上有不安全数据, 以便恢复邮件数据并保存;  (4) Packet Cache Queue: Caches the mail packet on this connection. If it is determined that there is unsafe data on the connection, the message data is restored and saved.
(5 ) 本连接上的会话所处的状态: 是命令交互状态还是数据传输状态; (6) 自动机临时状态: 用以解决按数据包进行关键字过滤时的漏报警问 题;  (5) The state of the session on the connection: whether it is the command interaction state or the data transmission state; (6) The automaton temporary state: It is used to solve the leakage alarm problem when the keyword filtering is performed by the data packet;
(7)本连接的安全标识: 当判明了连接上有不安全信息, 在该字段标记, 并不再扫描连接上的后续数据。  (7) Security ID of this connection: When it is determined that there is unsafe information on the connection, it is marked in this field, and subsequent data on the connection is no longer scanned.
9. 一种面向连接的垃圾邮件过滤方法, 其特征在于, 该方法至少包括如 下步骤:  A connection-oriented spam filtering method, characterized in that the method comprises at least the following steps:
( 1 ) 数据采集步骤, 用于从被监控网络上捕获数据包, 提交给过滤分析 模块, 作为整个过滤系统的数据输入;  (1) a data collection step for capturing a data packet from the monitored network and submitting it to the filtering analysis module as a data input of the entire filtering system;
(2) 过滤策略管理步骤, 用于过滤策略的配置及管理;  (2) Filtering policy management steps for configuring and managing filtering policies;
(3 ) 过滤分析步骤, 用于根据配置的过滤策略, 对输入的数据包进行分 析, 检査是否包含过滤策略所关注的信息;  (3) a filtering analysis step, configured to analyze the input data packet according to the configured filtering policy, and check whether the information concerned by the filtering policy is included;
(4 ) 数据处理步骤, 用于对过滤分析模块的分析结果数据进行各种不同 的处理。  (4) A data processing step for performing various processing on the analysis result data of the filtering analysis module.
10. 如权利要求 9所述的面向连接的垃圾邮件过滤方法, 其特征在于, 所 述步骤 (3 )还包括: 使用 SMTP、 POP3或 IMAP传输电子邮件时, 在命令交 互状态下, 提取输入数据包中的交互命令及其参数并进行分析; 在数据传输状 态下, 从数据数据包中提取邮件数据, 进行 MIME解码和内容扫描, 并将扫 描结果提交给数据处理模块。  10. The connection-oriented spam filtering method according to claim 9, wherein the step (3) further comprises: extracting input data in a command interaction state when transmitting the email using SMTP, POP3 or IMAP The interactive commands and their parameters in the package are analyzed. In the data transmission state, the mail data is extracted from the data packet, the MIME decoding and the content scanning are performed, and the scanning result is submitted to the data processing module.
11. 如权利要求 9所述的面向连接的垃圾邮件过滤方法, 其特征在于, 所 述步骤 (3 ) 还包括如下步骤:  11. The connection-oriented spam filtering method according to claim 9, wherein the step (3) further comprises the following steps:
( 111 ) TCP连接维护步骤, 用于维护一张 TCP连接哈希表; ( 112) 邮件协议解析步骤, 用于完成邮件协议的解析; ( 111 ) a TCP connection maintenance step for maintaining a TCP connection hash table; (112) a mail protocol parsing step for completing the parsing of the mail protocol;
( 113 ) MIME解码及内容扫描步骤, 用于判断输入的邮件数据的编码方 式,并调用相应的编码转换函数进行编码转换,然后对邮件内容进行全文扫描。  (113) The MIME decoding and content scanning steps are used to judge the encoding mode of the input mail data, and call the corresponding encoding conversion function for encoding conversion, and then perform full-text scanning on the mail content.
12. 如权利要求 10所述的面向连接的垃圾邮件过滤方法, 其特征在于, 所述步骤 (113 ) 还包括: 每扫描完一个包后, 把当前状态暂存在连接所属连 接节点的自动机临时状态字段中, 扫描下一个包时, 从所属连接节点的自动机 临时状态所指的状态幵始匹配, 以避免产生漏报警。  The connection-oriented spam filtering method according to claim 10, wherein the step (113) further comprises: temporarily suspending the current state to the automaton temporarily connected to the connected node after each packet is scanned. In the status field, when scanning the next packet, the matching is started from the state indicated by the automaton temporary state of the connected node to avoid a leak alarm.
13. 如权利要求 10所述的面向连接的垃圾邮件过滤方法, 其特征在于, 所述步骤 (113 ) 还包括: 对同一个 TCP连接上的有乱序的数据包进行排序, 并按照正确的顺序进行内容扫描, 以避免产生误报警。  The connection-oriented spam filtering method according to claim 10, wherein the step (113) further comprises: sorting the out-of-order packets on the same TCP connection, and following the correct Scan the content in sequence to avoid false alarms.
PCT/CN2004/001480 2004-12-21 2004-12-21 Connection-oriented junk mail filtering system and method WO2006066444A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2004/001480 WO2006066444A1 (en) 2004-12-21 2004-12-21 Connection-oriented junk mail filtering system and method
CN2004800441850A CN101040279B (en) 2004-12-21 2004-12-21 System and method for filter rubbish e-mails faced to connection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2004/001480 WO2006066444A1 (en) 2004-12-21 2004-12-21 Connection-oriented junk mail filtering system and method

Publications (1)

Publication Number Publication Date
WO2006066444A1 true WO2006066444A1 (en) 2006-06-29

Family

ID=36601337

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2004/001480 WO2006066444A1 (en) 2004-12-21 2004-12-21 Connection-oriented junk mail filtering system and method

Country Status (2)

Country Link
CN (1) CN101040279B (en)
WO (1) WO2006066444A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103580919A (en) * 2013-11-04 2014-02-12 复旦大学 Method and system for marking mail user by utilizing mail server blog
CN104796318A (en) * 2014-07-30 2015-07-22 北京中科同向信息技术有限公司 Behavior pattern identification technology
CN106789232A (en) * 2016-12-16 2017-05-31 武汉奥浦信息技术有限公司 A kind for the treatment of control system of efficient information flow
CN112702356A (en) * 2020-12-29 2021-04-23 中孚安全技术有限公司 Network security teaching method, system, equipment and readable storage medium
CN113067765A (en) * 2020-01-02 2021-07-02 中国移动通信有限公司研究院 Multimedia message monitoring method, device and equipment

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102594624A (en) * 2012-03-06 2012-07-18 上海纳轩电子科技有限公司 Method for efficiently capturing network data packets at high speed based on field programmable gate array (FPGA)
CN102857917B (en) * 2012-08-24 2015-06-03 北京拓明科技有限公司 Method for identifying internet access of mobile phone through personal computer (PC) based on signaling analysis
CN103077090B (en) * 2012-12-28 2016-03-23 盘石软件(上海)有限公司 A kind of Outlook deletes the restoration methods of mail
CN106027369A (en) * 2016-05-09 2016-10-12 哈尔滨工程大学 Email address characteristic oriented email address matching method
CN106302491A (en) * 2016-08-23 2017-01-04 浪潮电子信息产业股份有限公司 A kind of mail Monitoring method based on Linux

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654787B1 (en) * 1998-12-31 2003-11-25 Brightmail, Incorporated Method and apparatus for filtering e-mail
US20030225841A1 (en) * 2002-05-31 2003-12-04 Sang-Hern Song System and method for preventing spam mails
JP2004021623A (en) * 2002-06-17 2004-01-22 Nec Soft Ltd E-mail filter system using directory server and server program
JP2004171169A (en) * 2002-11-19 2004-06-17 Msd Japan:Kk Mail filtering method, mail server and mail filtering program
US20040210640A1 (en) * 2003-04-17 2004-10-21 Chadwick Michael Christopher Mail server probability spam filter

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1145316C (en) * 2001-01-23 2004-04-07 联想(北京)有限公司 Method for filtering electronic mail contents in interconnection network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654787B1 (en) * 1998-12-31 2003-11-25 Brightmail, Incorporated Method and apparatus for filtering e-mail
US20030225841A1 (en) * 2002-05-31 2003-12-04 Sang-Hern Song System and method for preventing spam mails
JP2004021623A (en) * 2002-06-17 2004-01-22 Nec Soft Ltd E-mail filter system using directory server and server program
JP2004171169A (en) * 2002-11-19 2004-06-17 Msd Japan:Kk Mail filtering method, mail server and mail filtering program
US20040210640A1 (en) * 2003-04-17 2004-10-21 Chadwick Michael Christopher Mail server probability spam filter

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103580919A (en) * 2013-11-04 2014-02-12 复旦大学 Method and system for marking mail user by utilizing mail server blog
CN104796318A (en) * 2014-07-30 2015-07-22 北京中科同向信息技术有限公司 Behavior pattern identification technology
CN106789232A (en) * 2016-12-16 2017-05-31 武汉奥浦信息技术有限公司 A kind for the treatment of control system of efficient information flow
CN106789232B (en) * 2016-12-16 2019-12-06 武汉奥浦信息技术有限公司 efficient information-flow processing control system
CN113067765A (en) * 2020-01-02 2021-07-02 中国移动通信有限公司研究院 Multimedia message monitoring method, device and equipment
CN113067765B (en) * 2020-01-02 2023-01-13 中国移动通信有限公司研究院 Multimedia message monitoring method, device and equipment
CN112702356A (en) * 2020-12-29 2021-04-23 中孚安全技术有限公司 Network security teaching method, system, equipment and readable storage medium

Also Published As

Publication number Publication date
CN101040279B (en) 2010-04-28
CN101040279A (en) 2007-09-19

Similar Documents

Publication Publication Date Title
EP2446411B1 (en) Real-time spam look-up system
US8769020B2 (en) Systems and methods for managing the transmission of electronic messages via message source data
Jung et al. An empirical study of spam traffic and the use of DNS black lists
US7886066B2 (en) Zero-minute virus and spam detection
US7873695B2 (en) Managing connections and messages at a server by associating different actions for both different senders and different recipients
US7870200B2 (en) Monitoring the flow of messages received at a server
US20060224673A1 (en) Throttling inbound electronic messages in a message processing system
WO2014101758A1 (en) Method, apparatus and device for detecting e-mail bomb
CN101729542A (en) Multi-protocol information resolving system based on network packet
US20070271611A1 (en) Determining a source of malicious computer element in a computer network
US20060265459A1 (en) Systems and methods for managing the transmission of synchronous electronic messages
WO2006066444A1 (en) Connection-oriented junk mail filtering system and method
US7958187B2 (en) Systems and methods for managing directory harvest attacks via electronic messages
Chiou et al. Blocking spam sessions with greylisting and block listing based on client behavior
Zhang et al. A new approach for detecting abnormal email traffic in backbone network
Sun et al. A novel method to characterize unwanted email Traffic
Sun A new spam traffic characterizatione mechanism
Huang et al. A Collaborative Anti-Spam E-mail Filter

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 200480044185.0

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 04802493

Country of ref document: EP

Kind code of ref document: A1

WWW Wipo information: withdrawn in national office

Ref document number: 4802493

Country of ref document: EP