RFC3164协议手册地址:https://tools.ietf.org/html/rfc3164
Syslog常被用来日志等数据的传输协议,数据格式遵循规范主要有RFC3164,RFC5424;
RFC5424 相比 RFC3164 主要是数据格式的不同,RFC3164相对来说格式较为简单,能适应大部分使用场景,但是已废弃,RFC5424已作为Syslog的业界规范;下面就来分别讲讲两个协议;
RFC5424(下面的标题序号基于原文来,便于对照查阅)
6、Syslog消息格式:
# 一条信息的构成
SYSLOG-MSG = HEADER SP STRUCTURED-DATA [SP MSG] # 最后的MSG是可省略的
# SYSLOG-MSG = 优先级 版本 空格 时间戳 空格 主机名 空格 应用名 空格 进程id 空格 信息id
HEADER = PRI VERSION SP TIMESTAMP SP HOSTNAME
SP APP-NAME SP PROCID SP MSGID
# PRI优先级
PRI = "<" PRIVAL ">" # 优先级 <0>
# PRI优先级的值
PRIVAL = 1*3DIGIT ; range 0 .. 191 # 3位数字, 0到191
# syslog版本号
VERSION = NONZERO-DIGIT 0*2DIGIT # 默认为 RFC5424默认为1
# 主机名
HOSTNAME = NILVALUE / 1*255PRINTUSASCII # - 或 255位可打印ASCII值
# 应用名
APP-NAME = NILVALUE / 1*48PRINTUSASCII # - 或 48位可打印ASCII值
# 进程ID
PROCID = NILVALUE / 1*128PRINTUSASCII # - 或 128位可打印ASCII值
# 信息ID
MSGID = NILVALUE / 1*32PRINTUSASCII # - 或 32位可打印ASCII值
# 时间戳
TIMESTAMP = NILVALUE / FULL-DATE "T" FULL-TIME # - 或 "0000-00-00"
# 完整日期格式
FULL-DATE = DATE-FULLYEAR "-" DATE-MONTH "-" DATE-MDAY # "0000-00-00"
# 年
DATE-FULLYEAR = 4DIGIT # 四位数字
# 月
DATE-MONTH = 2DIGIT ; 01-12 # 两位数字
# 日
DATE-MDAY = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on month/year
# 完整时间(带时区)
FULL-TIME = PARTIAL-TIME TIME-OFFSET
# 时间(不带时区)
PARTIAL-TIME = TIME-HOUR ":" TIME-MINUTE ":" TIME-SECOND # 23:59:59
[TIME-SECFRAC]
# 小时
TIME-HOUR = 2DIGIT ; 00-23 # 两位数字
# 分
TIME-MINUTE = 2DIGIT ; 00-59 # 两位数字
# 秒
TIME-SECOND = 2DIGIT ; 00-59 # 两位数字
# 时间的小数部分
TIME-SECFRAC = "." 1*6DIGIT # 6位数字
TIME-OFFSET = "Z" / TIME-NUMOFFSET # 相对于标准时区的偏移, "Z" 或 +/- 23:59
# 相对于便准时区的偏移
TIME-NUMOFFSET = ("+" / "-") TIME-HOUR ":" TIME-MINUTE # +/- 23:59
# 结构化数据
STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT # - 或 SD-ELEMENT
SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]" # [SD-ID*( PARAM-NAME="PARAM-VALUE")]
SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34 # PARAM-NAME="PARAM-VALUE"
SD-ID = SD-NAME # SD-ID
PARAM-NAME = SD-NAME # 参数名
PARAM-VALUE = UTF-8-STRING # utf-8字符, '"', '\' 和 ']'必须被转义
SD-NAME = 1*32PRINTUSASCII # 1到32位可打印ascii值,除了'=',空格, ']', 双引号(")
MSG = MSG-ANY / MSG-UTF8 # 信息
MSG-ANY = *OCTET ; not starting with BOM # 八进制字符串 不以BOM开头
MSG-UTF8 = BOM UTF-8-STRING # utf-8格式字符串
BOM = %xEF.BB.BF # 表明编码方式,以 EF BB BF开头表明utf-8编码
UTF-8-STRING = *OCTET # RFC 3629规定的字符
OCTET = %d00-255 # ascii
SP = %d32 # 空格
PRINTUSASCII = %d33-126 # ascii值的33-126,即数字、大小写字母、标点符号
NONZERO-DIGIT = %d49-57 # ascii的49-57
DIGIT = %d48 / NONZERO-DIGIT # ascii的48-57
NILVALUE = "-" # 无对应值
6.1 消息长度
RFC 5424 规定消息最大长度为2048个字节,如果收到Syslog报文,超过这个长度,需要注意截断或者丢弃;
- 截断:如果对消息做截断处理,必须注意消息内容的有消息,很好理解,UTF-8编码,一个中文字符对应3个字节,截断后的字符可能就是非法的;
- 丢弃:如果该syslog应用的场景下,认为超出长度的就是非法的,则可做丢弃处理;
6.2 消息头
6.2.1 PRI
PRI为消息优先级,用”<“和”>”括起来。PRI由两部分组成:
- Facility(特性):用来表示硬件设备、协议或系统软件的型号。
- Severity(严重性):范围为0~7的数字编码,表示了事件的严重程度。
计算方式为:PRI = Facility * 8 + severity;(例如 165表示一条级别为Notice的local4消息)
Facility取值范围及含义如下:
Numerical Facility
Code
0 kernel messages
1 user-level messages
2 mail system
3 system daemons
4 security/authorization messages
5 messages generated internally by syslogd
6 line printer subsystem
7 network news subsystem
8 UUCP subsystem
9 clock daemon
10 security/authorization messages
11 FTP daemon
12 NTP subsystem
13 log audit
14 log alert
15 clock daemon (note 2)
16 local use 0 (local0)
17 local use 1 (local1)
18 local use 2 (local2)
19 local use 3 (local3)
20 local use 4 (local4)
21 local use 5 (local5)
22 local use 6 (local6)
23 local use 7 (local7)
Severity取值范围含义:
Numerical Severity
Code
0 Emergency: system is unusable
1 Alert: action must be taken immediately
2 Critical: critical conditions
3 Error: error conditions
4 Warning: warning conditions
5 Notice: normal but significant condition
6 Informational: informational messages
7 Debug: debug-level messages
上述一般对应于日志的8种级别;
6.2.2 Version
版本用来表示Syslog协议的版本,RFC5424的版本号为“1”;
6.2.3 TIMESTAMP
时间戳格式为:yyyy-mm-ddTHH:MM:SS.xxxxxx+/-HH:MM
有以下几个要求:
- “T” “Z”必须大写
- “T”是必须的
- 不能使用闰秒
- 如果无法获取时间戳,必须使用”-“代替
举例如下:
1985-04-12T23:20:50.52Z #有效
1985-04-12T19:20:50.52-04:00#有效
2003-10-11T22:14:15.003Z#有效
2003-08-24T05:14:15.000003-07:00#有效
2003-08-24T05:14:15.000000003-07:00#非法,小数点后超过6位
6.2.4 HOSTNAME
hostname标识发送syslog消息的源主机;优先选用如下几种写法:
- FQDN
全限定域名:同时带有主机名和域名的名称。(通过符号“.”)
例如:主机名是bigserver,域名是mycompany.com,那么FQDN就是bigserver.mycompany.com。
- Static IP address
静态IP地址
- hostname
主机名
- Dynamic IP address
动态Ip地址
- the NILVALUE
“-“
6.2.5 APP-NAME
用于识别产生消息的设备或应用,找不到用”-“代替;
6.2.6 PROCID
进程名称或进程ID,得不到用”-“代替;ProcId常用于分析日志生成进程的连续性,但不做可靠性保证,比如进程重启还是可能会分到一样的进程ID;
6.2.7 MSGID
标识消息类型。例如TCPIN、TCPOUT分别代表TCP数据的流入或流出;如果无法获取数据类型,用”-“代替。 MSGID可根据数据类型用于数据过滤;
6.3 STRUCTURED-DATA
结构化数据;提供了一种记录被良好定义易于被解析的数据的数据格式;可用于记录系统的元信息或应用相关的信息;
可以包含多条结构化数据——”SD-ELEMENT”,如果没有则用”-“代替;
6.3.1 SD-ELEMENT
一条结构化数据SD-ELEMENT包含名字(SD-ID)以及多条键值对(SD-PARAM);
6.3.2 SD-ID
一条消息种必须唯一,用于识别SD-ELEMENT的类型和目的;
有两种格式:
- IANA定义的SD-ID
IANA定义,标准规范的SD-ID;不包含@;
- 自定义的SD-ID
支持自定义SD-ID,格式为 name@<private enterprise number>;例如 example1@32473;name同样不能包含@ = ] ” 空格 控制符
值得一提的是,32473 在IANA上已被注册作为文档中的举例数字,无法被使用;
6.3.3 SD-PARAM
键值对;除了自定义的SD-ID外,所有的SD-PARAM也是受限的,在IANA上定义了所有的PARAM-NAME;PARAM-NAME有效范围为一个指定的SD-ID;
- 在不同的SD-ID种两个同名的PARAM-NAME是不同的;
- 一个SD-ELEMENT可能包含多个同样的SD-PARAM;
- 一旦SD-ID以及PARAM-NAME被定义,是不允许修改的,只能通过新增的方式去实现你的需求;
举例如下:
[exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"]
一条自定义的数据类型:exampleSDID@32473,对应三个参数;
[exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"][examplePriority@32473 class="high"]
两条数据;
[exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"] [examplePriority@32473 class="high"]
非法数据,两条结构化数据中间有空格;
[ exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"][examplePriority@32473 class="high"]
非法数据,SD-ID与”[“中间有空格
6.4 MSG
消息体,无格式要求;如果Syslog应用用UTF-8编码,必须以BOM开头;
6.5 例子
Example 1 - with no STRUCTURED-DATA
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47
- BOM'su root' failed for lonvick on /dev/pts/8
In this example, the VERSION is 1 and the Facility has the value of
4. The Severity is 2. The message was created on 11 October 2003 at
10:14:15pm UTC, 3 milliseconds into the next second. The message
originated from a host that identifies itself as
"mymachine.example.com". The APP-NAME is "su" and the PROCID is
unknown. The MSGID is "ID47". The MSG is "'su root' failed for
lonvick...", encoded in UTF-8. The encoding is defined by the BOM.
There is no STRUCTURED-DATA present in the message; this is indicated
by "-" in the STRUCTURED-DATA field.
Example 2 - with no STRUCTURED-DATA
<165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1
myproc 8710 - - %% It's time to make the do-nuts.
In this example, the VERSION is again 1. The Facility is 20, the
Severity 5. The message was created on 24 August 2003 at 5:14:15am,
with a -7 hour offset from UTC, 3 microseconds into the next second.
The HOSTNAME is "192.0.2.1", so the syslog application did not know
its FQDN and used one of its IPv4 addresses instead. The APP-NAME is
"myproc" and the PROCID is "8710" (for example, this could be the
UNIX PID). There is no STRUCTURED-DATA present in the message; this
is indicated by "-" in the STRUCTURED-DATA field. There is no
specific MSGID and this is indicated by the "-" in the MSGID field.
GerhardsStandards Track[Page 19]
RFC 5424 The Syslog Protocol March 2009
The message is "%% It's time to make the do-nuts.". As the Unicode
BOM is missing, the syslog application does not know the encoding of
the MSG part.
Example 3 - with STRUCTURED-DATA
<165>1 2003-10-11T22:14:15.003Z mymachine.example.com
evntslog - ID47 [exampleSDID@32473 iut="3" eventSource=
"Application" eventID="1011"] BOMAn application
event log entry...
This example is modeled after Example 1. However, this time it
contains STRUCTURED-DATA, a single element with the value
"[exampleSDID@32473 iut="3" eventSource="Application"
eventID="1011"]". The MSG itself is "An application event log
entry..." The BOM at the beginning of MSG indicates UTF-8 encoding.
Example 4 - STRUCTURED-DATA Only
<165>1 2003-10-11T22:14:15.003Z mymachine.example.com
evntslog - ID47 [exampleSDID@32473 iut="3" eventSource=
"Application" eventID="1011"][examplePriority@32473
class="high"]
7 结构化数据 ID
所有规范的SD-ID都在IANA上有定义。
7.1 timeQuality
用来阐述系统时间的概念;
7.1.1 tzKnown
时间信息是否确定,如果确定tzKnown=”1″;否则tzKnown=”0″;
7.1.2 isSynced
标识时间是否是NTP同步的;如果是,则值为1;否则为0;
7.2 origin
描述消息源;
….