Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions app/vlinsert/syslog/syslog.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ import (
var (
syslogTimezone = flag.String("syslog.timezone", "Local", "Timezone to use when parsing timestamps in RFC3164 syslog messages. Timezone must be a valid IANA Time Zone. "+
"For example: America/New_York, Europe/Berlin, Etc/GMT+3 . See https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/")
syslogMsgField = flagutil.NewArrayString("syslog.msgField", "Fields to use as the _msg field. "+
"Defaults to 'message' for plain syslog and 'cef.name' (name of the event) for CEF-formatted logs. "+
"See https://docs.victoriametrics.com/victorialogs/keyconcepts/#message-field")

listenAddrTCP = flagutil.NewArrayString("syslog.listenAddr.tcp", "Comma-separated list of TCP addresses to listen to for Syslog messages. "+
"See https://docs.victoriametrics.com/victorialogs/data-ingestion/syslog/")
Expand Down Expand Up @@ -605,14 +608,22 @@ func processLine(line []byte, currentYear int, timezone *time.Location, useLocal
p.AddField("hostname", remoteIP)
}
}
logstorage.RenameField(p.Fields, msgFields, "_msg")
logstorage.RenameField(p.Fields, getMsgFields(), "_msg")
lmp.AddRow(ts, p.Fields, -1)

return nil
}

var timeFields = []string{"timestamp"}
var msgFields = []string{"message"}

var defaultMsgFields = []string{"message", "cef.name"}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, cef.extension.msg is an optional field with a more detailed description of the log entry, while cef.name is the event name and it's present in every CEF-formatted log entry.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while cef.name is the event name and it's present in every CEF-formatted log entry.

I think we shouldn't invent our own rule, using CEF name as the default message field seems unexpected(?). AFAIK Elasticsearch, Logstash, NXLog, etc map CEF msg to message.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, which field do you suggest using as the default?

As I mentioned above, cef.extension.msg is an optional field with details, while cef.name is the same thing as cef.extension.msg, but without the details and it is present in every log:

name is a string representing a human-readable and understandable description of the event.

src: https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/cef-implementation-standard/Content/CEF/Chapter%201%20What%20is%20CEF.htm

msg - An arbitrary message giving more details about the event

src: https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/cef-implementation-standard/Content/CEF/Chapter%202%20ArcSight%20Extension.htm#_Toc494359739

So, if we pick the cef.extension.msg field as the default, users will probably have to change that flag, but we want to avoid this to simplify usage of VictoriaLogs.

I think relying on cef.name is a more universal solution for typical scenarios. If the user wants to change the _msg field, we provide the new flag for that.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... I understand. My point is that using cef.name (kind of title) as the default is not a universal convention in other systems, so it may be unexpected.

So, which field do you suggest using as the default?

Probably message,cef.extension.msg (since they are "message").

Also a user may see some logs with a cef.name field if message there and some logs without it since cef.name become _msg. If they explicitly configure name, then they understand their own logs better.

About adding the flag, I have no strong concerns. Either adding it now or waiting until users ask for it is fine, since message field is already one of the most customizable fields across the project

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that using cef.name (kind of title) as the default is not a universal convention in other systems, so it may be unexpected.

I see your point and agree. Just found that ELK-stack has common-schema convention: https://www.elastic.co/elasticsearch/common-schema

tldr: they rename similar fields to a universal format, like: src.ip, src_ip, source_ip -> source.ip. Or CEF's msg -> message. I think this is not the same as our _msg, since Elasticsearch does not have the concept of a _msg field the way it works in VictoriaLogs


func getMsgFields() []string {
if len(*syslogMsgField) > 0 {
return *syslogMsgField
}
return defaultMsgFields
}

var (
errorsTotal = metrics.NewCounter(`vl_errors_total{type="syslog"}`)
Expand Down
2 changes: 1 addition & 1 deletion app/vlinsert/syslog/syslog_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Sep 19 08:26:10 host CEF:0|Security|threatmanager|1.0|100|worm successfully stop
currentYear := 2023
timestampsExpected := []int64{1685794113000000000, 1695111970000000000, 1685880513000000000, 1685814132345000000}
resultExpected := `{"format":"rfc3164","hostname":"abcd","app_name":"systemd","_msg":"Starting Update the local ESM caches...","remote_ip":"1.2.3.4"}
{"format":"rfc3164","hostname":"host","app_name":"CEF","cef.version":"0","cef.device_vendor":"Security","cef.device_product":"threatmanager","cef.device_version":"1.0","cef.device_event_class_id":"100","cef.name":"worm successfully stopped","cef.severity":"10","cef.extension.src":"10.0.0.1","cef.extension.dst":"2.1.2.2","cef.extension.spt":"1232","remote_ip":"1.2.3.4"}
{"format":"rfc3164","hostname":"host","app_name":"CEF","cef.version":"0","cef.device_vendor":"Security","cef.device_product":"threatmanager","cef.device_version":"1.0","cef.device_event_class_id":"100","_msg":"worm successfully stopped","cef.severity":"10","cef.extension.src":"10.0.0.1","cef.extension.dst":"2.1.2.2","cef.extension.spt":"1232","remote_ip":"1.2.3.4"}
{"priority":"165","facility_keyword":"local4","level":"notice","facility":"20","severity":"5","format":"rfc3164","hostname":"abcd","app_name":"systemd","proc_id":"345","_msg":"abc defg","remote_ip":"1.2.3.4"}
{"priority":"123","facility_keyword":"solaris-cron","level":"error","facility":"15","severity":"3","format":"rfc5424","hostname":"mymachine.example.com","app_name":"appname","proc_id":"12345","msg_id":"ID47","[email protected]":"3","[email protected]":"Application 123 = ] 56","[email protected]":"11211","_msg":"This is a test message with structured data.","remote_ip":"1.2.3.4"}`
f(data, currentYear, timestampsExpected, resultExpected)
Expand Down
1 change: 1 addition & 0 deletions docs/victorialogs/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ according to the following docs:
* FEATURE: [querying API](https://docs.victoriametrics.com/victorialogs/querying/): allow using [`limit`](https://docs.victoriametrics.com/victorialogs/logsql/#limit-pipe) and [`offset`](https://docs.victoriametrics.com/victorialogs/logsql/#offset-pipe) pipes after the [`stats` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#stats-pipe) in queries to [`/select/logsql/stats_query`](https://docs.victoriametrics.com/victorialogs/querying/#querying-log-stats). This enables the usage for these pipes in [alerting and recording rules for VictoriaLogs](https://docs.victoriametrics.com/victorialogs/vmalert/). See [#1296](https://github.com/VictoriaMetrics/VictoriaLogs/issues/1296).
* FEATURE: [alerts](https://github.com/VictoriaMetrics/VictoriaLogs/blob/master/deployment/docker/rules): add new alerting rules `PersistentQueueRunsOutOfSpaceIn12Hours` and `PersistentQueueRunsOutOfSpaceIn4Hours` for `vlagent` persistent queue capacity. These alerts help users to take proactive actions before `vlagent` starts dropping logs due to insufficient persistent queue space. See [#10193](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10193)
* FEATURE: [web UI](https://docs.victoriametrics.com/victorialogs/querying/#web-ui): remove the `Date format` setting and always display timestamps with nanosecond precision. See [#1161](https://github.com/VictoriaMetrics/VictoriaLogs/issues/1161).
* FEATURE: [data ingestion](https://docs.victoriametrics.com/victorialogs/data-ingestion/): add an ability to override the default list of `_msg` fields for syslog ingestion protocol. The default list of `_msg` fields has been updated to `message` (for plain syslog) and `cef.name` (name of the event for CEF-formatted logs). It is useful for CEF-formatted logs which may contain an arbitrary number of additional log fields. See [#1362](https://github.com/VictoriaMetrics/VictoriaLogs/issues/1362).

* BUGFIX: [vlagent](https://docs.victoriametrics.com/victorialogs/vlagent/): hide sensitive values passed via `-remoteWrite.proxyURL` in `/metrics`, `/flags`, and startup logs. Previously these values could be exposed in plain text. See [#1320](https://github.com/VictoriaMetrics/VictoriaLogs/pull/1320).
* BUGFIX: [web UI](https://docs.victoriametrics.com/victorialogs/querying/#web-ui): sanitize markdown URLs in logs rendered with `markdown parsing` enabled, allowing only `http`, `https`, `mailto`, and `tel` schemes for active links and images. See [#1313](https://github.com/VictoriaMetrics/VictoriaLogs/pull/1313).
Expand Down
Loading