In Logstash, try setting the same as Fluentd (td-agent) forest plugin and copy combined. As a result, even if the log type and the sender increase, it is possible to simplify without adding the output setting every time.
- What to expect - Setting example with Fluentd
- Example of setting with Logstash
- Conclusion - Multiple output settings in Logstash same as Fluentd forest + copy
What to expect - Setting example with Fluentd
In Fluentd (td - agent), make general setting to process with forest based on tag variable. An example of setting in Fluentd is as follows. Optimization such as Chunk, Buffer etc is not implemented yet.
Base setting. Include individual configuration files. The verification version is td-agent 0.12.31
.
/etc/td-agent/td-agent.conf
<source>
@type forward
port 24224
</source>
@include ./conf/*.conf
Simply capture the local log file. Use the tag set here for Elasticsearch's index etc.
/etc/td-agent/conf/local_messages.conf
<source>
@type tail
path /var/log/messages
pos_file /var/log/messages.pos
tag "sys.messages.#{Socket.gethostname}"
format syslog
</source>
Sending settings to Elasticsearch. Save it as a file for backup / long-term storage. By using forest + copy, tag can be taken as a variable and used for index and file name.
/etc/td-agent/conf/elasticsearch.conf
<match *.*.**> type forest subtype copy <template> <store> @type elasticsearch host localhost port 9200 logstash_format true logstash_prefix ${tag_parts[0]}.${tag_parts[1]} type_name ${tag_parts[0]} flush_interval 20 </store> <store> @type file path /var/log/td-agent/${tag_parts[0]}/${tag_parts[1]}.log compress gzip </store> </template> </match>
Here is the forest plugin.
This makes it unnecessary to add additional settings in the match directive even if the new log type increases. It is efficient when collecting and aggregating multiple logs from multiple servers.
Example of setting with Logstash
The version of Logstash is tested at 5.3.0.
$ /usr/share/logstash/bin/logstash --version logstash 5.3.0
Make sure that path.config: /etc/logstash/conf.d
is set in/etc/logstash/logstash.yml
. (By default)
Forest + Copy in Logstash has the following settings. It is described separately for each part, but you can group the setting as /etc/logstash/conf.d/messages.conf
into one file.
Add tags in input and distribute it with tag like Fluentd. The part corresponding to the source directive. Since it refers to environment variables with %{host} it is unnecessary to change. Although tag is used along Fluentd, other fields such as id and type can also be used.
input { file { path => "/var/log/messages" tags => ["sys", "logstash_messages", "%{host}"] type => syslog } }
Use index filtering and date processing with regular expression in filter. In Fluentd, it corresponds to format
,date_format
in the Source directive. Because it operates as a single filter, it is applied to multiple logs captured by input. Therefore, filter targets are limited by if.
filter { if [type] == "syslog" { grok { match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" } add_field => [ "received_at", "%{@timestamp}" ] add_field => [ "received_from", "%{host}" ] } date { match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ] } } }
Output to elasticsearch and file with output. It corresponds to Fluentd's match directive. Logstash does not use plugins such as copy and forest, it can simply use multiple output and variable in output. Because it is a sample when Elasticsearch is running on the same server, change hosts according to the environment.
Since it does not specify <match <tag1>.<tag2>.*>
, It is necessary to standardize the output format.
output { elasticsearch { hosts => ["localhost:9200"] index => "%{tags[0]}.%{tags[1]}-%{+YYYY.MM.dd}" } file { path => "/var/log/logstash/%{tags[0]}/%{tags[1]}.log" } }
Output result
The tags variable is expanded and used as index.
$ curl http://localhost:9200/sys.logstash_messages-* {"sys.logstash_messages-2017.04.09":{"aliases":{},"mappings":{"syslog":{"properties":{"@timestamp":{"type":"date"},"@version":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"host":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"message":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"path":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"received_at":{"type":"date"},"received_from":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"syslog_hostname":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"syslog_message":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"syslog_program":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"syslog_timestamp":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"tags":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"type":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}},"settings":{"index":{"creation_date":"1491720002393","number_of_shards":"5","number_of_replicas":"1","uuid":"AO_MMCiqQYqcNIIEG8XmQg","version":{"created":"5020299"},"provided_name":"sys.logstash_messages-2017.04.09"}}}}
Similarly to the file output, the tags variable was expanded, and a directory and a log file were created based on the tag information.
$ tree /var/log/logstash/ /var/log/logstash/ ├── logstash-plain.log └── sys └── logstash_messages.log
Conclusion - Multiple output settings in Logstash same as Fluentd forest + copy
In Logstash we made the same setting as Fluentd's forest + copy. As a result, even if the log type and the sender increase, it is possible to simplify without adding the output setting every time.