本文介绍如何:
- 跨服务器收集和处理Web应用程序日志。
- 将收集的日志近乎实时地发送到聚合器Fluentd。
- 将收集的日志存储到Elasticsearch中。
- 使用Kibana可视化数据。
先决条件
- 对Fluentd,Elasticsearch和Kibana的基本了解
- Fluentd,Elasticsearch和Kibana已安装
我们想做什么?
想象一下,你有一个应用程序,它与外部提供商交换数据。一切都很好,但有时会出现问题,您或他们需要知道您发送的数据和他们要求的数据。然后你用谷歌搜索并意识到你需要有一个访问日志,五分钟后你将包括slf4j + logback / log4j2并写入服务器中的文件。您的应用程序开始获得点击,现在你有十个节点的集群,日志分散在十个节点上。现在,每次需要查找请求时,都需要在每个节点中执行,当你意识到你需要集中你的日志,这篇文章来帮助你。
我们怎么做?
还有一堆的工具,你可以用它来集中应用程序日志:rsyslog, logstash, flume, scribe, fluentd, 从应用程序的角度来看,我将使用logback来记录和流畅地将数据发送给流利的人。E lasticsearch将保留日志数据,以便稍后由kibana查询。
将日志发送到本地fluentd
首先,我们需要能够记录请求和响应。这可以通过不同的方式实现,我将使用logback-access库,它就像一个用于logback的插件,并且与Jetty完美契合。
使用您最喜欢的依赖管理器将其包含在您的应用中:
<dependency> <groupId>ch.qos.logback</groupId> <artifactId>logback-access</artifactId> <version>1.2.3</version> </dependency>` |
这个库提供了几个类,我们将使用ch.qos.logback.access.servlet.TeeFilter来访问请求和响应有效负载(正文),并使用ch.qos.logback.access.jetty.RequestLogImp ll来发布请求并响应要回溯的数据,以便在我们的日志布局中使用它们。现在我们需要将这些类插入Jetty,有两行要突出显示:
contextHandler.addFilter(new FilterHolder(new TeeFilter()), “/*”, util.EnumSet.of(DispatcherType.INCLUDE, DispatcherType.REQUEST, DispatcherType.FORWARD))
我们使用TeeFilter来拦截所有匹配正则表达式“/ *”的请求,以复制请求和响应有效负载,以供我们记录。
requestLog.setResource("/logback-access.xml")
Logback-access使用它自己的配置文件,它是可配置的(默认路径是{jetty.home} /etc/logback-access.xml)。应该是这样的:
<configuration> <appender name="FLUENCY" class="ch.qos.logback.more.appenders.FluencyLogbackAppender"> <!-- Tag for Fluentd. Farther information: http://docs.fluentd.org/articles/config-file --> <tag>accesslog</tag> <!-- Host name/address and port number which Flentd placed --> <remoteHost>localhost</remoteHost> <port>20001</port> <!-- [Optional] Configurations to customize Fluency's behavior: https://github.com/komamitsu/fluency#usage --> <ackResponseMode>false</ackResponseMode> <fileBackupDir>/tmp</fileBackupDir> <!-- Initial chunk buffer size is 1MB (by default)--> <bufferChunkInitialSize>2097152</bufferChunkInitialSize> <!--Threshold chunk buffer size to flush is 4MB (by default)--> <bufferChunkRetentionSize>16777216</bufferChunkRetentionSize> <!-- Max total buffer size is 512MB (by default)--> <maxBufferSize>268435456</maxBufferSize> <!-- Max wait until all buffers are flushed is 10 seconds (by default)--> <waitUntilBufferFlushed>30</waitUntilBufferFlushed> <!-- Max wait until the flusher is terminated is 10 seconds (by default) --> <waitUntilFlusherTerminated>40</waitUntilFlusherTerminated> <!-- Flush interval is 600ms (by default)--> <flushIntervalMillis>200</flushIntervalMillis> <!-- Max retry of sending events is 8 (by default) --> <senderMaxRetryCount>12</senderMaxRetryCount> <!-- [Optional] Enable/Disable use of EventTime to get sub second resolution of log event date-time --> <useEventTime>true</useEventTime> <encoder> <pattern><![CDATA[REQUEST FROM %remoteIP ON %date{yyyy-MM-dd HH:mm:ss,UTC} UTC // %responseHeader{X-UOW} // responseHeader{X-RequestId} %n >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> %fullRequest <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< %fullResponse ]]> </pattern> </encoder> </appender> <appender-ref ref="FLUENCY"/> </configuration> |
使用logback-more-appenders的 ch.qos.logback.more.appenders.FluencyLogbackAppender 插入logback-access 和Fluency。
<dependency> <groupId>org.komamitsu</groupId> <artifactId>fluency</artifactId> <version>1.8.1</version> </dependency> <dependency> <groupId>com.sndyuk</groupId> <artifactId>logback-more-appenders</artifactId> <version>1.5.0</version> </dependency> |
Fluency 有很多缓冲风格配置你需要调整,这里有很好的解释。对于本文,我们将重点关注tag,remoteHost和port。
- tag用于标记事件。我们将使用它来匹配我们在流利的事件中的事件,并能够解析,过滤和转发它们到elasticsearch。
- remoteHost是事件将被发送的地方,在这种情况下我们将有一个本地流利的所以我们使用'localhost'
- port , fluentd监听端口
- encoder.pattern定义事件的布局。它与您的日志模式相同,您可以使用占位符,但在此提交发布之前无法使用MDC数据。以下是我们的活动将如何显示的示例:
REQUEST FROM 69.28.94.231 ON 2018-10-30 00:00:00 UTC // myapp-node-00-1540857599992 // h5hSUaVHvr >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> POST /my/app/path HTTP/1.1 X-Forwarded-Proto: https X-Forwarded-For: 69.28.94.231 Host: my.company.com Content-Length: 30 Content-Type: application/json
{"message": "This is the body of the request" }
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< HTTP/1.1 200 OK X-RequestId: h5hSUaVHvr X-UOW: myapp-node-00-1540857599992 Date: Mon, 29 Oct 2018 23:59:59 GMT Content-Type: application/json; charset=UTF-8
{"message": "This is the body of the response", "status": "Okey!"}
从本地Fluency 转发日志到远程Fluency
我们配置本地流利,以处理我们的事件并将它们转发给Fluency的聚合器。配置文件(默认情况下为/etc/td-agent/td-agent.conf):
<source> @type forward port 20001 </source> <filter accesslog> @type parser key_name msg reserve_data false <parse> @type multiline format_firstline /^REQUEST FROM/ format1 /REQUEST FROM (?<request.ip>[^ ]*) ON (?<time>\d{4}-\d{2}-\d{2} \d{2}\:\d{2}\:\d{2} [^ ]+) // (?<request.uow>[^ ]*) // (?<request.id>[^ ]*)\n/ format2 />{49}\n/ format3 /(?<request.method>[^ ]*) (?<request.path>[^ ]*) (?<request.protocol>[^ ]*)\n/ format4 /(?<request.headers>(?:.|\n)*?)\n\n/ format5 /(?<request.body>(?:.|\n)*?)\n/ format6 /<{49}\n/ format7 /(?<response.protocol>[^ ]*) (?<response.status.code>[^ ]*) (?<response.status.description>[^\n]*)\n/ format8 /(?<response.headers>(?:.|\n)*?)\n\n/ format9 /(?<response.body>(?:.|\n)*?)\n\Z/ </parse> </filter> # Parse request.headers="Header: Value\n Header: Value\n" to become and Object request.headers={"Header": "Value", "Header": "Value"} <filter accesslog> @type record_transformer enable_ruby true renew_record false auto_typecast true <record> hostname "#{Socket.gethostname}" request.headers ${Hash[record["request.headers"].each_line.map { |l| l.chomp.split(': ', 2) }]} response.headers ${Hash[record["response.headers"].each_line.map { |l| l.chomp.split(': ', 2) }]} </record> </filter> <match accesslog> @type forward send_timeout 5s recover_wait 10s hard_timeout 30s flush_interval 5s <server> name elastic-node-00 host elastic-node-00 port 24224 weight 100 </server> </match> <match **> @type file path /tmp/fluentd/output/messages </match> |
强调:
- source.port与我们在logback-access.xml中配置的端口相同,用于发送logaccess事件。
- filter 和match标签有' accesslog '关键字。这是我之前提到过的标签。我们正在使用完美的匹配,但可以有一个正则表达式。
- match标签将我们的事件转发到位于主机'elastic-node-00'中的Fluency聚合器并侦听端口24224
- Filter按顺序应用
- filter.parse有一个正则表达式来解析我们的事件。组标签(如response.body或 request.method)将在过滤后用作json属性。例如,我们的示例事件在每个过滤器后将如下所示:
First filter { ... "time": "2018-10-30 00:00:00 UTC" "request.ip": "192.168.0.1", "request.uow": "myapp-node-00-1540857599992", "request.id": "h5hSUaVHvr", "request.method": "POST", "request.path": "/my/app/path", "request.protocol": "HTTP/1.1", "request.headers": "X-Forwarded-Proto: https\nX-Forwarded-For: 69.28.94.231\nHost: my.company.com\nContent-Length: 30\nContent-Type: application/json", "request.body": "{\"message\": \"This is the body of the request\" }", "response.protocol": "HTTP/1.1", "response.status.code": "200", "response.status.description": "OK" "response.headers": "X-RequestId: h5hSUaVHvr\nX-UOW: myapp-node-00-1540857599992\nDate: Mon, 29 Oct 2018 23:59:59 GMT\nContent-Type: application/json; charset=UTF-8" "response.body": "{\"message\": \"This is the body of the response\", \"status\": \"Okey!\"}" ... } Second filter { ... "time": "2018-10-30 00:00:00 UTC" "request.ip": "192.168.0.1", "request.uow": "myapp-node-00-1540857599992", "request.id": "h5hSUaVHvr", "request.method": "POST", "request.path": "/my/app/path", "request.protocol": "HTTP/1.1", "request.headers": { "X-Forwarded-Proto": "https", "X-Forwarded-For": "69.28.94.231", "Host":" my.company.com", "Content-Length": "30", "Content-Type": "application/json" }, "request.body": "{\"message\": \"This is the body of the request\" }", "response.protocol": "HTTP/1.1", "response.status.code": "200", "response.status.description": "OK" "response.headers": { "X-RequestId: "h5hSUaVHvr", "X-UOW": "myapp-node-00-1540857599992", "Date": "Mon, 29 Oct 2018 23:59:59 GMT", "Content-Type": "application/json; charset=UTF-8" } "response.body": "{\"message\": \"This is the body of the response\", \"status\": \"Okey!\"}" ... } |
将收集的日志存储到Elasticsearch中
这部分非常简单,我们必须接收事件并将它们转发给elasticsearch。Fluentd配置文件应如下所示:
<source> @type forward port 24224 bind 0.0.0.0 </source> <match accesslog> @type elasticsearch scheme http host localhost port 9200 logstash_format true validate_client_version true </match> |
强调
- source.port 与我们在 match.server.port中配置的端口相同
- match.logstash_format 生成格式为 logstash -YYYY-mm-dd的Elasticsearch索引
- match.port表示Elasticsearch API侦听端口
在Kibana中查看数据
现在您只需要进入Kibana应用所有访问日志
猜你喜欢