11. Tuple Sampling and Debugging
11.1. 功能介绍
在调试Topology的过程中,很多Strom用户添加了“debug”Bolt或者Trident功能,以记录流经Topology的数据信息,在生产部署的时候移除或者禁用它们。如今Storm UI包含这一功能,可以使你直接通过Storm UI对流经Topology或者单个组件的数据进行取样。被取样的事件可以通过Storm UI直接观察到,并被保存到磁盘。
11.2. 使用说明
1.由于轻微的性能消耗,Storm默认没有开启这项功能,可以在cronf/storm.yaml中进行参数配置。
Parameter | Meaning | When to use |
---|---|---|
topology.eventlogger.executors: 0 | No event logger tasks are created (default). | If you don’t intend to inspect tuples and don’t want the slight performance hit. |
topology.eventlogger.executors: 1 | One event logger task for the topology. | If you want to sample a low percentage of tuples from a specific spout or a bolt. This could be the most common use case. |
topology.eventlogger.executors: nil | One event logger task per worker. | If you want to sample entire topology (all spouts and bolt) at a very high sampling percentage and the tuple rate is very high. |
2.设置过参数后进行如下操作:
对整个Topology进行数据采集:
对指定组件进行采集:
点击上图“event”按钮后,重定向到下图界面,其中event.log中的每一行代表从一个组件emit的数据,格式为:Timestamp, Component name, Component task-id, MessageId (incase of anchoring), List of emitted values