pipeline

package

v0.0.0-...-5c7ffcf Latest Latest Go to latest Published: Jul 2, 2024 License: Apache-2.0 Imports: 28 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/traas-stack/holoinsight-agent

Links

Open Source Insights

README ¶

流水线

日志流水线

目前, 每个采集任务自己一条流水线.
这和vessel里不太一样, 这里做一些解释.
最主要的原因还是为了支持多文件采集.
原本vessel上的都是单文件采集, 而现在为了支持多文件采集, 一个pipeline将只执行一个采集任务, 这个采集任务会有多个日志源和一个消费者(这跟原来是相反的, 原来是有一个日志源和多个消费者).

为了防止日志从磁盘重复读取, 我们在FileLogStream里做了cache, 并结合pipeline做了一个巧妙的类似sls cursor的消费流程. 总之它可以保证相同文件是按流式消费, 并且只会从磁盘读一次.

组成

日志源探测器: 用于检测匹配哪些日志源(可能会变化)
多个日志源: 实际匹配的日志源, pipeline会串行消费他们
一个消费者: 从日志源获取到的结果传输给消费者

配置更新

配置更新的几种类型:

引起日志源增删
只改变过滤条件, 对最终产出的数据格式无影响, schema不变
改变了产出的字段名, 分组名, 分组数等, 会对最终产出的数据格式有影响, schema都变了
以上几种的组合

如何保证配置更新是平滑的

平滑是指在"合理"的情况下不会导致数据丢失.

如果引起日志源增删的, 那么要尽快生效, 这个过程不存在"平滑"的说法. 删掉一个日志源那就从那之后就不采了, 但之前已经采的还是会记录在内.
改变过滤条件, 要尽快生效, 不用特殊的操作去保持平滑.
改变了schema, 由于schema变了, 最终在产品展示上肯定是新的开始(这是时序数据的特征), 维度都不一样了, 因此它也没有保持"平滑"的必要.
以上几种的组合: 只要能正确处理每一项, 对于几种的组合就自然而然是正确的.

综上, 想保证平滑, 基本上就是该处理的日志都处理了, 别丢失日志处理, 处理规则立即生效就行.

因为那些不能保证平滑的, 本身也不需要保证平滑.

配置更新前后pipeline实例是同一个.

暂停 pipeline (加锁)
创建新的 consumer(简称新c)
停止旧的 consumer(简称旧c)
新c从旧c继承一些状态(比如消费进度, 定时器, 时间之类的)
启动新c
恢复 pipeline (解锁)

系统指标流水线

TODO 系统指标流水线会更新吗?

Documentation ¶

Index ¶

type Manager
- func NewManager(ctm collecttask.IManager, lsm *logstream.Manager) *Manager

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Manager ¶

type Manager struct {
	// contains filtered or unexported fields
}

需要管理2种类型的pipelines

func NewManager ¶

func NewManager(ctm collecttask.IManager, lsm *logstream.Manager) *Manager

func (*Manager) LoadAll ¶

func (m *Manager) LoadAll()

func (*Manager) Start ¶

func (m *Manager) Start()

func (*Manager) Stop ¶

func (m *Manager) Stop()

func (*Manager) Update ¶

func (m *Manager) Update(f func(map[string]api.Pipeline))

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
integration
alibabacloud
base
standard

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL