Documentation
¶
Overview ¶
Package profiler provides fucntionality for collecting the following information: USE Metrics for various components of the Linux system (CPU, Memory, Disk I/O), kernel trace output, output of arbitray shell command provided by the oncall.
Index ¶
- func CollectUSEMetrics(component Component, outputs map[string]utils.ParsedOutput) error
- func NewDF(name string, flags string, titles []string) *df
- func NewFree(name string, titles []string) *free
- func NewIOStat(name string, flags string, delay int, count int, titles []string) *iostat
- func NewLscpu(name string, titles []string) *lscpu
- func NewVMStat(name string, delay int, count int, titles []string) *vmstat
- type CPU
- func (c *CPU) AdditionalInformation() string
- func (c *CPU) CollectErrors(outputs map[string]utils.ParsedOutput) error
- func (c *CPU) CollectSaturation(outputs map[string]utils.ParsedOutput) error
- func (c *CPU) CollectUtilization(outputs map[string]utils.ParsedOutput) error
- func (c *CPU) Name() string
- func (c *CPU) USEMetrics() *USEMetrics
- type Command
- type Component
- type MemCap
- func (m *MemCap) AdditionalInformation() string
- func (m *MemCap) CollectErrors(outputs map[string]utils.ParsedOutput) error
- func (m *MemCap) CollectSaturation(outputs map[string]utils.ParsedOutput) error
- func (m *MemCap) CollectUtilization(outputs map[string]utils.ParsedOutput) error
- func (m *MemCap) Name() string
- func (m *MemCap) USEMetrics() *USEMetrics
- type ProfilerConfig
- type ProfilerReport
- type StorageCap
- func (s *StorageCap) AdditionalInformation() string
- func (s *StorageCap) CollectErrors(outputs map[string]utils.ParsedOutput) error
- func (s *StorageCap) CollectSaturation(outputs map[string]utils.ParsedOutput) error
- func (s *StorageCap) CollectUtilization(outputs map[string]utils.ParsedOutput) error
- func (s *StorageCap) Name() string
- func (s *StorageCap) USEMetrics() *USEMetrics
- type StorageDevIO
- func (d *StorageDevIO) AdditionalInformation() string
- func (d *StorageDevIO) CollectErrors(outputs map[string]utils.ParsedOutput) error
- func (d *StorageDevIO) CollectSaturation(outputs map[string]utils.ParsedOutput) error
- func (d *StorageDevIO) CollectUtilization(outputs map[string]utils.ParsedOutput) error
- func (d *StorageDevIO) Name() string
- func (d *StorageDevIO) USEMetrics() *USEMetrics
- type USEMetrics
- type USEReport
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CollectUSEMetrics ¶
func CollectUSEMetrics(component Component, outputs map[string]utils.ParsedOutput) error
CollectUSEMetrics collects USE Metrics for the component specified. It does this by calling the necessary methods to collect utilization, saturation and errors.
Types ¶
type CPU ¶
type CPU struct {
// contains filtered or unexported fields
}
CPU holds information about the CPU component: name and USE Metrics collected.
func NewCPU ¶
NewCPU holds information about the CPU component: this can be used to initialize CPU outside of the profiler package.
func (*CPU) AdditionalInformation ¶
AdditionalInformation returns additional information unique to the the CPU component.
func (*CPU) CollectErrors ¶
func (c *CPU) CollectErrors(outputs map[string]utils.ParsedOutput) error
CollectErrors collects errors for the CPU component.
func (*CPU) CollectSaturation ¶
func (c *CPU) CollectSaturation(outputs map[string]utils.ParsedOutput) error
CollectSaturation calculates the saturation value for the CPU component. It does this by comparing the number of runnable processes with the number of CPUs in the system. If the number of processes (running or waiting) is greater than the CPU count, the CPU component is saturated. The value of runnable processes is found on vmstat's 'r' column and CPU count from lscpu's "CPU(s)" row.
func (*CPU) CollectUtilization ¶
func (c *CPU) CollectUtilization(outputs map[string]utils.ParsedOutput) error
CollectUtilization calculates the utilization score for the CPU Component. It does this by summing the time spent running non-kernel code (user time), time spent running kernel code (system time), and time stolen from a vitual virtual machine (steal) to get the total CPU time spent servicing work. These values can be found on vmstat's 'us' (user), 'sy' (system), and 'st' (steal) columns.
func (*CPU) USEMetrics ¶
func (c *CPU) USEMetrics() *USEMetrics
USEMetrics returns USEMetrics for the CPU component.
type Command ¶
Command interface defines functions that can be implemented by structs to execute shell commands.
type Component ¶
type Component interface { // CollectUtilization calculates the utilization score of a component. // It takes in a map of commands and uses it to get the parsed output // for the commands it will specify. CollectUtilization(cmdOutputs map[string]utils.ParsedOutput) error // CollectSaturation calculates the saturation value of a component. // It takes in a map of commands and specifies the commands it // needs to calculate saturation. CollectSaturation(cmdOutputs map[string]utils.ParsedOutput) error // CollectErrors finds the errors in a component. // It takes in a map of commands to their parsed output and uses that // to specify which commands (and therefore output) it needs. CollectErrors(cmdOutputs map[string]utils.ParsedOutput) error // USEMetrics returns the USEMetrics of the component. USEMetrics() *USEMetrics // Name returns the name of the component. Name() string // AdditionalInformation returns additional information unique to each // component. AdditionalInformation() string }
Component interface defines functions that can be implemented by the system components to be used when collecting USE Metrics.
type MemCap ¶
type MemCap struct {
// contains filtered or unexported fields
}
MemCap holds information about the Memory capacity component: name and USE Metrics collected.
func NewMemCap ¶
NewMemCap holds information about the Memory capacity component: this can be used to initialize MemCap outside of the profiler package.
func (*MemCap) AdditionalInformation ¶
AdditionalInformation returns additional information unique to the the MemCap component.
func (*MemCap) CollectErrors ¶
func (m *MemCap) CollectErrors(outputs map[string]utils.ParsedOutput) error
CollectErrors collects errors for the MemCap component.
func (*MemCap) CollectSaturation ¶
func (m *MemCap) CollectSaturation(outputs map[string]utils.ParsedOutput) error
CollectSaturation calculates the saturation value for Memory Capacity. It does this by checking whether the amount of memory being swapped in and out of the disks is significant. This indicates that the system is low on memory and the kernel is relying heavily on pages from the swap space on the disk. Here we define "significant" as the amount of swapped memory amounting to roughly 10% of the total memory." The values for memory swapped in and out of disks can be found on vmstat's 'si' (swapped in) and 'so' (swapped to) columns.
func (*MemCap) CollectUtilization ¶
func (m *MemCap) CollectUtilization(outputs map[string]utils.ParsedOutput) error
CollectUtilization calculates the utilization score for Memory Capacity. It does this by getting the quotient of used memory (main and virtual) and total memory (main and virtual). The values for main memory can be found on free's "Mem" row while virtual memory stats can be found on the "Swap" row. To get the used and total values for each row, free's "used" and "total" columns are used.
func (*MemCap) USEMetrics ¶
func (m *MemCap) USEMetrics() *USEMetrics
USEMetrics returns USEMetrics for the Memory capacity component.
type ProfilerConfig ¶
type ProfilerConfig struct { // KernelTracePoints are the trace points we should insert and capture. KernelTracePoints []string // RawCommands are the shell commands that we should run and capture // output. RawCommands []string }
ProfilerConfig tells the profiler which dynamic reports it should generate and capture.
type ProfilerReport ¶
type ProfilerReport struct { // Static reports // USEMetrics provides Utilization, Saturation and Errors for different // components on the system USEInfo USEReport // RawCommands captures the output of arbitrary shell commands provided // by the user. Example usage: count # of systemd units; count # of // cgroups RawCommands map[string][]byte // Dynamic tracing reports // KernelTraces captures the output of the ftrace command. The key is the // kernel trace point and the value is the output KernelTraces map[string][]byte }
ProfilerReport contains debugging information provided by the profiler tool. Currently, it will only provide USEMetrics (Utilization, Saturation, Errors), kernel trace outputs, and the outputs of arbitrary shell commands provided by the user. In future, we can add following different types of dynamic reports:
type PerfReport - Captures perf command output type STraceReport - Captures strace output type BPFTraceReport - Allows users to add eBPF hooks and capture its
output
func GenerateProfilerReport ¶
func GenerateProfilerReport(config ProfilerConfig) (*ProfilerReport, error)
GenerateProfilerReport generates a report which consists of multiple reports of different types. ProfilerConfig is passed as an input parameter that tells the Profiler tool which reports to capture
type StorageCap ¶
type StorageCap struct {
// contains filtered or unexported fields
}
StorageCap holds information about the Storage Capacity component: name and USE Metrics collected.
func NewStorageCap ¶
func NewStorageCap(name string) *StorageCap
NewStorageCap holds information about the StorageCap component: this can be used to initialize StorageCap outside of the profiler package.
func (*StorageCap) AdditionalInformation ¶
func (s *StorageCap) AdditionalInformation() string
AdditionalInformation returns additional information unique to the the StorageCap component.
func (*StorageCap) CollectErrors ¶
func (s *StorageCap) CollectErrors(outputs map[string]utils.ParsedOutput) error
CollectErrors collects errors for the Storage Capacity component.
func (*StorageCap) CollectSaturation ¶
func (s *StorageCap) CollectSaturation(outputs map[string]utils.ParsedOutput) error
CollectSaturation collects the saturation value for Storage Capacity.
func (*StorageCap) CollectUtilization ¶
func (s *StorageCap) CollectUtilization(outputs map[string]utils.ParsedOutput) error
CollectUtilization calculates the utilization value for Storage Capacity. It does this by getting disk usage of particular devices on the file system. Disk usage on a particular device can be found using the 'df' command by getting the 'Used' value of that device divided by its total size, found on the column specifying metrics of block size. In this case, this column is "1K-blocks", since "-k" was passed as a flag to 'df'. The devices to collect disk usage for are found on StorageCap's devices field. If this field is not set, "/dev/sda", i.e. the boot disk, is used as default.
func (*StorageCap) Name ¶
func (s *StorageCap) Name() string
func (*StorageCap) USEMetrics ¶
func (s *StorageCap) USEMetrics() *USEMetrics
type StorageDevIO ¶
type StorageDevIO struct {
// contains filtered or unexported fields
}
StorageDevIO holds information about the Storage device I/O component: name and USE Metrics collected.
func NewStorageDevIO ¶
func NewStorageDevIO(name string) *StorageDevIO
NewStorageDevIO holds information about the Storage device I/O component: this can be used to initialize Storage device I/O outside of the profiler package.
func (*StorageDevIO) AdditionalInformation ¶
func (d *StorageDevIO) AdditionalInformation() string
AdditionalInformation returns additional information unique to the the StorageDevIO component.
func (*StorageDevIO) CollectErrors ¶
func (d *StorageDevIO) CollectErrors(outputs map[string]utils.ParsedOutput) error
CollectErrors collects errors for the Storage Device I/O component.
func (*StorageDevIO) CollectSaturation ¶
func (d *StorageDevIO) CollectSaturation(outputs map[string]utils.ParsedOutput) error
CollectSaturation collects the saturation value for the StorageDevIO component. It does this by comparing the average queue length of requests that were issued to the device with 1. If the queue length is greater than 1, then the Storage Device component is saturated. The value for the average queue length can be found on iostat's 'aqu-sz' column.
func (*StorageDevIO) CollectUtilization ¶
func (d *StorageDevIO) CollectUtilization(outputs map[string]utils.ParsedOutput) error
CollectUtilization collects the utilization score for the StorageDevIO component. It does this by getting the percentage of elapsed time during which I/O requests were issued to the devices. This value can be found on iostat's '%util' column.
func (*StorageDevIO) Name ¶
func (d *StorageDevIO) Name() string
Name returns the name of the Storage device I/O component.
func (*StorageDevIO) USEMetrics ¶
func (d *StorageDevIO) USEMetrics() *USEMetrics
USEMetrics returns USEMetrics for the Storage Device I/O Component.
type USEMetrics ¶
type USEMetrics struct { // Timestamp refers to the point in time when the USE metrics for this // component was collected. Timestamp time.Time // Interval refers to the time interval for which the USE metrics for this // component was collected. Interval time.Duration // Utilization is the percent over a time interval for which the resource // was busy servicing work. Utilization float64 // Saturation is the degree to which the resource has extra work which it // can’t service. The value for Saturation has different meanings // depending on the component being analyzed. But for simplicity sake // Saturation here is just a bool which tells us whether this specific // component is saturated or not. Saturation bool // Errors is the number of errors seen in the component over a given // time interval. Errors int64 }
USEMetrics contain the USE metrics (utilization, saturation, errors) for a particular component of the system.
type USEReport ¶
type USEReport struct { // Components contains the USE Metrics for each component of the system. // Such components include CPU, memory, network, storage, etc. Components []Component // Analysis provides insights into the USE metrics collected, including // a guess as to which component may be causing performance issues. Analysis string }
USEReport contains the USE Report from a single run of the node profiler. The USE Report contains helpful information to help diagnose performance issues seen by customers on their k8s clusters.