dremio-diagnostic-collector

command module
v3.2.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 9, 2024 License: Apache-2.0 Imports: 5 Imported by: 0

README

Go Report Card

Automated log and analytics collection for Dremio clusters

  • Read the FAQ for common questions on setting up DDC
  • Read the ddc.yaml for a full, detailed list of customizable collection parameters (optional)
  • Read the official Dremio Support page for more details on the DDC architecture
  • Read the ddc help

Install DDC on your local machine

Download the latest release binary:

  1. Unzip the binary
  2. Open a terminal and change to the directory where you unzipped your binary
  3. Run the command ./ddc help. If you see the DDC command help, you are good to go.

Guided Collection

ddc
select transport

select namespace for k8s

select collection type

enjoy progress

Scripting - Dremio on Kubernetes

DDC connects via SSH or the kubernetes API and collects a series of logs and files for Dremio, then puts those collected files in an archive

For Kubernetes deployments (Relies on a kubernetes configuration file to be at $HOME/.kube/config or at $KUBECONFIG):

default collection
ddc --namespace mynamespace
to collect job profiles, system tables, kv reports and wlm (via REST API)

Requires Dremio admin privileges. Dremio PATs can be enabled by the support key auth.personal-access-tokens.enabled

ddc  -n mynamespace  --collect health-check

Scripting - Dremio on-prem

Specify executors that you want include in diagnostic collection with the -e flag and coordinators with the -c flag. Specify SSH user, and SSH key to use.

For SSH based communication to VMs or Bare Metal hardware:

coordinator only
ddc --coordinator 10.0.0.19 --ssh-user myuser 
coordinator and executors
ddc --coordinator 10.0.0.19 --executors 10.0.0.20,10.0.0.21,10.0.0.22 --ssh-user myuser
to collect job profiles, system tables, kv reports and wlm (via REST API)

Requires Dremio admin privileges. Dremio PATs can be enabled by the support key auth.personal-access-tokens.enabled

ddc --coordinator 10.0.0.19 --executors 10.0.0.20,10.0.0.21,10.0.0.22 --sudo-user dremio --ssh-user myuser --collect health-check
to avoid using the /tmp folder on nodes
ddc --coordinator 10.0.0.19 --executors 10.0.0.20,10.0.0.21,10.0.0.22 --sudo-user dremio --ssh-user myuser --transfer-dir /mnt/lots_of_storage/

Dremio AWSE

Log-only collection from a Dremio AWSE coordinator is possible via the following command. This will produce a tarball with logs from all nodes.

./ddc awselogs

Dremio Cloud

To collect job profiles, system tables, and wlm via REST API, specify the following parameters in ddc.yaml

is-dremio-cloud: true
dremio-endpoint: "[eu.]dremio.cloud"    # Specify whether EU Dremio Cloud or not
dremio-cloud-project-id: "<PROJECT_ID>"
dremio-pat-token: "<DREMIO_PAT>"
tmp-output-dir: /full/path/to/dir       # Specify local target directory

and run ./ddc local-collect from your local machine

Windows Users

If you are running DDC from Windows, always run in a shell from the C: drive prompt. This is because of a limitation of kubectl ( see https://github.com/kubernetes/kubernetes/issues/77310 )

ddc.yaml

The ddc.yaml file is located next to your DDC binary and can be edited to fit your environment. The default-ddc.yaml documents the full list of available parameters.

ddc usage

ddc v3.2.3-79ea60d
 ddc connects via ssh or kubectl and collects a series of logs and files for dremio, then puts those collected files in an archive
examples:

for a ui prompt just run:
	ddc 

for ssh based communication to VMs or Bare metal hardware:

	ddc --coordinator 10.0.0.19 --executors 10.0.0.20,10.0.0.21,10.0.0.22 --ssh-user myuser --ssh-key ~/.ssh/mykey --sudo-user dremio 

for kubernetes deployments:

	# run against a specific namespace and retrieve 2 days of logs
	ddc --namespace mynamespace

	# run against a specific namespace with a standard collection (includes jfr, top and 30 days of queries.json logs)
	ddc --namespace mynamespace	--collect standard

	# run against a specific namespace with a Health Check (runs 2 threads and includes everything in a standard collection plus collect 25,000 job profiles, system tables, kv reports and Work Load Manager (WLM) reports)
	ddc --namespace mynamespace	--collect health-check

Usage:
  ddc [flags]
  ddc [command]

Available Commands:
  awselogs      Log only collect of AWSE from the coordinator node
  completion    Generate the autocompletion script for the specified shell
  help          Help about any command
  local-collect retrieves all the dremio logs and diagnostics for the local node and saves the results in a compatible format for Dremio support
  version       Print the version number of DDC

Flags:
      --collect string             type of collection: 'light'- 2 days of logs (no top or jfr). 'standard' - includes jfr, top, 7 days of logs and 30 days of queries.json logs. 'standard+jstack' - all of 'standard' plus jstack. 'health-check' - all of 'standard' + WLM, KV Store Report, 25,000 Job Profiles (default "light")
  -x, --context string             K8S ONLY: context to use for kubernetes pods
  -c, --coordinator string         SSH ONLY: set a list of ip addresses separated by commas
      --ddc-yaml string            location of ddc.yaml that will be transferred to remote nodes for collection configuration (default "/opt/homebrew/Cellar/ddc/3.2.3/libexec/ddc.yaml")
      --detect-namespace           detect namespace feature to pass the namespace automatically
      --disable-free-space-check   disables the free space check for the --transfer-dir
  -d, --disable-kubectl            uses the embedded k8s api client and skips the use of kubectl for transfers and copying
      --disable-prompt             disables the prompt ui
  -e, --executors string           SSH ONLY: set a list of ip addresses separated by commas
  -h, --help                       help for ddc
  -l, --label-selector string      K8S ONLY: select which pods to collect: follows kubernetes label syntax see https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors (default "role=dremio-cluster-pod")
      --min-free-space-gb int      min free space needed in GB for the process to run (default 40)
  -n, --namespace string           K8S ONLY: namespace to use for kubernetes pods
      --output-file string         name and location of diagnostic tarball (default "diag.tgz")
  -t, --pat-prompt                 prompt for the pat, which will enable collection of kv report, system tables, job profiles and the workload manager report
  -s, --ssh-key string             SSH ONLY: of ssh key to use to login
  -u, --ssh-user string            SSH ONLY: user to use during ssh operations to login
  -b, --sudo-user string           SSH ONLY: if any diagnostics commands need a sudo user (i.e. for jcmd)
      --transfer-dir string        directory to use for communication between the local-collect command and this one (default "/tmp/ddc-20240906174311")
      --transfer-threads int       number of threads to transfer tarballs (default 2)

Use "ddc [command] --help" for more information about a command.

Documentation

Overview

main is the standard go entry point for the application

Directories

Path Synopsis
cmd
cmd package contains all the command line flag and initialization logic for commands
cmd package contains all the command line flag and initialization logic for commands
local
cmd package contains all the command line flag and initialization logic for commands
cmd package contains all the command line flag and initialization logic for commands
local/apicollect
apicollect provides all the methods that collect via the API, this is a substantial part of the activities of DDC so it gets it's own package
apicollect provides all the methods that collect via the API, this is a substantial part of the activities of DDC so it gets it's own package
local/conf
package conf provides configuration for the local-collect command
package conf provides configuration for the local-collect command
local/conf/autodetect
package autodetect looks at the system configuration and file names and tries to guess at the correct configuration
package autodetect looks at the system configuration and file names and tries to guess at the correct configuration
local/ddcio
ddcio include helper code for io operations common to ddc
ddcio include helper code for io operations common to ddc
local/jvmcollect
package jvmcollect handles parsing of the jvm information
package jvmcollect handles parsing of the jvm information
local/logcollect
package logcollect contains the logic for log collection in the local-collect sub command
package logcollect contains the logic for log collection in the local-collect sub command
local/nodeinfocollect
package nodeinfocollect has all the methods for collecting the information for nodeinfo
package nodeinfocollect has all the methods for collecting the information for nodeinfo
local/queriesjson
queriesjson package contains the logic for collecting queries.json information
queriesjson package contains the logic for collecting queries.json information
local/threading
threading package provides support for simple concurrency and threading
threading package provides support for simple concurrency and threading
root/cli
package cli provides wrapper support for executing commands, this is so we can test the rest of the implementations quickly.
package cli provides wrapper support for executing commands, this is so we can test the rest of the implementations quickly.
root/collection
collection package provides the interface for collection implementation and the actual collection execution
collection package provides the interface for collection implementation and the actual collection execution
root/fallback
packag fallback is only used when we are unable to collect with --detect namespace
packag fallback is only used when we are unable to collect with --detect namespace
root/helpers
helpers package provides some general functions that do not have a good home
helpers package provides some general functions that do not have a good home
root/kubectl
kubectl package provides access to log collections on k8s
kubectl package provides access to log collections on k8s
root/kubernetes
kubernetes package provides access to log collections on k8s
kubernetes package provides access to log collections on k8s
root/ssh
ssh package uses ssh and scp binaries to execute commands remotely and translate the results back to the calling node
ssh package uses ssh and scp binaries to execute commands remotely and translate the results back to the calling node
version
cmd package contains all the command line flag and initialization logic for commands
cmd package contains all the command line flag and initialization logic for commands
pkg
clusterstats
package clusterstats provides a placeholder for summary information found inside of a tarball, used by local and remote collect
package clusterstats provides a placeholder for summary information found inside of a tarball, used by local and remote collect
consoleprint
package consoleprint contains the logic to update the console UI
package consoleprint contains the logic to update the console UI
dirs
dirs provides helpers for working with directories on the filesystem
dirs provides helpers for working with directories on the filesystem
jps
jps package provides logic for extracting values from jps
jps package provides logic for extracting values from jps
masking
masking hides secrets in files and replaces them with redacted text
masking hides secrets in files and replaces them with redacted text
output
output provides functinos around capturing output
output provides functinos around capturing output
simplelog
simplelog package provides a simple logger
simplelog package provides a simple logger
tests
package tests provides helper functions and mocks for running tests
package tests provides helper functions and mocks for running tests
validation
package validation concerns itself with validation configuration
package validation concerns itself with validation configuration

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL