blackdagger

command module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 27, 2024 License: GPL-3.0 Imports: 2 Imported by: 0

README

blackdagger-logo

Blackdagger

Blackdagger is a potent alternative to Cron, enhanced with a Web UI, designed for DevOps, DevSecOps, MLOps, MLSecOps, and Continuous Red Teaming (CART) environments. It enables the definition of command dependencies using a Directed Acyclic Graph (DAG) in a declarative YAML format. Furthermore, Blackdagger natively supports Docker container management, making HTTP requests, and executing commands over SSH, offering a versatile toolset for complex automation workflows.

Highlights

  • Single binary file installation
  • Declarative YAML format for defining DAGs
  • Web UI for visually managing, rerunning, and monitoring pipelines
  • Use existing programs without any modification
  • Self-contained, with no need for a DBMS
  • Suitable for Continuous Red Teaming (CART)
  • Suitable for DevOps and DevSecOps
  • Suitable for MLOps and MLSecOps

Table of Contents

Features

  • Web User Interface
  • Command Line Interface (CLI) with several commands for running and managing DAGs
  • YAML format for defining DAGs, with support for various features including:
    • Execution of custom code snippets
    • Parameters
    • Command substitution
    • Conditional logic
    • Redirection of stdout and stderr
    • Lifecycle hooks
    • Repeating task
    • Automatic retry
  • Executors for running different types of tasks:
    • Running arbitrary Docker containers
    • Making HTTP requests
    • Sending emails
    • Running jq command
    • Executing remote commands via SSH
  • Email notification
  • Scheduling with Cron expressions
  • REST API Interface
  • Basic Authentication over HTTPS

##Usecase

  • Data Pipeline Automation: Schedule ETL tasks for data processing and centralization.
  • Infrastructure Monitoring: Periodically check infrastructure components with HTTP requests or SSH commands.
  • Automated Reporting: Generate and send periodic reports via email.
  • Batch Processing: Schedule batch jobs for tasks like data cleansing or model training.
  • Task Dependency Management: Manage complex workflows with interdependent tasks.
  • Microservices Orchestration: Define and manage dependencies between microservices.
  • CI/CD Integration: Automate code deployment, testing, and environment updates.
  • Alerting System: Create notifications based on specific triggers or conditions.
  • Custom Task Automation: Define and schedule custom tasks using code snippets.
  • Model Training Automation: Automate the training of machine learning models by scheduling jobs that run on new data sets. Use Blackdagger to manage dependencies between data preprocessing, training, evaluation, and deployment tasks.
  • Model Deployment Pipeline: Create a DAG to automate the deployment of trained models to production environments, including steps for model validation, containerization with Docker, and deployment using SSH commands.
  • Security Scans Integration: Schedule regular security scans and static code analysis as part of the CI/CD pipeline. Use Blackdagger to orchestrate these tasks, ensuring that deployments are halted if vulnerabilities are detected.
  • Automated Compliance Checks: Set up workflows to automatically run compliance checks against infrastructure and codebase, reporting results via HTTP requests to compliance monitoring tools.
  • Automated Penetration Testing: Schedule and manage continuous penetration testing activities. Define dependencies in Blackdagger to ensure that penetration tests are conducted after deployment but before wide release, using Docker containers to isolate testing environments.
  • Threat Simulation and Response: Automate the execution of threat simulations to test the effectiveness of security measures. Use Blackdagger to orchestrate complex scenarios involving multiple steps, such as breaching a system, escalating privileges, and exfiltrating data, followed by automated rollback and alerting.

Web UI

DAG Details

It shows the real-time status, logs, and DAG configurations. You can edit DAG configurations on a browser.

example

You can switch to the vertical graph with the button on the top right corner.

Details-TD

DAGs List

It shows all DAGs and the real-time status.

DAGs

Search DAGs

It greps given text across all DAGs. History

Execution History

It shows past execution results and logs.

History

DAG Execution Log

It shows the detail log and standard output of each execution and step.

DAG Log

Installation

Via Bash script

curl -L https://raw.githubusercontent.com/ErdemOzgen/blackdagger/main/scripts/downloader.sh | bash

Via Docker

docker run \
--rm \
-p 8080:8080 \
-v $HOME/.blackdagger/dags:/home/blackdagger/.blackdagger/dags \
-v $HOME/.blackdagger/data:/home/blackdagger/.blackdagger/data \
-v $HOME/.blackdagger/logs:/home/blackdagger/.blackdagger/logs \
ErdemOzgen/blackdagger:latest

ViaGitHubReleasePage'>Via GitHub Release Page

Download the latest binary from the Releases page and place it in your $PATH (e.g. /usr/local/bin).

Quick Start Guide

1. Launch the Web UI

Start the server and scheduler with the command blackdagger start-all and browse to http://127.0.0.1:8080 to explore the Web UI.

2. Create a New DAG

Navigate to the DAG List page by clicking the menu in the left panel of the Web UI. Then create a DAG by clicking the New DAG button at the top of the page. Enter example in the dialog.

Note: DAG (YAML) files will be placed in ~/.blackdagger/dags by default. See Configuration Options for more details.

3. Edit the DAG

Go to the SPEC Tab and hit the Edit button. Copy & Paste the following example and click the Save button.

Example:

schedule: "* * * * *" # Run the DAG every minute
steps:
  - name: s1
    command: echo Hello blackdagger
  - name: s2
    command: echo done!
    depends:
      - s1

4. Execute the DAG

You can execute the example by pressing the Start button. You can see "Hello blackdagger" in the log page in the Web UI.

CLI

# Runs the DAG
blackdagger start [--params=<params>] <file>

# Displays the current status of the DAG
blackdagger status <file>

# Re-runs the specified DAG run
blackdagger retry --req=<request-id> <file>

# Stops the DAG execution
blackdagger stop <file>

# Restarts the current running DAG
blackdagger restart <file>

# Dry-runs the DAG
blackdagger dry [--params=<params>] <file>

# Launches both the web UI server and scheduler process
blackdagger start-all [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Launches the blackdagger web UI server
blackdagger server [--host=<host>] [--port=<port>] [--dags=<path to directory>]

# Starts the scheduler process
blackdagger scheduler [--dags=<path to directory>]

# Shows the current binary version
blackdagger version

Documentation

Running as a daemon

The easiest way to make sure the process is always running on your system is to create the script below and execute it every minute using cron (you don't need root account in this way):

#!/bin/bash
process="blackdagger start-all"
command="/usr/bin/blackdagger start-all"

if ps ax | grep -v grep | grep "$process" > /dev/null
then
    exit
else
    $command &
fi

exit

Example Workflow

This example workflow showcases a data pipeline typically implemented in DevOps and Data Engineering scenarios. It demonstrates an end-to-end data processing cycle starting from data acquisition and cleansing to transformation, loading, analysis, reporting, and ultimately, cleanup.

Details-TD

The YAML code below represents this workflow:

# Environment variables used throughout the pipeline
env:
  - DATA_DIR: /data
  - SCRIPT_DIR: /scripts
  - LOG_DIR: /log
  # ... other variables can be added here

# Handlers to manage errors and cleanup after execution
handlerOn:
  failure:
    command: "echo error"
  exit:
    command: "echo clean up"

# The schedule for the workflow execution in cron format
# This schedule runs the workflow daily at 12:00 AM
schedule: "0 0 * * *"

steps:
  # Step 1: Pull the latest data from a data source
  - name: pull_data
    command: "bash"
    script: |
      echo `date '+%Y-%m-%d'`
    output: DATE

 # Step 2: Cleanse and prepare the data
  - name: cleanse_data
    command: echo cleansing ${DATA_DIR}/${DATE}.csv
    depends:
      - pull_data

  # Step 3: Transform the data
  - name: transform_data
    command: echo transforming ${DATA_DIR}/${DATE}_clean.csv
    depends:
      - cleanse_data

  # Parallel Step 1: Load the data into a database
  - name: load_data
    command: echo loading ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - transform_data

  # Parallel Step 2: Generate a statistical report
  - name: generate_report
    command: echo generating report ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - transform_data

  # Step 4: Run some analytics
  - name: run_analytics
    command: echo running analytics ${DATA_DIR}/${DATE}_transformed.csv
    depends:
      - load_data

  # Step 5: Send an email report
  - name: send_report
    command: echo sending email ${DATA_DIR}/${DATE}_analytics.csv
    depends:
      - run_analytics
      - generate_report

  # Step 6: Cleanup temporary files
  - name: cleanup
    command: echo removing ${DATE}*.csv
    depends:
      - send_report

Motivation

Legacy systems often have complex and implicit dependencies between jobs. When there are hundreds of cron jobs on a server, it can be difficult to keep track of these dependencies and to determine which job to rerun if one fails. It can also be a hassle to SSH into a server to view logs and manually rerun shell scripts one by one. blackdagger aims to solve these problems by allowing you to explicitly visualize and manage pipeline dependencies as a DAG, and by providing a web UI for checking dependencies, execution status, and logs and for rerunning or stopping jobs with a simple mouse click.

Why Not Use an Existing Workflow Scheduler Like Airflow?

There are many existing tools such as Airflow, but many of these require you to write code in a programming language like Python to define your DAG. For systems that have been in operation for a long time, there may already be complex jobs with hundreds of thousands of lines of code written in languages like Perl or Shell Script. Adding another layer of complexity on top of these codes can reduce maintainability. blackdagger was designed to be easy to use, self-contained, and require no coding, making it ideal for small projects.

How It Works

blackdagger is a single command line tool that uses the local file system to store data, so no database management system or cloud service is required. DAGs are defined in a declarative YAML format, and existing programs can be used without modification.


Feel free to contribute in any way you want! Share ideas, questions, submit issues, and create pull requests. Check out our Contribution Guide for help getting started.

We welcome any and all contributions!

License

This project is licensed under the GNU GPLv3. It was forked from Dagu and has been adapted to serve a different purpose. While Dagu is an excellent project, its current objectives do not align with ours.

Documentation

Overview

Copyright © 2023 Dagu Yota Hamada

Directories

Path Synopsis
cmd
internal
dag
pb
Package restapi Blackdagger
Package restapi Blackdagger
service
core/scheduler/filenotify
Package filenotify provides a mechanism for watching file(s) for changes.
Package filenotify provides a mechanism for watching file(s) for changes.
frontend/restapi
Package restapi Blackdagger
Package restapi Blackdagger

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL