wayback

package module
v0.20.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 2, 2024 License: GPL-3.0 Imports: 20 Imported by: 0

README

Wayback

LICENSE Go Report Card Test Coverage Go Reference Releases

Telegram Bot Discord Bot Matrix Bot Matrix Room Tor Hidden Service World Wide Web Nostr

Wayback is a web archiving and playback tool that allows users to capture and preserve web content. It provides an IM-style interface for receiving and presenting archived web content, and a search and playback service for retrieving previously archived pages. Wayback is designed to be used by web archivists, researchers, and anyone who wants to preserve web content and access it in the future.

Features

  • Free and open-source
  • Expose prometheus metrics
  • Cross-platform compatibility
  • Batch wayback URLs for faster archiving
  • Built-in CLI (wayback) for convenient use
  • Serve as a Tor Hidden Service or local web entry for added privacy and accessibility
  • Easier wayback to Internet Archive, archive.today, IPFS and Telegraph integration
  • Interactive with IRC, Matrix, Telegram bot, Discord bot, Mastodon, Twitter, and XMPP as a daemon service for convenient use
  • Supports publishing wayback results to Telegram channel, Mastodon, and GitHub Issues for sharing
  • Supports storing archived files to disk for offline use
  • Download streaming media (requires FFmpeg) for convenient media archiving.

Getting Started

For a comprehensive guide, please refer to the complete documentation.

Installation

The simplest, cross-platform way is to download from GitHub Releases and place the executable file in your PATH.

From source:

go install github.com/wabarc/wayback/cmd/wayback@latest

From GitHub Releases:

curl -fsSL https://get.wabarc.eu.org | sh

or via Bina:

curl -fsSL https://bina.egoist.dev/wabarc/wayback | sh

Using Snapcraft (on GNU/Linux)

sudo snap install wayback

Via APT:

curl -fsSL https://repo.wabarc.eu.org/apt/gpg.key | sudo gpg --dearmor -o /usr/share/keyrings/packages.wabarc.gpg
echo "deb [arch=amd64,arm64,armhf signed-by=/usr/share/keyrings/packages.wabarc.gpg] https://repo.wabarc.eu.org/apt/ /" | sudo tee /etc/apt/sources.list.d/wayback.list
sudo apt update
sudo apt install wayback

Via RPM:

sudo rpm --import https://repo.wabarc.eu.org/yum/gpg.key
sudo tee /etc/yum.repos.d/wayback.repo > /dev/null <<EOT
[wayback]
name=Wayback Archiver
baseurl=https://repo.wabarc.eu.org/yum/
enabled=1
gpgcheck=1
gpgkey=https://repo.wabarc.eu.org/yum/gpg.key
EOT

sudo dnf install -y wayback

Via Homebrew:

brew tap wabarc/wayback
brew install wayback
Usage
Command line
$ wayback -h

A command-line tool and daemon service for archiving webpages.

Usage:
  wayback [flags]

Examples:
  wayback https://www.wikipedia.org
  wayback https://www.fsf.org https://www.eff.org
  wayback --ia https://www.fsf.org
  wayback --ia --is -d telegram -t your-telegram-bot-token
  WAYBACK_SLOT=pinata WAYBACK_APIKEY=YOUR-PINATA-APIKEY \
    WAYBACK_SECRET=YOUR-PINATA-SECRET wayback --ip https://www.fsf.org

Flags:
      --chatid string      Telegram channel id
  -c, --config string      Configuration file path, defaults: ./wayback.conf, ~/wayback.conf, /etc/wayback.conf
  -d, --daemon strings     Run as daemon service, supported services are telegram, web, mastodon, twitter, discord, slack, irc, xmpp
      --debug              Enable debug mode (default mode is false)
  -h, --help               help for wayback
      --ia                 Wayback webpages to Internet Archive
      --info               Show application information
      --ip                 Wayback webpages to IPFS
      --ipfs-host string   IPFS daemon host, do not require, unless enable ipfs (default "127.0.0.1")
  -m, --ipfs-mode string   IPFS mode (default "pinner")
  -p, --ipfs-port uint     IPFS daemon port (default 5001)
      --is                 Wayback webpages to Archive Today
      --ph                 Wayback webpages to Telegraph
      --print              Show application configurations
  -t, --token string       Telegram Bot API Token
      --tor                Snapshot webpage via Tor anonymity network
      --tor-key string     The private key for Tor Hidden Service
  -v, --version            version for wayback
Examples

Wayback one or more url to Internet Archive and archive.today:

wayback https://www.wikipedia.org

wayback https://www.fsf.org https://www.eff.org

Wayback url to Internet Archive or archive.today or IPFS:

// Internet Archive
$ wayback --ia https://www.fsf.org

// archive.today
$ wayback --is https://www.fsf.org

// IPFS
$ wayback --ip https://www.fsf.org

For using IPFS, also can specify a pinning service:

$ export WAYBACK_SLOT=pinata
$ export WAYBACK_APIKEY=YOUR-PINATA-APIKEY
$ export WAYBACK_SECRET=YOUR-PINATA-SECRET
$ wayback --ip https://www.fsf.org

// or

$ WAYBACK_SLOT=pinata WAYBACK_APIKEY=YOUR-PINATA-APIKEY \
$ WAYBACK_SECRET=YOUR-PINATA-SECRET wayback --ip https://www.fsf.org

More details about pinning service.

With telegram bot:

wayback --ia --is --ip -d telegram -t your-telegram-bot-token

Publish message to your Telegram channel at the same time:

wayback --ia --is --ip -d telegram -t your-telegram-bot-token --chatid your-telegram-channel-name

Also can run with debug mode:

wayback -d telegram -t YOUR-BOT-TOKEN --debug

Both serve on Telegram and Tor hidden service:

wayback -d telegram -t YOUT-BOT-TOKEN -d web

URLs from file:

wayback url.txt
cat url.txt | wayback
Configuration Parameters

Look at the full list of configuration options.

Deployment

Docker/Podman
docker pull wabarc/wayback
docker run -d wabarc/wayback wayback -d telegram -t YOUR-BOT-TOKEN # without telegram channel
docker run -d wabarc/wayback wayback -d telegram -t YOUR-BOT-TOKEN -c YOUR-CHANNEL-USERNAME # with telegram channel
1-Click Deploy

Deploy Deploy to Render

Screenshots

Click to see screenshots of the services.
Discord

Discord

Web Service

Web

Mastodon

Mastodon

Matrix

Matrix Room

IRC

IRC

Slack

Slack Channel

Telegram

Telegram Bot Telegram Channel

XMPP

XMPP

Contributing

We encourage all contributions to this repository! Open an issue! Or open a Pull Request!

If you're interested in contributing to wayback itself, read our contributing guide to get started.

Note: All interaction here should conform to the Code of Conduct.

License

This software is released under the terms of the GNU General Public License v3.0. See the LICENSE file for details.

FOSSA Status

Documentation

Overview

Package wayback is a toolkit for snapshot webpage to Internet Archive, archive.today, IPFS and beyond.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Collect added in v0.8.0

type Collect struct {
	Arc string // Archive slot name, see config/config.go
	Dst string // Archived destination URL
	Src string // Source URL
	Ext string // Extra identifier
}

Collect results that archived, Arc is name of the archive service, Dst mapping the original URL and archived destination URL, Ext is extra descriptions.

func Playback added in v0.10.0

func Playback(ctx context.Context, cfg *config.Options, urls ...*url.URL) (cols []Collect, err error)

Playback returns URLs archived from the time capsules.

func Wayback added in v0.11.0

func Wayback(ctx context.Context, rdx reduxer.Reduxer, cfg *config.Options, urls ...*url.URL) ([]Collect, error)

Wayback returns URLs archived to the time capsules of given URLs.

type GA added in v0.20.0

type GA struct {
	URL *url.URL
	// contains filtered or unexported fields
}

GA represents the Ghostarchive slot.

func (GA) Wayback added in v0.20.0

func (g GA) Wayback(_ reduxer.Reduxer) string

Wayback implements the standard Waybacker interface: it reads URL from the Ghostarchive and returns archived URL as a string.

type IA added in v0.14.0

type IA struct {
	URL *url.URL
	// contains filtered or unexported fields
}

IA represents the Internet Archive slot.

func (IA) Wayback added in v0.14.0

func (i IA) Wayback(_ reduxer.Reduxer) string

Wayback implements the standard Waybacker interface: it reads URL from the IA and returns archived URL as a string.

type IP added in v0.14.0

type IP struct {
	URL *url.URL
	// contains filtered or unexported fields
}

IP represents the IPFS slot.

func (IP) Wayback added in v0.14.0

func (i IP) Wayback(rdx reduxer.Reduxer) string

Wayback implements the standard Waybacker interface: it reads URL from the IP and returns archived URL as a string.

type IS added in v0.14.0

type IS struct {
	URL *url.URL
	// contains filtered or unexported fields
}

IS represents the archive.today slot.

func (IS) Wayback added in v0.14.0

func (i IS) Wayback(_ reduxer.Reduxer) string

Wayback implements the standard Waybacker interface: it reads URL from the IS and returns archived URL as a string.

type PH added in v0.14.0

type PH struct {
	URL *url.URL
	// contains filtered or unexported fields
}

PH represents the Telegra.ph slot.

func (PH) Wayback added in v0.14.0

func (i PH) Wayback(rdx reduxer.Reduxer) string

Wayback implements the standard Waybacker interface: it reads URL from the PH and returns archived URL as a string.

type Waybacker added in v0.14.0

type Waybacker interface {
	Wayback(reduxer.Reduxer) string
}

Waybacker is the interface that wraps the basic Wayback method.

Wayback wayback *url.URL from struct of the implementations to the Wayback Machine. It returns the result of string from the upstream services.

Directories

Path Synopsis
cmd
Package config handles configuration management for the application.
Package config handles configuration management for the application.
Package entity contains all data structures used by the application.
Package entity contains all data structures used by the application.
Package errors handles errors.
Package errors handles errors.
Package ingress provides functionality for registering services.
Package ingress provides functionality for registering services.
Package metrics exposes wayback service status.
Package metrics exposes wayback service status.
Package pooling implements the wayback workers pool.
Package pooling implements the wayback workers pool.
The publish package provides a publishing service and requires initialization by the caller.
The publish package provides a publishing service and requires initialization by the caller.
Package reduxer implements a set of functions to transform webpage to various formats.
Package reduxer implements a set of functions to transform webpage to various formats.
Package service implements the common utils function for daemon services.
Package service implements the common utils function for daemon services.
discord
Package discord implements the discord bot daemon service.
Package discord implements the discord bot daemon service.
httpd
Package httpd implements the tor network service.
Package httpd implements the tor network service.
mastodon
Package mastodon implements the mastodon daemon service.
Package mastodon implements the mastodon daemon service.
matrix
Package matrix implements the matrix daemon service.
Package matrix implements the matrix daemon service.
relaychat
Package relaychat implements the internet relay chat daemon service.
Package relaychat implements the internet relay chat daemon service.
slack
Package slack implements the slack bot daemon service.
Package slack implements the slack bot daemon service.
telegram
Package telegram implements the telegram bot daemon service.
Package telegram implements the telegram bot daemon service.
twitter
Package twitter implements the twitter daemon service.
Package twitter implements the twitter daemon service.
xmpp
Package xmpp implements the xmpp daemon service.
Package xmpp implements the xmpp daemon service.
Package storage implements a set of functions to interact with the database.
Package storage implements a set of functions to interact with the database.
Package systemd provides a Go implementation of the sd_notify protocol.
Package systemd provides a Go implementation of the sd_notify protocol.
Package template handles template parsing and execution.
Package template handles template parsing and execution.
render
Package render handles template parsing and execution for services.
Package render handles template parsing and execution for services.
Package version contains application and build information.
Package version contains application and build information.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL