The highest tagged major version is v2.

webpalm

command module

v1.1.1 Latest Latest Go to latest Published: Jun 19, 2023 License: GPL-3.0 Imports: 1 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/Malwarize/webpalm

Links

Open Source Insights

README ¶

webpalm

banner

Take a look

What is webpalm?

WebPalm is a command-line tool that enables users to traverse a website and generate a tree of all its webpages and their links. It uses a recursive approach to enter each link found on a webpage and continues to do so until all levels have been explored. In addition to generating a site map, WebPalm can extract data from the body of each page using regular expressions and save the results in a file. This feature can be useful for web scraping or extracting specific information.

⚠️ DISCLAIMER ⚠️:

this tool is intended to be used for legal purposes only, and you are responsible for your actions.

Features

Generate a palm tree struct of web urls
Dump data from body pages using regular expressions
live output mode
Export the webtree to json, xml, txt
Fast and easy to use
Colorized output and error handling

Installation

From source

git clone https://github.com/XORbit01/webpalm.git
cd webpalm
go build -o webpalm && ./webpalm

From binary

you can download the binary from Releases

wget https://github.com/XORbit01/webpalm/releases/download/v0.0.1/webpalm_x.x.x_os_arch.tar.gz
tar -xvf webpalm_x.x.x_os_arch.tar.gz
cd webpalm
./webpalm

if you have go installed

go install github.com/XORbit01/webpalm@latest

Usage

webpalm -h

Flags:
  -x, --exclude-code ints        status codes to exclude / ex : -x 404,500
  -h, --help                     help for webpalm
  -i, --include strings          include only domains / ex : -i google.com,facebook.com
  -l, --level int                level of palming / ex: -l 2
      --live                     live output mode (slow but live streaming) / ex: --live
  -o, --output string            file to export the result (f.json, f.xml, f.txt) / ex: -o result.json
      --regexes stringToString   regexes to match in each page / ex: --regexes comments="\<\!--.*?-->  (default [])
  -u, --url string               target url / ex: -u https://google.com

Examples

get the palm tree of a website:

webpalm -u https://google.com -l1 --live

get palm tree of a website and exclude some status codes:

webpalm -u https://google.com -l1 -x 404,500

get the palm tree of a website and dump data from the body of the pages:

webpalm -u https://google.com -l1 --regexes comments="\<\!--.*?-->" -o result.json"

this will dump the comments of each page in the body of the page

webpalm -u https://google.com -l1 --regexes comments="\<\!--.*?-->",emails="([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+)"

this will dump the comments and emails of each page in the body of the page

get the palm tree of a website and export it to xml,txt:

webpalm -u https://google.com -l3 -o result.xml

webpalm -u https://google.com -l2 -o result.txt

get the palm tree of a website and include only some domains:

webpalm -u https://google.com -l2 -i google.com,facebook.com

this will crawl only the urls that contains google.com or facebook.com

Regexes Examples

Regex	Pattern
emails	([a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+)
comments	\<\!--.*?-->
tokens	[a-zA-Z0-9]{32}
password	\bpassword\b.{0,10}

Don't forget escaping the regexes if needed

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. you can also contact me on discord:XORbit#5945

Powered By Malwarize

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd
core
webtree

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL