gryffin

package module
v0.0.0-...-4338856 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 7, 2016 License: BSD-3-Clause Imports: 16 Imported by: 0

README

Gryffin (beta) Build Status GoDoc

Gryffin is a large scale web security scanning platform. It is not yet another scanner. It was written to solve two specific problems with existing scanners: coverage and scale.

Better coverage translates to fewer false negatives. Inherent scalability translates to capability of scanning, and supporting a large elastic application infrastructure. Simply put, the ability to scan 1000 applications today to 100,000 applications tomorrow by straightforward horizontal scaling.

Coverage

Coverage has two dimensions - one during crawl and the other during fuzzing. In crawl phase, coverage implies being able to find as much of the application footprint. In scan phase, or while fuzzing, it implies being able to test each part of the application for an applied set of vulnerabilities in a deep.

Crawl Coverage

Today a large number of web applications are template-driven, meaning the same code or path generates millions of URLs. For a security scanner, it just needs one of the millions of URLs generated by the same code or path. Gryffin's crawler does just that.

Page Deduplication

At the heart of Gryffin is a deduplication engine that compares a new page with already seen pages. If the HTML structure of the new page is similar to those already seen, it is classified as a duplicate and not crawled further.

DOM Rendering and Navigation

A large number of applications today are rich applications. They are heavily driven by client-side JavaScript. In order to discover links and code paths in such applications, Gryffin's crawler uses PhantomJS for DOM rendering and navigation.

Scan Coverage

As Gryffin is a scanning platform, not a scanner, it does not have its own fuzzer modules, even for fuzzing common web vulnerabilities like XSS and SQL Injection.

It's not wise to reinvent the wheel where you do not have to. Gryffin at production scale at Yahoo uses open source and custom fuzzers. Some of these custom fuzzers might be open sourced in the future, and might or might not be part of the Gryffin repository.

For demonstration purposes, Gryffin comes integrated with sqlmap and arachni. It does not endorse them or any other scanner in particular.

The philosophy is to improve scan coverage by being able to fuzz for just what you need.

Scale

While Gryffin is available as a standalone package, it's primarily built for scale.

Gryffin is built on the publisher-subscriber model. Each component is either a publisher, or a subscriber, or both. This allows Gryffin to scale horizontally by simply adding more subscriber or publisher nodes.

Operating Gryffin

Pre-requisites
  1. Go
  2. PhantomJS, v2
  3. Sqlmap (for fuzzing SQLi)
  4. Arachni (for fuzzing XSS and web vulnerabilities)
  5. NSQ ,
    • running lookupd at port 4160,4161
    • running nsqd at port 4150,4151
    • with --max-msg-size=5000000
  6. Kibana and Elastic search, for dashboarding
Installation
go get -u github.com/yahoo/gryffin/...
Run

(WIP)

TODO

  1. Mobile browser user agent
  2. Preconfigured docker images
  3. Redis for sharing states across machines
  4. Instruction to run gryffin (distributed or standalone)
  5. Documentation for html-distance
  6. Implement a JSON serializable cookiejar.
  7. Identify duplicate url patterns based on simhash result.

Talks and Slides

Credits

Licence

Code licensed under the BSD-style license. See LICENSE file for terms.

Documentation

Overview

Package gryffin is an application scanning infrastructure.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GenRandomID

func GenRandomID() string

GenRandomID generates a random ID.

func SetLogWriter

func SetLogWriter(w io.Writer)

func SetMemoryStore

func SetMemoryStore(m *GryffinStore)

Types

type Fingerprint

type Fingerprint struct {
	Origin             uint64 // origin
	URL                uint64 // origin + path
	Request            uint64 // method, url, body
	RequestFull        uint64 // request + header
	ResponseSimilarity uint64
}

Fingerprint contains all the different types of hash for the Scan (Request & Response)

type Fuzzer

type Fuzzer interface {
	Fuzz(*Scan) (int, error)
}

Fuzzer runs the fuzzing.

type GryffinStore

type GryffinStore struct {
	Oracles map[string]*distance.Oracle
	Hashes  map[string]bool
	Hits    map[string]int
	// contains filtered or unexported fields
}

func NewGryffinStore

func NewGryffinStore() *GryffinStore

func NewSharedGryffinStore

func NewSharedGryffinStore() *GryffinStore

func (*GryffinStore) GetRcvChan

func (s *GryffinStore) GetRcvChan() chan []byte

func (*GryffinStore) GetSndChan

func (s *GryffinStore) GetSndChan() chan []byte

func (*GryffinStore) Hit

func (s *GryffinStore) Hit(prefix string) bool

func (*GryffinStore) See

func (s *GryffinStore) See(prefix string, kind string, v uint64)

func (*GryffinStore) Seen

func (s *GryffinStore) Seen(prefix string, kind string, v uint64, r uint8) bool

type HTTPDoer

type HTTPDoer interface {
	Do(*http.Request) (*http.Response, error)
}

HTTPDoer interface is to be implemented by http.Client

type Job

type Job struct {
	ID             string
	DomainsAllowed []string // Domains that we would crawl
}

Job stores the job id and config (if any).

type LogMessage

type LogMessage struct {
	Service string
	Msg     string
	Method  string
	Url     string
	JobID   string
}

LogMessage contains the data fields to be marshall as a json for forwarding to the log processor.

type PublishMessage

type PublishMessage struct {
	F string // function, i.e. See or Seen
	T string // type (kind), i.e. oracle or hash
	K string // key
	V string // value
}

type Renderer

type Renderer interface {
	Do(*Scan)
	GetRequestBody() <-chan *Scan
	GetLinks() <-chan *Scan
}

Renderer is an interface for implementation HTML DOM renderer and obtain the response body and links. Since DOM construction is very likely to be asynchronous, we return the channels to receive response and links.

type Scan

type Scan struct {
	// ID is a random ID to identify this particular scan.
	// if ID is empty, this scan should not be performed (but record for rate limiting).
	ID           string
	Job          *Job
	Request      *http.Request
	RequestBody  string
	Response     *http.Response
	ResponseBody string
	Cookies      []*http.Cookie
	Fingerprint  Fingerprint
	HitCount     int
}

A Scan consists of the job, target, request and response.

func NewScan

func NewScan(method, url, post string) *Scan

NewScan creates a scan.

func NewScanFromJson

func NewScanFromJson(b []byte) *Scan

func (*Scan) CrawlAsync

func (s *Scan) CrawlAsync(r Renderer)

CrawlAsync run the crawling asynchronously.

func (*Scan) Error

func (s *Scan) Error(service string, err error)

TODO - LogFmt (fmt string) TODO - LogI (interface)

func (*Scan) Fuzz

func (s *Scan) Fuzz(fuzzer Fuzzer) (int, error)

Scan runs the vulnerability fuzzer, return the issue count

func (*Scan) IsDuplicatedPage

func (s *Scan) IsDuplicatedPage() bool

IsDuplicatedPage checks if we should proceed based on the Response

func (*Scan) IsScanAllowed

func (s *Scan) IsScanAllowed() bool

IsScanAllowed check if the request URL is allowed per Job.DomainsAllowed.

func (*Scan) Json

func (s *Scan) Json() []byte

func (*Scan) Log

func (s *Scan) Log(v interface{})

func (*Scan) Logf

func (s *Scan) Logf(format string, a ...interface{})

func (*Scan) Logm

func (s *Scan) Logm(service, msg string)

Logm sends a LogMessage to Log processor.

func (*Scan) Logmf

func (s *Scan) Logmf(service, format string, a ...interface{})

func (*Scan) MergeRequest

func (s *Scan) MergeRequest(req *http.Request)

MergeRequest merge the request field in scan with the existing one.

func (*Scan) Poke

func (s *Scan) Poke(client HTTPDoer) (err error)

Poke checks if the target is up.

func (*Scan) RateLimit

func (s *Scan) RateLimit() int

RateLimit checks whether we are under the allowed rate for crawling the site. It returns a delay time to wait to check for ReadyToCrawl again.

func (*Scan) ReadResponseBody

func (s *Scan) ReadResponseBody()

ReadResponseBody read Response.Body and fill it to ReadResponseBody. It will also reconstruct the io.ReaderCloser stream.

func (*Scan) ShouldCrawl

func (s *Scan) ShouldCrawl() bool

ShouldCrawl checks if the links should be queued for next crawl.

func (*Scan) Spawn

func (s *Scan) Spawn() *Scan

Spawn spawns a new scan object with a different ID.

func (*Scan) UpdateFingerprint

func (s *Scan) UpdateFingerprint()

UpdateFingerprint updates the fingerprint field.

type SerializableRequest

type SerializableRequest struct {
	*http.Request
	Cancel string
}

type SerializableResponse

type SerializableResponse struct {
	*http.Response
	Request *SerializableRequest
}

type SerializableScan

type SerializableScan struct {
	*Scan
	Request  *SerializableRequest
	Response *SerializableResponse
}

Directories

Path Synopsis
cmd
Package data provides an interface for common data store operations.
Package data provides an interface for common data store operations.
fuzzer
Package html-distance is a go library for computing the proximity of the HTML pages.
Package html-distance is a go library for computing the proximity of the HTML pages.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL