tldextract

package module
v0.0.0-...-d83daa6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2022 License: MIT Imports: 7 Imported by: 29

README

tldextract

Extract root domain, subdomain name, tld from a url, using the the Public Suffix List.

Installation

Install tldextract:

go get github.com/joeguo/tldextract

To run unit tests, run this command in tldextract's source directory($GOPATH/src/github.com/joeguo/tldextract):

go test

Example

package main

import (
"fmt"
"github.com/joeguo/tldextract"
)


func main() {
	urls := []string{"git+ssh://www.github.com:8443/", "http://media.forums.theregister.co.uk", "http://218.15.32.76", "http://google.com?q=cats"}
	cache := "/tmp/tld.cache"
	extract, _ := tldextract.New(cache,false)

	for _, u := range (urls) {
		result:=extract.Extract(u)
		fmt.Printf("%+v;%s\n",result,u)
	}
}

Output will look like:

  &{Flag:1 Sub:www Root:github Tld:com};git+ssh://www.github.com:8443/
  &{Flag:1 Sub:media.forums Root:theregister Tld:co.uk};http://media.forums.theregister.co.uk
  &{Flag:2 Sub: Root:218.15.32.76 Tld:};http://218.15.32.76
  &{Flag:1 Sub: Root:google Tld:com};http://google.com?q=cats

Flag value meaning

const (
	Malformed = iota
	Domain
	Ip4
	Ip6
)

========

Documentation

Overview

Package tldextract provides the ability to extract gTLD or ccTLD(generic or country code top-level domain), the registered domain and subdomain from a url according to the Public Suffix List.

A simple usage:

	package main

	import (
		"fmt"
		"github.com/joeguo/tldextract"
	)
	func main() {
		urls := []string{"git+ssh://www.github.com:8443/", "http://media.forums.theregister.co.uk", "http://218.15.32.76", "http://google.com?q=cats"}
		cache := "/tmp/tld.cache"
		extract := tldextract.New(cache,false)

		for _, u := range (urls) {
			result:=extract.Extract(u)
			fmt.Printf("%+v;%s\n",result,u)
		}
    }

Index

Constants

View Source
const (
	Malformed = iota
	Domain
	Ip4
	Ip6
)

used for Result.Flag

Variables

This section is empty.

Functions

This section is empty.

Types

type Result

type Result struct {
	Flag int
	Sub  string
	Root string
	Tld  string
}

type TLDExtract

type TLDExtract struct {
	CacheFile string
	// contains filtered or unexported fields
}

func New

func New(cacheFile string, debug bool) (*TLDExtract, error)

New creates a new *TLDExtract, it may be shared between goroutines, we usually need a single instance in an application.

func (*TLDExtract) Extract

func (extract *TLDExtract) Extract(u string) *Result

func (*TLDExtract) SetNoStrip

func (extract *TLDExtract) SetNoStrip()

SetNoStrip disables URL stripping in order to increase performance.

func (*TLDExtract) SetNoValidate

func (extract *TLDExtract) SetNoValidate()

SetNoValidate disables schema check in order to increase performance.

type Trie

type Trie struct {
	ExceptRule bool
	ValidTld   bool
	// contains filtered or unexported fields
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL