bayesian

package module
v0.0.0-...-c6aa877 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 28, 2018 License: MIT Imports: 4 Imported by: 2

README

Go-Bayesian

GoDoc Build Status

Go-Bayesian is a Go package for doing classification using Naive-Bayes algorithm. There are two Naive-Bayes models that implemented in this package, which are Multinomial TF and Multinomial Boolean.

It was based on RadhiFadlillah's project, and this one was added a new funtion of load model from a byte slice. It is useful to build project into a single binary file.

Usage Examples

For basic classifying, you can do it like this:

import (
	"fmt"
	"github.com/CyrusF/go-bayesian"
)

// Declare class
const (
	Good bayesian.Class = "good"
	Bad  bayesian.Class = "bad"
)

func main() {
	// New Multinomial TF classifier
	classifier := bayesian.NewClassifier(bayesian.MultinomialTf)

	// Do learning using two documents
	classifier.Learn(
		NewDocument(Good, "tall", "handsome", "rich"),
		NewDocument(Bad, "bald", "poor", "ugly"),
	)

	// Classify tokens from a document
	allScores, class, certain := classifier.Classify("the", "tall", "man")
	fmt.Println(allScores, class, certain)
}

You also can save the classifier to a file for later use. Useful to avoid repeating learning process :

func main() {
	// New Multinomial TF classifier
	classifier := bayesian.NewClassifier(bayesian.MultinomialTf)
	classifier.Learn(
		NewDocument(Good, "tall", "handsome", "rich"),
		NewDocument(Bad, "bald", "poor", "ugly"),
	)

	// Save classifier to file
	err := classifier.SaveClassifierToFile("./my-classifier")
	if err != nil {
		panic(err)
	}
}

Later, you can create a new Classifier from that file :

func main() {
	// New classifier from file
	classifier, err := bayesian.NewClassifierFromFile("./my-classifier")
	if err != nil {
		panic(err)
	}
}

or from the io.Reader or byte slice

func main() {
	// New classifier from io.Reader
	byteReader := bytes.NewReader([]byte(""))
	classifier, err := bayesian.NewClassifierFromFileStream(byteReader)
	if err != nil {
		panic(err)
	}
}

Resource

  1. Raschka, S. 2014. Naive Bayes and Text Classification I - Introduction and Theory. (PDF and Website)
  2. Metsis, V., Androutsopoulos, I., and Paliouras, G. 2006. Spam Filtering with Naive Bayes – Which Naive Bayes ?. Proceeding of CEAS 2006 - Third Conference on Email and Anti-Spam. California, USA, July 27-28, 2006. (PDF)
  3. Lecture slides from the Stanford Coursera course by Dan Jurafsky and Christopher Manning.

License

Go-Bayesian is distributed using MIT license.

Other Naive-Bayes Implementation

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Class

type Class string

Class is alias of string, representing class of a document

type Classifier

type Classifier struct {
	Model              Model
	LearningResults    map[string]map[Class]int
	PriorProbabilities map[Class]float64
	NDocumentByClass   map[Class]int
	NFrequencyByClass  map[Class]int
	NAllDocument       int
}

Classifier is object for classifying document

func NewClassifier

func NewClassifier(model Model) Classifier

NewClassifier returns new Classifier

func NewClassifierFromFile

func NewClassifierFromFile(path string) (Classifier, error)

NewClassifierFromFile returns new Classifier with configuration loaded from file in path

func NewClassifierFromFileStream

func NewClassifierFromFileStream(fl io.Reader) (Classifier, error)

NewClassifierFromFile returns new Classifier with configuration loaded from a byte stream in file

func (Classifier) Classify

func (classifier Classifier) Classify(tokens ...string) (map[Class]float64, Class, bool)

Classify executes classifying process for tokens

func (*Classifier) Learn

func (classifier *Classifier) Learn(docs ...Document)

Learn executes learning process for this classifier

func (Classifier) SaveClassifierToFile

func (classifier Classifier) SaveClassifierToFile(path string) error

SaveClassifierToFile saves Classifier config to file in path

type Document

type Document struct {
	Class  Class
	Tokens []string
}

Document is a group of tokens with certain class

func NewDocument

func NewDocument(class Class, tokens ...string) Document

NewDocument return new Document

type Model

type Model int

Model is alias of int, representing Naive-Bayes model that used in classifier

const (
	// MultinomialTf is model where frequency of token affects posterior probability
	MultinomialTf Model = 1

	// MultinomialBoolean is model like TF, but each token only calculated once for each document
	MultinomialBoolean Model = 2
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL