models

package
v1.0.0-alpha.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 1, 2021 License: BSD-2-Clause Imports: 9 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AutoMigrate

func AutoMigrate(db *gorm.DB) error

AutoMigrate performs the automatic migration of all GORM models.

Types

type ExtractedInfo

type ExtractedInfo struct {
	Model

	// Association to the WebArticle.
	WebArticleID uint `gorm:"not null;index;index:idx_web_article_id_info_extraction_rule_id,unique"`

	// Association to the InfoExtractionRule.
	InfoExtractionRuleID uint `gorm:"not null;index;index:idx_web_article_id_info_extraction_rule_id,unique"`

	Text       string  `gorm:"not null"`
	Confidence float32 `gorm:"not null"`
}

ExtractedInfo is a single result from the information extraction task performed on a WebArticle.

type Feed

type Feed struct {
	Model

	DeletedAt gorm.DeletedAt `gorm:"index"`

	// The unique URL of the feed.
	URL string `gorm:"not null;uniqueIndex"`

	// The system will look for new feed items from this feed only when it is
	// Enabled. Otherwise, the feed is simply ignored.
	Enabled bool `gorm:"not null;index"`

	// The date and time when this feed was last visited to successfully
	// retrieve its content (feed items), store it, and schedule further
	// processing jobs.
	LastRetrievedAt sql.NullTime `gorm:"index"`

	// Counter of consecutive fetching failures.
	FailuresCount int `gorm:"not null;default:0"`

	// When FailuresCount is not 0, this field should contain the error message
	// that caused the last failure. It is mostly useful for manual inspection.
	LastError sql.NullString

	// A Feed has many models.FeedItem models.
	FeedItems []FeedItem `gorm:"constraint:OnDelete:CASCADE"`
}

Feed is a model representing an RSS or Atom feed.

type FeedItem

type FeedItem struct {
	Model

	// Association to the Feed this item belongs to.
	FeedID uint `gorm:"not null;index"`

	// WebResourceID allows the has-one relation with a WebResource.
	WebResourceID uint `gorm:"not null;uniqueIndex"`

	Title       string `gorm:"not null"`
	Description string `gorm:"not null"`
	Content     string `gorm:"not null"`
	Language    string `gorm:"not null"`
	PublishedAt sql.NullTime
}

FeedItem extends a WebResource representing the item of a Feed.

type GDELTEvent

type GDELTEvent struct {
	Model

	// WebResourceID allows the has-one relation with a WebResource.
	WebResourceID uint `gorm:"not null;uniqueIndex"`

	// GlobalEventID is the globally unique identifier in GDELT master dataset.
	GlobalEventID uint `gorm:"not null;uniqueIndex"`

	// DateAdded is the date the event was added to the master database.
	DateAdded time.Time `gorm:"not null"`

	// LocationType specifies the geographic resolution of the match type.
	LocationType sql.NullString

	// LocationName is the full human-readable name of the matched location.
	LocationName sql.NullString

	// CountryCode is the ISO 3166-1 alpha2 country code for the location.
	CountryCode sql.NullString

	// Coordinates provides the centroid Longitude (X) and Latitude (Y) of
	// the landmark for mapping.
	Coordinates pgtype.Point `gorm:"type:point"`

	// EventCategories provides one or more CAMEO event codes at different
	// levels.
	EventCategories pgtype.TextArray `gorm:"type:text[];not null"`
}

GDELTEvent represents a GDELT Event, and it extends a WebResource which contains the URL of the first recognized news report for this event.

type InfoExtractionRule

type InfoExtractionRule struct {
	Model

	DeletedAt gorm.DeletedAt `gorm:"index"`

	Label        string       `gorm:"not null;uniqueIndex"`
	Question     string       `gorm:"not null"`
	AnswerRegexp types.Regexp `gorm:"not null"`
	Threshold    float32      `gorm:"not null"`
	Enabled      bool         `gorm:"not null;index"`
}

InfoExtractionRule is a single configuration item for the information extraction task.

type Model

type Model struct {
	ID        uint      `gorm:"primaryKey"`
	CreatedAt time.Time `gorm:"not null;default:now()"`
	UpdatedAt time.Time `gorm:"not null;default:now()"`
}

Model is the basic struct embedded into all GORM models.

type PendingJob

type PendingJob struct {
	// ID corresponds to the Job ID.
	ID string `gorm:"not null;primaryKey"`

	// The creation time can be useful for implementing a recovery process,
	// which could look for the existence of PendingJobs older than
	// a certain leeway timespan.
	CreatedAt time.Time `gorm:"not null;index"`

	// JSON-encoded Faktory Job.
	Data datatypes.JSON `gorm:"not null"`
}

A PendingJob represents a Faktory job which is not yet guaranteed to be scheduled (i.e. pushed to the server).

Please refer to jobscheduler package documentation to understand the benefits and use cases of this model.

func NewPendingJob

func NewPendingJob(job *faktory.Job) (*PendingJob, error)

NewPendingJob builds a new PendingJob, setting ID to the job.Jid and Data to the JSON serialization of the job.

It returns an error if json.Marshal fails.

type SimilarityInfo

type SimilarityInfo struct {
	Model

	// Association to the WebArticle this info belongs to.
	WebArticleID uint `gorm:"not null;uniqueIndex"`

	ParentID *uint `gorm:"index"`
	Parent   *WebArticle

	Distance *float32
}

SimilarityInfo provides information about similarity between WebArticles.

If a WebArticle "B" is considered to be a similar (or duplicate) of WebArticle "A"then "A" is considered the "parent" of "B".

If a WebArticle has no related SimilarityInfo, it means that the similarity detection task was not (yet) performed, or an error occurred.

If a WebArticle has a SimilarityInfo attached with no Parent (and no Distance), it means that the similarity detection task was successfully performed and no prior similar entities were found.

type TextClass

type TextClass struct {
	Model

	// Association to the WebArticle this class belongs to.
	WebArticleID uint `gorm:"not null;index"`

	Type       string  `gorm:"not null;index"`
	Label      string  `gorm:"not null;index"`
	Confidence float32 `gorm:"not null"`
}

TextClass is a single classification result class for a WebArticle, predicted with a generic text classifier.

type Tweet added in v0.3.0

type Tweet struct {
	Model

	// TwitterSourceID is the association to the TwitterSource this item belongs to.
	TwitterSourceID uint `gorm:"not null;index"`

	// WebResourceID allows the has-one relation with a WebResource.
	WebResourceID uint `gorm:"not null;uniqueIndex"`

	UpstreamID  string    `gorm:"not null;uniqueIndex"`
	Text        string    `gorm:"not null"`
	PublishedAt time.Time `gorm:"not null"`
	Username    string    `gorm:"not null;index"`
	UserID      string    `gorm:"not null;index"`
}

Tweet extends a WebResource which is the item of a TwitterSource.

type TwitterSource added in v0.3.0

type TwitterSource struct {
	Model

	DeletedAt gorm.DeletedAt `gorm:"index"`

	Type TwitterSourceType `gorm:"not null;index:idx_twitter_source_type_text,unique"`

	// Text is either a username or a search term, depending on the Type.
	Text string `gorm:"not null;index:idx_twitter_source_type_text,unique"`

	// The system will look for new tweets from this source only when it is
	// Enabled. Otherwise, the twitter source is simply ignored.
	Enabled bool `gorm:"not null;index"`

	// The date and time when this source was last visited to successfully
	// retrieve its content (tweets), store it, and schedule further
	// processing jobs.
	LastRetrievedAt sql.NullTime `gorm:"index"`

	// Counter of consecutive fetching failures.
	FailuresCount int `gorm:"not null;default:0"`

	// When FailuresCount is not 0, this field should contain the error message
	// that caused the last failure. It is mostly useful for manual inspection.
	LastError sql.NullString

	// Tweets is the has-many relation with Tweet models.
	Tweets []Tweet `gorm:"constraint:OnDelete:CASCADE"`
}

TwitterSource represents the source of Tweets / WebResources.

type TwitterSourceType

type TwitterSourceType string

TwitterSourceType acts as an enumeration type to identify different kind of Twitter sources.

const (
	// UserTwitterSource identifies a Twitter source linked to a user profile.
	UserTwitterSource TwitterSourceType = "user"

	// SearchTwitterSource identifies a Twitter source linked to a terms search.
	SearchTwitterSource TwitterSourceType = "search"
)

type Vector

type Vector struct {
	Model

	// Association to the WebArticle this vector belongs to.
	WebArticleID uint `gorm:"not null;uniqueIndex"`

	Data *pgtype.Float4Array `gorm:"type:float4[];not null"`
}

Vector is a vector representation of a WebArticle.

func (Vector) DataAsFloat32Slice

func (v Vector) DataAsFloat32Slice() ([]float32, error)

DataAsFloat32Slice converts Vector.Data to []float32.

type WebArticle

type WebArticle struct {
	Model

	// WebResourceID allows the has-one relation with a WebResource.
	WebResourceID uint `gorm:"not null;uniqueIndex"`

	Title              string `gorm:"not null;index"`
	TopImage           sql.NullString
	ScrapedPublishDate sql.NullTime
	Language           string    `gorm:"not null"`
	PublishDate        time.Time `gorm:"not null"`

	TranslatedTitle     sql.NullString
	TranslationLanguage sql.NullString

	CountryCode sql.NullString

	// A WebArticle has many models.ZeroShotClass models.
	ZeroShotClasses []ZeroShotClass `gorm:"constraint:OnDelete:CASCADE"`

	// A WebArticle has many models.TextClass models.
	TextClasses []TextClass `gorm:"constraint:OnDelete:CASCADE"`

	// A WebArticle has many models.ExtractedInfo models.
	ExtractedInfos []ExtractedInfo `gorm:"constraint:OnDelete:CASCADE"`

	// A WebArticle has one Vector.
	Vector *Vector `gorm:"constraint:OnDelete:CASCADE"`

	// A WebArticle has one SimilarityInfo.
	SimilarityInfo *SimilarityInfo `gorm:"constraint:OnDelete:CASCADE"`
}

WebArticle represents the scraped content of a WebResource.

type WebResource

type WebResource struct {
	Model

	// The unique URL of the web resource.
	URL string `gorm:"not null;uniqueIndex"`

	// A WebArticle extends the WebResource with the scraped content.
	WebArticle *WebArticle `gorm:"constraint:OnDelete:CASCADE"`

	// FeedItem allows the has-one relation with a models.FeedItem.
	FeedItem *FeedItem `gorm:"constraint:OnDelete:CASCADE"`

	// GDELTEvent allows the has-one relation with a models.GDELTEvent.
	GDELTEvent *GDELTEvent `gorm:"constraint:OnDelete:CASCADE"`

	// Tweet allows the has-one relation with a models.Tweet.
	Tweet *Tweet `gorm:"constraint:OnDelete:CASCADE"`
}

WebResource represents a web resource, usually a web page, accessible via a URL.

type ZeroShotClass

type ZeroShotClass struct {
	Model

	// Association to the WebArticle this class belongs to.
	WebArticleID uint `gorm:"not null;index;index:idx_web_article_id_template_id_best,unique,where:best;index:idx_web_article_id_label_id,unique"`

	// Association to the ZeroShotHypothesisLabel.
	ZeroShotHypothesisLabelID uint `gorm:"not null;index;index:idx_web_article_id_label_id,unique"`

	// Association to the ZeroShotHypothesisTemplate.
	ZeroShotHypothesisTemplateID uint `gorm:"not null;index;index:idx_web_article_id_template_id_best,unique,where:best"`

	// Reports whether this prediction is the best item among the labels of
	// the associated template.
	Best bool `gorm:"not null;index:idx_web_article_id_template_id_best,unique,where:best"`

	Confidence float32 `gorm:"not null"`
}

ZeroShotClass is a single classification result class for a WebArticle, predicted with spaGO BART zero-shot classification service.

type ZeroShotHypothesisLabel

type ZeroShotHypothesisLabel struct {
	Model

	DeletedAt gorm.DeletedAt `gorm:"index"`

	// Association to the ZeroShotHypothesisTemplate.
	ZeroShotHypothesisTemplateID uint `gorm:"not null;index;index:idx_hypothesis_id_text,unique"`

	// The system will ignore the labels which are not Enabled.
	Enabled bool `gorm:"not null;index"`

	// Text is the label to be replaced in the hypothesis text.
	Text string `gorm:"not null;index:idx_hypothesis_id_text,unique"`
}

ZeroShotHypothesisLabel is one possible label to be replaced in the text of a ZeroShotHypothesisTemplate.

type ZeroShotHypothesisTemplate

type ZeroShotHypothesisTemplate struct {
	Model

	DeletedAt gorm.DeletedAt `gorm:"index"`

	// The system will ignore the templates which are not Enabled.
	Enabled bool `gorm:"not null;index"`

	// Text is the hypothesis. It MUST contain one character sequence "{}" to
	// indicate the point where each related label will be placed.
	Text string `gorm:"not null"`

	// MultiClass indicates whether the zero-shot classification is multi-class
	// (true) or single-class (false).
	MultiClass bool `gorm:"not null"`

	// Labels are the possible items to be replaced in the Text.
	Labels []ZeroShotHypothesisLabel `gorm:"constraint:OnDelete:CASCADE"`
}

ZeroShotHypothesisTemplate represents the template for on hypothesis used for BART zero-shot classification of WebArticles.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL