Documentation ¶
Index ¶
- func AutoMigrate(db *gorm.DB) error
- type ExtractedInfo
- type Feed
- type FeedItem
- type GDELTEvent
- type InfoExtractionRule
- type Model
- type PendingJob
- type SimilarityInfo
- type TextClass
- type Tweet
- type TwitterSource
- type TwitterSourceType
- type Vector
- type WebArticle
- type WebResource
- type ZeroShotClass
- type ZeroShotHypothesisLabel
- type ZeroShotHypothesisTemplate
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AutoMigrate ¶
AutoMigrate performs the automatic migration of all GORM models.
Types ¶
type ExtractedInfo ¶
type ExtractedInfo struct { Model // Association to the WebArticle. WebArticleID uint `gorm:"not null;index;index:idx_web_article_id_info_extraction_rule_id,unique"` // Association to the InfoExtractionRule. InfoExtractionRuleID uint `gorm:"not null;index;index:idx_web_article_id_info_extraction_rule_id,unique"` Text string `gorm:"not null"` Confidence float32 `gorm:"not null"` }
ExtractedInfo is a single result from the information extraction task performed on a WebArticle.
type Feed ¶
type Feed struct { Model DeletedAt gorm.DeletedAt `gorm:"index"` // The unique URL of the feed. URL string `gorm:"not null;uniqueIndex"` // The system will look for new feed items from this feed only when it is // Enabled. Otherwise, the feed is simply ignored. Enabled bool `gorm:"not null;index"` // The date and time when this feed was last visited to successfully // retrieve its content (feed items), store it, and schedule further // processing jobs. LastRetrievedAt sql.NullTime `gorm:"index"` // Counter of consecutive fetching failures. FailuresCount int `gorm:"not null;default:0"` // When FailuresCount is not 0, this field should contain the error message // that caused the last failure. It is mostly useful for manual inspection. LastError sql.NullString // A Feed has many models.FeedItem models. FeedItems []FeedItem `gorm:"constraint:OnDelete:CASCADE"` }
Feed is a model representing an RSS or Atom feed.
type FeedItem ¶
type FeedItem struct { Model // Association to the Feed this item belongs to. FeedID uint `gorm:"not null;index"` // WebResourceID allows the has-one relation with a WebResource. WebResourceID uint `gorm:"not null;uniqueIndex"` Title string `gorm:"not null"` Description string `gorm:"not null"` Content string `gorm:"not null"` Language string `gorm:"not null"` PublishedAt sql.NullTime }
FeedItem extends a WebResource representing the item of a Feed.
type GDELTEvent ¶
type GDELTEvent struct { Model // WebResourceID allows the has-one relation with a WebResource. WebResourceID uint `gorm:"not null;uniqueIndex"` // GlobalEventID is the globally unique identifier in GDELT master dataset. GlobalEventID uint `gorm:"not null;uniqueIndex"` // DateAdded is the date the event was added to the master database. DateAdded time.Time `gorm:"not null"` // LocationType specifies the geographic resolution of the match type. LocationType sql.NullString // LocationName is the full human-readable name of the matched location. LocationName sql.NullString // CountryCode is the ISO 3166-1 alpha2 country code for the location. CountryCode sql.NullString // Coordinates provides the centroid Longitude (X) and Latitude (Y) of // the landmark for mapping. Coordinates pgtype.Point `gorm:"type:point"` // EventCategories provides one or more CAMEO event codes at different // levels. EventCategories pgtype.TextArray `gorm:"type:text[];not null"` }
GDELTEvent represents a GDELT Event, and it extends a WebResource which contains the URL of the first recognized news report for this event.
type InfoExtractionRule ¶
type InfoExtractionRule struct { Model DeletedAt gorm.DeletedAt `gorm:"index"` Label string `gorm:"not null;uniqueIndex"` Question string `gorm:"not null"` AnswerRegexp types.Regexp `gorm:"not null"` Threshold float32 `gorm:"not null"` Enabled bool `gorm:"not null;index"` }
InfoExtractionRule is a single configuration item for the information extraction task.
type Model ¶
type Model struct { ID uint `gorm:"primaryKey"` CreatedAt time.Time `gorm:"not null;default:now()"` UpdatedAt time.Time `gorm:"not null;default:now()"` }
Model is the basic struct embedded into all GORM models.
type PendingJob ¶
type PendingJob struct { // ID corresponds to the Job ID. ID string `gorm:"not null;primaryKey"` // The creation time can be useful for implementing a recovery process, // which could look for the existence of PendingJobs older than // a certain leeway timespan. CreatedAt time.Time `gorm:"not null;index"` // JSON-encoded Faktory Job. Data datatypes.JSON `gorm:"not null"` }
A PendingJob represents a Faktory job which is not yet guaranteed to be scheduled (i.e. pushed to the server).
Please refer to jobscheduler package documentation to understand the benefits and use cases of this model.
func NewPendingJob ¶
func NewPendingJob(job *faktory.Job) (*PendingJob, error)
NewPendingJob builds a new PendingJob, setting ID to the job.Jid and Data to the JSON serialization of the job.
It returns an error if json.Marshal fails.
type SimilarityInfo ¶
type SimilarityInfo struct { Model // Association to the WebArticle this info belongs to. WebArticleID uint `gorm:"not null;uniqueIndex"` ParentID *uint `gorm:"index"` Parent *WebArticle Distance *float32 }
SimilarityInfo provides information about similarity between WebArticles.
If a WebArticle "B" is considered to be a similar (or duplicate) of WebArticle "A"then "A" is considered the "parent" of "B".
If a WebArticle has no related SimilarityInfo, it means that the similarity detection task was not (yet) performed, or an error occurred.
If a WebArticle has a SimilarityInfo attached with no Parent (and no Distance), it means that the similarity detection task was successfully performed and no prior similar entities were found.
type TextClass ¶
type TextClass struct { Model // Association to the WebArticle this class belongs to. WebArticleID uint `gorm:"not null;index"` Type string `gorm:"not null;index"` Label string `gorm:"not null;index"` Confidence float32 `gorm:"not null"` }
TextClass is a single classification result class for a WebArticle, predicted with a generic text classifier.
type Tweet ¶ added in v0.3.0
type Tweet struct { Model // TwitterSourceID is the association to the TwitterSource this item belongs to. TwitterSourceID uint `gorm:"not null;index"` // WebResourceID allows the has-one relation with a WebResource. WebResourceID uint `gorm:"not null;uniqueIndex"` UpstreamID string `gorm:"not null;uniqueIndex"` Text string `gorm:"not null"` PublishedAt time.Time `gorm:"not null"` Username string `gorm:"not null;index"` UserID string `gorm:"not null;index"` }
Tweet extends a WebResource which is the item of a TwitterSource.
type TwitterSource ¶ added in v0.3.0
type TwitterSource struct { Model DeletedAt gorm.DeletedAt `gorm:"index"` Type TwitterSourceType `gorm:"not null;index:idx_twitter_source_type_text,unique"` // Text is either a username or a search term, depending on the Type. Text string `gorm:"not null;index:idx_twitter_source_type_text,unique"` // The system will look for new tweets from this source only when it is // Enabled. Otherwise, the twitter source is simply ignored. Enabled bool `gorm:"not null;index"` // The date and time when this source was last visited to successfully // retrieve its content (tweets), store it, and schedule further // processing jobs. LastRetrievedAt sql.NullTime `gorm:"index"` // Counter of consecutive fetching failures. FailuresCount int `gorm:"not null;default:0"` // When FailuresCount is not 0, this field should contain the error message // that caused the last failure. It is mostly useful for manual inspection. LastError sql.NullString // Tweets is the has-many relation with Tweet models. Tweets []Tweet `gorm:"constraint:OnDelete:CASCADE"` }
TwitterSource represents the source of Tweets / WebResources.
type TwitterSourceType ¶
type TwitterSourceType string
TwitterSourceType acts as an enumeration type to identify different kind of Twitter sources.
const ( // UserTwitterSource identifies a Twitter source linked to a user profile. UserTwitterSource TwitterSourceType = "user" // SearchTwitterSource identifies a Twitter source linked to a terms search. SearchTwitterSource TwitterSourceType = "search" )
type Vector ¶
type Vector struct { Model // Association to the WebArticle this vector belongs to. WebArticleID uint `gorm:"not null;uniqueIndex"` Data *pgtype.Float4Array `gorm:"type:float4[];not null"` }
Vector is a vector representation of a WebArticle.
func (Vector) DataAsFloat32Slice ¶
DataAsFloat32Slice converts Vector.Data to []float32.
type WebArticle ¶
type WebArticle struct { Model // WebResourceID allows the has-one relation with a WebResource. WebResourceID uint `gorm:"not null;uniqueIndex"` Title string `gorm:"not null;index"` TopImage sql.NullString ScrapedPublishDate sql.NullTime Language string `gorm:"not null"` PublishDate time.Time `gorm:"not null"` TranslatedTitle sql.NullString TranslationLanguage sql.NullString CountryCode sql.NullString // A WebArticle has many models.ZeroShotClass models. ZeroShotClasses []ZeroShotClass `gorm:"constraint:OnDelete:CASCADE"` // A WebArticle has many models.TextClass models. TextClasses []TextClass `gorm:"constraint:OnDelete:CASCADE"` // A WebArticle has many models.ExtractedInfo models. ExtractedInfos []ExtractedInfo `gorm:"constraint:OnDelete:CASCADE"` // A WebArticle has one Vector. Vector *Vector `gorm:"constraint:OnDelete:CASCADE"` // A WebArticle has one SimilarityInfo. SimilarityInfo *SimilarityInfo `gorm:"constraint:OnDelete:CASCADE"` }
WebArticle represents the scraped content of a WebResource.
type WebResource ¶
type WebResource struct { Model // The unique URL of the web resource. URL string `gorm:"not null;uniqueIndex"` // A WebArticle extends the WebResource with the scraped content. WebArticle *WebArticle `gorm:"constraint:OnDelete:CASCADE"` // FeedItem allows the has-one relation with a models.FeedItem. FeedItem *FeedItem `gorm:"constraint:OnDelete:CASCADE"` // GDELTEvent allows the has-one relation with a models.GDELTEvent. GDELTEvent *GDELTEvent `gorm:"constraint:OnDelete:CASCADE"` // Tweet allows the has-one relation with a models.Tweet. Tweet *Tweet `gorm:"constraint:OnDelete:CASCADE"` }
WebResource represents a web resource, usually a web page, accessible via a URL.
type ZeroShotClass ¶
type ZeroShotClass struct { Model // Association to the WebArticle this class belongs to. WebArticleID uint `gorm:"not null;index;index:idx_web_article_id_template_id_best,unique,where:best;index:idx_web_article_id_label_id,unique"` // Association to the ZeroShotHypothesisLabel. ZeroShotHypothesisLabelID uint `gorm:"not null;index;index:idx_web_article_id_label_id,unique"` // Association to the ZeroShotHypothesisTemplate. ZeroShotHypothesisTemplateID uint `gorm:"not null;index;index:idx_web_article_id_template_id_best,unique,where:best"` // Reports whether this prediction is the best item among the labels of // the associated template. Best bool `gorm:"not null;index:idx_web_article_id_template_id_best,unique,where:best"` Confidence float32 `gorm:"not null"` }
ZeroShotClass is a single classification result class for a WebArticle, predicted with spaGO BART zero-shot classification service.
type ZeroShotHypothesisLabel ¶
type ZeroShotHypothesisLabel struct { Model DeletedAt gorm.DeletedAt `gorm:"index"` // Association to the ZeroShotHypothesisTemplate. ZeroShotHypothesisTemplateID uint `gorm:"not null;index;index:idx_hypothesis_id_text,unique"` // The system will ignore the labels which are not Enabled. Enabled bool `gorm:"not null;index"` // Text is the label to be replaced in the hypothesis text. Text string `gorm:"not null;index:idx_hypothesis_id_text,unique"` }
ZeroShotHypothesisLabel is one possible label to be replaced in the text of a ZeroShotHypothesisTemplate.
type ZeroShotHypothesisTemplate ¶
type ZeroShotHypothesisTemplate struct { Model DeletedAt gorm.DeletedAt `gorm:"index"` // The system will ignore the templates which are not Enabled. Enabled bool `gorm:"not null;index"` // Text is the hypothesis. It MUST contain one character sequence "{}" to // indicate the point where each related label will be placed. Text string `gorm:"not null"` // MultiClass indicates whether the zero-shot classification is multi-class // (true) or single-class (false). MultiClass bool `gorm:"not null"` // Labels are the possible items to be replaced in the Text. Labels []ZeroShotHypothesisLabel `gorm:"constraint:OnDelete:CASCADE"` }
ZeroShotHypothesisTemplate represents the template for on hypothesis used for BART zero-shot classification of WebArticles.