Documentation
¶
Index ¶
- Constants
- type ArticleItem
- type BaseThingItem
- type ImageItem
- type OrganizationItem
- type Parser
- func (ps *Parser) Article() *data.MarkupArticle
- func (ps *Parser) Author() string
- func (ps *Parser) Copyright() string
- func (ps *Parser) Description() string
- func (ps *Parser) Images() []data.MarkupImage
- func (ps *Parser) OptOut() bool
- func (ps *Parser) Publisher() string
- func (ps *Parser) Title() string
- func (ps *Parser) Type() string
- func (ps *Parser) URL() string
- type PersonItem
- type SchemaType
- type ThingItem
- type UnsupportedItem
Constants ¶
const ( NameProp = "name" URLProp = "url" DescriptionProp = "description" ImageProp = "image" HeadlineProp = "headline" PublisherProp = "publisher" CopyrightHolderProp = "copyrightHolder" CopyrightYearProp = "copyrightYear" ContentURLProp = "contentUrl" EncodingFormatProp = "encodingFormat" CaptionProp = "caption" RepresentativeProp = "representativeOfPage" WidthProp = "width" HeightProp = "height" DatePublishedProp = "datePublished" DateModifiedProp = "dateModified" AuthorProp = "author" CreatorProp = "creator" SectionProp = "articleSection" AssociatedMediaProp = "associatedMedia" EncodingProp = "encoding" FamilyNameProp = "familyName" GivenNameProp = "givenName" LegalNameProp = "legalName" AuthorRel = "author" )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ArticleItem ¶
type ArticleItem struct {
BaseThingItem
}
func NewArticleItem ¶
func NewArticleItem(element *html.Node) *ArticleItem
type BaseThingItem ¶
type BaseThingItem struct {
// contains filtered or unexported fields
}
type OrganizationItem ¶
type OrganizationItem struct {
BaseThingItem
}
func NewOrganizationItem ¶
func NewOrganizationItem(element *html.Node) *OrganizationItem
type Parser ¶
type Parser struct {
// contains filtered or unexported fields
}
Parser recognizes and parses schema.org markup tags, and returns the properties that matter to distilled content. Schema.org markup (http://schema.org) is based on the microdata format (http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html).
For the basic Schema.org Thing type, the basic properties are: name, url, description, image. In addition, for each type that we support, we also parse more specific properties:
- Article: headline (i.e. title), publisher, copyright year, copyright holder, date published, date modified, author, article section
- ImageObject: headline (i.e. title), publisher, copyright year, copyright holder, content url, encoding format, caption, representative of page, width, height
- Person: family name, given name
- Organization: legal name.
The value of a Schema.Org property can be a Schema.Org type, i.e. embedded. E.g., the author or publisher of article or publisher of image could be a Schema.Org Person or Organization type; in fact, this is the reason we support Person and Organization types.
func (*Parser) Article ¶
func (ps *Parser) Article() *data.MarkupArticle
func (*Parser) Description ¶
func (*Parser) Images ¶
func (ps *Parser) Images() []data.MarkupImage
type PersonItem ¶
type PersonItem struct {
BaseThingItem
}
func NewPersonItem ¶
func NewPersonItem(element *html.Node) *PersonItem
type SchemaType ¶
type SchemaType uint
const ( Unsupported SchemaType = iota Image Article Person Organization )
type UnsupportedItem ¶
type UnsupportedItem struct {
BaseThingItem
}
func NewUnsupportedItem ¶
func NewUnsupportedItem(element *html.Node) *UnsupportedItem