Documentation ¶
Index ¶
- func ItNextOrNil(it Iterator) (element.Element, error)
- func ItPreviousOrNil(it Iterator) (element.Element, error)
- func TraverseNode(visitor NodeVisitor, root *html.Node)
- type Direction
- type ElementInDirection
- type ElementWithDelta
- type HTMLContentIterator
- type HTMLConverter
- type IndexedIterator
- type Iterator
- type NodeVisitor
- type ParsedElements
- type PublicationContentIterator
- type ResourceContentIteratorFactory
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ItNextOrNil ¶
Moves to the next item and returns it, or nil if we reached the end.
func ItPreviousOrNil ¶
Moves to the previous item and returns it, or nil if we reached the beginning.
func TraverseNode ¶
func TraverseNode(visitor NodeVisitor, root *html.Node)
Start a depth-first traverse of the root and all of its descendants. This implementation does not use recursion, so a deep DOM does not risk blowing the stack. From JSoup: https://github.com/jhy/jsoup/blob/1762412a28fa7b08ccf71d93fc4c98dc73086e03/src/main/java/org/jsoup/select/NodeTraversor.java#L20 NOTE: Unlike the JSoup implementation, we expect any implementor of NodeVisitor to be read-only, because it simplifies implementation
Types ¶
type ElementInDirection ¶
[Element] loaded with [hasPrevious] or [hasNext], associated with the move direction.
type ElementWithDelta ¶
[Element] loaded with [hasPrevious] or [hasNext], associated with the move delta.
type HTMLContentIterator ¶
type HTMLContentIterator struct { BeforeMaxLength int // Locators will contain a `before` context of up to this amount of characters. // contains filtered or unexported fields }
func NewHTML ¶
func NewHTML(resource fetcher.Resource, locator manifest.Locator) *HTMLContentIterator
Iterates an HTML [resource], starting from the given [locator]. If you want to start mid-resource, the [locator] must contain a `cssSelector` key in its [Locator.Locations] object. If you want to start from the end of the resource, the [locator] must have a `progression` of 1.0.
func (*HTMLContentIterator) HasNext ¶
func (it *HTMLContentIterator) HasNext() (bool, error)
func (*HTMLContentIterator) HasPrevious ¶
func (it *HTMLContentIterator) HasPrevious() (bool, error)
func (*HTMLContentIterator) Next ¶
func (it *HTMLContentIterator) Next() element.Element
func (*HTMLContentIterator) Previous ¶
func (it *HTMLContentIterator) Previous() element.Element
type HTMLConverter ¶
type HTMLConverter struct {
// contains filtered or unexported fields
}
Note that this whole thing is based off of JSoup's NodeVisitor and NodeTraverser classes https://jsoup.org/apidocs/org/jsoup/select/NodeVisitor.html https://jsoup.org/apidocs/org/jsoup/select/NodeTraversor.html
func (*HTMLConverter) Head ¶
func (c *HTMLConverter) Head(n *html.Node, depth int)
Implements NodeTraversor
func (*HTMLConverter) Result ¶
func (c *HTMLConverter) Result() ParsedElements
type IndexedIterator ¶
type IndexedIterator struct {
// contains filtered or unexported fields
}
Iterator for a resource, associated with its [index] in the reading order.
func (*IndexedIterator) NextContentIn ¶
func (it *IndexedIterator) NextContentIn(direction Direction) (element.Element, error)
type Iterator ¶
type Iterator interface { HasNext() (bool, error) // Returns true if the iterator has a next element Next() element.Element // Retrieves the element computed by a preceding call to [hasNext]. Panics if [hasNext] was not invoked. HasPrevious() (bool, error) // Returns true if the iterator has a previous element Previous() element.Element // Retrieves the element computed by a preceding call to [hasPrevious]. Panics if [hasNext] was not invoked. }
Iterates through a list of [Element] items asynchronously. [hasNext] and [hasPrevious] refer to the last element computed by a previous call to any of both methods. TODO: It's based on a kotlin iterator, maybe we can make this more of something for go?
type NodeVisitor ¶
type ParsedElements ¶
Holds the result of parsing the HTML resource into a list of element.Element. The [startIndex] will be calculated from the element matched by the base [locator], if possible. Defaults to 0.
type PublicationContentIterator ¶
type PublicationContentIterator struct {
// contains filtered or unexported fields
}
func NewPublicationContent ¶
func NewPublicationContent(manifest manifest.Manifest, fetcher fetcher.Fetcher, startLocator *manifest.Locator, resourceContentIteratorFactories []ResourceContentIteratorFactory) *PublicationContentIterator
TODO maybe wrap manifest/fetcher in something that doesn't depend on pub package
func (*PublicationContentIterator) HasNext ¶
func (it *PublicationContentIterator) HasNext() (bool, error)
func (*PublicationContentIterator) HasPrevious ¶
func (it *PublicationContentIterator) HasPrevious() (bool, error)
func (*PublicationContentIterator) Next ¶
func (it *PublicationContentIterator) Next() element.Element
func (*PublicationContentIterator) Previous ¶
func (it *PublicationContentIterator) Previous() element.Element
type ResourceContentIteratorFactory ¶
func HTMLFactory ¶
func HTMLFactory() ResourceContentIteratorFactory