04-link

command

v0.0.0-...-56ad08b Latest Latest Go to latest Published: Feb 28, 2024 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/parsiya/Parsia-Code

Links

Open Source Insights

README ¶

Gophercises - 4 - Link

Problem

Solutions

link: link package.
main: Use link package to extract links from HTML.

Lessons Learned

/x/net/html

Read the package example: https://godoc.org/golang.org/x/net/html

Token struct:

type Token struct {
    Type     TokenType
    DataAtom atom.Atom
    Data     string
    Attr     []Attribute
}

Type can give us information about what kind of token it is. Important ones for this exercise are:
- StartTagToken: <a href>
- EndTagToken: </a>
- TextToken: Text in between. Using text nodes will skip other elements inside the link.
Data contains the data in the node.
- Anchor tags: a.
- Text nodes: The actual text of the node.

Attribute is of type:

type Attribute struct {
    Namespace, Key, Val string
}

Key is the name of the attribute and Value is the value.
- <a href="example.net">: key = href and value = example.net.

Parse

Parse is easy.

Go through the nodes. If you reach a start anchor tag, set the capturing flag to start capturing. Store the href.
While capturing, add the text of every text node (trim all white space but add a space between nodes).
After reaching the end anchor tag, stop capturing and store the link.
Add link to the links slice.

Issues:

Nested links are ignored. Child links are not stored and their text is stored as part of the parent link.
- For an example run go run main.go -f ex5.html.

strings.Builder

Example: https://golang.org/pkg/strings/#example_Builder

var sb strings.Builder  // Create the builder.
sb.WriteString("whatever")  // Write to it. We can use fmt.Sprintf as param too.
return sb.String()  // Get the final string.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
link

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL