browser

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 4, 2025 License: MIT Imports: 6 Imported by: 0

README

Gost-DOM - A headless browser for Go

The Go-to headless browser for TDD workflows.

browser := NewBrowserFromHandler(pkg.RootHttpHandler)
window, err := browser.Open("http://example.com/example") // host is ignored
Expect(err).ToNot(HaveOccurred())
doc := window.Document()
button := doc.QuerySelector("button")
targetArea := doc.GetElementById("target-area")
button.Click()
Expect(targetArea).To(HaveTextContent("Click count: 1"))
button.Click()
Expect(targetArea).To(HaveTextContent("Click count: 2"))

Go-dom downloads, and executes client-side script, making it an ideal choice to help build applications using a Go/HTMX stack.

Being written in Go you can connect directly to an http.Handler bypassing the overhead of a TCP connection; as well as the burden of managing ports and connections.

This greatly simplifies the ability to replace dependencies during testing, as you can treat your HTTP server as a normal Go component.

[!NOTE]

This still early pre-release. Minimal functionality exists for some basic flows, but only just.

Feature list

[!IMPORTANT]

This package currently requires module replacement, check out the Installation section.

[!WARNING]

The API is not yet stable. use at your own risk.

Looking for sponsors

If this tool could reach a minimum level of usability, this would be extremely valuable in testing Go web applications, particularly combined with HTMX, a tech combination which is becoming increasingly popular.

Progress so far is the result of too much spare time; but that will not last. If If enough people would sponsor this project, it could mean the difference between continued development, or death.

Looking for contributors

This is a massive undertaking, and I would love people to join in.

Particularly if you have experience in the area of building actual browsers (I'm not talking about skinning Chromium here, like, implementing an actual rendering engine)

Join the "community"

Installation

After go getting github.com/gost-dom/browser, you need replace some modules:

go mod edit -replace="github.com/ericchiang/css=github.com/gost-dom/css@latest"
go mod edit -replace="github.com/tommie/v8go=github.com/stroiman/v8go@go-dom-support"
go mod tidy

The CSS is just a simple fix that allows the CSS selector to accept tag name patterns with upper case tag names, which HTMX produces. I've filed a PR, so hopefully this will get merged to the source repo soon.

For the v8go project, I've added a lot of V8 features that were missing in v8go. I'm working with tommie, who runs the best maintained branch, but it may be a while before they are all merged.

[!NOTE]

New features will probably be added to my branch, requiring the replacement to be updated. If you get build errors that look v8-ish, try running the replacement again. Tip: Create a shell script for this.

Project background

Go and HTMX is gaining in popularity as a stack.

While Go has great tooling for verifying request/responses of HTTP applications, but for HTMX, or just client-side scripting with server side rendering, you need browser automation to test the behaviour.

This introduces a significant overhead; not only from out-of-process communication with the browser, but also the necessity of launching your server.

This overhead discourages a TDD loop.

The purpose of this project is to enable a fast TDD feedback loop these types of project, where verification depend on

  • Behaviour of client-side scripts.
  • Browser behaviour when interacting with browser elements, e.g., clicking the submit button submits a form, and redirects are followed.
Unique features

Being written in Go, this library supports consuming an http.Handler directly. This removes the necessity managing TCP ports, and start a server on a real port. Your HTTP server is consumed by test code, like any other Go component would, also allowing you to replace dependencies for the test if applicable.

This also makes it easy to run parallel tests in isolation as each can create their own instance of the HTTP handler.

Drawbacks to Browser automation
  • You cannot verify how it look; e.g. you cannot get a screenshot of a failing test, nor use such screenshots for snapshot tests.
  • The verification doesn't prove that it works as intended in all browsers you want to support.

This isn't intended as a replacement for the cases where an end-2-end test is the right choice. It is intended as a tool to help when you want a smaller isolated test, e.g. mocking out part of the behaviour;

Code structure

This is still in early development, and the structure may still change.

dom/ # Core DOM implementation
html/ # Window, HTMLDocument, HTMLElement, 
scripting/ # Client-side script support
v8host/ # v8 engine, and bindings
gojahost/ # goja javascript engine,
browser.go # Main module

The folders, dom, and html correspond to the web APIs. It was the intention to have a folder for each supported web API, but that may turn out to be impossible, as there are circular dependencies between some of the specs.

Modularisation

Although the code isn't modularised yet, it is an idea that you should be able to include the modules relevant to your app. E.g., if your app deals with location services, you can add a module implementing location services.

This helps keep the size of the dependencies down for client projects; keeping build times down for the TDD loop.

It also provides the option of alternate implementations. E.g., for location services, the simple implementation can provide a single function to set the current location / accuracy. The advanced implementation can replay a GPX track.

Project status

Currently, the most basic HTMX app is working, simple click handler with swapping, boosted links, and form (at least with text fields).

Memory Leaks

The current implementation is leaking memory for the scope of a browser Window. I.e., all DOM nodes created and deleted for the lifetime of the window will stay in memory until the window is actively disposed.

This is not a problem for the intended use case

Why memory leaks

This codebase is a marriage between two garbage collected runtimes, and what is conceptually one object is split into two, a Go object and a JavaScript wrapper. As long of them is reachable; so must the other be.

I could join them into one; but that would result in an undesired coupling; the DOM implementation being coupled to the JavaScript execution engine. Eventually, a native Go JavaScript runtime will be supported.

A solution to this problem involves the use of weak references. This exists as an internal but was accepted as a feature.

For that reason; and because it's not a problem for the intended use case, I have postponed dealing with that issue.

Next up

The following are main focus areas ATM

  • Handle redirect responses
  • Implement a proper event loop, with proper setTimeout, setInterval, and their clear-counterparts.
  • Implement fast-forwarding of time.
  • Replace early hand-written JS wrappers with auto-generated code, helping drive a more complete implementation.

A parallel project is adding support for Goja , to eventually replace V8 with Goja as the default script engine, resulting in a pure Go implementation. V8 support will not go away, so there's a fallback, if important JS features are lacking from Goja.

Future goals

There is much to do, which includes (but this is not a full list):

  • Support web-sockets and server events.
  • Implement all standard JavaScript classes that a browser should support; but not part of the ECMAScript standard itself.
    • JavaScript polyfills would be a good starting point; which is how xpath is implemented at the moment.
      • Conversion to native go implementations would be prioritized on usage, e.g. fetch would be high in the list of priorities.
  • Implement default browser behaviour for user interaction, e.g. pressing enter when an input field has focus should submit the form.
Long Term Goals
CSS Parsing

Parsing CSS woule be nice, allowing test code to verify the resulting styles of an element; but having a working DOM with a JavaScript engine is higher priority.

Mock external sites

The system may depend on external sites in the browser, most notably identity providers (IDP), where your app redirects to the IDP, which redirects on successful login; but could be other services such as map providers, etc.

For testing purposes, replacing this with a dummy replacement would have some benefits:

  • The verification of your system doesn't depend on the availability of an external service; when working offline
  • Avoid tests breaking because of changes to the external system.
  • For an identity provider
    • Avoid pollution of dummy accounts to run your test suite.
    • Avoid locking out test accounts due to "suspiscious activity".
    • The IDP may use a Captcha or 2FA that can be impossible; or difficult to control from tests, and would cause a significant slowdown to the test suite.
  • For applications like map providers
    • Avoid being billed for API use during testing.

Out of scope.

Full Spec Compliance

A goal is not always meant to be reached, it often serves simply as something to aim at.

  • Bruce Lee

While it is a goal to reach whatwg spec compliance, the primary goal is to have a useful tool for testing modern web applications.

Some specs don't really have any usage in modern web applications. For example, you generally wouldn't write an application that depends on quirks mode.

Another example is document.write. I've yet to work on any application that depends on this. However, implementing support for this feature require a complete rewrite of the HTML parser. You would need a really good case (or sponsorship level) to have that prioritised.

Accessibility tree

It is not currently planned that this library should maintain the accessibility tree; nor provide higher level testing capabilities like what Testing Library provides for JavaScript.

These problems should eventually be solved, but could easily be implemented in a different library with dependency to the DOM alone.

Visual Rendering

It is not a goal to be able to provide a visual rendering of the DOM.

But just like the accessibility tree, this could be implemented in a new library depending only on the interface from here.

Documentation

Overview

Package browser is the main entry point for Gost, helping create a window initialized with a script enging, connected to a server.

Important!

This package depends on two 3rd party components that needs some custom modifications to work.

go mod edit -replace="github.com/tommie/v8go=github.com/stroiman/v8go@go-dom-support"
go mod edit -replace="github.com/ericchiang/css=github.com/gost-dom/css@latest"
go mod tidy

I hope that all my changes will make it to the original repos, eliminating the need for replace (or maintaining a new set of forks).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Browser

type Browser struct {
	Client     http.Client
	ScriptHost ScriptHost
	// contains filtered or unexported fields
}

Pretty stupid right now, but should _probably_ allow handling multiple windows/tabs. This used to be the case for _some_ identity providers, but I'm not sure if that even work anymore because of browser security.

func New

func New() *Browser

New initialises a new Browser with the default script engine.

func NewBrowser

func NewBrowser() *Browser

NewBrowser should not be called. Call New instead.

This method will selfdestruct in 10 commits

func NewBrowserFromHandler

func NewBrowserFromHandler(handler http.Handler) *Browser

NewBrowserFromHandler should not be called, call, NewFromHandler instead.

This method will selfdestruct in 10 commits

func NewFromHandler

func NewFromHandler(handler http.Handler) *Browser

NewFromHandler initialises a new Browser with the default script engine and sets up the internal http.Client used with an http.Roundtripper that bypasses the TCP stack and calls directly into the

Note: There is a current limitation that NO requests from the browser will be sent when using this. So sites will not work if they

  • Depend on content from CDN
  • Depend on an external service, e.g., an identity provider.

That is a limitation that was the result of prioritising more important, and higher risk features.

func (*Browser) Close

func (b *Browser) Close()

func (*Browser) Open

func (b *Browser) Open(location string) (window Window, err error)

Open will open a new html.Window, loading the specified location. If the server does not respons with a 200 status code, an error is returned.

See html.NewWindowReader about the return value, and when the window returns.

Directories

Path Synopsis
browser module
Package html works on top of dom to implement specific HTML elements
Package html works on top of dom to implement specific HTML elements
internal
constants
Package constants is a collection of values that are used many times in the implementation, but has no relevance to users of the library, e.g., a link to where you can file an issue when you encounter a not-implemented feature; or a feature that is not fully implemented, e.g.
Package constants is a collection of values that are used many times in the implementation, but has no relevance to users of the library, e.g., a link to where you can file an issue when you encounter a not-implemented feature; or a feature that is not fully implemented, e.g.
dom
log
Package log contains functions used internally for logging to a default logger implementing slog.Logger.
Package log contains functions used internally for logging to a default logger implementing slog.Logger.
test/script-test-suite
Ths suite package contains a specification of the behaviour of client-side scripting.
Ths suite package contains a specification of the behaviour of client-side scripting.
Package logger provides the basic functionality of supplying a custom logger.
Package logger provides the basic functionality of supplying a custom logger.
gojahost
The gojahost package provides functionality to execute client-scripts in gost-dom.
The gojahost package provides functionality to execute client-scripts in gost-dom.
v8host
The v8host packages provides functionality to execute client-side scripts in gost-dom.
The v8host packages provides functionality to execute client-side scripts in gost-dom.
testing
gomega-matchers
Package matchers contains custom matches for use with the [Gomega] assertion library.
Package matchers contains custom matches for use with the [Gomega] assertion library.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL