webconnectivity

package
v3.16.0-alpha.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 5, 2022 License: GPL-3.0 Imports: 25 Imported by: 0

README

webconnectivity

This directory contains a new implementation of Web Connectivity.

As of 2022-08-26, this code is experimental and is not selected by default when you run the websites group. You can select this implementation with miniooni using miniooni web_connectivity@v0.5 from the command line.

Issue #2237 explains the rationale behind writing this new implementation.

Implementation overview

The experiment measures a single URL at a time. The OONI Engine invokes the Run method inside the measurer.go file.

This code starts a number of background tasks, waits for them to complete, and finally calls TestKeys.finalize to finalize the content of the JSON measurement.

The first task that is started deals with DNS and lives in the dnsresolvers.go file. This task is responsible for resolving the domain inside the URL into 0..N IP addresses.

The domain resolution includes the system resolver and a DNS-over-UDP resolver. The implementaion may do more than that, but this is the bare minimum we're feeling like documenting right now. (We need to experiment a bit more to understand what else we can do there, hence the code is probably doing more than just that.)

Once we know the 0..N IP addresses for the domain we do the following:

  1. start a background task to communicate with the Web Connectivity test helper, using code inside control.go;

  2. start an endpoint measurement task for each IP adddress (which of course only happens when we know at least one addr).

Regarding starting endpoint measurements, we follow this policy:

  1. if the original URL is http://... then we start a cleartext task and an encrypted task for each address using ports 80 and 443 respectively.

  2. if it's https://..., then we only start encrypted tasks.

Cleartext tasks are implemented by cleartextflow.go while the encrypted tasks live in secureflow.go.

A cleartext task does the following:

  1. TCP connect;

  2. additionally, the first task to establish a connection also performs a GET request to fetch a webpage (we cannot GET for all connections, because that would be websteps and would require a different data format).

An encrypted task does the following:

  1. TCP connect;

  2. TLS handshake;

  3. additionally, the first task to handshake also performs a GET request to fetch a webpage iff the input URL was https://... (we cannot GET for all connections, because that would be websteps and would require a different data format).

If fetching the webpage returns a redirect, we start a new DNS task passing it the redirect URL as the new URL to measure. We do not call the test helper again when this happens, though. The Web Connectivity test helper already follows the whole redirect chain, so we would need to change the test helper to get information on each flow. When this will happen, this experiment will probably not be Web Connectivity anymore, but rather some form of websteps.

Additionally, when the test helper terminates, we run TCP connect and TLS handshake (when applicable) for new IP addresses discovered using the test helper that were previously unknown to the probe, thus collecting extra information. This logic lives inside the control.go file.

As previously mentioned, when all tasks complete, we call TestKeys.finalize.

In turn, this function analyzes the collected data by calling code implemented inside the following files:

We emit the blocking and accessible keys we emitted before as well as new keys, prefixed by x_ to indicate that they're experimental.

Limitations and next steps

We need to extend the Web Connectivity test helper to return us information about TLS handshakes with IP addresses discovered by the probe. This information would allow us to make more precise TLS blocking statements.

Further changes are probably possible. Departing too radically from the Web Connectivity model, though, will lead us to have a websteps implementation (but then the data model would most likely be different).

Documentation

Overview

Package webconnectivity implements the web_connectivity experiment.

Spec: https://github.com/ooni/spec/blob/master/nettests/ts-017-web-connectivity.md.

This implementation, in particular, contains extensions over the original model, which we document at https://github.com/ooni/probe/issues/2237.

Index

Constants

View Source
const (
	// AnalysisDNSBogon indicates we got any bogon reply
	AnalysisDNSBogon = 1 << iota

	// AnalysisDNSUnexpectedFailure indicates the TH could
	// resolve a domain while the probe couldn't
	AnalysisDNSUnexpectedFailure

	// AnalysisDNSUnexpectedAddrs indicates the TH resolved
	// different addresses from the probe
	AnalysisDNSUnexpectedAddrs
)
View Source
const (
	// DNSAddrFlagSystemResolver means we discovered this addr using the system resolver.
	DNSAddrFlagSystemResolver = 1 << iota

	// DNSAddrFlagUDP means we discovered this addr using the UDP resolver.
	DNSAddrFlagUDP

	// DNSAddrFlagHTTPS means we discovered this addr using the DNS-over-HTTPS resolver.
	DNSAddrFlagHTTPS
)

Variables

View Source
var DNSWhoamiSingleton = &DNSWhoamiService{
	mu:       &sync.Mutex{},
	systemv4: []DNSWhoamiInfoEntry{},
	udpv4:    map[string][]DNSWhoamiInfoEntry{},
}

DNSWhoamiSingleton is the DNSWhoamiService singleton.

View Source
var OpportunisticDNSOverHTTPSSingleton = &OpportunisticDNSOverHTTPS{
	interval: 0,
	mu:       &sync.Mutex{},
	rnd:      rand.New(rand.NewSource(time.Now().UnixNano())),
	t:        time.Time{},
	urls: []string{
		"https://mozilla.cloudflare-dns.com/dns-query",
		"https://dns.nextdns.io/dns-query",
		"https://dns.google/dns-query",
		"https://dns.quad9.net/dns-query",
	},
}

OpportunisticDNSOverHTTPSSingleton is the singleton used to keep track of the opportunistic DNS-over-HTTPS measurements state.

Functions

func NewExperimentMeasurer

func NewExperimentMeasurer(config *Config) model.ExperimentMeasurer

NewExperimentMeasurer creates a new model.ExperimentMeasurer.

Types

type CleartextFlow

type CleartextFlow struct {
	// Address is the MANDATORY address to connect to.
	Address string

	// DNSCache is the MANDATORY DNS cache.
	DNSCache *DNSCache

	// IDGenerator is the MANDATORY atomic int64 to generate task IDs.
	IDGenerator *atomicx.Int64

	// Logger is the MANDATORY logger to use.
	Logger model.Logger

	// Sema is the MANDATORY semaphore to allow just a single
	// connection to perform the HTTP transaction.
	Sema <-chan any

	// TestKeys is MANDATORY and contains the TestKeys.
	TestKeys *TestKeys

	// ZeroTime is the MANDATORY measurement's zero time.
	ZeroTime time.Time

	// WaitGroup is the MANDATORY wait group this task belongs to.
	WaitGroup *sync.WaitGroup

	// CookieJar contains the OPTIONAL cookie jar, used for redirects.
	CookieJar http.CookieJar

	// FollowRedirects is OPTIONAL and instructs this flow
	// to follow HTTP redirects (if any).
	FollowRedirects bool

	// HostHeader is the OPTIONAL host header to use.
	HostHeader string

	// Referer contains the OPTIONAL referer, used for redirects.
	Referer string

	// UDPAddress is the OPTIONAL address of the UDP resolver to use. If this
	// field is not set we use a default one (e.g., `8.8.8.8:53`).
	UDPAddress string

	// URLPath is the OPTIONAL URL path.
	URLPath string

	// URLRawQuery is the OPTIONAL URL raw query.
	URLRawQuery string
}

Measures HTTP endpoints.

The zero value of this structure IS NOT valid and you MUST initialize all the fields marked as MANDATORY before using this structure.

func (*CleartextFlow) Run

func (t *CleartextFlow) Run(parentCtx context.Context, index int64)

Run runs this task in the current goroutine.

func (*CleartextFlow) Start

func (t *CleartextFlow) Start(ctx context.Context)

Start starts this task in a background goroutine.

type Config

type Config struct{}

Config contains webconnectivity experiment configuration.

type Control

type Control struct {
	// Addresses contains the MANDATORY addresses we've looked up.
	Addresses []string

	// ExtraMeasurementsStarter is MANDATORY and allows this struct to
	// start additional measurements using new TH-discovered addrs.
	ExtraMeasurementsStarter EndpointMeasurementsStarter

	// Logger is the MANDATORY logger to use.
	Logger model.Logger

	// TestKeys is MANDATORY and contains the TestKeys.
	TestKeys *TestKeys

	// Session is the MANDATORY session to use.
	Session model.ExperimentSession

	// THAddr is the MANDATORY TH's URL.
	THAddr string

	// URL is the MANDATORY URL we are measuring.
	URL *url.URL

	// WaitGroup is the MANDATORY wait group this task belongs to.
	WaitGroup *sync.WaitGroup
}

Control issues a Control request and saves the results inside of the experiment's TestKeys.

The zero value of this structure IS NOT valid and you MUST initialize all the fields marked as MANDATORY before using this structure.

func (*Control) Run

func (c *Control) Run(parentCtx context.Context)

Run runs this task until completion.

func (*Control) Start

func (c *Control) Start(ctx context.Context)

Start starts this task in a background goroutine.

type DNSCache

type DNSCache struct {
	// contains filtered or unexported fields
}

DNSCache wraps a model.Resolver to provide DNS caching.

The zero value is invalid; please, use NewDNSCache to construct.

func NewDNSCache

func NewDNSCache() *DNSCache

NewDNSCache creates a new DNSCache instance.

func (*DNSCache) Get

func (c *DNSCache) Get(domain string) ([]DNSEntry, bool)

Get gets values from the cache

func (*DNSCache) Set

func (c *DNSCache) Set(domain string, values []DNSEntry)

Set inserts into the cache

type DNSEntry

type DNSEntry struct {
	// Addr is the cached address
	Addr string

	// Flags contains flags
	Flags int64
}

DNSEntry is an entry in the DNS cache.

type DNSResolvers

type DNSResolvers struct {
	// DNSCache is the MANDATORY DNS cache.
	DNSCache *DNSCache

	// Domain is the MANDATORY domain to resolve.
	Domain string

	// IDGenerator is the MANDATORY atomic int64 to generate task IDs.
	IDGenerator *atomicx.Int64

	// Logger is the MANDATORY logger to use.
	Logger model.Logger

	// TestKeys is MANDATORY and contains the TestKeys.
	TestKeys *TestKeys

	// URL is the MANDATORY URL we're measuring.
	URL *url.URL

	// ZeroTime is the MANDATORY zero time of the measurement.
	ZeroTime time.Time

	// WaitGroup is the MANDATORY wait group this task belongs to.
	WaitGroup *sync.WaitGroup

	// CookieJar contains the OPTIONAL cookie jar, used for redirects.
	CookieJar http.CookieJar

	// Referer contains the OPTIONAL referer, used for redirects.
	Referer string

	// Session is the OPTIONAL session. If the session is set, we will use
	// it to start the task that issues the control request. This request must
	// only be sent during the first iteration. It would be pointless to
	// issue such a request for subsequent redirects, because the TH will
	// always follow the redirect chain caused by the provided URL.
	Session model.ExperimentSession

	// THAddr is the OPTIONAL test helper address.
	THAddr string

	// UDPAddress is the OPTIONAL address of the UDP resolver to use. If this
	// field is not set we use a default one (e.g., `8.8.8.8:53`).
	UDPAddress string
}

Resolves the URL's domain using several resolvers.

The zero value of this structure IS NOT valid and you MUST initialize all the fields marked as MANDATORY before using this structure.

func (*DNSResolvers) Run

func (t *DNSResolvers) Run(parentCtx context.Context)

Run runs this task in the current goroutine.

func (*DNSResolvers) Start

func (t *DNSResolvers) Start(ctx context.Context)

Start starts this task in a background goroutine.

type DNSWhoamiInfo

type DNSWhoamiInfo struct {
	// SystemV4 contains results related to the system resolver using IPv4.
	SystemV4 []DNSWhoamiInfoEntry `json:"system_v4"`

	// UDPv4 contains results related to an UDP resolver using IPv4.
	UDPv4 map[string][]DNSWhoamiInfoEntry `json:"udp_v4"`
}

DNSWhoamiInfo contains info about DNS whoami.

type DNSWhoamiInfoEntry

type DNSWhoamiInfoEntry struct {
	// Address is the IP address
	Address string `json:"address"`
}

DNSWhoamiInfoEntry contains an entry for DNSWhoamiInfo.

type DNSWhoamiService

type DNSWhoamiService struct {
	// contains filtered or unexported fields
}

DNSWhoamiService is a service that performs DNS whoami lookups.

func (*DNSWhoamiService) SystemV4

func (svc *DNSWhoamiService) SystemV4(ctx context.Context) ([]DNSWhoamiInfoEntry, bool)

SystemV4 returns the results of querying using the system resolver and IPv4.

func (*DNSWhoamiService) UDPv4

func (svc *DNSWhoamiService) UDPv4(ctx context.Context, address string) ([]DNSWhoamiInfoEntry, bool)

UDPv4 returns the results of querying a given UDP resolver and IPv4.

type EndpointMeasurementsStarter

type EndpointMeasurementsStarter interface {
	// contains filtered or unexported methods
}

EndpointMeasurementsStarter is used by Control to start extra measurements using new IP addrs discovered by the TH.

type InputParser

type InputParser struct {
	// List of accepted URL schemes.
	AcceptedSchemes []string

	// Whether to allow endpoints in input.
	AllowEndpoints bool

	// The default scheme to use if AllowEndpoints == true.
	DefaultScheme string
}

InputParser helps to print the experiment's input.

func (*InputParser) Parse

func (ip *InputParser) Parse(input string) (*url.URL, error)

Parse parses the experiment input and returns the resulting URL.

type Measurer

type Measurer struct {
	// Contains the experiment's config.
	Config *Config
}

Measurer for the web_connectivity experiment.

func (*Measurer) ExperimentName

func (m *Measurer) ExperimentName() string

ExperimentName implements model.ExperimentMeasurer.

func (*Measurer) ExperimentVersion

func (m *Measurer) ExperimentVersion() string

ExperimentVersion implements model.ExperimentMeasurer.

func (*Measurer) GetSummaryKeys

func (m *Measurer) GetSummaryKeys(measurement *model.Measurement) (any, error)

GetSummaryKeys implements model.ExperimentMeasurer.GetSummaryKeys.

func (*Measurer) Run

func (m *Measurer) Run(ctx context.Context, sess model.ExperimentSession,
	measurement *model.Measurement, callbacks model.ExperimentCallbacks) error

Run implements model.ExperimentMeasurer.

type OpportunisticDNSOverHTTPS

type OpportunisticDNSOverHTTPS struct {
	// contains filtered or unexported fields
}

OpportunisticDNSOverHTTPS allows to perform opportunistic DNS-over-HTTPS measurements as part of Web Connectivity.

func (*OpportunisticDNSOverHTTPS) MaybeNextURL

func (o *OpportunisticDNSOverHTTPS) MaybeNextURL() (string, bool)

MaybeNextURL returns the next URL to measure, if any. Our aim is to perform periodic, opportunistic DoH measurements as part of Web Connectivity.

type SecureFlow

type SecureFlow struct {
	// Address is the MANDATORY address to connect to.
	Address string

	// DNSCache is the MANDATORY DNS cache.
	DNSCache *DNSCache

	// IDGenerator is the MANDATORY atomic int64 to generate task IDs.
	IDGenerator *atomicx.Int64

	// Logger is the MANDATORY logger to use.
	Logger model.Logger

	// Sema is the MANDATORY semaphore to allow just a single
	// connection to perform the HTTP transaction.
	Sema <-chan any

	// TestKeys is MANDATORY and contains the TestKeys.
	TestKeys *TestKeys

	// ZeroTime is the MANDATORY measurement's zero time.
	ZeroTime time.Time

	// WaitGroup is the MANDATORY wait group this task belongs to.
	WaitGroup *sync.WaitGroup

	// ALPN is the OPTIONAL ALPN to use.
	ALPN []string

	// CookieJar contains the OPTIONAL cookie jar, used for redirects.
	CookieJar http.CookieJar

	// FollowRedirects is OPTIONAL and instructs this flow
	// to follow HTTP redirects (if any).
	FollowRedirects bool

	// HostHeader is the OPTIONAL host header to use.
	HostHeader string

	// Referer contains the OPTIONAL referer, used for redirects.
	Referer string

	// SNI is the OPTIONAL SNI to use.
	SNI string

	// UDPAddress is the OPTIONAL address of the UDP resolver to use. If this
	// field is not set we use a default one (e.g., `8.8.8.8:53`).
	UDPAddress string

	// URLPath is the OPTIONAL URL path.
	URLPath string

	// URLRawQuery is the OPTIONAL URL raw query.
	URLRawQuery string
}

Measures HTTPS endpoints.

The zero value of this structure IS NOT valid and you MUST initialize all the fields marked as MANDATORY before using this structure.

func (*SecureFlow) Run

func (t *SecureFlow) Run(parentCtx context.Context, index int64)

Run runs this task in the current goroutine.

func (*SecureFlow) Start

func (t *SecureFlow) Start(ctx context.Context)

Start starts this task in a background goroutine.

type SummaryKeys

type SummaryKeys struct {
	// contains filtered or unexported fields
}

Summary contains the summary results.

Note that this structure is part of the ABI contract with ooniprobe therefore we should be careful when changing it.

type TestKeys

type TestKeys struct {
	// NetworkEvents contains network events.
	NetworkEvents []*model.ArchivalNetworkEvent `json:"network_events"`

	// DNSWhoami contains results of using the DNS whoami functionality for the
	// possibly cleartext resolvers that we're using.
	DNSWoami *DNSWhoamiInfo `json:"x_dns_whoami"`

	// DoH contains ancillary observations collected by DoH resolvers.
	DoH *TestKeysDoH `json:"x_doh"`

	// Do53 contains ancillary observations collected by Do53 resolvers.
	Do53 *TestKeysDo53 `json:"x_do53"`

	// DNSLateReplies contains late replies we didn't expect to receive from
	// a resolver (which may raise eyebrows if they're different).
	DNSLateReplies []*model.ArchivalDNSLookupResult `json:"x_dns_late_replies"`

	// Queries contains DNS queries.
	Queries []*model.ArchivalDNSLookupResult `json:"queries"`

	// Requests contains HTTP results.
	Requests []*model.ArchivalHTTPRequestResult `json:"requests"`

	// TCPConnect contains TCP connect results.
	TCPConnect []*model.ArchivalTCPConnectResult `json:"tcp_connect"`

	// TLSHandshakes contains TLS handshakes results.
	TLSHandshakes []*model.ArchivalTLSOrQUICHandshakeResult `json:"tls_handshakes"`

	// ControlRequest is the control request we sent.
	ControlRequest *webconnectivity.ControlRequest `json:"x_control_request"`

	// Control contains the TH's response.
	Control *webconnectivity.ControlResponse `json:"control"`

	// ControlFailure contains the failure of the control experiment.
	ControlFailure *string `json:"control_failure"`

	// DNSFlags contains DNS analysis flags.
	DNSFlags int64 `json:"x_dns_flags"`

	// DNSExperimentFailure indicates whether there was a failure in any
	// of the DNS experiments we performed.
	DNSExperimentFailure *string `json:"dns_experiment_failure"`

	// DNSConsistency indicates whether there is consistency between
	// the TH's DNS results and the probe's DNS results.
	DNSConsistency string `json:"dns_consistency"`

	// HTTPExperimentFailure indicates whether there was a failure in
	// the final HTTP request that we recorded.
	HTTPExperimentFailure *string `json:"http_experiment_failure"`

	// BlockingFlags contains blocking flags.
	BlockingFlags int64 `json:"x_blocking_flags"`

	// BodyLength match tells us whether the body length matches.
	BodyLengthMatch *bool `json:"body_length_match"`

	// HeadersMatch tells us whether the headers match.
	HeadersMatch *bool `json:"headers_match"`

	// StatusCodeMatch tells us whether the status code matches.
	StatusCodeMatch *bool `json:"status_code_match"`

	// TitleMatch tells us whether the title matches.
	TitleMatch *bool `json:"title_match"`

	// Blocking indicates the reason for blocking. This is notoriously a bad
	// type because it can be one of the following values:
	//
	// - "tcp_ip"
	// - "dns"
	// - "http-diff"
	// - "http-failure"
	// - false
	// - null
	//
	// In addition to having a ~bad type, this field has the issue that it
	// reduces the reason for blocking to an enum, whereas it's a set of flags,
	// hence we introduced the x_blocking_flags field.
	Blocking any `json:"blocking"`

	// Accessible indicates whether the resource is accessible. Possible
	// values for this field are: nil, true, and false.
	Accessible any `json:"accessible"`
	// contains filtered or unexported fields
}

TestKeys contains the results produced by web_connectivity.

func NewTestKeys

func NewTestKeys() *TestKeys

NewTestKeys creates a new instance of TestKeys.

func (*TestKeys) AppendDNSLateReplies

func (tk *TestKeys) AppendDNSLateReplies(v ...*model.ArchivalDNSLookupResult)

AppendDNSLateReplies appends to DNSLateReplies.

func (*TestKeys) AppendNetworkEvents

func (tk *TestKeys) AppendNetworkEvents(v ...*model.ArchivalNetworkEvent)

AppendNetworkEvents appends to NetworkEvents.

func (*TestKeys) AppendQueries

func (tk *TestKeys) AppendQueries(v ...*model.ArchivalDNSLookupResult)

AppendQueries appends to Queries.

func (*TestKeys) AppendRequests

func (tk *TestKeys) AppendRequests(v ...*model.ArchivalHTTPRequestResult)

AppendRequests appends to Requests.

func (*TestKeys) AppendTCPConnectResults

func (tk *TestKeys) AppendTCPConnectResults(v ...*model.ArchivalTCPConnectResult)

AppendTCPConnectResults appends to TCPConnect.

func (*TestKeys) AppendTLSHandshakes

func (tk *TestKeys) AppendTLSHandshakes(v ...*model.ArchivalTLSOrQUICHandshakeResult)

AppendTLSHandshakes appends to TLSHandshakes.

func (*TestKeys) Finalize

func (tk *TestKeys) Finalize(logger model.Logger)

Finalize performs any delayed computation on the test keys. This function must be called from the measurer after all the tasks have completed.

func (*TestKeys) SetControl

func (tk *TestKeys) SetControl(v *webconnectivity.ControlResponse)

SetControl sets the value of Control.

func (*TestKeys) SetControlFailure

func (tk *TestKeys) SetControlFailure(err error)

SetControlFailure sets the value of controlFailure.

func (*TestKeys) SetControlRequest

func (tk *TestKeys) SetControlRequest(v *webconnectivity.ControlRequest)

SetControlRequest sets the value of controlRequest.

func (*TestKeys) SetFundamentalFailure

func (tk *TestKeys) SetFundamentalFailure(err error)

SetFundamentalFailure sets the value of fundamentalFailure.

func (*TestKeys) WithDNSWhoami

func (tk *TestKeys) WithDNSWhoami(fun func(*DNSWhoamiInfo))

WithDNSWhoami calls the given function with the mutex locked passing to it as argument the pointer to the DNSWhoami field.

func (*TestKeys) WithTestKeysDo53

func (tk *TestKeys) WithTestKeysDo53(f func(*TestKeysDo53))

WithTestKeysDo53 calls the given function with the mutex locked passing to it as argument the pointer to the Do53 field.

func (*TestKeys) WithTestKeysDoH

func (tk *TestKeys) WithTestKeysDoH(f func(*TestKeysDoH))

WithTestKeysDoH calls the given function with the mutex locked passing to it as argument the pointer to the DoH field.

type TestKeysDo53

type TestKeysDo53 struct {
	// NetworkEvents contains network events.
	NetworkEvents []*model.ArchivalNetworkEvent `json:"network_events"`

	// Queries contains DNS queries.
	Queries []*model.ArchivalDNSLookupResult `json:"queries"`
}

TestKeysDo53 contains ancillary observations collected using Do53.

They are on a separate hierarchy to simplify processing.

type TestKeysDoH

type TestKeysDoH struct {
	// NetworkEvents contains network events.
	NetworkEvents []*model.ArchivalNetworkEvent `json:"network_events"`

	// Queries contains DNS queries.
	Queries []*model.ArchivalDNSLookupResult `json:"queries"`

	// Requests contains HTTP results.
	Requests []*model.ArchivalHTTPRequestResult `json:"requests"`

	// TCPConnect contains TCP connect results.
	TCPConnect []*model.ArchivalTCPConnectResult `json:"tcp_connect"`

	// TLSHandshakes contains TLS handshakes results.
	TLSHandshakes []*model.ArchivalTLSOrQUICHandshakeResult `json:"tls_handshakes"`
}

TestKeysDoH contains ancillary observations collected using DoH (e.g., the DNS lookups, TCP connects, TLS handshakes caused by given DoH lookups).

They are on a separate hierarchy to simplify processing.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL