data-migration

module
v0.0.0-...-ef9d934 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 8, 2019 License: BSD-3-Clause

README

WPT Data Migration Scripts

This repository contains scripts that can be used or modified to correct mistakes in the datastore that backs wpt.fyi.

Running a script

First of all, run gcloud auth application-default login (you should already have access to wptdashboard and/or wptdashboard-staging projects).

This repo does NOT use Go modules yet, so it is recommeneded to check out the repo at $GOPATH/src/github.com/web-platform-tests/data-migration. Then run go get -u ./... to get all the dependencies.

Finally, you can run most scripts with go run, e.g. go run tagger/master.go --help.

Writing a script

We have a few different categories of scripts.

Datastore-only

This is the most common kind. These scripts do a pass of scan-check-modify over all TestRuns in Datastore in parallel. Check-and-modify is done atomically in a transaction.

The reusable logic is in processor/. New scripts only need to implement the Runs interface.

Examples can be found in tagger/.

Storage

The following scripts also download results from GCS, so they are a lot slower.

add_run_info/ - used to backfill product and browser name metadata, as well as switch to a new URL schema.

add_time_start/ - used to backfill the TimeStart metadata for runs done before that information was added.

dedup_runs/ - used to deduplicate runs with the same raw_results_url from before results-processor was idempotent.

Bigtable

grid/ - an experiment to load all results into Bigtable.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL