pgmigrate

package module
v0.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 6, 2023 License: MIT Imports: 13 Imported by: 9

README

🐽 pgmigrate

Latest Version Golang

pgmigrate is a modern Postgres migrations CLI and golang library. It is designed for use by high-velocity teams who practice continuous deployment. The goal is to make migrations as simple and reliable as possible.

Major features
  • Applies any previously-unapplied migrations, in ascending filename order — that's it.
  • Each migration is applied within a transaction.
  • Only "up" migrations, no "down" migrations.
  • Uses Postgres advisory locks so it's safe to run in parallel.
  • All functionality is available as a golang library, a docker container, and as a static cli binary
  • Can dump your database schema and data from arbitrary tables to a single migration file
    • This lets you squash migrations
    • This lets you prevent schema conflicts in CI
    • The dumped sql is human readable
    • The dumping process is roundtrip-stable (dumping > applying > dumping gives you the same result)
  • Supports a shared configuration file that you can commit to your git repo
  • CLI contains "ops" commands for manually modifying migration state in your database, for those rare occasions when something goes wrong in prod.
  • Compatible with pgtestdb so database-backed tests are very fast.

Documentation

Quickstart Example

Please visit the ./example directory for a working example of how to use pgmigrate. This example demonstrates:

  • Using the CLI
  • Creating and applying new migrations
  • Dumping your schema to a file
  • Using pgmigrate as an embedded library to run migrations on startup
  • Writing extremely fast database-backed tests

CLI

Install

Homebrew:
# install it
brew install peterldowns/tap/pgmigrate
Manually download binaries

Visit the latest Github release and pick the appropriate binary. Or, click one of the shortcuts here:

Nix (flakes):
# run it
nix run github:peterldowns/pgmigrate -- --help
# install it
nix profile install --refresh github:peterldowns/pgmigrate
Golang:

The golang-installed binaries will not be built with version information.

# run it
go run github.com/peterldowns/pgmigrate/cli@latest -- --help
# install it
go install github.com/peterldowns/pgmigrate/cli@latest
Docker:
# The default CMD is "pgmigrate" which just shows the help screen.
docker run -it --rm ghcr.io/peterldowns/pgmigrate:latest
# To actually run migrations, you'll want to make sure the container can access
# your database and migrations directory and specify a command. To access a
# database running on the host, use `host.docker.internal` instead of
# `localhost` in the connection string:
docker run -it --rm \
  --volume $(pwd)//migrations:/migrations \
  --env PGM_MIGRATIONS=/migrations \
  --env PGM_DATABASE='postgresql://postgres:password@host.docker.internal:5433/postgres' \
  ghcr.io/peterldowns/pgmigrate:latest \
  pgmigrate plan

Configuration

pgmigrate reads its configuration from cli flags, environment variables, and a configuration file, in that order.

pgmigrate will look in the following locations for a configuration file:

  • If you passed --configfile <aaa>, then it reads <aaa>
  • If you defined PGM_CONFIGFILE=<bbb>, then it reads <bbb>
  • If your current directory has a .pgmigrate.yaml file, it reads $(pwd)/.pgmigrate.yaml
  • If the root of your current git repo has a .pgmigrate.yaml file, it reads $(git_repo_root)/.pgmigrate.yaml

Here's an example configuration file. All keys are optional, an empty file is also a valid configuration.

# connection string to a database to manage
database: "postgres://postgres:password@localhost:5433/postgres"
# path to the folder of migration files. if this is relative,
# it is treated as relative to wherever the "pgmigrate" command
# is invoked, NOT as relative to this config file.
migrations: "./tmp/migrations"
# the name of the table to use for storing migration records.  you can give
# this in the form "table" to use your database's default schema, or you can
# give this in the form "schema.table" to explicitly set the schema.
table_name: "custom_schema.custom_table"
# this key configures the "dump" command.
schema:
  # the name of the schema to dump, defaults to "public"
  name: "public"
  # the file to which to write the dump, defaults to "-" (stdout)
  # if this is relative, it is treated as relative to wherever the
  # "pgmigrate" command is invoked, NOT as relative to this config file.
  file: "./schema.sql"
  # any explicit dependencies between database objects that are
  # necessary for the dumped schema to apply successfully.
  dependencies:
    some_view: # depends on
      - some_function
      - some_table
    some_table: # depends on
      - another_table
  # any tables for which the dump should contain INSERT statements to create
  # actual data/rows. this is useful for enums or other tables full of
  # ~constants.
  data:
    - name: "%_enum" # accepts wildcards using SQL query syntax
    - name: "my_example_table" # can also be a literal
      # if not specified, defaults to "*"
      columns:
        - "value"
        - "comment"
      # a valid SQL order clause to use to order the rows in the INSERT
      # statement.
      order_by: "value asc"

Usage

The CLI ships with documentation and examples built in, please see pgmigrate help and pgmigrate help <command> for more details.

# pgmigrate --help
Docs: https://github.com/peterldowns/pgmigrate

Usage:
  pgmigrate [flags]
  pgmigrate [command]

Examples:
  # Preview and then apply migrations
  pgmigrate plan     # Preview which migrations would be applied
  pgmigrate migrate  # Apply any previously-unapplied migrations
  pgmigrate verify   # Verify that migrations have been applied correctly
  pgmigrate applied  # Show all previously-applied migrations
  
  # Dump the current schema to a file
  pgmigrate dump --out schema.sql

Migrating:
  applied     Show all previously-applied migrations
  migrate     Apply any previously-unapplied migrations
  plan        Preview which migrations would be applied
  verify      Verify that migrations have been applied correctly

Operations:
  ops         Perform manual operations on migration records
  version     Print the version of this binary

Development:
  config      Print the current configuration / settings
  dump        Dump the database schema as a single migration file
  help        Help about any command
  new         generate the name of the next migration file based on the current sequence prefix

Flags:
      --configfile string   [PGM_CONFIGFILE] a path to a configuration file
  -d, --database string     [PGM_DATABASE] a 'postgres://...' connection string
  -h, --help                help for pgmigrate
      --log-format string   [PGM_LOGFORMAT] 'text' or 'json', the log line format (default 'text')
  -m, --migrations string   [PGM_MIGRATIONS] a path to a directory containing *.sql migrations
      --table-name string   [PGM_TABLENAME] the table name to use to store migration records (default 'pgmigrate_migrations')
  -v, --version             version for pgmigrate

Use "pgmigrate [command] --help" for more information about a command.

Library

Install

  • requires golang 1.18+ because it uses generics.
  • only depends on stdlib; all dependencies in the go.mod are for tests.
# library
go get github.com/peterldowns/pgmigrate@latest

Usage

All of the methods available in the CLI are equivalently named and available in the library. Please read the cli help with pgmigrate help <command> or read the the go.dev docs at pkg.go.dev/github.com/peterldowns/pgmigrate.

FAQ

How does it work?

pgmigrate has the following invariants, rules, and behavior:

  • A migration is a file whose name ends in .sql. The part before the extension is its unique ID.
  • All migrations are "up" migrations, there is no such thing as a "down" migration.
  • The migrations table is a table that pgmigrate uses to track which migrations have been applied. It has the following schema:
    • id (text not null): the ID of the migration
    • checksum (text not null): the MD5() hash of the contents of the migration when it was applied.
    • execution_time_in_millis (integer not null): how long it took to apply the migration, in milliseconds.
    • applied_at (timestamp with time zone not null): the time at which the migration was finished applying and this row was inserted.
  • A plan is an ordered list of previously-unapplied migrations. The migrations are sorted by their IDs, in ascending lexicographical/alphabetical order. This is the same order that you get when you use ls or sort.
  • Each time migrations are applied, pgmigrate calculates the plan, then attempts to apply each migration one at a time.
  • To apply a migration, pgmigrate:
    • Begins a transaction.
      • Runs the migration SQL.
      • Creates and inserts a new row in the migrations table.
    • Commits the transaction.
  • Because each migration is applied in an explicit transaction, you must not use BEGIN/COMMIT/ROLLBACK within your migration files.
  • Any error when applying a migration will result in an immediate failure. If there are other migrations later in the plan, they will not be applied.
  • If and only if a migration is applied successfully, there will be a row in the migrations table containing its ID.
  • pgmigrate uses Postgres advisory locks to ensure that only once instance is attempting to run migrations at any point in time.
  • It is safe to run migrations as part of an init container, when your binary starts, or any other parallel way.
  • After a migration has been applied you should not edit the file's contents.
    • Editing its contents will not cause it to be re-applied.
    • Editing its contents will cause pgmigrate to show a warning that the hash of the migration differs from the hash of the migration when it was applied.
  • After a migration has been applied you should never delete the migration. If you do, pgmigrate will warn you that a migration that had previously been applied is no longer present.

Why use pgmigrate instead of the alternatives?

pgmigrate has the following features and benefits:

  • your team can merge multiple migrations with the same sequence number (00123_create_a.sql, 00123_update_b.sql).
  • your team can merge multiple migrations "out of order" (merge 00123_create_a.sql, then merge 00121_some_other.sql).
  • your team can dump a human-readable version of your database schema to help with debugging and to prevent schema conflicts while merging PRs.
  • your team can squash migration files to speed up new database creation and reduce complexity.
  • you never need to think about down migrations ever again (you don't use them and they're not necessary).
  • you can see exactly when each migration was applied, and the hash of the file contents of that migration, which helps with auditability and debugging.
  • if a migration fails you can simply edit the file and then redeploy without having to perform any manual operations.
  • the full functionality of pgmigrate is available no matter how you choose to use it (cli, embedded library, docker container).

How should my team work with it?

the migrations directory

Your team repository should include a migrations/ directory containing all known migrations.

migrations
├── 0001_cats.sql
├── 0003_dogs.sql
├── 0003_empty.sql
├── 0004_rm_me.sql

Because your migrations are applied in ascending lexicographical order, you should use a consistent-length numerical prefix for your migration files. This will mean that when you ls the directory, you see the migrations in the same order that they will be applied. Some teams use unix timestamps, others use integers, it doesn't matter as long as you're consistent.

creating a new migration

Add a new migration by creating a new file in your migrations/ directory ending in .sql. The usual work flow is:

  • Create a new feature branch
  • Create a new migration with a sequence number one greater than the most recent migration
  • Edit the migration contents

It is OK for you and another coworker to use the same sequence number. If you both choose the exact same filename, git will prevent you from merging both PRs.

what's allowed in a migration

You can do anything you'd like in a migration except for the following limitations:

  • migrations must not use transactions (BEGIN/COMMIT/ROLLBACK) as pgmigrate will run each migration inside of a transaction.
  • migrations must not use CREATE INDEX CONCURRENTLY as this is guaranteed to fail inside of a transaction.
preventing conflicts

You may be wondering, how is running "any previously unapplied migration" safe? What if there are two PRs that contain conflicting migrations?

For instance let's say two new migrations get created,

  • 0006_aaa_delete_users.sql, which deletes the users table
  • 0006_bbb_create_houses.sql, which creates a new houses table with a foreign key to users.
├── ...
├── 0006_aaa_delete_users.sql
├── 0006_bbb_create_houses.sql

There's no way both of these migrations could be safely applied, and the resulting database state could be different depending on the order!

  • If 0006_aaa_delete_users.sql is merged and applied first, then 0006_bbb_create_houses.sql is guaranteed to fail because there is no longer a users table to reference in the foreign key.
  • If 0006_bbb_create_houses.sql is merged and applied first, then 0006_aaa_delete_users.sql will either fail (because it cannot delete the users table) or result in the deletion of the houses table as well (in the case of ON DELETE CASCADE on the foreign key).

You can prevent this conflict at CI-time by using pgmigrate to maintain an up-to-date dump of your database schema. This schema dump will cause a git merge conflict so that only one of the migrations can be merged, and the second will force the developer to update the PR and the migration:

# schema.sql should be checked in to your repository, and CI should enforce that
# it is up to date. The easiest way to do this is to spin up a database, apply
# the migrations, and run the dump command.  Then, error if there are any
# changes detected:
pgmigrate dump -o schema.sql

You should also make sure to run a CI check on your main/dev branch that creates a new database and applies all known migrations. This check should block deploying until it succeeds.

Returning to the example of two conflicting migrations being merged, we can see how these guards provide a good developer experience and prevent a broken migration from being deployed:

  1. One of the two migrations is merged. The second branch should not be able to be merged because the dumped schema.sql will contain a merge conflict.
  2. If for some reason both of the migrations are able to be merged, the check on the main/dev branch will fail to apply migrations and block the deploy. because the migrations cannot be applied. Breaking main is annoying, but...

Lastly, you should expect this situation to happen only rarely. Most teams, even with large numbers of developers working in parallel, coordinate changes to shared tables such that conflicting schema changes are a rare event.

deploying and applying migrations

You should run pgmigrate with the latest migrations directory each time you deploy. You can do this by:

  • using pgmigrate as a golang library, and calling pgmigrate.Migrate(...) when your application starts
  • using pgmigrate as a cli or as a docker init container and applying migrations before your application starts.

Your application should fail to start if migrations fail for any reason.

Your application should start successfully if there are verification errors or warnings, but you should treat those errors as a sign there is a difference between the expected database state and the schema as defined by your migration files.

Because pgmigrate uses advisory locks, you can roll out as many new instances of your application as you'd like. Even if multiple instance attempt to run the migrations at once, only one will acquire the lock and apply the migrations. The other instances will wait for it to succeed and then no-op.

backwards compatibility

Assuming you're running in a modern cloud environment, you're most likely doing rolling deployments where new instances of your application are brought up before old ones are terminated. Therefore, make sure any new migrations will result in a database state that the previous version of your application (which will still be running as migrations are applied) can handle.

squashing migrations

At some point, if you have hundreds or thousands of migration files, you may want to replace them with a single migration file that achieves the same thing. You may want this because:

  • creating a new dev or test database and applying migrations will be faster if there are fewer migrations to run.
  • having so many migration files makes it annoying to add new migrations
  • having so many migration files gives lots of out-of-date results when searching for sql tables/views/definitions.

This process will involve manually updating the migrations table of your staging/production databases. Your coworkers will need to recreate their development databases or manually update their migration state with the same commands used in staging/production. Make sure to coordinate carefully with your team and give plenty of heads up beforehand. This should be an infrequent procedure.

Start by replacing your migrations with the output of pgmigrate dump. This can be done in a pull request just like any other change.

  • Apply all current migrations to your dev/local database and verify that they were applied:
export PGM_MIGRATIONS="./migrations"
pgmigrate apply
pgmigrate verify
  • Remove all existing migration files:
rm migrations/*.sql
  • Dump the current schema as a new migration:
pgmigrate dump -o migrations/00001_squash_on_2023_07_02.sql

This "squash" migration does the exact same thing as all the migration files that it replaced, which is the goal! But before you can deploy and run migrations, you will need to manually mark this migration as having already been applied. Otherwise, pgmigrate would attempt to apply it, and that almost certainly wouldn't work. The commands below use $PROD to reference the connection string for the database you are manually modifying, but you will need to do this on every database for which you manage migrations.

  • Double-check that the schema dumped from production is the exact same as the squash migration file. If there are any differences in these two files, DO NOT continue with the rest of this process. You will need to figure out why your production database schema is different than that described by your migrations. If necessary, please report a bug or issue on Github if pgmigrate is the reason for the difference.
mkdir -p tmp
pgmigrate --database $PROD dump -o tmp/prod-schema.sql
# This should result in no differences being printed. If you see any
# differences, please abort this process.
diff migrations/00001_squash_on_2023_07_02.sql tmp/prod-schema.sql
rm tmp/prod-schema.sql
  • Remove the records of all previous migrations having been applied.
# DANGER: Removes all migration records from the database
pgmigrate --database $PROD ops mark-unapplied --all
  • Mark this migration as having been applied
# DANGER: marks all migrations in the directory (only our squash migration in
# this case) as having been applied without actually running the migrations.
pgmigrate --database $PROD ops mark-applied --all
  • Check that the migration plan is empty, the result should show no migrations need to be applied.
pgmigrate --database $PROD plan
  • Verify the migrations state, should show no errors or problems.
pgmigrate --database $PROD verify

Acknowledgements

I'd like to thank and acknowledge:

  • All existing migration libraries for inspiration.
  • djrobstep's schemainspect and migra projects, for the queries used to implement pgmigrate dump.
  • The backend team at Pipe for helping test and validate this project's assumptions, utility, and implementation.

Future Work / TODOs

Documentation

Index

Constants

View Source
const (
	// DefaultTableName is the default name of the migrations table that
	// pgmigrate will use to store a record of applied migrations.
	DefaultTableName string = "pgmigrate_migrations"
)

Variables

This section is empty.

Functions

func IDFromFilename

func IDFromFilename(filename string) string

IDFromFilename removes directory paths and extensions from the filename to return just the filename (no extension).

Examples:

"0001_initial" == IDFromFilename("0001_initial.sql")
"0002_whatever.up" == IDFromFilename("0002_whatever.up.sql")

func SortByID

func SortByID(migrations []Migration)

SortByID sorts a slice of Migration in ascending lexicographical order by their ID. This means that they should show up in the same order that they appear when you use `ls` or `sort`.

Types

type AppliedMigration

type AppliedMigration struct {
	Migration
	Checksum              string    // The MD5 hash of the SQL of this migration
	ExecutionTimeInMillis int64     // How long it took to run this migration
	AppliedAt             time.Time // When the migration was run
}

AppliedMigration represents a successfully-executed Migration. It embeds the Migration, and adds fields for execution results.

func Applied

func Applied(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]AppliedMigration, error)

Applied returns a list of [AppliedMigration]s in the order that they were applied in (applied_at ASC, id ASC).

If there are no applied migrations, or the specified table does not exist, this will return an empty list without an error.

func MarkAllApplied added in v0.0.2

func MarkAllApplied(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]AppliedMigration, error)

MarkAllApplied (⚠️ danger) is a manual operation that marks all known migrations as applied without running them.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as applied.

func MarkAllUnapplied added in v0.0.2

func MarkAllUnapplied(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]AppliedMigration, error)

MarkAllUnapplied (⚠️ danger) is a manual operation that marks all known migrations as unapplied (not having been run) by removing their records from the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as unapplied.

func MarkApplied added in v0.0.2

func MarkApplied(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger, ids ...string) ([]AppliedMigration, error)

MarkApplied (⚠️ danger) is a manual operation that marks specific migrations as applied without running them.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as applied.

func MarkUnapplied added in v0.0.2

func MarkUnapplied(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger, ids ...string) ([]AppliedMigration, error)

MarkUnapplied (⚠️ danger) is a manual operation that marks specific migrations as unapplied (not having been run) by removing their records from the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as unapplied.

func RecalculateAllChecksums added in v0.0.2

func RecalculateAllChecksums(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]AppliedMigration, error)

RecalculateChecksums (⚠️ danger) is a manual operation that explicitly recalculates the checksums of all known migrations and updates their records in the migrations table to have the calculated checksum.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s whose checksums have been recalculated.

func RecalculateChecksums added in v0.0.2

func RecalculateChecksums(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger, ids ...string) ([]AppliedMigration, error)

RecalculateChecksums (⚠️ danger) is a manual operation that explicitly recalculates the checksums of the specified migrations and updates their records in the migrations table to have the calculated checksum.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s whose checksums have been recalculated.

func SetChecksums added in v0.0.2

func SetChecksums(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger, updates ...ChecksumUpdate) ([]AppliedMigration, error)

SetChecksums (⚠️ danger) is a manual operation that explicitly sets the recorded checksum of applied migrations in the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s whose checksums have been updated.

type ChecksumUpdate added in v0.0.2

type ChecksumUpdate struct {
	MigrationID string // The ID of the migration to update, `0001_initial`
	NewChecksum string // The checksum to set in the migrations table, `aaaabbbbccccdddd`
}

ChecksumUpdate represents an update to a specific migration. This struct is used instead of a `map[migrationID]checksum“ in order to apply multiple updates in a consistent order.

type Executor

type Executor interface {
	ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
	QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row
	QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error)
	BeginTx(ctx context.Context, opts *sql.TxOptions) (*sql.Tx, error)
}

Executor is satisfied by *sql.DB as well as *sql.Conn. Many of the Migrator's methods are designed to work inside of a session-scoped lock, which requires running queries on a *sql.Conn. These methods accept an Executor so that they can more easily be used by an external caller.

type Helper added in v0.0.2

type Helper interface {
	Helper()
}

Helper is an optional interface that your logger can implement to help make debugging and stacktraces easier to understand, primarily in tests. If a Logger implements this interface, pgmigrate will call Helper() in its own helper methods for writing to your logger, with the goal of omitting pgmigrate's helper methods from your stacktraces.

For instance, the TestLogger we provide embeds a testing.T, which implements Helper().

You do *not* need to implement this interface in order for pgmigrate to successfully use your logger.

type LogField added in v0.0.2

type LogField struct {
	Key   string
	Value any
}

LogField holds a key/value pair for structured logging.

type LogLevel added in v0.0.2

type LogLevel string

LogLevel represents the severity of the log message, and is one of

const (
	LogLevelDebug   LogLevel = "debug"
	LogLevelInfo    LogLevel = "info"
	LogLevelError   LogLevel = "error"
	LogLevelWarning LogLevel = "warning"
)

type Logger added in v0.0.2

type Logger interface {
	Log(context.Context, LogLevel, string, ...LogField)
}

Logger is a generic logging interface so that you can easily use pgmigrate with your existing structured ogging solution -- hopefully it is not difficult for you to write an adapter.

type Migration

type Migration struct {
	ID  string // the filename of the migration, without the .sql extension
	SQL string // the contents of the migration file
}

Migration represents a single SQL migration.

func Load

func Load(filesystem fs.FS) ([]Migration, error)

Load walks a filesystem from its root and extracts all files ending in `.sql` as Migrations, with the filename (without extension) being the ID and the file's contents being the SQL.

From disk:

	// the migration files will be read at run time
    fs := os.DirFS("./path/to/migrations/directory/*.sql")

From an embedded fs:

	// the migration files will be embedded at compile time
    //go:embed path/to/migrations/directory/*.sql
	var fs embed.FS

Load returns the migrations in sorted order.

func Plan

func Plan(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]Migration, error)

Plan shows which migrations, if any, would be applied, in the order that they would be applied in.

The plan will be a list of [Migration]s that are present in the migrations directory that have not yet been marked as applied in the migrations table.

The migrations in the plan will be ordered by their IDs, in ascending lexicographical order. This is the same order that you see if you use "ls". This is also the same order that they will be applied in.

The ID of a migration is its filename without the ".sql" suffix.

A migration will only ever be applied once. Editing the contents of the migration file will NOT result in it being re-applied. Instead, you will see a verification error warning that the contents of the migration differ from its contents when it was previously applied.

Migrations can be applied "out of order". For instance, if there were three migrations that had been applied:

  • 001_initial
  • 002_create_users
  • 003_create_viewers

And a new migration "002_create_companies" is merged:

  • 001_initial
  • 002_create_companies
  • 002_create_users
  • 003_create_viewers

Running "pgmigrate plan" will show:

  • 002_create_companies

Because the other migrations have already been applied. This is by design; most of the time, when you're working with your coworkers, you will not write migrations that conflict with each other. As long as you use a migration name/number higher than that of any dependencies, you will not have any problems.

func (*Migration) MD5

func (m *Migration) MD5() string

MD5 computes the MD5 hash of the SQL for this migration so that it can be uniquely identified. After a Migration is applied, the AppliedMigration will store this hash in the `Checksum` field.

type Migrator

type Migrator struct {
	// Migrations is the full set of migrations that describe the desired state
	// of the database.
	Migrations []Migration
	// Logger is used by the Migrator to log messages as it operates. It is
	// designed to be easy to adapt to whatever logging system you use.
	//
	// [NewMigrator] defaults it to `nil`, which will prevent any messages from
	// being logged.
	Logger Logger
	// TableName is the table that this migrator should use to keep track of
	// applied migrations.
	//
	// [NewMigrator] defaults it to [DefaultTableName].
	TableName string
}

Migrator should be instantiated with NewMigrator rather than used directly. It contains the state necessary to perform migrations-related operations.

func NewMigrator

func NewMigrator(
	migrations []Migration,
) *Migrator

NewMigrator creates a Migrator and sets appropriate default values for all configurable fields:

To configure these fields, just set the values on the struct.

func (*Migrator) Applied

func (m *Migrator) Applied(ctx context.Context, db Executor) ([]AppliedMigration, error)

Applied returns a list of [AppliedMigration]s in the order that they were applied in (applied_at ASC, id ASC).

If there are no applied migrations, or the specified table does not exist, this will return an empty list without an error.

func (*Migrator) MarkAllApplied added in v0.0.2

func (m *Migrator) MarkAllApplied(ctx context.Context, db Executor) ([]AppliedMigration, error)

MarkAllApplied (⚠️ danger) is a manual operation that marks all known migrations as applied without running them.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as applied.

func (*Migrator) MarkAllUnapplied added in v0.0.2

func (m *Migrator) MarkAllUnapplied(ctx context.Context, db Executor) ([]AppliedMigration, error)

MarkAllUnapplied (⚠️ danger) is a manual operation that marks all known migrations as unapplied (not having been run) by removing their records from the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as unapplied.

func (*Migrator) MarkApplied added in v0.0.2

func (m *Migrator) MarkApplied(ctx context.Context, db Executor, ids ...string) ([]AppliedMigration, error)

MarkApplied (⚠️ danger) is a manual operation that marks specific migrations as applied without running them.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as applied.

func (*Migrator) MarkUnapplied added in v0.0.2

func (m *Migrator) MarkUnapplied(ctx context.Context, db Executor, ids ...string) ([]AppliedMigration, error)

MarkUnapplied (⚠️ danger) is a manual operation that marks specific migrations as unapplied (not having been run) by removing their records from the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s that have been marked as unapplied.

func (*Migrator) Migrate

func (m *Migrator) Migrate(ctx context.Context, db *sql.DB) ([]VerificationError, error)

Migrate will apply any previously applied migrations. It stores metadata in the database with the following schema:

  • id: text not null
  • checksum: text not null
  • execution_time_in_millis: integer not null
  • applied_at: timestamp with time zone not null

It does the following things:

First, acquire an advisory lock to prevent conflicts with other instances that may be running in parallel. This way only one migrator will attempt to run the migrations at any point in time.

Then, calculate a plan of migrations to apply. The plan will be a list of migrations that have not yet been marked as applied in the migrations table. The migrations in the plan will be ordered by their IDs, in ascending lexicographical order.

For each migration in the plan,

  • Begin a transaction
  • Run the migration
  • Create a record in the migrations table saying that the migration was applied
  • Commit the transaction

If a migration fails at any point, the transaction will roll back. A failed migration results in NO record for that migration in the migrations table, which means that future attempts to run the migrations will include it in their plan.

Migrate() will immediately return the error related to a failed migration, and will NOT attempt to run any further migrations. Any migrations applied before the failure will remain applied. Any migrations not yet applied will not be attempted.

If all the migrations in the plan are applied successfully, then call Verify() to double-check that all known migrations have been marked as applied in the migrations table.

Finally, the advisory lock is released.

func (*Migrator) Plan

func (m *Migrator) Plan(ctx context.Context, db Executor) ([]Migration, error)

Plan shows which migrations, if any, would be applied, in the order that they would be applied in.

The plan will be a list of [Migration]s that are present in the migrations directory that have not yet been marked as applied in the migrations table.

The migrations in the plan will be ordered by their IDs, in ascending lexicographical order. This is the same order that you see if you use "ls". This is also the same order that they will be applied in.

The ID of a migration is its filename without the ".sql" suffix.

A migration will only ever be applied once. Editing the contents of the migration file will NOT result in it being re-applied. Instead, you will see a verification error warning that the contents of the migration differ from its contents when it was previously applied.

Migrations can be applied "out of order". For instance, if there were three migrations that had been applied:

  • 001_initial
  • 002_create_users
  • 003_create_viewers

And a new migration "002_create_companies" is merged:

  • 001_initial
  • 002_create_companies
  • 002_create_users
  • 003_create_viewers

Running "pgmigrate plan" will show:

  • 002_create_companies

Because the other migrations have already been applied. This is by design; most of the time, when you're working with your coworkers, you will not write migrations that conflict with each other. As long as you use a migration name/number higher than that of any dependencies, you will not have any problems.

func (*Migrator) RecalculateAllChecksums added in v0.0.2

func (m *Migrator) RecalculateAllChecksums(ctx context.Context, db Executor) ([]AppliedMigration, error)

func (*Migrator) RecalculateChecksums added in v0.0.2

func (m *Migrator) RecalculateChecksums(ctx context.Context, db Executor, ids ...string) ([]AppliedMigration, error)

RecalculateChecksums (⚠️ danger) is a manual operation that explicitly recalculates the checksums of the specified migrations and updates their records in the migrations table to have the calculated checksum.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s whose checksums have been recalculated.

func (*Migrator) SetChecksums added in v0.0.2

func (m *Migrator) SetChecksums(ctx context.Context, db Executor, updates ...ChecksumUpdate) ([]AppliedMigration, error)

SetChecksums (⚠️ danger) is a manual operation that explicitly sets the recorded checksum of applied migrations in the migrations table.

You should NOT use this as part of normal operations, it exists to help devops/db-admin/sres interact with migration state.

It returns a list of the [AppliedMigration]s whose checksums have been updated.

func (*Migrator) Verify

func (m *Migrator) Verify(ctx context.Context, db Executor) ([]VerificationError, error)

Verify returns a list of [VerificationError]s with warnings for any migrations that:

  • Are marked as applied in the database table but do not exist in the migrations directory.
  • Have a different checksum in the database than the current file hash.

These warnings usually signify that the schema described by the migrations no longer matches the schema in the database. Usually the cause is removing/editing a migration without realizing that it was already applied to a database.

The most common cause of a warning is in the case that a new release/deployment contains migrations, the migrations are applied successfully, but the release is then rolled back due to other issues. In this case the warning is just that, a warning, and should not be a long-term problem.

These warnings should not prevent your application from starting, but are worth showing to a human devops/db-admin/sre-type person for them to investigate.

type TestLogger added in v0.0.2

type TestLogger struct {
	*testing.T
}

TestLogger implements the Logger and Helper interface and writes all logs to a given test's output in such a way that stack traces are correctly preserved.

func NewTestLogger added in v0.0.2

func NewTestLogger(t *testing.T) TestLogger

NewTestLogger returns a TestLogger, which is a Logger and Helper (due to the embedded testing.T) that writes all logs to a given test's output in such a way that stack traces are correctly preserved.

func (TestLogger) Log added in v0.0.2

func (t TestLogger) Log(_ context.Context, level LogLevel, msg string, fields ...LogField)

Log writes a message to a given test's output in pseudo key=value form.

type VerificationError

type VerificationError struct {
	Message string
	Fields  map[string]any
}

A VerificationError represents a warning of either of two types:

  • a migration is marked as applied to the database but is not present in the directory of migrations: this can happen if a migration is applied, but the code containing that migration is later rolled back.
  • a migration whose hash (when applied) doesn't match its current hash (when calculated from its SQL contents): this can happen if someone edits a migration after it was previously applied.

These verification errors are worth looking into, but should not be treated the same as a failure to apply migrations. Typically these are warned or alerted on by the app using this migration library, and results in a human intervening in some way.

func Migrate

func Migrate(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]VerificationError, error)

Migrate will apply any previously applied migrations. It stores metadata in the DefaultTableName table, with the following schema: - id: text not null - checksum: text not null - execution_time_in_millis: integer not null - applied_at: timestamp with time zone not null

It does the following things:

First, acquire an advisory lock to prevent conflicts with other instances that may be running in parallel. This way only one migrator will attempt to run the migrations at any point in time.

Then, calculate a plan of migrations to apply. The plan will be a list of migrations that have not yet been marked as applied in the migrations table. The migrations in the plan will be ordered by their IDs, in ascending lexicographical order.

For each migration in the plan,

  • Begin a transaction
  • Run the migration
  • Create a record in the migrations table saying that the migration was applied
  • Commit the transaction

If a migration fails at any point, the transaction will roll back. A failed migration results in NO record for that migration in the migrations table, which means that future attempts to run the migrations will include it in their plan.

Migrate() will immediately return the error related to a failed migration, and will NOT attempt to run any further migrations. Any migrations applied before the failure will remain applied. Any migrations not yet applied will not be attempted.

If all the migrations in the plan are applied successfully, then call Verify() to double-check that all known migrations have been marked as applied in the migrations table.

Finally, the advisory lock is released.

func Verify

func Verify(ctx context.Context, db *sql.DB, dir fs.FS, logger Logger) ([]VerificationError, error)

Verify returns a list of [VerificationError]s with warnings for any migrations that:

  • Are marked as applied in the database table but do not exist in the migrations directory.
  • Have a different checksum in the database than the current file hash.

These warnings usually signify that the schema described by the migrations no longer matches the schema in the database. Usually the cause is removing/editing a migration without realizing that it was already applied to a database.

The most common cause of a warning is in the case that a new release/deployment contains migrations, the migrations are applied successfully, but the release is then rolled back due to other issues. In this case the warning is just that, a warning, and should not be a long-term problem.

These warnings should not prevent your application from starting, but are worth showing to a human devops/db-admin/sre-type person for them to investigate.

Directories

Path Synopsis
cli module
cmd
pgmigrate Module
example module
internal
migrations
migrations contains example migration data that is used in tests.
migrations contains example migration data that is used in tests.
multierr
multierr is a reimplementation of https://github.com/uber-go/multierr/ with a much simpler API and zero dependencies.
multierr is a reimplementation of https://github.com/uber-go/multierr/ with a much simpler API and zero dependencies.
sessionlock
sessionlock package provides support for application level distributed locks via advisory locks in PostgreSQL.
sessionlock package provides support for application level distributed locks via advisory locks in PostgreSQL.
withdb
withdb is a simplified way of creating test databases, used to test the internal packages that pgtestdb depends on.
withdb is a simplified way of creating test databases, used to test the internal packages that pgtestdb depends on.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL