gitaly

module
v14.10.0-rc1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 6, 2022 License: MIT

README

Gitaly

Quick Links: Roadmap | Want to Contribute? | GitLab Gitaly Issues | GitLab Gitaly Merge Requests |


Gitaly is a Git RPC service for handling all the git calls made by GitLab.

To see where it fits in please look at GitLab's architecture.

Project Goals

Fault-tolerant horizontal scaling of Git storage in GitLab, and particularly, on gitlab.com.

This will be achieved by focusing on two areas (in this order):

  1. Migrate from repository access via NFS to gitaly-proto, GitLab's new Git RPC protocol
  2. Evolve from large Gitaly servers managed as "pets" to smaller Gitaly servers that are "cattle"

Current Status

As of GitLab 11.5, almost all application code accesses Git repositories through Gitaly instead of direct disk access. GitLab.com production no longer uses direct disk access to touch Git repositories; the NFS mounts have been removed.

For performance reasons some RPCs can be performed through NFS still. An effort is made to mitigate performance issues by removing Gitaly N+1. Once that is no longer necessary we can conclude the migration project by removing the Git repository storage paths from gitlab-rails's configuration.

In the meantime we are building features according to our roadmap.

If you're interested in seeing how well Gitaly is performing on GitLab.com, read about our observability story!

Overall

image

By Feature

image

Installation

Most users won't install Gitaly on its own. It is already included in your GitLab installation.

Gitaly requires Go 1.16 or Go 1.17 and Ruby 2.7. Run make to download and compile Ruby dependencies, and to compile the Gitaly Go executable.

Gitaly uses git. Versions 2.33.0 and newer are supported.

Configuration

The administration and reference guide is documented in the GitLab project.

Contributing

See CONTRIBUTING.md.

Name

Gitaly is a tribute to git and the town of Aly. Where the town of Aly has zero inhabitants most of the year we would like to reduce the number of disk operations to zero for most actions. It doesn't hurt that it sounds like Italy, the capital of which is the destination of all roads. All git actions in GitLab end up in Gitaly.

Design

High-level architecture overview:

Gitaly Architecture

Edit this diagram directly in Google Drawings

Gitaly clients

As of Q4 2018, the following GitLab components act as Gitaly clients:

  • gitlab-rails: the main GitLab Rails application.
  • gitlab-shell: for git clone, git push etc. via SSH.
  • gitlab-workhorse: for git clone via HTTPS and for slow requests that serve raw Git data. (example)
  • gitaly-ssh: for internal Git data transfers between Gitaly servers.
  • gitaly-ruby: for RPC's that interact with more than one repository, such as merging a branch.

The clients written in Go (gitlab-shell, gitlab-workhorse, gitaly-ssh) use library code from the gitlab.com/gitlab-org/gitaly/client package.

High Availability

We are working on a high-availability (HA) solution for Gitaly based on asynchronous replication. A Gitaly server would be made highly available by assigning one or more standby servers ("secondaries") to it, each of which contains a full copy of all the repository data on the primary Gitaly server.

To implement this we are building a new GitLab component called Praefect, which is hosted alongside the rest of Gitaly in this repository. As we currently envision it, Praefect will have four responsibilities:

  • route RPC traffic to the primary Gitaly server
  • inspect RPC traffic and mark repositories as dirty if the RPC is a "mutator"
  • ensure dirty repositories have their changes replicated to the secondary Gitaly servers
  • in the event of a failure on the primary, demote it to secondary and elect a new primary

Praefect has internal state: it needs to be able to "remember" which repositories are in need of replication, and which Gitaly server is the primary. We will use Postgres to store Praefect's internal state.

As of December 2019 we are busy rolling out the Postgres integration in Praefect. The minimum supported Postgres version is 9.6, just like the rest of GitLab.

Further reading

More about the project and its processes is detailed in the docs.

Distributed Tracing

Gitaly supports distributed tracing through LabKit using OpenTracing APIs.

By default, no tracing implementation is linked into the binary, but different OpenTracing providers can be linked in using build tags/build constraints. This can be done by setting the BUILD_TAGS make variable.

For more details of the supported providers, see LabKit, but as an example, for Jaeger tracing support, include the tags: BUILD_TAGS="tracer_static tracer_static_jaeger".

$ make BUILD_TAGS="tracer_static tracer_static_jaeger"

Once Gitaly is compiled with an opentracing provider, the tracing configuration is configured via the GITLAB_TRACING environment variable.

For example, to configure Jaeger, you could use the following command:

GITLAB_TRACING=opentracing://jaeger ./gitaly config.toml

Continuous Profiling

Gitaly supports Continuous Profiling through LabKit using Stackdriver Profiler.

For more information on how to set it up, see the LabKit monitoring docs.

Presentations

Directories

Path Synopsis
_support
cmd
praefect
Command praefect provides a reverse-proxy server with high-availability specific features for Gitaly.
Command praefect provides a reverse-proxy server with high-availability specific features for Gitaly.
internal
backchannel
Package backchannel implements connection multiplexing that allows for invoking gRPC methods from the server to the client.
Package backchannel implements connection multiplexing that allows for invoking gRPC methods from the server to the client.
cache
Package cache supplies background workers for periodically cleaning the cache folder on all storages listed in the config file.
Package cache supplies background workers for periodically cleaning the cache folder on all storages listed in the config file.
dontpanic
Package dontpanic provides function wrappers and supervisors to ensure that wrapped code does not panic and cause program crashes.
Package dontpanic provides function wrappers and supervisors to ensure that wrapped code does not panic and cause program crashes.
git
log
praefect
Package praefect is a Gitaly reverse proxy for transparently routing gRPC calls to a set of Gitaly services.
Package praefect is a Gitaly reverse proxy for transparently routing gRPC calls to a set of Gitaly services.
praefect/commonerr
Package commonerr contains common errors between different Praefect components.
Package commonerr contains common errors between different Praefect components.
praefect/datastore
Package datastore provides data models and datastore persistence abstractions for tracking the state of repository replicas.
Package datastore provides data models and datastore persistence abstractions for tracking the state of repository replicas.
praefect/datastore/advisorylock
Package advisorylock contains the lock IDs of all advisory locks used in Praefect.
Package advisorylock contains the lock IDs of all advisory locks used in Praefect.
praefect/datastore/glsql
Package glsql (Gitaly SQL) is a helper package to work with plain SQL queries.
Package glsql (Gitaly SQL) is a helper package to work with plain SQL queries.
praefect/grpc-proxy/proxy
Package proxy provides a reverse proxy handler for gRPC.
Package proxy provides a reverse proxy handler for gRPC.
ps
streamcache
Package streamcache provides a cache for large blobs (in the order of gigabytes).
Package streamcache provides a cache for large blobs (in the order of gigabytes).
proto
go/internal/cmd/protoc-gen-gitaly
Command protoc-gen-gitaly is designed to be used as a protobuf compiler plugin to verify Gitaly processes are being followed when writing RPC's.
Command protoc-gen-gitaly is designed to be used as a protobuf compiler plugin to verify Gitaly processes are being followed when writing RPC's.
Package streamio contains wrappers intended for turning gRPC streams that send/receive messages with a []byte field into io.Writers and io.Readers.
Package streamio contains wrappers intended for turning gRPC streams that send/receive messages with a []byte field into io.Writers and io.Readers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL