Quick Links:
Roadmap |
Want to Contribute? |
GitLab Gitaly Issues |
GitLab Gitaly Merge Requests |
Gitaly is a Git RPC
service for handling all the git calls made by GitLab.
To see where it fits in please look at GitLab's architecture.
Project Goals
Fault-tolerant horizontal scaling of Git storage in GitLab, and particularly, on gitlab.com.
This will be achieved by focusing on two areas (in this order):
- Migrate from repository access via NFS to gitaly-proto, GitLab's new Git RPC protocol
- Evolve from large Gitaly servers managed as "pets" to smaller Gitaly servers that are "cattle"
Current Status
As of GitLab 11.5, almost all application code accesses Git repositories
through Gitaly instead of direct disk access. GitLab.com production no
longer uses direct disk access to touch Git repositories; the NFS
mounts have been
removed.
For performance reasons some RPCs can be performed through NFS still. An
effort is made to mitigate performance issues by removing Gitaly N+1.
Once that is no longer neccesairy we can conclude the migration project by
removing the Git repository storage paths from gitlab-rails's
configuration.
In the meantime we are building features according to our roadmap.
If you're interested in seeing how well Gitaly is performing on
GitLab.com, read about our observibiltiy story!
Overall
By Feature
Installation
Most users won't install Gitaly on its own. It is already included in
your GitLab installation.
Gitaly requires Go 1.11 or newer and Ruby 2.5. Run make
to download
and compile Ruby dependencies, and to compile the Gitaly Go
executable.
Gitaly uses git
. Version 2.21.0
or higher is required.
Configuration
See configuration documentation.
Contributing
See CONTRIBUTING.md.
Name
Gitaly is a tribute to git and the town of Aly. Where the town of
Aly has zero inhabitants most of the year we would like to reduce the number of
disk operations to zero for most actions. It doesn't hurt that it sounds like
Italy, the capital of which is the destination of all roads. All git actions in
GitLab end up in Gitaly.
Design
High-level architecture overview:
Edit this diagram directly in Google Drawings
Gitaly clients
As of Q4 2018, the following GitLab components act as Gitaly clients:
- gitlab-rails:
the main GitLab Rails application.
- gitlab-shell:
for
git clone
, git push
etc. via SSH.
- gitlab-workhorse:
for
git clone
via HTTPS and for slow requests that serve raw Git
data.
(example)
- gitaly-ssh:
for internal Git data transfers between Gitaly servers.
- gitaly-ruby:
for RPC's that interact with more than one repository, such as
merging a branch.
The clients written in Go (gitlab-shell, gitlab-workhorse, gitaly-ssh)
use library code from the
gitlab.com/gitlab-org/gitaly/client
package.
Further reading
More about the project, and its processes is accumulated in the docs.
Distributed Tracing
Gitaly supports distributed tracing through LabKit using OpenTracing APIs.
By default, no tracing implementation is linked into the binary, but different OpenTracing providers can be linked in using build tags/build constraints. This can be done by setting the BUILD_TAGS
make variable.
For more details of the supported providers, see LabKit, but as an example, for Jaeger tracing support, include the tags: BUILD_TAGS="tracer_static tracer_static_jaeger"
.
$ make BUILD_TAGS="tracer_static tracer_static_jaeger"
Once Gitaly is compiled with an opentracing provider, the tracing configuration is configured via the GITLAB_TRACING
environment variable.
For example, to configure Jaeger, you could use the following command:
GITLAB_TRACING=opentracing://jaeger ./gitaly config.toml
Presentations
-
How Gitaly fits into GitLab (Youtube) - a series of 1-hour training videos for contributors new to GitLab and Gitaly.
-
Part 1: the Gitaly client in gitlab-ce, 2019-02-21
Overview of GitLab backend processes, gitlab-rails deep dive: Gitaly
config in gitlab-rails, SQL data model, overview of how Gitaly calls get
made via GitalyClient.call.
-
Part 2: Git SSH, 2019-02-28
What is in a gitaly-proto Repository message, legacy vs
hashed storage (repository directories), git clone
via SSH,
gitlab-shell, authorized_keys
and forced commands, what happens
during git push
.
-
Part 3: Git push, 2019-03-07
A closer look at the final stage of git push
where the git hooks run
and the refs get updated. Interaction between the git hooks and GitLab
internal API. The Git
object quarantine mechanism.
Preview of Git HTTP (to be discussed next time).
-
Part 4: Git HTTP, 2019-03-14
Intercepting Git HTTP traffic with mitmproxy, overview of
Git HTTP clone steps, code walk in gitlab-workhorse and gitlab-ce,
investigating internal workhorse API messages used for Git HTTP.
-
Part 5: Merge Requests across Forks, 2019-03-21
Fixing a locally broken Ruby gem C
extension by recompiling, demo of how creating a MR across forks
causes new commits to suddenly appear in the fork parent repository,
deep dive into the FetchSourceBranch RPC, adding debug code to see
how address and authentication metadata is passed down to
gitaly-ruby, failed attempt to log gitaly-ssh arguments, comparison
of gitaly-ssh and gitlab-shell, a Gitaly server can end up making RPC calls to itself.
-
Part 6: Creating Git commits on behalf of Git users, 2019-03-21
Demonstrate how usually Git hooks are run by
git-receive-pack
, but sometimes by gitaly-ruby
. Deep dive into
UserCommitFiles: where do those hooks actually get run? A look at
UserMerge. How does Gitaly make merge commits. A look at the
implementation of the special feature where users are not allowed
push to a branch, but are allowed to merge into it.
-
Infrastructure Team Update 2017-05-11
-
Gitaly Basics, 2017-05-01
-
Git Paris meetup, 2017-02-22 a high-level overview of what our plans are and where we are.