Backend
Repository that holds best practices for structuring a backend micro-service.
"Best practices" are subjective to my personal experience and opinion.
At this point in time, the repository holds best practices for logging, testing, error handling, and using Twirp
(i.e. grpc, protobufs).
Dependencies
Table of Contents
Using protobuf clients and Twirp will shield the request header from the method that will handle the endpoint being called, and will force us to use context
instead.
This is not a huge issue, and allows for us to add an important piece of information to the context, an unique trace id.
With each request that comes through, we should be able to trace all the actions that the specific request entices.
To do this, let's take a look at some code inside of our server:
router := mux.NewRouter().StrictSlash(false)
...
err := http.ListenAndServe(":8080", request.WithRequestHeaders(router))
Our router is wrapped around a function that will add a unique id, which we can refer to as a trace id
, to the context so that each request can be identified.
This function serves as a hook on the router and will inject this before the request gets to the endpoint handler function
id, err := uuid.NewRandom()
if err == nil {
ctx = context.WithValue(ctx, "id", id.ID())
}
Let's take a look at a function that handles a request on our server, in particular the FillIn
endpoint.
func (p provider) FillIn(ctx context.Context, request *paintservice.FillInRequest) (*paintservice.FillInResponse, error) {
entry := p.logger.WithFields(logrus.Fields{
"id": ctx.Value("id"),
})
...
}
We can see that each request that gets kicked off will now have an unique trace id we'll be able to use in our logging and debugging purposes when an error is encountered.
This is incredibly helpful since Twirp
utilizes grpc
and each request is invoked in its individual go-routine.
Logs should help discern what the error was, where it occurred, what state the program was in such that it can be reproducible, and which object it effected (i.e. user, account, index in database, etc.).
A package that can be utilized to achieve all the aforementioned conditions is logrus.
More importantly, using instances of a logger
, and sharing and using those loggers
instances within the entire program.
The motivation behind using instances of a logger instead of just using the package level logrus
to log errors, is that each logger can can be customized for the needs of the specific portion of the program.
By adding hooks, for example, we can have our logs that are at the error
or warning
level automatically be pushed into datadog, where we already have settings such that N
number of errors on X
endpoint triggers pagerduty for the on-call engineer to investigate.
This flow of reporting can be crucial when dealing with time-sensitive events.
Moreover, if the method of logging is followed, the engineer tackling the error will easily be able to get to the root cause of the error.
Before jumping into how to structure logs, make sure to have reviewed requests on utilizing context
.
Let's take a look at the endpoint FillIn
, which serves as a parody to MS Paint Fill In
functionality.
picture
in this example is a new instance of the picture
struct.
func (p provider) FillIn(ctx context.Context, request *paintservice.FillInRequest) (*paintservice.FillInResponse, error) {
...
picture := NewPicture(matrix, p.logger)
err := picture.FillIn(ctx, request.Value, request.X, request.Y)
if err != nil {
entry.Error("FillIn")
return &paintservice.FillInResponse{}, twirp.NewError(twirp.Internal, "internal error")
}
...
We can see that when the initial endpoint is hit, a call to FillIn
is made on the picture
struct.
func (p picture) FillIn(ctx context.Context, value string, x, y int64) error {
entry := p.logger.WithFields(logrus.Fields{
"id": ctx.Value("id"),
"x": x,
"y": y,
"value": value,
})
if x < 0 || x >= int64(len(p.values)) {
entry.WithField("maxRows", len(p.values)).Error("x is out of range")
return errors.New("x is out of range")
}
...
When this particular error is encountered, we should log it, and we will messages of the following kind
{"file":".../backend/internal/paint/paint.go:32","func":"github.com/ykamo001/backend/internal/paint.picture.FillIn","id":704462431,"level":"error","maxRows":10,"msg":"x is out of range","time":"2019-11-03T21:40:47-08:00","value":"y","x":-2,"y":2}
{"file":".../backend/internal/paint/provider.go:41","func":"github.com/ykamo001/backend/internal/paint.provider.FillIn","id":704462431,"level":"error","msg":"FillIn","time":"2019-11-03T21:40:47-08:00"}
We can tell exactly where this error happened, what the state of the server and function was, what time it happened, and which request it can be tied back to.