tutorial

command
v0.0.0-...-1538fac Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 28, 2024 License: Apache-2.0 Imports: 12 Imported by: 0

README

Getting Started with Go Cloud

The best way to understand Go Cloud is to write some code and use it. Let's build a command line application that uploads files to blob storage on both AWS and GCP. Blob storage stores binary data under a string key, and is one of the most frequently used cloud services. We'll call the tool upload.

Before we start with any code, it helps to think about possible designs. We could very well write a code path for Amazon's Simple Storage Service (S3) and another code path for Google Cloud Storage (GCS). That would work. However, it would also be tedious. We would have to learn the semantics of uploading files to both blob storage services. Even worse, we would have two code paths that effectively do the same thing, but would have to be maintained separately (ugh!). It would be much nicer if we could write the upload logic once and reuse it across providers. That's exactly the kind of separation of concerns that Go Cloud makes possible. So, let's write some code!

When we're done, our command line application will work like this:

# uploads gopher.png to GCS
$ ./upload -cloud gcp gopher.png

# uploads gopher.png to S3
$ ./upload -cloud aws gopher.png

Flags & setup

We start with a skeleton for our program with flags to configure the cloud provider and the bucket name.

// Command upload saves files to blob storage on GCP and AWS.
package main

import (
    "flag"
    "log"
)

func main() {
    // Define our input.
    cloud := flag.String("cloud", "", "Cloud storage to use")
    bucketName := flag.String("bucket", "go-cloud-bucket", "Name of bucket")
    flag.Parse()
    if flag.NArg() != 1 {
        log.Fatal("Failed to provide file to upload")
    }
    file := flag.Arg(0)
}

Now that we have a basic skeleton in place, let's start filling in the pieces. Opening a connection to the bucket will depend on which cloud we're using. Because we allow the user to specify a cloud, we will write connection logic for both clouds. This will be the only step that requires cloud-specific APIs.

GCP authentication & connection

As a first pass, let's write the code to connect to a GCS bucket. Then, we will write the code to connect to an S3 bucket. In both cases, we are going to create a pointer to a blob.Bucket, the cloud-agnostic blob storage type.

package main

import (
    "context"
    "flag"
    "log"

    "github.com/google/go-cloud/blob"
    "github.com/google/go-cloud/blob/gcsblob"
    "github.com/google/go-cloud/gcp"
)

func main() {
    // flag setup omitted
    // ...

    ctx := context.Background()
    // Open a connection to the bucket.
    var (
        b *blob.Bucket
        err error
    )
    switch *cloud {
    case "gcp":
        b, err = setupGCP(ctx, *bucketName)
    case "aws":
        // AWS is handled below in the next code sample.
        b, err = setupAWS(ctx, *bucketName)
    default:
        log.Fatalf("Failed to recognize cloud. Want gcp or aws, got: %s", *cloud)
    }
    if err != nil {
        log.Fatalf("Failed to setup bucket: %s", err)
    }
}

func setupGCP(ctx context.Context, bucket string) (*blob.Bucket, error) {
    // DefaultCredentials assumes a user has logged in with gcloud.
    // See here for more information:
    // https://cloud.google.com/docs/authentication/getting-started
    creds, err := gcp.DefaultCredentials(ctx)
    if err != nil {
        return nil, err
    }
    c, err := gcp.NewHTTPClient(gcp.DefaultTransport(), gcp.CredentialsTokenSource(creds))
    if err != nil {
        return nil, err
    }
    // The bucket name must be globally unique.
    return gcsblob.OpenBucket(ctx, bucket, c)
}

The cloud-specific setup is the tricky part as each cloud has its own series of invocations to create a bucket. It's worth thinking about this point for moment.

Without Go Cloud, both your setup code and your business logic are platform dependent. That's a bad thing because business logic has a tendency to grow increasingly complex over time and if that logic is coupled to a particular platform, your application will likely become tightly coupled to the plaform as well.

With Go Cloud, only your setup code is platform dependent. The business logic makes no demands on a particular platform, which means your application itself requires less work to move from one cloud to another. In any case, the setup code in an application grows far less than the business logic does. Anything we can do to separate the business logic from an underlying platform is well-worth doing.

AWS authentication & connection

With GCS setup code out of the way, let's handle the S3 connection logic next:

package main

import (
    // ...
    // listing only new import statements

    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/credentials"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/google/go-cloud/blob/s3blob"
)

// ... main

func setupAWS(ctx context.Context, bucket string) (*blob.Bucket, error) {
    c := &aws.Config{
        // Either hard-code the region or use AWS_REGION.
        Region: aws.String("us-east-2"),
        // credentials.NewEnvCredentials assumes two environment variables are
        // present:
        // 1. AWS_ACCESS_KEY_ID, and
        // 2. AWS_SECRET_ACCESS_KEY.
        Credentials: credentials.NewEnvCredentials(),
    }
    s := session.Must(session.NewSession(c))
    return s3blob.OpenBucket(ctx, s, bucket)
}

The important point here is that in spite of the differences in setup for GCS and S3, the result is the same: we have a pointer to a blob.Bucket. Provided we go through the setup, once we have a blob.Bucket pointer, we may design our system around that type and prevent any cloud specific concerns from leaking throughout our application code. In other words, by using blob.Bucket we avoid being tightly coupled to one cloud provider.

With the setup done, we're ready to use the bucket connection. Note, as a design principle of Go Cloud, blob.Bucket does not support creating a bucket and instead focuses solely on interacting with it. This separates the concerns of provisioning infrastructure and using infrastructure.

Reading the file

We need to read our file into a slice of bytes before uploading it. The process is the usual one:

package main

import (
    // previous imports omitted
    "os"
    "io/ioutil"
)

func main() {
    // ... previous code omitted

    // Prepare the file for upload.
    data, err := ioutil.ReadFile(file)
    if err != nil {
        log.Fatalf("Failed to read file: %s", err)
    }
}

Writing the file to the bucket

Now, we have data, our file in a slice of bytes. Let's get to the fun part and write those bytes to the bucket!

package main

// No new imports.

func main() {
    // ...

    w, err := b.NewWriter(ctx, file, nil)
    if err != nil {
        log.Fatalf("Failed to obtain writer: %s", err)
    }
    _, err = w.Write(data)
    if err != nil {
        log.Fatalf("Failed to write to bucket: %s", err)
    }
    if err := w.Close(); err != nil {
        log.Fatalf("Failed to close: %s", err)
    }
}

First, we create a writer based on the bucket connection. In addition to a context.Context type, the method takes the key under which the data will be stored and the mime type of the data.

The call to NewWriter creates a blob.Writer, which implements io.Writer. With the writer, we call Write passing in the data. In response, we get the number of bytes written and any error that might have occurred.

Finally, we close the writer with Close and check the error.

Uploading an image

That's it! Let's try it out. As setup, we will to create an S3 bucket and a GCS bucket. In the code above, I called that bucket my-cool-bucket, but you can change that to match whatever your bucket is called. For GCP, you will need to login with gcloud. If you do not want to install gcloud, see here for alternatives. For AWS, you will need an access key ID and a secret access key. See here for details.

With our buckets created in S3 and GCS and our credentials for both set up, we'll build the program first:

$ go build -o upload

Then, we will send gopher.png (in the same directory as this README) to GCS:

$ ./upload -cloud gcp gopher.png

Then, we send that same gopher to S3:

$ ./upload -cloud aws gopher.png

If we check both buckets, we should see our gopher in both places! We're done!

Separating setup code from application code

Now that we have a working application, let's take one extra step to further separate setup code from application code. We will move all setup concerns that use any specific cloud-provider SDK out into a setup.go file and replace our switch statement with a single function:

// main.go

func main() {
    // ...

    cxt := context.Background()
    // Open a connection to the bucket.
    b, err := setupBucket(ctx, *cloud, *bucketName)
    if err != nil {
        log.Fatalf("Failed to setup bucket: %s", err
    }

    // Prepare the file for upload.
    // ...
}

After moving the setupGCP and setupAWS functions into setup.go, we add our new setupBucket function next:

func setupBucket(ctx context.Context, cloud, bucket string) (*blob.Bucket, error) {
    if cloud == "aws" {
        return setupAWS(ctx, bucket)
    }
    if cloud == "gcp" {
        return setupGCP(ctx, bucket)
    }
    return nil, fmt.Errorf("invalid cloud provider: %s", cloud)
}

Now, our application code consists of two files: main.go with the application logic and setup.go with the cloud-provider specific details. A benefit of this separation is that changing any part of blob storage is a matter of changing only the internals of setupBucket. The application logic does not need to be changed.

Wrapping Up

In conclusion, we have a program that can seamlessly switch between GCS and S3 using just one code path.

I hope this example demonstrates how having one type for multiple clouds is a huge win for simplicity and maintainability. By writing an application using a generic interface like *blob.Bucket, we retain the option of using infrastructure in whichever cloud that best fits our needs all without having to worry about a rewrite.

Documentation

Overview

Command upload saves files to blob storage on GCP and AWS.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL