etag

package
v0.2.2-dev11-06-2023 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 6, 2023 License: AGPL-3.0 Imports: 17 Imported by: 0

Documentation

Overview

Package etag provides an implementation of S3 ETags.

Each S3 object has an associated ETag that can be used to e.g. quickly compare objects or check whether the content of an object has changed.

In general, an S3 ETag is an MD5 checksum of the object content. However, there are many exceptions to this rule.

Single-part Upload

In case of a basic single-part PUT operation - without server side encryption or object compression - the ETag of an object is its content MD5.

Multi-part Upload

The ETag of an object does not correspond to its content MD5 when the object is uploaded in multiple parts via the S3 multipart API. Instead, S3 first computes a MD5 of each part:

 e1 := MD5(part-1)
 e2 := MD5(part-2)
...
 eN := MD5(part-N)

Then, the ETag of the object is computed as MD5 of all individual part checksums. S3 also encodes the number of parts into the ETag by appending a -<number-of-parts> at the end:

ETag := MD5(e1 || e2 || e3 ... || eN) || -N

For example: ceb8853ddc5086cc4ab9e149f8f09c88-5

However, this scheme is only used for multipart objects that are not encrypted.

Server-side Encryption

S3 specifies three types of server-side-encryption - SSE-C, SSE-S3 and SSE-KMS - with different semantics w.r.t. ETags. In case of SSE-S3, the ETag of an object is computed the same as for single resp. multipart plaintext objects. In particular, the ETag of a singlepart SSE-S3 object is its content MD5.

In case of SSE-C and SSE-KMS, the ETag of an object is computed differently. For singlepart uploads the ETag is not the content MD5 of the object. For multipart uploads the ETag is also not the MD5 of the individual part checksums but it still contains the number of parts as suffix.

Instead, the ETag is kind of unpredictable for S3 clients when an object is encrypted using SSE-C or SSE-KMS. Maybe AWS S3 computes the ETag as MD5 of the encrypted content but there is no way to verify this assumption since the encryption happens inside AWS S3. Therefore, S3 clients must not make any assumption about ETags in case of SSE-C or SSE-KMS except that the ETag is well-formed.

To put all of this into a simple rule:

SSE-S3 : ETag == MD5
SSE-C  : ETag != MD5
SSE-KMS: ETag != MD5

Encrypted ETags

An S3 implementation has to remember the content MD5 of objects in case of SSE-S3. However, storing the ETag of an encrypted object in plaintext may reveal some information about the object. For example, two objects with the same ETag are identical with a very high probability.

Therefore, an S3 implementation may encrypt an ETag before storing it. In this case, the stored ETag may not be a well-formed S3 ETag. For example, it can be larger due to a checksum added by authenticated encryption schemes. Such an ETag must be decrypted before sent to an S3 client.

S3 Clients

There are many different S3 client implementations. Most of them access the ETag by looking for the HTTP response header key "Etag". However, some of them assume that the header key has to be "ETag" (case-sensitive) and will fail otherwise. Further, some clients require that the ETag value is a double-quoted string. Therefore, this package provides dedicated functions for adding and extracing the ETag to/from HTTP headers.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ContentMD5Requested

func ContentMD5Requested(h http.Header) bool

ContentMD5Requested - for http.request.header is not request Content-Md5

func Equal

func Equal(a, b ETag) bool

Equal returns true if and only if the two ETags are identical.

func Set

func Set(etag ETag, h http.Header)

Set adds the ETag to the HTTP headers. It overwrites any existing ETag entry.

Due to legacy S3 clients, that make incorrect assumptions about HTTP headers, Set should be used instead of http.Header.Set(...). Otherwise, some S3 clients will not able to extract the ETag.

func Wrap

func Wrap(wrapped, content io.Reader) io.Reader

Wrap returns an io.Reader that reads from the wrapped io.Reader and implements the Tagger interaface.

If content implements Tagger then the returned Reader returns ETag of the content. Otherwise, it returns nil as ETag.

Wrap provides an adapter for io.Reader implemetations that don't implement the Tagger interface. It is mainly used to provide a high-level io.Reader access to the ETag computed by a low-level io.Reader:

content := etag.NewReader(r.Body, nil)

compressedContent := Compress(content)
encryptedContent := Encrypt(compressedContent)

// Now, we need an io.Reader that can access
// the ETag computed over the content.
reader := etag.Wrap(encryptedContent, content)

Types

type ETag

type ETag []byte

ETag is a single S3 ETag.

An S3 ETag sometimes corresponds to the MD5 of the S3 object content. However, when an object is encrypted, compressed or uploaded using the S3 multipart API then its ETag is not necessarily the MD5 of the object content.

For a more detailed description of S3 ETags take a look at the package documentation.

func Decrypt

func Decrypt(key []byte, etag ETag) (ETag, error)

Decrypt decrypts the ETag with the given key.

If the ETag is not encrypted, Decrypt returns the ETag unmodified.

func FromContentMD5

func FromContentMD5(h http.Header) (ETag, error)

FromContentMD5 decodes and returns the Content-MD5 as ETag, if set. If no Content-MD5 header is set it returns an empty ETag and no error.

func Get

func Get(h http.Header) (ETag, error)

Get extracts and parses an ETag from the given HTTP headers. It returns an error when the HTTP headers do not contain an ETag entry or when the ETag is malformed.

Get only accepts AWS S3 compatible ETags - i.e. no encrypted ETags - and therefore is stricter than Parse.

func Multipart

func Multipart(etags ...ETag) ETag

Multipart computes an S3 multipart ETag given a list of S3 singlepart ETags. It returns nil if the list of ETags is empty.

Any encrypted or multipart ETag will be ignored and not used to compute the returned ETag.

func Parse

func Parse(s string) (ETag, error)

Parse parses s as an S3 ETag, returning the result. The string can be an encrypted, singlepart or multipart S3 ETag. It returns an error if s is not a valid textual representation of an ETag.

func (ETag) ETag

func (e ETag) ETag() ETag

ETag returns the ETag itself.

By providing this method ETag implements the Tagger interface.

func (ETag) Format

func (e ETag) Format() ETag

Format returns an ETag that is formatted as specified by AWS S3.

An AWS S3 ETag is 16 bytes long and, in case of a multipart upload, has a `-N` suffix encoding the number of object parts. An ETag is not AWS S3 compatible when encrypted. When sending an ETag back to an S3 client it has to be formatted to be AWS S3 compatible.

Therefore, Format returns the last 16 bytes of an encrypted ETag.

In general, a caller has to distinguish the following cases:

  • The object is a multipart object. In this case, Format returns the ETag unmodified.
  • The object is a SSE-KMS or SSE-C encrypted single- part object. In this case, Format returns the last 16 bytes of the encrypted ETag which will be a random value.
  • The object is a SSE-S3 encrypted single-part object. In this case, the caller has to decrypt the ETag first before calling Format. S3 clients expect that the ETag of an SSE-S3 encrypted single-part object is equal to the object's content MD5. Formatting the SSE-S3 ETag before decryption will result in a random-looking ETag which an S3 client will not accept.

Hence, a caller has to check:

if method == SSE-S3 {
   ETag, err := Decrypt(key, ETag)
   if err != nil {
   }
}
ETag = ETag.Format()

func (ETag) IsEncrypted

func (e ETag) IsEncrypted() bool

IsEncrypted reports whether the ETag is encrypted.

func (ETag) IsMultipart

func (e ETag) IsMultipart() bool

IsMultipart reports whether the ETag belongs to an object that has been uploaded using the S3 multipart API. An S3 multipart ETag has a -<part-number> suffix.

func (ETag) Parts

func (e ETag) Parts() int

Parts returns the number of object parts that are referenced by this ETag. It returns 1 if the object has been uploaded using the S3 singlepart API.

Parts may panic if the ETag is an invalid multipart ETag.

func (ETag) String

func (e ETag) String() string

String returns the string representation of the ETag.

The returned string is a hex representation of the binary ETag with an optional '-<part-number>' suffix.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

A Reader wraps an io.Reader and computes the MD5 checksum of the read content as ETag.

Optionally, a Reader can also verify that the computed ETag matches an expected value. Therefore, it compares both ETags once the underlying io.Reader returns io.EOF. If the computed ETag does not match the expected ETag then Read returns a VerifyError.

Reader implements the Tagger interface.

func NewReader

func NewReader(ctx context.Context, r io.Reader, etag ETag, forceMD5 []byte) *Reader

NewReader returns a new Reader that computes the MD5 checksum of the content read from r as ETag.

If the provided etag is not nil the returned Reader compares the etag with the computed MD5 sum once the r returns io.EOF.

func (*Reader) ETag

func (r *Reader) ETag() ETag

ETag returns the ETag of all the content read so far. Reading more content changes the MD5 checksum. Therefore, calling ETag multiple times may return different results.

func (*Reader) Read

func (r *Reader) Read(p []byte) (int, error)

Read reads up to len(p) bytes from the underlying io.Reader as specified by the io.Reader interface.

type Tagger

type Tagger interface {
	ETag() ETag
}

Tagger is the interface that wraps the basic ETag method.

type UUIDHash

type UUIDHash struct {
	// contains filtered or unexported fields
}

UUIDHash - use uuid to make md5sum

func NewUUIDHash

func NewUUIDHash(uuid []byte) *UUIDHash

NewUUIDHash - new UUIDHash

func (UUIDHash) BlockSize

func (u UUIDHash) BlockSize() int

BlockSize - implement hash.Hash BlockSize

func (UUIDHash) Reset

func (u UUIDHash) Reset()

Reset - implement hash.Hash Reset

func (UUIDHash) Size

func (u UUIDHash) Size() int

Size - implement hash.Hash Size

func (UUIDHash) Sum

func (u UUIDHash) Sum(b []byte) []byte

Sum - implement md5.Sum

func (UUIDHash) Write

func (u UUIDHash) Write(p []byte) (n int, err error)

Write - implement hash.Hash Write

type VerifyError

type VerifyError struct {
	Expected ETag
	Computed ETag
}

VerifyError is an error signaling that a computed ETag does not match an expected ETag.

func (VerifyError) Error

func (v VerifyError) Error() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL