aws

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2025 License: BSD-3-Clause-Clear Imports: 0 Imported by: 0

README

go-aws-msg

Go Reference lint tests

**This is a fork of https://github.com/zerofox-oss/go-aws-msg with added message batching for cost control This post claims that they managed reduce SQS costs by 90% by batching the messages https://www.moengage.com/blog/reduce-sqs-cost/

AWS Pub/Sub Primitives for Go

This library contains basic primitives for building pub/sub systems using AWS's SNS (Simple Notification Service) and SQS (Simple Queue Service). It is inspired by go-msg.

SNS is specifically designed to be a fully managed Pub/Sub messaging service which supports a number of different subscription types, such as HTTP endpoints or SQS queues.

At ZeroFOX, we use SNS for publishing all of our data. Teams are able to tap into existing data streams for new features or data analysis. SQS is our primary subscription protocol because most of our data processing backend is queued (for obvious benefits). Though Lambda and HTTP are not uncommon. For us, SNS and SQS form the backbone of our architecture, what we call our data pipelines.

How it works

Most of the basic theory behind the primitives can be found in go-msg. This library contains the Topics and Servers which interact with SNS and SQS.

Batching

A package 'batching' has been added. On the client side it would spawn a go routine that runs a 'batching engine'. The message won't be immediately send to SNS/SQS but will be put on the queue instead via batching.Append call. The actual send occurs either when Append sees that the queued messages with this extra payload exceed 250K in total or when the batching engine detects a topic timeout that is specified in batching.NewTopic call. The higher level packages sqs and sns were added functions BatchON and BatchOFF to turn batching on and off. In theory, turning batching on and off should be possible without restart. The server side that reads SQS messages has a call sqs.BatchServer() that switches the package to reading/parsing/processing batches of messages instead of single ones. Obviously, client and server need to be in sync when it comes to batching.

Packing Multiple Messages into a Batch

When multiple messages are batched together into a single one, each of them is simply prefixed by the 4 bytes that symbolically expresses the message length in bytes as a base 62 integer. Say we have these 3 messages to batch:

[]string{"ABC", strings.Repeat("杂志等中区别", 1000), strings.Repeat("志", 200000)}

The batch will look like below, without the blanks put there for readability

0003ABC 04Gk杂志等...18000 bytes(6000 runes)  2w5q志志志志...600000 bytes(20K runes) 

The batched messages have the following attribute: value set

"Content-Transfer-Encoding": "partially-base64-batch"
Base-64 encoding of Messages and Optimization made in this area

SQS has limited support of Unicode Some Unicode ranges are not supported. Because of that the original library primary option is to Base-64 encode whole messages. This is very wasteful since it inflates even ASCII data. For example, 'ABCDEF12345' becomes 'QUJDREVGMTIzNDUK': extra 6 chars 18 bytes of '杂志等中区别' (SQS supported characters) becomes 28 bytes of '5p2C5b+X562J5Lit5Yy65YirCgo='. This is problematic because an SNS/SQS has a size limit of 250K

Batching uses encoding that base-64 encodes only what's necessary: the msg subsequences that are not supported by SQS. And this is the only encoding batching supports. It's implemented in the sqsencode package. So partially base64 encoded message will look like this:

<sqs supported sequence>U+10FFFF<4 bytes of base64 encoded subsequence length><base64 encoded subsequence>

if the original message contains U+10FFFF, it is duplicated.

There is an option to use this encoding without batching: sns.NewPartialBASE64Topic call. But if the topic is batched - sns.NewBatchedTopic - the above is the only encoding used internally.

This encoding sets the following message attribute: value

"Content-Transfer-Encoding": "partially-base64"

Documentation

Overview

Package aws (go-aws-msg) implements pub/sub primitives using AWS (specifically SNS and SQS). This library implements the interfaces outlined in package "github.com/zerofox-oss/go-msg" A typical use is for building data processing pipelines in AWS.

Directories

Path Synopsis
test

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL