The purpose of this code is to forward ESP (Email Service Provider) webhooks.
ESPs usually only allows one forward.
To add support for a new ESP, please look at the esp package.
It's built for the GCP ecosystem.
It's licensed under the MIT license. tl;dr: You can use it in your closed source
project if you want. You don't have to publish any modifications, but we appreciate
bug reports.
Please note
Environment variables are specified per cloud function. Some environment variables
have different values for different cloud functions.
Components
mandatory environment variables
PROJECT_ID this is your GCP project id.
DEV_OR_PROD this is not strictly necessary to set. We use dev for when a
developer is testing, devprod for a complete live instance that is just for
test, and prod for the actual production instances.
SIMPLE_HASH_PASSWORD is used for the simple hash that is checked in
the fanout package.
optional environment variables
NBR_PUBLISH_WORKER this is how many go-routines are busy with sending data.
For the forwarder package you might want to have a constrained number since
the receiving end might be upset about too many connections. Default is 32.
For the fanout package, you can have a higher number if you like.
MAX_NBR_MESSAGES_POLLED specifies how many messages fanout and forward
tries to receive.
NBR_ACK_WORKER is the number of go routines that are concerned with sending ACK
to received messages.
webhook package
Overview
The ESP (Email Service Provider, e.g Sendgrid) posts a webhook event that
the webhook package listens to.
The webhook package is the webhook web listener entry point.
The ESP posts webhooks to a URL this package listens to.
The URL the ESP posts to ends in...
/v1/sg/1/aaa/bbb/
v1 is the protocol version.
sg is what ESP the webhook originates from. Here it's "sg" for "Sendgrid"
1 is the database user id for the client.
aaa is a hash calculated from the user id above and a site wide secret password.
This is just to do a quick filter of the most obvious stuff.
It's specified in environment variable SIMPLE_HASH_PASSWORD
bbb is a hash calculated using user specific data.
It therefore requires used data to be read
The webhook package does a quick check it's valid, and then it pushes
the package to a pubsub queue.
The package is put on a pubsub queue and this cloud function exits.
The topic id is specified in environment variable OUT_QUEUE_TOPIC_ID
Why is it constructed this way? We want to minimize the time a thread is spent
on one webhook event because it's costly to process it one and one.
fanout package
Overview
A bunch of messages are loaded at once from the pubsub subscription that the
webhook above posted to.
The bunch size is specified in environment variable MAX_NBR_MESSAGES_POLLED
Both the fanout package and the forward package below have two subscriptions
related to the incoming side.
One pubsub subscription is simply the queue of webhook messages.
The second subscription is written to by the trigger package.
Packages written here is what's waking up fanout and forwarder.
Thanks to this separation of the queue and trigger, each cloud function can
process many events at a time.
We load user configuration for the users that has received packages.
Currently we load user data from postgres only.
The next logical step is to load from redis instead, and allow parallel requests.
Using this configuration we can verify the hash called "bbb" above.
Hash is unique per user. You can for example use a randomly generated uuid.
If this hash is not ok, or user is not active, we ignore hooks to that user.
For each message we have n endpoints specified in the user configuration.
We publish every message+endpoint combination to the first forward pubsub queue.
The topic id is specified in environment variable IN_SUBSCRIPTION_ID
The environment variable NBR_PUBLISH_WORKER controls how many parallel
workers we use. Both in forwarder and fanout.
forward package
Overview
The forward package does the actual forward. When waken up by a trigger message,
it reads a number of messages (specified in environment variable
MAX_NBR_MESSAGES_POLLED) from a subscription
(named in environment variable IN_SUBSCRIPTION_ID).
It also optionally checks that the package is at least as many seconds old
as is specified in the environment variable MIN_AGE_SECS
It tries to forward to the url in the message.
This is what has the potential to be the biggest time hog and is why we
want to try to parallelize it.
How parallel is specified in environment variable NBR_PUBLISH_WORKER
If there is a problem with the forward, the problem is either intermittent
(like a http 500 response), or permanent. If it's intermittent, and if it's
not the last retry, the message is pushed forward to the next queue.
Next queue's topic is in environment variable OUT_QUEUE_TOPIC_ID
There are three forward attempts. The first is just after fanout has
validated the messages and attached forward urls to them. If the third
attempt fails, the forward message is lost.
You tell the forwarder package which attempt it is by using the
environment variable AT_QUEUE
trigger package
The trigger package wakes up the fanout package, and the forward package.
Overview
A Cloud Scheduler pubsub trigger wakes up the trigger package.
Trigger package checks the subscription it should trigger how many
messages it currently have. The subscription is named by
environment variable SUBSCRIPTION_TO_PROCESS
The environment variable MAX_NBR_MESSAGES_POLLED says how many messages
each forwarder or fanout cloud function are expected to consume.
We check the size of the trigger queue using the subscription name
from environment variable TRIGGER_SUBSCRIPTION_ID
Trigger package will put
ceil(number of messages on queue / max_nbr_messages_polled) - number_of_items_already_on_trigger_queue
...on the trigger queue named by environment variable
TRIGGER_TOPIC