= Installation Guide - Crunchy Proxy
Crunchy Data Solutions, Inc.
v0.0.1, {docdate}
image::docs/crunchy_logo.png?raw=true[]
== Project
*crunchy-proxy* is a minimal PostgreSQL-aware proxy used to handle PostgreSQL application requests. The diagram below depicts a PostgreSQL client application connecting to the *crunchy-proxy*, which appears to the PostgreSQL application as any other PostgreSQL database connection. The *crunchy-proxy* accepts the inbound client requests, and routes them to an appropriate PostgreSQL cluster member (as described in more detail below).
image::docs/proxy-diagram.png?raw=true["scaledwidth="50%"]
To the PostgreSQL server, the *crunchy-proxy* is transparent and appears as any other
PostgreSQL client would appear. As described in more detail below, the *crunchy-proxy* currently provides the following capabilities:
* ability to route messages based on the SQL command type - writes are
sent to a master and reads are sent to replicas in a round-robin fashion
* configuration via JSON file
* PostgreSQL healthcheck
* ability to route inbound client messages based on the health of PostgreSQL
* REST administrative interface
* ability to publish healthcheck events to consumers outside of the proxy
* ability to load balance SQL statements across multiple replica backends
* connection pooling
* support for certificate and SSL based authentication
As described under "Approach" below, implementation of the *crunchy-proxy* requires that (1) a user has access to their application's SQL and (2) a user has the ability to add basic annotations to their SQL statements.
== Approach
One of the key features of the *crunchy-proxy* is the ability to route different SQL statements to a PostgreSQL cluster members based on certain pre-defined rules. This capability enables end-user applications to see a PostgreSQL cluster as a single connection. In order to support this capability, it is necessary that *crunchy-proxy* have knowledge of the SQL statements that a PostgreSQL application sends to PostgreSQL.
One approach to addressing the requirement that a proxy be "SQL aware" (e.g. link:http://www.pgpool.net/mediawiki/index.php/Main_Page[pgpool]) is to first parse each SQL statement looking for grammar that would imply a *write* or *update* and to then send those SQL statements to the cluster master. SQL statements that are determined to be *read-only* in nature are deemed safe to send to the cluster replica members. Remember, in PostgreSQL cluster replica members are only able to process SQL that is *read-only* in nature.
*crunchy-proxy* takes a different approach to SQL routing. Instead of attempting to parse the entire SQL grammer of each SQL statement, it looks for a simple annotation, supplied by the end-user within their SQL statement, to determine the routing destination.
For example, the following SQL statement is annotated to indicate it
is a *read-only* SQL statement:
....
/* read */ select now();
....
Examples of a write SQL statement include the following:
....
create table boofar (id int);
drop table boofar (id int);
....
If a SQL statement does not include an annotation, the the statement is deemed a *write* and thus sent to the master cluster member.
By parsing only the annotation, *crunchy-proxy* simplifies the complexity associated with determining whether a SQL statement is a *write* or *read* and thus to which member (master or replica) of a PostgreSQL cluser to send a SQL statement.
In taking this approach, *crunchy-proxy* has assumed (1) a user has access to their application's SQL and (2) a user has the ability to add the annotation in their SQL statements. If they do, then they can use the *crunchy-proxy* for SQL routing.
Of course these assumptions introduce certain limitations on the *crunchy-proxy*. Nonetheless, it was determined that these assumptions will not be unduly limiting in the usability of the *crunchy-proxy* and that the resulting limiations are justified by the benefits of (1) reduction in complexity associated with SQL parsing implementation, (2) increase in proxy throughput and (3) improved routing accuracy of the SQL parsing.
== PostgreSQL Wire Protocol
*crunchy-proxy* operates at the PostgreSQL wire protocol (network) layer to understand PostgreSQL client authentication requests and SQL statements passed by a client to a PostgreSQL backend.
As *crunchy-proxy* uses annotations to route messages to the backend, the proxy primarily examines SQL statements for proxy-specific annotations and does very little processing of the messages sent between a client and an actual backend.
Its important to note that the proxy does not implement all features of libpq or provide an application interface similar to a JDBC driver or other language driver.
The following resources are useful in understanding the PostgreSQL wire protocol:
* link:https://www.pgcon.org/2014/schedule/attachments/330_postgres-for-the-wire.pdf[https://www.pgcon.org/2014/schedule/attachments/330_postgres-for-the-wire.pdf]
* link:https://www.postgresql.org/docs/current/static/protocol.html[https://www.postgresql.org/docs/current/static/protocol.html]
* link:https://github.com/lib/pq[https://github.com/lib/pq]
In the future, by working at the wire protocol level, *crunchy-proxy* can implement a variety of features important for high speed proxy handling and for supporting PostgreSQL features.
== Execution
The proxy is a golang binary, you execute it as follows:
....
$> crunchy-proxy start --config=config.yaml
....
To run the proxy at different logging output levels:
....
$> crunchy-proxy start --config=config.yaml --log-level=<level>
....
Where _<level>_ is one of the following:
* *debug*
* *info*
* *error*
* *fatal*
Detailed documentation including configuration file format and
developer information is
found in the link:docs/crunchy-proxy-user-guide.asciidoc[User Guide]
For Docker users, you can run the proxy using the
link:bin/run-docker.sh[run-docker.sh] script.
== Feedback
If you find a bug, or want to provide feedback on the design and features
feel free to create a github issue.
== Legal Notices
Copyright © 2017 Crunchy Data Solutions, Inc.
CRUNCHY DATA SOLUTIONS, INC. PROVIDES THIS GUIDE "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Crunchy, Crunchy Data Solutions, Inc. and the Crunchy Hippo Logo are trademarks of Crunchy Data Solutions, Inc.