README ¶
Gzip Middleware
This Go package which wraps HTTP server handlers to transparently gzip the response body, for clients which support it.
For HTTP clients we provide a transport wrapper that will do gzip decompression faster than what the standard library offers.
Both the client and server wrappers are fully compatible with other servers and clients.
This package is forked from the dead nytimes/gziphandler and extends functionality for it.
Install
go get -u github.com/klauspost/compress
Documentation
Usage
There are 2 main parts, one for http servers and one for http clients.
Client
The standard library automatically adds gzip compression to most requests and handles decompression of the responses.
However, by wrapping the transport we are able to override this and provide our own (faster) decompressor.
Wrapping is done on the Transport of the http client:
func ExampleTransport() {
// Get an HTTP client.
client := http.Client{
// Wrap the transport:
Transport: gzhttp.Transport(http.DefaultTransport),
}
resp, err := client.Get("https://google.com")
if err != nil {
return
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)
fmt.Println("body:", string(body))
}
Speed compared to standard library DefaultTransport
for an approximate 127KB JSON payload:
BenchmarkTransport
Single core:
BenchmarkTransport/gzhttp-32 1995 609791 ns/op 214.14 MB/s 10129 B/op 73 allocs/op
BenchmarkTransport/stdlib-32 1567 772161 ns/op 169.11 MB/s 53950 B/op 99 allocs/op
BenchmarkTransport/zstd-32 4579 238503 ns/op 547.51 MB/s 5775 B/op 69 allocs/op
Multi Core:
BenchmarkTransport/gzhttp-par-32 29113 36802 ns/op 3548.27 MB/s 11061 B/op 73 allocs/op
BenchmarkTransport/stdlib-par-32 16114 66442 ns/op 1965.38 MB/s 54971 B/op 99 allocs/op
BenchmarkTransport/zstd-par-32 90177 13110 ns/op 9960.83 MB/s 5361 B/op 67 allocs/op
This includes both serving the http request, parsing requests and decompressing.
Server
For the simplest usage call GzipHandler
with any handler (an object which implements the
http.Handler
interface), and it'll return a new handler which gzips the
response. For example:
package main
import (
"io"
"net/http"
"github.com/Memexurer/compress/gzhttp"
)
func main() {
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
io.WriteString(w, "Hello, World")
})
http.Handle("/", gzhttp.GzipHandler(handler))
http.ListenAndServe("0.0.0.0:8000", nil)
}
This will wrap a handler using the default options.
To specify custom options a reusable wrapper can be created that can be used to wrap any number of handlers.
package main
import (
"io"
"log"
"net/http"
"github.com/Memexurer/compress/gzhttp"
"github.com/Memexurer/compress/gzip"
)
func main() {
handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/plain")
io.WriteString(w, "Hello, World")
})
// Create a reusable wrapper with custom options.
wrapper, err := gzhttp.NewWrapper(gzhttp.MinSize(2000), gzhttp.CompressionLevel(gzip.BestSpeed))
if err != nil {
log.Fatalln(err)
}
http.Handle("/", wrapper(handler))
http.ListenAndServe("0.0.0.0:8000", nil)
}
Performance
Speed compared to nytimes/gziphandler with default settings, 2KB, 20KB and 100KB:
λ benchcmp before.txt after.txt
benchmark old ns/op new ns/op delta
BenchmarkGzipHandler_S2k-32 51302 23679 -53.84%
BenchmarkGzipHandler_S20k-32 301426 156331 -48.14%
BenchmarkGzipHandler_S100k-32 1546203 818981 -47.03%
BenchmarkGzipHandler_P2k-32 3973 1522 -61.69%
BenchmarkGzipHandler_P20k-32 20319 9397 -53.75%
BenchmarkGzipHandler_P100k-32 96079 46361 -51.75%
benchmark old MB/s new MB/s speedup
BenchmarkGzipHandler_S2k-32 39.92 86.49 2.17x
BenchmarkGzipHandler_S20k-32 67.94 131.00 1.93x
BenchmarkGzipHandler_S100k-32 66.23 125.03 1.89x
BenchmarkGzipHandler_P2k-32 515.44 1345.31 2.61x
BenchmarkGzipHandler_P20k-32 1007.92 2179.47 2.16x
BenchmarkGzipHandler_P100k-32 1065.79 2208.75 2.07x
benchmark old allocs new allocs delta
BenchmarkGzipHandler_S2k-32 22 16 -27.27%
BenchmarkGzipHandler_S20k-32 25 19 -24.00%
BenchmarkGzipHandler_S100k-32 28 21 -25.00%
BenchmarkGzipHandler_P2k-32 22 16 -27.27%
BenchmarkGzipHandler_P20k-32 25 19 -24.00%
BenchmarkGzipHandler_P100k-32 27 21 -22.22%
benchmark old bytes new bytes delta
BenchmarkGzipHandler_S2k-32 8836 2980 -66.27%
BenchmarkGzipHandler_S20k-32 69034 20562 -70.21%
BenchmarkGzipHandler_S100k-32 356582 86682 -75.69%
BenchmarkGzipHandler_P2k-32 9062 2971 -67.21%
BenchmarkGzipHandler_P20k-32 67799 20051 -70.43%
BenchmarkGzipHandler_P100k-32 300972 83077 -72.40%
Stateless compression
In cases where you expect to run many thousands of compressors concurrently, but with very little activity you can use stateless compression. This is not intended for regular web servers serving individual requests.
Use CompressionLevel(-3)
or CompressionLevel(gzip.StatelessCompression)
to enable.
Consider adding a bufio.Writer
with a small buffer.
See more details on stateless compression.
Migrating from gziphandler
This package removes some of the extra constructors. When replacing, this can be used to find a replacement.
GzipHandler(h)
->GzipHandler(h)
(keep as-is)GzipHandlerWithOpts(opts...)
->NewWrapper(opts...)
MustNewGzipLevelHandler(n)
->NewWrapper(CompressionLevel(n))
NewGzipLevelAndMinSize(n, s)
->NewWrapper(CompressionLevel(n), MinSize(s))
By default, some mime types will now be excluded.
To re-enable compression of all types, use the ContentTypeFilter(gzhttp.CompressAllContentTypeFilter)
option.
Range Requests
Ranged requests are not well supported with compression. Therefore any request with a "Content-Range" header is not compressed.
To signify that range requests are not supported any "Accept-Ranges" header set is removed when data is compressed.
If you do not want this behavior use the KeepAcceptRanges()
option.
Flushing data
The wrapper supports the http.Flusher interface.
The only caveat is that the writer may not yet have received enough bytes to determine if MinSize
has been reached. In this case it will assume that the minimum size has been reached.
If nothing has been written to the response writer, nothing will be flushed.
BREACH mitigation
BREACH is a specialized attack where attacker controlled data is injected alongside secret data in a response body. This can lead to sidechannel attacks, where observing the compressed response size can reveal if there are overlaps between the secret data and the injected data.
For more information see https://breachattack.com/
It can be hard to judge if you are vulnerable to BREACH. In general, if you do not include any user provided content in the response body you are safe, but if you do, or you are in doubt, you can apply mitigations.
gzhttp
can apply Heal the Breach, or improved content aware padding.
// RandomJitter adds 1->n random bytes to output based on checksum of payload.
// Specify the amount of input to buffer before applying jitter.
// This should cover the sensitive part of your response.
// This can be used to obfuscate the exact compressed size.
// Specifying 0 will use a buffer size of 64KB.
// 'paranoid' will use a slower hashing function, that MAY provide more safety.
// If a negative buffer is given, the amount of jitter will not be content dependent.
// This provides *less* security than applying content based jitter.
func RandomJitter(n, buffer int, paranoid bool) option
...
The jitter is added as a "Comment" field. This field has a 1 byte overhead, so actual extra size will be 2 -> n+1 (inclusive).
A good option would be to apply 32 random bytes, with default 64KB buffer: gzhttp.RandomJitter(32, 0, false)
.
Note that flushing the data forces the padding to be applied, which means that only data before the flush is considered for content aware padding.
The padding in the comment is the text Padding-Padding-Padding-Padding-Pad....
The length is 1 + crc32c(payload) MOD n
or 1 + sha256(payload) MOD n
(paranoid), or just random from crypto/rand
if buffer < 0.
Paranoid?
The padding size is determined by the remainder of a CRC32 of the content.
Since the payload contains elements unknown to the attacker, there is no reason to believe they can derive any information from this remainder, or predict it.
However, for those that feel uncomfortable with a CRC32 being used for this can enable "paranoid" mode which will use SHA256 for determining the padding.
The hashing itself is about 2 orders of magnitude slower, but in overall terms will maybe only reduce speed by 10%.
Paranoid mode has no effect if buffer is < 0 (non-content aware padding).
Examples
Adding the option gzhttp.RandomJitter(32, 50000)
will apply from 1 up to 32 bytes of random data to the output.
The number of bytes added depends on the content of the first 50000 bytes, or all of them if the output was less than that.
Adding the option gzhttp.RandomJitter(32, -1)
will apply from 1 up to 32 bytes of random data to the output.
Each call will apply a random amount of jitter. This should be considered less secure than content based jitter.
This can be used if responses are very big, deterministic and the buffer size would be too big to cover where the mutation occurs.
License
Documentation ¶
Index ¶
- Constants
- func CompressAllContentTypeFilter(ct string) bool
- func CompressionLevel(level int) option
- func ContentTypeFilter(compress func(ct string) bool) option
- func ContentTypes(types []string) option
- func DefaultContentTypeFilter(ct string) bool
- func DropETag() option
- func ExceptContentTypes(types []string) option
- func GzipHandler(h http.Handler) http.HandlerFunc
- func Implementation(writer writer.GzipWriterFactory) option
- func KeepAcceptRanges() option
- func MinSize(size int) option
- func NewWrapper(opts ...option) (func(http.Handler) http.HandlerFunc, error)
- func RandomJitter(n, buffer int, paranoid bool) option
- func SetContentType(b bool) option
- func SuffixETag(suffix string) option
- func Transport(parent http.RoundTripper, opts ...transportOption) http.RoundTripper
- func TransportCustomEval(fn func(header http.Header) bool) transportOption
- func TransportEnableGzip(b bool) transportOption
- func TransportEnableZstd(b bool) transportOption
- type GzipResponseWriter
- func (w *GzipResponseWriter) Close() error
- func (w *GzipResponseWriter) Flush()
- func (w *GzipResponseWriter) Hijack() (net.Conn, *bufio.ReadWriter, error)
- func (w *GzipResponseWriter) Unwrap() http.ResponseWriter
- func (w *GzipResponseWriter) Write(b []byte) (int, error)
- func (w *GzipResponseWriter) WriteHeader(code int)
- type GzipResponseWriterWithCloseNotify
- type NoGzipResponseWriter
- func (n *NoGzipResponseWriter) CloseNotify() <-chan bool
- func (n *NoGzipResponseWriter) Flush()
- func (n *NoGzipResponseWriter) Header() http.Header
- func (n *NoGzipResponseWriter) Unwrap() http.ResponseWriter
- func (n *NoGzipResponseWriter) Write(bytes []byte) (int, error)
- func (n *NoGzipResponseWriter) WriteHeader(statusCode int)
Examples ¶
Constants ¶
const ( // DefaultQValue is the default qvalue to assign to an encoding if no explicit qvalue is set. // This is actually kind of ambiguous in RFC 2616, so hopefully it's correct. // The examples seem to indicate that it is. DefaultQValue = 1.0 // DefaultMinSize is the default minimum size until we enable gzip compression. // 1500 bytes is the MTU size for the internet since that is the largest size allowed at the network layer. // If you take a file that is 1300 bytes and compress it to 800 bytes, it’s still transmitted in that same 1500 byte packet regardless, so you’ve gained nothing. // That being the case, you should restrict the gzip compression to files with a size (plus header) greater than a single packet, // 1024 bytes (1KB) is therefore default. DefaultMinSize = 1024 )
const ( // HeaderNoCompression can be used to disable compression. // Any header value will disable compression. // The Header is always removed from output. HeaderNoCompression = "No-Gzip-Compression" )
Variables ¶
This section is empty.
Functions ¶
func CompressAllContentTypeFilter ¶
CompressAllContentTypeFilter will compress all mime types.
func CompressionLevel ¶
func CompressionLevel(level int) option
CompressionLevel sets the compression level
func ContentTypeFilter ¶
ContentTypeFilter allows adding a custom content type filter.
The supplied function must return true/false to indicate if content should be compressed.
When called no parsing of the content type 'ct' has been done. It may have been set or auto-detected.
Setting this will override default and any previous Content Type settings.
func ContentTypes ¶
func ContentTypes(types []string) option
ContentTypes specifies a list of content types to compare the Content-Type header to before compressing. If none match, the response will be returned as-is.
Content types are compared in a case-insensitive, whitespace-ignored manner.
A MIME type without any other directive will match a content type that has the same MIME type, regardless of that content type's other directives. I.e., "text/html" will match both "text/html" and "text/html; charset=utf-8".
A MIME type with any other directive will only match a content type that has the same MIME type and other directives. I.e., "text/html; charset=utf-8" will only match "text/html; charset=utf-8".
By default common compressed audio, video and archive formats, see DefaultContentTypeFilter.
Setting this will override default and any previous Content Type settings.
func DefaultContentTypeFilter ¶
DefaultContentTypeFilter excludes common compressed audio, video and archive formats.
func DropETag ¶
func DropETag() option
DropETag removes the ETag of responses which are compressed. If DropETag is specified in conjunction with SuffixETag, this option will take precedence and the ETag will be dropped.
Per [RFC 7232 Section 2.3.3](https://www.rfc-editor.org/rfc/rfc7232#section-2.3.3), the ETag of a compressed response must differ from it's uncompressed version.
This workaround eliminates ETag conflicts between the compressed and uncompressed versions by removing the ETag from the compressed version.
func ExceptContentTypes ¶
func ExceptContentTypes(types []string) option
ExceptContentTypes specifies a list of content types to compare the Content-Type header to before compressing. If none match, the response will be compressed.
Content types are compared in a case-insensitive, whitespace-ignored manner.
A MIME type without any other directive will match a content type that has the same MIME type, regardless of that content type's other directives. I.e., "text/html" will match both "text/html" and "text/html; charset=utf-8".
A MIME type with any other directive will only match a content type that has the same MIME type and other directives. I.e., "text/html; charset=utf-8" will only match "text/html; charset=utf-8".
By default common compressed audio, video and archive formats, see DefaultContentTypeFilter.
Setting this will override default and any previous Content Type settings.
func GzipHandler ¶
func GzipHandler(h http.Handler) http.HandlerFunc
GzipHandler allows to easily wrap an http handler with default settings.
func Implementation ¶
func Implementation(writer writer.GzipWriterFactory) option
Implementation changes the implementation of GzipWriter
The default implementation is backed by github.com/klauspost/compress To support RandomJitter, the GzipWriterExt must also be supported by the returned writers.
func KeepAcceptRanges ¶
func KeepAcceptRanges() option
KeepAcceptRanges will keep Accept-Ranges header on gzipped responses. This will likely break ranged requests since that cannot be transparently handled by the filter.
func NewWrapper ¶
func NewWrapper(opts ...option) (func(http.Handler) http.HandlerFunc, error)
NewWrapper returns a reusable wrapper with the supplied options.
func RandomJitter ¶
RandomJitter adds 1->n random bytes to output based on checksum of payload. Specify the amount of input to buffer before applying jitter. This should cover the sensitive part of your response. This can be used to obfuscate the exact compressed size. Specifying 0 will use a buffer size of 64KB. 'paranoid' will use a slower hashing function, that MAY provide more safety. See README.md for more information. If a negative buffer is given, the amount of jitter will not be content dependent. This provides *less* security than applying content based jitter.
func SetContentType ¶
func SetContentType(b bool) option
SetContentType sets the content type before returning requests, if unset before returning, and it was detected. Default: true.
func SuffixETag ¶
func SuffixETag(suffix string) option
SuffixETag adds the specified suffix to the ETag header (if it exists) of responses which are compressed.
Per [RFC 7232 Section 2.3.3](https://www.rfc-editor.org/rfc/rfc7232#section-2.3.3), the ETag of a compressed response must differ from it's uncompressed version.
A suffix such as "-gzip" is sometimes used as a workaround for generating a unique new ETag (see https://bz.apache.org/bugzilla/show_bug.cgi?id=39727).
func Transport ¶
func Transport(parent http.RoundTripper, opts ...transportOption) http.RoundTripper
Transport will wrap an HTTP transport with a custom handler that will request gzip and automatically decompress it. Using this is significantly faster than using the default transport.
func TransportCustomEval ¶
TransportCustomEval will send the header of a response to a custom function. If the function returns false, the response will be returned as-is, Otherwise it will be decompressed based on Content-Encoding field, regardless of whether the transport added the encoding.
func TransportEnableGzip ¶
func TransportEnableGzip(b bool) transportOption
TransportEnableGzip will send Gzip as a compression option to the server. Enabled by default.
func TransportEnableZstd ¶
func TransportEnableZstd(b bool) transportOption
TransportEnableZstd will send Zstandard as a compression option to the server. Enabled by default, but may be disabled if future problems arise.
Types ¶
type GzipResponseWriter ¶
type GzipResponseWriter struct { http.ResponseWriter // contains filtered or unexported fields }
GzipResponseWriter provides an http.ResponseWriter interface, which gzips bytes before writing them to the underlying response. This doesn't close the writers, so don't forget to do that. It can be configured to skip response smaller than minSize.
func (*GzipResponseWriter) Close ¶
func (w *GzipResponseWriter) Close() error
Close will close the gzip.Writer and will put it back in the gzipWriterPool.
func (*GzipResponseWriter) Flush ¶
func (w *GzipResponseWriter) Flush()
Flush flushes the underlying *gzip.Writer and then the underlying http.ResponseWriter if it is an http.Flusher. This makes GzipResponseWriter an http.Flusher. If not enough bytes has been written to determine if we have reached minimum size, this will be ignored. If nothing has been written yet, nothing will be flushed.
func (*GzipResponseWriter) Hijack ¶
func (w *GzipResponseWriter) Hijack() (net.Conn, *bufio.ReadWriter, error)
Hijack implements http.Hijacker. If the underlying ResponseWriter is a Hijacker, its Hijack method is returned. Otherwise an error is returned.
func (*GzipResponseWriter) Unwrap ¶
func (w *GzipResponseWriter) Unwrap() http.ResponseWriter
func (*GzipResponseWriter) Write ¶
func (w *GzipResponseWriter) Write(b []byte) (int, error)
Write appends data to the gzip writer.
func (*GzipResponseWriter) WriteHeader ¶
func (w *GzipResponseWriter) WriteHeader(code int)
WriteHeader just saves the response code until close or GZIP effective writes. In the specific case of 1xx status codes, WriteHeader is directly calling the wrapped ResponseWriter.
type GzipResponseWriterWithCloseNotify ¶
type GzipResponseWriterWithCloseNotify struct {
*GzipResponseWriter
}
func (GzipResponseWriterWithCloseNotify) CloseNotify ¶
func (w GzipResponseWriterWithCloseNotify) CloseNotify() <-chan bool
type NoGzipResponseWriter ¶
type NoGzipResponseWriter struct { http.ResponseWriter // contains filtered or unexported fields }
NoGzipResponseWriter filters out HeaderNoCompression.
func (*NoGzipResponseWriter) CloseNotify ¶
func (n *NoGzipResponseWriter) CloseNotify() <-chan bool
func (*NoGzipResponseWriter) Flush ¶
func (n *NoGzipResponseWriter) Flush()
func (*NoGzipResponseWriter) Header ¶
func (n *NoGzipResponseWriter) Header() http.Header
func (*NoGzipResponseWriter) Unwrap ¶
func (n *NoGzipResponseWriter) Unwrap() http.ResponseWriter
func (*NoGzipResponseWriter) Write ¶
func (n *NoGzipResponseWriter) Write(bytes []byte) (int, error)
func (*NoGzipResponseWriter) WriteHeader ¶
func (n *NoGzipResponseWriter) WriteHeader(statusCode int)