concurrentlimit

package module
v0.0.0-...-b9e6d22 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2024 License: BSD-3-Clause Imports: 8 Imported by: 0

README

Concurrent request/connection limits for Go servers

Each connection and request that a server is processing takes memory. If you have too many, your server will run out of memory and crash. To make servers robust, you must limit the amount of concurrent work that it accepts, so it at least serves some requests during overload scenarios, rather than serving none. This package provides some APIs to make this easier, and some tools I used to test this.

For a really robust server, you should do the following, in rough priority order:

  • Limit concurrent executing requests to limit memory.
  • Limit concurrent connections to limit memory. In particular, Go gRPC connections are very expensive (~230 kiB versus about ~40 kiB per HTTP server connection, particularly when connections are opening/closing rapidly).
  • Close connections/requests that are too slow or idle, since they are wasting resources.
  • Make clients well-behaved so they reduce their request rate on error, or stop entirely (exponential backoff, back pressure, circuit breakers). The gRPC MaxConcurrentStreams setting can help here.

Possible future improvements to this code

  • gRPC streaming requests: The grpclimit package currently only limits unary requests.

  • Faster implementation: This uses a single sync.Mutex. It works well for ~10000 requests/second on 8 CPUs, but can be a bottleneck for extremely low-latency requests or high-CPU servers. Some sort of sharded counter, or something crazy like https://github.com/jonhoo/drwmutex would be more efficient.

  • Blocking/queuing: This package currently rejects requests when over the limit. It probably would be better to queue requests for some period of time. This can cause fewer retries when there are short overload bursts. It also means that poorly behaved clients that retry too quickly will retry less often, which may ultimately be better. There are also choices here about LIFO versus FIFO, drop head versus drop tail. See my previous investigation: https://www.evanjones.ca/prevent-server-overload.html

  • Multiple buckets of limits: Health checks, statistics, or other cheap requests should have much higher limits than expensive requests. It is possible this should be configurable.

  • Aggressively close idle connections on overload: This package sets idle timeouts on connections to attempt to avoid lots of idle clients starving busy clients. It would be nice if this policy triggered on overload. If we are at the connection limit, we should aggressively close idle connections. If we are not, then we should not care.

Running the server with limited memory and Docker

docker build . --tag=sleepyserver
docker run -p 127.0.0.1:8080:8080 -p 127.0.0.1:8081:8081 --rm -ti --memory=128m --memory-swap=128m sleepyserver

To monitor in another terminal:

  • docker stats
  • curl http://localhost:8080/stats

High memory per request

This client makes requests that use 1 MiB/request. Using 80 concurrent clients reliably blows up the server very quickly. Adding the concurrent rate limiter --concurrentRequests=40 fixes it.

ulimit -n 10000
# HTTP
go run ./loadclient/main.go --httpTarget=http://localhost:8080/ --concurrent=80 --sleep=3s --waste=1048576 --duration=2m
# gRPC
go run ./loadclient/main.go --grpcTarget=localhost:8081 --concurrent=80 --sleep=3s --waste=1048576 --duration=2m

Low memory per request (lots of idle requests)

This client makes requests that basically do nothing except use idle connections.

ulimit -n 10000
# HTTP
go run ./loadclient/main.go --httpTarget=http://localhost:8080/ --concurrent=5000 --sleep=20s --duration=2m
# gRPC
go run ./loadclient/main.go --grpcTarget=localhost:8081 --concurrent=5000 --sleep=20s --duration=2m

With HTTP and a docker memory limit of 128 MiB, on my machine 3000 concurrent connections seems to "work" but is dangerously close to the limit. Running the test a few times in a row seems to kill it. It seems like closing and re-opening connections causes an increase in memory usage. The gRPC test fails at a lower connection count (around 1000), so those connections are MUCH more memory expensive than HTTP connections.

  • 3000-3100 works but unreliably
  • 3200 works for a while but dies
  • 3500 connections dies after a few minutes
  • 3800 connections reliably dies

Using a concurrent request limit does NOT solve the problem, even with --concurrentRequests=40: There are simply too many connections and too much goroutine/connection overhead. To fix this, we need to reject new connections using --concurrentConnections=80.

gRPC MaxConcurrentStreams

This limits the number of concurrent streams per-client connection, so this doesn't fix overload by itself. For example, setting it to 40, and using the "high memory" client above still blows through the limit. With the --shareGRPC client, this will protect it. With this option, the server communicates the limit back to the client, which means the client will block and slow down its rate of requests (back-pressure). It is still useful, but does not protect the server's resources appropriately from "worst case" scenarios.

Documentation

Overview

Package concurrentlimit limits the number of concurrent requests to a Go HTTP or gRPC server.

Index

Constants

This section is empty.

Variables

View Source
var ErrLimited = errors.New("exceeded max concurrent operations limit")

ErrLimited is returned by Limiter when the concurrent operation limit is exceeded.

Functions

func Handler

func Handler(limiter Limiter, handler http.Handler) http.Handler

Handler returns an http.Handler that uses limiter to only permit a limited number of concurrent requests to be processed.

func Listen

func Listen(network string, address string, connectionLimit int) (net.Listener, error)

Listen wraps net.Listen with netutil.LimitListener to limit concurrent connections.

func ListenAndServe

func ListenAndServe(srv *http.Server, requestLimit int, connectionLimit int) error

ListenAndServe listens for HTTP requests with a limited number of concurrent requests and connections. This helps avoid running out of memory during overload situations. Both requestLimit and connectionLimit must be > 0, and connectionLimit must be >= requestLimit. A reasonable defalt is to set the connectionLimit to double the request limit, which assumes that processing each request requires more memory than a raw connection, and that keeping some idle connections is useful. This modifies srv.Handler with another handler that implements the limit.

This also sets the server's ReadHeaderTimeout and IdleTimeout to a reasonable default if they are not set, which is an attempt to avoid idle or slow connections using all connections.

func ListenAndServeTLS

func ListenAndServeTLS(
	srv *http.Server, certFile string, keyFile string, requestLimit int, connectionLimit int,
) error

ListenAndServeTLS listens for HTTP requests with a limited number of concurrent requests and connections. See the documentation for ListenAndServe for details.

Types

type Limiter

type Limiter interface {
	// Start begins a new operation. It returns a completion function that must be called when the
	// operation completes, or it returns ErrLimited if no more concurrent operations are allowed.
	// This should be called as:
	//
	// end, err := limiter.Start()
	// if err != nil {
	//     // Handle ErrLimited
	// defer end()
	Start() (func(), error)
}

Limiter limits the number of concurrent operations that can be processed.

func New

func New(limit int) Limiter

New returns a Limiter that will only permit limit concurrent operations. It will panic if limit is < 0.

func NoLimit

func NoLimit() Limiter

NoLimit returns a Limiter that permits an unlimited number of operations.

Directories

Path Synopsis
Package grpclimit limits the number of concurrent requests and concurrent connections to a gRPC server to ensure that it does not run out of memory during overload scenarios.
Package grpclimit limits the number of concurrent requests and concurrent connections to a gRPC server to ensure that it does not run out of memory during overload scenarios.
Package sleepymemory contains protobuf messages.
Package sleepymemory contains protobuf messages.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL