swarmsvc

package
v0.9.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2018 License: Apache-2.0 Imports: 20 Imported by: 10

README

FireCamp Swarm Internal

By default, docker swarm manager listens on ip:2377 with TLS enabled. And the swarm manager tls files are not exposed for the external use.

One solution is to start docker daemon with our own CA certificate. So the FireCamp manageserver could talk with the docker daemon on the Swarm manager nodes. While, this will not work on the existing swarm cluster created with the default TLS.

It looks better to run the FireCamp manageserver container on the Swarm manager nodes, and talk with Swarm manager via the unix socket. The customer could easily add FireCamp to the existing or new swarm cluster. The FireCamp manageserver container is very light. It is ok to run on the swarm manager node.

Docker swarm service could pass the task slot in the volume source name to the volume plugin. So the volume plugin could directly know which member the container is for. This works for the single availability zone cluster. One EBS volume is binded to one availability zone. In the multi-zones cluster, Docker Swarm task slot is not aware of the availability zone.

Create a FireCamp Swarm cluster

Follow the installation guide to install the Swarm cluster using cloudformation.

Install FireCamp on the existing Swarm cluster

If you already have the Swarm cluster and want to use FireCamp, you could follow below steps.

  1. Create the FireCamp IAM and assign to all Swarm nodes.

Use packaging/aws-cloudformation/firecamp-iamprofile.template to create the IAM, and add this IAM to the node's IAM.

  1. Decide the cluster name you want to assign to the Swarm cluster.

FireCamp assigns a unique DNS name for every service member. For example, the cluster name is c1, the service name is mymongo and has 3 members. FireCamp will assign mymongo-0.c1-firecamp.com, mymongo-1.c1-firecamp.com and mymongo-2.c1-firecamp.com to these 3 members.

The cluster name has to be unique for the swarm clusters in the same VPC. If 2 swarm clusters have the same name and creates the service with the same name as well, the members of these 2 services will have the same DNS name. This will mess up the service membership. Different VPCs will have different DNS servers, different HostedZone in AWS Route53. So the swarm clusters in different VPCs could have the same name.

  1. Add the availability zone labels to docker engine at every node.

The availability zone labels are required to run the service on part of the availability zones in one cluster. If all availability zones will be included in the service, could skip this step. For example, your swarm cluster has 3 availability zones and you plan to create a 3 or more replicas stateful service such as ZooKeeper.

For example, on the 3 availability zones cluster, Redis may be deployed to 1 or 2 availability zones. Deploying to 1 availability zone might be fine for some applications that use Redis only for cache. Redis cluster mode supports 1 slave for 1 master. The customer may deploy all masters on 1 availability zone, and deploy all slaves on the second availability zone to tolerate the availability zone failure.

If the cluster includes only 1 or 2 availability zones and Redis is deployed to all (1 or 2) availability zones, could also skip this step.

  1. Install FireCamp plugin on every swarm worker node.

Create the FireCamp directory: sudo mkdir -p /var/lib/firecamp and sudo mkdir -p /var/log/firecamp.

Install plugin at such as release 0.9.3: docker plugin install --grant-all-permissions cloudstax/firecamp-volume:0.9.3 PLATFORM=swarm CLUSTER=c1 and docker plugin install --grant-all-permissions cloudstax/firecamp-log:0.9.3 CLUSTER=c1.

  1. Install FireCamp log plugin on every swarm manager node.

Create the FireCamp directory: sudo mkdir -p /var/lib/firecamp and sudo mkdir -p /var/log/firecamp.

Install plugin at such as release 0.9.3: docker plugin install --grant-all-permissions cloudstax/firecamp-log:0.9.3 CLUSTER=c1.

  1. Create the FireCamp manageserver service on the swarm manager node.

The example command for release 0.9.3: docker service create --name firecamp-manageserver --constraint node.role==manager --publish mode=host,target=27040,published=27040 --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock --replicas 1 --log-driver=cloudstax/firecamp-log:0.9.3 --log-opt ServiceUUID=firecamp --log-opt ServiceMember=manageserver -e CONTAINER_PLATFORM=swarm -e DB_TYPE=clouddb -e AVAILABILITY_ZONES=us-east-1a,us-east-1b,us-east-1c -e CLUSTER=c1 cloudstax/firecamp-manageserver:0.9.3

Please update the AVAILABILITY_ZONES, CLUSTER, the manageserver docker image tag and firecamp log plugin tag accordingly accordingly to your environment. Please do NOT change others.

  1. Create the stateful service.

For example, to create a 2 replicas PostgreSQL, copy the firecamp-service-cli to the manager node and simply run: firecamp-service-cli -cluster=c1 -op=create-service -service-name=pg1 -service-type=postgresql -replicas=2 -volume-size=1

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type SwarmClient

type SwarmClient struct {
	// contains filtered or unexported fields
}

func NewSwarmClient

func NewSwarmClient() (*SwarmClient, error)

NewSwarmClient creates a new SwarmClient instance

func (*SwarmClient) NewClient

func (s *SwarmClient) NewClient() (*client.Client, error)

NewClient returns a new docker swarm client. This function is only used by swarmservice, which is used by manageserver. And the managerserver runs only on the Swarm manager nodes.

type SwarmInfo

type SwarmInfo struct {
	// contains filtered or unexported fields
}

func NewSwarmInfo

func NewSwarmInfo(clusterName string) (*SwarmInfo, error)

func (*SwarmInfo) GetContainerClusterID

func (s *SwarmInfo) GetContainerClusterID() string

func (*SwarmInfo) GetLocalContainerInstanceID

func (s *SwarmInfo) GetLocalContainerInstanceID() string

type SwarmSvc

type SwarmSvc struct {
	// contains filtered or unexported fields
}

SwarmSvc implements swarm service and task related functions.

TODO task framework on Swarm. Swarm doesn't support the task execution. The SwarmSvc will have to manage the task lifecycle. The Docker daemon on the swarm worker node will have to listen on the host port, so docker API could be accessed remotely. The SwarmSvc will periodically collect the metrics, select one node, store it in the controldb. If the node is full, select another node to run the task. At v1, the task is simply run on the swarm manager node. This is not a big issue, as the task would usually run some simple job, such as setup the MongoDB ReplicaSet.

func NewSwarmSvc

func NewSwarmSvc(azs []string) (*SwarmSvc, error)

NewSwarmSvc creates a new SwarmSvc instance

func NewSwarmSvcForVolumePlugin

func NewSwarmSvcForVolumePlugin(region string, cluster string) (*SwarmSvc, error)

NewSwarmSvcForVolumePlugin creates a new SwarmSvc instance for the volume plugin.

func (*SwarmSvc) CreateService

func (s *SwarmSvc) CreateService(ctx context.Context, opts *containersvc.CreateServiceOptions) error

CreateService creates a swarm service

func (*SwarmSvc) CreateServiceSpec

func (s *SwarmSvc) CreateServiceSpec(opts *containersvc.CreateServiceOptions) swarm.ServiceSpec

CreateServiceSpec creates the swarm ServiceSpec.

func (*SwarmSvc) CreateServiceVolume added in v0.9.3

func (s *SwarmSvc) CreateServiceVolume(ctx context.Context, service string, memberIndex int64, volumeID string, volumeSizeGB int64, journal bool) (existingVolumeID string, err error)

CreateServiceVolume is a non-op for swarm.

func (*SwarmSvc) CreateSwarmService

func (s *SwarmSvc) CreateSwarmService(ctx context.Context, serviceSpec swarm.ServiceSpec, opts *containersvc.CreateServiceOptions) error

CreateSwarmService creates the swarm service.

func (*SwarmSvc) DeleteService

func (s *SwarmSvc) DeleteService(ctx context.Context, cluster string, service string) error

DeleteService delets a swarm service

func (*SwarmSvc) DeleteServiceVolume added in v0.9.3

func (s *SwarmSvc) DeleteServiceVolume(ctx context.Context, service string, memberIndex int64, journal bool) error

DeleteServiceVolume is a non-op for swarm.

func (*SwarmSvc) DeleteTask

func (s *SwarmSvc) DeleteTask(ctx context.Context, cluster string, service string, taskType string) error

DeleteTask deletes the task container

func (*SwarmSvc) GetContainerSvcType added in v0.9.3

func (s *SwarmSvc) GetContainerSvcType() string

GetContainerSvcType gets the containersvc type.

func (*SwarmSvc) GetJoinToken

func (s *SwarmSvc) GetJoinToken(ctx context.Context) (managerToken string, workerToken string, err error)

GetJoinToken gets the swarm manager and worker node join token.

func (*SwarmSvc) GetServiceStatus

func (s *SwarmSvc) GetServiceStatus(ctx context.Context, cluster string, service string) (*common.ServiceStatus, error)

GetServiceStatus gets the service's status.

func (*SwarmSvc) GetServiceTask

func (s *SwarmSvc) GetServiceTask(ctx context.Context, cluster string, service string, containerInstanceID string) (serviceTaskID string, err error)

GetServiceTask gets the task running on the containerInstanceID

func (*SwarmSvc) GetTaskContainerInstance

func (s *SwarmSvc) GetTaskContainerInstance(ctx context.Context, cluster string,
	serviceTaskID string) (containerInstanceID string, err error)

GetTaskContainerInstance returns the ContainerInstanceID the task runs on

func (*SwarmSvc) GetTaskStatus

func (s *SwarmSvc) GetTaskStatus(ctx context.Context, cluster string, taskID string) (*common.TaskStatus, error)

GetTaskStatus returns the task's status.

func (*SwarmSvc) IsServiceExist

func (s *SwarmSvc) IsServiceExist(ctx context.Context, cluster string, service string) (bool, error)

IsServiceExist checks whether the service exists

func (*SwarmSvc) IsSwarmInitialized

func (s *SwarmSvc) IsSwarmInitialized(ctx context.Context) (bool, error)

IsSwarmInitialized checks if the swarm cluster is initialized.

func (*SwarmSvc) ListActiveServiceTasks

func (s *SwarmSvc) ListActiveServiceTasks(ctx context.Context, cluster string, service string) (serviceTaskIDs map[string]bool, err error)

ListActiveServiceTasks lists the active (running and pending) tasks of the service

func (*SwarmSvc) ListSwarmManagerNodes added in v0.9.4

func (s *SwarmSvc) ListSwarmManagerNodes(ctx context.Context) (goodManagers []string, downManagerNodes []swarm.Node, downManagers []string, err error)

ListSwarmManagerNodes returns the good and down managers

func (*SwarmSvc) RemoveDownManagerNode added in v0.9.4

func (s *SwarmSvc) RemoveDownManagerNode(ctx context.Context, node swarm.Node) error

RemoveDownManagerNode removes the down manager node.

func (*SwarmSvc) RollingRestartService added in v0.9.4

func (s *SwarmSvc) RollingRestartService(ctx context.Context, cluster string, service string, opts *containersvc.RollingRestartOptions) error

RollingRestartService restarts the service tasks one after the other.

func (*SwarmSvc) RunTask

func (s *SwarmSvc) RunTask(ctx context.Context, opts *containersvc.RunTaskOptions) (taskID string, err error)

RunTask creates and runs the task once. It does 3 steps: 1) pull the image, 2) create the container, 3) start the container.

func (*SwarmSvc) ScaleService added in v0.9.2

func (s *SwarmSvc) ScaleService(ctx context.Context, cluster string, service string, desiredCount int64) error

ScaleService scales the service containers up/down to the desiredCount.

func (*SwarmSvc) StopService

func (s *SwarmSvc) StopService(ctx context.Context, cluster string, service string) error

StopService stops all service containers

func (*SwarmSvc) SwarmInit

func (s *SwarmSvc) SwarmInit(ctx context.Context, addr string) error

SwarmInit initializes the swarm cluster.

func (*SwarmSvc) SwarmJoin

func (s *SwarmSvc) SwarmJoin(ctx context.Context, addr string, joinAddr string, token string) error

SwarmJoin joins the current node to the remote manager.

func (*SwarmSvc) WaitServiceRunning

func (s *SwarmSvc) WaitServiceRunning(ctx context.Context, cluster string, service string, replicas int64, maxWaitSeconds int64) error

WaitServiceRunning waits till all tasks are running or time out.

func (*SwarmSvc) WaitTaskComplete

func (s *SwarmSvc) WaitTaskComplete(ctx context.Context, cluster string, taskID string, maxWaitSeconds int64) error

WaitTaskComplete waits till the task container completes

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL