README
¶
FireCamp Swarm Internal
By default, docker swarm manager listens on ip:2377 with TLS enabled. And the swarm manager tls files are not exposed for the external use.
One solution is to start docker daemon with our own CA certificate. So the FireCamp manageserver could talk with the docker daemon on the Swarm manager nodes. While, this will not work on the existing swarm cluster created with the default TLS.
It looks better to run the FireCamp manageserver container on the Swarm manager nodes, and talk with Swarm manager via the unix socket. The customer could easily add FireCamp to the existing or new swarm cluster. The FireCamp manageserver container is very light. It is ok to run on the swarm manager node.
Docker swarm service could pass the task slot in the volume source name to the volume plugin. So the volume plugin could directly know which member the container is for. This works for the single availability zone cluster. One EBS volume is binded to one availability zone. In the multi-zones cluster, Docker Swarm task slot is not aware of the availability zone.
Create a FireCamp Swarm cluster
Follow the installation guide to install the Swarm cluster using cloudformation.
Install FireCamp on the existing Swarm cluster
If you already have the Swarm cluster and want to use FireCamp, you could follow below steps.
- Create the FireCamp IAM and assign to all Swarm nodes.
Use packaging/aws-cloudformation/firecamp-iamprofile.template to create the IAM, and add this IAM to the node's IAM.
- Decide the cluster name you want to assign to the Swarm cluster.
FireCamp assigns a unique DNS name for every service member. For example, the cluster name is c1, the service name is mymongo and has 3 members. FireCamp will assign mymongo-0.c1-firecamp.com, mymongo-1.c1-firecamp.com and mymongo-2.c1-firecamp.com to these 3 members.
The cluster name has to be unique for the swarm clusters in the same VPC. If 2 swarm clusters have the same name and creates the service with the same name as well, the members of these 2 services will have the same DNS name. This will mess up the service membership. Different VPCs will have different DNS servers, different HostedZone in AWS Route53. So the swarm clusters in different VPCs could have the same name.
-
Add FireCamp tag to each node, tag key: firecamp-worker, value: clustername.
-
Add the availability zone labels to docker engine at every node.
The availability zone labels are required to run the service on part of the availability zones in one cluster. If all availability zones will be included in the service, could skip this step. For example, your swarm cluster has 3 availability zones and you plan to create a 3 or more replicas stateful service such as ZooKeeper.
For example, on the 3 availability zones cluster, Redis may be deployed to 1 or 2 availability zones. Deploying to 1 availability zone might be fine for some applications that use Redis only for cache. Redis cluster mode supports 1 slave for 1 master. The customer may deploy all masters on 1 availability zone, and deploy all slaves on the second availability zone to tolerate the availability zone failure.
If the cluster includes only 1 or 2 availability zones and Redis is deployed to all (1 or 2) availability zones, could also skip this step.
- Install FireCamp plugin on every swarm worker node.
Create the FireCamp directory: sudo mkdir -p /var/lib/firecamp
and sudo mkdir -p /var/log/firecamp
.
Install plugin for the release, such as release 0.9.x:
docker plugin install --grant-all-permissions cloudstax/firecamp-volume:0.9.x PLATFORM=swarm CLUSTER=c1
docker plugin install --grant-all-permissions cloudstax/firecamp-log:0.9.x CLUSTER=c1
- Install FireCamp log plugin on every swarm manager node.
Create the FireCamp directory: sudo mkdir -p /var/lib/firecamp
and sudo mkdir -p /var/log/firecamp
.
Install plugin for the release, such as release 0.9.x: docker plugin install --grant-all-permissions cloudstax/firecamp-log:0.9.x CLUSTER=c1
.
- Create the FireCamp manageserver service on the swarm manager node.
The example command for release 0.9.x:
docker service create --name firecamp-manageserver --constraint node.role==manager --publish mode=host,target=27040,published=27040 --mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock --replicas 1 --log-driver=cloudstax/firecamp-log:0.9.x --log-opt ServiceUUID=firecamp --log-opt ServiceMember=manageserver -e CONTAINER_PLATFORM=swarm -e DB_TYPE=clouddb -e AVAILABILITY_ZONES=us-east-1a,us-east-1b,us-east-1c -e CLUSTER=c1 cloudstax/firecamp-manageserver:0.9.x
Please update the AVAILABILITY_ZONES, CLUSTER, the manageserver docker image tag and firecamp log plugin tag accordingly accordingly to your environment. Please do NOT change others.
- Create the stateful service.
For example, to create a 2 replicas PostgreSQL, copy the firecamp-service-cli to the manager node and simply run: firecamp-service-cli -cluster=c1 -op=create-service -service-name=pg1 -service-type=postgresql -replicas=2 -volume-size=1
Documentation
¶
Index ¶
- type SwarmClient
- type SwarmInfo
- type SwarmSvc
- func (s *SwarmSvc) CreateService(ctx context.Context, opts *containersvc.CreateServiceOptions) error
- func (s *SwarmSvc) CreateServiceSpec(opts *containersvc.CreateServiceOptions) swarm.ServiceSpec
- func (s *SwarmSvc) CreateServiceVolume(ctx context.Context, service string, memberName string, volumeID string, ...) (existingVolumeID string, err error)
- func (s *SwarmSvc) CreateSwarmService(ctx context.Context, serviceSpec swarm.ServiceSpec, ...) error
- func (s *SwarmSvc) DeleteService(ctx context.Context, cluster string, service string) error
- func (s *SwarmSvc) DeleteServiceVolume(ctx context.Context, service string, memberName string, journal bool) error
- func (s *SwarmSvc) DeleteTask(ctx context.Context, cluster string, service string, taskType string) error
- func (s *SwarmSvc) GetContainerSvcType() string
- func (s *SwarmSvc) GetJoinToken(ctx context.Context) (managerToken string, workerToken string, err error)
- func (s *SwarmSvc) GetServiceStatus(ctx context.Context, cluster string, service string) (*common.ServiceStatus, error)
- func (s *SwarmSvc) GetServiceTask(ctx context.Context, cluster string, service string, ...) (serviceTaskID string, err error)
- func (s *SwarmSvc) GetTaskContainerInstance(ctx context.Context, cluster string, serviceTaskID string) (containerInstanceID string, err error)
- func (s *SwarmSvc) GetTaskStatus(ctx context.Context, cluster string, taskID string) (*common.TaskStatus, error)
- func (s *SwarmSvc) IsServiceExist(ctx context.Context, cluster string, service string) (bool, error)
- func (s *SwarmSvc) IsSwarmInitialized(ctx context.Context) (bool, error)
- func (s *SwarmSvc) ListActiveServiceTasks(ctx context.Context, cluster string, service string) (serviceTaskIDs map[string]bool, err error)
- func (s *SwarmSvc) ListSwarmManagerNodes(ctx context.Context) (goodManagers []string, downManagerNodes []swarm.Node, downManagers []string, ...)
- func (s *SwarmSvc) RemoveDownManagerNode(ctx context.Context, node swarm.Node) error
- func (s *SwarmSvc) RollingRestartService(ctx context.Context, cluster string, service string, ...) error
- func (s *SwarmSvc) RunTask(ctx context.Context, opts *containersvc.RunTaskOptions) (taskID string, err error)
- func (s *SwarmSvc) ScaleService(ctx context.Context, cluster string, service string, desiredCount int64) error
- func (s *SwarmSvc) StopService(ctx context.Context, cluster string, service string) error
- func (s *SwarmSvc) SwarmInit(ctx context.Context, addr string) error
- func (s *SwarmSvc) SwarmJoin(ctx context.Context, addr string, joinAddr string, token string) error
- func (s *SwarmSvc) UpdateService(ctx context.Context, opts *containersvc.UpdateServiceOptions) error
- func (s *SwarmSvc) WaitServiceRunning(ctx context.Context, cluster string, service string, replicas int64, ...) error
- func (s *SwarmSvc) WaitTaskComplete(ctx context.Context, cluster string, taskID string, maxWaitSeconds int64) error
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type SwarmClient ¶
type SwarmClient struct {
// contains filtered or unexported fields
}
func NewSwarmClient ¶
func NewSwarmClient() (*SwarmClient, error)
NewSwarmClient creates a new SwarmClient instance
type SwarmInfo ¶
type SwarmInfo struct {
// contains filtered or unexported fields
}
func NewSwarmInfo ¶
func (*SwarmInfo) GetContainerClusterID ¶
func (*SwarmInfo) GetLocalContainerInstanceID ¶
type SwarmSvc ¶
type SwarmSvc struct {
// contains filtered or unexported fields
}
SwarmSvc implements swarm service and task related functions.
TODO task framework on Swarm. Swarm doesn't support the task execution. The SwarmSvc will have to manage the task lifecycle. The Docker daemon on the swarm worker node will have to listen on the host port, so docker API could be accessed remotely. The SwarmSvc will periodically collect the metrics, select one node, store it in db. If the node is full, select another node to run the task. At v1, the task is simply run on the swarm manager node. This is not a big issue, as the task would usually run some simple job, such as setup the MongoDB ReplicaSet.
func NewSwarmSvcOnManagerNode ¶ added in v0.9.5
NewSwarmSvcOnManagerNode creates a new SwarmSvc instance on the manager node
func NewSwarmSvcOnWorkerNode ¶ added in v0.9.5
NewSwarmSvcOnWorkerNode creates a new SwarmSvc instance on the worker node for such as volume plugin.
func (*SwarmSvc) CreateService ¶
func (s *SwarmSvc) CreateService(ctx context.Context, opts *containersvc.CreateServiceOptions) error
CreateService creates a swarm service
func (*SwarmSvc) CreateServiceSpec ¶
func (s *SwarmSvc) CreateServiceSpec(opts *containersvc.CreateServiceOptions) swarm.ServiceSpec
CreateServiceSpec creates the swarm ServiceSpec.
func (*SwarmSvc) CreateServiceVolume ¶ added in v0.9.3
func (s *SwarmSvc) CreateServiceVolume(ctx context.Context, service string, memberName string, volumeID string, volumeSizeGB int64, journal bool) (existingVolumeID string, err error)
CreateServiceVolume is a non-op for swarm.
func (*SwarmSvc) CreateSwarmService ¶
func (s *SwarmSvc) CreateSwarmService(ctx context.Context, serviceSpec swarm.ServiceSpec, opts *containersvc.CreateServiceOptions) error
CreateSwarmService creates the swarm service.
func (*SwarmSvc) DeleteService ¶
DeleteService delets a swarm service
func (*SwarmSvc) DeleteServiceVolume ¶ added in v0.9.3
func (s *SwarmSvc) DeleteServiceVolume(ctx context.Context, service string, memberName string, journal bool) error
DeleteServiceVolume is a non-op for swarm.
func (*SwarmSvc) DeleteTask ¶
func (s *SwarmSvc) DeleteTask(ctx context.Context, cluster string, service string, taskType string) error
DeleteTask deletes the task container
func (*SwarmSvc) GetContainerSvcType ¶ added in v0.9.3
GetContainerSvcType gets the containersvc type.
func (*SwarmSvc) GetJoinToken ¶
func (s *SwarmSvc) GetJoinToken(ctx context.Context) (managerToken string, workerToken string, err error)
GetJoinToken gets the swarm manager and worker node join token.
func (*SwarmSvc) GetServiceStatus ¶
func (s *SwarmSvc) GetServiceStatus(ctx context.Context, cluster string, service string) (*common.ServiceStatus, error)
GetServiceStatus gets the service's status.
func (*SwarmSvc) GetServiceTask ¶
func (s *SwarmSvc) GetServiceTask(ctx context.Context, cluster string, service string, containerInstanceID string) (serviceTaskID string, err error)
GetServiceTask gets the task running on the containerInstanceID
func (*SwarmSvc) GetTaskContainerInstance ¶
func (s *SwarmSvc) GetTaskContainerInstance(ctx context.Context, cluster string, serviceTaskID string) (containerInstanceID string, err error)
GetTaskContainerInstance returns the ContainerInstanceID the task runs on
func (*SwarmSvc) GetTaskStatus ¶
func (s *SwarmSvc) GetTaskStatus(ctx context.Context, cluster string, taskID string) (*common.TaskStatus, error)
GetTaskStatus returns the task's status.
func (*SwarmSvc) IsServiceExist ¶
func (s *SwarmSvc) IsServiceExist(ctx context.Context, cluster string, service string) (bool, error)
IsServiceExist checks whether the service exists
func (*SwarmSvc) IsSwarmInitialized ¶
IsSwarmInitialized checks if the swarm cluster is initialized.
func (*SwarmSvc) ListActiveServiceTasks ¶
func (s *SwarmSvc) ListActiveServiceTasks(ctx context.Context, cluster string, service string) (serviceTaskIDs map[string]bool, err error)
ListActiveServiceTasks lists the active (running and pending) tasks of the service
func (*SwarmSvc) ListSwarmManagerNodes ¶ added in v0.9.4
func (s *SwarmSvc) ListSwarmManagerNodes(ctx context.Context) (goodManagers []string, downManagerNodes []swarm.Node, downManagers []string, err error)
ListSwarmManagerNodes returns the good and down managers
func (*SwarmSvc) RemoveDownManagerNode ¶ added in v0.9.4
RemoveDownManagerNode removes the down manager node.
func (*SwarmSvc) RollingRestartService ¶ added in v0.9.4
func (s *SwarmSvc) RollingRestartService(ctx context.Context, cluster string, service string, opts *containersvc.RollingRestartOptions) error
RollingRestartService restarts the service tasks one after the other.
func (*SwarmSvc) RunTask ¶
func (s *SwarmSvc) RunTask(ctx context.Context, opts *containersvc.RunTaskOptions) (taskID string, err error)
RunTask creates and runs the task once. It does 3 steps: 1) pull the image, 2) create the container, 3) start the container.
func (*SwarmSvc) ScaleService ¶ added in v0.9.2
func (s *SwarmSvc) ScaleService(ctx context.Context, cluster string, service string, desiredCount int64) error
ScaleService scales the service containers up/down to the desiredCount.
func (*SwarmSvc) StopService ¶
StopService stops all service containers
func (*SwarmSvc) UpdateService ¶ added in v0.9.5
func (s *SwarmSvc) UpdateService(ctx context.Context, opts *containersvc.UpdateServiceOptions) error
UpdateService updates the service