GoToProduction

Using HTTPS

GoDocker web component must  be accessed via HTTPS for security. To do so, one need to setup a web proxy acting as an HTTPS endpoind, proxying requests via HTTP to the web server.

Proxy should set request header X-Forwarded-Proto to "https" so that redirections redirect requests to the https scheme instead of http (as web server itself manages http only).

With Apache server, it can be set with: RequestHeader set X-Forwarded-Proto "https"

Scalibility/High availability

GoDocker is scalable and support high availability. All components are not managed the same way however.

Web servers

go-docker-web web servers can be scaled locally via gunicorn (number of workers) but it is also possible to launch multiple servers with load balancing. Web servers are stateless.

An example implementation of a dynamic load balancer to godocker web server can be found at https://bitbucket.org/osallou/go-docker-haproxy-consul

Web server can be executed with environment varible GODOCKER_WEB_PREFIX (version >=1.2) to access web UI and API under the selected prefix (for example  http://x.y.z:6543/my/prefix/app/#)

go-d-scheduler process

Only one scheduler can be active, however, one can execute 2 instances (or more, but 2 is enough), in active/stand-by mode. If master scheduler fails, the other instance will switch as master after a timeout period.

go-d-watcher process

This process can be scaled horizontally. One can add as many processes as needed to handle the load and dynamically add or stop processes (clean stop).

If number of running tasks to handle increase and jobs seems to remain running too long after their real termination, then it means that watcher processes need more resources to handle the job termination checks.

go-d-ftp

This process can be scaled horizontally. One can add as many processes as needed to handle the file uploads/downloads.

go-d-archive

This process can be scaled horizontally. One can add as many processes as needed to handle the jobs archiving.

Cleanup

Containers are cleanup by the software once job is over. It may be necessary however to clean images on nodes at regular interval if their number increase. docker-gc can be used on nodes for such task: https://github.com/spotify/docker-gc

For each job, a job directory is created and will contain job outputs etc.. To avoid infinite disk consumption, one can use the go-d-clean process, set in a cron at regular interval.

This process will archive jobs i.e. change their secondary status to archived and delete the job directory. Job lifetime before archiving is configurable in configuration file.

Monitoring

Processes

In production, it is advised to set a status_policy in configuration, using one of the available plugin (etcd, consul).

Policy register processes to specified plugin and helps ensuring processes and up and running (and how many).

Status can be queried via go-d-status program or via REST API/Web UI.

Usage

GoDocker provides a Prometheus endpoint via web server providing some statistics. Prometheus should also point to cAdvisor endpoints on nodes to record statistics on containers usage (cpu/mem) for later query.

Logging

Logging is important for production. Default logging can (and should) be modified in go-d.ini file for processes.

Rotating files should be used to avoid filling disk and it is easy to configure log notifications to Graylog, etc... Just use standard python logging syntax to use additional loggers.