GoToProduction
Using HTTPS
GoDocker web component must be accessed via HTTPS for security. To do so, one need to setup a web proxy acting as an HTTPS endpoind, proxying requests via HTTP to the web server.
Proxy should set request header X-Forwarded-Proto to "https" so that redirections redirect requests to the https scheme instead of http (as web server itself manages http only).
With Apache server, it can be set with: RequestHeader set X-Forwarded-Proto "https"
Scalibility/High availability
GoDocker is scalable and support high availability. All components are not managed the same way however.
Web servers
go-docker-web web servers can be scaled locally via gunicorn (number of workers) but it is also possible to launch multiple servers with load balancing. Web servers are stateless.
An example implementation of a dynamic load balancer to godocker web server can be found at https://bitbucket.org/osallou/go-docker-haproxy-consul
Web server can be executed with environment varible GODOCKER_WEB_PREFIX (version >=1.2) to access web UI and API under the selected prefix (for example http://x.y.z:6543/my/prefix/app/#)
go-d-scheduler process
Only one scheduler can be active, however, one can execute 2 instances (or more, but 2 is enough), in active/stand-by mode. If master scheduler fails, the other instance will switch as master after a timeout period.
go-d-watcher process
This process can be scaled horizontally. One can add as many processes as needed to handle the load and dynamically add or stop processes (clean stop).
If number of running tasks to handle increase and jobs seems to remain running too long after their real termination, then it means that watcher processes need more resources to handle the job termination checks.
go-d-ftp
This process can be scaled horizontally. One can add as many processes as needed to handle the file uploads/downloads.
go-d-archive
This process can be scaled horizontally. One can add as many processes as needed to handle the jobs archiving.
Cleanup
Containers are cleanup by the software once job is over. It may be necessary however to clean images on nodes at regular interval if their number increase. docker-gc can be used on nodes for such task: https://github.com/spotify/docker-gc
For each job, a job directory is created and will contain job outputs etc.. To avoid infinite disk consumption, one can use the go-d-clean process, set in a cron at regular interval.
This process will archive jobs i.e. change their secondary status to archived and delete the job directory. Job lifetime before archiving is configurable in configuration file.
Monitoring
Processes
In production, it is advised to set a status_policy in configuration, using one of the available plugin (etcd, consul).
Policy register processes to specified plugin and helps ensuring processes and up and running (and how many).
Status can be queried via go-d-status program or via REST API/Web UI.
Usage
GoDocker provides a Prometheus endpoint via web server providing some statistics. Prometheus should also point to cAdvisor endpoints on nodes to record statistics on containers usage (cpu/mem) for later query.
Logging
Logging is important for production. Default logging can (and should) be modified in go-d.ini file for processes.
Rotating files should be used to avoid filling disk and it is easy to configure log notifications to Graylog, etc... Just use standard python logging syntax to use additional loggers.