Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Mesos

 

At scheduler startup, if specified in config (mesos/reconcile), framework will try to reconcile GoDocker running jobs with Mesos tasks to sync task status.

 

In case of Mesos framework deletion in Mesos master, it is possible to force scheduler to use a new framework Id.

...

Code Block
set god:mesos:over:TASK_ID 7  # TASK_ID = identifier of the job, 7: mark as failed, 2: mark as OK

 

If Mesos/GoDocker are completly out of sync with any reason, and there are too many tasks to handle the above trick.

Switch GoDocker to maintenance and wait for any running job to complete. Once, on mesos side, all jobs are over, if you still have some jobs running in GoDocker, you can delete those jobs.

Stop scheduler and watchers processes.

In mongodb:

 

Code Block
db.jobs.remove({'status.primary': 'running'}) # Will delete all jobs in running status

Then in redis:

Code Block
del god:jobs:running  # Clear the running jobs queue used by watchers/executors

 

If a job fails to be killed by mesos executor, go to the mesos slave and stop the Docker container

Code Block
docker stop XYZ