Tutorial — Troubleshooting a killed process

Edit on GitHub

One of the reasons for a process being killed is OOM Killer(Out of memory Killer).

The OOM Killer is a process that the Linux kernel employs when the system is critically low on memory. This happens when the Linux kernel over-allocates memory to its processes.

If the primary container process was killed by OOM Killer, the container exit status is 139. To check the exit status of a process, do the following:

  1. In the AWS Management Console, go to Services > Elastic Container Service.
  2. In the navigation pane, select Clusters.
  3. Select the cluster of the environment to which an unavailable service belongs.
  4. On the page of the cluster, select the unavailable service.
  5. On the page of the service, check if the Running count is equal to the Desired count. If the numbers are equal, the service is running correctly.
  6. Switch to the Tasks tab.
  7. Check if the Last status is Running.

service-tasks

  1. If the task is not running, switch to the Events tab and check the errors.

ecs-service-events

  1. Switch to the Tasks tab.
  2. For the Desired task status, select Stopped.

stopped-service-tasks

  1. Select the latest stopped task.
Multiple stopped tasks

If there are multiple stopped tasks, to identify the latest one, open the page of every task and compare the Stopped at dates and times.

  1. In the Containers section, select the arrow before the container name.

  2. In the Details section, check the exit code and the errors.

task-exit-code