How to diagnose ECS Fargate task failing to start?

I’m trying to launch/run a Dockerfile on AWS using their ECS service. I can run my docker image locally just fine, but it’s failing on the Fargate launch type. I’ve uploaded my Docker image to ECR, and I’ve created a cluster/service/task from it.

However, my cluster’s task status simply reads “DEPROVISIONING (Task failed to start)”, but it provides no logs or details of the output of my running image, so I have no idea what’s wrong. How do I find more information and diagnose why ECS isn’t able to run my image?

Enquirer: Cerin

||

Solution #1:

Please go Clusters > Tasks > Details > Containers

You could see some error message around the red rectangle in the figure “error message.”

Task detail:

task detail

Error message:

error message

Respondent: Yasu

Solution #2:

I may be late to the party, but you can check the container logs instead of the tasks’.

Go to the failed task -> Details -> Container (at the bottom) and open it. Right under details you’ll see a Status reason.

Opening the container details
Opening the container

Getting the reason for failureenter image description here

Note: if your task runs more than one container, check the ‘Status reason’ of each container as per the screenshot above, as it can be different between them.

Respondent: Radu Di??

Solution #3:

You can get some information regarding the task failure under the ‘Events’ tab of your service’s dashboard. Though the message there aren’t very descriptive, they can provide you a vague idea where exactly things are getting wrong.

enter image description here

Respondent: Abhinav Khare

Solution #4:

As Abhinav says, the message isn’t very descriptive (and using the CLI aws ecs describe-tasks doesn’t add anything more). The only possibility is to log into the host EC2 instance and read the logs there, or send those logs to CloudWatch https://docs.aws.amazon.com/AmazonECS/latest/developerguide/using_cloudwatch_logs.html#cwlogs_user_data

The mostly likely cause (in ECS) is that the cluster doesn’t have enough resources to launch the new task. You can sometimes work out the cause from the Metrics tab, or since mid-2019 (depending on your region I guess) you can enable “CloudWatch Container Insights” from ECS Account Settings to get more detailed information about memory and CPU reservations.

Respondent: andrew lorien

Solution #5:

None of those methods worked for me.
What worked was making just one of the services as essential (only the one you are sure is going to work), and then looking at Cloudwatch logs, and eventually even the ECS logs within the EC2 instance.

# ecs-params.yml

version: 1
task_definition:
  services:
    myservice1:
       essential: true
    myservice2:
       essential: false
    myservice3:
       essential: false
    myservice4:
       essential: false
    myservice5:
       essential: false

ECS’s black box is not very friendly after all.

Respondent: lowercase00

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.