1 year ago
#382913
Devios
Job with self-hosted gitlab runner (aws ec2 autoscale spot instance) sometimes stuck infinitely on cache downloading
I have a bit of a specific problem
Sometimes on our runner, a job will be stuck infinitely on the cache downloading.
This runner is a runner that use aws ec2 spot instance to run
Here's the config.toml:
concurrent = 32
check_interval = 0
[session_server]
session_timeout = 1800
[[runners]]
name = "gitlab-runner-xxx-xx"
url = "https://gitlab.com/"
token = "xxxx"
executor = "docker+machine"
limit = 32
request_concurrency = 100
[runners.docker]
image = "alpine"
privileged = true
disable_cache = true
[runners.cache]
Type = "s3"
Shared = true
[runners.cache.s3]
ServerAddress = "s3.amazonaws.com"
AccessKey = "xxxx"
SecretKey = "xxxxx"
BucketName = "xxxxx"
BucketLocation = "eu-west-3"
[runners.machine]
IdleCount = 0
IdleTime = 500
MaxBuilds = 20
MachineDriver = "amazonec2"
MachineName = "gitlab-docker-machine-%s"
MachineOptions = [
"amazonec2-access-key=xxx",
"amazonec2-secret-key=xxxx",
"amazonec2-region=eu-west-3",
"amazonec2-vpc-id=vpc-xxxx",
"amazonec2-subnet-id=subnet-xxxx",
"amazonec2-tags=runner-manager-name,gitlab-aws-autoscaler,gitlab,true,gitlab-runner-autoscale,true",
"amazonec2-security-group=xxxxx",
"amazonec2-instance-type=m5.large",
"amazonec2-request-spot-instance=true",
"amazonec2-spot-price=0.07",
amazonec2-use-private-address=true,
]
[[runners.machine.autoscaling]]
Periods = ["* * 8-18 * * mon-fri *"]
IdleCount = 1
IdleTime = 1000
Timezone = "UTC"
[[runners.machine.autoscaling]]
Periods = ["* * * * * sat,sun *"]
IdleCount = 0
IdleTime = 60
Timezone = "UTC"
And now the job logs
It stuck here infinitely
There's not much in gitlab-runner logs for the given time periods
Apr 06 13:25:59 ip-172-31-24-13 gitlab-runner[155844]: Machine removed lifetime=53m13.857477797s name=runner-k9ee2cag-gitlab-docker-machine-1649248366-b8c70535 now=2022-04-06 13:25:59.996569815 +0000 UTC m=+1109081.553801666 reason=too many idle machines retries=0 used=1.295241679s usedCount=0
Apr 06 13:35:08 ip-172-31-24-13 gitlab-runner[155844]: Checking for jobs... received job=2299342280 repo_url=https://gitlab.com/ads-development/awa.git runner=k9ee2CAg
Apr 06 13:35:09 ip-172-31-24-13 gitlab-runner[155844]: Using existing docker-machine created=2022-04-06 07:39:37.533117067 +0000 UTC m=+1088299.090348926 docker=tcp://35.180.75.59:2376 job=2299342280 name=runner-k9ee2cag-gitlab-docker-machine-1649230777-c10facd5 now=2022-04-06 13:35:09.469844968 +0000 UTC m=+1109631.027076818 project=27672694 runner=k9ee2CAg usedcount=17
Apr 06 13:35:11 ip-172-31-24-13 gitlab-runner[155844]: Running pre-create checks... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:11 ip-172-31-24-13 gitlab-runner[155844]: Creating machine... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:11 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a) Launching instance... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:13 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a) Waiting for spot instance... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:29 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a) Created spot instance request sir-9e86a6bn driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:30 ip-172-31-24-13 gitlab-runner[155844]: Waiting for machine to be running, this may take a few minutes... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:30 ip-172-31-24-13 gitlab-runner[155844]: Detecting operating system of created instance... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:30 ip-172-31-24-13 gitlab-runner[155844]: Waiting for SSH to be available... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:41 ip-172-31-24-13 gitlab-runner[155844]: Detecting the provisioner... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:42 ip-172-31-24-13 gitlab-runner[155844]: Provisioning with ubuntu(systemd)... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:35:53 ip-172-31-24-13 gitlab-runner[155844]: Installing Docker... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:29 ip-172-31-24-13 gitlab-runner[155844]: Copying certs to the local machine directory... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:30 ip-172-31-24-13 gitlab-runner[155844]: Copying certs to the remote machine... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:31 ip-172-31-24-13 gitlab-runner[155844]: Setting Docker configuration on the remote daemon... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:32 ip-172-31-24-13 gitlab-runner[155844]: Checking connection to Docker... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:33 ip-172-31-24-13 gitlab-runner[155844]: Docker is up and running! driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:33 ip-172-31-24-13 gitlab-runner[155844]: To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a operation=create
Apr 06 13:36:33 ip-172-31-24-13 gitlab-runner[155844]: Machine created duration=1m22.376991683s name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a now=2022-04-06 13:36:33.185950721 +0000 UTC m=+1109714.743182572 retries=0
Apr 06 13:39:33 ip-172-31-24-13 gitlab-runner[155844]: Starting docker-machine build... created=2022-04-06 07:39:37.533117067 +0000 UTC m=+1088299.090348926 docker=tcp://35.180.75.59:2376 job=2299342280 name=runner-k9ee2cag-gitlab-docker-machine-1649230777-c10facd5 now=2022-04-06 13:39:33.033567365 +0000 UTC m=+1109894.590799240 project=27672694 runner=k9ee2CAg usedcount=17
Apr 06 13:53:17 ip-172-31-24-13 gitlab-runner[155844]: Checking for jobs... received job=2299447035 repo_url=https://gitlab.com/ads-development/izanami-proxy.git runner=k9ee2CAg
Apr 06 13:53:18 ip-172-31-24-13 gitlab-runner[155844]: Using existing docker-machine created=2022-04-06 13:35:10.808885536 +0000 UTC m=+1109632.366117387 docker=tcp://13.36.172.133:2376 job=2299447035 name=runner-k9ee2cag-gitlab-docker-machine-1649252110-5691836a now=2022-04-06 13:53:18.579111892 +0000 UTC m=+1110720.136343742 project=24129043 runner=k9ee2CAg usedcount=1
Apr 06 13:53:20 ip-172-31-24-13 gitlab-runner[155844]: Running pre-create checks... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Apr 06 13:53:20 ip-172-31-24-13 gitlab-runner[155844]: Creating machine... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Apr 06 13:53:20 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2) Launching instance... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Apr 06 13:53:22 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2) Waiting for spot instance... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Apr 06 13:53:37 ip-172-31-24-13 gitlab-runner[155844]: (runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2) Created spot instance request sir-t6vp8pzm driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Apr 06 13:53:39 ip-172-31-24-13 gitlab-runner[155844]: Waiting for machine to be running, this may take a few minutes... driver=amazonec2 name=runner-k9ee2cag-gitlab-docker-machine-1649253200-fcbdf5d2 operation=create
Just a 3 minute pause in the job and gitlab runner between 13:37 and 13:39
I have a guess that spot instance is destroyed during cache downloading that's why the job is stuck here But it may be another thing, has anyone an idea of what's happening ?
Thank you very much guys !
amazon-web-services
gitlab-ci-runner
spot-instances
0 Answers
Your Answer