Realtime Server Fleet not Activating

I am trying to get a GameLift fleet up and running using the basic example shown in the GameLift documentation. Rather than just doing it through the console I wanted to configure it with code. I set up a CDK project (effectively cloudformation) that uploads a server javascript file to S3 as a zip file. It then adds a role that can access the S3 artifact and configures a gamelift script.

When I then try to create a fleet from that script gamelift passes all of the check stages but gets stuck when activating. It finally goes to Error state after an hour or two. I managed to get on to the instance and the zip file looks to have been downloaded OK as the file exists in the folder /local/game/etag-0. But I see no log files which I assume indicates it didn’t start the process correctly.

What is strange is that if I download the same zip file from S3 and then upload it while creating a script manually in the console I am able to create a fleet from that script with no issue at all.

From what I could tell the /local/game directories on both the working and non-working instances looked pretty much the same. I could not find anything in any of the logs on the instances which indicated and issue.

It would be really helpful if anyone had any ideas about what could be causing the issue or if anyone had any advice on a good place to look for more information.

Thanks.

My initial guess is that CDK is somehow mangling the zip causing it fail when its unzipped. But as you download the zip from S3, its probably ok. The other option here is that your launch path could be off wrt to the zip.

Did you see anything in the fleet events for failed the failed fleet?

If you can provide your failed fleet id, I can get the GameLift service to take a quick look.

Hi Pip thank you for the swift response. That was my initial thought that there was an issue with the zip which is why i tried downloading it and uploading through the console. In terms of the Fleet events it looks like everything is OK:

Extracting Build:

Extracted files from build: 
	/local/game/etag-0/MegaFrogRaceServer.js

Searching for runtime path:

Script extraction path: /local/game/etag-0
Searched for one valid path in the following list:
	/local/game/etag-0/MegaFrogRaceServer.js

Found a path: /local/game/etag-0/MegaFrogRaceServer.js

I had deleted my fleet but have brought up a new one this morning that is also failing in the same way. The fleet id is:

fleet-0891c2f7-15ca-4679-b829-8108e7ec1ed5

I am probably doing something stupid but appreciate your help. Thanks.

Thanks for the update, I’ve pinged the GameLift service team to have a look. Hopefully they will have an update for you shortly.

Hi DanM,

I’m taking a look into the issue from the GameLift side. I’ve done some investigation and I have a couple questions;
-Was the script or permissions changed at all during the ACTIVATING time of the fleet?
-Does the CDK project change the permissions on the zip file at all? E.g. Write/Read

Thanks!
Alex

I am not aware of a reason why the permission should be changed during the ACTIVATING time of the fleet although I am not sure how to check that.

In terms od CDK as far as I can tell it doesn’t do anything with the file permissions. The file I tell CDK to bundle to S3 has the same permissions as when I download it again from S3.

It has read/write owner and group, read-only everyone and no execute flag.

Don’t know if this helps at all but let me know if there is anymore information I can give.

Thanks for your help.

Hi DanM,

Did you also try uploading it manually (either through console or aws cli) to S3 then using that s3 location to create a script & fleet? If not, could you try that as well?

It looks like we treat the S3 ETag as a md5 digest to validate that we didn’t lose any data in midflight. However, this doesn’t seem to be as fail-safe as expected: https://aws.amazon.com/premiumsupport/knowledge-center/data-integrity-s3/#:~:text=Follow%20these%20steps%20to%20verify,object%20was%20created%20and%20encrypted.

I would think that there’s something in CDK which impacts the etag, hence why the fleeet won’t go to ACTIVE as it thinks the s3 file is corrupted.

To unblock you for now, would it be possible to upload the script to s3 through aws cli or console?

Hi Kenneth,

I have just tried what you asked. I have uploaded the zip file to s3 using the aws cli. I then changed the cdk code to provision the script by referencing the manually uploaded zip file and I was then able to provision a fleet from that script. So sounds like it is something about the way CDK is uploading the zip file.

I can work with this is a work around while I investigate gamelift. Is there anything I can do to help get to the bottom of the root cause?

Thanks,
Dan

On the cli zip files are uploaded typically via the fileb option so that the file is treated as an unencoded binary.

I would suspect that you need to possibly need to set something here: https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-s3-assets.Asset.html#bundling-span-class-api-icon-api-icon-experimental-title-this-api-element-is-experimental-it-may-change-without-notice-span or via the isZipArchive property?

I’ve though yet to find a clear example of what should be set; I believe its supposed to set these properties automatically if its a Zip file, so you may want to ensure you’re on the latest CDK.

Good luck.