GameLift Terminating Healthy, Active Sessions

We’ve seen several instances where GameLift has requested the shutdown of in-use, healthy servers and were hoping for more information. Examples are:

  • arn:aws:gamelift:us-west-2::gamesession/fleet-d631790e-def4-42bc-b8eb-f628dfc270c8/F9EED712-4A1F-A189-B21F-28841EB90FEE created at 2018-04-13 23:33:59 UTC-0400
  • arn:aws:gamelift:us-west-2::gamesession/fleet-d631790e-def4-42bc-b8eb-f628dfc270c8/C5811D01-4ED4-CD7C-B90B-4794CB44248B from 2018-04-13 16:44:40 UTC-0400

Hi @Philippe23, I’m sorry you’re running into this. I investigated both sessions and the hosts should not have been terminated.

We’re investigating root cause and will get back to you once we have more information.

Thanks,

Ben

Thanks @Ben, Just so I’m clear: you think that in these cases our policy triggered a scale down and it chose the in-use instance instead of the idle instance? And that is essentially because the current “choose an instance to scale down” GameLift algorithm doesn’t take “in-use” into account unless “Full Protection” is enabled. Is that right? (We’re definitely on “No Protection” currently.)

For reference, here’s our current scaling policy:

Hi @Ben. Looking forward to hearing what you find.

Just to be clear, the game server processes received the FProcessParameters::OnTerminate event (UE4 GameLift Server SDK), which is referred to as onProcessTerminate by the AWS GameLift SDK docs.

The
docs seem to imply that it’ll only be called if the app becomes unresponsive for more than 3 minutes or if it reports itself as unhealthy, which shouldn’t be the case for either of these sessions.

“The Amazon GameLift service invokes the server
process’s onProcessTerminate callback. This call is used to shut down a
server process that has reported unhealthy or not responded with health status for
three consecutive minutes.”

Source: https://docs.aws.amazon.com/gamelift/latest/developerguide/gamelift-sdk-interactions.html#gamelift-sdk-interactions-shutdown-request

Hey @Philippe23, I think I see what’s going on here. Can you check your GameSession Protection Policy?

To find it go to the console and select your fleet. Then click on actions and click ‘Edit Fleet’.

Down at the very bottom you’ll see a section called Protection Policy.

Click on the drop down and select ‘Full Protection’.

You can configure this through the console when creating a fleet.

If you’re using the API you can set it during CreateFleet:

Ben

That’s exactly correct @Philippe23. If “No Protection” is selected then the scaling policy will randomly choose a host to scale down regardless of game session state.

Ben

Thanks, @Ben!

To provide some context on why you might want GameLift to just “tear everything down” even if there are active games running - during development. For example, you might be testing a bunch of changes, creating games and player sessions and then just want to scale your fleets down. With instance protection off, GameLift will terminate all instances cleanly.

During production, when you have live games and players, enabling instance protection is the best way to ensure a good player experience.