Queue priorities with Spot fleets

Hi.

According to documentation: Best practices for GameLift game session queues - Amazon GameLift

  • When configuring priorities for a queue with Spot fleets, place cost near the top of the list. This will ensure that locations on Spot fleets will always take precedence over locations on On-Demand fleets, when available.

Also: Tutorial: Set up a game session queue for Spot Instances - Amazon GameLift

Prioritize the fleets in your queue. Fleet prioritization determines where the queue looks first when searching for available resources to host a new game session. You might choose to prioritize by Region, instance types, fleet type, and so on. When working with Spot fleets, we recommend either of the following approaches:

  • If your infrastructure uses a primary Region with fleets in a second Region for back-up only, you want to prioritize fleets first by region, and then by fleet type. With this approach, all fleets in the primary Region are placed at the top of the list, with Spot fleets followed by On-Demand fleets.
  • If your infrastructure uses multiple Regions equally, you want to prioritize fleets by fleet type, placing Spot fleets at the top of the list.

I have a setup with 10 fleets (5 per region):
1 x On-Demand - c5.xlarge
4 x Spot - c5.xlarge, c5.2xlarge, r5.xlarge, m5.xlarge

I’ve placed the On-Demand fleets last in the Queue and still the players are routed to those two On-Demand fleets instead of Spot ones. We are making the placements based on reported player latency.

What am I missing? What is the explanation for this behavior?
Thank you.

Hi @Lucian_Gutu

GameLift Queue would prefer SPOT fleet as long as it is “viable”. Viability is determined by the SPOT fleet’s EC2 Instance Type, OS and Region. If those attribution combination is at risk for SPOT interruption, we deem it as unviable.

SPOT interruption, in layman’s term, is EC2 reclaiming the SPOT fleet and reallocate it to ON_DEMAND. This typically happen when the EC2 instance type and OS have high usage in the region. 5th generation hosts (c5, m5, r5) that are larger than *.large typically have relatively low capacity but high usage, hence it’s likely that all of your SPOT instances were unviable and caused GameLift queue to place into ON_DEMAND instead.

Here are some graphs illustrating the viability in the last 30 days in IAD for Linux:


As you can see, *.large or 4th gen instance types are typically much more stable in viability. So, I’d recommend you to use c5.large or c4.xlarge to replace one of your x5.xlarge instance types.

You can find out about why your cheapest SPOT fleet wasn’t placed by going to CloudWatch and search for “FirstChoiceNotViable” for your queue. Monitor GameLift with Amazon CloudWatch - Amazon GameLift.

1 Like