Game Server Sharding: Blasphemess Clusters and the Portal
Vertical scaling servers can only get you so far; horizontal scaling and sharding is a great complement to support massive scale.
Vertical scaling servers can only get you so far; horizontal scaling and sharding is a great complement to support massive scale.
If you're not familiar with technical terms, you may be asking yourself, "what is horizontal scaling and vertical scaling?" Vertical scaling is where you upgrade a single machine's hardware to make the game server run better. For instance, upgrading the CPU from 12 cores to 64 cores, boosting RAM from 32 GB to 128 GB, and swapping from 100 Mbps networking to 10 Gbps.
As you add more performance, you get the ability to handle more players concurrently. More players in a game server means more opportunity for interactions.
However, modern computers have a hard limit on how big they can scale individually. CPUs can only get so fast, have so many cores, and RAM can only fit so much data, before you run into physical limitations and heat dissipation problems.
Enter horizontal scaling. What if, instead of making one server really performant, you wrote your application to use two servers instead? Or three or four or even one thousand...
Well, you can get interesting scale that way if your architecture allows for it. You also can benefit from "fault tolerance," or the ability to handle issues like hardware failures or network outages.
Both vertical and horizontal scaling work great, have their niches, and they synergise quite well together. With my work, I've spent a lot of time moving business services into clouds and containerizing them.
For my game, Blasphemess, I'm aiming for horizontal scaling right from the start. The idea is to make individual game server instances (shards, also called clusters in my setting) operate independently of each other.
This idea is not new, and can be traced back to the early days of internet gaming:
Sharding is one common technique to scaling databases. Instead of storing all data in one big instance of your database (MySQL or PostgreSQL, for instance), you store all the relevant info for your app based on some mapping between a field and a database instance.
In the business world, an easy way to do that is to map a customer ID to a single database instance, which then has all the tables and rows necessary to function for that customer.
In gaming, you might have a game shard that only supports maybe 50 players in the copy of a game's location at a time. Or famously, like EVE Online, you might have different nodes running at different timescales connected into a larger universe.
For Blasphemess, the sharding is based around game settings for the conflict. A single world, a collection of cities, or regions locked in war. These shards of the game support a fraction of the total players at a time, with some coming and going.
With such an infinite possibility space of the multiversal cosmic setting, these game shard settings can be anything that fits the mood. A Sci-Fi generation ship, an arcology, a demon-infested slave planet, a fey forest? They all can fit in snugly, and be run side by side each other in different corners of the cosmos.
I've elected to name these shards of the game world "Clusters."
What are the Blasphemess Clusters and Portal?
In short, clusters are the game setting where players engage in the game, and the portal is the overhead that allows players to sign up, create characters, and migrate them between clusters.
The portal is the source of truth for users, and which characters in game they own. From the portal, players can send their characters into a cluster to play by "migrating" them, which is functionally the starting point for a character. If they later migrate off a cluster, it's like a soft reset.
Clusters, on the other hand, are self-contained game worlds. Once a character is migrated onto one, the cluster has all the data needed to play the game. If the portal or a different cluster go offline, you can still play on the online cluster.
In the lore of the setting, clusters represent a region or locale in the multidimensional conflict. That might be a world like Vexavel's, ruled by a demonic sovereign and newly facing opposition from the holy alliance and the exoneration coalition. Players join and play in this corner of the Blasphemess cosmos, and decide the eventual fate of that world.
These game clusters are called such because they are a cluster of conflict regions, and also because they are a cluster of services intended to be run on, e.g. a Kubernetes cluster.
Let's take a quick peek at the simplified architecture of the prototype so far:
My FastAPI and Celery backends are designed to be run concurrently, with multiple worker copies at a time. Those containers can be scaled up horizontally. If I ever need to do more than vertically scale the Postgres database, I could implement a read-only replica set to offload work from the primary DB.
There are also a few secondary components, like Jaeger which is purely for observability into the cluster–so that I can trace faults and debug and performance profile. My Celery setup also includes Flower for observability.
In short, the architecture for clusters is solidly in place to accomadate much more complex game settings than I currently intend to try.
What's the Future of Scale in Blasphemess?
For these early days, I don't imagine that Blasphemess will get more than tens of players. 20 to 50 players, perhaps.
However, if I want to pursue the option, I wouldn't mind having the game set up to already support hundreds or even thousands of players.
Who knows where the months and years may lead? Perhaps I will keep this game a cloistered, friendly environment. Or perhaps some day it will take off, and it will become bigger than I'd hoped for.
Time will tell! Until then, let's keep building and crafting.