Multiple servers - high availability

Post here if you need help setting up your server, etc.
Post Reply
jonaswm
Posts: 2
Joined: Wed Aug 23, 2017 1:51 pm

Multiple servers - high availability

Post by jonaswm »

Good morning, in Brazil.
I'm new here! :)
I would like to configure the server to make it into high availability, this is a university job.
I looked at the settings of the master.srv files, I do not know if the configuration should only be done there or have to change other .cfg files, anyway ..
I ask for help!
And thank you!

Thank you!
(By google translator)

User avatar
sinewav
Graphic Artist
Posts: 6214
Joined: Wed Jan 23, 2008 3:37 am
Contact:

Re: Multiple servers - high availability

Post by sinewav »

jonaswm wrote:I would like to configure the server to make it into high availability, this is a university job.
Can you be more specific? What do you mean "high availability"? Usually a high-availability server is part of a redundant pair or a cluster and uses something to act as a load balancer. None of this is configurable in Armagetron AFAIK.

Some routers have a high-availability feature. You can create two Arma servers and use the router to direct traffic between them. Can I ask why you need a high-availability Armagetron server?

jonaswm
Posts: 2
Joined: Wed Aug 23, 2017 1:51 pm

Re: Multiple servers - high availability

Post by jonaswm »

sinewav wrote:
jonaswm wrote:I would like to configure the server to make it into high availability, this is a university job.
Can you be more specific? What do you mean "high availability"? Usually a high-availability server is part of a redundant pair or a cluster and uses something to act as a load balancer. None of this is configurable in Armagetron AFAIK.

Some routers have a high-availability feature. You can create two Arma servers and use the router to direct traffic between them. Can I ask why you need a high-availability Armagetron server?

Thank you for answering me!
So, the division of traffic between two Armagetron servers is possible, only with NAT rules in iptables, but how will I keep the state of the game in sync, the state of the player in a certain match? From what I've seen, there's a directory where Armagetron keeps the state of each played, should I just sync this folder between them?
My biggest problem is to maintain the state of the player, because the task is to cause the least impact to the player.

I hope I understood, thank you!

User avatar
Lucifer
Project Developer & Local Moonshiner
Posts: 8610
Joined: Sun Aug 15, 2004 3:32 pm
Location: Republic of Texas
Contact:

Re: Multiple servers - high availability

Post by Lucifer »

You can't split traffic between two arma servers. The closest to "high availability" you're going to get is to run it as a service, so that if it crashes or gets DDOS'd or something, it'll automatically restart.

I hope google translate can work with this next bit.

Making a game like Armagetron support clustering would increase perceived latency. You'll have multiple clients connected to multiple servers, and each server is running it's own simulation. You then need a way for each server to agree on which simulation is the "correct" or canonical simulation and adjust their own simulations to match. This concept is similar to how the client runs its own simulation to predict what's happening on the server, except that the server, by definition, is the "correct" simulation, which is how you get jumps and bumps when playing online. Lost sync packets mean the client doesn't have all of the information to power it's simulation, and periodically has to re-sync everything.
Image

Be the devil's own, Lucifer's my name.
- Iron Maiden

User avatar
delinquent
Match Winner
Posts: 660
Joined: Sat Jul 07, 2012 3:07 am

Re: Multiple servers - high availability

Post by delinquent »

Actually, a bastardisation of distributed service might be workable, if a controller is placed in the network to determine priority. Assuming the idea operates on the same principle as cluster failover, a controller might be able to hand over to a secondary arma server that would carry on with a game, in the event that a primary game server were to crash.

There are, however, multiple caveats to this approach:

- Game state would need to be synced at every round to compensate for hardware capability - this could potentially lead to client timeouts.
- Failover wouldn't be seamless. Because the first server is a priority server, the controller would need to take a few milliseconds to catch up. You'd see noticeable lagspikes, and it might cause in-game deaths.
- The controller would certainly add a few milliseconds to the existing client-server latency.
- You'd have to make the controller invisible to both the clients and the server, otherwise you'd need to fork both projects. It would be a huge undertaking for one person to attempt to re-engineer Arma.

As a networking project, this is cool. If you can effectively build a custom controller, you can potentially move on to other games that might have need of such a piece of software. I know games like War Thunder, World of Tanks/Warplanes/Warships, Elite Dangerous etc would all appreciate something like this. Eve Online uses something slightly similar, but it's a proprietary piece of kit developed specifically for that platform.

User avatar
Lucifer
Project Developer & Local Moonshiner
Posts: 8610
Joined: Sun Aug 15, 2004 3:32 pm
Location: Republic of Texas
Contact:

Re: Multiple servers - high availability

Post by Lucifer »

So you're suggesting having a controller that specifies which server is the correct simulation? Then, obviously, if that server crashes, the number two server becomes correct and each server promotes itself. I think you can minimize lag spikes pretty easily, though. You'd have a (configurable) amount of servers specified to take over in the event the top server loses contact, as well as a (configurable) extremely short period of time when the top server has to send packets to the other servers. I'm thinking a small time period, like 1 ms. You could even work in a heartbeat system which would simply be a series of pings where the other servers in the cluster measure the "health" of a server, and if the top server starts getting spotty (maybe due to high CPU load), the others could go ahead and bump that server back.

None of that is too difficult, but what I'm worried about is the perceived latency. As to how the other servers in the cluster function, I see three options.

Option 1: Each server receives input from a subset of clients connected to the cluster. They pass those through to the top server with their own timestamps, and the top server trusts the timestamps on those packets (the arma server currently doesn't trust timestamps from clients). Then the cluster node simulates the input and keeps talking to their own clients accordingly, while syncing it's own simulation with the top server.

Perceived latency problem: Seeing syncs from the other nodes in the cluster connected to other clients. One player, connected to one node, sees a comparable level of accuracy for their own movements as they already get with one game server, but will see more latency from clients connected to other nodes due to the fact that syncs have to pass through the top server.

Option 2: Each node receives input from their clients and passes them to all other nodes. Each node simulates, like they do, and syncs with the top server.

Perceived latency problem: All the extra processing of packets in the network layer leads to longer packet processing times. You're basically increasing internal traffic in the cluster quite dramatically.

Option 3: Your load balancer (which would have to be a new app that is aware of the cluster behind it) receives all input from clients and distributes the packets evenly to all the servers so that all the servers get input packets at pretty damn close to the same time. Each server then simulates accordingly, and outgoing packets from the servers would need some sort of really special treatment by the load balancer. Basically, some fancy algorithm that reduces each node's outgoing packet load so that they're not sending many outgoing syncs, the load balancer sends more total outgoing syncs, but it's picking based on each node. In this situation, each node syncs their simulation internally with the top server, but sends outgoing packets from their own simulations.

Perceived latency problem: This is probably the one that has the least perceived latency problem, but it's the same as option 1, just spread out amongst all the clients.

Of these options, you still need a load balancing process that is aware of the game internals, to some extent. This is because there's work you want to offload to the load balancer. Chat, for example. Authentication as well. When mod powers are used by a server moderator, you'd want the load balancer to schedule a time for each node to implement the changes, to have minimal disruption to the game. The load balancer needs to know about teams, and it'll manage the connections with the clients. Then, on top of all of this, you could even write a cloud controller for the whole thing, so that you could launch new server instances inside virtual machines to handle more clients.

It seems like all of this could be done like 90% in the network layer itself. I asked Z-man in chat one time how easy it would be to rip out the network layer and make it a standalone library, and he said it would be pretty easy, so bridging to a load balancer should be fairly easy. I'd be interested in seeing the network layer adapted to use shared memory for passing messages between processes on the same machine so that the entire stack can be run on one really powerful machine and reduce latency by using shared memory instead of the network. A basic implmentation of this would be simply going between protobuf and the socket interaction and adding a shared memory backend to replace the sockets.

If someone were to do this, I'd suggest going with option 1 and then later adapting to option 3, and not worry about using virtual machines and instead worry about launching new server processes on the same machine. I'd also suggest implementing the over-the-wire sharing between servers at the load balancer level rather than the game server level so that each node could run any arbitrary number of game server processes. And then, ultimately, the load balancer becomes the service that you run, even when you want to run only one game server. It would be a big change, but the more I think about it, the more I think it's probably a lot more doable than it appears at first glance.
Image

Be the devil's own, Lucifer's my name.
- Iron Maiden

User avatar
aP|Nelg
Match Winner
Posts: 557
Joined: Wed Oct 22, 2014 10:22 pm
Contact:

Re: Multiple servers - high availability

Post by aP|Nelg »

jonaswm wrote:(By google translator)
Some of these posts might end up being gibberish :P

Anyway, can you be more specific by what you mean by "high availability", it's hard to tell what google translate has done.

User avatar
delinquent
Match Winner
Posts: 660
Joined: Sat Jul 07, 2012 3:07 am

Re: Multiple servers - high availability

Post by delinquent »

Lucifer wrote:So you're suggesting ...
Your option 3 is pretty much what I was thinking. Providing one game server is given priority, technically becoming a "master game server". Additional latency would just be handled by each individual game server, so if you were watching a secondary game server you'd think there was quite a bit of latency. However, should one of those game servers become a priority server, that lag would disappear. I've yet to consider how ping rubber and ping charity would be handled, though.

Obviously, the controller would need to be aware of some of the mechanics of the game, but building this as a standalone library seems relatively doable - in which case the controller would be easily transportable to other games - but the legwork required to translate them into common terms might be a bit of a long project.

I do think it's possible, though, and with relative ease too. If I had any real skill in C++ I would think about contributing... alas I'm a .NET bastard myself.

Post Reply