Rollback Prediction Management

Started by
33 comments, last by NetworkDev19 3 years, 6 months ago

I have a simulation that ticks at 60hz and expect devices that may only update at 30hz to play too.

I've struggled for awhile to handle the performance of rollbacks for prediction with this setup. And since I've been struggling with the engine and low level code doing micro optimizations, I figured maybe I need to reevaluate the higher level problem: the rollbacks themselves..

Does anyone have a particular resource that describes or demonstrates their rollback and reprediction technique for reference?

The way mine is setup is every frame has x number of ticks at a fixed delta time to sample and then predict the players input, which gets sent at the end of the frame. So every frame at 30fps would have 2 prediction ticks. I also do a “fractional” prediction tick which is an additional prediction tick that uses left over frame time to do a smoothed out prediction tick. This fractional prediction tick always gets rollbacked every frame. The others do not.

The others get rollback when the game mispredicts OR their sim falls behind (dropped packets or device freeze/stall). If the hash of the prediction world does not match the hash of the server world, then it must rollback to the server world state. Then, all pending input from the server state up to the last prediction tick are resimulated. On a 30 fps device running a 60hz sim with 40ms rtt (assume 20ms one way clean), I have something like 20-40 inputs not yet confirmed. So I have to resimulated that many ticks in a single frame.

This is killer for mobile devices in particular. Rollbacks destroy their performance. And often it's not even a misprediction, it's just lag. The client has dropped packets or something or a frame stalls, then they have to rollback because they're behind the sim.

Trying to think of ways to soften this! Maybe there's some kind of method of checking input values, seeing if they haven't changed, and merging/skipping certain parts of the resimulation in a rollback?

Advertisement

I'm guessing you're doing something like a fighting game, where frame data is based on a 60hz sim? If not, a first step would be to drop the sim to 30.

What all are you having to replay, state-wise? Are you in an engine where you're stuck recalculating a complex physics sim? Have you instrumented your rollback/replay code and figured out what's taking the most time?

Net_ said:

I'm guessing you're doing something like a fighting game, where frame data is based on a 60hz sim? If not, a first step would be to drop the sim to 30.

What all are you having to replay, state-wise? Are you in an engine where you're stuck recalculating a complex physics sim? Have you instrumented your rollback/replay code and figured out what's taking the most time?

Fps

Mostly movement, shooting, abilities.

Right now it's just expensive to run multiple update ticks in a frame. Even if there's no input, just executing so much logic multiple times a frame is heavy on device.

Does anyone have a particular resource that describes or demonstrates their rollback and reprediction technique for reference?

have u seen this SDK: https://www.ggpo.net/

he's got an opensource c++ github project, might be worth having a look…

That's it… all the best ?

@undefined

max5 said:

60hz , i dont think so..)

Sorry I don't think I understand

max5 was a spammer.

Even if there's no input, just executing so much logic multiple times a frame is heavy on device.

Why? What are you doing that is expensive in your simulation? Where does the profiler say you're spending your CPU cycles (and thus your battery)?

enum Bool { True, False, FileNotFound };

@undefined

hplus0603 said:

max5 was a spammer.

Even if there's no input, just executing so much logic multiple times a frame is heavy on device.

Why? What are you doing that is expensive in your simulation? Where does the profiler say you're spending your CPU cycles (and thus your battery)?

It's mobile. Plus using some experimental tech/engine that isn't meant to do operations like this unfortunately. My devices end up running less than 30 fps even with optimizations, and rendering is already as low as it can go. The prediction ticks just take up so much frame time. And a misprediction and rollback will lead to a 16 up to 32 ticks of reprediction. It destroys the frame and subsequent frames because the device falls behind.

I feel the best next steps until I can get over those hurdles is to reduce tick rate to 30hz for now and try to push 60 for when the game goes public.

Right now, I have a hack in that has clients simulating only 1 prediction tick regardless of performance. Obviously this isn't good for prediction, movement input with a frame time of 32ms can/will be different from movement with a frame time of 16ms x 2. It's fine for testing and demos for now but it won't hold up under scrutiny. My slowest phone ends up being caught in a lot of rollbacks. But the performance is much better than doing multiple ticks for now ?

I've been trying to bar at the folks who make the engine and that experimental tech to see what we can do. Because when I don't use that engine I can get much better times thankfully.

ddlox said:

Does anyone have a particular resource that describes or demonstrates their rollback and reprediction technique for reference?

have u seen this SDK: https://www.ggpo.net/

he's got an opensource c++ github project, might be worth having a look…

That's it… all the best ?

I've tried looking into ggpo several times and didn't see it fitting my shooting game, but I gave it another shot and I guess the implementation is a bit fuzzy to me because of my understanding of how it works at a high level. I kind of understand it, but I don't understand it in contexts outside of Street Fighter or, say, Mortal Kombat for example.

It predicts the input from other players, but a lot of games wouldn't want that - like my own. Something like a shooter for example: the local player wouldn't care what other players' inputs, they only care about the results of their inputs (ie result world state). Other players are just “ghosts” to the local player.

Because it's not p2p, the server has lag compensation and stuff to help the local player's inputs against the ghost remote players to compensate for the delay. Whereas with ggpo, instead it does something like sync the player's input delays to the best compromise for an optimal experience. This doesn't really translate over I think? Not sure. I wouldnt want one person's bad connection having a direct impact on the experience/other players in a game of 5 up to 100 and more people.

However, the rollback itself (whether it's consisting of 2 p2p players or 1 player connected to a server) seems to be the same idea. If the predicted state mismatches, rollback and repredict inputs. Here's the rub I'm not sure I follow:

When you repredict after a rollback, is everything just rollback and let the simulation run as normal? Or in 1 frame (resimulate multiple ticks for pending inputs)?

GGPO, according to an image on their site, shows it as the former. The rollback occurs when the verify fails, but that's only on frame 1. Frame 2 predicts with the input on different frames (2, 3, 4).

In the latter (resimulate in same frame) which is what my model uses, we'd loop in frame 1 until all pending inputs sent to the server that have not bee ack'd are simmed before continuing to frame 2. That's where my performance problem lies. Maybe I should try not blocking in frame 1 and just letting the simulation continue as normal, see how it behaves?

In my experience, you only need to rollback if there is a /mis/ prediction, i.e. you know that you've done something locally that cannot be correct because the assumptions you based it on were false. Normally this is a rare event. The fact that you seem to be doing one every single frame implies… well, it's a weird setup. My fix for that would be “don't do that”.

As for rollbacks when you get lag, that's a bit odd. The point of the ‘predicted’ input is that the system is accepting that lag exists. If nothing else interacts with the object being predicted, then you should be able to cope with any amount of lag - 100ms, 1 second, 1 hour, anything. It's only when the simulations diverge - usually due to interaction caused by some other object - that you need to rollback and replay. And that should happen rarely.

But does it only happen rarely? The implication here is that it happens a lot when you have lag, which implies something else is wrong. This is where relying on comparing state hashes is also a bit of a red flag. That sounds like a useful optimisation when the system is working perfectly, but at this stage it's an unnecessary barrier stopping you from easily seeing why the simulations have diverged. Maybe you just have a bug that is causing more rollbacks than necessary?

This topic is closed to new replies.

Advertisement