Finally some candid information from Gameservers. I wanted to pass this info along to those that might still be interested in the status of this.
These are quotes from Gameservers Zachary Williams ([gs] leviathan) on the Black Ops server stability problem.
- Quote:
-
Hello,
To reply to your question, here is what the problem is, the triggers (as we know them to date) and the work that has been done.
1: The Problem:
The core issue is in the physics engine of the game. The crash happens during some physics calculation. We have confirmed that the crash comes from within the game binary, and it is very consistent as to the specific module that it is hitting (phsyics).
2: The Trigger This issue seems to require multiple things to happen to trigger. One of the triggers, is RCON use. We know this because of the GameTracker scanning issue that happened. We increased scanning frequency, and crashes went through the roof. Se stopped that quickly, and now use a much more passive method of acquiring data (similar to that of what the server browser uses). Some other third party companies have tried to scan us regularly, and those have been blocked, to ensure as much stability as possible. This is also why we have made the suggestion to limit RCON use, as it may affect stability.
3: What has been done so far? Every crash, on every server we have is logged. Analyzed, and recorded in a database. Treyarch has direct access to this information to assist them in finding and repairing the problem. This bug is elusive, and nearly impossible to replicate on demand (we have tried, a lot to force a server to crash, but have not been successful). This is important, because, if we can't replicate it on demand, 'testing' of new code becomes virtually impossible. New code requires client updates, as well as server updates, so the only way to know for sure, is to release the patch with the fixes that have been made.
This is not to say treyarch has not tried to verify the fixes! They have test servers up, the populate them, they keep them running for days while troubleshooting. But, as we can't replicate crashes on demand, their tests can show no problems, but when it gets out into the real world... then we see the results.
So, that's the real deal. The short of it is that we at least know where the crash is happening, we know its triggers, and have created a complete crash database and tracking system to assist in finding the solution. Treyarch is working continuously to try to repair this (and other issues in the game). For example, there was a crash exploit that could have allowed anyone to crash a server on demand (different thing than the crash bug). That was fixed in the latest release before it got into the wild, which is a very, very good thing.
If you have other questions, I will monitor this thread to try to answer them. Just note, that I do not speak directly to Treyarch developers, so I can primarily only answer questions that would be related to Gameservers.
- Quote:
-
On demand replication is really critical to bug solving. When I say can't replicate, I mean, we throw GT requests at a box, repeatedly, for an hour or more, and can't get it to crash.
Trying to solve a problem requires more instant ****. We know we can have it crash here or there, but if we can duplicate the exact conditions and 'make' it crash, now there is a way to really narrow in on the exact piece of code that is the culprit.
- Quote:
-
To address some speculation.
1: Rcon tool has shown to cause an 'increase' in crashes. Please don't mistake 'not using the tool' to 'no crashes'. Rcon use, in general, be it tool or scans is a trigger. We know this, we have verified this.
2: We have tried spamming rcon packets. We have spent a lot of time trying to replicate this. We know full servers have a higher probability of crashing. We have taken the servers that crash most often (by our crash log database), and spammed them to no end. We have not been able to make them crash on demand.
What this means, most likely, is that rcon is not the sole cause. This means that something, within the game must be present in a specific state before the crash happens. It probably ALSO means that it requires both game state 'and' some rcon command.
3: We are not saying you have to come to terms, and that it will never be fixed. That is very unlikely. What 'is' likely, is that it will take longer to fix.
I have done my absolute best to be as candid as possible. I'm not saying anyone should be happy with the information I've provided. We're not happy. However, we know that we are doing everything we can do. We're still working on reproducing the crash on demand. We are working on behalf of every server renter here to help get this solved.
Thank you for taking the time to read this.
Zachary Williams Gameservers.com
- Quote:
-
There are no other ways we are aware of to lower the rate of crashes, other that keeping rcon use to a minimum. Unfortunately, that is all that can be done currently.
|