Server randomly crashing caused by memory corruption

Information:
FXServer Artifacts: 6040
on Onesync Infinity
System: Windows Server 2022
Client: Production & Beta

How it all started:
Hey my server is crashing randomly with a “network thread hitch”. The strange thing about this is, that we didn’t change anything on our system. We just updated our artifacts on 11.11.22 and got the first crashes on 16.11.22. We even tried to go back to older artifact versions we had before. Our server ran without any issues for years now and we didn’t change anything in the last few months. It even happens when the server is not that full. Even got sessions with 180 players witout any problems before it crashed with just 40 players on it. So at first i thought its some sort of attack. So we checked our DDoS protection and there where no attacks. I reported my issue on the CFX Discord server and found a lot of other server owners with the exact same issue starting on the same timeframe. All of them updated to a newer artifact and tried to go back to an older one. So the only thing we all did before it started was a artifact update.

Information about the crash
The crash starts randomly without any abnormalities in the serverlog. So there are no other network hitch warnings before that. Thats how it looks like:

[ citizen-server-impl] network thread hitch warning: timer interval of 231 milliseconds
[ citizen-server-impl] server thread hitch warning: timer interval of 247 milliseconds
[ citizen-server-impl] network thread hitch warning: timer interval of 2244 milliseconds
[ citizen-server-impl] server thread hitch warning: timer interval of 2282 milliseconds
txaEvent “serverShuttingDown” “{"delay":5000,"author":"txAdmin","message":"Server Neustart (Crash erkannt)."}”

It will hitch up to 5000 ms and crash with this error codes:

FXServer Closed. (code 4294967294)
FXServer Closed. (code 3221226356)

After a txAdmin restart it often crashes again after 2-5 minutes. After that, the server runs fine again for like 20 hours, Most of the time its happening between 15:00 and 20:00.

Analyzing the crash:
We checked the dmp files and found this as error codes:

FAILURE_BUCKET_ID: INVALID_POINTER_READ_c0000005_citizen-scripting-lua.dll!Unknown

PROCESS_NAME: FXServer.exe
READ_ADDRESS: 000000000030303b
ERROR_CODE: (NTSTATUS) 0xc0000005 - Die Anweisung in 0x%p verwies auf Arbeitsspeicher bei 0x%p. Der Vorgang %s konnte im Arbeitsspeicher nicht durchgeführt werden.

So i saw it has something to do with the memory. I had a conversation with nta on discord and i send him my full dmp. He said its some sort of generic memory corruption. Likely some attack indeed though given one of them is in tcpserver, but impossible to tell without having access to the attack method ‘on demand’ to run tests with.

Here is a dmp file if you want to check it. Let me know if you need a full dmp file. I can send it via PM.
f275019d-94ee-4887-9710-8c4137a63995.dmp (3.5 MB)

Things i checked for abnormalities:

  • I checked the server resmon for any memory leaks
  • Checked network event traffic with neteventlog
  • Downgraded from 6040 to recommended artifacts
  • I did some server network monitoring


Just help us, it’s not only one server owner stuggling with this errors…

Without info or access to a server experiencing these issues it’s a bit difficult (even impossible) to ‘help’.

I can provide you everything, access, data, everything- Just tell me what you need and how we can solve this together, i can pay for help.
This is really a headache and we are losing players… And not only us two, we know about at least 5+ top servers which have this problem by november

Hey i can give you access to my server that is experiencing this problem. I can even provide your more data via PM (if you allow me).

Ok i got a new information about the crash: We blocked all users from joining it, so it was completley empty and the server still crashed randomly. So we can say its not a modder with a server crasher.

Outside of the choice of terminology (‘modder’? really? also, ‘server crasher’), there’s no requirement for a ‘server crash’ exploit to have the attacker be ‘joined’.

The only explanation here, again, is a novel exploit leading to memory corruption, especially since it ‘happens even on older server versions’ and started at a point in time where no changes were made, but also only happens for a specific set of server owners, and similarly, this matches past such race condition/use-after-free-based exploits.

The only explanation here, again, is a novel exploit leading to memory corruption

Is there any chance to track exploit like this?

We had 2 days without crashes but today after scheduled txadmin restart at 16:00 we had 4 in a row.