Random heap corruption (regression?) on servers

So, After 5d of testing and after almost 6 months of constant crashes all day long.
I think we can say we have a victory.
D-bubble - your 6318 fixed our crash problems. We have no issues at all at this build. 0 Crashes.
I want to thank you from a bottom of my heart and i want to apologize for my not so nice message on older thread. I hope you are not mad at me im very sorry.

Also i want to thank mr Tabbara :slight_smile: Your constant help, fast responsing and yourwillingness are out of this world. Thank you for all of what you did for this very hard to resolve problem.

I hope this is the end of the story for this crashes but it looks like it is :blush:

9 Likes

No cashes so far with 6318. It looks like you fixed it. Fianally :slight_smile:

Finally, I also donā€™t crash anymore with artifact 6318, thanks

Hello,

Iā€™m experiencing this crashes one month ago. Iā€™m always update to newest artifact release, but nothing seems to solve this type of crash.

Iā€™ve stopped recent scripts that I started, but still the same.

Do you have any tip to check what is going bad to crash my server despite having 20 players or 80?

Turns out this fix might not have actually fixed anythingā€¦ :sweat: Ever since this fix HTTP/2 responses are just entirely broken, a state nobody noticed for a few months but stuff like ā€˜the connection was closed when deferringā€™ and some other screwery has been happening since as HTTP/2 just doesnā€™t do anything anymore.

Iā€™ve pushed a commit (fix(net/http-server): empty HTTP/2 responses Ā· citizenfx/fivem@5d6322b Ā· GitHub) changing this broken part back, hopefully the other fixes still work there, but itā€™d be appreciated if you folks that had this issue back ā€˜thenā€™ could test server version 6540 to see if itā€™s still fixed there.

At least the server didnā€™t crash anymore. And that was a huge fix for us :slight_smile: I will get back to you if the server will start crashing again or something like that.

ā€¦ on builds that have the change (like 6540) of course that is otherwise itā€™s a bit of a useless test :stuck_out_tongue:

I still have the server but its not running anymore. I could let it run with nobody in it but i assume the issue wont happen as it never happend with no players in it.

1 Like

Yeah, thatā€™d not really help. :frowning: A shame, heh, but maybe thereā€™s others left. :smiley:

1 Like

For the past 2 weeks or so, we are currently running 6540, we have been having an issue almost exactly the same as these folks are having, the sync & Network thread hitches, followed by losing 90% of our players (count is typically around 65-110) and drops all the way down to like 10-20 players left. The server never ā€œCrashesā€ completely. I have been following almost every thread I can scour from the forums. I have hyper-optimized our code, used neteventlog on the client to see what happens when the crashes occur, I have captured profilers on the console DURING the Network thread hitches, and there is literally nothing hitching in any resource that I can see. Iā€™m not sure if this is related to our issue, maybe you can help out? I know I donā€™t really have any actual data or dmp files to submit here, but I am a proficient coder and I have gone through everything with a fine tooth comb, and I am coming up empty.

When I did catch the hitches happening on the client neteventlog, I did not have any massive tables coming over or anything crazy like that. The events just slowed down and completely stopped coming in within about 10-15 seconds, followed by all players timing out.

If this happened before 6540 as well or the server doesnā€™t crash, itā€™s not the same issue as described here at all and you should rather make a separate topic, ideally after youā€™ve managed to capture a .etl file on the server.

(itā€™s likely one of a few common object recreation spam issues though that really need reproduction steps)

Last question, donā€™t want to take this too far off topic, but can you explain what you mean by object recreation? And I figured that running an ET wouldnā€™t really help since it was network thread hitches. I did however capture a wire shark during the hitches, but again Iā€™ll make my own topic once I have the proper stuff

Does this count to versions above the 6540, or only that one?
We were running on 6541 or 6542 not sure now for a bout a week and no issues like before occured.

It wasnā€™t reverted yet, so 6541/6542 would also be valid for this.

In that case, we didnā€™t have any crashes for a LONG time, nor server had any issues.

Hey just a little update: We are running on 6540 and up. Still no crashes so far. Even asked everyone who had the crashes before. So it looks like its still fixed for us :slight_smile:

I just had a crash happen that is almost exactly the same as the other folks here described, I noticed that one of my resources (sonorancad) was using 600MB of memory, which is odd, usually it uses a max of 30-40MB. But I noticed this high usage in the svgui resource monitor and I was getting frequent sync thread hitches, so I tried stopped the resource, and the server instantly crashed on me. Hereā€™s a video I recorded it live as it happened. It generated a crash dump of like 5.5GB which I debugged.


https://cdn.discordapp.com/attachments/1100261422216249364/1145569455817109654/2023-08-27_21-57-42_-_Trim.mp4

This issue was fixed, locking to prevent further bumps.