Thank you for answer. I can provide anything. I write my own resource in peace and I don’t have any frameworks, i.e. I have 100% overview of the code. So if you point me in the right direction, I can figure out the problem. If the problem is in ‘GetClientData’, in which native servers is it used? I can grant a code review for a suspect server native.
PS: From what you found out it’s related to ‘GetClientData’ ?
Looking at the dump, svMain looks to be executing GET_PLAYER_ROUTING_BUCKET.
Given that no registers/stack-values look zeroed, I wonder if multiple threads are racing to create the initial m_syncData (given the copy assignment operator of shared_ptr is not atomic - this seems possible). For example, three potentially competing paths:
svMain: ProcessServerFrame/GetEntityLockdownMode
svSync: Tick/UpdateWorldGrid
svNetwork: ProcessPacket/SendObjectIds
And comparing Windows/Linux artifacts, the Windows implementation looks way more durable and less likely to crash outright.
Perhaps. I don’t really like the lazy creation here at all, it’s weird and difficult/impossible to clean up properly, so I’ve moved it to be handled by OnClientCreated.
There’s some other fx::Client lifetime concerns elsewhere too (such as @AvarianKnight’s post) but I thought that was related to freeing and should’ve been mitigated by one of the recent changes in ClientSharedPtr stuff.
Thanks for your advice I appreciate it. In all reosources, I put print(“…”) before the native ‘GET_PLAYER_ROUTING_BUCKET’ to locate in which part of the code the crash occurs. I’ll get back to you as soon as I find out more details.
Update your server artifacts to 6419 to see if today’s changes have helped your case.
Also, that linked report is using 5848. A bit dated and not worthwhile to consider if there is anything new or actionable within it given recent changes.
e[38;5;171m[ script:mychat] e[0me[0m[13:17:41] JohnHonza(5): j
e[38;5;171m[ script:mychat] e[0me[0m[13:17:43] JohnHonza(5): taky
e[38;5;57m[ script:acc] e[0me[0mplayerDropped 30 Server->client connection timed out. Last seen 28 msec ago.
e[38;5;57m[ script:acc] e[0me[0mJohnHonza
e[38;5;73m[ citizen-server-impl] e[0me[0mnetwork thread hitch warning: timer interval of 158 milliseconds
=================================================================
e[31mFXServer crashed.e[0m
A dump can be found at /alpine/opt/cfx-server/crashes/3494acd5-8bc3-4ac8-8a63209a-af479696.dmp.
Crash report ID: d5eec8a1-a6b9-4b19-b29d-1a6443a9b8e0
=================================================================
And here is code that printed this message before crash:
AddEventHandler("playerDropped",function(reason)
local id = source
if serverids[id] then
playerids[serverids[id]] = nil
end
serverids[id] = nil
if acc[id] then
print("playerDropped",id,reason)
print(GetPlayerName(id))
local pos = GetPlayerRoutingBucket(id) == 0 and GetEntityCoords(GetPlayerPed(id)) or vector3(0.0,0.0,0.0)
MySQL.Async.execute("UPDATE user SET disc = CURRENT_TIMESTAMP, x = @x, y = @y, z = @z WHERE nick = @nick",
{
["@x"] = pos.x,
["@y"] = pos.y,
["@z"] = pos.z,
["@nick"] = GetPlayerName(id)
},
function()
end)
end
LogPlayer(id,"disconnect",GetPlayerEndpoint(id).." "..reason)
acc[id] = nil
SharedData[id] = nil
TriggerClientEvent("fivem:OnPlayerDisconnect",-1,GetPlayerName(id),id)
print(os.date("[%X] ").."Disconnect: "..GetPlayerName(id).." Reason: "..reason)
end)
Right, this might be something specifically involving calling a function depending on ‘sync data’ (like getting routing buckets) from the playerDropped event in your case.
It turns out both crashes might be related to a client being dropped twice: the other dump is after ServerGameState::HandleClientDrop gets called while the client’s ‘sync data’ pointer is already null.
The other dump might rather be a case of a regression in the fix earlier where dropping a deferral twice would crash now.
I don’t think I’m going to be able to actually look into or successfully fix this any further, sorry - I’ve no idea what is even going on at all.
e[38;5;57m[ script:acc] e[0me[0m[10:43:49] Disconnect: _K_r_p_a_t_a_ Reason: Server->client connection timed out. Last seen 13 msec ago.
e[38;5;57m[ script:acc] e[0me[0mplayerDropped 128 Server->client connection timed out. Last seen 13 msec ago.
e[38;5;57m[ script:acc] e[0me[0m_K_r_p_a_t_a_
e[38;5;57m[ script:acc] e[0me[0mplayerDropped2
e[38;5;57m[ script:acc] e[0me[0mplayerDropped3
e[38;5;57m[ script:acc] e[0me[0mplayerDropped4
e[38;5;57m[ script:acc] e[0me[0mplayerDropped5
e[38;5;73m[ citizen-server-impl] e[0me[0msync thread hitch warning: timer interval of 152 milliseconds
=================================================================
e[31mFXServer crashed.e[0m
A dump can be found at /30120/alpine/opt/cfx-server/crashes/f010b827-2819-451c-64ee9aa4-01872cda.dmp.
Crash report ID: 36631bd0-7358-40fe-88cb-3066226c0e01
=================================================================
And playerDropped:
AddEventHandler("playerDropped",function(reason)
local id = source
local name = GetPlayerName(id)
local ip = GetPlayerEndpoint(id)
print(os.date("[%X] ").."Disconnect: "..name.." Reason: "..reason)
if serverids[id] then
playerids[serverids[id]] = nil
end
serverids[id] = nil
if acc[id] then
print("playerDropped",id,reason)
print(name)
local ped = GetPlayerPed(id)
local pos = vector3(0.0,0.0,0.0)
if GetSharedData(id,"VW") == 0 and ped ~= 0 then
pos = GetEntityCoords(ped)
end
MySQL.Async.execute("UPDATE user SET disc = CURRENT_TIMESTAMP, x = @x, y = @y, z = @z WHERE nick = @nick",
{
["@x"] = pos.x,
["@y"] = pos.y,
["@z"] = pos.z,
["@nick"] = name
},
function()
end)
end
print("playerDropped2")
LogPlayer(id,"disconnect",ip.." "..reason)
print("playerDropped3")
acc[id] = nil
SharedData[id] = nil
print("playerDropped4")
TriggerClientEvent("fivem:OnPlayerDisconnect",-1,name,id)
print("playerDropped5")
end)
It is a bit late, so I don’t have this fully fleshed out…
Consider auth->RunAuthentication doing an HTTP request: its callback will be executed on another thread; causing done to be run on that thread. The contents of request->SetCancelHandler may then race with the subsequent execute_callback_on_main_thread in a few places. Creating pathways for RemoveClient + client->OnDrop(); to be called twice.
Side question: Should the m_keepAliveTimer + TimerEvent connection handle be disposed of after use? Seems a bit leaky.
Yeah, the temp client logic in deferrals looked extremely fishy, but I didn’t look into it too much. RemoveClient being invoked twice would make sense.
I also don’t really like the way RemoveClient calls are sprinkled around there so much, but refactoring it is more risky without the ability to verify changes in a realistic environment.
Right, as uvw::Handle objects don’t close on destruction, this probably is an oversight. No, huh, m_keepAliveTimer does get cleaned up in fx::ClientDeferral::~ClientDeferral.
I don’t think the dtor to ClientDeferral is ever being invoked:
The SetCardResponseHandler lambda captures self by value and that creates a circular dependency in the ClientDeferral shared_ptr (never allowing its use_count to hit zero). Making things weak is an easy fix.
Mega-unrelated side note: It would also be nice if this WriteColor could be wrapped in a g_allowVt check.