Try using something like NSSM (https://nssm.cc/) to run it decoupled from the actual interactive session.
Throwing that out there but, could this be cause by āSaveResourceFileā ?
What makes you think this to be the case?
I have a resources that once the server is started creates 3 pretty big json files has āBackupā and it backups once every 30 minutes.
As the server crashes right on startup sometimes, i figured this could be linked to the issue.
I also remember seeing an error that mentioned something about write acces.
It might not be anywhere near my issue but im still searching.
I just rewrote the whole thing to save this data with mysql to test it out.
Also im not sure if you saw or maybe it wasnāt concluding but i uploaded a crash dump that was log by fxServer in my last post.
We donāt use too much SaveResourceFile. We handle everything through database mysql but it still sometimes crash
@Plouffe Do you have any luck with this one, did you fix crashing?
I did pretty much everything, removed most of the StateBags, changed hosting, used older artifacts.
Only difference is that I now get
[ citizen-server-impl] Server list query returned an error: System.Threading.Tasks.TaskCanceledException: A task was canceled. <- System.TimeoutException: A task was canceled. <- System.Threading.Tasks.TaskCanceledException: The request was canceled due to the configured HttpClient.Timeout of 30 seconds elapsing.
Unhandled Exception:
System.NullReferenceException: Object reference not set to an instance of an object
Unhandled exception in Mono script environment: System.NullReferenceException: Object reference not set to an instance of an object
(null)> txaEvent "serverShuttingDown" "{ļ¼delayļ¼:5000,ļ¼authorļ¼:ļ¼txAdminļ¼,ļ¼messageļ¼:ļ¼Server is shutting down: (Server stopped).ļ¼}"
Also, @nta does this line have any meaning to you
Feb 10 18:44:13 5600x kernel: [97146.307251] traps: luv_tcp5[212245] general protection fault ip:7fd02ecf09aa sp:7fd008e7d360 error:0 in ld-musl-x86_64.so.1[7fd02ecdf000+4b000]
That is the message that appears in the kernel log right when server crashes
EDIT: Also I noticed that crash usually occurs RIGHT AFTER heartbeat:
[ citizen-server-impl] Sending heartbeat to https://servers-ingress-live.fivem.net/ingress
[ citizen-server-impl] sync thread hitch warning: timer interval of 102 milliseconds
=================================================================
FXServer crashed.
A dump can be found at /root/FIVEM/MAIN/alpine/opt/cfx-server/crashes/7a3e30b6-6309-4a1e-12abd89a-da288941.dmp.
Crash report ID: bc54b898-55ee-417b-81d0-bfc57a5c0d20
=================================================================
> txaEvent "serverShuttingDown" "{ļ¼delayļ¼:5000,ļ¼authorļ¼:ļ¼txAdminļ¼,ļ¼messageļ¼:ļ¼Server se restartuje: (Server se zaustavio).ļ¼}"
On our server, it happens when there is Ā± 190 players And people say server crashes when a blimp drops down or someone explodes the gas station.
Entirely unrelated to the crashes that have been discussed in this topic so far. If that is a thing you should provide dumps for that scenario separately.
Semi-related (but not reallyā¦ was looking through the uses of HttpResponse
).
Should ResourceHttpComponent be ending or closing the HttpResponse if there is no handler associated with a resource?
In theory, this would be fine once all references vanishā¦ but this is indeed a little bit weird-looking.
There was another thread running about this in parallel, this has been closed and is redirected here now:
(yes, the other thread is technically older but this one had more recent activity and is the one that usually ends up found instead)
We found out that we are getting rate limits everytime before crashes. Also i think itās related to the internet, cuz why it would hand rdp before crash in one server, why we donāt have logs before crash, why there is always network hitch etc. Here are some screenshots of console before crashes
Since an earlier investigation as recent as 2023-03-08T17:39:00Z resulted in claims that the heap corruption exists āas low as 5914ā, and theoretically ā5848ā was still fine, thereās a new theory about what might be going on here:
- in tweak(net/server): various resilience and limit tweaks Ā· citizenfx/fivem@1c52f55 Ā· GitHub (2020-08, build 2801), some HTTP server stuff was moved to use EASTL.
- in tweak(net): fixed_{multimap->vector} for Http2Server Ā· citizenfx/fivem@114608b Ā· GitHub (2022-04, build 5513), Http2Serverās header list was changed to use EASTL vector instead of EASTL multimap.
Of note is the commit message there saying it āmitigates a corruption crashā, which matches whatās going on now as well. - in tweak(vendor): bump eastl to electronicarts/EASTL@5eb9b1ec09faaf59651ā¦ Ā· citizenfx/fivem@168f92e Ā· GitHub (2022-09, build 5903), EASTL was upgraded from a revision from 2020 to a revision from 2022.
This might have exposed a latent case of the corruption issue from before.
As another experiment, Iāve just pushed tweak(net/http-server): flag to remove EASTL usage Ā· citizenfx/fivem@6fa9f9f Ā· GitHub (build 6314), which should at least behave differently here (and might also finally show the original corruption in cases with a memory debugger attached, so if the issue still occurs there Iāll probably throw an ASan build out there again).
tl;dr
(tl;dr: try again with 6314, if it still fails do upload a full dump and if itās the same failure and it needs more info still thereāll be an ASan build to try with again too thatāll hopefully catch it unlike last time)
This seems to be a print from this āSekulBanSysteā resource. What does this resource do that makes it send requests that āget rate limitedā, and when do these get tripped initially?
That behavior also seems to add up with some sort of deliberate attack again, by the way.
We are now on 5855 and have crashes all time
It is discord logs when someone is joining
After having like 30 crashes in a row after starts.
We have disabled every single http request, making only exeption for TXAdmin. We are running still,
I will keep it disabled for 2 days or more to confirm that problem is gone (we had have at least 5 crashes daily)
function PerformHttpRequest(url, cb, method, data, headers, options)
if GetCurrentResourceName() == 'monitor' then
local followLocation = true
if options and options.followLocation ~= nil then
followLocation = options.followLocation
end
local t = {
url = url,
method = method or 'GET',
data = data or '',
headers = headers or {},
followLocation = followLocation
}
local id = PerformHttpRequestInternalEx(t)
if id ~= -1 then
httpDispatch[id] = cb
else
cb(0, nil, {}, 'Failure handling HTTP request')
end
else
cb(0, nil, {}, 'Failure handling HTTP request')
end
end
Paste this in scheduler to disable every httprequest
Third timeās the charm, christ.
Since apparently the D word is a very bad word (even though it was already used in this thread) I am going to censor this url.
What else we can use to have logs?
Although true, that is not relevant for this thread.
So letās keep the subject centered on the crashes and solutions, shall we?!