Train Carriages becoming detached

Hendreik · June 18, 2021, 12:39pm

Client

Using canary? Yes
Windows version: Windows 10

Server

Operating system: Windows 10
Artifact version: 4086
IP address: localhost
Resources: spawnmanager, vMenu, sessionmanager and the reproduction resource

Incident

Summary: When using OneSync, train carriages can become visibility detached if the player varies their distance around the train in a certain way
Expected behavior: Trains should stay together as one on the track
Actual behavior: Trains carriages become visibility detached or separated by a large distance. Though all carriages still move as one.
Steps to reproduce: Download the attached resource, and load up a server with 2 clients. Position them around (X: 2414 Y: 1095 Z: 79) and then run the /train command on one player. The players positions will automatically be adjusted a couple times to show the issue
Server/Client? Client
Files for repro (if any): Reproduction.zip (1.4 KB)
Error screenshot (if any): None
.dmp files/report IDs: None

Any additional info:
This can also happen to metro trains and ‘naturally spawned’ trains. Happens when using either OneSync on or legacy. Reference pictures:
https://i.imgur.com/t5RZdwS.png
https://i.imgur.com/OzmfRFf.png
https://i.imgur.com/h7atdcm.png

NadelieMunsell · September 22, 2022, 8:53pm

@nta Mentioning & bumping as it appears this report got buried. This is a set of working reproduction steps for trains breaking. Tested on latest artifacts.

Ekinoxx · March 5, 2023, 7:49am

After seeing dbub comment about this post in the recent community pulse discussion, I can confirm this bug is still present as of today, latest artifact being 6302.

I just wanted to do some investigation myself with the very little knowledge I have.

This repro seems to be working on 2612, 2699 and 2802 (didnt test with any version prior to this), testing done with OneSync Infinity.
The method seems to be related to ownership change breaking the link between the engine and the carriage.

Printing the carriage engine with GetTrainCarriageEngine server side does reveal something of importance, it seems like during the ownership change, the data is lost on all carriage (except the engine that returns itself), all carriage now returns 0.

Printing the owner of every carriage seems to be working fine on the server side, all wagons are changing ownership in a single frame all at once.
Printing the carriage index doesn’t reveal anything weird, all wagons exist with correct carriage index during the repro.

When leaving the repro going to the very end, when the train is exiting the scope of the player, the engine is deleted first, leaving all the carriage going without any engine during a few seconds.

That’s all I noted, it seems like this bug will be kinda hard to debug as is everything related to owership weirdness.
Also, doesn’t seems to be a priority at the moment as they are clearly not main gameplay, just wanted to add all this information somewhere at least

nta · March 5, 2023, 12:02pm

Right, I recall the intended fix actually being something like ‘parse the train attachment data and make sure the entire chain only gets migrated and created as a whole’, though I also believe there was something about ‘try to check what game code does here’.

At the time I don’t think this repro had been seen though so testing it was an issue, as is the other thing with cloning changes namely regression testing.

JnKTechstuff · March 5, 2023, 12:16pm

My server is available for testing again if you need it for this issue. I know you did some preliminary work jeez probably last year?

@smallo was also helping you a bit from memory

nta · March 6, 2023, 11:41pm

The repro seems to be fixed in fix(gamestate/server): migrate/cull trains as a singular 'entity' · citizenfx/fivem@a760c73 · GitHub though this needs regression testing (and tests with more players).

It shouldn’t affect cases where trains aren’t used (though another commit in the same batch might), but a look would be appreciated anyway.

JnKTechstuff · March 6, 2023, 11:52pm

Appreciate you taking the time to look into trains

gottfriedleibniz · March 7, 2023, 4:29pm

My notes on CTrainGameStateDataNode (verification required).

The names unk12 and unk14 are the same from X360 (the mission train bits look reworked in 2372; including the addition of unk199);
2372 inserted a bool (0x2310A8F9421EBF43) between isEngine and isCaboose; unk199 and unk224 (0xEC0C1D4922AF9754) are serialized after forceDoorsOpen.
I have unk199 labeled as _has_no_thread_id. Although, this could probably be improved.

gottfriedleibniz · May 9, 2023, 9:09pm

After this fix, and its followups, I noticed that ReassignEntityInner may sometimes be called twice on the same (link-)entity. While fairly innocuous, it puts LastOwner into an incorrect state.

nta · May 9, 2023, 9:10pm

… compared to the other random flukes people have reported since, indeed.

I guess there was indeed something to say about a proper ‘already called on these’ list of sorts for this.

gottfriedleibniz · May 14, 2023, 1:33pm

Something else to consider:

In the train migration logic, there are now some paths in which entity->clientMutex may not be locked; leading to races with the Get* and Get*Unsafe weak references.

Some double-checking here may be required.

nta · May 14, 2023, 6:48pm

… was this usually handled by the caller to ReassignEntity? If so, that would explain some of the crashes tied to it.

(yeah, there’s warning comments around the function, but - uh - whoops?!)

gottfriedleibniz · May 17, 2023, 1:05am

I suspected the previous migration fixes may have lead to deadlocking (no time to verify), so I’ll throw out this question for rubber-ducking:

Is the current ReassignEntity logic all-or-nothing, i.e., if the client owns the engine it is guaranteed to also own all its links? Or is this not a (hard) requirement?

LegacysMaps · March 29, 2024, 10:26am

This fix unfortunately did not work for us in regards to AI driven trains. We have a very simple script that uses poly zones to tell where to spawn the trains and the one near Sandy Shores specifically always glitches out around multiple players with onesync infinity.

Perhaps you guys may have some suggestions here to improve this old client sided code, if it is indeed dated:

local function Lerp(a, b, x)
	return a + ((b - a) * x)
end

function isWithinUrbanZone(train) --checks if train is in an urban zone
    local coord = GetEntityCoords(train)
    
    local isInZone, activeZone = false, nil

    for _, zone in pairs(Config.Zones) do 
        if zone.Zone:isPointInside(coord) == true then
            isInZone, activeZone = true, zone
        end
    end

    return isInZone, activeZone
end

-- Enable tracks and train/tram spawning with fixed frequency
SwitchTrainTrack(0, true)
SwitchTrainTrack(3, true)
SetTrainTrackSpawnFrequency(0, 120000)
SetTrainTrackSpawnFrequency(3, 160000)
SetRandomTrains(true)

local stopTrain = false

CreateThread(function()

    local freightHash = GetHashKey("freight")
    
    local Speed = 20

    while true do

        local vehiclePool = GetGamePool("CVehicle")

        for _, vehicle in pairs(vehiclePool) do
            local isTrain = GetVehicleClass(vehicle) == 21
            if isTrain and GetEntityModel(vehicle) == freightHash then
                local isInZone, activeZone = isWithinUrbanZone(vehicle)
                local desiredSpeed = isInZone and activeZone.Speed or Config.RuralSpeed
                if stopTrain then desiredSpeed = 0.0 end
                Speed = Lerp(Speed, desiredSpeed, 0.0125)
                
                SetTrainSpeed(vehicle, Speed)
            end
        end

        Wait(100)
    end
end)

FEEL_Mods · June 7, 2024, 8:48pm

Still not working