Happy 2022! Now, about those save times...

xixas · Post by **xixas** » Sun, 2. Jan 22, 01:07

Happy New Years! And a sincere thank you for all of your hard work.

But... X4's save times are atrocious

I know it's not the first time someone's said so, but I'd like to dive a little deeper than just complaining or dropping another feature request.

I've seen and implemented numerous real-time save systems over the years.
While X4's load times are reasonable, the save problems you've described in this forum aren't technical ones.
They may require significant refactoring or re-evaluating the thread model, but the mechanic is well established and central not only to gaming, but to file systems, databases, virtual machines, spatial search algorithms, depth-based rendering, etc.

I want to discuss actually fixing this problem -- and I brought code (github gist and writeup).

The sample was designed and intended to address the following comment (and the user speculation that followed):

CBJ wrote: ↑
Mon, 28. Jan 19, 09:58
The game universe is a complex thing and saving it does take time, regardless of format. As various people have noted, it's a lot quicker on a faster drive because of the size of the file, but there's still a lot of work to be done to extract everything that needs to be saved. And yes, the game does have to be paused, otherwise you'd have an inconsistent dataset as things changed from one frame to the next while the save progressed.

It took less than 10 minutes to write a demo class and another 20 for some sample code -- it's there for anyone to use, free and clear, public domain (+alt licenses)

I hold the X series in the highest regard, and I appreciate the regular and continued effort and attention you put into it.
But I've been following the save and load time threads here on the EgoSoft forums, as well as on the Steam boards, Reddit, etc. -- for 3 years now.

We're another year down, another feature release complete (4.20), and another major release is around the corner.
Quick saves and uninterrupted play are something players have come to expect from modern games, and I legitimately believe this is an issue that can be fixed.

If snapshot type wrappers don't work, let's talk real-time transaction logs, whiteouts and tombstones, scratch tables, etc.
There are so many avenues to address this issue, and most of them scale well.
I just want to get the ball rolling in the right direction.

--
(Re)Edited: To keep it topical...

theredman · Post by **theredman** » Sun, 2. Jan 22, 19:51

I was thinking of writing something up and posting it though this is better, I normally use C# . Save times were driving me mad. Could even get the save game xml and break out the model types... / do the actual code for saving code for them but They may complain about copyright ??

On a side note the devs and artists they have do an amazing job for a small studio. I don't mind paying for the dlc's, I cant help but think that egosoft could hire a phew more people like artists to do the people walking around and maybe a dev that's experienced in pathfinding and multithreading/optimisations.
Going into 2022 and beyond if they want to release an x5 and want to get new players as well as old they may need to up there game on the little things. The ships look amazing, awesome detail. Maybe they will use unreal 5 ??

Post by **Imperial Good** » Sun, 2. Jan 22, 20:17

Please show a demonstration of this working efficiently in a entity component model where the saveable state is millions of pieces of data, without impacting run time performance.

From a brief glance the performance is not viable because you need to perform checks and even synchronisation operations every time a "saveable" type is accessed or manipulated. This granularity is far too small for high performance as synchronisation operations are expensive.

Shehriazad · Post by **Shehriazad** » Sun, 2. Jan 22, 22:02

I am not qualified to comment on this code at all...but if the solution was "this easy" then why would it not just have been implemented yet and instead they spend insane amounts of time for tiny gains on the current savegame system?

xixas · Post by **xixas** » Mon, 3. Jan 22, 04:11

Here's an Updated code sample (details below)

Imperial Good wrote: ↑
Sun, 2. Jan 22, 20:17
Please show a demonstration of this working efficiently in a entity component model...

I certainly won't be incorporating an entire ECS for the purpose of a digestible, standalone tech demo.
If I were willing to do so, that just leads down the rabbit hole of "that's not the ECS used in X4" and I'd rather not bog this down in a cycle intent on splitting hairs at the cost of delaying discussion.
Suffice to say, if there's control over the stored data types and the ECS supports complex types, it makes a heck of lot more sense for a dev with access to the unit tests to evaluate performance impact instead of us trying to blindly armchair it here.

Imperial Good wrote: ↑
Sun, 2. Jan 22, 20:17
...where the saveable state is millions of pieces of data, without impacting run time performance.

I hope you're not suggesting actually updating and reading back every available piece of indexed data every frame.
That's not even reliably realistic across millions of pure primitives at 60fps if you've got anything else going on.

Anyway, any change will impact runtime performance. The question is, when a change doesn't improve performance, is there a way to mitigate or silo the bottlenecks.

Regarding changes we could make to this initial per-value snapshot example:

Mitigate - Splitting live data from snapshot data is only necessary (a) on a per-value basis and (b) while a save is in progress

Silo - Merging values is only necessary once a save is complete

Imperial Good wrote: ↑
Sun, 2. Jan 22, 20:17
From a brief glance the performance is not viable because you need to perform checks and even synchronisation operations every time a "saveable" type is accessed or manipulated. This granularity is far too small for high performance as synchronisation operations are expensive.

In the provided example, I added unnecessary locks on every read because it's intended for consumption by a general audience.
If someone copies and pastes that without external sync they're going to run into random SIGABRT errors every time they hit a read/write race condition.
Any experienced developer should know at a glance those individual locks are unnecessary with external synchronization.

I'd prefer to be talking general methodology here -- you know, trying to determine what might work instead of nitpicking every line of off-the-cuff demo classes.
Still, it's worth incorporating your concerns into the model for better evaluation.

Gist an analysis here (same as link above)

It incorporates most of the changes mentioned above.
3 million values over 8 threads.
Running worst-case scenario.
Each thread writes all 3 million values per iteration -- on a 10 year old processor

Looks like it's averaging about 25ms per cycle -- granting that the save complete and first post-save cycles are likely bumping that up a bit since they're performing full merges/splits (as discussed above).
While I'd try to find a few more optimizations to put in there if this were intended for any sort of production, that's really not that bad a start.

Of course it's all relative from this side of the screen.
What does good look like?

xixas · Post by **xixas** » Mon, 3. Jan 22, 04:24

Shehriazad wrote: ↑
Sun, 2. Jan 22, 22:02
I am not qualified to comment on this code at all...but if the solution was "this easy" then why would it not just have been implemented yet and instead they spend insane amounts of time for tiny gains on the current savegame system?

I don't think anyone said it was easy...

But none of us get it right the first time every time.
Anyone that tells you different is selling something.
There is no "right", there's only "better", and it's an iterative process.

Honestly though, the simple answer is "it takes a village" -- more eyes over more time equals better coverage.

Anyway, it's not "easy" no matter how you slice it -- it's a ton of systems design time, refactoring, and performance testing.
I'm not attempting to trivialize it with a code snippet.
It's work. Notably, it's work that won't be spent on something else if they try it.

I'm just trying to provide constructive examples to nudge the conversation in the direction of actually discussing this 3 year old problem without a simply writing it off as "it can't be done."
Anything we proactively accomplish through discussion and trial and error here is an experiment that some developer doesn't have to tackle.

Post by **Imperial Good** » Mon, 3. Jan 22, 06:05

I do not see where in your snippet serialization is meant to occur. A big limitation on save performance is probably due to serialization since a complex entity component graph has to be squished into a flat form with well defined ordering. When saving iterating this graph is apparently what is taking most of the time.

The problem fundamentally stems from resolving relationships. An object, such as an entity, could be referenced by multiple other objects, such as component states. For performance these references are usually kept in a machine friendly way such as address pointers or offsets into a chink of virtual memory. During serialisation such references to an object are all resolved to the same unique identifier which can then be used to find and rebuild the references to the object during deserialisation. Trying to multithread such process efficiently is non-trivial, since the resulting object reference identifiers from all threads must be consistent and no object should be duplicated. Trying to join the results from multiple threads would likely have performance limited by the joining thread which ends up doing similar work to a less threaded implementation. Trying to have the threads coordinate with each other likely results in such lock bottlenecking and overhead that less threaded solutions are faster.

Not all data has such complex relationships. For example bulk data in Minecraft or Factorio representing a chunk of terrain. In such case it is certainly possible to multi thread the serialisation of such data. The output buffers can then be gathered very efficiently for writing. Deserilisation can also be multithreaded since these resulting chunks each have their own unique place in memory, possibly with a known size. The issue with this is the very specific requirements, something that some games like Factorio might have a lot of but other games like X4 might not. The only data in X4 I can think of which might benefit from this in is the saving of current yield data, something which is likely so small that trivial time is spent serialising it anyway, especially if the data is organised efficiently.

xixas wrote: ↑
Mon, 3. Jan 22, 04:11
Regarding changes we could make to this initial per-value snapshot example:
Mitigate - Splitting live data from snapshot data is only necessary (a) on a per-value basis and (b) while a save is in progress
Silo - Merging values is only necessary once a save is complete

This seems like an attempt to imitate Linux process forking. Games like Factorio use such principle to allow for asynchronous saving and playing on servers. In this example it works by forking the Factorio process such that a new process is spawned that shares the entire virtual memory of the original process in copy on write mode. During this time the Factorio server is free to keep updating the game state in the original process without interfering with the ongoing serialisation in the forked process which also has the added benefit of running entirely in parallel with its own set of threads. Of course this has some pretty big downsides to it. Most notably that only Linux distributions and Max OS support this, with Windows not supporting it at all. It also can cause memory use to bloat significantly during the save process as every single change to state requires that the entire memory page it is located on be copied and separated from the serialisation process, even if the change was for a single byte. I am unsure how Vulkan would behave with this as well.

dmk · Post by **dmk** » Mon, 3. Jan 22, 06:53

i really suggest to read https://forums.factorio.com/viewtopic.php?t=24242

currently game save time is comparable to WHOLE system Hibernation time.
i don't think that out of sector universe state is occupying large memory region so copy on write OS semantic would solve problems (basic idea you mark several pages, and duplicate them with copy on write (i.e. they don't actually get copied, unless someone write into them, and even in this case copying would be done by OS with fastest method available) ,
duplication on Linux can be done with process forking (i.e. pages would be on same address space as they was) - problem here second process has same network&graphics&audio, and safe closing them can be time consuming, and this approach not work directly on windows (but does work with WSL ) ;

or via file mapping (MapViewOfFile() ) with PAGE_WRITECOPY) on any OS, but their address obviously changes, so save code should expect that, but code would process in separate thread with OS guarantees of data is not modified.

it will work at least if game on pause (i.e. you can still see whatever you want and issue orders, edit station e.t.c. you just can't unpause game),
for high performance PC with a lot of memory this will work without need pause if implemented thoroughly.

p.s.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
with Windows not supporting it at all.

WSL = Linux subsystem for Windows almost basic windows component now, does support fork.

in theory you can start another process and map file(universe state) in same address as was in main process using MapViewOfFileEx . but obviously that depends...
in practice you better fork main process before any network&graphics&audio initialises, and then in forked version do save using MapViewOfFileEx
(obviously you should strictly avoid any allocation or dll load, between child fork finish and MapViewOfFileEx /MAP_FIXED on linux )

xixas · Post by **xixas** » Mon, 3. Jan 22, 08:29

Now we're speaking my language

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
I do not see where in your snippet serialization is meant to occur. A big limitation on save performance is probably due to serialization since a complex entity component graph has to be squished into a flat form with well defined ordering. When saving iterating this graph is apparently what is taking most of the time.

I couldn't have said it better myself, and the writeup on the original sample has a section called "This isn't actually saving anything to disk..."
Honestly, I hid the thread differentiation more than I should have in the original example in the name of getting it down to 100 lines (enough to fit comfortably on a vertical monitor) -- again, designed for general consumption.

In both samples, the last thread performs double duty as a standard processing thread until a save state is flagged and then switches to dedicated save mode... but no serialization / data marshalling occurs, simply because it wasn't the point of the exercise. It could spend 20 minutes serializing to XML if you wanted it to (with a little spinner in the corner of the screen, of course) and it would make no difference, other than the inability to quick save again in the mean time -- which technically could be alleviated by using a stack instead of a hard pair, but again, that's outside the scope of a focused example.

Point being, serialization should be occurring on a temporarily dedicated thread from snapshot data -- it shouldn't matter how long serialization takes.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
The problem fundamentally stems from resolving relationships. An object, such as an entity, could be referenced by multiple other objects, such as component states. For performance these references are usually kept in a machine friendly way such as address pointers or offsets into a chink of virtual memory. During serialisation such references to an object are all resolved to the same unique identifier which can then be used to find and rebuild the references to the object during deserialisation.

Standard entity model -- so now I'm assuming we're using a table/document model, as opposed to a simple owner hierarchy.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
Trying to multithread such process efficiently is non-trivial, since the resulting object reference identifiers from all threads must be consistent and no object should be duplicated. Trying to join the results from multiple threads would likely have performance limited by the joining thread which ends up doing similar work to a less threaded implementation.

Ok, pause here for a second.

Either you're telling me the entire modeled dataset is managed real-time on a single thread, or you're at least saying it's problematic to access said machine-friendly structure from multiple threads.
I can't imagine the latter is true. It shouldn't be.

A standard entity is effectively just an ID or a pointer to the object memory location -- if it's being used for serialization, I presume it's the former.
In either case, object references within offset data should be equally accessible to any thread so long as said access is synchronized.

Entities have their own, lets say, "hierarchical chain of data processors".
I'd expect they're grouped into or processed by threads according to priority or locality (e.g. the distant world is less important "right now" than nearby entities) and should be managing their own data blocks with priority-level timing -- or at least sending messages to a dedicated state write thread at said priority level (but god that's messy).

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
Trying to have the threads coordinate with each other likely results in such lock bottlenecking and overhead that less threaded solutions are faster.

On general principle, agree to disagree.
In my experience, so long as most of the world is read, not written, per cycle (be that state, physics, rendering, whatever) shared locking across as many available threads as possible is more efficient.
But my experience has no bearing on code I can't see, so I'll gladly cede that likelihood in the name of finding the right solution.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
Not all data has such complex relationships. For example bulk data in Minecraft or Factorio representing a chunk of terrain. In such case it is certainly possible to multi thread the serialisation of such data ...
The only data in X4 I can think of which might benefit from this in is the saving of current yield data, something which is likely so small that trivial time is spent serialising it anyway, especially if the data is organised efficiently.

Maybe our thought trains are traveling in opposite directions here.
I'm not suggesting multi-threading serialization.

I mean, if you wanted to break down entity groups and you're drawing off a thread pool and there are some free threads that you could divide them between, go for it.
But that's not what I'm suggesting -- quite the opposite.

I'm suggesting keeping the serialization on one (or more) predetermined background thread(s) that can read from a temporary snapshot state, and leave the rest to keep the game running.
Granted, you're down (or up) a thread, and the sooner the save completes, the sooner delta memory gets freed, but I get the feeling most of the game's memory consumption doesn't consist of low-level, numeric, compressed, offset, real-time data blocks anyway -- it's assets. That is to say, I presume the non-aggregate low level numeric delta over a 60 second period isn't 4+ GB.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 06:05
This seems like an attempt to imitate Linux process forking...

Not what I was going for, but glad you pegged the Linux bit.
The copy-on-write bit aligns -- Overlayfs and AUFS came to mind when I styled this particular in-place snapshot example -- suppose it's also why I mentioned whiteouts and tombstones.

You trailed off a bit there from "like a fork" to "an actual fork", and I'm certainly not proposing a proc fork -- both because of page copy (the biggest problem with casper/tmpfs+overlayfs, and btrfs to a smaller degree) and because it's entirely unnecessary here.

The main process already has the data to be saved -- it's not on disk, it's in available memory, yes?
We know when we want to save it.
We have to keep writing world state changes.
All I'm saying is we don't have to write them to the same place if we can imitate the desired read.

If we're talking about block offsets, sure, that adds a layer of complexity.
It's like asking for the (i)th array element every time and then when a switch is flipped suddenly saying -- you know what? forget that array, instead, read (n) bytes from arbitrary (j) location.
If you're stuck with twiddling inside pre-allocated blocks, it'd suck to guess at sizes and pre-allocate a scratch space...
And yet, that's precisely what I'd propose... and then merge those deltas to their original slots when the save's complete.
While it takes a bit more mental gymnastics, it's really no different than working with a multi-dimensional linked list.

Gavrushka · Post by **Gavrushka** » Mon, 3. Jan 22, 08:41

Could this thread be moved to another forum, preferably one I don't have access to... It makes the decaying remnants of my ageing brain ache.

Is there a Tefal man forum?

xixas · Post by **xixas** » Mon, 3. Jan 22, 09:14

Gavrushka wrote: ↑
Mon, 3. Jan 22, 08:41
Could this thread be moved to another forum, preferably one I don't have access to... It makes the decaying remnants of my ageing brain ache.

Hah, this did get deep fast.
I guess that's why game developers don't usually post to each other's forums -- but in my defense, I did wait 3 years

I was hoping it'd spawn more of a shallow-and-wide "have you -specifically- considered this idea?" kind of discussion that included a few more players -- you know, maybe some of the modders with a bit of supplementary background.

Look, I love the X series, and have for a long time. I've got thousands of hours in these games just like everyone else.
It frustrates me watching players ask the same question over and over again (while I sit there all "yeah? yeah?!") and just keep getting the same kind of responses...
Which generally boil down to "it's too complicated to explain why, but it's just not possible"...
To which my brain always appends "...yet".

But I understand.
I read our games' forums all the time and think "wow, you are so off base I can't even begin to tell you why."
Still, they want to fix things. It's a world they love. It's where they spend their down time. It's a break from their grind. It's where they relax.
It's where I spend 12+ hour days wondering what someone was thinking with that last round of changes...

...but user experience comes first.

We don't do it for us. We do it for them.
And, unfortunately, right now... I'm them

theredman · Post by **theredman** » Mon, 3. Jan 22, 11:45

So it sounds like your using some sort of xml parser to pars some giant object in 1 big blow and it takes a long time to work out that new object. Simple c# dev here so I don't worry about memory pointers. My thoughts on this revolve around... Why are you doing this in one massive chunk? I am trying not to imagine how the code looks. So spinning up a thread all the time is heavy ... You could instantiate the threads like workers ... set the amount of workers to be less than the core count .. so for a 12 core cpu you could prob run the 12 cores and leave the hyper threads for windows or use 20 threads....

The Worker gets a sector to work through, it could use an xml parser ... once parsed its basically a string that can be added together ... so one sector is stringafied and added together with the other 12/20 then it moves onto the next batch. Well that's how I would tackle something like this. The objective is to keep things simple and easy to maintain or at least its should be..

Could a worker type system be setup to run existing code and can the objects be split out into smaller chunks or by factions? If you can split the data out and you have something that munches to into a save format already it should be easy to use multiple threads "workers".. but meh ... just trying to help. It sounds like the source code would take a while to get ones head around.

Post by **Imperial Good** » Mon, 3. Jan 22, 16:48

dmk wrote: ↑
Mon, 3. Jan 22, 06:53
WSL = Linux subsystem for Windows almost basic windows component now, does support fork.

It also is an optional component, and likely only properly supports Fork if running as a full virtual machine which itself has performance overhead. A lot of emulated fork implementations for Windows work by creating a copy of the application's memory in its entirety rather than only on write.

The reason such approach is viable is due to copy on write being implemented at a kernel level using hardware level interrupts. This effectively creates overhead similar to normal page faults but less due to in memory only and only on the first time a page is written to.

xixas wrote: ↑
Mon, 3. Jan 22, 08:29
Either you're telling me the entire modeled dataset is managed real-time on a single thread, or you're at least saying it's problematic to access said machine-friendly structure from multiple threads.
I can't imagine the latter is true. It shouldn't be.

I was saying that trying to multi-threaded serialisation is non-trivial due to the complexities of encoding object state in a consistent way. You already said this is not the intended case as you only intend for 1 thread to work on serialisation, although from your example it was not clear.

xixas wrote: ↑
Mon, 3. Jan 22, 08:29
In my experience, so long as most of the world is read, not written, per cycle (be that state, physics, rendering, whatever) shared locking across as many available threads as possible is more efficient.

X4 is a game. Although most of the state is not updated, a lot of it still is. Thousands of ship coordinates, script states, e.t.c, all change all the time. This is potentially a lot of lock checks if granularity is small, or so much blocking that it might as well be single threaded if granularity is big.

Further more X4 is already highly CPU bottlenecked during gameplay. Anything that requires slowing it down by even a few percent is not good, even if it would mean better save/load times.

xixas wrote: ↑
Mon, 3. Jan 22, 08:29
You trailed off a bit there from "like a fork" to "an actual fork", and I'm certainly not proposing a proc fork -- both because of page copy (the biggest problem with casper/tmpfs+overlayfs, and btrfs to a smaller degree) and because it's entirely unnecessary here.

The main process already has the data to be saved -- it's not on disk, it's in available memory, yes?
We know when we want to save it.
We have to keep writing world state changes.
All I'm saying is we don't have to write them to the same place if we can imitate the desired read.

Except in theory process fork is a lot faster as it adds as good as no overhead outside of initial page copy that only occurs when a page is first written. There are not any locks to check, no synchronisation primitives and even no additional malloc/object space allocation overhead. When the process tries to write to the page, I think an interrupt is generated that spawns a physical copy of that page in memory (no disc IO involved). The original process goes on to write changes to this copied page while the original page is still used by the forked process to serialise from. Since this uses the virtual memory page table functionality already being used it should not generally slow down execution, with the only overhead being once-off page copies during saving. In the ideal implementation, such as can be seen in Factorio, there should be a brief freeze while saving is started, a significant reduction in performance immediately after saving has begun while the working set is being copied and then performance quickly returning back to more normal levels for the rest of saving.

The only down side to this is that Windows does not support it.

SirConnery · Post by **SirConnery** » Mon, 3. Jan 22, 17:27

The in game experimental save option does save quite a bit of save time. Just in case you haven't tried it.

Post by **Imperial Good** » Mon, 3. Jan 22, 23:35

SirConnery wrote: ↑
Mon, 3. Jan 22, 17:27
The in game experimental save option does save quite a bit of save time. Just in case you haven't tried it.

This new? Or you talking about the uncompressed save which has mixed results (most people say it does not improve save speed by much)?

SirConnery · Post by **SirConnery** » Tue, 4. Jan 22, 00:41

Imperial Good wrote: ↑
Mon, 3. Jan 22, 23:35
This new? Or you talking about the uncompressed save which has mixed results (most people say it does not improve save speed by much)?

Uncompressed save is what I mean yes. For me it cut about a third of my save time. That was in 4.0, don't know how it's now since they already improved save speeds in 4.2 I think.

Also excluding your savegame folder from windows search indexation is gonna have a speed increase.

Takes like 15 seconds for me now to save the game. Which while longer than most games is bearable. In 4.0 before I did any of that the game was saving for a minute or so.

surferx · Post by **surferx** » Tue, 4. Jan 22, 16:09

I've played other games that had quicker save times, but reloading the save was hit or miss, and this was significantly more frustrating than waiting a few more seconds for a save to load. So, I'm happy with having a save that I know is going to load correctly and I would hate for Egosoft to compromise the save operation for a few players who just don't have time to spare those few seconds of their busy lives.

xixas · Post by **xixas** » Tue, 4. Jan 22, 19:06

SirConnery wrote: ↑
Tue, 4. Jan 22, 00:41
Uncompressed save is what I mean yes. For me it cut about a third of my save time. ...
Takes like 15 seconds for me now to save the game. Which while longer than most games is bearable. In 4.0 before I did any of that the game was saving for a minute or so.

On Linux I'm averaging 49 seconds a save right now, but I've got 4 active monitors (playing across 3), so Xorg is always hogging a bit of CPU to keep up.
Still, the save and Xorg are running on different cores, so that shouldn't affect things much.
I'd rather not balloon the saves to 400+ MB on an SSD.

surferx wrote: ↑
Tue, 4. Jan 22, 16:09
... I would hate for Egosoft to compromise the save operation for a few players who just don't have time to spare those few seconds of their busy lives.

If I'm sitting down to play X4, it's the weekend and I likely expect to spend the next 10 hours in strategy mode... it's not a matter of a busy life.
It's about breaking immersion. Anything above about 5 seconds is a problem -- 30+ seconds might as well be 3 minutes.

[rant]
If I'm saving because I'm about to attempt boarding and am severely out-classed, anticipation wanes.

If I'm saving because I just spent the last 5 hours assembling a fleet to take on a Xenon sector, and we're already in formation waiting on the go signal, I've got plans in mind; I've got a strategy in place; I know what I want 7 different wings to be doing 5 seconds from now; and I've got a properly zoomed map and both info panes open ready to start quick-clicking through things to make it all happen in the order I want to see it go down. I was probably going to issue all those initial commands while the game's paused anyway, so sure, I can go get a cup of coffee while I wait for a save. But in the mean time I lose that click sequence my brain had already worked out because I stopped looking at a screen I have no control over.

The latter of those 2 issues could be cured with command queues, but there's really no fixing the former.

That's doubly true with the quick save I might tap if I get into an accidental dog fight that seems like it might be fun but I really can't say if I'm going to win the day.
I'm mid flight, guns blazing, and I've already got a couple missile locks on me... it's significantly less fun after coming back from a save.
If I were coming in from a load, it'd probably be because I did not win the day, and my immediate action is to high tail it out of there.
"Coming back" from a save shouldn't feel like coming back from a load.
[/rant]

Imperial Good wrote: ↑
Mon, 3. Jan 22, 16:48
X4 is already highly CPU bottlenecked during gameplay.

I haven't noticed too many CPU bottlenecks playing on newer hardware, but yeah, gotta admit there's a good bit of lag/skipping on 200+ ship OOS battles on the map screen.

Imperial Good wrote: ↑
Mon, 3. Jan 22, 16:48
Except in theory process fork is a lot faster as it adds as good as no overhead outside of initial page copy that only occurs when a page is first written. ...
The only down side to this is that Windows does not support it.

Hah, I know it'd be greedy to say "well just make Linux faster then"

I'm well aware that parity between OS releases is preferable, and maintaining multiple code paths is a PITA.

While forking may look like a more appealing option than in-place locking code, if you're saying they're flat out off the table then there's no need to follow that train any further.
I do think, though, that entity or large-component level access abstraction is still likely a viable option.
Data offsets as previously described could be mapped to a pseudo-scratch space, with locking only required at a higher level for batch reads, and mapping only performed on initial write (like page copy).
Worst case it'd effectively suffer from the common cache-miss overhead, but that's the same underlying problem with most hashtable implementations and it certainly doesn't stop us from using them.

Don't suppose you could say which ECS X4 uses? If it's third-party that is. That might help narrow down snapshot/COW options.

Alternately, could we determine a ballpark on the maximum expected amount of RAM actually used for active/saveable data?
If the memory consumption of said packed structures is relatively low, maybe this could be approached by doubling the block size of saveable units.
In said scenario, accessors could read unions and then bitshift instead of dealing with cache misses -- e.g.

Code: Select all

union U { std::uint16_t b; std::uint8_t a; }; U u;

...

Code: Select all

 std::uint8_t x = u.a || ((u.b & 0xff00) >> 8);

While it's not exactly as simple as that, I think it conveys the idea.

surferx · Post by **surferx** » Tue, 4. Jan 22, 20:19

xixas wrote: ↑
Tue, 4. Jan 22, 19:06

surferx wrote: ↑
Tue, 4. Jan 22, 16:09
... I would hate for Egosoft to compromise the save operation for a few players who just don't have time to spare those few seconds of their busy lives.
If I'm sitting down to play X4, it's the weekend and I likely expect to spend the next 10 hours in strategy mode... it's not a matter of a busy life.
It's about breaking immersion. Anything above about 5 seconds is a problem -- 30+ seconds might as well be 3 minutes.

[rant]
If I'm saving because I'm about to attempt boarding and am severely out-classed, anticipation wanes.

If I'm saving because I just spent the last 5 hours assembling a fleet to take on a Xenon sector, and we're already in formation waiting on the go signal, I've got plans in mind; I've got a strategy in place; I know what I want 7 different wings to be doing 5 seconds from now; and I've got a properly zoomed map and both info panes open ready to start quick-clicking through things to make it all happen in the order I want to see it go down. I was probably going to issue all those initial commands while the game's paused anyway, so sure, I can go get a cup of coffee while I wait for a save. But in the mean time I lose that click sequence my brain had already worked out because I stopped looking at a screen I have no control over.

The latter of those 2 issues could be cured with command queues, but there's really no fixing the former.

That's doubly true with the quick save I might tap if I get into an accidental dog fight that seems like it might be fun but I really can't say if I'm going to win the day.
I'm mid flight, guns blazing, and I've already got a couple missile locks on me... it's significantly less fun after coming back from a save.
If I were coming in from a load, it'd probably be because I did not win the day, and my immediate action is to high tail it out of there.
"Coming back" from a save shouldn't feel like coming back from a load.
[/rant]

Suppose you spend 5 hours assembling a fleet and save , then find out you wasted your time because the save was corrupted because Egosoft tried to cut a few seconds off the saving process. That 5 hours might as well be 5 days.
I get the breaking immersion part, immersion is totally my reason for playing. But it does demand a little compromise sometimes. Well, cheers and Happy 2022 to you too!

Post by **Imperial Good** » Tue, 4. Jan 22, 22:27

xixas wrote: ↑
Tue, 4. Jan 22, 19:06
I haven't noticed too many CPU bottlenecks playing on newer hardware, but yeah, gotta admit there's a good bit of lag/skipping on 200+ ship OOS battles on the map screen.

It is entirely CPU bottlenecked. You can check out the other threads giving frame rate numbers. Both memory latency and CPU architecture improve performance significantly.

The skipping on the map is intentional and not related to performance. Low attention objects such as those far away from the player's ship update less frequently. This is how X4 can scale to tens of thousands of ships.

xixas wrote: ↑
Tue, 4. Jan 22, 19:06
Don't suppose you could say which ECS X4 uses? If it's third-party that is. That might help narrow down snapshot/COW options.

As a moderator I am not involved with development at all. It clearly does use an entity component model though given the structure of game data, literally referring to "entity" and "component" types. Given that entity component models are fundamentally not that complex I have to guess that it is written in house, possibly during development of XR which is the engine X4 is based on and that was a large change from the X3 engine, which I think was also written in house.

xixas wrote: ↑
Tue, 4. Jan 22, 19:06
In said scenario, accessors could read unions and then bitshift instead of dealing with cache misses -- e.g.

Any sort of hacky approach like this for something fundamental decreases maintainability. It is effectively coupling the code to architecture. Although currently X4 is x86-64, it is possible that ARM ports or even other ports could be desirable in the future.

egosoft.com

egosoft.com

Happy 2022! Now, about those save times...

Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...

Re: Happy 2022! Now, about those save times...