It's not "some" XML markup. It's actually quite a lot. Just to give you an idea, an average moderately-paced 20-hour save would contain:strask412 wrote: ↑Fri, 5. Jul 19, 20:47... he was speaking about making a copy of the data in RAM and then save this copy. He didn't directly say (that I can find anywhere) if the actual write-it-to-disk part is during, or after the pause, but what is apparently happening during this "just make a copy in RAM" is that it's making the copy and incidentally inserting some xml markup, while converting some binary stuff to UTF-8 on the side probably. He said the additon of this formatting during the "copy in RAM" part isn't significant from a performance standpoint.
- about 4 million tags, amounting to more than 34Mb (from a quick grep of opening tags only, i.e. it's underestimated).
- about 5 million quoted strings (i.e. "strings"), ~60Mb. Of those strings, only 35 thousand (0.07%) are unique, adding up to puny 650 kilobytes (1%). That's outrageous overhead, but such is the nature of XML, it's stupidly redundant.
Most of the numeric values, represented as text, are also grossly over-inflated (that is, take up way more space than their binary representation). I haven't the energy to write a script to estimate that particular overhead, at least not right now. But just from the two points above, about a third of the uncompressed save is basically, well, junk, which is completely discarded by the game and is needed purely for upholding XML structure. Analysis of just a couple randomly-chosen chunks in saves shows that at least those chunks could be "compressed" by at least a factor of 4 "simply" by getting rid of XML (that's before employing actual compression such as zlib).
But size is only one side of the issue. The more your on-disk representation deviates from in-memory representation, the more processing power (and therefore, time) you have to spend on conversions. And not just raw conversions (i.e. numbers to text and back), but replicating (when saving) or recreating (when loading) the overall structure. Just a simple example: an array. In memory, that's typically 16 or 24 bytes of upkeep plus the actual contiguous storage, of known bounds. In XML though, no such thing: it's just a stream of tags, which you would have to count one way or another to get at the size of storage that you need for data. Or the aforementioned "strings", most of which are keys for some tables, which implies lookups that are, at worst, O(n) (as opposed to linear indexing which is always O(1)).
Reading even an uncompressed 300Mb save from disk into memory is blazing fast (varies, but would be under half a second on most common hardware, assuming file is "cold"). Traversing it and building the universe from it is not, unless it's a (almost) carbon copy of game's layout.
No, it's not. The number you're seeing is the amount of memory that allocator(s) which are used by the game (directly or indirectly) have reserved. It's not indicative of the size of actual game state (except to guess that it's no larger). That includes the overhead which allocators naturally create to avoid additional system calls and to create their own acceleration structures for efficient allocations. Plus there are no doubt various ancillary buffers that pertain to submitting and retrieving data to/from the GPU, and all the state associated with rendering, playing sound, etc. etc. In short, there's a lot of stuff that never gets into the save.I just glanced at my paused copy of X4 and it's using almost exactly 4gb of RAM right now. If we "just make a copy" that's doubling our RAM consumption instantly.
"Just make a copy" meant duplicating the structure and contents of game's state that needs to be saved. Depending on how that state is structured, it can be surprisingly easy, or frustratingly tedious and error-prone (and even slower than just directly translating the original into a different representation, i.e. XML). Judging by CBJ's response, it's the latter. Sadly, changing that at this point in time is just not going to fly.
Thanks, CBJ, for clarifying. @Tamina and strask, we understood each other just fine
