Questions about optimization

Vertu · February 16, 2022, 12:45:22 PM

I have been heavily considering more and more ways to optimize my custom units now that they are mostly finished and unlikely to heavily change and am wondering about how optimization works in CW4 for custom units for certain things.
Or in other words, I want some details to the performance impact of the following so I can make better decisions when trying to optimize my custom units. Negligibility is also desired information.
Also I will include a quick question for vanilla units.

Many objects vs few objects in a unit. (Not all have to posses a mesh, an example is creating objects to be preference points to then use GetObjPosition rather than developing a mental map of other reference points from a single location mathematically).
Script communication using ScriptVars in multiple scripts for units to share in the same CPack vs a single massive script unique to each unit in the same CPack. I have a VAUs CPack that have units with similar @functions but need to communicate information from these functions to the rest of the script with some key differences and so I didn't make a general purposed script until my recent use of ScriptVars. Would using ScriptVars to reduce total lines of code (probably from 8,000+ to ~4,000+ for the VAUs CPack) be better? Or is the process of calling Get/SetScriptVar to reduce the total lines of code in a CPack not much more efficient?
I would basically be replacing @function with a new script with a ScriptVar(s) that 4 units would then use. Resulting in 1/4 the amount of lines in the CPack from that function as these 4 units no longer call that @function and instead use GetScriptVar.
Complex meshes vs many many simple meshes/objects. I am aware of how performance taxing a single complex mesh can be but I also wonder to what extent as I make "assembly meshes" which are exported meshes of unique colors of a model's part (due to .obj export technicalities). Would it be much better to split larger meshes into multiple meshes or stick to a single large mesh to heavily minimize object count?
Is a single complex mesh more performance taxing than multiple "jointed" objects that have simple meshes which are "jointed" to an object that will be rotating? So a rocket launcher has a ROT-POINT object to be rotated, then multiple simple meshes "attached" to that object that will rotate "with it". Would this be more performance heavy than a more complex mesh that will reduce the number of objects attached to the "ROT-POINT" from 8 objects with a mesh to 2 objects with a more complex mesh?
When it comes to mesh complexity, is it size or "depth"/detail which causes performance drops? Or maybe is it vertex creation. What I mean by vertex creation is how multiple vertices' are "rendered" (my knowledge here is a bit lacking so I apologize if I get this very wrong). I wish I could illustrate this but in an attempt to explain; typically when a set of vertices' aren't "aligned" to each other, it increases complexity of the object/mesh's rendering as more vertices'' are created to "encase" the mesh. So a cube at 0,0,0 then has another cube at 10,10,10 with both being the same mesh would result in much more complex vertex creation and performance impact than just having 2 independent cube meshes. Is this true and I should do my best to create meshes that stick within a simple area or does this "vertex creation" not depend on this and instead will depend on the actual mesh on a visual standpoint and the mesh itself visually is directly proportionate to the complexity of it's vertices'? Sorry if this was unclear.

Now for vanilla units:
in a recent map I have made, I have noticed odd performance impacts of increasing the range, max health, ERN Simulated state (setting to 1), heal speed, and max ammo of vanilla units. The map has very few units but many of them had "wireless" changes to the mentioned properties and the overall performance of the map was similar to some of my maps which had many times more units (including custom units). So does altering these properties of vanilla units cause a large performance impact? I have encountered this on 2 other unreleased maps I have made which used this altering method and encountered a similar drop in performance from doing so compared to other maps I have made which didn't alter vanilla units at all.

Any additional optimization tips would be appreciated.
Again, I apologize if I am getting a lot of concepts completely wrong.
Also I do not plan on joining the Discord as a method of asking questions like these.

knucracker · February 18, 2022, 08:35:12 AM

- Many vs fewer objects in a custom unit: Generally, fewer objects is better. However, 1,2,5,10,20... won't usually impact performance. If you try to create hundreds of child objects that might start being noticed. It of course also depends on how many of each custom unit can be on the map. If it is 1, then no big deal. If it is hundreds, that matters more. It also depends on if the children are moved, rotated, etc. The game has to do all of the transformation matrix math when you move a child and its children. It's very efficient, but not free. If you need to have many moving 'child' parts, it is best to let the game do that transformation math. That will always be better than you doing it in 4rpl.

- ScriptsVars and total lines of code: A significant reduction in the number of lines of code (the number of 4rpl commands that run each frame) will be a performance win. Each 4rpl command (every keyword), is a lookup in a table followed by some function invocation. So when you put "2 2 + ->sum" in a 4rpl script that is 4 'commands', 4 table lookups, 4 internal function invocations. Pushing a constant to the stack is a 'command', as is '+", as is popping the result from the stack and storing it in a var. So if you can cut from 8000 to 4000 lines that would execute, that would be a notable win. As for ScriptVars they are pretty efficient. They involve a table lookup to get the unit, followed by an optional list search to find the script, followed by a table lookup to get the var. Notably, vars are always table lookups even when not using ScriptVars. If you have a unit and there is only 1 script, or the script you mean to reference is the first script, that is the most efficient. If you leave the script name as an empty string that is the most efficient case. In that case the internal ScriptVar function will just take the first script on the unit and look for the variable in it. If you specify a script name, then the internal function has to search through the scripts and find the one with that name. This involves string comparisons and happens even if there is only one script attached.

- Meshes: Don't worry too much about vertex count. If you have a mesh with 10k vertices it will likely be fine, even if there are 100 units on the map. That's 1 million vertices, I know. But vertex count has come to mean less over the last N years as GPUs have gotten better at handling very large numbers. That said, don't just throw 100k vertex models in the game if it isn't necessary. They have to get saved, loaded and held in memory. So that all takes save game space and save/load time. As far as nested meshes, that doesn't matter so much for rendering performance. It won't matter if you have two cubes and if one is in the other or they overlap, or they are separate. The mesh geometry you give to an object will render if the object is in the frustum. So hidden and internal geometry will still render. Again, though, don't worry too much about it especially if you are talking about dozens, or hundreds of vertices. Everything really depends on how many units will be in the game. If it is one, then it can have a 250k vertex mesh (or meshes) and nobody will care that much. If it is a unit you can build, then assume there will be 200 of them and work back from that. In that case you might only want 1000 vertex meshes as a ceiling. Generally, you can get by with meshes in the dozens or hundreds. Since these are units often viewed from a distance, extra detail is really just wasted most of the time.

- Changing properties on built in units: I don't know of any specific problems with doing this. There might be some problems that do exist, though. I know that anything that involves range increases can increase the computational load. So if a cannon's range is the entire map and there are a 100 cannons on the map... that's a lot of work. Most range calculations have optimizations in place but some of them can only be so efficient. To find nearby creeper a cannon has to search outwards in all of the cells around itself and find the closest creeper. It has to do that every frame. So if there is no creeper, and the range is huge, that's a lot of memory access. A mortar has to find the deepest in range, so that always searches the entire search area out to the range. There are some spatial hashing levels in place, and other data structures to make it more efficient, but it only gets so good. Anyway, watch out for large 'ranges' on units. Other than that, I wouldn't think there would be any performance issues from changing things. I'd have to look at each case on a case by case basis to see what was going on.

Vertu · February 18, 2022, 09:19:12 AM

Thank you very much! I will put all of this information to good use.
The map that modified the ranges of the units was VPAC Incursion if you wish to look there and based on your info, I will reduce the range multiplication of my modifications or maybe even swap them to addition since it wasn't too unlikely for units to have very. large range.
Also nice to know vertices aren't a big deal in CW4 as it is in some other older games.

GoodMorning · February 21, 2022, 11:53:24 PM

Thanks for an interesting read, especially around the Get/SetScriptVar internal processes.

Questions about optimization

Vertu

knucracker

Vertu

GoodMorning