You are not logged in.

Applications: [GameMaster: OPEN] | [Volunteer Testers: OPEN]


This forum will be permanently shut down on Friday 13.07.2018
Please copy or save all important information from old forum before they will be deactivated
We have moved to new board. https://forum.runesofmagic.gameforge.com/Come join us.

Peryl

Intermediate

  • "Peryl" started this thread

Posts: 313

Location: Elsewhere

  • Send private message

1

Thursday, February 2nd 2012, 12:32am

[repost] Lua Optimization Tricks

This is a repost since the previous got dumped in the forum restructuring. Some edits may be missing from this version.


This is a guide on some Lua optimization tricks that I had up on my website. As I can't seem to find the time to really update/maintain the site, I decided that it might serve a better purpose being posted here (at least it'll have a chance to be viewed by those interested in this stuff. If you don't care about writing add-ons, then this really isn't for you.


[SIZE=+2]Optimizing Lua Code[/SIZE]
Here we'll discuss some tricks and techniques for optimizing your add-on code for both size and speed.


Most of the techniques given here are taken from various Lua sites, including the Lua-Users wiki. Though many other tricks for general optimizations can be used as well as those presented here, these techniques are more specific to Lua and/or Runes of Magic.


[SIZE=+2]Put Globals into a Local Scope for Increased Speed.[/SIZE]
When accessing global variables, the Lua virtual machine must first access the variable, place it into a stack register then use the value. When writing back to the global variable, it then must access the global again and copy the value from the stack register back into it. This slows things down to a significant extent when dealing with variables accessed often such as in a loop or a RoM frame's OnUpdate callback.


However, note that the Lua VM copies the value of the global variable to/from a stack register. These stack registers are in fact the same place that the Lua VM places its local variables (there are some technical differences between the two, but we can ignore that for the purposes of this discussion). We can take advantage of this fact by explicitly copying global variables into local ones for faster access within our Lua file in effect doing part of the work that Lua would do, but as we can do this outside of any loop, Lua doesn't need to do the same copying to/from global space every single time it needs to access the variable (note that this only really works for tables, strings, and functions since they are stored by reference and not by value).


For example, lets say we had a function that did a fair bit of math using the sine and cosine functions from within a loop. We can speed things up fairly dramatically by assigning these functions to local variables first and using the local name within our function. All we need is to put the following at the top of our Lua file:

Source code

1
local sin, cos = math.sin, math.cos;

This idea can be taken further still. If we needed to access many global variables or functions from within our code, it can become impractical to create local versions of every single global we wish to access (Lua has a limit of 250 local variables after all). Instead, we can actually create a local instance of the global table _G. By doing this, we then have access to all the global variables from within a local scope. We can do this with the following line of code:

Source code

1
local _G = getfenv(0);

To benefit from this we would need to explicitly use _G in front of the globals we want to access or Lua will default back to the normal global table. Lua doesn't actually use _G at all, it is merely there as a place to put the global variables. So to benefit we do need to explicitly use the _G prefix when accessing global variables with this trick.


Side note: Currently, Runes of Magic uses a Lua 5.1 engine. However, there is a proposal for Lua 5.2 that will do away with code environments. Instead, a special _ENV table will be used in-lieu of the getfenv()/setfenv() functions. This would change how we perform the _G trick above, though not dramatically. It could also introduce new variations of this trick as well. This of course also assumes that a future Runes of Magic client would be upgraded to use a Lua 5.2 engine as well.


[SIZE=+2]Avoid String Concatenation (..)[/SIZE]
Lua is somewhat atypical in the way it handles strings. Lua will internalize all strings that it encounters when executing code, and treats each string as an immutable object. What this means is that all strings that the Lua VM encounters are kept in memory and if a duplicate is found, instead of creating a second copy of the string, it references the first string instead. This speeds up Lua's string handling under most cases, but forces the Lua VM to not allow a string to change since changing one can force a second reference to change when it shouldn't.


The side effect to this is that when you concatenate two strings, Lua will actually create a third string and copy the data from the first two into it. This can be rather wasteful of memory if these were only used temporarily for text output. Now imagine the scenario where you use multiple string concatenations. Each string concatenation will end up creating a brand new string! Each one will need to be created, have the data copied into it, then garbage collected after it has been used. This is rather wasteful of both CPU and memory.


The workaround for this is to use Lua's string.format function. This effectively bypasses the problem by calling the underlying C/C++ engine to create the string. As C/C++ doesn't have this immutable string problem, only the final string is ever actually created and given back to Lua. As a bonus, it is also faster since the underlying C/C++ code doesn't need to constantly copy the data from one string part to another.


[SIZE=+2]Table Rehashing is Bad[/SIZE]
The way Lua uses tables is to create a hash value for each entry, then uses these hash values to quickly find a given entry in a table. However, when you add an entry to a table, it needs to create a hash value for this entry and will typically also create a few extra entries so that the next time an entry is added, it doesn't need to create another hash value.


Under most uses this is perfectly fine, but imagine the scenario where you add many entries to a table from within a loop. Lua will hash one entry, then a little later will need to rehash the table to accommodate new entries, then rehash again as yet more entries are added, etc, etc. This rehashing process is not quick either (comparatively speaking, that is). This applies whether you are using a table with named entries, or just using the array part of the table.


One trick to avoid this table rehashing, if you know in advance how many entries you are going to have, is to create the table with dummy entries in them. This way, the Lua compiler will realize what the size of the table should be and allocate the table with all entries present. So in the loop that actually sets the proper values for the entries, no table rehashing is performed and thereby speeds things up.


As an example, if you knew that you will need a table with ten entries in the array part, the following will pre-allocate the table:

Source code

1
local SomeTable = { true, true, true, true, true, true, true, true, true, true };

This works even if you aren't using all entries as boolean values. What is important is to get the Lua compiler to pre-allocate the table with the correct size. If instead of the array part, we wanted to have named entries in the table, we can set the names in advance like this:

Source code

1
local SomeTable = { name = true, address = true, phone = true };

One caveat to this is that the array part of a table should not be allocated with named indices. For example like this:

Source code

1
local BadTableAllocation = { [1] = true, [2] = true, [3] = true };  -- You don't want to do this!

The reason this is bad is that the Lua compiler is not smart enough to realize that these are for the array part of the table and so will actually create three entries in the named part of the table which will then get ignored and the Lua VM will then create three more hash entries in the array part of the table. So doing this actually ends up wasting memory for no good reason. Avoid doing this.


[SIZE=+2]Rethink Your Table Structures to Save Memory[/SIZE]
Tables are the main data structure in Lua and due to how they are stored by the Lua VM can heavily impact the memory usage of the table. Typically, we create our tables with our own convenience in mind more than the effect it has on memory, but it can be beneficial to rethink how you want to store your data.


Here is an example of a typical table, declared with our convenience in mind:

Source code

1
2
3
4
5
6
7
8
9
10
11
12
local SomeTable = {
    {x = "blah", y = 1},
    {x = "bleh", y = 2},
    {x = "blih", y = 3},
    {x = "bloh", y = 4},
    {x = "bluh", y = 5},
    {x = "slah", y = 6},
    {x = "clah", y = 7},
    {x = "vlah", y = 8},
    {x = "dlah", y = 9},
    {x = "zlah", y = 10},
}

The above table, when created in the Runes of Magic game client version 3.0.9 took 1208 bytes of memory. Not really that huge, but it is only a ten entry table after all. However, we can do much better than this.


The above table is effectively an array of tables with named entries. Well, named entries take extra space in Lua so we could try ditching those. It would mean losing the convenience of doing SomeTable[1].x by replacing it with something like SomeTable[1][1], but sacrifices must be made:

Source code

1
2
3
4
5
6
7
8
9
10
11
12
local SomeTable = {
    {"blah", 1},
    {"bleh", 2},
    {"blih", 3},
    {"bloh", 4},
    {"bluh", 5},
    {"slah", 6},
    {"clah", 7},
    {"vlah", 8},
    {"dlah", 9},
    {"zlah", 10},
}

This version, took 888 bytes of memory, representing a 26.5% reduction in size from the previous version. Of course it isn't as convenient to use, but it has a smaller memory footprint for the same amount of information.


Surprisingly, we can do better still and gain back some of that convenience we lost by converting everything to arrays. Instead of having an array of tables with named entries as we had in the first version, what if we have a table of named entries containing arrays:

Source code

1
2
3
4
local SomeTable = {
    x = {"blah", "bleh", "blih", "bloh", "bluh", "slah", "clah", "vlah", "dlah", "zlah"},
    y = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10},
}

With this version, we can access the entries using SomeTable.x[1]. Not as convenient as the first version, though clearer than the second one. The real surprise is the memory footprint. This version takes only 536 bytes. A 55.6% reduction in memory use compared with the first version. We need less than half the memory, and yet we have the same data, and in a form almost as easy to access as the first version. Worth the bother to store things this way.


[SIZE=+2]Old School Programmer Tricks Work[/SIZE]
Back in the days when getting any kind of speed out of a computer involved breaking out Ye Olde Assembler and coding down to the bare metal, programmers came up with all kinds of little tricks to help speed things up. Now that machines are quite speedy and we can spend the time to have scripting languages running within other programs, some of these old methods are making a come back.


Time to teach an new dog some old tricks.


In Lua, multiplication is faster than division. This means that x * 0.5 is faster than x/2. So where is is possible, try to use multiplication instead of division.


Along the same vein, multiplication is also faster than exponentiation. Therefore x*x is faster than x^2. Again, use multiplication where possible.


Constant folding can speed things up. This little trick relies on the Lua compiler itself. In effect, constant folding is the pre-calculation of the value of constants in a given formula. This is performed by the compiler itself to help speed up some calculations. By knowing that the compiler does this, you can help it along by ensuring that constants in an equation are in a position to be folded properly. This means that 1 + 2 + x is just as fast as 3 + x or x + (1 + 2) but actually faster than x + 1 + 2. This last is because x + 1 + 2 is equivalent to (x + 1) + 2 and when taking into account overflow and/or non-associative types with a __add metamethod, this may not be the same as x + (1 + 2) and so the Lua compiler doesn't do constant folding for x + 1 + 2. So by re-writing that last as x + (1 + 2), you will get constant folding.


Note: There is a possibility that constant folding will be removed in Lua 5.2 since it is apparently a source of many bugs. Though that will only affect RoM add-on development if a) it actually does get removed and b) the Runes of Magic client gets upgraded to use Lua 5.2.


Factor out your expressions to simplify the math. This kind of goes hand in hand with the constant folding and the multiplication tricks above. You can speed your expressions up a bit by factoring them since Lua won't do it for you. For example, x*y + x*z + y*z becomes x*(y+z) + y*z. The factored version has fewer multiplications and therefore will be a little faster. The Lua compiler isn't all that smart and will not factor your expressions for you (in part because the compiler can't assume distributive or algebraic properties hold during numerical overflow). So you need to simplify the math yourself.


Short in-line expressions can be faster than calling a function. This isn't strictly an old trick, but I'm putting it in here anyway. The idea behind this is that the act of calling a function, and then returning from it incurs some execution overhead. So if you can do the same thing in a simple line of code instead of calling a function, it can be faster (especially in loops). A good example of this is the table.insert function. Typically, table.insert is used to add an entry to the end of an array and this can be replaced by a single line of code that does this same job. For example, to add the value 5 at the end of the array SomeArray, the table.insert version would be:

Source code

1
table.insert(SomeArray, 5);

However, the following line will do the same thing but will generally run faster:

Source code

1
SomeArray[#SomeArray+1] = 5;
2013... The year from hell....