[Bug] PRPL AcNot not removing from stack

Started by GoodMorning, November 06, 2016, 05:35:23 AM

Previous topic - Next topic

GoodMorning

A short report.

In :awake, the following runs:

PrintStack
0 ->Self.UnitPixelCoordX
PrintStack
0 ->Self.UnitPixelCoordY
PrintStack


In PRPL.txt, the following is found: (Script name changed)

[0,0] Script.prpl: -- Stack Top --
-- Stack Bottom --

[0,0] Script.prpl: -- Stack Top --
int: 0
-- Stack Bottom --

[0,0] Script.prpl: -- Stack Top --
int: 0
int: 0
-- Stack Bottom --


The Core does move to the lower left corner.

Edit: Note that I am now using ClearStack to deal with this. I only noticed in the first place because I was using AppendStackToList.
A narrative is a lightly-marked path to another reality.

knucracker

Hmmm
Yeah... looks like.
A real good question is how in the world I got this far and never noticed.  I suppose a combination of the notation not being used too often and the bug only happens in simple cases, like assigning a constant.

So I'll fix that... however, it still won't work like you want in your specific example.  "Self" isn't a var, it's a function.  So "<-self" and "->self" aren't the same as just "self". 

So you'd have to do something like "self ->self" to get the accessor notation to work in your example.  Yes I know...
I could also fix this by making <-self be a reserved variable name and always return "self" (and do nothing when assigned). But that might actually break some scripts if I did that.

GoodMorning

I know that much, and had already used the Self ->Self workaround (I seem to recall that the other way will not even compile). I had just forgotten to include it in the snippet I uploaded.

I find myself steadily more curious about the PRPL compiler. It would seem as if it were a large switch statement to run it once compiled, but the compilation step...
A narrative is a lightly-marked path to another reality.

knucracker

The compiler is a simpler parser that tokenizes the source by whitespace.  So every "word" becomes a token.  Each token is then inspected and turned into a "Command" struc.  A Command struc consists (in a simplified form) of an integer that represents an 'opcode' and then an optional object for additional data.  The opcode is an integer that represents a command. It's just an index into an enum of possible commands.

Ex:
1 2 add ->result

This translates into:
PUSH[1]
PUSH[2]
ADD[]
WRITE["result"]

Here, the upper case word is the opcode and the thing in brackets is the optional data.

The text parser (compiler) splits the source into tokens (here separated by spaces) and notices that the first two tokens are just numbers.  That means to do a 'PUSH' command with an argument.  The 3rd token is just a simple string with nothing special.  That means to see if it is a legit command exists in the command enum and if so write out the Command for ADD.  The final token is inspected and it starts with "->".  That means a "WRITE" command with an argument equal to the string following "->" in that token.

When the compile is finished the output is a List of Command objects.  That List<Command> is held by a core and evaluated every frame.  The core of course makes available a 'heap' (that is what READ and WRITE use) as well as a stack (that is what PUSH and various GetXXXXFromStack calls use).  So in some ways, the architecture is closer to a CPU than to a compiled language.  The Core is a CPU that makes available the heap (registers) and memory and also provided various CISC like instructions (commands like CreateParticle).

Anyway, that's the high level description.  There are a million little details that make it all actually work. Things like caching compilation results, transmuting data types, checking syntax, implementing things like timers, Delay, logic blocks, function calls, etc.  Hmmmm..  Now that I say all of that I question again why in the world I ever spent the many months I did creating it :)

GoodMorning

Interesting.

Some of that I had guessed from the snippets that are gradually turning to doc, others follow logically.

Another thing to add to my "list of things to do as an exercise". (The slight problem with that is that the preceding item is already "build a JS game engine" with the completion test "can I make a game using it?".)

For logic blocks and loops are you looking at a GOTO[index]? Or is there some clever trick?



Quote from: virgilw on November 07, 2016, 08:12:13 AM
[...] I question again why in the world I ever spent the many months I did creating it :)

Because it added so very many options?
A narrative is a lightly-marked path to another reality.

knucracker

Yeah, it is close to that.  The list of Command objects is held by a core.  The core has a "instruction pointer" which is just the index into that list that points to the instruction it is currently evaluating.  That instruction pointer can be moved.  By default it just advances by one.  But a given instruction could cause it to just to some other location (forward or back).

If any of this sounds confusing (as it should) to anyone who has bothered to read this far, I might suggest a fun way to have it all make sense.
https://en.wikipedia.org/wiki/Core_War


planetfall

I'm curious to know how you implemented functions internally. If they're just further positions past the end of the main script, then there must be an invisible "return" added to the end of scripts on compile. Or are they each compiled as "mini scripts" with references to the main script's stack and variable hash?

Fwiw, in (REDACTED) I used an approach similar to the latter, though a bit odd because of the specific program and my limited skill.
Pretty sure I'm supposed to be banned, someone might want to get on that.

Quote from: GoodMorning on December 01, 2016, 05:58:30 PM"Build a ladder to the moon" is simple as a sentence, but actually doing it is not.

knucracker

They are just further positions in the list of commands.  When a token is encountered that starts with a ":" that translates to an "FUNC" command.  The core that holds a script also has a "jump table".  That table just maps a string name (the function name) to a location in the command list.

I use a "call stack" trick to avoid having to put returns at the end of functions (or the main script body).  Whenever a core jumps to a function is pushes the current instruction pointer to a call stack.  Execution proceeds after the jump to the function location.  If another function is encountered, the FUNC command notices the call stack has something on it and pops the value and returns to that location.  If the main body of the script is executing it will come to a FUNC.  It will notice that there is nothing on the call stack.  In that case, it jumps to the end of the list (which is the end of execution).

So long as all functions come after the main body of the script, a simple call stack can be used to determine where to return to.  That works for the main body of the script as well.  Note that a stack is necessary because a function can call other functions.  The call stack can also useful in case you want to limit the maximum call depth (to prevent runaway recursion).  I don't do that... but were this a 'real' general purpose language/VM I probably would have.


GoodMorning

That also answers a question I had considered some time ago about recursion in scripts.

I suppose that PRPL globals are just a different heap.

Given the bug above, your AcNot seems to be something more than parser notation-warping. Is it an optimised form?

A narrative is a lightly-marked path to another reality.

knucracker

Yeah, variables are just a hashmap (Dictionary in c#).  ->foobar means to pop from the stack and store it in the heap dictionary at location "foobar".

Accessor notation  "<-thing.method" for instance, is all handled at compile time.  The parser tokenizes and determines that the variable name is "thing.method".  It see there is a dot in that name and then does its magic.  It creates output commands, in this case, as if this has been the original source: "<-thing GetMethod".

Warp notation (using parens) works in a similar way.  During the compile parse, any token before and opening paren gets moved to the closing paren.
Add (1 2) becomes 1 2 Add.

GoodMorning

That's how I thought it would work, but then how did the bug that started the thread come about?

I.e. If I understand correctly, these would be converted to the same opcodes, if the preprocessor sweep converts AcNot...

0 ->UID.xyz
<-UID 0 SetXYZ #Some things
<-UID CONST_XYZ 0 SetUnitAttribute #Most others
A narrative is a lightly-marked path to another reality.

knucracker

Assignment is a much harder case.  Take this example
1 2 add ->obj.Val
This is the same as:
<-obj 1 2 add SetVal

The problem here is what comes before the assignment isn't a simple constant or a var read.  It is whatever happens to be on the stack.
So, long story short is I had two ways of permuting the code.  One that handles the generic case (and uses a swap command) and the other the recognizes there is a constant and can avoid the swap.  In that constant case, I has messed up and wrote out the new commands to a position in the command array that was off by one.  So the original constant was still getting pushed to the stack, then the new commands were happening.

kwinse

Ah, off by one errors. They get us all in the end.

GoodMorning

I see. Perhaps I should explore this myself before taking more of your time.
A narrative is a lightly-marked path to another reality.