Recommended Posts

1 minute ago, DarthParametric said:

Is there a game-specific identifier in the NCS header? I thought both were flagged as NCS V1.0B.

No, there isn't. That's why I had to write a detection algorithm.

Share this post


Link to post
Share on other sites

@JCarter426

Thanks! I'll look at them later just to be sure I haven't missed anything. I was supposed to do it earlier, I just forgot about it until I was Force Persuaded to. 😋

Share this post


Link to post
Share on other sites
12 hours ago, AmanoJyaku said:

Version Detection

 

The algorithm was fine, the problem was the input. I was using K2 function definitions, some of which had changed from K1. Now, NCS files are identified as K1 or K2!

 

It's probably easier and safer just query the user on what game this script is for (via a command line option, for example), than to dependant on some brittle heuristic to detect that.

 

12 hours ago, AmanoJyaku said:
  1. Evaluation of for and while loops
    1. They look identical, with the only potential difference being an incremented or decremented value at the end of a for loop
    2. Now that the algorithm described above has been completed, it should be easy to determine the rest of the code that makes a for loop unique
    3. I've temporarily given up on do-while loops because only one file has one, and I'm not entirely certain it's in the compiled NCS file

If you lack files to test this on, you could get Neverwinter Nights and install its toolset, because that came with an official BioWare NWScript compiler. Yes, it's for NWN, but the control structure stuff should be identical.

 

12 hours ago, AmanoJyaku said:
  1. Figuring out the DESTRUCT op code
    1. I've seen this used to destroy variables on the stack that aren't necessary, but prevent the top-of-stack behavior used by NCS
    2. Although this shouldn't be necessary, since the desired variable could just be copied to the top of the stack using the CPTOPSP op code...

Dunno if I already wrote that here or just in IRC, on a GitHub issue or wherever, but: the interesting thing about the DESTRUCT opcode is that it's used to single out individual struct members. I.e. whenever the nss used a struct member, the ncs copies the whole struct to the top of the stack and used DESTRUCT on this new block in the stack to get rid of everything but the single struct member it's interested in.

Which means you can use the existance of DESTRUCT to identify structs.

Share this post


Link to post
Share on other sites
9 hours ago, DrMcCoy said:

It's probably easier and safer just query the user on what game this script is for (via a command line option, for example), than to dependant on some brittle heuristic to detect that.

I considered that, but the differences are obvious and the detection simple. Are you aware of any pitfalls, or are you just being cautious? This is a beta release, there's plenty of time to change functionality.

9 hours ago, DrMcCoy said:

If you lack files to test this on, you could get Neverwinter Nights and install its toolset, because that came with an official BioWare NWScript compiler. Yes, it's for NWN, but the control structure stuff should be identical.

Given the fact that one file out of 2,500 has a do-while loop, it's low priority. I'll get around to it after more pressing matters are dealt with.

9 hours ago, DrMcCoy said:

Dunno if I already wrote that here or just in IRC, on a GitHub issue or wherever, but: the interesting thing about the DESTRUCT opcode is that it's used to single out individual struct members. I.e. whenever the nss used a struct member, the ncs copies the whole struct to the top of the stack and used DESTRUCT on this new block in the stack to get rid of everything but the single struct member it's interested in.

Which means you can use the existance of DESTRUCT to identify structs.

Thanks, I thought as much. I first saw it in use destroying two elements of a vector, so I figured its purpose was for destroying unused elements of aggregates. I just need to identify which elements are preserved.

Share this post


Link to post
Share on other sites

It's time for a monthly update, and I said I might have a beta ready by now. Well, there won't be a beta.

Spoiler

I'm skipping the beta to complete the "final" release!

The to-do list is nearly complete:

  1. Evaluating subroutines - Complete
    1. Input parameters and return values have been discovered
  2. Merging the control structure algorithm with the virtual stack - Complete
    1. Subroutine scopes and variables are correctly identified
  3. Evaluation of for and while loops - Nearly complete
    1. While loops - Complete
    2. For loops - In progress
  4. Evaluation of switch statements - Nearly complete
    1. Case labels and statements - Complete
    2. Default label and statement - In progress
  5. Figuring out the DESTRUCT op code - In progress
    1. Not started, but not expecting it to take much work

Will have more news next week.

  • Like 2

Share this post


Link to post
Share on other sites

Sorry that this is taking longer than I wanted. I was lucky enough to get some contracts, but that also means very little free time.

While a decompiler is an all-or-nothing program (hence the reason there won't be a beta), I can show off some sample output:

Spoiler

'.\NCS Analyzer.exe' '..\..\Compiled Scripts\Kotor2\k_inc_npckill.ncs'
This is a K2 file

void Sub1()

void Sub2()
Integer
Integer
Integer

void Sub3(???, ???, ???)
Integer
Integer
Location
Float
Float
Float
Float
Location
Object

int Sub4(???)
Integer
Integer
Integer

void Sub5(???, ???)

void Sub6(???, ???)
Effect

 

It's not much, but you can see following:

  1. It identifies the NCS as a Kotor2 file, which is reliant on the code including K2 engine functions
  2. There are six subroutines
    1. _start(), which is automatically included by the compiler
    2. void main()
    3. void DamagingExplosion( object oCreature, int nDelay, int nDamage )
    4. int GR_GetGrenadeDC(object oTarget)
    5. void NonDamagingExplosion(object oCreature, int nDelay)
    6. void KillCreature(object oCreature, int nDelay )
  3. The types of return values are found by examining the called subroutine
  4. The types of parameters must be found by examining caller subroutines (strange as it may seem, there's no guarantee a parameter is used)
  5. The local variables are listed in the order in which they are created

As simple as this may seem, it's been hell trying to figure out how NCS works due to limited documentation, time, and mental capacity (🤪). That said, I think I now know everything there is to know about NCS. Even how to deal with recursion (it took less than an hour), identifying vectors and structs (they do disappear in the bytecode, but there are code patterns to look out for), and handling certain errors in the game's code.

So, the two major tasks left are:

  1. Including block scopes, e.g. if-else and while statements (I can identify them, I just haven't put them in among the locals)
  2. Handling expressions, commonly known as operator precedence and associativity, e.g. int d = (a + b) * c
  • Like 1

Share this post


Link to post
Share on other sites
1 hour ago, AmanoJyaku said:

there's no guarantee a parameter is used

Interesting. I had assumed that the compiler substituted in default values when compiling the NCS, but are you saying that it's the engine that does that at runtime? Which would also mean DeNCS adds missing defaults back in when decompiling then.

Share this post


Link to post
Share on other sites
11 hours ago, DarthParametric said:

Interesting. I had assumed that the compiler substituted in default values when compiling the NCS, but are you saying that it's the engine that does that at runtime? Which would also mean DeNCS adds missing defaults back in when decompiling then.

Sorry, I wasn't clear. You are correct about the compiler substituting defaults into the bytecode, the engine does not substitute at runtime.

(I'm preparing a post to better explain NCS in case someone wants to port this to a different language, or make a better decompiler.)

Spoiler

 Let's say you have the call Sub0(1, 2, 3):


void main()
{
	Sub0(1, 2, 3);
	return;
}

void Sub0(int a, int b, int c)
{
	b + c;
	return;
}

The NCS call in void main() looks like this:


CONSTI 3
CONSTI 2
CONSTI 1
JSR Sub0

The inside of Sub0 looks like this:


//Top of the stack is always 0
//Upon entry to the subroutine -4 is a, -8 is b, -12 is c

CPTOPSP -12 4
//Now -4 is the copy of c, -8 is a, -12 is b, -16 is c

CPTOPSP -12 4
//Now -4 is the copy of b, -8 is the copy of c, -12 is a, -16 is b, -20 is c

ADDII
//Now -4 is the result of b + c, -8 is a, -12 is b, -16 is c

MOVSP -4
//Now -4 is a, -8 is b, -12 is c

MOVSP -12
//All parameters destroyed

RETN

 

This is a silly example. But, it demonstrates the impossibility of identifying the type and value of an unused parameter from inside a function.

I can't say I've seen this in practice. There are 2,500 files, I'm not looking at them all that closely. But I have to code for the possibility, or risk corrupting the stack.

Share this post


Link to post
Share on other sites

Ah, you were talking about the function not using all the stated parameters. I thought you were talking about calling a function without specifying all the parameters. I suppose that sort of sloppiness is not out of the question when dealing with mod-generated scripts.

Share this post


Link to post
Share on other sites

I forgot to do the monthly update. 🤯 I was all set to finish the decompiler, and of course I ran into trouble.

First, I had a family problem that has since been resolved.

The second is a mundane issue: my laptop has been suffering from BSODs for the past two months, and it's happening with increasing frequency. Sometimes when I wake the machine from sleep or hibernation it BSODs, and when I cold boot it doesn't see any storage. This only happens when it's on battery, and it seems to be when the battery is below 50%. Now that I have an idea of what triggers this I can work around it, but it was affecting my productivity.

The third is an issue with NCS files. My decompiler relied on files being written "correctly", so of course it bombed when fed a poorly written file. For example, k_inc_npckill:

Spoiler

void main()
{
    int nKillMode = GetScriptParameter(1);
    int nDelay = GetScriptParameter(2);
    int nDamage = GetScriptParameter(3);

    if ( nKillMode == 0 )
    {
        DamagingExplosion(OBJECT_SELF, nDelay, nDamage);
        return;
    }

    if ( nKillMode == 1 )
    {
        NonDamagingExplosion(OBJECT_SELF, nDelay);
        return;
    }

    if ( nKillMode == 2 )
    {
        KillCreature(OBJECT_SELF, nDelay);
        return;
    }
}

 

Nothing wrong with the source. Let's look at the NCS:

Spoiler

void main()
21      RSADDx
23      CONSTI 1
29      GetScriptParameter
34      CPDOWNSP -2 1
42      MOVSP -1
48      RSADDx
50      CONSTI 2
56      GetScriptParameter
61      CPDOWNSP -2 1
69      MOVSP -1
75      RSADDx
77      CONSTI 3
83      GetScriptParameter
88      CPDOWNSP -2 1
96      MOVSP -1
102     CPTOPSP -3 1
110     CONSTI 0
116     EQUALxx
118     JZ 170

124     CPTOPSP -1 1
132     CPTOPSP -3 1
140     CONSTO object
146     JSR 298
152     MOVSP -3
158     JMP 296

*164     JMP 170

170     CPTOPSP -3 1
178     CONSTI 1
184     EQUALxx
186     JZ 230

192     CPTOPSP -2 1
200     CONSTO object
206     JSR 1601
212     MOVSP -3
218     JMP 296

*224     JMP 230

230     CPTOPSP -3 1
238     CONSTI 2
244     EQUALxx
246     JZ 290

252     CPTOPSP -2 1
260     CONSTO object
266     JSR 1637
272     MOVSP -3
278     JMP 296

*284     JMP 290

290     MOVSP -3

296     RETN

 

The lines with asterisks are dead code, they never get executed and the game is fine with that. Problem is, my decompiler was expecting the last operation of the true branch of an if statement to be part of the control path. The last op of the true branch is what tells you if you're looking at a regular if, an if-else, or an if that exits the script as seen above. The decompiler was looking at the wrong thing, and returning the wrong results. I've now fixed that with an additional set of evaluations.

Now I'm back to working on producing output. This is tedious as it requires keeping track of data, individual operations, and the context in which the data and operations are being used. Believe it or not, a block of code will decompile differently based on code that comes before and after it. (For example, RSADDx does not mean "create a named variable" as I incorrectly assumed months ago. It could create a named variable or a temporary variable, based on the context.)

This morning I had an idea as to the overall rule for determining the beginning and end of statements, so I'll be working on that this weekend. I hope to have a status update in a few days.

  • Like 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.