AmanoJyaku 184 Posted February 5, 2020 I put my original mod on hold since someone else is making it. And, wow, does it look good! So, on to the most neglected project: a new DeNCS. It's currently able to interpret op codes and arguments, so the next step is the higher-level control flow. Here's where you can help: Byte Code - I obtained a list of op codes from here. I need to make sure it's complete and correct. NWScript - I am compiling a list of operators, punctuators, and keywords from here, and here. I need to make sure it's complete and correct. Mapping - This is the big one, making sure script converts to byte code, and vice versa. 100% identical conversion, every time. No pressure. DOT Diagram - I need a program that can make diagrams from DOT DOT Generator - I can generate the DOT from my code, but it would be helpful if something could do this for me, as well. Name - Right now, it's called DeNCS 2020. Narrowly edging out NewNCS. Please, give it a good name. A couple of things I want to address. DeNCS sometimes reports a partial-byte mismatch. At first, I ignored this just like every other modder. However, while testing this tool I compared the output of an NCS file straight from the game to the output DeNCS generated... Well, it looks like DeNCS attempts to convert the NCS to source code, then attempts to convert the source code back to an NCS file. Then, it compares the original NCS to the new one. And, if they don't match? Partial-byte mismatch. I don't know why DeNCS doesn't always perform a perfect conversion, but it's something to investigate. Particularly, since there are reports that even NCS files generated by Bioware and Obsidian had bugs in them. Troubleshooting the NCS files will make this longer since I have no way of knowing what the original files should have looked like. (But, I can guess just like DeNCS seems to do.) This is command-line only. There was never any intention to make this a stand-alone tool. However, development will take longer than I expected, which means a GUI is the least of my priorities. At some point, I want to merge this into a toolset, maybe even the one up above. So, no GUI. (That also means the stand-alone tool won't be around for very long.) Got any feedback? Thanks! Edit: I just downloaded Graphviz, so I am covered with DOT diagrams. I think the included library will allow me to generate DOT, too. 2 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted February 6, 2020 7 hours ago, AmanoJyaku said: Byte Code - I obtained a list of op codes from here. I need to make sure it's complete and correct. Presumably you want to ask @DrMcCoy. Xoreos has done the most recent and thorough (publicly available) work on it. I gather you have perused their repo? 7 hours ago, AmanoJyaku said: Name - Right now, it's called DeNCS 2020. Narrowly edging out NewNCS. Please, give it a good name. I'd advise against any incorporation of "DeNCS" in the name. Xoreos has already taken "ncsdis" and "ncsdecomp". You could perhaps go with something like "NCS2NSS". 7 hours ago, AmanoJyaku said: This is command-line only. That's fine. The Xoreos tools are all commandline, and real men compile directly with nwnnsscomp which is commandline as well. Batch scripts are easy enough to write. And as with nwnnsscomp, if someone was keen enough they could always write a GUI front-end for it (ask @JCarter426 - he'll need a GUI project for his course later in the year presumably). 7 hours ago, AmanoJyaku said: DeNCS sometimes reports a partial-byte mismatch. Yes. It's presumably due to the way that DeNCS interprets the bytecode, since there's obviously not a 1:1 match between it and the source NSS. I would suggest you try decompiling some of the global scripts that have Bioware source available and compare DeNCS's output with the original NSS. There are certain quirks in the way DeNCS likes to format things, which, while functionally the same, may introduce the subtle differences it complains about with partial matches. 1 1 Share this post Link to post Share on other sites
DrMcCoy 40 Posted February 6, 2020 Yeah, do have a look at the nwscript directory: https://github.com/xoreos/xoreos-tools/tree/master/src/nwscript Specifically, the instruction.h: https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/instruction.h , which contains a list of opcodes (including 2 each that were introduced by Dragon Age: Origins and Dragon Age II). You might also want to look at https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor.h and https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor2.h for a list of engine functions and their signatures (return type + types of parameters). Alternatively, you can parse the nwscript.nss from the games yourself. If you haven't already, have a read over my blog post about disassembling the bytecode: https://xoreos.org/blog/2016/01/12/disassembling-nwscript-bytecode/ . It also shows a bug in BioWare's compiler (though that particular bug is not present in the KotOR scripts, it was fixed before KotOR development started it seems). If you want to decompile all scripts, you'll find that you're going to have problems with recursion, i.e. functions that call themselves. You need to analyze how a function leaves the stack to be able to continue with the code after the call, so you need to branch into a callee first before continuing with the caller. If the callee is the caller itself (or A calls B calls C calls A again)... that's a problem. This is essentially the halting problem, and there's no general solution, unfortunately. 1 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted February 6, 2020 Interestingly though, most scripts that DeNCS chokes on do not appear to be due to recursion, at least judging by running said scripts through ncsdis, which reports the presence of recursion. In my experience, it's the use of per-planet includes that trips it up the most. The module OnEnter and OnHeartbeat scripts for Tatooine in K1 are a particularly good example of this. 1 Share this post Link to post Share on other sites
DrMcCoy 40 Posted February 6, 2020 As for operators, you should probably also make sure you know the precedence rules. I'm not sure NWScript follows the C rules there exactly. I haven't yet looked at that, since it's unambiguous in the bytecode. Oh, and also, there's another bug I've found, but I'm not sure in which games it was present. It had to do with parameter shadowing. I.e. void foobar(int blah) { int blah = barfoo(blah); } (excuse the quirky formatting, this editor is weirder than I remember) I've seen the produced bytecode use the uninitialized value of the just created local variable blah as in input for barfoo(), instead of the parameter of foobar(). Maybe something to have an eye on. 6 minutes ago, DarthParametric said: In my experience, it's the use of per-planet includes that trips it up the most. The module OnEnter and OnHeartbeat scripts for Tatooine in K1 are a particularly good example of this. Hmm, depends on how DeNCS operates. Does it also try to decompile functions that are not called? I.e. if you have an include with foobar1(), foobar2() and foobar3(), and the script itself just calls foobar1() (and that doesn't call the other functions), does DeNCS also decompile foobar2() and foobar3()? Because IIRC ncsdis completely ignores foobar2() and foobar3() in that case and wouldn't see any recursion there. 1 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted February 6, 2020 No, because those wouldn't be pulled into the compiled script if it didn't call them. 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 6, 2020 It's been a month, so here's a progress report: Completed decoding of opcodes, data types, header fields Validating basic fields and generating (helpful?) error messages Embedded NWScript engine calls into program code No need for modders to provide nwscript.nss Created call graphs of subroutines Generating graph description files for Graphviz renderer Created control-flow graphs of basic blocks Generating graph description files for Graphviz renderer Created call stack simulator Keeping track of which variables are modified To do: Identifying iteration, selection and jump statements Operator associativity and precedence Type conversions Byte code conversion to source code Source code conversion to byte code GUI Setup new dev laptop (dropped current laptop last night, awaiting delivery of new one) 😢 Probably more stuff, but I don't know what I don't know, you know? 3 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 19, 2020 It's been quite interesting these last few weeks, eh? I hope you are all doing well. There's been progress in analyzing the NCS byte code. Graphviz has been invaluable in producing the maps necessary to visualize the program flow. At the moment, one script has been reverse engineered. By no means is a decompilier close to being ready, but it is a significant step in analyzing the program flow. At the moment, several NCS code patterns have been identified that map back to NWScript: Dead code (e.g. perfectly valid code that never gets called, and theoretically could be removed) Assignment (e.g. i = 5) Named variable declaration (e.g. int i, float f, string s, object o) initialized from constant (e.g. int i = 5, float f = 0.0, string s = "Jello, World!") initialized from engine routine (object o = GetFirstPC()) initialized from subroutine (bool b = IsItTrue()) initialized from named variable (int i = integer_i ) Selection If (GetLocalBoolean(55)) If (!GetLocalBoolean(55)) If (IsItTrue()) If (!IsItTrue() Switch (i) {case 0: break; default: break;} Attached is a control flow diagram of the script k_sup_galaxymap.ncs. Red blocks are dead code. There are three switch statements in the code. Can you find them? Expect more updates in the next few weeks! 3 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 20, 2020 7 hours ago, AmanoJyaku said: Dead code (e.g. perfectly valid code that never gets called, and theoretically could be removed) I'm assuming these are all the debug functions? PrintString, AurPostString and the like? I was playing with those only yesterday and noticed that despite removing if (!ShipBuild()) checks, they still didn't appear to function. I assume they gutted them from the retail release altogether? While they are unnecessary for the code to run, I would argue they should remain in any decompiled scripts since, in the absence of commented source for the module scripts, the debug functions are the only thing remaining that can indicate developer intent. Additionally, the user can manually change them to SendMessageToPC for their own debugging purposes if needed. 1 1 Share this post Link to post Share on other sites
bead-v 251 Posted March 20, 2020 3 hours ago, DarthParametric said: I'm assuming these are all the debug functions? PrintString, AurPostString and the like? I was playing with those only yesterday and noticed that despite removing if (!ShipBuild()) checks, they still didn't appear to function. I assume they gutted them from the retail release altogether? While they are unnecessary for the code to run, I would argue they should remain in any decompiled scripts since, in the absence of commented source for the module scripts, the debug functions are the only thing remaining that can indicate developer intent. Additionally, the user can manually change them to SendMessageToPC for their own debugging purposes if needed. It may be that those weren't intended to print anything in the game itself but a console window, which would be present in a working/debug version of the game. If that's the case, it's not so much that they don't function as that they have no effect. In any case, anything that can be preserved, should be! 1 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 20, 2020 As when any such debates arise in game development - add a toggle for it! Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 20, 2020 9 hours ago, DarthParametric said: I'm assuming these are all the debug functions? PrintString, AurPostString and the like? I was playing with those only yesterday and noticed that despite removing if (!ShipBuild()) checks, they still didn't appear to function. I assume they gutted them from the retail release altogether? While they are unnecessary for the code to run, I would argue they should remain in any decompiled scripts since, in the absence of commented source for the module scripts, the debug functions are the only thing remaining that can indicate developer intent. Additionally, the user can manually change them to SendMessageToPC for their own debugging purposes if needed. My apologies. As with all things STEM, terminology must be accurate. I am learning compiler design in order to create this compiler/decompiler, so I am still unfamiliar with communicating this topic effectively. "Dead code" has two meanings, one more accurate than the other: Instructions that compute a value that is never used Unreachable code The red blocks do not represent "instructions that compute a value that is never used". They may fit the first definition, but I am not looking for that in these diagrams. The red blocks are "unreachable code"; they never get called, which is why they aren't preceded by another green block. Only the function entry point (the very first block in a function/subroutine) should be without a preceding block. The image above has 10 functions/subroutines, each represented by a box filled with one or more colored blocks. (Accuracy alert!!! Functions and subroutines are similar, but not identical. The difference only matters to academics and compiler designers, but this entire post is about pedantry...) Thus, there should only be 10 blocks that aren't preceded by another green block: the function entry points. If a block is in red, that's because no other block explicitly jumps to it or implicitly proceeds to it. These blocks are therefore useless, and could be removed if they serve no other purpose. x86/x86-64 uses the NOP op code to allow for instruction padding to improve memory alignment, and resulting access times. To my knowledge, such a technique has no effect in the NWScript runtime. If I am correct, then the code could be stripped. My concern is that perhaps there IS a reason for these blocks, or that the existing script compilers are riddled with bugs... Finally, these blocks are not the result of debug code. One of the switch statements in the diagram has a debug function under a case label in its source. switch(nPlanet) { case PLANET_PERAGUS: { AurPostString("ERROR: We should not be able to travel back to peragus.",0,10,5.0); } break; //Other case statements removed } The compiled case label: case 8: 0x00000ffb CPTOPSP -4 4 0x00001003 CONSTI 8 //PLANET_PERAGUS defined as 8 in nwscript.nss 0x00001009 EQUALII 0x0000100b JNZ 0x000010dd The target block in the compiled file: case 8 block 0x000010dd 0x000010dd CONSTF 0.000000 0x000010e3 CONSTI 10 0x000010e9 CONSTI 0 0x000010ef CONSTS ERROR: We should not be able to travel back to peragus. 0x0000112a ACTION 582 4 //Engine functions are zero-indexed, AurPostString() is #582, and it takes 4 parameters 0x0000112f JMP 0x0000158e Case label 8 jumps to the Case 8 block, and calls the debug function. No unreachable code here! 1 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 20, 2020 56 minutes ago, AmanoJyaku said: or that the existing script compilers are riddled with bugs... Probably not an unsafe assumption, even if it is not the case in this specific example. Perhaps @DrMcCoy could shed some further light. Share this post Link to post Share on other sites
DrMcCoy 40 Posted March 20, 2020 I don't remember seeing this exacty case. Then again, it's been a while since I touched that code and I'm not known to have a good memory. :P Also, my disassembler just follows the starting segment and IIRC I don't even check for unreachable blocks without an edge leading into them. That's maybe something that could use improvement. I take GitHub pull requests! ;) However, I can say that I have found a few bugs in BioWare's script compiler (I explained two in my replies above, I think), so I wouldn't particularily rule that out. And while I can't name any specifics right now, I also never got the feeling that their compiler was all that great in optimizing, fusing instructions, or removing outright unnecessary instructions. I don't think they focussed on that at all, rather keeping it simple and working without any surprises (bugs nonwithstanding). 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 20, 2020 46 minutes ago, DrMcCoy said: I don't remember seeing this exacty case. Then again, it's been a while since I touched that code and I'm not known to have a good memory. Also, my disassembler just follows the starting segment and IIRC I don't even check for unreachable blocks without an edge leading into them. That's maybe something that could use improvement. I take GitHub pull requests! However, I can say that I have found a few bugs in BioWare's script compiler (I explained two in my replies above, I think), so I wouldn't particularily rule that out. Unreachable code, by definition, should have no effect on script execution. Analysis of unreachable code is simply to facilitate analysis of reachable code.For example: 0xNNNNNNNN CPTOPSP -4 4 0xNNNNNNNN CONSTI N 0xNNNNNNNN EQUALII 0xNNNNNNNN JNZ 0xNNNNNNNN Is analyzed to see if it is a switch case label. Every case label I've seen fits this pattern, and for all but the first case label the above code is the entirety of a basic block. However, the first case label has instructions prior to CPTOPSP. It's necessary to see if such code is unreachable, and therefore part of a separate basic block, or actually a functional part of the first case's basic block. And, that's assuming it's actually a switch! It could just be an if (false) {} statement, but we have to analyze the flow to find out! 46 minutes ago, DrMcCoy said: And while I can't name any specifics right now, I also never got the feeling that their compiler was all that great in optimizing, fusing instructions, or removing outright unnecessary instructions. I don't think they focussed on that at all, rather keeping it simple and working without any surprises (bugs nonwithstanding). I won't rule out optimizations, but I doubt they exist. They would be meaningless since NWScript isn't meant for high-performance. We aren't using NWScript to create databases, calculate protein folds, or process bank transactions. Slow code won't be noticeable, but such optimizations are error-prone and require a lot of development effort that clearly would have been better spent elsewhere. 1 Share this post Link to post Share on other sites
JCarter426 1,220 Posted March 20, 2020 9 hours ago, AmanoJyaku said: I am learning compiler design in order to create this compiler/decompiler, so I am still unfamiliar with communicating this topic effectively. It's like I'm looking at my future, and my future is written in assembly. I whipped up some tests and attached them. I don't know how much help they'll be, though, since if anything they only confirm what's already been said. The compiler we have, at least, does On 2/6/2020 at 1:08 AM, DrMcCoy said: Does it also try to decompile functions that are not called? On 2/6/2020 at 1:16 AM, DarthParametric said: No, because those wouldn't be pulled into the compiled script if it didn't call them. The preprocessor that we're using, at least, will only include code that is actually used. Which is good. My tests confirm that. Based on the garbage I've seen in decompiled game scripts, though, I think it's possible BioWare's might not have been as optimized. I often see tons of integers declared in decompiled scripts that are never used. I can't remember if I've ever seen any functions I was sure weren't used, though. 9 hours ago, AmanoJyaku said: "Dead code" has two meanings, one more accurate than the other: Instructions that compute a value that is never used Unreachable code I tried various things that should lead to dead code but I think it's all of the first variety. I tried to make an unreachable block in the 6th one, but I still see jump points in the disassembled code. ncstest.zip Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 22, 2020 On 3/20/2020 at 7:59 PM, JCarter426 said: I tried various things that should lead to dead code but I think it's all of the first variety. I tried to make an unreachable block in the 6th one, but I still see jump points in the disassembled code. Scripts 1-5 have unreachable code (remember, I'm ignoring dead/unused code for now). Script 6 is fine, all basic blocks are reachable. How did you compile this? On 3/20/2020 at 7:59 PM, JCarter426 said: It's like I'm looking at my future, and my future is written in assembly. *triggered* Share this post Link to post Share on other sites
JCarter426 1,220 Posted March 22, 2020 I compiled with the version of NWNSSCOMP found here. In KOTOR 2 mode, if that matters. 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 22, 2020 54 minutes ago, JCarter426 said: I compiled with the version of NWNSSCOMP found here. In KOTOR 2 mode, if that matters. Ooooooooooooh!!! There are issues in almost all of the tools on that site. I'm sure it's the case for NWNSSCOMP, as well, although I haven't used it myself. Don't need to, either, seeing as how the scripts you gave me are compiled incorrectly. I'm particularly concerned about this: Quote 'Star Wars: Knights of the Old Republic' Script Compiler/Decompiler based on 'NeverWinter Nights' Script Compiler/Decompiler Copyright 2002-2003, Edward T. Smith Modified by Hazard (hazard_x@gmx.net) Modified further tk102 for stoffe -mkb- (v0.03b) Modified? Modified how? Where's the source? Was this reverse engineered? Etc... Given that this was worked on by three different people/groups, my guess is someone screwed up while making updates. Ah, well... All the more reason to produce a new compiler. Edit: So, I just read Torlak's old site. Here's the full description of NWNSSCOMP and it's potential problems. tl;dr, there most likely are bugs in NWNSSCOMP: Spoiler Quote What is NwnNssComp? NwnNssComp is a standalone NWScript compiler and decompiler for Bioware's Neverwinter Nights game. It allows module developers to quickly recompile all their scripts. Roboius' New NWNNSSCOMP Click here to download the Original NWNNSSCOMP Reasons to Use NwnNssComp Like the model compiler, there is little need for a standalone script compiler. However, there are a few instances where this compiler has some advantages. First, NwnNssComp is fast. Depending on your system memory, it can be any where from 12 to 20 times faster than Bioware's toolkit compiler. Second, for groups creating large modules stored in source management systems such as CVS, NwnNssComp provides an all important stand alone compiler needed to merge script source into a module. DO NOT recompile Bioware's scripts contained in their BIF files or in the override directly. You will probably break NWN or at least prevent it from being patched in the future. Does NwnNssComp Work? Although it is impossible to prove that any program is bug free, NwnNssComp has built-in regression tests that allow it to be validated against Bioware's script compiler. All of the NWM files were copied into the modules directory and had their extension renamed to MOD. Then each was loaded in Bioware's toolkit and built. Then NwnNssComp was run using the "-t2" option. This compiled all the scripts in the selected module and compared the results against Bioware's compiled files. Excluding include files and scripts that would not even build with Bioware's compiler, 7857 script files verified without error. However, it is important to note that around ten scripts failed to match 100% due to slight differences in floating point conversions between the two compilers. Future versions of the "-t2" option will detect and ignore this insignificant difference. Running NwnNssComp NwnNssComp is a command line program. Thus, it doesn't have any fancy window interface. If you wish, you can copy "nwnnsscomp.exe" into your window's directory. Usage: nwnnsscomp [-cdeox] [-t#] infile [outfile] infile - name of the input file. outfile - name of the output file. -c - Compile the script (default) -d - Decompile the script (can't be used with -c) -e - Enable non-Bioware extensions -o - Optimize the compiled source -x - Extract script from NWN data files -t1 - Perform a compilation test with BIF scripts -t2 - Perform a compilation test with the given module -t3 - Optimization space saving report with the given module NwnNssComp supports two basic modes, compilation and decompilation. To decompile a script, use the following command: NWNNSSCOMP -d nw_all_feedback6 This assumes that there exists a file called "nw_all_feedback6.ncs" in the current directory. To decompile a script stored in the Bioware data files, add the "-x" option. NWNNSSCOMP -dx nw_all_feedback6 In both these cases, a file called "nw_all_feedback6.pcode" will be created in the current directory. To compile a script, use the following command: NWNNSSCOMP -c nw_all_feedback6 In this case, the file called "nw_all_feedback6.nss" will be compiled and the results saved in the file "nw_all_feedback6.nsc". To compile a script with optimizations, use the following command: NWNNSSCOMP -co nw_all_feedback6 Using NwnNssComp with Wildcards NnwNssComp supports wildcards in the input file name. Standard wildcards such as "*" and "?" are fully supported. Wildcards can be used when compiling or decompiling scripts stored on disk or stored in the Bioware data files. NWNNSSCOMP -dx * This command will decompile all the scripts in the Bioware data files and store the output in the current directory. NWNNSSCOMP -dx * pcode\ This command will decompile all the scripts in the Bioware data files and store them in the "pcode" subdirectory. Make sure this directory has been created. NWNNSSCOMP -c ascii\* binary\ This command will compile all the scripts in the subdirectory "ascii" and store the output in the "binary" subdirectory. If you wish to recompile all the scripts in your module, follow these steps: Open the module using the Bioware's toolkit. Start a command prompt. Change your directory to the module's temp directory. This is usually something like "C:\NeverwinterNights\Nwn\Modules\temp0\". However, if you installed NWN in another directory or if Bioware's toolkit aborted leaving old temp directories, then your path might be different. Make sure no scripts are currently open in Bioware's toolkit script editor. Invoke NwnNssComp with the following command "nwnnsscomp -c *.nss". If the "save" option isn't available in Bioware's toolkit, then bring up the module properties (under Edit) and then press OK. That will enable the save option. NwnNssComp Regression Tests To help insure that NwnNssComp actually works as advertised, it has a series of three regression tests. The "-t1" regression tests compiles all the scripts located in the Bioware BIF files and compares the results with any found compiled version of the script. This test is known to generate many errors due to scripts that can't be compiled by even Bioware's toolkit or compiled scripts that no longer match the source. The "-t2" regression test compiles all the scripts located in the module specified by "infile". The results are compared against an existing compiled version of the script. This test is known to be reliable as long as the module has been recently compiled with Bioware's toolkit. The "-t3" test isn't really a regression test. Given a module specified by the "infile" parameter, the scripts contained within the module are compiled with optimizations and then a final report of any savings in compiled script file size is reported. Optimizations (-o) NwnNssComp supports a wide range of optimizations that improve reliability, file size, and execution speed. With all optimizations enabled, the size of the compiled script files shipped in the Bioware modules reduced in size by an average of 11%. Following is a list of all the optimizations. 1. Correction of Bioware's compiler bugs All known Bioware script compiler bugs are corrected when optimizations are enabled. In the case of the "break/continue" stack bug, this fix improves reliability. In the case of the logical OR (||) bug, it improves program speed while saving two instructions. Overall, this optimization improves file size, runtime performance, and reliability. 2. Optimize global variables There are two minor problems with globals and Bioware's script compiler. To understand the problems, you first must understand how the script engine supports globals. In traditional languages, globals are stored in a segment of memory. However, with Bioware's script engine, globals are treated as variants of normal local variables. If a global exists in a script, then a dummy global routine is created that will at runtime create the global variables. This is done every time the script is invoked. Also, when actions are used with routines such as DelayCommand, the whole global variable stack and the current routine's local variable stack must be saved. So, it is important that when possible, global variables are eliminated. The first optimization with global variables is the correction of a problem in Bioware's script compiler that would cause this dummy global routine to be generated if a structure has been defined but not global variables actually exist. Elimination of this extra routine removes five or eight needless instructions that would get executed every time. This improves file size and run time performance. The second optimization is the identification of global variables that are unreferenced or never modified. In the case of unreferenced variables, they are eliminated. This saves one instruction if the global variable didn't include any initialization or four instructions if it did. Also, if a global variable is never modified, then it is effectively a constant. By eliminating it as a global and treating it as a constant in any referencing routines, we save four instructions and a global variable. This optimization is not done for strings or global variables that do not include any initialization. Overall, this optimization improves file size and runtime performance. 4. Structure copy for data access When elements need to be accessed in structures, Bioware's compiler invokes two instructions. The first instruction makes a copy of the whole structure onto the top of the stack. The second instruction then destroys all elements of the structure excluding the structure element the script needs to access. When optimized, this behavior is changed to just copy the element of interest directly to the top of the stack. Not only does this save one instruction, but it saves significant amounts of stack processing. This optimization improves file size and runtime performance. 5. Lingering stack adjustment after a return In Bioware's script compiler, when a value is to be returned, the expression is computed. Then the top of the stack is copied to the return value. Next, computed return value still on the top of the stack is removed. Finally a JMP instruction is invoked to branch to the routine's final cleanup code and return. Unfortunately, after the JMP statement, another instruction exist to remove the computed return value from the top of the stack. This code is never executed so it doesn't affect run time performance. However, it does waste six bytes of data for the instruction. NwnNssComp doesn't generated this extra instruction with optimizations enabled. This optimization improves file size only. 6. Extra JMP instruction in an "if" statement with no "else" When generating code, the Bioware compiler treats "if" statements without "else" clauses the same as "if" statements with "else" clauses. This means that at the end of the "if" clause, there is a JMP instruction to skip over the "else" clause. Optimization removes this extra JMP. This optimization improves file size and runtime performance. 7. Negative test and extra JMP instruction in "do" loops "do" loops have their looping test at the end of the loop. Thus, it is natural to test the conditional and jump to the start of the loop's body when the conditional is TRUE. This is actually counter to how "for" and "while" loops work where they jump out of the loop when the conditional is FALSE. In Bioware's script compiler, instead of using a simple jump if non-zero (JNZ), they jump if zero (JZ) to break out of the loop. When the conditional is TRUE, this jump is not taken. Following the test is a JMP instruction pointing back to the start of the loop. This means that is takes two instructions instead of one for each loop iteration. When optimizations are enabled, the JZ is replaced with a JNZ and the following JMP is not generated. Thus, if the conditional is TRUE, then it jumps to the start of the loop's body. Otherwise it just continues normal execution. This optimization improves file size and runtime performance. 8. "continue" branches to end of "while" loop When using a "continue" inside of a "while" loop, the Bioware compiler branches to the end of the loop where there exists a JMP back to the conditional test at the start of the loop. When optimization is enabled, a "continue" will branch directly to the conditional test and thus avoid the extra JMP. This optimization improves file size and runtime performance. 9. Empty "for" loop generates a dummy conditional When a script contains a "for" loop with an empty conditional (i.e. for (;;)), the Bioware compiler generates a dummy conditional of "1". When optimization is enabled, this dummy conditional isn't generated. This optimization improves file size and runtime performance. 10. Declarations with initializations The Bioware compiler doesn't treat "int i = 12" any different from "int i; i = 12". This results in four instructions. When optimizations are enabled, a variable with an initial value will use the initializer to create the variable. Thus saving three instructions. Example: i = 12; Old way RDADDI CONSTI 12 CPDOWNSP -8, 4 MOVSP -4 New way CONSTI 12 This optimization improves file size and runtime performance. 11. Constant elements in expressions When an operator is used with a constant value or a pair of constant values, the results are computed and used in place of the original expression. However, there are some limits to the optimizations. Optimizations occur from left to right. Thus if there are non-constant values prior to the constant values, then the optimization might not happen. Using "()" to group constants can get around this limitation. NOTE: Currently, the optimization of ">>" and ">>>" is disabled due to some uncertainty with how Bioware implemented these operators when the value is negative. Examples of expressions that will optimize: 1 + 2 1 + 2 + nValue nValue + (1 + 2) Examples of expressions that will not optimize: nValue + 1 + 2 1 + d6 () This optimization improves file size and runtime performance. 12. Constant conditional If the conditional in a statement such as "if" is constant, then the code is generated to only compile the required elements. For example, "if (0)" would only result in the "else" clause being generated if it exists. This optimization applies to "if", "do", "while", "for" and "?:". Also please note that if the conditional of a "for" is a constant 0, then the initialization clause will still be compiled even though all other elements of the "for" will be ignored. This optimization improves file size and runtime performance. Extensions (-e) Extensions are enhancements to NWScript that ARE NOT supported by Bioware's compiler. Thus, they are more academic in nature and using them would mean Bioware's compiler would no longer compile the script. 1. "const" Keyword (Added to BioWare's compiler with 1.30/SoU) The "const" keyword is used in front of variable declarations to mark them as constant. This means they are treated much like the variables declared in "nwscript.nss". One of the benefits of using "const" is that it enables the expression optimizer to treat these variables as if they were no different than a constant value. Thus, expressions using "const" variables have a much greater chance of being optimized. Also, "const" variables can be used in "case" statements. Example: const int HELPER_EVENT = 1024; void main () { int i = HELPER_EVENT * 12; switch (i) { case HELPER_EVENT: break; } } This example would compile as if every instance of "HELPER_EVENT" was replaced with 1024. Also, if the script tried to assign a new value to "HELPER_EVENT", an error would be generated. NOTE: Constant value must be simple and not an expression if optimizations are not enabled. Differences Between Bioware's Script Compiler and NwnNssComp Even though I have tried to make sure the compilers work exactly the same, there will be some minor differences. 1. NULL statements are not allowed in some constructs Even though it is a valid construct in C/C++, the Bioware compile does not allow a script to contain an empty statement for constructs such as "if". This is probably a good check to make since it usually means the script writer has a bug. However, NwnNssComp will allow this construct. Future versions will issue a warning. Example: M1Q2_VASCO2 (Chapter1) if(!GetIsObjectValid(oBloodsailor)); Bugs In Bioware's Script Compiler While creating NwnNssComp, I found 2 bugs in Bioware's script compiler. Both bugs are currently emulated by NwnNssComp. However, future versions of NwnNssComp will not support these bugs. 1. "break" and "continue" fail to account for local scope variables (Fixed in BioWare's compiler with 1.30/SoU) In a loop with local variables, if a "break" or "continue" is used after a local variable is defined, the internal stack is not adjusted causing unpredictable results. Example: void main () { int i = 5; while (TRUE) { int j = 6; break; } SpeakString ("The value of i is = " + IntToString (i)); } When this script is run, the resulting spoken string is "The value of i is = 6". This is because "j" is not removed from the stack. To work around this problem, declared the variables outside the while loop. 2. Logical OR (||) fails to ignore the right hand side when left hand side is TRUE (Fixed in BioWare's compiler with 1.30/SoU) It is part of the standard in C/C++ that the right hand side of logical operators are not evaluated if the left hand side result implies the logical operator must be TRUE or FALSE regardless of the value of the right hand side. In the case of a logical AND (&&), if the left hand side is FALSE, then the right hand side is not evaluated. In the case of a logical OR (||), if the left hand side is TRUE, then the left hand side is not evaluated. Bioware's script compiler supports this for logical AND (&&). However, it doesn't work properly for logical OR (||). Bioware has confirmed this to be a bug in their script compile so when it is fixed, some script might be broken. Share this post Link to post Share on other sites
JCarter426 1,220 Posted March 22, 2020 That's the most up to date compiler we have for KOTOR, unfortunately. Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 29, 2020 Once your work is far enough along and you have a spare moment @AmanoJyaku, I'm curious if you can finally resolve the problem I mentioned above regarding DeNCS choking on decompiling some scripts that doesn't seem related to recursion (or at least ncsdis doesn't report it). I just found another one (attached if you are curious) as I was poking through various things, again from Tatooine which seems to be heavily affected by the issue, so it reminded me. I'm extremely curious to know the whys and wherefores. Non-decompilable_script.7z 2 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 29, 2020 Interesting. It's definitely not recursion. There are only four functions: 1) _start() 2) _global() 3) main() 4) sub1() The main() function only calls sub1() once, and sub1() never calls a user-defined function. So, only an engine routine could recur. More interestingly, the file is very short and very simple. I suspect I know what the problem is, because I see something in the code I've never seen anywhere else. But, I would like to withhold assumptions until I can analyze another file that doesn't decompile. Can you supply another? 1 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 29, 2020 Sure. Here's another short one from the very same conversation: Non-decompiling_same_convo.7z And here are some other ones that we came across developing K1CP v1.8. Mostly module OnEnters and OnHeartbeats that have a bit more going on, but also some smaller trigger scripts and the like. Mostly from Tatooine, but there are also a couple of Manaan ones: Other_non-decompiling.7z Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 29, 2020 Veeeery interesting. I'm seeing the same thing in all files, so I hope it's the culprit. Basically, every NCS file I've worked on uses global variables initialized from constants. However, the files you've given me all include a global that's initialized from an engine routine! I'm wondering if DeNCS is choking on that. 0x00000c67 RSADDS 0x00000c69 CONSTI 32289 0x00000c6f ACTION 239 1 0x00000c74 CPDOWNSP -8 4 0x00000c7c MOVSP -4 // 239: Get a string from the talk table using nStrRef. string GetStringByStrRef(int nStrRef); 3 2 Share this post Link to post Share on other sites
DarthParametric 3,795 Posted March 29, 2020 Presumably not coincidentally, both the Tatooine and Manaan module includes contain the following in their constants list: string RACE_DEFAULT = GetStringByStrRef(32289); And, as discussed previously, we know Bioware's compiler pulled in all constants from any listed includes, regardless of whether they were required or not. @JCarter426 also determined that the KOTOR nwnnsscomp also does this, despite the original NWN version being fixed to prevent that. 2 Share this post Link to post Share on other sites