AmanoJyaku 184 Posted December 26, 2020 7 hours ago, DarthParametric said: That seems fine to me. I was more concerned that certain files might not able to be decompiled at all. This is the case for DeNCS as well, based on my own experiments. My concern with nwnnsscomp is that we now know its output differs from that of the original compiler. That leads to three possibilities: The output files are fully reversible, even if they differ slightly from the source. This is acceptable. The output files are fully reversible, but require additional code to detect and handle the differences from the original compiler. This will be very annoying. The output files are not reversible. This is unacceptable. At the moment, I have no idea which of these is the case. With over 2,500 files, there is no reasonable way I can develop a reverse compiler while inspecting each script in a timely fashion. I feel it's something to take note of, and look into once the reverse compiler is complete. 1 hour ago, JCarter426 said: This is still common practice, at least as far as I've been taught. Yeah, there are security issues if you then try to read one, but modern compilers either won't let you do that or will initialize it with a default value, if the language itself doesn't. With that in mind, I don't see the issue. I don't know who taught you this, but it's a practice that's frowned upon even with modern compilers. We try to learn good habits rather than relying on the compiler to handle things we could easily do ourselves. It's not uncommon for bugs to creep into compilers, and you can track these down more readily with proper code. This advice comes from the best programmers at places like Apple, Google, and Microsoft, and even the ISO C and C++ committees. 1 hour ago, JCarter426 said: The issues you've mentioned before don't seem like they should cause problems to me. Like with the nested else/if vs a simple else if, that'll only be functionally different if you declare something local to the else branch's scope. And, obviously, don't do that. And I doubt the compiler would've optimized for it that way if it would produce functionally different code. It usually knows better than we do. Why not create variables local to the else branch's scope? That's the whole point of having scopes, to create variables only where they're needed. Variables placed in too high a scope, e.g. the global scope, are at risk of being overwritten erroneously. As for the compiler knowing better than we do, that's true of professionally written compilers. Which nwnnsscomp is not. I've already listed some of the basic things it fails to optimize, e.g. failing to remove empty functions. To be fair, the original compiler from BioWare has an even bigger problem in that logical AND and logical OR are incorrect. Which nwnnsscomp doesn't fix... Share this post Link to post Share on other sites
JCarter426 1,214 Posted December 26, 2020 2 hours ago, AmanoJyaku said: I don't know who taught you this, but it's a practice that's frowned upon even with modern compilers. We try to learn good habits rather than relying on the compiler to handle things we could easily do ourselves. It's not uncommon for bugs to creep into compilers, and you can track these down more readily with proper code. This advice comes from the best programmers at places like Apple, Google, and Microsoft, and even the ISO C and C++ committees. I've seen numerous examples in the documentation for C# and Java, including the very page on variable initialization in C#. The C++ textbooks I've read (Starting Out with C++, Professional C++, C++ Notes for Professionals) obviously warned about the dangers of using uninitialized variables, but still used the extensively when it was safe. I was taught that a) it is theoretically more efficient in languages that don't require variables to be initialized (e.g. C++ and C#) as minimal as that efficiency gain may be, and b) it's more readable—if you don't know what the value will be, don't say that's the value, as that's potentially misleading. 3 hours ago, AmanoJyaku said: Why not create variables local to the else branch's scope? That's the whole point of having scopes, to create variables only where they're needed. Variables placed in too high a scope, e.g. the global scope, are at risk of being overwritten erroneously. This, on the other hand, I was taught was bad practice because it increases the chance of hiding variables by accident and creating conflicts. So I was told that it's better to avoid this except for temp variables in a for loop or for a swap. This was just within a function, though—obviously global variables have their own problems. I was also told that, in terms of memory, it's better to allocate it at the start of a function and have it crash there if it's going to crash rather than in the middle of execution. But I question whether that's a realistic problem anymore. 3 hours ago, AmanoJyaku said: As for the compiler knowing better than we do, that's true of professionally written compilers. Which nwnnsscomp is not. I've already listed some of the basic things it fails to optimize, e.g. failing to remove empty functions. To be fair, the original compiler from BioWare has an even bigger problem in that logical AND and logical OR are incorrect. Which nwnnsscomp doesn't fix... Well... yes. I believe you. 😧 Share this post Link to post Share on other sites
DrMcCoy 40 Posted December 26, 2020 4 hours ago, AmanoJyaku said: At the moment, I have no idea which of these is the case. With over 2,500 files, there is no reasonable way I can develop a reverse compiler while inspecting each script in a timely fashion. I feel it's something to take note of, and look into once the reverse compiler is complete. You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files. Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable. You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases. There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development. Share this post Link to post Share on other sites
AmanoJyaku 184 Posted December 26, 2020 2 hours ago, JCarter426 said: I've seen numerous examples in the documentation for C# and Java, including the very page on variable initialization in C#. The C++ textbooks I've read (Starting Out with C++, Professional C++, C++ Notes for Professionals) obviously warned about the dangers of using uninitialized variables, but still used the extensively when it was safe. Often, documentation is written with the assumption that the reader has some familiarity with the topic and is simply looking for a reference as a refresher on minutae. For example, the C# link shows a class definition A with an unassigned static scalar variable x, and unassigned instance scalar variable y. This is valid because it states: Quote Static variables The initial value of a static variable is the default value (Default values) of the variable's type. Instance variables The initial value of an instance variable of a class is the default value (Default values) of the variable's type. The default value of an int is 0. This is achieved by the compiler writing the assignment code for you, so the initialization occurs. In contrast: Quote Local variables A local variable introduced by a local_variable_declaration is not automatically initialized and thus has no default value. With the exception of the section "Try-catch-finally statements", the remainder of the document specifically uses uninitialized variables to point out their dangers. The "Try-catch-finally statements" section demonstrates a contrived (meaning, unrealistic) example of "safe" use of uninitialized variables. Documentation and tutorials use contrived examples to demonstrate how something works, but often caution such examples are not acceptable in production code. I should point out Microsoft's documentation is mostly filled contrived examples. Post one of their examples on StackOverflow and prepare to be burned. tl;dr: never assume uninitialized variables are safe. 2 hours ago, JCarter426 said: I was taught that a) it is theoretically more efficient in languages that don't require variables to be initialized (e.g. C++ and C#) as minimal as that efficiency gain may be, and b) it's more readable—if you don't know what the value will be, don't say that's the value, as that's potentially misleading. There's a mantra you should repeat to yourself: code for correctness, worry about performance later. Modern CPUs are fast, and compilers are good at optimizing your code for those CPUs. A CPU that operates at 3 billion cycles per second does not benefit from saving a few dozen cycles by omitting a single variable initialization. If the compiler thinks you can benefit, it will often make the change for you. For example, creating variables inside of loops is considered a bad practice because you're just going to create and destroy that variable repeatedly. Which is why compilers perform variable hoisting, meaning the variable is taken out of the loop and placed before it. So: int Count = 10; while (Count > 0) { int i = 0; //Do something with i --Count; } Becomes: int Count = 10; int i = 0; while (Count > 0) { //Do something with i --Count; } But, you should probably just use a for loop: for (int i = 0, Count = 10; Count > 0; --Count) { //Do something with i } Readability is a concern, yes. But, it's more readable when you give a variable a descriptive name, initialize it with an appropriate starting value, and place it where you intend to use it. In the example, I can guess Count is the number of times to execute the loop. But, I should give i a better name. Maybe it should be Meters, or Years, or Euros? Both variables are created once, and destroyed once the loop completes. See how much we can learn from this simple code fragment? 2 hours ago, JCarter426 said: This, on the other hand, I was taught was bad practice because it increases the chance of hiding variables by accident and creating conflicts. So I was told that it's better to avoid this except for temp variables in a for loop or for a swap. This was just within a function, though—obviously global variables have their own problems. I was also told that, in terms of memory, it's better to allocate it at the start of a function and have it crash there if it's going to crash rather than in the middle of execution. But I question whether that's a realistic problem anymore. I'm not sure I know what you mean by hiding variables, but if you place them where you use them then they aren't hidden. Conflicts can easily be avoided by using descriptive names. And a function should only be as large as the single task it performs. And, please, no global variables. I don't think there is ever a reason for global variables. As for memory, that's only a problem with embedded systems and some 32-bit platforms. No matter how little RAM a computer has it still has access to virtual memory. It won't crash, but it will be slow. 1 hour ago, DrMcCoy said: You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files. Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable. You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases. That's the plan. 1 hour ago, DrMcCoy said: There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development. I have no idea, I only learned about these tools when I answered your request for help. It's embarrassing, because I have two copies of NWN, I just need to find them... Share this post Link to post Share on other sites
JCarter426 1,214 Posted December 26, 2020 1 hour ago, AmanoJyaku said: I should point out Microsoft's documentation is mostly filled contrived examples. Post one of their examples on StackOverflow and prepare to be burned. Yeah, I know. And with a managed language, they have the luxury of being more willy-nilly than if it were C++ or C. Still, I don't think they would be doing it all over the place if it were so universally unacceptable. It's safe as long as you don't read before assigning at some point, and modern IDEs and compilers go out of their way to prevent you from doing that by accident. Again, if you have a compiler you can trust. I know the Visual Studio C++ compiler has some silly bugs, and Clang lets you do things it really shouldn't do because that code won't compile with anything else, to say nothing of NWNNSSComp. 1 hour ago, AmanoJyaku said: There's a mantra you should repeat to yourself: code for correctness, worry about performance later. Modern CPUs are fast, and compilers are good at optimizing your code for those CPUs. A CPU that operates at 3 billion cycles per second does not benefit from saving a few dozen cycles by omitting a single variable initialization. I know in most use cases it's minimal, but it can make a difference if you're processing huge amounts of data in a database, which C# and .NET are often used for. The reason C# allows uninitialized local variables, unlike, say, Java, is specifically for the performance gain. It mattered to somebody somewhere at some point. 1 hour ago, AmanoJyaku said: I'm not sure I know what you mean by hiding variables, but if you place them where you use them then they aren't hidden. Conflicts can easily be avoided by using descriptive names. I mean if you give a variable in an inner scope the same identifier as one in the outer scope. The variable in the outer scope is hidden. Which you can also file under don't do that, but people make mistakes. 1 hour ago, AmanoJyaku said: And a function should only be as large as the single task it performs. Well yes, ideally you shouldn't have too much nested code in the first place. Another reason why I don't mind NWNNSSComp replacing nested else/if with else if. Share this post Link to post Share on other sites
AmanoJyaku 184 Posted January 25, 2021 Life has been crazy, and I've seen things I never thought would happen in my country. But, coding never stops. Time for an overdue update! An earlier update outlined the structure of a subroutine: return_type subA(Argument_1, Argument_2...) { //Part 1 - The code that does stuff //Part 2 - Values returned //Part 3 - Destruction of local variables //Part 4 - Destruction of arguments } Part 2 returns values from a subroutine using the CPDOWNSP opcode. Parts 3 and 4 destroy local variables and arguments using the MOVSP opcode. These opcodes can appear in any part of a subroutine. So, how do we know if we're returning a value, and destroying variables and/or arguments? We analyze part 1 to determine what, if anything, is on the stack. Computers execute operations in sequence unless directed otherwise. In high-level languages like NWScript, control statements alter the sequence. An if statement conditionally executes code. If the test condition fails, this code is jumped over. So, what does an if statement look like in the compiled NCS? 102 CPTOPSP -3 1 110 CONSTI 0 116 EQUALxx 118 JZ 170 124 CPTOPSP -1 1 132 CPTOPSP -3 1 140 CONSTO object 146 JSR 298 152 MOVSP -3 158 JMP 296 164 JMP 170 In the example, the opcode at offset 102 of the NCS file copies the value of a variable to the top of the stack. Offset 110 places the value 0 onto the stack. Offset 116 performs a test of equality of the two values, removes them from the stack, then places the result of the test onto the stack. Offset 118 performs a conditional jump based on the result, removing the result in the process. Offsets 124-164 are the body of the if statement, which consists of a call to a subroutine followed by a return statement: if ( nKillMode == 0 ) { DamagingExplosion(OBJECT_SELF, nDelay, nDamage); return; } So many ops, so few statements. That's why we write in high-level languages instead of assembly, because the compiler allows us to focus on what the program should do rather than creating the proper sequence of opcodes to keep the program from breaking. Notice in the body of the if statement there is a MOVSP. That is the destruction of local variables, part 3. There is no part 2, because the subroutine returns void (nothing). But, how do we know the MOVSP is part 3 and not part 4? And why are there two unconditional jumps??? The challenge of writing a reverse compiler is knowing the rules used to compile. I've spent the past six months observing, analyzing, and documenting those rules. For example, all control statements other than the switch statement are created using the jump-if-zero (JZ) opcode. (Switch statements are created using the jump-if-not-zero [JNZ] opcode.) However, the existence of a JZ opcode does not mean you've encountered a control statement! JZ is used to create the following: Logical AND Logical OR if statements else if statements while statements for statements do-while statements (I finally found one!!!) This means we need to some way of determining what the range of addresses created by the JZ opcode is. In the example above, the range created by the if statement is [124, 170). In this range notation a bracket "[" means the offset is included in the range, and a parenthesis ")" means the offset is outside of the range. The 170 is obvious because it's specified by the JZ opcode; if nKillMode == 0 returns false we jump to 170 and resume execution from there. Where does the 124 come from? Remember what was stated above: "computers execute operations in sequence unless directed otherwise." A JZ opcode is 6 bytes in length and the JZ is at offset 118, so 118 + 6 means the next op in the sequence is at offset 124. If nKillMode == 0 returns true we enter the body of the if statement at offset 124. Offset 124 is the first address in the range, offset 170 is the first address outside of the range. All control statements are ranges, but not all ranges are control statements. The NCS file is itself a range: [0, file size). Subroutines are ranges. However, our focus is on the ranges created by JZ. How do we determine what they are? The unconditional jump, JMP. Notice the example ends with two JMPs. This caused me lots of stress because I didn't know why they were there. I'm now convinced this is one of the rules of compiling NCS files: control statements end with a JMP. The if statement is easily identified as such because the second JMP goes to the same offset as the JZ. The true block of an if-else statement ends in a JMP to an offset greater than that of the JZ. The if and else-if statements all end in JMPs to the same offset. Loops all end in JMPs to an offset less than the offset of the JZ. Logical AND and Logical Or are not control statements, so they do not end in a JMP. Our example has two JMPs. Why? Most likely the original compiler was dumb. The NWScript return statement is translated into a JMP to the end of the subroutine, offset 296. The opcode at this offset is RETN, the last opcode in a subroutine. Anytime you see a return in NWScript that isn't the last statement in the subroutine, you know it will be implemented as a JMP: if ( nKillMode == 0 ) { return; } if ( nKillMode == 1 ) { return; } if ( nKillMode == 2 ) { return; } is 118 JZ 170 158 JMP 296 164 JMP 170 186 JZ 230 218 JMP 296 224 JMP 230 246 JZ 290 278 JMP 296 284 JMP 290 296 RETN The original compiler just stuck the body of the if statement in between the JZ and JMP. A modern compiler would have removed the second JMP since it's unreachable code. OK, this is great and all. But what about the reverse compiler? Well, now that I have the rules down I'm close to the finish line. The reverse compiler accurately finds ranges and determines what types they are. The output looks like this: This is definitely a K2 file Subroutine 13, 21 13, 21 //Subroutine Subroutine 21, 298 21, 298 //Subroutine 124, 170 //if 192, 230 //if 252, 290 //if Subroutine 298, 1345 298, 1345 //Subroutine 320, 396 //if 704, 722 //logical AND or OR 728, 1216 //loop 787, 805 //logical AND or OR 811, 1135 //if 914, 970 //if-else 1017, 1043 //if-else Subroutine 1345, 1601 1345, 1601 //Subroutine 1459, 1553 //if 1511, 1547 //if Subroutine 1601, 1637 1601, 1637 //Subroutine Subroutine 1637, 1813 1637, 1813 //Subroutine 1659, 1727 //if There's more to be done, particularly for loops because they support the break and continue statements (also implemented with JMP). But, hard part seems to be behind me. Should have an update in two weeks. 3 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted January 28, 2021 I promised an update in two weeks... How about two days? The approach of using ranges has borne fruit. Consider the following code: int subroutine(int I, float F, vector V) { int i = 1; while(true) { if(i > 10) { //do stuff } else { return -1; } } return 0; } The reverse compiler needs to be able to find the return type and parameter types. To do this, it: Counts the number of local variables in Part 1 Counts the number of variables returned in Part 2 Destroys the local and returned variables in Part 3 Verifies the stack is empty, and assumes the count being destroyed is the number of input arguments in Part 4 Using the above code as an example, the method outlined in the post from two days ago effectively ignores everything inside the while loop. So: 1 local variable is counted, created by the RSADDI opcode 1 return variable is counted, created by the CPTOPSP opcode and copied by the CPDOWNSP opcode 1 local variable and 1 returned variable are destroyed by the MOVSP opcode 5 input arguments are destroyed by the MOVSP opcode (vectors are three floats) Here's the output of the KoTOR2 file k_contain_unlock.ncs: This is definitely a K2 file Subroutine 13, 21 JSR Subroutine 21, 995 JSR Subroutine 995, 1122 Return variables: 0 Local variables: 0 Parameters: 0 Subroutine 1122, 1766 Return variables: 0 Local variables: 0 Parameters: 3 Subroutine 1766, 2948 Return variables: 1 Local variables: 0 Parameters: 2 Subroutine 2948, 6915 JSR Subroutine 6915, 7116 Return variables: 1 Local variables: 0 Parameters: 1 A subroutine with the status "JSR" means it depends upon another subroutine to be reverse compiled first: Sub13 calls Sub21 Sub21 calls Sub995 Sub2948 calls Sub6915 Here's what the other status codes mean: Return variables: 0 - returns void; 1 - returns a single value, e.g. int, float, string or object; a value of 2 or more returns a struct or vector Local variables: 0 - should always be 0, because it's calculated after the stack is cleared; non-zero value means the analysis is flawed Parameters: 0 - there are no input parameters; 1 or more means there are input parameters So, what are the subroutines? Sub13 - void _start() - the hidden routine Sub21 - void _global() - the routine that creates global variables Sub995 - int main() Sub1122 - void PlaceTreasureDisposable(object oContainer = OBJECT_SELF, int numberOfItems = 1, int nItemType = 900) Sub1766 - string GetTreasureBundle (int nItemLevel, int nItemType = 0) Sub2948 - string GetBundlePrefix (int nItemLevel, int nItemType) Sub6915 - string SWTR_GetQuantity(int iCount) Still targeting an update in two weeks. Or earlier. 2 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted February 3, 2021 The Phantom Bug Spoiler Given the progress in the last few updates, I figured I'd make an attempt at reverse compiling to see what issues might be lurking. It's easy enough to reverse this: 18344 CONSTI 15 18350 CONSTO object 18356 int GetHitDice 18361 ADDxx That's 15 + GetHitDice(OBJECT_SELF). Similarly, the JSR issue is easily dealt with: RSADDx JSR JSR If we determine the signature of the first JSR returns 1 variable and has 0 parameters, then we know the code is JSR(JSR()). It makes sense to use the previously discussed strategy of skipping ranges to reverse compile. And the reverse compiler plowed through all of the files... Then suddenly got stuck on one. Attack of the Cases Spoiler It was a file from the M478 mod, c_m478_globl_1.ncs. It wasn't a problem before, so it was clearly related to the output of the range finding algorithm. But what? Let's look at a switch statement: switch (var) { case 0: //more code break; case 1: //more code break; case 2: //more code break; default: //more code break; } The case labels in NCS look like this: CPTOPSP -1 1 CONSTI 0 EQUALxx JNZ offset_1 CPTOPSP -1 1 CONSTI 1 EQUALxx JNZ offset_2 CPTOPSP -1 1 CONSTI 1 EQUALxx JNZ offset_3 JMP offset_4 offset_1 //more code offset_2 //more code offset_3 //more code offset_4 //more code The difference between JZ and JNZ is more than just jumping based on 0 or not 0. Since JNZ is only used for switch case labels, the offsets are the first opcodes in the case statements. What's important to note is that, unlike JZ, JNZ does not tell you the end of the range. To find the end, you need to look at the next JNZ. The next JNZ's start is the previous JNZ's end. Which begs the question: how do you find the end of the last JNZ? It's the offset specified by the JMP. That's the default case. And as you can see, we don't know where it ends, either. In fact, it's not even a guarantee that the switch has a default case at all! Revenge of the Switch Spoiler The JMP is always present, so we need another way to determine if there is a default case. You can't do it easily be examining the code starting from the JMP offset, so I tried another way. Each case label that ends with a break statement is implemented as a JMP to the end of the switch. So, I read through each of the case ranges for the first one to end with a JMP. Why the first one? Because all JMPs go to the end, so there's no need to check all of them. However, case statements aren't required to end with a break statement (this is known as "fall through"), so those that don't should lack a JMP. switch (var) { case 0: //falls through to case 1, case 2, and default case case 1: //falls through to case 2 and default case case 2: //falls through to the default case default: break; } And there was the flaw in my logic. What I never considered was a case statement that falls through, but also includes code that ends in a JMP. What does that? switch (var1) { case 0: if (var2){/*more code*/} case 1: if (var3){/*more code*/} case 2: if (var3){/*more code*/} default: if (var4){/*more code*/} break; } Remember, if, else-if, while, for, and do-while all end with a JMP. The loops jump backward, if and else-if jump to the next op. The next op is the start of the next case offset. And that offset was set as the end of the default case's range!!! As it turns out, I was testing a more efficient method of traversing the inside of a subroutine that relied on input data being perfect. My earlier traversal code read all ops, and skipped them if they were in the ranges of if, else-if, else, while, for, and do-while statements. It was slow, but it always moved forward. The new code specifically looks for child ranges, reads up to the start of those ranges, then jumps to the end of those ranges: Sub() { //Read this if() { //Skip this while() { //Which means this is skipped, too } } //Resume reading from here } The new traversal code is correct, but the default case range was incorrectly defined as ending with a jump backward because of the combination fall through, an if statement, and the crappy method of using the switch cases to find the end of the switch statement. The potential solution? Remove identification of switch statement ranges until a later point in analysis that includes the stack. Can't complain, I suppose, since stack analysis is part of the phase that produces text output. I'm not adding anything in, I'm just moving things around. It's just pure luck that I caught this bug due to the most unlikely combination of things that are rarely done. 2 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted February 7, 2021 I fixed the switch bug and decided to try the first step of reverse compiling: finding statements. For example, the following has six statements: //c_003handfight int StartingConditional() { string tString = GetScriptStringParameter(); int tInt = GetScriptParameter( 1 ); int LevelInt = GetScriptParameter( 2 ); if ( ( GetGlobalNumber(tString) == tInt ) && ( GetGlobalNumber ("G_PC_LEVEL") < LevelInt ) ) { return TRUE; } return FALSE; } The first three are declaration statements, they introduce entities into the StartingConditional subroutine scope. In C, variables and subroutines (functions) are entities. Entities have names and types. In an NCS file, subroutines are found by the JSR opcode and variables are created by the RSADDx opcode. The return statement is one of the jump statements, which includes the break and continue statements. All three are implemented by the JMP opcode. The if statement is one of the selection statements, which includes the switch statement. The if statement is also a compound statement or block, meaning it can contain multiple statements. The previous updates detailed my efforts to correctly identify the various types of statements. Here's what the reverse compiler produces: This is definitely a K2 file Sub13, 23 13 adds 1 15 Sub23 removes 0 returns 1 21 Sub23, 250 23 adds 1 25 GetScriptStringParameter adds 1 30 38 removes -1 44 adds 1 46 adds 1 52 GetScriptParameter removes -1 adds 1 57 65 removes -1 71 adds 1 73 adds 1 79 GetScriptParameter removes -1 adds 1 84 92 removes -1 98 adds 1 106 GetGlobalNumber removes -1 adds 1 111 adds 1 119 removes -1 121 adds 1 129 removes -1 135 adds 1 149 GetGlobalNumber removes -1 adds 1 154 adds 1 162 removes -1 164 removes -1 166 if() removes -1 210 adds 1 216 224 removes -4 230 248 Sub13 is the hidden _start() function. This file is a conditional script used in dialogs, so it returns a value. This is the file you use to practice fighting with Handmaiden, and she won't fight you if you aren't good enough. The value returned is obtained from Sub23 and stored in the space reserved by the RSADDx opcode at offset 13 (written above as "add 1"). Sub23 is int StartingConditional(), which is what is required for conditional scripts. The alternative is void main() for non-dialog scripts. If this was a non-dialog script offset 13 would not be a reservation. It would be Sub21, because the JSR opcode is six bytes and the next opcode is the two-byte RETN (13 + 6 + 2 = 21). Offsets 23-38 are the a declaration statement, string tString = GetScriptStringParameter(). You see three elements added to the stack, and two removed. Offset 30 is the CPDOWNSP opcode. Blank lines mean elements have not been added to or removed from the stack. Instead, CPDOWNSP copies the value returned by GetScriptStringParameter() to the element reserved by offset 23, string tString. CPDOWNSP is the assignment operator, "=". Offset 38 then destroys the value returned by GetScriptStringParameter() since it is no longer needed. In C this is known as discarding. The key to writing a reverse compiler is finding these discarded values. The MOVSP opcode discards values, so it's a safe bet that a MOVSP opcode is the end of a statement. In the above example, offsets 38, 65, and 92 are discarding values. You can think of MOVSP as the semicolon at the end of the statement. Offset 164 is the logical AND, 98-162 are the operands. Since we have already found the ranges created by JZ and JNZ, we already know the boundaries of those block statements. Offset 166 is the if () statement, which ends at offset 210. And blocks are surrounded by curly braces. Offset 210 places the return value onto the stack. This is after the if () statement, so it's value is FALSE. It's then returned at offset 216 using CPDOWNSP to the reservation created at offset 13. Offset 224 then destroys the three local variables and the return value using MOVSP. Finally, the RETN opcode should be at offset 230, but for some reason it's a JMP to 248. (I swear, I hate the NCS compilers.) Offset 248 has the RETN. Neither the JMP or RETN would be printed. TL;DR I just have to remove diagnostic info, group expressions into statements, and ensure the branching statements and nested statements are handled correctly. Look forward to another update soon. 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted February 8, 2021 Stripped out some of the diagnostic output and added some source code output. Here's the original Handmaiden conditional script again: //c_003handfight int StartingConditional() { string tString = GetScriptStringParameter(); int tInt = GetScriptParameter( 1 ); int LevelInt = GetScriptParameter( 2 ); if ( ( GetGlobalNumber(tString) == tInt ) && ( GetGlobalNumber ("G_PC_LEVEL") < LevelInt ) ) { return TRUE; } return FALSE; } Here's the output from the reverse compiler: //Kotor2\c_003handfight.ncs This is definitely a K2 file Sub13, 23 } Sub23, 250 38 string = GetScriptStringParameter(); 65 int = GetScriptParameter(1, ); 92 int = GetScriptParameter(2, ); 166 if (GetGlobalNumber(CPTOPSP, ) == CPTOPSP && GetGlobalNumber("G_PC_LEVEL", ) < CPTOPSP) { } 224 ; } Recognizable! Here's the output of k_inc_npckill.ncs: Spoiler //Kotor2\k_inc_npckill.ncs This is definitely a K2 file Sub13, 21 13 Sub21() } Sub21, 298 42 int = GetScriptParameter(1, ); 69 int = GetScriptParameter(2, ); 96 int = GetScriptParameter(3, ); 118 if (CPTOPSP == 0) { } 186 if (CPTOPSP == 1) { } 246 if (CPTOPSP == 2) { } 290 ; } Sub298, 1345 314 if (CPTOPSP > 0) { } 412 int = 15; 434 int = 0; 463 location = GetLocation(CPTOPSP, ); 492 float = GetFacing(CPTOPSP, ); 525 vector_z = GetPositionFromLocation(CPTOPSP, ); 563 vector_z = GetPositionFromLocation(CPTOPSP, ) = CPTOPSP + 1.000000; 600 location = Location(CPTOPSP, CPTOPSP, CPTOPSP, CPTOPSP, ); 671 object = GetFirstObjectInShape(4, 4.000000, CPTOPSP, 0, 65, 0.000000, 0.000000, 0.000000, ); 722 while(GetIsObjectValid(CPTOPSP, ) && CPTOPSP > 0) { } 1253 ApplyEffectAtLocation(0, EffectVisualEffect(3003, 0, ), CPTOPSP, 0.000000, ) 1289 AssignCommand(CPTOPSP, ) 1326 DestroyObject(CPTOPSP, 0.000000, 1, 0.000000, 0, ) 1331 ; 1337 ; } Sub1345, 1601 1374 int = GetHasFeat(56, CPTOPSP, ); 1409 int = GetHasFeat(57, CPTOPSP, ); 1431 int = 0; 1453 if (CPTOPSP == 1) { } 1569 ; 1593 ; } Sub1601, 1637 1623 Sub298(CPTOPSP, CPTOPSP, 0, ) 1629 ; } Sub1637, 1813 1653 if (CPTOPSP > 0) { } 1760 effect = EffectDeath(0, 0, 1, ); 1794 ApplyEffectToObject(0, CPTOPSP, CPTOPSP, 0.000000, ) 1799 ; 1805 ; } And k_contain_unlock: Spoiler //Kotor2\k_contain_unlock.ncs This is definitely a K2 file Sub13, 21 13 Sub21() } Sub21, 995 37 int = 0; 59 int = 1; 81 int = 2; 103 int = 3; 125 int = 4; 147 int = 5; 169 int = 6; 191 int = 7; 213 int = 8; 235 int = 9; 257 int = 10; 279 int = 11; 301 int = 12; 323 int = 13; 345 int = 14; 367 int = 15; 389 int = 16; 411 int = 17; 433 int = 18; 455 int = 19; 477 int = 1100; 501 int = -6; 525 int = -5; 549 int = -4; 573 int = -2; 597 int = -1; 619 int = 0; 641 int = 1; 663 int = 1; 685 int = 2; 707 int = 3; 729 int = 4; 751 int = 5; 773 int = 6; 795 int = 7; 817 int = 8; 839 int = 9; 861 int = 10; 883 int = 11; 905 int = 12; 927 int = 13; 949 int = 14; 971 int = 15; 979 Sub995() 987 ; } Sub995, 1122 1011 object = 0; 1038 if (!GetLocalBoolean(CPTOPSP, 55, )) { } 1114 ; } Sub1122, 1766 1143 if (!GetLocalBoolean(CPTOPSP, 57, )) { } 1758 ; } Sub1766, 2948 1782 int = 0; 1819 int = GetGlobalNumber("G_PC_LEVEL", ); 1839 string = ""; 1890 if (CPTOPSP == 990 && !IsAvailableCreature(7, )) { } 1938 else if(CPTOPSP == 0) { } 2053 else if(CPTOPSP % 100 == 0) { } 2347 else if(CPTOPSP % 10 == 0) { } 2916 ; 2940 ; } Sub2948, 6915 2964 int = 1; 2987 string = GetModuleName(); 3363 if (CPTOPSP == "851NIH" || CPTOPSP == "852NIH" || CPTOPSP == "901MAL" || CPTOPSP == "902MAL" || CPTOPSP == "903MAL" || CPTOPSP == "904MAL" || CPTOPSP == "905MAL" || CPTOPSP == "906MAL") { } 3636 case label 3658 case label 3680 case label 3702 case label 3724 case label 3746 case label 3768 case label 3790 case label 3812 case label 3834 case label 3856 case label 3878 case label 3900 case label 3922 case label 3944 case label 3966 case label 3988 case label 4010 case label 4032 case label 4054 case label 4076 case label 4098 case label 4120 case label 4142 case label 4164 case label 4186 case label 4208 case label 4230 case label 6843 string = GetModuleName(); 6883 ; 6907 ; } Sub6915, 7116 6938 string = IntToString(CPTOPSP, ); 6959 string = "0"; 6981 int = 4; 7010 while(GetStringLength(CPTOPSP, ) < CPTOPSP) { } 7084 ; 7108 ; } Rough, but getting there! 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 5, 2021 I wanted to give an update a few weeks ago, but I had to tackle a problem I ignored for months. The reverse compiler is now more than 5,000 lines in length, and that makes it difficult to maintain. So, almost 4,000 lines of source code have been moved into a library. Additionally, many functions did too much and have since been split into smaller functions or replaced entirely. It's not sexy, but coding never is. Or is it? Cue: Salt-N-Peppa's "Let's talk about sex" Let's talk about Hex, baby! Let's talk about Binary! Let's talk about Octal numbers, Signed and unsigned, number theory! OK, OK, I'm back to "normal". Anyway, after three weeks of maintenance I got back to the real work and decided to rewrite the remaining 1,000+ lines entirely. Everything from reading an NCS file from disk to producing NWScript is new. And so much better than before, since I eliminated some bugs in the code. I'll provide a more detailed post later, but here's some sample output from TSL's k_inc_npckill.ncs: Spoiler 21, 298() { int = GetScriptParameter(1); int = GetScriptParameter(2); int = GetScriptParameter(3); } 298, 1345() { int = 15; int = 0; location = GetLocation(Argument); float = GetFacing(Argument); float = GetPositionFromLocation(location = GetLocation(Argument)); float = GetPositionFromLocation(location = GetLocation(Argument)); float = GetPositionFromLocation(location = GetLocation(Argument)) = float = GetPositionFromLocation(location = GetLocation(Argument)) + 1.000000; location = Location(vector, float = GetFacing(Argument)); object = GetFirstObjectInShape(4, 4.000000, location = GetLocation(Argument), 0, 65, vector); Destroying 3 arguments } 1345, 1601() { int = GetHasFeat(56, Argument); int = GetHasFeat(57, Argument); int = 0; return int = 0; Destroying 1 arguments } 1601, 1637() { Destroying 2 arguments } 1637, 1813() { effect = EffectDeath(0, 0, 1); Destroying 2 arguments } I purposely left out nested scopes because I wanted to focus local variables. This script is particularly interesting because it contains a vector. See the really long line that starts with float = GetPositionFromLocation? That's one of the vector's members. The source code looks like this: Spoiler vector vPos = GetPositionFromLocation( oLoc ); vPos.z = vPos.z + 1.0f ; The really long line in the sample output essentially merges the two lines above, but if you look closely you'll see it's the combination of creating vPos.z, adding 1.0f to vPos.z, then storing the result back to vPos.z. Here's a cut-and-paste from the NSS file to compare to the sample output: Spoiler int nDC = 15; int nDCCheck = 0; location oLoc = GetLocation(oCreature); float oOri = GetFacing(oCreature); vector vPos = GetPositionFromLocation( oLoc ); vPos.z = vPos.z + 1.0f ; location oExplosionLoc = Location( vPos, oOri ); object oTarget = GetFirstObjectInShape(SHAPE_SPHERE, 4.0, oLoc, FALSE, 65); Now that I've confirmed the stack is being created correctly, the next tasks are to: Add variable declarations, e.g. int nParam, object oParam Properly display modification of variable values, e.g. vPos.z = vPoz.z + 1.0f Insert nested scopes (if-else, while, etc...) Probably other stuff I can't think of right now 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted March 17, 2021 Just a quick update. Here's output from TSL's k_contain_unlock.ncs: Spoiler //void main() 0 995, 1122(0) { object Object0; Object0 = 0 if (!GetLocalBoolean(Object0, 55)) { SetLocalBoolean(Object0, 55, 1); Sub1122(Object0, Random(8) + 1, 920); } } //void PlaceTreasureDisposable(object oContainer = OBJECT_SELF, int numberOfItems = 1, int nItemType = 900) 0 1122, 1766(3) { if (!GetLocalBoolean(CPTOPSP, 57)) { SetLocalBoolean(CPTOPSP, 57, 1); int Int0; int Int1; Int1 = GetGlobalNumber("G_PC_LEVEL") if (CPTOPSP < 900) { } int Int2; Int2 = Int1 + Random(6) - 4 if (Int2 < 1) { Int2 = 1 } if (Int2 > 30) { Int2 = 30 } Int0 = 1 while (Int0 <= CPTOPSP) { string String3; string String4; int Int5; Int5 = 1 int Int6; String3 = Sub1766(Int2, CPTOPSP) Int6 = FindSubString(String3, "[") if (FindSubString(String3, "[") >= 0) { Int5 = StringToInt(GetSubString(String3, Int6 + 1, 4)) String4 = GetSubString(String3, 0, Int6) } else { String4 = String3 } } } Removing -3 arguments } Overall, the results look good! Though, there are things that need to be dealt with: Subroutines only list the number of return values and arguments; type information needs to be substituted, e.g. void Sub1122(object, int, int) Arguments in expressions are listed as CPTOPSP; this needs to be replaced with a variable name and argument position, e.g. if (!GetLocalBoolean(Arg0, 57)) Variable declarators with initializers are written as two lines; while correct, these should be merged, e.g. object Object0 = 0; Arguments that are assigned to need to be printed, e.g. if (Arg2 < 900) Arg2 = 0; For loops are presented as while loops; while correct, for (Int0 = 1 ; Int0 <= Arg1; ++Int0) is more compact than Int0 = 0; while (Int0 <= Arg1) ++Int0 Increment and decrement are not yet printed (this is trivial, I just didn't do it) Printing expressions needs to take place at a different point in the process; the current method prints some expressions twice, and omits others Switch statements need to be printed Vectors and structs need to be detected, which will require multiple methods Return statements need to be printed Break and continue statements in loops need to be detected Arguments of type "action" (AssignCommand, DelayCommand, ActionDoCommand) need to be printed These are largely minor issues, but they're taking a backseat to a bug I introduced along the way: Spoiler //string GetTreasureBundle (int nItemLevel, int nItemType = 0) 1 1766, 2948(2) { int Int0 = 0; int Int1; int Int2 = GetGlobalNumber("G_PC_LEVEL") string String3 = "" if (Arg1 == 990 && !IsAvailableCreature(7)) { } if (Arg1 == 0) { Int1 = Random(0) + 9 String3 = Sub1766(CPTOPSP, Int1 * 100) } else if (String3) { Int0 = Random(CPTOPSP) + 1 Int2 = Sub1766(CPTOPSP, CPTOPSP + 10 * Int0) } else if (Int2) { Int1 = Sub2948(CPTOPSP, CPTOPSP + CPTOPSP) } else { Int1 = Sub2948(CPTOPSP, CPTOPSP) } Removing -5 arguments } The else-if statements have the wrong test condition. They should all be pointing at the same variable as the preceding if statement, Arg1. This error is propagated further as reflected in the last line. Removing 5 arguments? There are only 2!!! Printing code was added yesterday, so I expect this to be a quick fix. Share this post Link to post Share on other sites
AmanoJyaku 184 Posted April 21, 2021 Sorry for the delay. I wanted to put out an update, but I've been addressing bugs. More important, I'm dealing with real life and needed to take some time for myself. This project has consumed most of my free time, I haven't played a game since February of last year. I'll pick this up next week, but for now I'm playing some games! 2 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted May 18, 2021 Sometimes, I learn more from mistakes than I do from getting it right the first time... My last post mentioned bug fixes. In an earlier post I described the structure of an NCS subroutine: Step 1 - The code that does the stuff you want Step 2 - Placing a return value on the stack Step 3 - Destroying local variables and temporaries on the stack Step 4 - Destroying arguments on the stack None of these steps are required, I'm almost certain I've seen an empty subroutine that simply returned. That means a missing step cannot impede the analysis of successive steps. So, how do we analyze the following? Subroutine 1601 1601 CONSTI 0 1607 CPTOPSP -3 1 1615 CPTOPSP -3 1 1623 JSR 298 1629 MOVSP -2 1635 RETN There is a single path through the code, that is there are no jumps so the ops execute in sequence. (Technically, the JSR is a jump. However, subroutine calls return to the op from which they were called and proceed to the next op in the sequence.) So, how does the stack change? The first three ops each add a value to the stack. The fourth op, a subroutine call, removes those three values from the stack. The subroutine returns void, i.e. nothing, so no return value is placed on the stack. And yet, the fifth op removes values from the empty stack... This tells us a few things: The fourth op is the last in Step 1 Subroutine 1601 does not place a value onto the stack after Step 1, so there is no Step 2 There are no local variables and no temporary values on the stack, so there is no Step 3 Therefore, the fifth op must be Step 4 Subroutine 1601 takes two arguments and returns void, i.e. void Sub1601(Arg0, Arg1) Let's look at a more complicated example: Subroutine 87299 87299 CPTOPSP -1 1 87307 CPTOPBP -171 1 87315 EQUALxx 87317 JZ 89610 CONSTx "NO COMBO SELECTED" 89631 CPDOWNSP -3 1 89639 MOVSP -1 89657 MOVSP -1 The first two ops each place a value onto the stack. The third removes those two values, and replaces them with a boolean (actually, an integer) result. The fourth removes the result, then performs a conditional jump. To simplify things, I've removed the conditional code branches. At this point, the stack is empty. The fifth op places a string onto the stack. The sixth copies the string down three positions from the top of the stack. However, there is only one value on the stack so this may be a return... The seventh op clears the stack. The eighth tries to remove a value from the empty stack. So, what does this tell us? The fourth op is the last in Part 1 Ops five and six may be Part 2 The seventh op is Part 3 The eighth and final op is Part 4 You'll notice the repeated use of "may". We should not assume all writes outside of the subroutine's stack are returns. Subroutine arguments also exist outside of the subroutine's stack, and this turned out to be the source of the bug. One of the subroutines analyzed wrote to its argument, and this was mistakenly marked as a return value. Thus, a subroutine that returned void, i.e. nothing, but also modified an argument messed up the analysis of any other subroutine that called it. The fix was simple enough: record the last write outside of the stack, along with the difference between the negation of destination and size of the stack. In the example above, the destination is -3 and the stack size is 1: -(-3) - 1 yields 2. When analyzing Part 4, the destruction of subroutine arguments, we compare the result to the size of the negation of number of arguments being destroyed: 2 vs -(-1), or 2 > 1. This tells us the final write is to a position greater than the furthest argument, thus it must be a return value. Subroutine 87299 takes one argument and returns one value. In short, we have to analyze Part 4 before we can analyze Part 2. Fixing bugs wasn't the only benefit. It's removed from the example, but there is an unconditional jump after the first MOVSP that transfers execution to the second MOVSP. This jump bypasses an op, creating dead code. I seem to have finally discovered the source of dead code: return statements. This also solves a different problem: I now know how to identify return statements in different parts of the subroutine. More in a future update! 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted May 30, 2021 "Might as well jump!" - Van Halen The last update mentioned the mystery of the unconditional jump at the end of a subroutine, and explained it's a return statement. Let's look at it in a bit more detail. 7054 CONSTS [ 7059 CPTOPSP -4 1 7067 ADDxx 7069 CONSTS ] 7074 ADDxx 7076 CPDOWNSP -6 1 7084 MOVSP -4 7090 JMP 7108 *7096 MOVSP -1* //Dead code *7102 MOVSP -3* //Dead code 7108 MOVSP -1 7114 RETN Why is this happening? It's the result of the compiler building the subroutine in stages. Let's separate the op codes into subroutine Parts: Part 2 7054 CONSTS [ 7059 CPTOPSP -4 1 7067 ADDxx 7069 CONSTS ] 7074 ADDxx 7076 CPDOWNSP -6 1 Part 3 (New) 7084 MOVSP -4 7090 JMP 7108 7096 MOVSP -1 Part 3 (Original) 7102 MOVSP -3 Part 4 7108 MOVSP -1 7114 RETN Oho! This subroutine returns a value (Part 2), clears local variables (Part 3), and clears arguments (Part 4). But, why are there two Part 3's? Top to bottom, the stack looks like this: Temporary return values Local variables Arguments Return values The compiler first writes the code for clearing the arguments (Part 4), which is easy because there is a fixed number. Then, it writes code for clearing the local variables (Part 3 - Original). In the example, the subroutine has 3 local variables. Next, the compiler calculates the number of values returned in Part 2 and adds that to the number of local variables. The result is 3 + 1. The compiler does not remove the original Part 3, so it's necessary to jump over it (Part 3 - New). So, why is there a MOVSP after the jump? As far as I can tell, it's a quirk of the compiler. Part 2 returns a value by evaluating a set of expressions. The result is then copied down the stack past the arguments. The result is then cleared in the new Part 3. However, the compiler inserts a MOVSP after the jump that would have cleared the result of the last expression. That leaves us with two MOVSPs that are jumped over: one to clear the local variables, and one to clear the last expression (in this case, the return value). I'm rewriting my code to take advantage of this. Instead of calculating all of the changes made to the stack in order to find the subroutine signature, it could be easier to look for a jump and evaluate the code around it. The question is how to handle subroutines that don't return values, and therefore lack the jump... This newfound knowledge is useful for another reason: I now know how to properly analyze switch statements. More in a future post. 2 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted June 1, 2021 I wrote my first NWScript in order to test the findings from the previous post. The script: Spoiler vector test() { int I = 0; return GetPositionFromLocation(GetLocation(OBJECT_SELF)); } void main() { vector V = test(); } I specifically chose to add a single local variable and return a vector (struct of three floats) to prove the results. Let's break down the test subroutine: Spoiler Part 1 61 RSADDx 63 CONSTx 69 CPDOWNSP -2 1 77 MOVSP -1 Part 2 83 CONSTx 89 1 GetLocation(1) 94 3 GetPositionFromLocation(1) 99 CPDOWNSP -7 3 Part 3 (New) 107 MOVSP -4 113 JMP 131 119 MOVSP -3 Part 3 (Original) 125 MOVSP -1 131 RETN Part 1 is where the local variable I is declared and initialized. Part 2 is where the vector is returned. Part 3 is where things get interesting. Part 3 (Original) clears the local variable. Part 3 (New) clears the return values and local variables. Since Part 3 (Original) is not removed there is a jump over it in Part 3 (New). That jump is followed by the compiler quirk, the MOVSP that clears the last expression from the stack. That expression is the result of GetPositionFromLocation(), a vector. It's good to finally understand this weird JMP/MOVSP business, but this is awesome for two other reasons. The first is code analysis can be simplified to look for this JMP/MOVSP combination, it's no longer necessary to calculate stack changes for each op code in order to determine if a subroutine returns values. This eliminates the majority of bugs. It's also faster, but that's not as much of a concern. The second reason is it fundamentally changes the reverse compiler construction. I'll explain more in an upcoming post, but the tl;dr is it's now possible to find the beginning and end of statements! Spoiler void _start() //Hidden start function //Essentially, main(); 13 JSR 21() 19 RETN void main() //vector v = test(); 21 RSADDx 23 RSADDx 25 RSADDx 27 RSADDx 29 RSADDx 31 RSADDx 33 JSR 61() 39 CPDOWNSP -6 3 47 MOVSP -3 //Not printed 53 MOVSP -3 59 RETN vector test() //int i = 0; 61 RSADDx 63 CONSTx 69 CPDOWNSP -2 1 77 MOVSP -1 //return GetPositionFromLocation(GetLocation(OBJECT_SELF)); 83 CONSTx 89 1 GetLocation(1) 94 3 GetPositionFromLocation(1) 99 CPDOWNSP -7 3 107 MOVSP -4 //Not printed 113 JMP 131 119 MOVSP -3 125 MOVSP -1 131 RETN Notice a pattern? 🤔 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted June 6, 2021 So, I finally discovered a bug in nwnnsscomp... I think. The purpose of a reverse compiler is to recreate the syntax of the high-level language. This can only be accomplished if we understand statements, expressions, and values. Statement: HK-47 is ready to serve, master. - HK-47 Spoiler There are seven categories of statements in NWScript: Declaration statements int I; float F; string S; etc... Expression statements I = 0; ++I; PrintString("Print me!!!"); etc... Selection statements if (condition) switch (condition) Iteration statements while (condition) for (condition) do while(condition) Labeled statements case 0: default: Jump statements break; continue; return; Compound statements { //statement 0 //statement 1 //... } Each statement is identified by locating an initiating op code, terminating op code, or combination of both. For example, variable declarations are a single op code, RSADDx. However, RSADDx is also used for return values of subroutines. If the Random engine function were written as a subroutine call it might look like this: RSADDx CONSTI 10 JSR <offset to subroutine> Which means we identify RSADDx as a code of interest, but need to process the op codes that follow to be certain. As it turns out, finding the terminator is usually easier than finding the initiator. But, why? If you care for others, then dispense with pity and sacrifice and recognize the value in letting them fight their own battles. - Kreia Spoiler The majority of statements in a program are expression statements, so it's necessary to understand expressions. In general, expressions produce values: 1 + 1; What happens to the value 2? Before we get to that, consider that not all expressions produce values. Function/subroutine calls are expressions that can produce void, or no value: //PrintString doesn't produce a value void PrintString(string); It's also important to understand the NCS runtime places variables and values onto the same stack. Thus: int I = 0; float F = 0.0; 1 + 1; //more statements... looks like this: //Top of stack 2 0.0 0 //Bottom of stack At least, it looks that way before proceeding to the statement after 1 + 1. A variable is essentially a value that persists until the end of its enclosing scope (the curly braces). All other values are guaranteed to be destroyed by the end of their statement. Thus, another term for an expression statement is discarded-value expression. Battle is a pure form of expression. - Handmaiden Spoiler When we reach the end of a statement any value that is not a variable is guaranteed to be destroyed. The MOVSP op code is what destroys the values of expression statements. This was illustrated in the previous update. It can also be seen in declarations with initializers: RSADDx //Place a variable on the stack CONSTx 0 //Place a value on the stack CPDOWNSP -2 1 //Assign the value to the variable (initialization) MOVSP -1 //Destroy the value, we've reached the end of a statement But, what about expressions that return void? Expressions can produce void, but they cannot accept void. Thus, a void expression ends a statement. What about selection, iteration, and labeled statements? Their conditions are expressions. What happens to their resulting values? int I = 0; //The result of the comparison is destroyed by the if statement if (I > 0) { } Thus, by the time the program proceeds to a branch of a selection or iteration statement all values are destroyed and only variables are left on the stack. This isn't true for labeled statements due to the way switch statements work, but once you exit a switch entirely all values are destroyed. tl;dr We find statements in many ways, including detecting the destruction of expressions. So, what's all this got to do with nwnnsscomp? Spoiler That thing I mentioned about labeled statements and the way switch statements work. Usually, destroying the condition of a switch statement is the very last thing to occur. The exception to this is when a switch contains a return statement: switch (condition) { case 0: //CPDOWNSP here, e.g. Part 2 of subroutine //MOVSP -N here, e.g. Part 3 of subroutine return 0; case 1: //Jumps to the MOVSP -1 break; } //MOVSP -1 here to destroy condition We've seen the analysis of returns, so this should be familiar. So, what's the bug? Consider the following: while (condition) { switch (condition) { case 0: break; case 1: continue; } } Case 1 exits the switch and jumps to the end of the while statement, so the switch condition should be destroyed before the jump. It's not. Nwnnsscomp leaves the condition on the stack, which means any reference to the stack is off by the number of times the continue is executed. There are very few continue statements in the game files, and none that I could find inside of a switch statement. But, if someone were to write perfectly valid code like this, well... kaboom!!! //Top of stack Condition Condition Condition ... Variables //Bottom of stack Why am I not certain this isn't a bug? It clearly is, I'm just not certain which version of nwnnsscomp is the latest, and whether or not it's been fixed. I have three versions of the program, and I've only tested one. 1 Share this post Link to post Share on other sites
DarthParametric 3,782 Posted June 7, 2021 9 hours ago, AmanoJyaku said: There are very few continue statements in the game files Interesting. I can't claim to have gone through all the K1 scripts (not least because some still can't be decompiled), but in the course of working on K1CP have browsed one or two and I have never come across this one. I did a quick search through the global scripts and found a use of it in TSL's k_inc_force. Is it a TSL-only thing? Share this post Link to post Share on other sites
AmanoJyaku 184 Posted June 7, 2021 8 hours ago, DarthParametric said: Interesting. I can't claim to have gone through all the K1 scripts (not least because some still can't be decompiled), but in the course of working on K1CP have browsed one or two and I have never come across this one. I did a quick search through the global scripts and found a use of it in TSL's k_inc_force. Is it a TSL-only thing? The continue statement is as old as the C language NWScript is derived from. It exists in K1, and probably NWN, it simply was never used. Another example is do-while; there is only one script out of 2,600+ that has a do-while statement. I haven't seen the ternary operator, the const keyword, or the bitwise operators, either. Share this post Link to post Share on other sites
AmanoJyaku 184 Posted November 6, 2021 How is everyone? It's been a while... Several changes have been made to the reverse compiler, resulting in a rewrite from the ground up: All Opcodes are now properly decoded, including: RSADDx - Now able to differentiate variables from subroutine return values JSR - Number of subroutine arguments and return values fully deduced JZ - Now able to differentiate logical AND|OR expressions from if|while|for statements Subroutine signatures are now partially deduced: Number, but not types, of arguments and return values To do: Collection of type information is dependent upon evaluation of statements Proto-statements (I made up this term) are discovered: Input for recreating complete statements (I made up this term, too) int i = 0; is a complete statement int i; is a complete statement, and a proto-statement of int i = 0; i = 0; is a complete statement, and a proto-statement of int i = 0; To do: Proto-statements must be combined to recreate complete statements Error handling is slowly being added: Has helped discover programming and algorithm bugs Is meant to discover incorrect NCS files, which compilers can generate under certain conditions To do: "Full" error handling won't happen in the first release If anyone has questions I'll be happy to explain further. For now, I'm working to complete this before the two-year anniversary of starting this project. 😬 2 Share this post Link to post Share on other sites
DarthParametric 3,782 Posted November 6, 2021 4 minutes ago, AmanoJyaku said: I'm working to complete this before the two-year anniversary of starting this project. Cool. I'm looking forward to finally being able to decode some of those holdout scripts that still remain impervious to DeNCS's charms (like some of the Star Forge level 4 ones for example). If you manage to release something before the end of the year we can clear out the last of the remaining hijacked scripts in K1CP before v1.9 releases. 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted November 10, 2021 On 11/6/2021 at 10:48 AM, DarthParametric said: Cool. I'm looking forward to finally being able to decode some of those holdout scripts that still remain impervious to DeNCS's charms (like some of the Star Forge level 4 ones for example). If you manage to release something before the end of the year we can clear out the last of the remaining hijacked scripts in K1CP before v1.9 releases. I'm not promising anything, but we'll see. I don't know how many people are coders, but I came across a series of articles on creating your own compiler. The articles are largely useless for writing a reverse compiler like this WIP, but they give some insight into how early design choices impact development later on. In part 4 the author wanted to make one change to a feature added in part 2, yet it led to multiple changes because so much was dependent on that one change. This has been my experience from the start: numerous rewrites from the ground up as I learned more about the NCS byte code. Finally, I'm past interpreting individual codes and on to interpreting statements and data structures. Statements are easy (except switch statements), but data structures will require some thought. Two different data structures could have the same member definitions (e.g. struct S1 {int i; float f;} and struct S2 {int i; float f;}), so I'll need logic to detect this. Finding structures is surprisingly easy, though. tl;dr The two obstacles left are switch statements and user-defined structs, and I'm nearly finished with switch statements. http://visualstudiomagazine.com/articles/2014/05/01/how-to-write-your-own-compiler-part-1.aspx http://visualstudiomagazine.com/articles/2014/06/01/compiler-basics-part-2.aspx http://visualstudiomagazine.com/articles/2014/07/01/syntax-analysis.aspx https://visualstudiomagazine.com/articles/2014/09/01/compiler-basics-how-to-part-4.aspx Share this post Link to post Share on other sites
AmanoJyaku 184 Posted November 17, 2021 The last few updates were light on samples, so let's address that. TSL's k_inc_treasure.nss: Spoiler //turn a given quantity into appropriate format for a treasure blob (string) string SWTR_GetQuantity(int iCount) { string str = IntToString(iCount); string pad = "0"; int length = 4; while(GetStringLength(str) < length) { str = pad + str; } return("[" + str + "]"); } And the reverse compiler output: Spoiler string S; <expression statement>; string S; <expression statement>; int I; <expression statement>; <iteration statement> <expression statement>; <jump statement>; <expression statement>; <jump statement>; <MOVSP> <MOVSP> <MOVSP> <RETN> And the two overlapped for comparison, with the reverse compiled script labeled as "RCScript": Spoiler string SWTR_GetQuantity(int iCount) { [NWScript] string str = IntToString(iCount); [RCScript] string S; [RCScript] <expression statement>; [NWScript] string pad = "0"; [RCScript] string S; [RCScript] <expression statement>; [NWScript] int length = 4; [RCScript] int I; [RCScript] <expression statement>; [NWScript] while(GetStringLength(str) < length) [RCScript] <iteration statement> { [NWScript] str = pad + str; [RCScript] <expression statement>; } [RCScript] <jump statement>; [NWScript] return("[" + str + "]"); [RCScript] <expression statement>; [RCScript] <jump statement>; [RCScript] <MOVSP> } [RCScript] <MOVSP> [RCScript] <MOVSP> [RCScript] <RETN> In some cases, the number of RCScript statements matches that of the original NWScript statements, e.g. the assignment expression statement "str = pad + str;". [An earlier post referred to RCScript statements as proto-statements, and NWScript statements as complete statements. They're all terms I made up.] In others, multiple RCScripts statements combine to form the NWScript statement. The simplest example is a variable declaration with an initializer: the initializer is an expression statement that immediately follows the declaration. A more complicated example is the iteration statement, which is followed by a jump statement. Then, the body of the iteration statement (in the example, the assignment expression statement) is placed between them. The return statement is similar, however it has a slight quirk in that it starts off as regular expression statement that gets split in two. Then a jump statement is placed in between the two pieces, with the second being the original MOVSP. The first piece is modified with a new MOVSP that removes the result of the expression and all other values on the stack in order to clear the stack before exiting the subroutine. However, the ultimate is the switch statement. It's so complicated it needs its own post. It has optional parts... The final three RCScript statements are unprintable. The first clears all variables from the stack. However, this was already dealt with in the return statement. The jump statement inserted into the return statement jumps past this MOVSP and lands on the final MOVSP. Since the stack has already been cleared, this must be clearing arguments. And then the subroutine returns. To Do: Write code to generate expressions, e.g. ADDxx becomes <operand 1> + <operand 2> Regroup RCScript statements back into NWScript statements Integrate the value stack so variables can be referenced by unique names 1 Share this post Link to post Share on other sites
AmanoJyaku 184 Posted January 20, 2022 Been a while, time for an update! Writing code isn't as "simple" as using a modeling tool like 3DS Max or Maya (is Lightwave 3D still a thing?) to create 3D characters. A modeling tool uses physics; imagine being the coder who has to create the physics. Sir Amano Newton has spent the past two years unraveling NCS "physics", because NCS "god" (BioWare) didn't leave us precise notes. Let's look at we have to do to get started writing a reverse compiler: Binary, Decimal, and Hexadecimal Output Spoiler Usually, input is a file from disk. It doesn't really matter where it comes from, all that matters is the NCS data is a contiguous sequence of bytes. What does that even look like? 1E00000000082000 The code above is readable, clearly a series of numbers and letters. However, NCS data files are stored in binary format. The binary bytes of a file have 256 possible values, but only 96 of those can be printed: A-Z, a-z, 0-9, space, return, tab, punctuation, etc... So, what about the other 160 values? They either show up as weird characters or blank spaces, or they do unexpected things unique to the reader software. You can't use a text editor to read binary files, and that means no printing raw binary data. We need to convert the values to a printable representation. We could try a printable version of binary, but that looks like this: //42 in binary 00101010 That's a single byte. Each 4-byte integer, most of the numbers in NCS, would be four times longer! So, what about decimal representation? 42 is perfectly fine, but: //Maximum signed 32-bit value 2147483647 //Minimum signed 32-bit value -2147483648 Including the minus sign, that's 11 characters! Hexadecimal is shorter: //Maximum signed 8-bit value 7F, vs 127 //Maximum signed 16-bit value 7FFF, vs 32767 //Maximum signed 32-bit value 7FFFFFFF, vs 2147483647 Eliminating 1 or 2 characters from every 3-11 doesn't seem significant until you're reading thousands of characters... Converting From Binary to Hex Spoiler OK, so we want to print in hex. How? I had to write my own code for this, I couldn't figure out the C++ formatting and conversion library. //hex_str(uint8_t Quotient) std::string Converted; do { switch (Quotient % 16u) { case 0u: Converted += '0'; break; ... case 15u: Converted += 'F'; break; } } while (0 < (Quotient /= 16u)); if (2u == std::size(Converted)) { std::reverse(std::begin(Converted), std::end(Converted)); } else { Converted.insert(std::begin(Converted), '0'); } return Converted; Each byte is fed to the conversion function as the variable named Quotient. The value is repeatedly divided by 16, storing the result backwards into a string. A single byte is represented by two hexadecimal characters (0x7F is (7 * A) + (F * 1), or (7 * 16) + (15 * 1)) . If the conversion yields a string with two characters, that string is reversed to place the values in the same order as the original file. If the conversion yields a string with just one character, a '0' character is placed in front, (e.g. F becomes 0F). Printing Spoiler Printing is now as simple as looping through each byte, passing it to the conversion function, and adding the result to a string to be printed: for (auto CurPos{ BeginFile + 13 }; _at_least(CurPos, EndFile, 1u); ++CurPos) { Output += hex_str(static_cast<uint8_t>(*CurPos)); } CurPos is the current position in the file, which advances to EndFile. The very first op code is always at offset 13. The _at_least function guarantees there are at least N bytes left, otherwise there's not enough to process the file. For C++ reasons, the value retrieved is converted from a std::byte to a uint8_t, then passed to the conversion function hex_str. Which contains the output seen in our first snippet: 1E0000000008200002030403000000000101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000020101FFFFFFF800041B00FFFFFFFC02030403000000030101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC02030403000000050101FFFFFFF800041B00FFFFFFFC02030403000000060101FFFFFFF800041B00FFFFFFFC02030403000000070101FFFFFFF800041B00FFFFFFFC02030403000000080101FFFFFFF800041B00FFFFFFFC02030403000000090101FFFFFFF800041B00FFFFFFFC020304030000000A0101FFFFFFF800041B00FFFFFFFC020304030000000B0101FFFFFFF800041B00FFFFFFFC020304030000000C0101FFFFFFF800041B00FFFFFFFC020304030000000D0101FFFFFFF800041B00FFFFFFFC020304030000000E0101FFFFFFF800041B00FFFFFFFC020304030000000F0101FFFFFFF800041B00FFFFFFFC02030403000000100101FFFFFFF800041B00FFFFFFFC02030403000000110101FFFFFFF800041B00FFFFFFFC02030403000000120101FFFFFFF800041B00FFFFFFFC02030403000000130101FFFFFFF800041B00FFFFFFFC020304030000044C0101FFFFFFF800041B00FFFFFFFC020304030000000619030101FFFFFFF800041B00FFFFFFFC020304030000000519030101FFFFFFF800041B00FFFFFFFC020304030000000419030101FFFFFFF800041B00FFFFFFFC020304030000000219030101FFFFFFF800041B00FFFFFFFC020304030000000119030101FFFFFFF800041B00FFFFFFFC02030403000000000101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000020101FFFFFFF800041B00FFFFFFFC02030403000000030101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC02030403000000050101FFFFFFF800041B00FFFFFFFC02030403000000060101FFFFFFF800041B00FFFFFFFC02030403000000070101FFFFFFF800041B00FFFFFFFC02030403000000080101FFFFFFF800041B00FFFFFFFC02030403000000090101FFFFFFF800041B00FFFFFFFC020304030000000A0101FFFFFFF800041B00FFFFFFFC020304030000000B0101FFFFFFF800041B00FFFFFFFC020304030000000C0101FFFFFFF800041B00FFFFFFFC020304030000000D0101FFFFFFF800041B00FFFFFFFC020304030000000E0101FFFFFFF800041B00FFFFFFFC020304030000000F0101FFFFFFF800041B00FFFFFFFC2A001E00000000102B001B00FFFFFF54200002060406000000000101FFFFFFF800041B00FFFFFFFC0403000000370301FFFFFFF80004050002A70222031F000000004C0403000000010403000000370301FFFFFFF40004050002A803040300000398040300000008050000000104030000000114200301FFFFFFF400041E00000000141D00000000061B00FFFFFFFC20000403000000390301FFFFFFF80004050002A70222031F00000002670403000000010403000000390301FFFFFFF40004050002A803020302030405000A475F50435F4C4556454C05000244010101FFFFFFF800041B00FFFFFFFC0301FFFFFFEC00040403000003840F201F00000000200403000000000101FFFFFFE800041B00FFFFFFFC1D000000000602030301FFFFFFF800040403000000060500000001142004030000000415200101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC00040403000000010F201F00000000200403000000010101FFFFFFF800041B00FFFFFFFC1D00000000060301FFFFFFFC000404030000001E0E201F000000002004030000001E0101FFFFFFF800041B00FFFFFFFC1D00000000060403000000010101FFFFFFF000041B00FFFFFFFC0301FFFFFFF400040301FFFFFFE8000410201F00000001340205020502030403000000010101FFFFFFF800041B00FFFFFFFC020302050301FFFFFFD400040301FFFFFFE400041E00000001140101FFFFFFEC00041B00FFFFFFFC040500015B0301FFFFFFEC000405000042020101FFFFFFF800040403000000000D201F000000006B0403000000040301FFFFFFF8000404030000000114200301FFFFFFE800040500004103050000E8010101FFFFFFF400041B00FFFFFFFC0301FFFFFFFC00040403000000000301FFFFFFE8000405000041030101FFFFFFF000041B00FFFFFFFC1D000000001C0301FFFFFFF000040101FFFFFFF000041B00FFFFFFFC0403000000000301FFFFFFF400040301FFFFFFD800040301FFFFFFE800040500001F041B00FFFFFFFC1B00FFFFFFF00301FFFFFFF400042403FFFFFFF01B00FFFFFFFC1D00FFFFFEC01B00FFFFFFF41D00000000061B00FFFFFFF4200002030403000000000101FFFFFFF800041B00FFFFFFFC020302030405000A475F50435F4C4556454C05000244010101FFFFFFF800041B00FFFFFFFC0205040500000101FFFFFFF800041B00FFFFFFFC0301FFFFFFE800040403000003DE0B200301FFFFFFFC00041F0000000015040300000007050002B801220306201F00000000200403000003D40101FFFFFFE400041B00FFFFFFFC1D00000000060301FFFFFFE800040403000000000B201F000000005B040300000000050000000104030000000914200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFF0000404030000006416200301FFFFFFE400041E00FFFFFF130101FFFFFFF800041B00FFFFFFFC1D000000036D0301FFFFFFE8000404030000006418200403000000000B201F000000010E0301FFFFFFE8000404030000006417200301FFFFFFFC00040403000000090B2025000000000C1D000000007B040300000007050002B8011F00000000200403000000090101FFFFFFE800041B00FFFFFFFC1D000000004A0301FFFFFFF400040403000000060F201F00000000200403000000060101FFFFFFE800041B00FFFFFFFC1D000000001A0403000000080101FFFFFFE800041B00FFFFFFFC1D00000000061B00FFFFFFFC0301FFFFFFF00004050000000104030000000114200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFE4000404030000000A0301FFFFFFE80004162014200301FFFFFFE400041E00FFFFFDED0101FFFFFFF800041B00FFFFFFFC1D00000002470301FFFFFFE8000404030000000A18200403000000000B201F00000002030301FFFFFFE8000404030000000A17200301FFFFFFFC000404030000005C0B2025000000007A0301FFFFFFFC000404030000005E0B2025000000007E0301FFFFFFFC000404030000005F0B202500000000820301FFFFFFFC00040403000000610B202500000000860301FFFFFFFC00040403000000620B2025000000008A0301FFFFFFFC00040403000000630B2025000000008E1D000000010A0403000000020101FFFFFFE800041B00FFFFFFFC1D00000000F00403000000020101FFFFFFE800041B00FFFFFFFC1D00000000D60403000000040101FFFFFFE800041B00FFFFFFFC1D00000000BC0403000000090101FFFFFFE800041B00FFFFFFFC1D00000000A20403000000030101FFFFFFE800041B00FFFFFFFC1D00000000880301FFFFFFE8000404030000000317200101FFFFFFE800041B00FFFFFFFC0301FFFFFFEC00040403000000050E201F00000000200403000000050101FFFFFFE800041B00FFFFFFFC1D00000000060403000000000101FFFFFFE800041F00000000200403000000010101FFFFFFE800041B00FFFFFFFC1D00000000061D00000000061B00FFFFFFFC0301FFFFFFF00004050000000104030000000114200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFE400040301FFFFFFEC000414200301FFFFFFE400041E00000000700101FFFFFFF800041B00FFFFFFFC1D000000002C02050301FFFFFFE400040301FFFFFFE400041E00000000440101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC00040101FFFFFFE000041B00FFFFFFEC1D00000000121B00FFFFFFFC1B00FFFFFFF01B00FFFFFFF8200002030403000000010101FFFFFFF800041B00FFFFFFFC0205020505000231000101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC0004040500063835314E49480B230301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063835324E49480B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930314D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930324D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930334D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930344D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930354D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930364D414C0B2307201F00000000F904030000000305000000010403000000000B201F00000000200403000003AD0101FFFFFFE800041B00FFFFFFFC1D00000000C004030000000305000000010403000000000B201F000000002D0403000003CB040300000009050000000114200101FFFFFFE800041B00FFFFFFFC1D000000008004030000000305000000010403000000000B201F00000000200403000003A30101FFFFFFE800041B00FFFFFFFC1D000000004D04030000000305000000010403000000000B201F00000000200403000003990101FFFFFFE800041B00FFFFFFFC1D000000001A04030000039A0101FFFFFFE800041B00FFFFFFFC1D00000000060301FFFFFFEC00040301FFFFFFFC000404030000038F0B2025000000025E0301FFFFFFFC00040403000003990B202500000002A60301FFFFFFFC000404030000039A0B202500000002E70301FFFFFFFC00040403000003A30B202500000003250301FFFFFFFC00040403000003AD0B2025000000039C0301FFFFFFFC00040403000003AE0B202500000004270301FFFFFFFC00040403000003B70B202500000004B50301FFFFFFFC00040403000003B80B202500000005190301FFFFFFFC00040403000003B90B2025000000057D0301FFFFFFFC00040403000003BA0B202500000005E10301FFFFFFFC00040403000003C10B2025000000066C0301FFFFFFFC00040403000003CB0B202500000006B40301FFFFFFFC00040403000003CC0B202500000006E70301FFFFFFFC00040403000003CD0B202500000007330301FFFFFFFC00040403000003CE0B202500000007A20301FFFFFFFC00040403000003CF0B202500000008250301FFFFFFFC00040403000003D00B202500000008930301FFFFFFFC00040403000003D10B202500000008A20301FFFFFFFC00040403000003D20B202500000009100301FFFFFFFC00040403000003D30B2025000000097F0301FFFFFFFC00040403000003D50B202500000009B00301FFFFFFFC00040403000003D60B202500000009BD0301FFFFFFFC00040403000003D70B202500000009CE0301FFFFFFFC00040403000003DF0B202500000009DE0301FFFFFFFC00040403000003E00B202500000009EB0301FFFFFFFC00040403000003E10B202500000009F80301FFFFFFFC00040403000003E20B20250000000A050301FFFFFFFC00040403000003E30B20250000000A121D0000000A2F04030000000A0301FFFFFFE80004162005000000010403000000320500000001142004030000001414200101FFFFFFEC00041B00FFFFFFFC0405000E675F695F637265646974733031350101FFFFFFF000041B00FFFFFFFC1D00000009D10301FFFFFFEC000405000000010301FFFFFFE800040500000001142004030000000114200101FFFFFFEC00041B00FFFFFFFC0405000D636F6D706F6E745F30303030310101FFFFFFF000041B00FFFFFFFC1D000000097A0301FFFFFFEC000405000000010301FFFFFFE800040500000001142004030000000114200101FFFFFFEC00041B00FFFFFFFC0405000A6368656D5F30303030310101FFFFFFF000041B00FFFFFFFC1D00000009260301FFFFFFEC0004040300000004172004030000000114200101FFFFFFE800041B00FFFFFFFC0301FFFFFFEC00040403000000070E201F00000000200403000000070101FFFFFFE800041B00FFFFFFFC1D000000000604050009615F736869656C645F040500013014230301FFFFFFE800040500005C0114230101FFFFFFF000041B00FFFFFFFC1D00000008990301FFFFFFEC00040403000000090F201F000000002D0405000F675F695F6D65646571706D6E7430310101FFFFFFF000041B00FFFFFFFC1D00000000640301FFFFFFEC00040403000000100F201F000000002D0405000F675F695F6D65646571706D6E7430320101FFFFFFF000041B00FFFFFFFC1D00000000270405000F675F695F6D65646571706D6E7430330101FFFFFFF000041B00FFFFFFFC1D00000007F80301FFFFFFEC00040403000000090F201F000000002E04050010675F695F6472647265706571703030310101FFFFFFF000041B00FFFFFFFC1D00000000660301FFFFFFEC00040403000000100F201F000000002E04050010675F695F6472647265706571703030320101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6472647265706571703030330101FFFFFFF000041B00FFFFFFFC1D00000007540403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030310101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030340101FFFFFFF000041B00FFFFFFFC1D00000006DA0403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030320101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030350101FFFFFFF000041B00FFFFFFFC1D00000006600403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030330101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030360101FFFFFFF000041B00FFFFFFFC1D00000005E60301FFFFFFEC00040403000000090F201F000000002D0405000F675F695F6D65646571706D6E7430310101FFFFFFF000041B00FFFFFFFC1D00000000640301FFFFFFEC00040403000000100F201F000000002D0405000F675F695F6D65646571706D6E7430320101FFFFFFF000041B00FFFFFFFC1D00000000270405000F675F695F6D65646571706D6E7430330101FFFFFFF000041B00FFFFFFFC1D00000005450403000000020301FFFFFFE80004162005000000010403000000320500000001142004030000000A14200101FFFFFFEC00041B00FFFFFFFC0405000E675F695F637265646974733031340101FFFFFFF000041B00FFFFFFFC1D00000004E7040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC04050010675F775F61646873766772656E3030310101FFFFFFF000041B00FFFFFFFC1D000000049E0301FFFFFFEC00040403000000040F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D00000000260405000E675F775F7374756E6772656E30310101FFFFFFF000041B00FFFFFFFC1D000000043C0301FFFFFFEC000404030000000B0F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000049040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC04050010675F775F6372796F626772656E3030310101FFFFFFF000041B00FFFFFFFC1D00000003B70301FFFFFFEC000404030000000F0F201F000000004E0405000F675F775F666972656772656E3030310101FFFFFFF000041B00FFFFFFFC040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC1D000000003B0403000000010101FFFFFFEC00041B00FFFFFFFC0405000F675F775F746865726D6C64657430310101FFFFFFF000041B00FFFFFFFC1D000000031E0301FFFFFFEC00040403000000090F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F666972656772656E3030310101FFFFFFF000041B00FFFFFFFC1D000000029A0405000D675F775F696F6E6772656E30310101FFFFFFF000041B00FFFFFFFC1D00000002750301FFFFFFEC00040403000000070F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F706F69736E6772656E30310101FFFFFFF000041B00FFFFFFFC1D00000001F10301FFFFFFEC00040403000000060F201F000000002D0405000F675F775F736F6E69636772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F736F6E69636772656E30310101FFFFFFF000041B00FFFFFFFC1D000000016C040300000004050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D00000001250405000B675F695F706172747330310101FFFFFFF000041B00FFFFFFFC1D00000001020405000F675F695F70726F677370696B6530310101FFFFFFF000041B00FFFFFFFC1D00000000DB0405000E675F695F7365637370696B6530310101FFFFFFF000041B00FFFFFFFC1D00000000B50405000B775F726F636B65745F30310101FFFFFFF000041B00FFFFFFFC1D00000000920405000B775F726F636B65745F30320101FFFFFFF000041B00FFFFFFFC1D000000006F0405000B775F726F636B65745F30330101FFFFFFF000041B00FFFFFFFC1D000000004C0405000B775F726F636B65745F30340101FFFFFFF000041B00FFFFFFFC1D00000000290405000B775F726F636B65745F30350101FFFFFFF000041B00FFFFFFFC1D00000000061B00FFFFFFFC0301FFFFFFF8000402050301FFFFFFEC00041E000000003014230101FFFFFFE400041B00FFFFFFF01D00000000121B00FFFFFFFC1B00FFFFFFF41B00FFFFFFF8200002050301FFFFFFF800040500005C010101FFFFFFF800041B00FFFFFFFC020504050001300101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC0301FFFFFFF400040500003B010301FFFFFFF800040F201F000000002C0301FFFFFFF800040301FFFFFFF0000414230101FFFFFFF000041B00FFFFFFFC1D00FFFFFFC3040500015B0301FFFFFFF000041423040500015D14230101FFFFFFE800041B00FFFFFFF01D00000000121B00FFFFFFFC1B00FFFFFFF41B00FFFFFFFC2000 This... is not good. Formatting to the Rescue Spoiler All we see is a big block of text. We want to break this up into NCS operations, meaning op codes, op types, and optional operands. Our first snippet should look like this: 1E 00 00000008 20 00 And the big block of text should have a similar format: for (auto CurPos{ BeginFile + 13 }; _at_least(CurPos, EndFile, 2u);) { auto OpCode{ static_cast<ncs_op>(*CurPos) }; ++CurPos; auto OpType{ static_cast<ncs_type>(*CurPos) }; ++CurPos; Output += hex_str(static_cast<uint8_t>(OpCode)); Output += ' '; Output += hex_str(static_cast<uint8_t>(OpType)); switch (OpCode) { case ncs_op::CPDOWNSP: [[fallthrough]]; case ncs_op::CPTOPSP: [[fallthrough]]; case ncs_op::CPDOWNBP: [[fallthrough]]; case ncs_op::CPTOPBP: Output += hex_str(CurPos, EndFile, 4u); CurPos += 4; Output += hex_str(CurPos, EndFile, 2u); CurPos += 2; break; ... } Output += '\n'; } The loop checks for two bytes now, the op code and op type. Both are inserted into the output string, with a space in between them for printing columns. The switch statement reads the op codes and determines if there should be operands. If so, it converts each operand according to its size. Each operand is preceded by a space, again to produce columns. The converted output is then terminated with a new-line character. The loop starts over again, if there are more operations. What I Start To Work With Spoiler The above code produces the following output for an actual file: 1E 00 00000008 20 00 02 03 04 03 00000000 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000002 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000003 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000004 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000005 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000006 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000007 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000008 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000009 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000A 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000B 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000C 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000D 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000E 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000F 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000010 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000011 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000012 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000013 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000044C 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000006 19 03 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000005 19 03 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000004 19 03 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000002 19 03 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000001 19 03 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000000 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000002 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000003 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000004 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000005 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000006 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000007 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000008 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000009 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000A 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000B 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000C 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000D 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000E 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 0000000F 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 2A 00 1E 00 00000010 2B 00 1B 00 FFFFFF54 20 00 02 06 04 06 00000000 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 04 03 00000037 03 01 FFFFFFF8 0004 05 00 02A7 02 22 03 1F 00 0000004C 04 03 00000001 04 03 00000037 03 01 FFFFFFF4 0004 05 00 02A8 03 04 03 00000398 04 03 00000008 05 00 0000 01 04 03 00000001 14 20 03 01 FFFFFFF4 0004 1E 00 00000014 1D 00 00000006 1B 00 FFFFFFFC 20 00 04 03 00000039 03 01 FFFFFFF8 0004 05 00 02A7 02 22 03 1F 00 00000267 04 03 00000001 04 03 00000039 03 01 FFFFFFF4 0004 05 00 02A8 03 02 03 02 03 04 05 000A 05 00 0244 01 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 00000384 0F 20 1F 00 00000020 04 03 00000000 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 02 03 03 01 FFFFFFF8 0004 04 03 00000006 05 00 0000 01 14 20 04 03 00000004 15 20 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFFC 0004 04 03 00000001 0F 20 1F 00 00000020 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 1D 00 00000006 03 01 FFFFFFFC 0004 04 03 0000001E 0E 20 1F 00 00000020 04 03 0000001E 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 1D 00 00000006 04 03 00000001 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 03 01 FFFFFFF4 0004 03 01 FFFFFFE8 0004 10 20 1F 00 00000134 02 05 02 05 02 03 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 02 05 03 01 FFFFFFD4 0004 03 01 FFFFFFE4 0004 1E 00 00000114 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 0001 03 01 FFFFFFEC 0004 05 00 0042 02 01 01 FFFFFFF8 0004 04 03 00000000 0D 20 1F 00 0000006B 04 03 00000004 03 01 FFFFFFF8 0004 04 03 00000001 14 20 03 01 FFFFFFE8 0004 05 00 0041 03 05 00 00E8 01 01 01 FFFFFFF4 0004 1B 00 FFFFFFFC 03 01 FFFFFFFC 0004 04 03 00000000 03 01 FFFFFFE8 0004 05 00 0041 03 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000001C 03 01 FFFFFFF0 0004 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 04 03 00000000 03 01 FFFFFFF4 0004 03 01 FFFFFFD8 0004 03 01 FFFFFFE8 0004 05 00 001F 04 1B 00 FFFFFFFC 1B 00 FFFFFFF0 03 01 FFFFFFF4 0004 24 03 FFFFFFF0 1B 00 FFFFFFFC 1D 00 FFFFFEC0 1B 00 FFFFFFF4 1D 00 00000006 1B 00 FFFFFFF4 20 00 02 03 04 03 00000000 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 02 03 04 05 000A 05 00 0244 01 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 05 04 05 0000 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFE8 0004 04 03 000003DE 0B 20 03 01 FFFFFFFC 0004 1F 00 00000015 04 03 00000007 05 00 02B8 01 22 03 06 20 1F 00 00000020 04 03 000003D4 01 01 FFFFFFE4 0004 1B 00 FFFFFFFC 1D 00 00000006 03 01 FFFFFFE8 0004 04 03 00000000 0B 20 1F 00 0000005B 04 03 00000000 05 00 0000 01 04 03 00000009 14 20 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 02 05 03 01 FFFFFFF0 0004 04 03 00000064 16 20 03 01 FFFFFFE4 0004 1E 00 FFFFFF13 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 1D 00 0000036D 03 01 FFFFFFE8 0004 04 03 00000064 18 20 04 03 00000000 0B 20 1F 00 0000010E 03 01 FFFFFFE8 0004 04 03 00000064 17 20 03 01 FFFFFFFC 0004 04 03 00000009 0B 20 25 00 0000000C 1D 00 0000007B 04 03 00000007 05 00 02B8 01 1F 00 00000020 04 03 00000009 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 0000004A 03 01 FFFFFFF4 0004 04 03 00000006 0F 20 1F 00 00000020 04 03 00000006 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 0000001A 04 03 00000008 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 1B 00 FFFFFFFC 03 01 FFFFFFF0 0004 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 02 05 03 01 FFFFFFE4 0004 04 03 0000000A 03 01 FFFFFFE8 0004 16 20 14 20 03 01 FFFFFFE4 0004 1E 00 FFFFFDED 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 1D 00 00000247 03 01 FFFFFFE8 0004 04 03 0000000A 18 20 04 03 00000000 0B 20 1F 00 00000203 03 01 FFFFFFE8 0004 04 03 0000000A 17 20 03 01 FFFFFFFC 0004 04 03 0000005C 0B 20 25 00 0000007A 03 01 FFFFFFFC 0004 04 03 0000005E 0B 20 25 00 0000007E 03 01 FFFFFFFC 0004 04 03 0000005F 0B 20 25 00 00000082 03 01 FFFFFFFC 0004 04 03 00000061 0B 20 25 00 00000086 03 01 FFFFFFFC 0004 04 03 00000062 0B 20 25 00 0000008A 03 01 FFFFFFFC 0004 04 03 00000063 0B 20 25 00 0000008E 1D 00 0000010A 04 03 00000002 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 000000F0 04 03 00000002 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 000000D6 04 03 00000004 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 000000BC 04 03 00000009 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 000000A2 04 03 00000003 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000088 03 01 FFFFFFE8 0004 04 03 00000003 17 20 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 00000005 0E 20 1F 00 00000020 04 03 00000005 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 04 03 00000000 01 01 FFFFFFE8 0004 1F 00 00000020 04 03 00000001 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 1D 00 00000006 1B 00 FFFFFFFC 03 01 FFFFFFF0 0004 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 02 05 03 01 FFFFFFE4 0004 03 01 FFFFFFEC 0004 14 20 03 01 FFFFFFE4 0004 1E 00 00000070 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 1D 00 0000002C 02 05 03 01 FFFFFFE4 0004 03 01 FFFFFFE4 0004 1E 00 00000044 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFFC 0004 01 01 FFFFFFE0 0004 1B 00 FFFFFFEC 1D 00 00000012 1B 00 FFFFFFFC 1B 00 FFFFFFF0 1B 00 FFFFFFF8 20 00 02 03 04 03 00000001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 05 02 05 05 00 0231 00 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFFC 0004 04 05 0006 0B 23 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 03 01 FFFFFFFC 0004 1F 00 00000014 03 01 FFFFFFFC 0004 1F 00 0000001A 03 01 FFFFFFF8 0004 04 05 0006 0B 23 07 20 1F 00 000000F9 04 03 00000003 05 00 0000 01 04 03 00000000 0B 20 1F 00 00000020 04 03 000003AD 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 000000C0 04 03 00000003 05 00 0000 01 04 03 00000000 0B 20 1F 00 0000002D 04 03 000003CB 04 03 00000009 05 00 0000 01 14 20 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000080 04 03 00000003 05 00 0000 01 04 03 00000000 0B 20 1F 00 00000020 04 03 000003A3 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 0000004D 04 03 00000003 05 00 0000 01 04 03 00000000 0B 20 1F 00 00000020 04 03 00000399 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 0000001A 04 03 0000039A 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 03 01 FFFFFFEC 0004 03 01 FFFFFFFC 0004 04 03 0000038F 0B 20 25 00 0000025E 03 01 FFFFFFFC 0004 04 03 00000399 0B 20 25 00 000002A6 03 01 FFFFFFFC 0004 04 03 0000039A 0B 20 25 00 000002E7 03 01 FFFFFFFC 0004 04 03 000003A3 0B 20 25 00 00000325 03 01 FFFFFFFC 0004 04 03 000003AD 0B 20 25 00 0000039C 03 01 FFFFFFFC 0004 04 03 000003AE 0B 20 25 00 00000427 03 01 FFFFFFFC 0004 04 03 000003B7 0B 20 25 00 000004B5 03 01 FFFFFFFC 0004 04 03 000003B8 0B 20 25 00 00000519 03 01 FFFFFFFC 0004 04 03 000003B9 0B 20 25 00 0000057D 03 01 FFFFFFFC 0004 04 03 000003BA 0B 20 25 00 000005E1 03 01 FFFFFFFC 0004 04 03 000003C1 0B 20 25 00 0000066C 03 01 FFFFFFFC 0004 04 03 000003CB 0B 20 25 00 000006B4 03 01 FFFFFFFC 0004 04 03 000003CC 0B 20 25 00 000006E7 03 01 FFFFFFFC 0004 04 03 000003CD 0B 20 25 00 00000733 03 01 FFFFFFFC 0004 04 03 000003CE 0B 20 25 00 000007A2 03 01 FFFFFFFC 0004 04 03 000003CF 0B 20 25 00 00000825 03 01 FFFFFFFC 0004 04 03 000003D0 0B 20 25 00 00000893 03 01 FFFFFFFC 0004 04 03 000003D1 0B 20 25 00 000008A2 03 01 FFFFFFFC 0004 04 03 000003D2 0B 20 25 00 00000910 03 01 FFFFFFFC 0004 04 03 000003D3 0B 20 25 00 0000097F 03 01 FFFFFFFC 0004 04 03 000003D5 0B 20 25 00 000009B0 03 01 FFFFFFFC 0004 04 03 000003D6 0B 20 25 00 000009BD 03 01 FFFFFFFC 0004 04 03 000003D7 0B 20 25 00 000009CE 03 01 FFFFFFFC 0004 04 03 000003DF 0B 20 25 00 000009DE 03 01 FFFFFFFC 0004 04 03 000003E0 0B 20 25 00 000009EB 03 01 FFFFFFFC 0004 04 03 000003E1 0B 20 25 00 000009F8 03 01 FFFFFFFC 0004 04 03 000003E2 0B 20 25 00 00000A05 03 01 FFFFFFFC 0004 04 03 000003E3 0B 20 25 00 00000A12 1D 00 00000A2F 04 03 0000000A 03 01 FFFFFFE8 0004 16 20 05 00 0000 01 04 03 00000032 05 00 0000 01 14 20 04 03 00000014 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000009D1 03 01 FFFFFFEC 0004 05 00 0000 01 03 01 FFFFFFE8 0004 05 00 0000 01 14 20 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000D 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000097A 03 01 FFFFFFEC 0004 05 00 0000 01 03 01 FFFFFFE8 0004 05 00 0000 01 14 20 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000A 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000926 03 01 FFFFFFEC 0004 04 03 00000004 17 20 04 03 00000001 14 20 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 00000007 0E 20 1F 00 00000020 04 03 00000007 01 01 FFFFFFE8 0004 1B 00 FFFFFFFC 1D 00 00000006 04 05 0009 04 05 0001 14 23 03 01 FFFFFFE8 0004 05 00 005C 01 14 23 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000899 03 01 FFFFFFEC 0004 04 03 00000009 0F 20 1F 00 0000002D 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000064 03 01 FFFFFFEC 0004 04 03 00000010 0F 20 1F 00 0000002D 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000027 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000007F8 03 01 FFFFFFEC 0004 04 03 00000009 0F 20 1F 00 0000002E 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000066 03 01 FFFFFFEC 0004 04 03 00000010 0F 20 1F 00 0000002E 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000028 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000754 04 03 00000001 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 0000000C 0F 20 1F 00 0000002E 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000028 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000006DA 04 03 00000001 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 0000000C 0F 20 1F 00 0000002E 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000028 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000660 04 03 00000001 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 03 01 FFFFFFEC 0004 04 03 0000000C 0F 20 1F 00 0000002E 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000028 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000005E6 03 01 FFFFFFEC 0004 04 03 00000009 0F 20 1F 00 0000002D 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000064 03 01 FFFFFFEC 0004 04 03 00000010 0F 20 1F 00 0000002D 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000027 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000545 04 03 00000002 03 01 FFFFFFE8 0004 16 20 05 00 0000 01 04 03 00000032 05 00 0000 01 14 20 04 03 0000000A 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000004E7 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000049E 03 01 FFFFFFEC 0004 04 03 00000004 0F 20 1F 00 0000002C 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000026 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000043C 03 01 FFFFFFEC 0004 04 03 0000000B 0F 20 1F 00 0000002C 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000049 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 0010 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000003B7 03 01 FFFFFFEC 0004 04 03 0000000F 0F 20 1F 00 0000004E 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 1D 00 0000003B 04 03 00000001 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000031E 03 01 FFFFFFEC 0004 04 03 00000009 0F 20 1F 00 0000002C 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000048 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000029A 04 05 000D 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000275 03 01 FFFFFFEC 0004 04 03 00000007 0F 20 1F 00 0000002C 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000048 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000001F1 03 01 FFFFFFEC 0004 04 03 00000006 0F 20 1F 00 0000002D 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000048 04 03 00000002 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000016C 04 03 00000004 05 00 0000 01 04 03 00000001 14 20 01 01 FFFFFFEC 0004 1B 00 FFFFFFFC 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000125 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000102 04 05 000F 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000000DB 04 05 000E 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 000000B5 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000092 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000006F 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 0000004C 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000029 04 05 000B 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 00000006 1B 00 FFFFFFFC 03 01 FFFFFFF8 0004 02 05 03 01 FFFFFFEC 0004 1E 00 00000030 14 23 01 01 FFFFFFE4 0004 1B 00 FFFFFFF0 1D 00 00000012 1B 00 FFFFFFFC 1B 00 FFFFFFF4 1B 00 FFFFFFF8 20 00 02 05 03 01 FFFFFFF8 0004 05 00 005C 01 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 05 04 05 0001 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 02 03 04 03 00000004 01 01 FFFFFFF8 0004 1B 00 FFFFFFFC 03 01 FFFFFFF4 0004 05 00 003B 01 03 01 FFFFFFF8 0004 0F 20 1F 00 0000002C 03 01 FFFFFFF8 0004 03 01 FFFFFFF0 0004 14 23 01 01 FFFFFFF0 0004 1B 00 FFFFFFFC 1D 00 FFFFFFC3 04 05 0001 03 01 FFFFFFF0 0004 14 23 04 05 0001 14 23 01 01 FFFFFFE8 0004 1B 00 FFFFFFF0 1D 00 00000012 1B 00 FFFFFFFC 1B 00 FFFFFFF4 1B 00 FFFFFFFC 20 00 Yes, I'm reading that and write code to reverse it. Looks like fun, eh? 1 Share this post Link to post Share on other sites
DrMcCoy 40 Posted January 21, 2022 12 hours ago, AmanoJyaku said: OK, so we want to print in hex. How? I had to write my own code for this, I couldn't figure out the C++ formatting and conversion library. The C++98-era method would be to use stringstream. It's a bit terrible, though. You could the C function snprintf(): #include <cstdint> #include <cstdio> #include <string> std::string hex_string(uint8_t data) { char tmp[3]; std::snprintf(tmp, 3, "%02X", data); return tmp; } In C++20, you'll be able to use std::format() instead, which provides a syntax similar to Python's string formating. I.e. you could do std::format("{:02X}", data), which already returns a std::string. Compiler support for that is still sparse though. Alternatively, you can pull in the (optionally header-only) library fmt, which is actually what became C++20's std::format(). Share this post Link to post Share on other sites