A New DeNCS

AmanoJyaku · December 26, 2020

7 hours ago, DarthParametric said:

That seems fine to me. I was more concerned that certain files might not able to be decompiled at all. This is the case for DeNCS as well, based on my own experiments.

My concern with nwnnsscomp is that we now know its output differs from that of the original compiler. That leads to three possibilities:

The output files are fully reversible, even if they differ slightly from the source. This is acceptable.
The output files are fully reversible, but require additional code to detect and handle the differences from the original compiler. This will be very annoying.
The output files are not reversible. This is unacceptable.

At the moment, I have no idea which of these is the case. With over 2,500 files, there is no reasonable way I can develop a reverse compiler while inspecting each script in a timely fashion. I feel it's something to take note of, and look into once the reverse compiler is complete.

1 hour ago, JCarter426 said:

This is still common practice, at least as far as I've been taught. Yeah, there are security issues if you then try to read one, but modern compilers either won't let you do that or will initialize it with a default value, if the language itself doesn't. With that in mind, I don't see the issue.

I don't know who taught you this, but it's a practice that's frowned upon even with modern compilers. We try to learn good habits rather than relying on the compiler to handle things we could easily do ourselves. It's not uncommon for bugs to creep into compilers, and you can track these down more readily with proper code. This advice comes from the best programmers at places like Apple, Google, and Microsoft, and even the ISO C and C++ committees.

1 hour ago, JCarter426 said:

The issues you've mentioned before don't seem like they should cause problems to me. Like with the nested else/if vs a simple else if, that'll only be functionally different if you declare something local to the else branch's scope. And, obviously, don't do that. And I doubt the compiler would've optimized for it that way if it would produce functionally different code. It usually knows better than we do.

Why not create variables local to the else branch's scope? That's the whole point of having scopes, to create variables only where they're needed. Variables placed in too high a scope, e.g. the global scope, are at risk of being overwritten erroneously.

As for the compiler knowing better than we do, that's true of professionally written compilers. Which nwnnsscomp is not. I've already listed some of the basic things it fails to optimize, e.g. failing to remove empty functions. To be fair, the original compiler from BioWare has an even bigger problem in that logical AND and logical OR are incorrect. Which nwnnsscomp doesn't fix...

**JCarter426** · December 26, 2020

2 hours ago, AmanoJyaku said:

I don't know who taught you this, but it's a practice that's frowned upon even with modern compilers. We try to learn good habits rather than relying on the compiler to handle things we could easily do ourselves. It's not uncommon for bugs to creep into compilers, and you can track these down more readily with proper code. This advice comes from the best programmers at places like Apple, Google, and Microsoft, and even the ISO C and C++ committees.

I've seen numerous examples in the documentation for C# and Java, including the very page on variable initialization in C#. The C++ textbooks I've read (Starting Out with C++, Professional C++, C++ Notes for Professionals) obviously warned about the dangers of using uninitialized variables, but still used the extensively when it was safe.

I was taught that a) it is theoretically more efficient in languages that don't require variables to be initialized (e.g. C++ and C#) as minimal as that efficiency gain may be, and b) it's more readable—if you don't know what the value will be, don't say that's the value, as that's potentially misleading.

3 hours ago, AmanoJyaku said:

Why not create variables local to the else branch's scope? That's the whole point of having scopes, to create variables only where they're needed. Variables placed in too high a scope, e.g. the global scope, are at risk of being overwritten erroneously.

This, on the other hand, I was taught was bad practice because it increases the chance of hiding variables by accident and creating conflicts. So I was told that it's better to avoid this except for temp variables in a for loop or for a swap. This was just within a function, though—obviously global variables have their own problems.

I was also told that, in terms of memory, it's better to allocate it at the start of a function and have it crash there if it's going to crash rather than in the middle of execution. But I question whether that's a realistic problem anymore.

3 hours ago, AmanoJyaku said:

As for the compiler knowing better than we do, that's true of professionally written compilers. Which nwnnsscomp is not. I've already listed some of the basic things it fails to optimize, e.g. failing to remove empty functions. To be fair, the original compiler from BioWare has an even bigger problem in that logical AND and logical OR are incorrect. Which nwnnsscomp doesn't fix...

Well... yes. I believe you. 😧

DrMcCoy · December 26, 2020

4 hours ago, AmanoJyaku said:

At the moment, I have no idea which of these is the case. With over 2,500 files, there is no reasonable way I can develop a reverse compiler while inspecting each script in a timely fashion. I feel it's something to take note of, and look into once the reverse compiler is complete.

You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files.

Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable.

You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases.

There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development.

AmanoJyaku · December 26, 2020

2 hours ago, JCarter426 said:

I've seen numerous examples in the documentation for C# and Java, including the very page on variable initialization in C#. The C++ textbooks I've read (Starting Out with C++, Professional C++, C++ Notes for Professionals) obviously warned about the dangers of using uninitialized variables, but still used the extensively when it was safe.

Often, documentation is written with the assumption that the reader has some familiarity with the topic and is simply looking for a reference as a refresher on minutae. For example, the C# link shows a class definition A with an unassigned static scalar variable x, and unassigned instance scalar variable y. This is valid because it states:

Quote

Static variables

The initial value of a static variable is the default value (Default values) of the variable's type.

Instance variables

The initial value of an instance variable of a class is the default value (Default values) of the variable's type.

The default value of an int is 0. This is achieved by the compiler writing the assignment code for you, so the initialization occurs. In contrast:

Quote

Local variables

A local variable introduced by a local_variable_declaration is not automatically initialized and thus has no default value.

With the exception of the section "Try-catch-finally statements", the remainder of the document specifically uses uninitialized variables to point out their dangers. The "Try-catch-finally statements" section demonstrates a contrived (meaning, unrealistic) example of "safe" use of uninitialized variables. Documentation and tutorials use contrived examples to demonstrate how something works, but often caution such examples are not acceptable in production code.

I should point out Microsoft's documentation is mostly filled contrived examples. Post one of their examples on StackOverflow and prepare to be burned.

tl;dr: never assume uninitialized variables are safe.

2 hours ago, JCarter426 said:

I was taught that a) it is theoretically more efficient in languages that don't require variables to be initialized (e.g. C++ and C#) as minimal as that efficiency gain may be, and b) it's more readable—if you don't know what the value will be, don't say that's the value, as that's potentially misleading.

There's a mantra you should repeat to yourself: code for correctness, worry about performance later. Modern CPUs are fast, and compilers are good at optimizing your code for those CPUs. A CPU that operates at 3 billion cycles per second does not benefit from saving a few dozen cycles by omitting a single variable initialization. If the compiler thinks you can benefit, it will often make the change for you. For example, creating variables inside of loops is considered a bad practice because you're just going to create and destroy that variable repeatedly. Which is why compilers perform variable hoisting, meaning the variable is taken out of the loop and placed before it. So:

int Count = 10;
while (Count > 0)
{
  int i = 0;
  //Do something with i
  --Count;
}

Becomes:

int Count = 10;
int i = 0;
while (Count > 0)
{
  //Do something with i
  --Count;
}

But, you should probably just use a for loop:

for (int i = 0, Count = 10; Count > 0; --Count)
{
  //Do something with i
}

Readability is a concern, yes. But, it's more readable when you give a variable a descriptive name, initialize it with an appropriate starting value, and place it where you intend to use it. In the example, I can guess Count is the number of times to execute the loop. But, I should give i a better name. Maybe it should be Meters, or Years, or Euros? Both variables are created once, and destroyed once the loop completes. See how much we can learn from this simple code fragment?

2 hours ago, JCarter426 said:

This, on the other hand, I was taught was bad practice because it increases the chance of hiding variables by accident and creating conflicts. So I was told that it's better to avoid this except for temp variables in a for loop or for a swap. This was just within a function, though—obviously global variables have their own problems.

I was also told that, in terms of memory, it's better to allocate it at the start of a function and have it crash there if it's going to crash rather than in the middle of execution. But I question whether that's a realistic problem anymore.

I'm not sure I know what you mean by hiding variables, but if you place them where you use them then they aren't hidden. Conflicts can easily be avoided by using descriptive names. And a function should only be as large as the single task it performs.

And, please, no global variables. I don't think there is ever a reason for global variables.

As for memory, that's only a problem with embedded systems and some 32-bit platforms. No matter how little RAM a computer has it still has access to virtual memory. It won't crash, but it will be slow.

1 hour ago, DrMcCoy said:

You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files.

Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable.

You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases.

That's the plan.

1 hour ago, DrMcCoy said:

There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development.

I have no idea, I only learned about these tools when I answered your request for help. It's embarrassing, because I have two copies of NWN, I just need to find them...

**JCarter426** · December 26, 2020

1 hour ago, AmanoJyaku said:

I should point out Microsoft's documentation is mostly filled contrived examples. Post one of their examples on StackOverflow and prepare to be burned.

Yeah, I know. And with a managed language, they have the luxury of being more willy-nilly than if it were C++ or C. Still, I don't think they would be doing it all over the place if it were so universally unacceptable. It's safe as long as you don't read before assigning at some point, and modern IDEs and compilers go out of their way to prevent you from doing that by accident.

Again, if you have a compiler you can trust. I know the Visual Studio C++ compiler has some silly bugs, and Clang lets you do things it really shouldn't do because that code won't compile with anything else, to say nothing of NWNNSSComp.

1 hour ago, AmanoJyaku said:

There's a mantra you should repeat to yourself: code for correctness, worry about performance later. Modern CPUs are fast, and compilers are good at optimizing your code for those CPUs. A CPU that operates at 3 billion cycles per second does not benefit from saving a few dozen cycles by omitting a single variable initialization.

I know in most use cases it's minimal, but it can make a difference if you're processing huge amounts of data in a database, which C# and .NET are often used for. The reason C# allows uninitialized local variables, unlike, say, Java, is specifically for the performance gain. It mattered to somebody somewhere at some point.

1 hour ago, AmanoJyaku said:

I'm not sure I know what you mean by hiding variables, but if you place them where you use them then they aren't hidden. Conflicts can easily be avoided by using descriptive names.

I mean if you give a variable in an inner scope the same identifier as one in the outer scope. The variable in the outer scope is hidden.

Which you can also file under don't do that, but people make mistakes.

1 hour ago, AmanoJyaku said:

And a function should only be as large as the single task it performs.

Well yes, ideally you shouldn't have too much nested code in the first place. Another reason why I don't mind NWNNSSComp replacing nested else/if with else if.

AmanoJyaku · January 25, 2021

Life has been crazy, and I've seen things I never thought would happen in my country. But, coding never stops. Time for an overdue update!

An earlier update outlined the structure of a subroutine:

return_type subA(Argument_1, Argument_2...)
{
  //Part 1 - The code that does stuff
  //Part 2 - Values returned
  //Part 3 - Destruction of local variables
  //Part 4 - Destruction of arguments
}

Part 2 returns values from a subroutine using the CPDOWNSP opcode. Parts 3 and 4 destroy local variables and arguments using the MOVSP opcode. These opcodes can appear in any part of a subroutine. So, how do we know if we're returning a value, and destroying variables and/or arguments?

We analyze part 1 to determine what, if anything, is on the stack.

Computers execute operations in sequence unless directed otherwise. In high-level languages like NWScript, control statements alter the sequence. An if statement conditionally executes code. If the test condition fails, this code is jumped over. So, what does an if statement look like in the compiled NCS?

102	CPTOPSP -3 1
110	CONSTI 0
116	EQUALxx
118	JZ 170

124	CPTOPSP -1 1
132	CPTOPSP -3 1
140	CONSTO object
146	JSR 298
152	MOVSP -3
158	JMP 296
164	JMP 170

In the example, the opcode at offset 102 of the NCS file copies the value of a variable to the top of the stack. Offset 110 places the value 0 onto the stack. Offset 116 performs a test of equality of the two values, removes them from the stack, then places the result of the test onto the stack. Offset 118 performs a conditional jump based on the result, removing the result in the process. Offsets 124-164 are the body of the if statement, which consists of a call to a subroutine followed by a return statement:

    if ( nKillMode == 0 )
    {
        DamagingExplosion(OBJECT_SELF, nDelay, nDamage);
        return;
    }

So many ops, so few statements. That's why we write in high-level languages instead of assembly, because the compiler allows us to focus on what the program should do rather than creating the proper sequence of opcodes to keep the program from breaking. Notice in the body of the if statement there is a MOVSP. That is the destruction of local variables, part 3. There is no part 2, because the subroutine returns void (nothing). But, how do we know the MOVSP is part 3 and not part 4? And why are there two unconditional jumps???

The challenge of writing a reverse compiler is knowing the rules used to compile. I've spent the past six months observing, analyzing, and documenting those rules. For example, all control statements other than the switch statement are created using the jump-if-zero (JZ) opcode. (Switch statements are created using the jump-if-not-zero [JNZ] opcode.) However, the existence of a JZ opcode does not mean you've encountered a control statement! JZ is used to create the following:

Logical AND
Logical OR
if statements
else if statements
while statements
for statements
do-while statements (I finally found one!!!)

This means we need to some way of determining what the range of addresses created by the JZ opcode is. In the example above, the range created by the if statement is [124, 170). In this range notation a bracket "[" means the offset is included in the range, and a parenthesis ")" means the offset is outside of the range. The 170 is obvious because it's specified by the JZ opcode; if nKillMode == 0 returns false we jump to 170 and resume execution from there. Where does the 124 come from? Remember what was stated above: "computers execute operations in sequence unless directed otherwise." A JZ opcode is 6 bytes in length and the JZ is at offset 118, so 118 + 6 means the next op in the sequence is at offset 124. If nKillMode == 0 returns true we enter the body of the if statement at offset 124. Offset 124 is the first address in the range, offset 170 is the first address outside of the range.

All control statements are ranges, but not all ranges are control statements. The NCS file is itself a range: [0, file size). Subroutines are ranges. However, our focus is on the ranges created by JZ. How do we determine what they are? The unconditional jump, JMP.

Notice the example ends with two JMPs. This caused me lots of stress because I didn't know why they were there. I'm now convinced this is one of the rules of compiling NCS files: control statements end with a JMP. The if statement is easily identified as such because the second JMP goes to the same offset as the JZ. The true block of an if-else statement ends in a JMP to an offset greater than that of the JZ. The if and else-if statements all end in JMPs to the same offset. Loops all end in JMPs to an offset less than the offset of the JZ.

Logical AND and Logical Or are not control statements, so they do not end in a JMP.

Our example has two JMPs. Why? Most likely the original compiler was dumb. The NWScript return statement is translated into a JMP to the end of the subroutine, offset 296. The opcode at this offset is RETN, the last opcode in a subroutine. Anytime you see a return in NWScript that isn't the last statement in the subroutine, you know it will be implemented as a JMP:

    if ( nKillMode == 0 )
    {
        return;
    }

    if ( nKillMode == 1 )
    {
        return;
    }

    if ( nKillMode == 2 )
    {
       return;
    }

is

118	JZ 170
158	JMP 296
164	JMP 170

186	JZ 230
218	JMP 296
224	JMP 230

246	JZ 290
278	JMP 296
284	JMP 290

296	RETN

The original compiler just stuck the body of the if statement in between the JZ and JMP. A modern compiler would have removed the second JMP since it's unreachable code.

OK, this is great and all. But what about the reverse compiler? Well, now that I have the rules down I'm close to the finish line. The reverse compiler accurately finds ranges and determines what types they are. The output looks like this:

This is definitely a K2 file

Subroutine 13, 21
13, 21		//Subroutine

Subroutine 21, 298
21, 298		//Subroutine
124, 170	//if
192, 230	//if
252, 290	//if

Subroutine 298, 1345
298, 1345	//Subroutine
320, 396	//if
704, 722	//logical AND or OR
728, 1216	//loop
787, 805	//logical AND or OR
811, 1135	//if
914, 970	//if-else
1017, 1043	//if-else

Subroutine 1345, 1601
1345, 1601	//Subroutine
1459, 1553	//if
1511, 1547	//if

Subroutine 1601, 1637
1601, 1637	//Subroutine

Subroutine 1637, 1813
1637, 1813	//Subroutine
1659, 1727	//if

There's more to be done, particularly for loops because they support the break and continue statements (also implemented with JMP). But, hard part seems to be behind me. Should have an update in two weeks.

AmanoJyaku · January 28, 2021

I promised an update in two weeks... How about two days?

The approach of using ranges has borne fruit. Consider the following code:

int subroutine(int I, float F, vector V)
{
    int i = 1;
    while(true)
    {
      	if(i > 10)
        {
		//do stuff
        }
        else
        {
		return -1;
        }
    }
    return 0;
}

The reverse compiler needs to be able to find the return type and parameter types. To do this, it:

Counts the number of local variables in Part 1
Counts the number of variables returned in Part 2
Destroys the local and returned variables in Part 3
Verifies the stack is empty, and assumes the count being destroyed is the number of input arguments in Part 4

Using the above code as an example, the method outlined in the post from two days ago effectively ignores everything inside the while loop. So:

1 local variable is counted, created by the RSADDI opcode
1 return variable is counted, created by the CPTOPSP opcode and copied by the CPDOWNSP opcode
1 local variable and 1 returned variable are destroyed by the MOVSP opcode
5 input arguments are destroyed by the MOVSP opcode (vectors are three floats)

Here's the output of the KoTOR2 file k_contain_unlock.ncs:

This is definitely a K2 file

Subroutine 13, 21
JSR

Subroutine 21, 995
JSR

Subroutine 995, 1122
Return variables: 0     Local variables: 0      Parameters: 0

Subroutine 1122, 1766
Return variables: 0     Local variables: 0      Parameters: 3

Subroutine 1766, 2948
Return variables: 1     Local variables: 0      Parameters: 2

Subroutine 2948, 6915
JSR

Subroutine 6915, 7116
Return variables: 1     Local variables: 0      Parameters: 1

A subroutine with the status "JSR" means it depends upon another subroutine to be reverse compiled first:

Sub13 calls Sub21
Sub21 calls Sub995
Sub2948 calls Sub6915

Here's what the other status codes mean:

Return variables: 0 - returns void; 1 - returns a single value, e.g. int, float, string or object; a value of 2 or more returns a struct or vector
Local variables: 0 - should always be 0, because it's calculated after the stack is cleared; non-zero value means the analysis is flawed
Parameters: 0 - there are no input parameters; 1 or more means there are input parameters

So, what are the subroutines?

Sub13 - void _start() - the hidden routine
Sub21 - void _global() - the routine that creates global variables
Sub995 - int main()
Sub1122 - void PlaceTreasureDisposable(object oContainer = OBJECT_SELF, int numberOfItems = 1, int nItemType = 900)
Sub1766 - string GetTreasureBundle (int nItemLevel, int nItemType = 0)
Sub2948 - string GetBundlePrefix (int nItemLevel, int nItemType)
Sub6915 - string SWTR_GetQuantity(int iCount)

Still targeting an update in two weeks. Or earlier.

AmanoJyaku · February 3, 2021

The Phantom Bug

Spoiler

Given the progress in the last few updates, I figured I'd make an attempt at reverse compiling to see what issues might be lurking. It's easy enough to reverse this:


18344	CONSTI 15
18350	CONSTO object
18356	int GetHitDice
18361	ADDxx

That's 15 + GetHitDice(OBJECT_SELF).

Similarly, the JSR issue is easily dealt with:


RSADDx
JSR
JSR

If we determine the signature of the first JSR returns 1 variable and has 0 parameters, then we know the code is JSR(JSR()).

It makes sense to use the previously discussed strategy of skipping ranges to reverse compile. And the reverse compiler plowed through all of the files... Then suddenly got stuck on one.

Attack of the Cases

Spoiler

It was a file from the M478 mod, c_m478_globl_1.ncs. It wasn't a problem before, so it was clearly related to the output of the range finding algorithm. But what? Let's look at a switch statement:


switch (var)
{
  case 0:
    //more code
    break;
  case 1:
    //more code
    break;
  case 2:
    //more code
   break;
  default:
    //more code
    break;
}

The case labels in NCS look like this:


CPTOPSP -1 1
CONSTI 0
EQUALxx
JNZ offset_1

CPTOPSP -1 1
CONSTI 1
EQUALxx
JNZ offset_2

CPTOPSP -1 1
CONSTI 1
EQUALxx
JNZ offset_3

JMP offset_4

offset_1
//more code
offset_2
//more code
offset_3
//more code
offset_4
//more code

The difference between JZ and JNZ is more than just jumping based on 0 or not 0. Since JNZ is only used for switch case labels, the offsets are the first opcodes in the case statements. What's important to note is that, unlike JZ, JNZ does not tell you the end of the range. To find the end, you need to look at the next JNZ. The next JNZ's start is the previous JNZ's end.

Which begs the question: how do you find the end of the last JNZ? It's the offset specified by the JMP. That's the default case. And as you can see, we don't know where it ends, either. In fact, it's not even a guarantee that the switch has a default case at all!

Revenge of the Switch

Spoiler

The JMP is always present, so we need another way to determine if there is a default case. You can't do it easily be examining the code starting from the JMP offset, so I tried another way. Each case label that ends with a break statement is implemented as a JMP to the end of the switch. So, I read through each of the case ranges for the first one to end with a JMP. Why the first one? Because all JMPs go to the end, so there's no need to check all of them. However, case statements aren't required to end with a break statement (this is known as "fall through"), so those that don't should lack a JMP.


switch (var)
{
  case 0:	//falls through to case 1, case 2, and default case
  case 1:	//falls through to case 2 and default case
  case 2:	//falls through to the default case 
  default:
    break;
}

And there was the flaw in my logic. What I never considered was a case statement that falls through, but also includes code that ends in a JMP. What does that?


switch (var1)
{
  case 0:
    if (var2){/*more code*/}
  case 1:
    if (var3){/*more code*/}
  case 2:
    if (var3){/*more code*/}
  default:
    if (var4){/*more code*/}
    break;
}

Remember, if, else-if, while, for, and do-while all end with a JMP. The loops jump backward, if and else-if jump to the next op. The next op is the start of the next case offset. And that offset was set as the end of the default case's range!!!

As it turns out, I was testing a more efficient method of traversing the inside of a subroutine that relied on input data being perfect. My earlier traversal code read all ops, and skipped them if they were in the ranges of if, else-if, else, while, for, and do-while statements. It was slow, but it always moved forward. The new code specifically looks for child ranges, reads up to the start of those ranges, then jumps to the end of those ranges:


Sub()
{
	//Read this
	if()
	{
		//Skip this
		while()
		{
			//Which means this is skipped, too
		}
	}
	//Resume reading from here
}

The new traversal code is correct, but the default case range was incorrectly defined as ending with a jump backward because of the combination fall through, an if statement, and the crappy method of using the switch cases to find the end of the switch statement.

The potential solution? Remove identification of switch statement ranges until a later point in analysis that includes the stack. Can't complain, I suppose, since stack analysis is part of the phase that produces text output. I'm not adding anything in, I'm just moving things around. It's just pure luck that I caught this bug due to the most unlikely combination of things that are rarely done.

AmanoJyaku · February 7, 2021

I fixed the switch bug and decided to try the first step of reverse compiling: finding statements. For example, the following has six statements:

//c_003handfight

int StartingConditional()
{
    string tString = GetScriptStringParameter();
    int tInt = GetScriptParameter( 1 );
    int LevelInt = GetScriptParameter( 2 );

    if ( ( GetGlobalNumber(tString) == tInt ) && ( GetGlobalNumber ("G_PC_LEVEL") < LevelInt ) )
    {
        return TRUE;
    }
    return FALSE;
}

The first three are declaration statements, they introduce entities into the StartingConditional subroutine scope. In C, variables and subroutines (functions) are entities. Entities have names and types. In an NCS file, subroutines are found by the JSR opcode and variables are created by the RSADDx opcode.

The return statement is one of the jump statements, which includes the break and continue statements. All three are implemented by the JMP opcode.

The if statement is one of the selection statements, which includes the switch statement. The if statement is also a compound statement or block, meaning it can contain multiple statements.

The previous updates detailed my efforts to correctly identify the various types of statements. Here's what the reverse compiler produces:

This is definitely a K2 file

Sub13, 23
13 adds 1
15 Sub23 removes 0 returns 1
21


Sub23, 250
23 adds 1
25 GetScriptStringParameter adds 1
30
38 removes -1

44 adds 1
46 adds 1
52 GetScriptParameter removes -1 adds 1
57
65 removes -1

71 adds 1
73 adds 1
79 GetScriptParameter removes -1 adds 1
84
92 removes -1

98 adds 1
106 GetGlobalNumber removes -1 adds 1
111 adds 1
119 removes -1
121 adds 1
129 removes -1
135 adds 1
149 GetGlobalNumber removes -1 adds 1
154 adds 1
162 removes -1
164 removes -1
166 if()  removes -1

210 adds 1
216
224 removes -4

230
248

Sub13 is the hidden _start() function. This file is a conditional script used in dialogs, so it returns a value. This is the file you use to practice fighting with Handmaiden, and she won't fight you if you aren't good enough. The value returned is obtained from Sub23 and stored in the space reserved by the RSADDx opcode at offset 13 (written above as "add 1").

Sub23 is int StartingConditional(), which is what is required for conditional scripts. The alternative is void main() for non-dialog scripts. If this was a non-dialog script offset 13 would not be a reservation. It would be Sub21, because the JSR opcode is six bytes and the next opcode is the two-byte RETN (13 + 6 + 2 = 21).

Offsets 23-38 are the a declaration statement, string tString = GetScriptStringParameter(). You see three elements added to the stack, and two removed. Offset 30 is the CPDOWNSP opcode. Blank lines mean elements have not been added to or removed from the stack. Instead, CPDOWNSP copies the value returned by GetScriptStringParameter() to the element reserved by offset 23, string tString. CPDOWNSP is the assignment operator, "=". Offset 38 then destroys the value returned by GetScriptStringParameter() since it is no longer needed. In C this is known as discarding.

The key to writing a reverse compiler is finding these discarded values. The MOVSP opcode discards values, so it's a safe bet that a MOVSP opcode is the end of a statement. In the above example, offsets 38, 65, and 92 are discarding values. You can think of MOVSP as the semicolon at the end of the statement. Offset 164 is the logical AND, 98-162 are the operands. Since we have already found the ranges created by JZ and JNZ, we already know the boundaries of those block statements. Offset 166 is the if () statement, which ends at offset 210. And blocks are surrounded by curly braces.

Offset 210 places the return value onto the stack. This is after the if () statement, so it's value is FALSE. It's then returned at offset 216 using CPDOWNSP to the reservation created at offset 13. Offset 224 then destroys the three local variables and the return value using MOVSP. Finally, the RETN opcode should be at offset 230, but for some reason it's a JMP to 248. (I swear, I hate the NCS compilers.) Offset 248 has the RETN. Neither the JMP or RETN would be printed.

TL;DR

I just have to remove diagnostic info, group expressions into statements, and ensure the branching statements and nested statements are handled correctly. Look forward to another update soon.

AmanoJyaku · February 8, 2021

Stripped out some of the diagnostic output and added some source code output. Here's the original Handmaiden conditional script again:

//c_003handfight

int StartingConditional()
{
    string tString = GetScriptStringParameter();
    int tInt = GetScriptParameter( 1 );
    int LevelInt = GetScriptParameter( 2 );

    if ( ( GetGlobalNumber(tString) == tInt ) && ( GetGlobalNumber ("G_PC_LEVEL") < LevelInt ) )
    {
        return TRUE;
    }
    return FALSE;
}

Here's the output from the reverse compiler:

//Kotor2\c_003handfight.ncs
This is definitely a K2 file

Sub13, 23
}

Sub23, 250
38 string = GetScriptStringParameter();
65 int = GetScriptParameter(1, );
92 int = GetScriptParameter(2, );
166 if (GetGlobalNumber(CPTOPSP, ) == CPTOPSP && GetGlobalNumber("G_PC_LEVEL", ) < CPTOPSP)
{
}
224 ;
}

Recognizable!

Here's the output of k_inc_npckill.ncs:

Spoiler


//Kotor2\k_inc_npckill.ncs
This is definitely a K2 file

Sub13, 21
13 Sub21()
}

Sub21, 298
42 int = GetScriptParameter(1, );
69 int = GetScriptParameter(2, );
96 int = GetScriptParameter(3, );
118 if (CPTOPSP == 0)
{
}
186 if (CPTOPSP == 1)
{
}
246 if (CPTOPSP == 2)
{
}
290 ;
}

Sub298, 1345
314 if (CPTOPSP > 0)
{
}
412 int = 15;
434 int = 0;
463 location = GetLocation(CPTOPSP, );
492 float = GetFacing(CPTOPSP, );
525 vector_z = GetPositionFromLocation(CPTOPSP, );
563 vector_z = GetPositionFromLocation(CPTOPSP, ) = CPTOPSP + 1.000000;
600 location = Location(CPTOPSP, CPTOPSP, CPTOPSP, CPTOPSP, );
671 object = GetFirstObjectInShape(4, 4.000000, CPTOPSP, 0, 65, 0.000000, 0.000000, 0.000000, );
722 while(GetIsObjectValid(CPTOPSP, ) && CPTOPSP > 0)
{
}
1253 ApplyEffectAtLocation(0, EffectVisualEffect(3003, 0, ), CPTOPSP, 0.000000, )
1289 AssignCommand(CPTOPSP, )
1326 DestroyObject(CPTOPSP, 0.000000, 1, 0.000000, 0, )
1331 ;
1337 ;
}

Sub1345, 1601
1374 int = GetHasFeat(56, CPTOPSP, );
1409 int = GetHasFeat(57, CPTOPSP, );
1431 int = 0;
1453 if (CPTOPSP == 1)
{
}
1569 ;
1593 ;
}

Sub1601, 1637
1623 Sub298(CPTOPSP, CPTOPSP, 0, )
1629 ;
}

Sub1637, 1813
1653 if (CPTOPSP > 0)
{
}
1760 effect = EffectDeath(0, 0, 1, );
1794 ApplyEffectToObject(0, CPTOPSP, CPTOPSP, 0.000000, )
1799 ;
1805 ;
}

And k_contain_unlock:

Spoiler


//Kotor2\k_contain_unlock.ncs
This is definitely a K2 file

Sub13, 21
13 Sub21()
}

Sub21, 995
37 int = 0;
59 int = 1;
81 int = 2;
103 int = 3;
125 int = 4;
147 int = 5;
169 int = 6;
191 int = 7;
213 int = 8;
235 int = 9;
257 int = 10;
279 int = 11;
301 int = 12;
323 int = 13;
345 int = 14;
367 int = 15;
389 int = 16;
411 int = 17;
433 int = 18;
455 int = 19;
477 int = 1100;
501 int = -6;
525 int = -5;
549 int = -4;
573 int = -2;
597 int = -1;
619 int = 0;
641 int = 1;
663 int = 1;
685 int = 2;
707 int = 3;
729 int = 4;
751 int = 5;
773 int = 6;
795 int = 7;
817 int = 8;
839 int = 9;
861 int = 10;
883 int = 11;
905 int = 12;
927 int = 13;
949 int = 14;
971 int = 15;
979 Sub995()
987 ;
}

Sub995, 1122
1011 object = 0;
1038 if (!GetLocalBoolean(CPTOPSP, 55, ))
{
}
1114 ;
}

Sub1122, 1766
1143 if (!GetLocalBoolean(CPTOPSP, 57, ))
{
}
1758 ;
}

Sub1766, 2948
1782 int = 0;
1819 int = GetGlobalNumber("G_PC_LEVEL", );
1839 string = "";
1890 if (CPTOPSP == 990 && !IsAvailableCreature(7, ))
{
}
1938 else if(CPTOPSP == 0)
{
}
2053 else if(CPTOPSP % 100 == 0)
{
}
2347 else if(CPTOPSP % 10 == 0)
{
}
2916 ;
2940 ;
}

Sub2948, 6915
2964 int = 1;
2987 string = GetModuleName();
3363 if (CPTOPSP == "851NIH" || CPTOPSP == "852NIH" || CPTOPSP == "901MAL" || CPTOPSP == "902MAL" || CPTOPSP == "903MAL" || CPTOPSP == "904MAL" || CPTOPSP == "905MAL" || CPTOPSP == "906MAL")
{
}
3636    case label
3658    case label
3680    case label
3702    case label
3724    case label
3746    case label
3768    case label
3790    case label
3812    case label
3834    case label
3856    case label
3878    case label
3900    case label
3922    case label
3944    case label
3966    case label
3988    case label
4010    case label
4032    case label
4054    case label
4076    case label
4098    case label
4120    case label
4142    case label
4164    case label
4186    case label
4208    case label
4230    case label
6843 string = GetModuleName();
6883 ;
6907 ;
}

Sub6915, 7116
6938 string = IntToString(CPTOPSP, );
6959 string = "0";
6981 int = 4;
7010 while(GetStringLength(CPTOPSP, ) < CPTOPSP)
{
}
7084 ;
7108 ;
}

Rough, but getting there!

AmanoJyaku · March 5, 2021

I wanted to give an update a few weeks ago, but I had to tackle a problem I ignored for months. The reverse compiler is now more than 5,000 lines in length, and that makes it difficult to maintain. So, almost 4,000 lines of source code have been moved into a library. Additionally, many functions did too much and have since been split into smaller functions or replaced entirely. It's not sexy, but coding never is.

Or is it?

Cue: Salt-N-Peppa's "Let's talk about sex"

Let's talk about Hex, baby!
Let's talk about Binary!
Let's talk about Octal numbers,
Signed and unsigned, number theory!

OK, OK, I'm back to "normal". Anyway, after three weeks of maintenance I got back to the real work and decided to rewrite the remaining 1,000+ lines entirely. Everything from reading an NCS file from disk to producing NWScript is new. And so much better than before, since I eliminated some bugs in the code. I'll provide a more detailed post later, but here's some sample output from TSL's k_inc_npckill.ncs:

Spoiler


21, 298()
{
        int = GetScriptParameter(1);
        int = GetScriptParameter(2);
        int = GetScriptParameter(3);
}

298, 1345()
{
        int = 15;
        int = 0;
        location = GetLocation(Argument);
        float = GetFacing(Argument);
        float = GetPositionFromLocation(location = GetLocation(Argument));
        float = GetPositionFromLocation(location = GetLocation(Argument));
        float = GetPositionFromLocation(location = GetLocation(Argument)) = float = GetPositionFromLocation(location = GetLocation(Argument)) + 1.000000;
        location = Location(vector, float = GetFacing(Argument));
        object = GetFirstObjectInShape(4, 4.000000, location = GetLocation(Argument), 0, 65, vector);
        Destroying 3 arguments
}

1345, 1601()
{
        int = GetHasFeat(56, Argument);
        int = GetHasFeat(57, Argument);
        int = 0;
        return int = 0;
        Destroying 1 arguments
}

1601, 1637()
{
        Destroying 2 arguments
}

1637, 1813()
{
        effect = EffectDeath(0, 0, 1);
        Destroying 2 arguments
}

I purposely left out nested scopes because I wanted to focus local variables. This script is particularly interesting because it contains a vector. See the really long line that starts with float = GetPositionFromLocation? That's one of the vector's members. The source code looks like this:

Spoiler


vector vPos = GetPositionFromLocation( oLoc );

vPos.z = vPos.z + 1.0f ;

The really long line in the sample output essentially merges the two lines above, but if you look closely you'll see it's the combination of creating vPos.z, adding 1.0f to vPos.z, then storing the result back to vPos.z. Here's a cut-and-paste from the NSS file to compare to the sample output:

Spoiler


int nDC = 15;
int nDCCheck = 0;

location oLoc = GetLocation(oCreature);
float oOri = GetFacing(oCreature);
vector vPos = GetPositionFromLocation( oLoc );

vPos.z = vPos.z + 1.0f ;
location oExplosionLoc = Location( vPos, oOri );
object oTarget = GetFirstObjectInShape(SHAPE_SPHERE, 4.0, oLoc, FALSE, 65);

Now that I've confirmed the stack is being created correctly, the next tasks are to:

Add variable declarations, e.g. int nParam, object oParam
Properly display modification of variable values, e.g. vPos.z = vPoz.z + 1.0f
Insert nested scopes (if-else, while, etc...)
Probably other stuff I can't think of right now

AmanoJyaku · March 17, 2021

Just a quick update. Here's output from TSL's k_contain_unlock.ncs:

Spoiler


//void main()
0 995, 1122(0)
{
        object Object0;
        Object0 = 0
        if (!GetLocalBoolean(Object0, 55))
        {
                SetLocalBoolean(Object0, 55, 1);
                Sub1122(Object0, Random(8) + 1, 920);
        }
}

//void PlaceTreasureDisposable(object oContainer = OBJECT_SELF, int numberOfItems = 1, int nItemType = 900)
0 1122, 1766(3)
{
        if (!GetLocalBoolean(CPTOPSP, 57))
        {
                SetLocalBoolean(CPTOPSP, 57, 1);
                int Int0;
                int Int1;
                Int1 = GetGlobalNumber("G_PC_LEVEL")
                if (CPTOPSP < 900)
                {
                }
                int Int2;
                Int2 = Int1 + Random(6) - 4
                if (Int2 < 1)
                {
                        Int2 = 1
                }
                if (Int2 > 30)
                {
                        Int2 = 30
                }
                Int0 = 1
                while (Int0 <= CPTOPSP)
                {
                        string String3;
                        string String4;
                        int Int5;
                        Int5 = 1
                        int Int6;
                        String3 = Sub1766(Int2, CPTOPSP)
                        Int6 = FindSubString(String3, "[")
                        if (FindSubString(String3, "[") >= 0)
                        {
                                Int5 = StringToInt(GetSubString(String3, Int6 + 1, 4))
                                String4 = GetSubString(String3, 0, Int6)
                        }
                        else
                        {
                                String4 = String3
                        }
                }
        }
        Removing -3 arguments
}

Overall, the results look good! Though, there are things that need to be dealt with:

Subroutines only list the number of return values and arguments; type information needs to be substituted, e.g. void Sub1122(object, int, int)
Arguments in expressions are listed as CPTOPSP; this needs to be replaced with a variable name and argument position, e.g. if (!GetLocalBoolean(Arg0, 57))
Variable declarators with initializers are written as two lines; while correct, these should be merged, e.g. object Object0 = 0;
Arguments that are assigned to need to be printed, e.g. if (Arg2 < 900) Arg2 = 0;
For loops are presented as while loops; while correct, for (Int0 = 1 ; Int0 <= Arg1; ++Int0) is more compact than
1. Int0 = 0;
2. while (Int0 <= Arg1)
3. ++Int0
Increment and decrement are not yet printed (this is trivial, I just didn't do it)
Printing expressions needs to take place at a different point in the process; the current method prints some expressions twice, and omits others
Switch statements need to be printed
Vectors and structs need to be detected, which will require multiple methods
Return statements need to be printed
Break and continue statements in loops need to be detected
Arguments of type "action" (AssignCommand, DelayCommand, ActionDoCommand) need to be printed

These are largely minor issues, but they're taking a backseat to a bug I introduced along the way:

Spoiler


//string GetTreasureBundle (int nItemLevel, int nItemType = 0)
1 1766, 2948(2)
{
        int Int0 = 0;
        int Int1;
        int Int2 = GetGlobalNumber("G_PC_LEVEL")
        string String3 = ""
        if (Arg1 == 990 && !IsAvailableCreature(7))
        {
        }
        if (Arg1 == 0)
        {
                Int1 = Random(0) + 9
                String3 = Sub1766(CPTOPSP, Int1 * 100)
        }
        else if (String3)
        {
                Int0 = Random(CPTOPSP) + 1
                Int2 = Sub1766(CPTOPSP, CPTOPSP + 10 * Int0)
        }
        else if (Int2)
        {
                Int1 = Sub2948(CPTOPSP, CPTOPSP + CPTOPSP)
        }
        else
        {
                Int1 = Sub2948(CPTOPSP, CPTOPSP)
        }
        Removing -5 arguments
}

The else-if statements have the wrong test condition. They should all be pointing at the same variable as the preceding if statement, Arg1. This error is propagated further as reflected in the last line. Removing 5 arguments? There are only 2!!! Printing code was added yesterday, so I expect this to be a quick fix.

AmanoJyaku · April 21, 2021

Sorry for the delay. I wanted to put out an update, but I've been addressing bugs. More important, I'm dealing with real life and needed to take some time for myself. This project has consumed most of my free time, I haven't played a game since February of last year. I'll pick this up next week, but for now I'm playing some games!

AmanoJyaku · May 18, 2021

Sometimes, I learn more from mistakes than I do from getting it right the first time...

My last post mentioned bug fixes. In an earlier post I described the structure of an NCS subroutine:

Step 1 - The code that does the stuff you want
Step 2 - Placing a return value on the stack
Step 3 - Destroying local variables and temporaries on the stack
Step 4 - Destroying arguments on the stack

None of these steps are required, I'm almost certain I've seen an empty subroutine that simply returned. That means a missing step cannot impede the analysis of successive steps. So, how do we analyze the following?

Subroutine 1601
1601	CONSTI 0
1607	CPTOPSP -3 1
1615	CPTOPSP -3 1
1623	JSR 298
1629	MOVSP -2
1635	RETN

There is a single path through the code, that is there are no jumps so the ops execute in sequence. (Technically, the JSR is a jump. However, subroutine calls return to the op from which they were called and proceed to the next op in the sequence.) So, how does the stack change? The first three ops each add a value to the stack. The fourth op, a subroutine call, removes those three values from the stack. The subroutine returns void, i.e. nothing, so no return value is placed on the stack. And yet, the fifth op removes values from the empty stack...

This tells us a few things:

The fourth op is the last in Step 1
Subroutine 1601 does not place a value onto the stack after Step 1, so there is no Step 2
There are no local variables and no temporary values on the stack, so there is no Step 3
Therefore, the fifth op must be Step 4
Subroutine 1601 takes two arguments and returns void, i.e. void Sub1601(Arg0, Arg1)

Let's look at a more complicated example:

Subroutine 87299
87299   CPTOPSP -1 1
87307   CPTOPBP -171 1
87315   EQUALxx
87317   JZ
89610   CONSTx "NO COMBO SELECTED"
89631   CPDOWNSP -3 1
89639   MOVSP -1
89657   MOVSP -1

The first two ops each place a value onto the stack. The third removes those two values, and replaces them with a boolean (actually, an integer) result. The fourth removes the result, then performs a conditional jump. To simplify things, I've removed the conditional code branches. At this point, the stack is empty.

The fifth op places a string onto the stack. The sixth copies the string down three positions from the top of the stack. However, there is only one value on the stack so this may be a return... The seventh op clears the stack. The eighth tries to remove a value from the empty stack. So, what does this tell us?

The fourth op is the last in Part 1
Ops five and six may be Part 2
The seventh op is Part 3
The eighth and final op is Part 4

You'll notice the repeated use of "may". We should not assume all writes outside of the subroutine's stack are returns. Subroutine arguments also exist outside of the subroutine's stack, and this turned out to be the source of the bug. One of the subroutines analyzed wrote to its argument, and this was mistakenly marked as a return value. Thus, a subroutine that returned void, i.e. nothing, but also modified an argument messed up the analysis of any other subroutine that called it.

The fix was simple enough: record the last write outside of the stack, along with the difference between the negation of destination and size of the stack. In the example above, the destination is -3 and the stack size is 1: -(-3) - 1 yields 2. When analyzing Part 4, the destruction of subroutine arguments, we compare the result to the size of the negation of number of arguments being destroyed: 2 vs -(-1), or 2 > 1. This tells us the final write is to a position greater than the furthest argument, thus it must be a return value. Subroutine 87299 takes one argument and returns one value. In short, we have to analyze Part 4 before we can analyze Part 2.

Fixing bugs wasn't the only benefit. It's removed from the example, but there is an unconditional jump after the first MOVSP that transfers execution to the second MOVSP. This jump bypasses an op, creating dead code. I seem to have finally discovered the source of dead code: return statements. This also solves a different problem: I now know how to identify return statements in different parts of the subroutine. More in a future update!

AmanoJyaku · May 30, 2021

"Might as well jump!" - Van Halen

The last update mentioned the mystery of the unconditional jump at the end of a subroutine, and explained it's a return statement. Let's look at it in a bit more detail.

7054	CONSTS [
7059	CPTOPSP -4 1
7067	ADDxx
7069	CONSTS ]
7074	ADDxx
7076	CPDOWNSP -6 1
7084	MOVSP -4
7090	JMP 7108
*7096	MOVSP -1*	//Dead code
*7102	MOVSP -3*	//Dead code
7108	MOVSP -1
7114	RETN

Why is this happening? It's the result of the compiler building the subroutine in stages. Let's separate the op codes into subroutine Parts:

Part 2
7054	CONSTS [
7059	CPTOPSP -4 1
7067	ADDxx
7069	CONSTS ]
7074	ADDxx
7076	CPDOWNSP -6 1

Part 3 (New)
7084	MOVSP -4
7090	JMP 7108
7096	MOVSP -1

Part 3 (Original)
7102	MOVSP -3

Part 4
7108	MOVSP -1
7114	RETN

Oho! This subroutine returns a value (Part 2), clears local variables (Part 3), and clears arguments (Part 4). But, why are there two Part 3's?

Top to bottom, the stack looks like this:

Temporary return values
Local variables
Arguments
Return values

The compiler first writes the code for clearing the arguments (Part 4), which is easy because there is a fixed number. Then, it writes code for clearing the local variables (Part 3 - Original). In the example, the subroutine has 3 local variables. Next, the compiler calculates the number of values returned in Part 2 and adds that to the number of local variables. The result is 3 + 1. The compiler does not remove the original Part 3, so it's necessary to jump over it (Part 3 - New).

So, why is there a MOVSP after the jump? As far as I can tell, it's a quirk of the compiler. Part 2 returns a value by evaluating a set of expressions. The result is then copied down the stack past the arguments. The result is then cleared in the new Part 3. However, the compiler inserts a MOVSP after the jump that would have cleared the result of the last expression.

That leaves us with two MOVSPs that are jumped over: one to clear the local variables, and one to clear the last expression (in this case, the return value). I'm rewriting my code to take advantage of this. Instead of calculating all of the changes made to the stack in order to find the subroutine signature, it could be easier to look for a jump and evaluate the code around it. The question is how to handle subroutines that don't return values, and therefore lack the jump...

This newfound knowledge is useful for another reason: I now know how to properly analyze switch statements. More in a future post.

AmanoJyaku · June 1, 2021

I wrote my first NWScript in order to test the findings from the previous post. The script:

Spoiler


vector test()
{
	int I = 0;
	return GetPositionFromLocation(GetLocation(OBJECT_SELF));
}

void main()
{
	vector V = test();
}

I specifically chose to add a single local variable and return a vector (struct of three floats) to prove the results. Let's break down the test subroutine:

Spoiler


Part 1
61      RSADDx
63      CONSTx
69      CPDOWNSP -2 1
77      MOVSP -1
Part 2
83      CONSTx
89      1 GetLocation(1)
94      3 GetPositionFromLocation(1)
99      CPDOWNSP -7 3
Part 3 (New)
107     MOVSP -4
113     JMP 131
119     MOVSP -3
Part 3 (Original)
125     MOVSP -1
131     RETN

Part 1 is where the local variable I is declared and initialized. Part 2 is where the vector is returned. Part 3 is where things get interesting.

Part 3 (Original) clears the local variable. Part 3 (New) clears the return values and local variables. Since Part 3 (Original) is not removed there is a jump over it in Part 3 (New). That jump is followed by the compiler quirk, the MOVSP that clears the last expression from the stack. That expression is the result of GetPositionFromLocation(), a vector.

It's good to finally understand this weird JMP/MOVSP business, but this is awesome for two other reasons. The first is code analysis can be simplified to look for this JMP/MOVSP combination, it's no longer necessary to calculate stack changes for each op code in order to determine if a subroutine returns values. This eliminates the majority of bugs. It's also faster, but that's not as much of a concern.

The second reason is it fundamentally changes the reverse compiler construction. I'll explain more in an upcoming post, but the tl;dr is it's now possible to find the beginning and end of statements!

Spoiler


void _start()
//Hidden start function
//Essentially, main();
13      JSR 21()
19      RETN

void main()
//vector v = test();
21      RSADDx
23      RSADDx
25      RSADDx
27      RSADDx
29      RSADDx
31      RSADDx
33      JSR 61()
39      CPDOWNSP -6 3
47      MOVSP -3
//Not printed
53      MOVSP -3
59      RETN

vector test()
//int i = 0;
61      RSADDx
63      CONSTx
69      CPDOWNSP -2 1
77      MOVSP -1
//return GetPositionFromLocation(GetLocation(OBJECT_SELF));
83      CONSTx
89      1 GetLocation(1)
94      3 GetPositionFromLocation(1)
99      CPDOWNSP -7 3
107     MOVSP -4
//Not printed
113     JMP 131
119     MOVSP -3
125     MOVSP -1
131     RETN

Notice a pattern? 🤔

AmanoJyaku · June 6, 2021

So, I finally discovered a bug in nwnnsscomp... I think.

The purpose of a reverse compiler is to recreate the syntax of the high-level language. This can only be accomplished if we understand statements, expressions, and values.

Statement: HK-47 is ready to serve, master.

- HK-47

Spoiler

There are seven categories of statements in NWScript:
Declaration statements
   int I;
   float F;
   string S;
   etc...
Expression statements
   I = 0;
   ++I;
   PrintString("Print me!!!");
   etc...
Selection statements
   if (condition)
   switch (condition)
Iteration statements
   while (condition)
   for (condition)
   do while(condition)
Labeled statements
   case 0:
   default:
Jump statements
   break;
   continue;
   return;
Compound statements
   {
       //statement 0
       //statement 1
       //...
   }

Each statement is identified by locating an initiating op code, terminating op code, or combination of both. For example, variable declarations are a single op code, RSADDx. However, RSADDx is also used for return values of subroutines. If the Random engine function were written as a subroutine call it might look like this:

RSADDx
CONSTI 10
JSR <offset to subroutine>

Which means we identify RSADDx as a code of interest, but need to process the op codes that follow to be certain.

As it turns out, finding the terminator is usually easier than finding the initiator. But, why?

If you care for others, then dispense with pity and sacrifice and recognize the value in letting them fight their own battles.

- Kreia

Spoiler

The majority of statements in a program are expression statements, so it's necessary to understand expressions. In general, expressions produce values:

1 + 1;

What happens to the value 2? Before we get to that, consider that not all expressions produce values. Function/subroutine calls are expressions that can produce void, or no value:

//PrintString doesn't produce a value
void PrintString(string);

It's also important to understand the NCS runtime places variables and values onto the same stack. Thus:

int I = 0;
float F = 0.0;
1 + 1;
//more statements...

looks like this:
//Top of stack

2
0.0
0

//Bottom of stack

At least, it looks that way before proceeding to the statement after 1 + 1. A variable is essentially a value that persists until the end of its enclosing scope (the curly braces). All other values are guaranteed to be destroyed by the end of their statement. Thus, another term for an expression statement is discarded-value expression.

Battle is a pure form of expression.

- Handmaiden

Spoiler

When we reach the end of a statement any value that is not a variable is guaranteed to be destroyed. The MOVSP op code is what destroys the values of expression statements. This was illustrated in the previous update. It can also be seen in declarations with initializers:

RSADDx           //Place a variable on the stack
CONSTx   0       //Place a value on the stack
CPDOWNSP -2 1   //Assign the value to the variable (initialization)
MOVSP -1       //Destroy the value, we've reached the end of a statement

But, what about expressions that return void? Expressions can produce void, but they cannot accept void. Thus, a void expression ends a statement.

What about selection, iteration, and labeled statements? Their conditions are expressions. What happens to their resulting values?

int I = 0;
//The result of the comparison is destroyed by the if statement
if (I > 0)
{
}

Thus, by the time the program proceeds to a branch of a selection or iteration statement all values are destroyed and only variables are left on the stack. This isn't true for labeled statements due to the way switch statements work, but once you exit a switch entirely all values are destroyed.

tl;dr
We find statements in many ways, including detecting the destruction of expressions.

So, what's all this got to do with nwnnsscomp?

Spoiler

That thing I mentioned about labeled statements and the way switch statements work. Usually, destroying the condition of a switch statement is the very last thing to occur. The exception to this is when a switch contains a return statement:

switch (condition)
{
case 0:
   //CPDOWNSP here, e.g. Part 2 of subroutine
   //MOVSP -N here, e.g. Part 3 of subroutine
   return 0;
case 1:
   //Jumps to the MOVSP -1
   break;
}
//MOVSP -1 here to destroy condition

We've seen the analysis of returns, so this should be familiar. So, what's the bug? Consider the following:

while (condition)
{
   switch (condition)
   {
   case 0:
       break;
   case 1:
       continue;
   }
}

Case 1 exits the switch and jumps to the end of the while statement, so the switch condition should be destroyed before the jump. It's not. Nwnnsscomp leaves the condition on the stack, which means any reference to the stack is off by the number of times the continue is executed. There are very few continue statements in the game files, and none that I could find inside of a switch statement. But, if someone were to write perfectly valid code like this, well... kaboom!!!

//Top of stack

Condition
Condition
Condition
...
Variables

//Bottom of stack

Why am I not certain this isn't a bug? It clearly is, I'm just not certain which version of nwnnsscomp is the latest, and whether or not it's been fixed. I have three versions of the program, and I've only tested one.

DarthParametric · June 7, 2021

9 hours ago, AmanoJyaku said:

There are very few continue statements in the game files

Interesting. I can't claim to have gone through all the K1 scripts (not least because some still can't be decompiled), but in the course of working on K1CP have browsed one or two and I have never come across this one. I did a quick search through the global scripts and found a use of it in TSL's k_inc_force. Is it a TSL-only thing?

AmanoJyaku · June 7, 2021

8 hours ago, DarthParametric said:

Interesting. I can't claim to have gone through all the K1 scripts (not least because some still can't be decompiled), but in the course of working on K1CP have browsed one or two and I have never come across this one. I did a quick search through the global scripts and found a use of it in TSL's k_inc_force. Is it a TSL-only thing?

The continue statement is as old as the C language NWScript is derived from. It exists in K1, and probably NWN, it simply was never used. Another example is do-while; there is only one script out of 2,600+ that has a do-while statement. I haven't seen the ternary operator, the const keyword, or the bitwise operators, either.

AmanoJyaku · November 6, 2021

How is everyone? It's been a while... Several changes have been made to the reverse compiler, resulting in a rewrite from the ground up:

All Opcodes are now properly decoded, including:
1. RSADDx - Now able to differentiate variables from subroutine return values
2. JSR - Number of subroutine arguments and return values fully deduced
3. JZ - Now able to differentiate logical AND|OR expressions from if|while|for statements
Subroutine signatures are now partially deduced:
1. Number, but not types, of arguments and return values
2. To do: Collection of type information is dependent upon evaluation of statements
Proto-statements (I made up this term) are discovered:
1. Input for recreating complete statements (I made up this term, too)
2. int i = 0; is a complete statement
3. int i; is a complete statement, and a proto-statement of int i = 0;
4. i = 0; is a complete statement, and a proto-statement of int i = 0;
5. To do: Proto-statements must be combined to recreate complete statements
Error handling is slowly being added:
1. Has helped discover programming and algorithm bugs
2. Is meant to discover incorrect NCS files, which compilers can generate under certain conditions
3. To do: "Full" error handling won't happen in the first release

If anyone has questions I'll be happy to explain further. For now, I'm working to complete this before the two-year anniversary of starting this project. 😬

DarthParametric · November 6, 2021

4 minutes ago, AmanoJyaku said:

I'm working to complete this before the two-year anniversary of starting this project.

Cool. I'm looking forward to finally being able to decode some of those holdout scripts that still remain impervious to DeNCS's charms (like some of the Star Forge level 4 ones for example). If you manage to release something before the end of the year we can clear out the last of the remaining hijacked scripts in K1CP before v1.9 releases.

AmanoJyaku · November 10, 2021

On 11/6/2021 at 10:48 AM, DarthParametric said:

Cool. I'm looking forward to finally being able to decode some of those holdout scripts that still remain impervious to DeNCS's charms (like some of the Star Forge level 4 ones for example). If you manage to release something before the end of the year we can clear out the last of the remaining hijacked scripts in K1CP before v1.9 releases.

I'm not promising anything, but we'll see.

I don't know how many people are coders, but I came across a series of articles on creating your own compiler. The articles are largely useless for writing a reverse compiler like this WIP, but they give some insight into how early design choices impact development later on. In part 4 the author wanted to make one change to a feature added in part 2, yet it led to multiple changes because so much was dependent on that one change.

This has been my experience from the start: numerous rewrites from the ground up as I learned more about the NCS byte code. Finally, I'm past interpreting individual codes and on to interpreting statements and data structures. Statements are easy (except switch statements), but data structures will require some thought. Two different data structures could have the same member definitions (e.g. struct S1 {int i; float f;} and struct S2 {int i; float f;}), so I'll need logic to detect this. Finding structures is surprisingly easy, though.

tl;dr

The two obstacles left are switch statements and user-defined structs, and I'm nearly finished with switch statements.

http://visualstudiomagazine.com/articles/2014/05/01/how-to-write-your-own-compiler-part-1.aspx

http://visualstudiomagazine.com/articles/2014/06/01/compiler-basics-part-2.aspx

http://visualstudiomagazine.com/articles/2014/07/01/syntax-analysis.aspx

https://visualstudiomagazine.com/articles/2014/09/01/compiler-basics-how-to-part-4.aspx

AmanoJyaku · November 17, 2021

The last few updates were light on samples, so let's address that. TSL's k_inc_treasure.nss:

Spoiler


//turn a given quantity into appropriate format for a treasure blob (string)
string SWTR_GetQuantity(int iCount)
{
  string str = IntToString(iCount);
  string pad = "0";
  int length = 4;
  
  while(GetStringLength(str) < length)
  {
    str = pad + str;
  }
  return("[" + str + "]");
}

And the reverse compiler output:

Spoiler


string S;
<expression statement>;
string S;
<expression statement>;
int I;
<expression statement>;
<iteration statement>
<expression statement>;
<jump statement>;
<expression statement>;
<jump statement>;
<MOVSP>
<MOVSP>
<MOVSP>
<RETN>

And the two overlapped for comparison, with the reverse compiled script labeled as "RCScript":

Spoiler


string SWTR_GetQuantity(int iCount)
{
  [NWScript] string str = IntToString(iCount);
  [RCScript] string S;
  [RCScript] <expression statement>;
  
  [NWScript] string pad = "0";
  [RCScript] string S;
  [RCScript] <expression statement>;
  
  [NWScript] int length = 4;
  [RCScript] int I;
  [RCScript] <expression statement>;
  
  [NWScript] while(GetStringLength(str) < length)
  [RCScript] <iteration statement>
  {
    [NWScript] str = pad + str;
    [RCScript] <expression statement>;
  }
  [RCScript] <jump statement>;
  
  [NWScript] return("[" + str + "]");
  [RCScript] <expression statement>;
  [RCScript] <jump statement>;
  [RCScript] <MOVSP>
}
[RCScript] <MOVSP>
[RCScript] <MOVSP>
[RCScript] <RETN>

In some cases, the number of RCScript statements matches that of the original NWScript statements, e.g. the assignment expression statement "str = pad + str;".

[An earlier post referred to RCScript statements as proto-statements, and NWScript statements as complete statements. They're all terms I made up.]

In others, multiple RCScripts statements combine to form the NWScript statement. The simplest example is a variable declaration with an initializer: the initializer is an expression statement that immediately follows the declaration.

A more complicated example is the iteration statement, which is followed by a jump statement. Then, the body of the iteration statement (in the example, the assignment expression statement) is placed between them. The return statement is similar, however it has a slight quirk in that it starts off as regular expression statement that gets split in two. Then a jump statement is placed in between the two pieces, with the second being the original MOVSP. The first piece is modified with a new MOVSP that removes the result of the expression and all other values on the stack in order to clear the stack before exiting the subroutine.

However, the ultimate is the switch statement. It's so complicated it needs its own post. It has optional parts...

The final three RCScript statements are unprintable. The first clears all variables from the stack. However, this was already dealt with in the return statement. The jump statement inserted into the return statement jumps past this MOVSP and lands on the final MOVSP. Since the stack has already been cleared, this must be clearing arguments. And then the subroutine returns.

To Do:

Write code to generate expressions, e.g. ADDxx becomes <operand 1> + <operand 2>
Regroup RCScript statements back into NWScript statements
Integrate the value stack so variables can be referenced by unique names

AmanoJyaku · January 20, 2022

Been a while, time for an update!

Writing code isn't as "simple" as using a modeling tool like 3DS Max or Maya (is Lightwave 3D still a thing?) to create 3D characters. A modeling tool uses physics; imagine being the coder who has to create the physics. Sir Amano Newton has spent the past two years unraveling NCS "physics", because NCS "god" (BioWare) didn't leave us precise notes.

Let's look at we have to do to get started writing a reverse compiler:

Binary, Decimal, and Hexadecimal Output

Spoiler

Usually, input is a file from disk. It doesn't really matter where it comes from, all that matters is the NCS data is a contiguous sequence of bytes. What does that even look like?


1E00000000082000

The code above is readable, clearly a series of numbers and letters. However, NCS data files are stored in binary format. The binary bytes of a file have 256 possible values, but only 96 of those can be printed: A-Z, a-z, 0-9, space, return, tab, punctuation, etc... So, what about the other 160 values? They either show up as weird characters or blank spaces, or they do unexpected things unique to the reader software. You can't use a text editor to read binary files, and that means no printing raw binary data.

We need to convert the values to a printable representation. We could try a printable version of binary, but that looks like this:


//42 in binary
00101010

That's a single byte. Each 4-byte integer, most of the numbers in NCS, would be four times longer!

So, what about decimal representation? 42 is perfectly fine, but:


//Maximum signed 32-bit value
2147483647

//Minimum signed 32-bit value
-2147483648

Including the minus sign, that's 11 characters!

Hexadecimal is shorter:


//Maximum signed 8-bit value
7F, vs
127
//Maximum signed 16-bit value
7FFF, vs
32767
//Maximum signed 32-bit value
7FFFFFFF, vs
2147483647

Eliminating 1 or 2 characters from every 3-11 doesn't seem significant until you're reading thousands of characters...

Converting From Binary to Hex

Spoiler

OK, so we want to print in hex. How? I had to write my own code for this, I couldn't figure out the C++ formatting and conversion library.


//hex_str(uint8_t Quotient)

std::string Converted;

do
{
	switch (Quotient % 16u)
	{
	case 0u:
		Converted += '0';
		break;
	...
	case 15u:
		Converted += 'F';
		break;
	}
} while (0 < (Quotient /= 16u));

if (2u == std::size(Converted))
{
	std::reverse(std::begin(Converted), std::end(Converted));
}      
else
{
	Converted.insert(std::begin(Converted), '0');
}

return Converted;

Each byte is fed to the conversion function as the variable named Quotient. The value is repeatedly divided by 16, storing the result backwards into a string.

A single byte is represented by two hexadecimal characters (0x7F is (7 * A) + (F * 1), or (7 * 16) + (15 * 1)) . If the conversion yields a string with two characters, that string is reversed to place the values in the same order as the original file. If the conversion yields a string with just one character, a '0' character is placed in front, (e.g. F becomes 0F).

Printing

Spoiler

Printing is now as simple as looping through each byte, passing it to the conversion function, and adding the result to a string to be printed:


for (auto CurPos{ BeginFile + 13 }; _at_least(CurPos, EndFile, 1u); ++CurPos)
{
	Output += hex_str(static_cast<uint8_t>(*CurPos));
}

CurPos is the current position in the file, which advances to EndFile. The very first op code is always at offset 13. The _at_least function guarantees there are at least N bytes left, otherwise there's not enough to process the file. For C++ reasons, the value retrieved is converted from a std::byte to a uint8_t, then passed to the conversion function hex_str.

Which contains the output seen in our first snippet:


1E0000000008200002030403000000000101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000020101FFFFFFF800041B00FFFFFFFC02030403000000030101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC02030403000000050101FFFFFFF800041B00FFFFFFFC02030403000000060101FFFFFFF800041B00FFFFFFFC02030403000000070101FFFFFFF800041B00FFFFFFFC02030403000000080101FFFFFFF800041B00FFFFFFFC02030403000000090101FFFFFFF800041B00FFFFFFFC020304030000000A0101FFFFFFF800041B00FFFFFFFC020304030000000B0101FFFFFFF800041B00FFFFFFFC020304030000000C0101FFFFFFF800041B00FFFFFFFC020304030000000D0101FFFFFFF800041B00FFFFFFFC020304030000000E0101FFFFFFF800041B00FFFFFFFC020304030000000F0101FFFFFFF800041B00FFFFFFFC02030403000000100101FFFFFFF800041B00FFFFFFFC02030403000000110101FFFFFFF800041B00FFFFFFFC02030403000000120101FFFFFFF800041B00FFFFFFFC02030403000000130101FFFFFFF800041B00FFFFFFFC020304030000044C0101FFFFFFF800041B00FFFFFFFC020304030000000619030101FFFFFFF800041B00FFFFFFFC020304030000000519030101FFFFFFF800041B00FFFFFFFC020304030000000419030101FFFFFFF800041B00FFFFFFFC020304030000000219030101FFFFFFF800041B00FFFFFFFC020304030000000119030101FFFFFFF800041B00FFFFFFFC02030403000000000101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000010101FFFFFFF800041B00FFFFFFFC02030403000000020101FFFFFFF800041B00FFFFFFFC02030403000000030101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC02030403000000050101FFFFFFF800041B00FFFFFFFC02030403000000060101FFFFFFF800041B00FFFFFFFC02030403000000070101FFFFFFF800041B00FFFFFFFC02030403000000080101FFFFFFF800041B00FFFFFFFC02030403000000090101FFFFFFF800041B00FFFFFFFC020304030000000A0101FFFFFFF800041B00FFFFFFFC020304030000000B0101FFFFFFF800041B00FFFFFFFC020304030000000C0101FFFFFFF800041B00FFFFFFFC020304030000000D0101FFFFFFF800041B00FFFFFFFC020304030000000E0101FFFFFFF800041B00FFFFFFFC020304030000000F0101FFFFFFF800041B00FFFFFFFC2A001E00000000102B001B00FFFFFF54200002060406000000000101FFFFFFF800041B00FFFFFFFC0403000000370301FFFFFFF80004050002A70222031F000000004C0403000000010403000000370301FFFFFFF40004050002A803040300000398040300000008050000000104030000000114200301FFFFFFF400041E00000000141D00000000061B00FFFFFFFC20000403000000390301FFFFFFF80004050002A70222031F00000002670403000000010403000000390301FFFFFFF40004050002A803020302030405000A475F50435F4C4556454C05000244010101FFFFFFF800041B00FFFFFFFC0301FFFFFFEC00040403000003840F201F00000000200403000000000101FFFFFFE800041B00FFFFFFFC1D000000000602030301FFFFFFF800040403000000060500000001142004030000000415200101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC00040403000000010F201F00000000200403000000010101FFFFFFF800041B00FFFFFFFC1D00000000060301FFFFFFFC000404030000001E0E201F000000002004030000001E0101FFFFFFF800041B00FFFFFFFC1D00000000060403000000010101FFFFFFF000041B00FFFFFFFC0301FFFFFFF400040301FFFFFFE8000410201F00000001340205020502030403000000010101FFFFFFF800041B00FFFFFFFC020302050301FFFFFFD400040301FFFFFFE400041E00000001140101FFFFFFEC00041B00FFFFFFFC040500015B0301FFFFFFEC000405000042020101FFFFFFF800040403000000000D201F000000006B0403000000040301FFFFFFF8000404030000000114200301FFFFFFE800040500004103050000E8010101FFFFFFF400041B00FFFFFFFC0301FFFFFFFC00040403000000000301FFFFFFE8000405000041030101FFFFFFF000041B00FFFFFFFC1D000000001C0301FFFFFFF000040101FFFFFFF000041B00FFFFFFFC0403000000000301FFFFFFF400040301FFFFFFD800040301FFFFFFE800040500001F041B00FFFFFFFC1B00FFFFFFF00301FFFFFFF400042403FFFFFFF01B00FFFFFFFC1D00FFFFFEC01B00FFFFFFF41D00000000061B00FFFFFFF4200002030403000000000101FFFFFFF800041B00FFFFFFFC020302030405000A475F50435F4C4556454C05000244010101FFFFFFF800041B00FFFFFFFC0205040500000101FFFFFFF800041B00FFFFFFFC0301FFFFFFE800040403000003DE0B200301FFFFFFFC00041F0000000015040300000007050002B801220306201F00000000200403000003D40101FFFFFFE400041B00FFFFFFFC1D00000000060301FFFFFFE800040403000000000B201F000000005B040300000000050000000104030000000914200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFF0000404030000006416200301FFFFFFE400041E00FFFFFF130101FFFFFFF800041B00FFFFFFFC1D000000036D0301FFFFFFE8000404030000006418200403000000000B201F000000010E0301FFFFFFE8000404030000006417200301FFFFFFFC00040403000000090B2025000000000C1D000000007B040300000007050002B8011F00000000200403000000090101FFFFFFE800041B00FFFFFFFC1D000000004A0301FFFFFFF400040403000000060F201F00000000200403000000060101FFFFFFE800041B00FFFFFFFC1D000000001A0403000000080101FFFFFFE800041B00FFFFFFFC1D00000000061B00FFFFFFFC0301FFFFFFF00004050000000104030000000114200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFE4000404030000000A0301FFFFFFE80004162014200301FFFFFFE400041E00FFFFFDED0101FFFFFFF800041B00FFFFFFFC1D00000002470301FFFFFFE8000404030000000A18200403000000000B201F00000002030301FFFFFFE8000404030000000A17200301FFFFFFFC000404030000005C0B2025000000007A0301FFFFFFFC000404030000005E0B2025000000007E0301FFFFFFFC000404030000005F0B202500000000820301FFFFFFFC00040403000000610B202500000000860301FFFFFFFC00040403000000620B2025000000008A0301FFFFFFFC00040403000000630B2025000000008E1D000000010A0403000000020101FFFFFFE800041B00FFFFFFFC1D00000000F00403000000020101FFFFFFE800041B00FFFFFFFC1D00000000D60403000000040101FFFFFFE800041B00FFFFFFFC1D00000000BC0403000000090101FFFFFFE800041B00FFFFFFFC1D00000000A20403000000030101FFFFFFE800041B00FFFFFFFC1D00000000880301FFFFFFE8000404030000000317200101FFFFFFE800041B00FFFFFFFC0301FFFFFFEC00040403000000050E201F00000000200403000000050101FFFFFFE800041B00FFFFFFFC1D00000000060403000000000101FFFFFFE800041F00000000200403000000010101FFFFFFE800041B00FFFFFFFC1D00000000061D00000000061B00FFFFFFFC0301FFFFFFF00004050000000104030000000114200101FFFFFFF000041B00FFFFFFFC02050301FFFFFFE400040301FFFFFFEC000414200301FFFFFFE400041E00000000700101FFFFFFF800041B00FFFFFFFC1D000000002C02050301FFFFFFE400040301FFFFFFE400041E00000000440101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC00040101FFFFFFE000041B00FFFFFFEC1D00000000121B00FFFFFFFC1B00FFFFFFF01B00FFFFFFF8200002030403000000010101FFFFFFF800041B00FFFFFFFC0205020505000231000101FFFFFFF800041B00FFFFFFFC0301FFFFFFFC0004040500063835314E49480B230301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063835324E49480B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930314D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930324D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930334D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930344D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930354D414C0B2307200301FFFFFFFC00041F00000000140301FFFFFFFC00041F000000001A0301FFFFFFF80004040500063930364D414C0B2307201F00000000F904030000000305000000010403000000000B201F00000000200403000003AD0101FFFFFFE800041B00FFFFFFFC1D00000000C004030000000305000000010403000000000B201F000000002D0403000003CB040300000009050000000114200101FFFFFFE800041B00FFFFFFFC1D000000008004030000000305000000010403000000000B201F00000000200403000003A30101FFFFFFE800041B00FFFFFFFC1D000000004D04030000000305000000010403000000000B201F00000000200403000003990101FFFFFFE800041B00FFFFFFFC1D000000001A04030000039A0101FFFFFFE800041B00FFFFFFFC1D00000000060301FFFFFFEC00040301FFFFFFFC000404030000038F0B2025000000025E0301FFFFFFFC00040403000003990B202500000002A60301FFFFFFFC000404030000039A0B202500000002E70301FFFFFFFC00040403000003A30B202500000003250301FFFFFFFC00040403000003AD0B2025000000039C0301FFFFFFFC00040403000003AE0B202500000004270301FFFFFFFC00040403000003B70B202500000004B50301FFFFFFFC00040403000003B80B202500000005190301FFFFFFFC00040403000003B90B2025000000057D0301FFFFFFFC00040403000003BA0B202500000005E10301FFFFFFFC00040403000003C10B2025000000066C0301FFFFFFFC00040403000003CB0B202500000006B40301FFFFFFFC00040403000003CC0B202500000006E70301FFFFFFFC00040403000003CD0B202500000007330301FFFFFFFC00040403000003CE0B202500000007A20301FFFFFFFC00040403000003CF0B202500000008250301FFFFFFFC00040403000003D00B202500000008930301FFFFFFFC00040403000003D10B202500000008A20301FFFFFFFC00040403000003D20B202500000009100301FFFFFFFC00040403000003D30B2025000000097F0301FFFFFFFC00040403000003D50B202500000009B00301FFFFFFFC00040403000003D60B202500000009BD0301FFFFFFFC00040403000003D70B202500000009CE0301FFFFFFFC00040403000003DF0B202500000009DE0301FFFFFFFC00040403000003E00B202500000009EB0301FFFFFFFC00040403000003E10B202500000009F80301FFFFFFFC00040403000003E20B20250000000A050301FFFFFFFC00040403000003E30B20250000000A121D0000000A2F04030000000A0301FFFFFFE80004162005000000010403000000320500000001142004030000001414200101FFFFFFEC00041B00FFFFFFFC0405000E675F695F637265646974733031350101FFFFFFF000041B00FFFFFFFC1D00000009D10301FFFFFFEC000405000000010301FFFFFFE800040500000001142004030000000114200101FFFFFFEC00041B00FFFFFFFC0405000D636F6D706F6E745F30303030310101FFFFFFF000041B00FFFFFFFC1D000000097A0301FFFFFFEC000405000000010301FFFFFFE800040500000001142004030000000114200101FFFFFFEC00041B00FFFFFFFC0405000A6368656D5F30303030310101FFFFFFF000041B00FFFFFFFC1D00000009260301FFFFFFEC0004040300000004172004030000000114200101FFFFFFE800041B00FFFFFFFC0301FFFFFFEC00040403000000070E201F00000000200403000000070101FFFFFFE800041B00FFFFFFFC1D000000000604050009615F736869656C645F040500013014230301FFFFFFE800040500005C0114230101FFFFFFF000041B00FFFFFFFC1D00000008990301FFFFFFEC00040403000000090F201F000000002D0405000F675F695F6D65646571706D6E7430310101FFFFFFF000041B00FFFFFFFC1D00000000640301FFFFFFEC00040403000000100F201F000000002D0405000F675F695F6D65646571706D6E7430320101FFFFFFF000041B00FFFFFFFC1D00000000270405000F675F695F6D65646571706D6E7430330101FFFFFFF000041B00FFFFFFFC1D00000007F80301FFFFFFEC00040403000000090F201F000000002E04050010675F695F6472647265706571703030310101FFFFFFF000041B00FFFFFFFC1D00000000660301FFFFFFEC00040403000000100F201F000000002E04050010675F695F6472647265706571703030320101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6472647265706571703030330101FFFFFFF000041B00FFFFFFFC1D00000007540403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030310101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030340101FFFFFFF000041B00FFFFFFFC1D00000006DA0403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030320101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030350101FFFFFFF000041B00FFFFFFFC1D00000006600403000000010101FFFFFFEC00041B00FFFFFFFC0301FFFFFFEC000404030000000C0F201F000000002E04050010675F695F6164726E616C696E653030330101FFFFFFF000041B00FFFFFFFC1D000000002804050010675F695F6164726E616C696E653030360101FFFFFFF000041B00FFFFFFFC1D00000005E60301FFFFFFEC00040403000000090F201F000000002D0405000F675F695F6D65646571706D6E7430310101FFFFFFF000041B00FFFFFFFC1D00000000640301FFFFFFEC00040403000000100F201F000000002D0405000F675F695F6D65646571706D6E7430320101FFFFFFF000041B00FFFFFFFC1D00000000270405000F675F695F6D65646571706D6E7430330101FFFFFFF000041B00FFFFFFFC1D00000005450403000000020301FFFFFFE80004162005000000010403000000320500000001142004030000000A14200101FFFFFFEC00041B00FFFFFFFC0405000E675F695F637265646974733031340101FFFFFFF000041B00FFFFFFFC1D00000004E7040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC04050010675F775F61646873766772656E3030310101FFFFFFF000041B00FFFFFFFC1D000000049E0301FFFFFFEC00040403000000040F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D00000000260405000E675F775F7374756E6772656E30310101FFFFFFF000041B00FFFFFFFC1D000000043C0301FFFFFFEC000404030000000B0F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000049040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC04050010675F775F6372796F626772656E3030310101FFFFFFF000041B00FFFFFFFC1D00000003B70301FFFFFFEC000404030000000F0F201F000000004E0405000F675F775F666972656772656E3030310101FFFFFFF000041B00FFFFFFFC040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC1D000000003B0403000000010101FFFFFFEC00041B00FFFFFFFC0405000F675F775F746865726D6C64657430310101FFFFFFF000041B00FFFFFFFC1D000000031E0301FFFFFFEC00040403000000090F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F666972656772656E3030310101FFFFFFF000041B00FFFFFFFC1D000000029A0405000D675F775F696F6E6772656E30310101FFFFFFF000041B00FFFFFFFC1D00000002750301FFFFFFEC00040403000000070F201F000000002C0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F706F69736E6772656E30310101FFFFFFF000041B00FFFFFFFC1D00000001F10301FFFFFFEC00040403000000060F201F000000002D0405000F675F775F736F6E69636772656E30310101FFFFFFF000041B00FFFFFFFC1D0000000048040300000002050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000F675F775F736F6E69636772656E30310101FFFFFFF000041B00FFFFFFFC1D000000016C040300000004050000000104030000000114200101FFFFFFEC00041B00FFFFFFFC0405000E675F775F667261676772656E30310101FFFFFFF000041B00FFFFFFFC1D00000001250405000B675F695F706172747330310101FFFFFFF000041B00FFFFFFFC1D00000001020405000F675F695F70726F677370696B6530310101FFFFFFF000041B00FFFFFFFC1D00000000DB0405000E675F695F7365637370696B6530310101FFFFFFF000041B00FFFFFFFC1D00000000B50405000B775F726F636B65745F30310101FFFFFFF000041B00FFFFFFFC1D00000000920405000B775F726F636B65745F30320101FFFFFFF000041B00FFFFFFFC1D000000006F0405000B775F726F636B65745F30330101FFFFFFF000041B00FFFFFFFC1D000000004C0405000B775F726F636B65745F30340101FFFFFFF000041B00FFFFFFFC1D00000000290405000B775F726F636B65745F30350101FFFFFFF000041B00FFFFFFFC1D00000000061B00FFFFFFFC0301FFFFFFF8000402050301FFFFFFEC00041E000000003014230101FFFFFFE400041B00FFFFFFF01D00000000121B00FFFFFFFC1B00FFFFFFF41B00FFFFFFF8200002050301FFFFFFF800040500005C010101FFFFFFF800041B00FFFFFFFC020504050001300101FFFFFFF800041B00FFFFFFFC02030403000000040101FFFFFFF800041B00FFFFFFFC0301FFFFFFF400040500003B010301FFFFFFF800040F201F000000002C0301FFFFFFF800040301FFFFFFF0000414230101FFFFFFF000041B00FFFFFFFC1D00FFFFFFC3040500015B0301FFFFFFF000041423040500015D14230101FFFFFFE800041B00FFFFFFF01D00000000121B00FFFFFFFC1B00FFFFFFF41B00FFFFFFFC2000

This... is not good.

Formatting to the Rescue

Spoiler

All we see is a big block of text. We want to break this up into NCS operations, meaning op codes, op types, and optional operands. Our first snippet should look like this:


1E 00 00000008
20 00

And the big block of text should have a similar format:


for (auto CurPos{ BeginFile + 13 }; _at_least(CurPos, EndFile, 2u);)
{
	auto OpCode{ static_cast<ncs_op>(*CurPos) };
	++CurPos;

	auto OpType{ static_cast<ncs_type>(*CurPos) };
	++CurPos;

	Output += hex_str(static_cast<uint8_t>(OpCode));
	Output += ' ';
	Output += hex_str(static_cast<uint8_t>(OpType));

	switch (OpCode)
	{
	case ncs_op::CPDOWNSP: [[fallthrough]];
	case ncs_op::CPTOPSP: [[fallthrough]];
	case ncs_op::CPDOWNBP: [[fallthrough]];
	case ncs_op::CPTOPBP:
		Output += hex_str(CurPos, EndFile, 4u);
		CurPos += 4;
		Output += hex_str(CurPos, EndFile, 2u);
		CurPos += 2;
		break;
      ...
	}

	Output += '\n';
}

The loop checks for two bytes now, the op code and op type. Both are inserted into the output string, with a space in between them for printing columns.

The switch statement reads the op codes and determines if there should be operands. If so, it converts each operand according to its size. Each operand is preceded by a space, again to produce columns. The converted output is then terminated with a new-line character.

The loop starts over again, if there are more operations.

What I Start To Work With

Spoiler

The above code produces the following output for an actual file:


1E 00 00000008
20 00
02 03
04 03 00000000
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000002
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000003
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000004
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000005
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000006
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000007
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000008
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000009
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000A
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000B
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000C
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000D
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000E
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000F
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000010
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000011
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000012
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000013
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000044C
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000006
19 03
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000005
19 03
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000004
19 03
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000002
19 03
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000001
19 03
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000000
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000002
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000003
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000004
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000005
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000006
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000007
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000008
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000009
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000A
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000B
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000C
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000D
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000E
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 0000000F
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
2A 00
1E 00 00000010
2B 00
1B 00 FFFFFF54
20 00
02 06
04 06 00000000
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
04 03 00000037
03 01 FFFFFFF8 0004
05 00 02A7 02
22 03
1F 00 0000004C
04 03 00000001
04 03 00000037
03 01 FFFFFFF4 0004
05 00 02A8 03
04 03 00000398
04 03 00000008
05 00 0000 01
04 03 00000001
14 20
03 01 FFFFFFF4 0004
1E 00 00000014
1D 00 00000006
1B 00 FFFFFFFC
20 00
04 03 00000039
03 01 FFFFFFF8 0004
05 00 02A7 02
22 03
1F 00 00000267
04 03 00000001
04 03 00000039
03 01 FFFFFFF4 0004
05 00 02A8 03
02 03
02 03
04 05 000A
05 00 0244 01
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 00000384
0F 20
1F 00 00000020
04 03 00000000
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
02 03
03 01 FFFFFFF8 0004
04 03 00000006
05 00 0000 01
14 20
04 03 00000004
15 20
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFFC 0004
04 03 00000001
0F 20
1F 00 00000020
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
1D 00 00000006
03 01 FFFFFFFC 0004
04 03 0000001E
0E 20
1F 00 00000020
04 03 0000001E
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
1D 00 00000006
04 03 00000001
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
03 01 FFFFFFF4 0004
03 01 FFFFFFE8 0004
10 20
1F 00 00000134
02 05
02 05
02 03
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
02 05
03 01 FFFFFFD4 0004
03 01 FFFFFFE4 0004
1E 00 00000114
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 0001
03 01 FFFFFFEC 0004
05 00 0042 02
01 01 FFFFFFF8 0004
04 03 00000000
0D 20
1F 00 0000006B
04 03 00000004
03 01 FFFFFFF8 0004
04 03 00000001
14 20
03 01 FFFFFFE8 0004
05 00 0041 03
05 00 00E8 01
01 01 FFFFFFF4 0004
1B 00 FFFFFFFC
03 01 FFFFFFFC 0004
04 03 00000000
03 01 FFFFFFE8 0004
05 00 0041 03
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000001C
03 01 FFFFFFF0 0004
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
04 03 00000000
03 01 FFFFFFF4 0004
03 01 FFFFFFD8 0004
03 01 FFFFFFE8 0004
05 00 001F 04
1B 00 FFFFFFFC
1B 00 FFFFFFF0
03 01 FFFFFFF4 0004
24 03 FFFFFFF0
1B 00 FFFFFFFC
1D 00 FFFFFEC0
1B 00 FFFFFFF4
1D 00 00000006
1B 00 FFFFFFF4
20 00
02 03
04 03 00000000
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
02 03
04 05 000A
05 00 0244 01
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 05
04 05 0000
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFE8 0004
04 03 000003DE
0B 20
03 01 FFFFFFFC 0004
1F 00 00000015
04 03 00000007
05 00 02B8 01
22 03
06 20
1F 00 00000020
04 03 000003D4
01 01 FFFFFFE4 0004
1B 00 FFFFFFFC
1D 00 00000006
03 01 FFFFFFE8 0004
04 03 00000000
0B 20
1F 00 0000005B
04 03 00000000
05 00 0000 01
04 03 00000009
14 20
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
02 05
03 01 FFFFFFF0 0004
04 03 00000064
16 20
03 01 FFFFFFE4 0004
1E 00 FFFFFF13
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
1D 00 0000036D
03 01 FFFFFFE8 0004
04 03 00000064
18 20
04 03 00000000
0B 20
1F 00 0000010E
03 01 FFFFFFE8 0004
04 03 00000064
17 20
03 01 FFFFFFFC 0004
04 03 00000009
0B 20
25 00 0000000C
1D 00 0000007B
04 03 00000007
05 00 02B8 01
1F 00 00000020
04 03 00000009
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 0000004A
03 01 FFFFFFF4 0004
04 03 00000006
0F 20
1F 00 00000020
04 03 00000006
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 0000001A
04 03 00000008
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
1B 00 FFFFFFFC
03 01 FFFFFFF0 0004
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
02 05
03 01 FFFFFFE4 0004
04 03 0000000A
03 01 FFFFFFE8 0004
16 20
14 20
03 01 FFFFFFE4 0004
1E 00 FFFFFDED
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
1D 00 00000247
03 01 FFFFFFE8 0004
04 03 0000000A
18 20
04 03 00000000
0B 20
1F 00 00000203
03 01 FFFFFFE8 0004
04 03 0000000A
17 20
03 01 FFFFFFFC 0004
04 03 0000005C
0B 20
25 00 0000007A
03 01 FFFFFFFC 0004
04 03 0000005E
0B 20
25 00 0000007E
03 01 FFFFFFFC 0004
04 03 0000005F
0B 20
25 00 00000082
03 01 FFFFFFFC 0004
04 03 00000061
0B 20
25 00 00000086
03 01 FFFFFFFC 0004
04 03 00000062
0B 20
25 00 0000008A
03 01 FFFFFFFC 0004
04 03 00000063
0B 20
25 00 0000008E
1D 00 0000010A
04 03 00000002
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 000000F0
04 03 00000002
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 000000D6
04 03 00000004
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 000000BC
04 03 00000009
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 000000A2
04 03 00000003
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000088
03 01 FFFFFFE8 0004
04 03 00000003
17 20
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 00000005
0E 20
1F 00 00000020
04 03 00000005
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
04 03 00000000
01 01 FFFFFFE8 0004
1F 00 00000020
04 03 00000001
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
1D 00 00000006
1B 00 FFFFFFFC
03 01 FFFFFFF0 0004
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
02 05
03 01 FFFFFFE4 0004
03 01 FFFFFFEC 0004
14 20
03 01 FFFFFFE4 0004
1E 00 00000070
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
1D 00 0000002C
02 05
03 01 FFFFFFE4 0004
03 01 FFFFFFE4 0004
1E 00 00000044
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFFC 0004
01 01 FFFFFFE0 0004
1B 00 FFFFFFEC
1D 00 00000012
1B 00 FFFFFFFC
1B 00 FFFFFFF0
1B 00 FFFFFFF8
20 00
02 03
04 03 00000001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 05
02 05
05 00 0231 00
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFFC 0004
04 05 0006
0B 23
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
03 01 FFFFFFFC 0004
1F 00 00000014
03 01 FFFFFFFC 0004
1F 00 0000001A
03 01 FFFFFFF8 0004
04 05 0006
0B 23
07 20
1F 00 000000F9
04 03 00000003
05 00 0000 01
04 03 00000000
0B 20
1F 00 00000020
04 03 000003AD
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 000000C0
04 03 00000003
05 00 0000 01
04 03 00000000
0B 20
1F 00 0000002D
04 03 000003CB
04 03 00000009
05 00 0000 01
14 20
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000080
04 03 00000003
05 00 0000 01
04 03 00000000
0B 20
1F 00 00000020
04 03 000003A3
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 0000004D
04 03 00000003
05 00 0000 01
04 03 00000000
0B 20
1F 00 00000020
04 03 00000399
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 0000001A
04 03 0000039A
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
03 01 FFFFFFEC 0004
03 01 FFFFFFFC 0004
04 03 0000038F
0B 20
25 00 0000025E
03 01 FFFFFFFC 0004
04 03 00000399
0B 20
25 00 000002A6
03 01 FFFFFFFC 0004
04 03 0000039A
0B 20
25 00 000002E7
03 01 FFFFFFFC 0004
04 03 000003A3
0B 20
25 00 00000325
03 01 FFFFFFFC 0004
04 03 000003AD
0B 20
25 00 0000039C
03 01 FFFFFFFC 0004
04 03 000003AE
0B 20
25 00 00000427
03 01 FFFFFFFC 0004
04 03 000003B7
0B 20
25 00 000004B5
03 01 FFFFFFFC 0004
04 03 000003B8
0B 20
25 00 00000519
03 01 FFFFFFFC 0004
04 03 000003B9
0B 20
25 00 0000057D
03 01 FFFFFFFC 0004
04 03 000003BA
0B 20
25 00 000005E1
03 01 FFFFFFFC 0004
04 03 000003C1
0B 20
25 00 0000066C
03 01 FFFFFFFC 0004
04 03 000003CB
0B 20
25 00 000006B4
03 01 FFFFFFFC 0004
04 03 000003CC
0B 20
25 00 000006E7
03 01 FFFFFFFC 0004
04 03 000003CD
0B 20
25 00 00000733
03 01 FFFFFFFC 0004
04 03 000003CE
0B 20
25 00 000007A2
03 01 FFFFFFFC 0004
04 03 000003CF
0B 20
25 00 00000825
03 01 FFFFFFFC 0004
04 03 000003D0
0B 20
25 00 00000893
03 01 FFFFFFFC 0004
04 03 000003D1
0B 20
25 00 000008A2
03 01 FFFFFFFC 0004
04 03 000003D2
0B 20
25 00 00000910
03 01 FFFFFFFC 0004
04 03 000003D3
0B 20
25 00 0000097F
03 01 FFFFFFFC 0004
04 03 000003D5
0B 20
25 00 000009B0
03 01 FFFFFFFC 0004
04 03 000003D6
0B 20
25 00 000009BD
03 01 FFFFFFFC 0004
04 03 000003D7
0B 20
25 00 000009CE
03 01 FFFFFFFC 0004
04 03 000003DF
0B 20
25 00 000009DE
03 01 FFFFFFFC 0004
04 03 000003E0
0B 20
25 00 000009EB
03 01 FFFFFFFC 0004
04 03 000003E1
0B 20
25 00 000009F8
03 01 FFFFFFFC 0004
04 03 000003E2
0B 20
25 00 00000A05
03 01 FFFFFFFC 0004
04 03 000003E3
0B 20
25 00 00000A12
1D 00 00000A2F
04 03 0000000A
03 01 FFFFFFE8 0004
16 20
05 00 0000 01
04 03 00000032
05 00 0000 01
14 20
04 03 00000014
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000009D1
03 01 FFFFFFEC 0004
05 00 0000 01
03 01 FFFFFFE8 0004
05 00 0000 01
14 20
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000D
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000097A
03 01 FFFFFFEC 0004
05 00 0000 01
03 01 FFFFFFE8 0004
05 00 0000 01
14 20
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000A
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000926
03 01 FFFFFFEC 0004
04 03 00000004
17 20
04 03 00000001
14 20
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 00000007
0E 20
1F 00 00000020
04 03 00000007
01 01 FFFFFFE8 0004
1B 00 FFFFFFFC
1D 00 00000006
04 05 0009
04 05 0001
14 23
03 01 FFFFFFE8 0004
05 00 005C 01
14 23
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000899
03 01 FFFFFFEC 0004
04 03 00000009
0F 20
1F 00 0000002D
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000064
03 01 FFFFFFEC 0004
04 03 00000010
0F 20
1F 00 0000002D
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000027
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000007F8
03 01 FFFFFFEC 0004
04 03 00000009
0F 20
1F 00 0000002E
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000066
03 01 FFFFFFEC 0004
04 03 00000010
0F 20
1F 00 0000002E
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000028
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000754
04 03 00000001
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 0000000C
0F 20
1F 00 0000002E
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000028
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000006DA
04 03 00000001
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 0000000C
0F 20
1F 00 0000002E
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000028
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000660
04 03 00000001
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
03 01 FFFFFFEC 0004
04 03 0000000C
0F 20
1F 00 0000002E
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000028
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000005E6
03 01 FFFFFFEC 0004
04 03 00000009
0F 20
1F 00 0000002D
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000064
03 01 FFFFFFEC 0004
04 03 00000010
0F 20
1F 00 0000002D
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000027
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000545
04 03 00000002
03 01 FFFFFFE8 0004
16 20
05 00 0000 01
04 03 00000032
05 00 0000 01
14 20
04 03 0000000A
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000004E7
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000049E
03 01 FFFFFFEC 0004
04 03 00000004
0F 20
1F 00 0000002C
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000026
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000043C
03 01 FFFFFFEC 0004
04 03 0000000B
0F 20
1F 00 0000002C
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000049
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 0010
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000003B7
03 01 FFFFFFEC 0004
04 03 0000000F
0F 20
1F 00 0000004E
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
1D 00 0000003B
04 03 00000001
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000031E
03 01 FFFFFFEC 0004
04 03 00000009
0F 20
1F 00 0000002C
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000048
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000029A
04 05 000D
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000275
03 01 FFFFFFEC 0004
04 03 00000007
0F 20
1F 00 0000002C
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000048
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000001F1
03 01 FFFFFFEC 0004
04 03 00000006
0F 20
1F 00 0000002D
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000048
04 03 00000002
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000016C
04 03 00000004
05 00 0000 01
04 03 00000001
14 20
01 01 FFFFFFEC 0004
1B 00 FFFFFFFC
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000125
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000102
04 05 000F
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000000DB
04 05 000E
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 000000B5
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000092
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000006F
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 0000004C
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000029
04 05 000B
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 00000006
1B 00 FFFFFFFC
03 01 FFFFFFF8 0004
02 05
03 01 FFFFFFEC 0004
1E 00 00000030
14 23
01 01 FFFFFFE4 0004
1B 00 FFFFFFF0
1D 00 00000012
1B 00 FFFFFFFC
1B 00 FFFFFFF4
1B 00 FFFFFFF8
20 00
02 05
03 01 FFFFFFF8 0004
05 00 005C 01
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 05
04 05 0001
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
02 03
04 03 00000004
01 01 FFFFFFF8 0004
1B 00 FFFFFFFC
03 01 FFFFFFF4 0004
05 00 003B 01
03 01 FFFFFFF8 0004
0F 20
1F 00 0000002C
03 01 FFFFFFF8 0004
03 01 FFFFFFF0 0004
14 23
01 01 FFFFFFF0 0004
1B 00 FFFFFFFC
1D 00 FFFFFFC3
04 05 0001
03 01 FFFFFFF0 0004
14 23
04 05 0001
14 23
01 01 FFFFFFE8 0004
1B 00 FFFFFFF0
1D 00 00000012
1B 00 FFFFFFFC
1B 00 FFFFFFF4
1B 00 FFFFFFFC
20 00

Yes, I'm reading that and write code to reverse it. Looks like fun, eh?

DrMcCoy · January 21, 2022

12 hours ago, AmanoJyaku said:

OK, so we want to print in hex. How? I had to write my own code for this, I couldn't figure out the C++ formatting and conversion library.

The C++98-era method would be to use stringstream. It's a bit terrible, though.

You could the C function snprintf():

#include <cstdint>
#include <cstdio>
#include <string>

std::string hex_string(uint8_t data) {
	char tmp[3];

	std::snprintf(tmp, 3, "%02X", data);

	return tmp;
}

In C++20, you'll be able to use std::format() instead, which provides a syntax similar to Python's string formating. I.e. you could do std::format("{:02X}", data), which already returns a std::string. Compiler support for that is still sparse though.

Alternatively, you can pull in the (optionally header-only) library fmt, which is actually what became C++20's std::format().

Sign In

A New DeNCS

Recommended Posts

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

AmanoJyaku

AmanoJyaku

DarthParametric

Posted Images

Static variables

Instance variables

Local variables

Important Information