Editing offsets in binary 2DAs

LewsTherinTelescope · January 9, 2018

I know they're 16 bit integers, but are they the number of bytes into the data section, or what? And is there any delimiting character, or since they're all the same size, do I just put one right after the other? And lastly, I would use a short to represent one in Java, correct?

ndix UR · January 9, 2018

Looking at my code ... I think they are offsets into the data section. 16 bit unsigned little-endian integers. No delimiter. Double null bytes (or a 0 short if you prefer) end the offsets section.

Not sure how you would represent them or deal with binary data in Java. But they are 'short' in most architectures.

In my application I just treat them as integers/numbers because they are only used to read the values, once I read the values I don't retain the offsets in the object because it's a lot easier to recalculate them all when you write it back out. And when writing them out, it generally takes a proper binary writer that will know how to encode a general "integer" into a uint16LE for output.

The weirdest part, I thought, was encoding integers and floating point numbers as null-terminated strings to put into the data section.

LewsTherinTelescope · January 9, 2018

Looking at my code ... I think they are offsets into the data section. 16 bit unsigned little-endian integers. No delimiter. Double null bytes (or a 0 short if you prefer) end the offsets section.

Not sure how you would represent them or deal with binary data in Java. But they are 'short' in most architectures.

In my application I just treat them as integers/numbers because they are only used to read the values, once I read the values I don't retain the offsets in the object because it's a lot easier to recalculate them all when you write it back out. And when writing them out, it generally takes a proper binary writer that will know how to encode a general "integer" into a uint16LE for output.

The weirdest part, I thought, was encoding integers and floating point numbers as null-terminated strings to put into the data section.

Thanks, so would I just probably getBytes() and add that length to the offset for the next one, and start at 0? I think DataOutoutStream has an option to write a short.

peedeeboy · January 9, 2018

Here is my scrappy Java code:

/* Get offsets */
            cellCount = rowCount * columnCount;                                 // Calculate number of offsets
            offsets = new int[cellCount];                                       // Create array to hold offsets

            for(int i = 0; i < cellCount; i++) {                                // Iterate once for each offset

                byte[] offsetByteArray = new byte[2];                           // Create new two byte array to store 16 bit number

                for(int j = 0; j < 2; j++) {                                    // Get next two bytes 
                    offsetByteArray[j] = ds.readByte();                         // and put them in the array
                }

                ByteBuffer offBuf = ByteBuffer.wrap(offsetByteArray);           // Stuff the byte array a new byte buffer
                offBuf.order(ByteOrder.LITTLE_ENDIAN);                          // Convert the byte order to LITTE_ENDIAN
                int offset = offBuf.getChar();                                  // Get the 16 bit number (Pop fact: char is a 16 bit unsigned number datatype!)

                offsets[i] = offset;                                            // Put offset in array

* You are correct - there is one offset for each data item in the table. You will have already parsed the row and column headers section, so by multiplying rows * columns you know how many offsets there are.

* There is no delimiter between offsets. There is however, a section break of 2 null bytes between the offset section and the data section.

* This code essentially reads two bytes into an array of type byte. Then wraps that array with a ByteBuffer. This is to use the order() method of ByteBuffer to to tell Java these bytes are Little Endian byte order (Java's primitive integers are Big Endian and signed)

* I then use the getChar() to convert to the Char datatype? Why? Because Char in Java is the only unsigned datatype! And conveniently, its also 16 bits / 2 bytes long, so perfect for storing these offsets.

So you can cast those Chars to int and it all works.... it seems hacky, but it gets around Java's datatypes.

Side note, so you can do things like this with chars:

int i = i = 10;
        char c1 = 'a'; // integer value of 97
        char c2 = 'b'; // integer value of 98
        
        System.out.println(i + c1 + c2); // prints 215

* And yes, you are correct again that the offsets are how far into the data section each data item is - starting from the point after those two null bytes mentioned above.

(so multiple data items can point to one piece of data - e.g. if you have multiple instances off **** are the offsets point to the same place)

I am pulling the bytes off a stream, so I use the mark() method of the DataInputStream class to mark that point, then for each offset I read the data at that offset, then reset() back to the mark again, ready for the next offset:

/* Get Data */
            this.data = new Object[rowCount][columnCount];                           // Create two dimensional area to hold data
            ds.mark(FILESIZE);                                                  // Mark start of data area
            
            int sent = 0;
                
            for(int row = 0; row < rowCount; row++) {                       // Outer loop: Iterate through data rows

                for(int col = 0; col < columnCount; col++) {                // Inner loop: Iterate through columns

                    String value = "";                                      // Variable to 'gather' data value

                    ds.skipBytes(offsets[sent]);                                      // Skip pointer to the offset
                    r = ds.read();                                          // Read next byte

                    while(r != NULL && r != -1) {                           // Read until we hit a null char
                        ch = (char) r;                                      // Convert decimal to ASCII char
                        value += ch;                                        // Add char to value string
                        
                        r = ds.readByte();                                  // Read next byte
                    }
                    
                    sent++;
                    this.data[row][col] = value;                                 // Add value to 2DA
                    ds.reset();    
                    }
                

                }

Of course - you could just read the whole file into an array of bytes in memory, and use variables to hold the mark position etc - I just chose to do this as the bytes are coming off the Input Stream rather than load the whole lot into memory and pull it apart there... I chose not to do this in case of large 2da files, but I might change my approach in future now most PCs have loads of RAM!

I hope some of that helps :ice:

LewsTherinTelescope · January 9, 2018

Here is my scrappy Java code:
/* Get offsets */
            cellCount = rowCount * columnCount;                                 // Calculate number of offsets
            offsets = new int[cellCount];                                       // Create array to hold offsets

            for(int i = 0; i < cellCount; i++) {                                // Iterate once for each offset

                byte[] offsetByteArray = new byte[2];                           // Create new two byte array to store 16 bit number

                for(int j = 0; j < 2; j++) {                                    // Get next two bytes 
                    offsetByteArray[j] = ds.readByte();                         // and put them in the array
                }

                ByteBuffer offBuf = ByteBuffer.wrap(offsetByteArray);           // Stuff the byte array a new byte buffer
                offBuf.order(ByteOrder.LITTLE_ENDIAN);                          // Convert the byte order to LITTE_ENDIAN
                int offset = offBuf.getChar();                                  // Get the 16 bit number (Pop fact: char is a 16 bit unsigned number datatype!)

                offsets[i] = offset;                                            // Put offset in array
* You are correct - there is one offset for each data item in the table. You will have already parsed the row and column headers section, so by multiplying rows * columns you know how many offsets there are.

* There is no delimiter between offsets. There is however, a section break of 2 null bytes between the offset section and the data section.

* This code essentially reads two bytes into an array of type byte. Then wraps that array with a ByteBuffer. This is to use the order() method of ByteBuffer to to tell Java these bytes are Little Endian byte order (Java's primitive integers are Big Endian and signed)

* I then use the getChar() to convert to the Char datatype? Why? Because Char in Java is the only unsigned datatype! And conveniently, its also 16 bits / 2 bytes long, so perfect for storing these offsets.

So you can cast those Chars to int or them them to ints etc.
int i = i = 10;
        char c1 = 'a'; // integer value of 97
        char c2 = 'b'; // integer value of 98
        
        System.out.println(i + c1 + c2); // prints 215
I hope some of that helps

Thanks! I saw that code already when looking at your TwoDABroker.java, but explanation helps. I'm using your code to read them, what I'm trying to do is calculate and write them when saving changes. Do you know if I need to convert everything to little endian, or just numbers?

peedeeboy · January 9, 2018

Looking at my code for the data section (I wrote this ages ago!), I'm pulling off one byte at a time there:

r = ds.readByte();

So that section is ASCII (one byte = one character) - I believe the header and row/column names are the same?

As it is only one byte, Byte Order is irrelevant here.

What you will have to be careful of, is that, as mentioned above, the Java char datatype is 2 bytes long (so it can store Unicode / UTF-16) and String also stores two bytes (or more?) for each character. Java is all about Unicode.

So you will need to convert each character from you application to the equivalent one byte ASCII representation and then write that array of byte[] when you write to the file.

(E.g. don't pass a String to BufferedWriter or similar, because it would write Unicode rather than ASCII to the file).

Hope that makes sense??

LewsTherinTelescope · January 9, 2018

Looking at my code for the data section (I wrote this ages ago!), I'm pulling off one byte at a time there:
r = ds.readByte();
So that section is ASCII (one byte = one character) - I believe the header and row/column names are the same?

As it is only one byte, Byte Order is irrelevant here.

What you will have to be careful of, is that, as mentioned above, the Java char datatype is 2 bytes long (so it can store Unicode) and String also stores two bytes for each character.

So you will need to convert each character from you application to the equivalent one byte ASCII representation and then write that array of byte[] when you write to the file.

(E.g. don't pass a String to BufferedWriter or similar, because it would write Unicode rather than ASCII to the file).

Hope that makes sense??

Got it. So I need to convert header, data, and labels to ASCII one byte? And I need to make the row count little endian as well?

EDIT: Is it plain ASCII, or one of the extensions? My text editor is recognizing it as UTF-8, but obviously it's not a plain text file, so it might be messing up.

peedeeboy · January 9, 2018

Exactly!

TBH, I'm not 100% sure if the text in the 2da is ASCII or UTF-8 encoded, but they are the same for the major Latin characters used in English (first 128, I think). Perhaps somebody with more knowledge of the Aurora Engine than I can confirm??

LewsTherinTelescope · January 9, 2018

Exactly!

TBH, I'm not 100% sure if the text in the 2da is ASCII or UTF-8 encoded, but they are the same for the major Latin characters used in English (first 128, I think). Perhaps somebody with more knowledge of the Aurora Engine than I can confirm??

OK, thanks! I'd use a ByteBuffer with a length of two, correct? And then four for row count, since it is 32 bit?

Looking at the Jade Empire Modding wiki, the column names are ASCII, so I'm assuming the rest of the text is too.

And I need to replace empty cells with "****", correct? Or does the binary 2DA not use that?

My current code for offsets:

short offset = 0;																	// Stores the offset
			ByteBuffer offsetLE = ByteBuffer.allocate(2).order(LITTLE_ENDIAN).putShort(offset); // Must be little endian
			for (int i = 0; i < this.data.length; i++) {										// Loop through rows
				for (int j = 0; j < this.data[i].length; j++) {									// Loop through columns
					fos.write(offsetLE.array());												// Write offset
					String cell = (String) data[i][j];											// Cast to String
					offset += cell.getBytes().length;											// Add length to offset
					offsetLE.putShort(0, offset);												// Update ByteBuffer
				}
			}
			fos.write(new byte[]{NUL,NUL});														// Terminate offsets

For some reason, writing the offsets is adding a bunch of newlines to the file. Any idea what could cause that?

peedeeboy · January 9, 2018

Your code is basically fine!

Using the following test array of 2da data:

String[][] data = {
            {"r1c1", "r1c2", "r1c3"},
            {"r2c1", "r2c2", "r2c3"},
            {"r3c1", "r3c2", "r3c3"}
        };

We should have a file with nine offsets: 0 4 8 12 16 20 24 28 32

Followed by two null bytes at the end.

Edited your code just a wee bit so I could run it:

public static void main(String[] args) {

        /* Variables */
        String[][] data = {
            {"r1c1", "r1c2", "r1c3"},
            {"r2c1", "r2c2", "r2c3"},
            {"r3c1", "r3c2", "r3c3"}
        };                                                                      // 2da of Test data
        final byte NUL = 0x0;                                                   // Constant for NUL byte
        File file = new File("D://Temp//test.2da");                             // File to write to
        short offset = 0;                                                       // Stores the offset

        /* Do the work */
        try (FileOutputStream fos = new FileOutputStream(file)) {               // "Tty with resources"
            ByteBuffer offsetLE = ByteBuffer.allocate(2)
                                            .order(LITTLE_ENDIAN)
                                            .putShort(offset);                  // Must be little endian			
            for (int i = 0; i < data.length; i++) {				// Loop through rows				
                for (int j = 0; j < data[i].length; j++) {			// Loop through columns
                    fos.write(offsetLE.array());                                // Write offset
                    String cell = (String) data[i][j];				// Cast to String
                    offset += cell.getBytes().length;				// Add length to offset
                    offsetLE.putShort(0, offset);   				// Update ByteBuffer
                }
            }
            fos.write(new byte[]{NUL, NUL});                                    // Terminate offsets
        } 
        catch (IOException ex) {
            ex.printStackTrace();
        }

Opening the generated file in a hex editor, we get:

00 00 04 00 08 00 0C 00 10 00 14 00 18 00 1C 00 20 00 00 00

Proof:

So let's check it! By converting each pair of bytes to a 16 bit unsigned int in Little Endian byte order!

00 00 = 0

04 00 = 4

08 00 = 8

0c 00 = 12

10 00 = 16

14 00 = 20

18 00 = 24

1C 00 = 28

20 00 = 32

Followed by 00 00 at the end!

So apparently, you rock dood! :ice:

(12 is a new line in ASCII / UTF-8, and 15 is a carriage return - so if you open the file in a Text editor and it tries to convert the bytes to text you'll get new lines when it hits those bytes.....)

ndix UR · January 9, 2018

Don't use '****' in binary 2DA. They become an empty string (single null character). When I read 2DA values I convert empty string to '****' and when I write 2DA I convert '****' to empty string, but whether you want to do that depends on your use case (mine is a general purpose editor).

Here is some javascript that I use to compute all the offsets. I do it early so that I can create a single Buffer w/ the exact file size before any writes are done. This is the part that makes it so that you only have one offset per unique value.

  let value_pos  = 0;
  let value_hash = {};
  for (let row of twoDA.rows) {
    for (let index of twoDA.labels) {
      let val = row[index] === '****' ? '' : row[index];
      if (value_hash[val] !== undefined) {
        // already seen this value, will use existing offset, skip
        continue;
      }
      // record offset to value in data section
      value_hash[val] = value_pos;
      // advance offset past this value
      value_pos += val.length;
      value_pos += 1; // null pad
    }
  }

I can check later, but I'm 99+% sure that they are not UTF8 and probably are straight ASCII. Windows of that era was pretty exclusively UTF16LE, which 2DA's definitely are not. UTF8 is a superset of ASCII, so most text editors refer to ASCII as UTF8 nowadays.

LewsTherinTelescope · January 9, 2018

Your code is basically fine!

Using the following test array of 2da data:
String[][] data = {            {"r1c1", "r1c2", "r1c3"},            {"r2c1", "r2c2", "r2c3"},            {"r3c1", "r3c2", "r3c3"}        };
We should have a file with nine offsets: 0 4 8 12 16 20 24 28 32
Followed by two null bytes at the end.

Edited your code just a wee bit so I could run it:
public static void main(String[] args) {        /* Variables */        String[][] data = {            {"r1c1", "r1c2", "r1c3"},            {"r2c1", "r2c2", "r2c3"},            {"r3c1", "r3c2", "r3c3"}        };                                                                      // 2da of Test data        final byte NUL = 0x0;                                                   // Constant for NUL byte        File file = new File("D://Temp//test.2da");                             // File to write to        short offset = 0;                                                       // Stores the offset        /* Do the work */        try (FileOutputStream fos = new FileOutputStream(file)) {               // "Tty with resources"            ByteBuffer offsetLE = ByteBuffer.allocate(2)                                            .order(LITTLE_ENDIAN)                                            .putShort(offset);                  // Must be little endian			            for (int i = 0; i < data.length; i++) {				// Loop through rows				                for (int j = 0; j < data[i].length; j++) {			// Loop through columns                    fos.write(offsetLE.array());                                // Write offset                    String cell = (String) data[i][j];				// Cast to String                    offset += cell.getBytes().length;				// Add length to offset                    offsetLE.putShort(0, offset);   				// Update ByteBuffer                }            }            fos.write(new byte[]{NUL, NUL});                                    // Terminate offsets        }         catch (IOException ex) {            ex.printStackTrace();        }
Opening the generated file in a hex editor, we get:
00 00 04 00 08 00 0C 00 10 00 14 00 18 00 1C 00 20 00 00 00

Proof:

So let's check it! By converting each pair of bytes to a 16 bit unsigned int in Little Endian byte order!

00 00 = 0

04 00 = 4

08 00 = 8

0c 00 = 12

10 00 = 16

14 00 = 20

18 00 = 24

1C 00 = 28

20 00 = 32

Followed by 00 00 at the end!

So apparently, you rock dood!

(12 is a new line in ASCII / UTF-8, and 15 is a carriage return - so if you open the file in a Text editor and it tries to convert the bytes to text you'll get new lines when it hits those bytes.....)

Except I have 32 extra lines, and that would only allow for one (also, my editor uses Unix newlines, not Windows), because the offset would increase when writing the next number. I haven't tested it with dummy data, only with an actual mod, so I'll see if maybe it is different on Android, or maybe wy other code messes with it?

peedeeboy · January 9, 2018

Don't use '****' in binary 2DA. They become an empty string (single null character).

This is correct! Apologies for confusing things erroneously mentioning **** earlier....

peedeeboy · January 9, 2018

Except I have 32 extra lines, and that would only allow for one (also, my editor uses Unix newlines, not Windows), because the offset would increase when writing the next number. I haven't tested it with dummy data, only with an actual mod, so I'll see if maybe it is different on Android, or maybe wy other code messes with it?

Newline in Unix is 10 (0x0A), no? So, if you are opening a file in a text editor looking for Unix line endings and using a UTF-8/ASCII encoding, every time it finds a 0x0A byte it will interpret that as a new line.

Have you tried using a simple / small file (as per my example) and checking manually in a hex editor?

LewsTherinTelescope · January 9, 2018

Don't use '****' in binary 2DA. They become an empty string (single null character). When I read 2DA values I convert empty string to '****' and when I write 2DA I convert '****' to empty string, but whether you want to do that depends on your use case (mine is a general purpose editor).

Here is some javascript that I use to compute all the offsets. I do it early so that I can create a single Buffer w/ the exact file size before any writes are done. This is the part that makes it so that you only have one offset per unique value.
let value_pos  = 0;  let value_hash = {};  for (let row of twoDA.rows) {    for (let index of twoDA.labels) {      let val = row[index] === '****' ? '' : row[index];      if (value_hash[val] !== undefined) {        // already seen this value, will use existing offset, skip        continue;      }      // record offset to value in data section      value_hash[val] = value_pos;      // advance offset past this value      value_pos += val.length;      value_pos += 1; // null pad    }  }
I can check later, but I'm 99+% sure that they are not UTF8 and probably are straight ASCII. Windows of that era was pretty exclusively UTF16LE, which 2DA's definitely are not. UTF8 is a superset of ASCII, so most text editors refer to ASCII as UTF8 nowadays.

OK, thanks. Is the NUL separate from the cell terminator ([other cell] NUL NUL

NUL [other cell]), or do I just leave the data blank ([other cell] NUL NUL [other cell]? And is the data section null terminated or delimited (is there a NUL ending the file)?

Thanks for the code, I'll try to get that to regular Java and test it.

My editor has separate encodings for ASCII and UTF-8, so the binary data is probably just messing things up.

LewsTherinTelescope · January 9, 2018

Newline in Unix is 10 (0x0A), no? So, if you are opening a file in a text editor looking for Unix line endings and using a UTF-8/ASCII encoding, every time it finds a 0x0A byte it will interpret that as a new line.

Have you tried using a simple / small file (as per my example) and checking manually in a hex editor?

I have not yet, about to do so.

ndix UR · January 9, 2018

OK, thanks. Is the NUL separate from the cell terminator ([other cell] NUL NUL

NUL [other cell]), or do I just leave the data blank ([other cell] NUL NUL [other cell]? And is the data section null terminated or delimited (is there a NUL ending the file)?

Thanks for the code, I'll try to get that to regular Java and test it.

My editor has separate encodings for ASCII and UTF-8, so the binary data is probably just messing things up.

It is just the single NUL as terminator for representing empty string AFAIK, so [other cell] NUL NUL [other cell] NUL. The bold NUL is a separate 'cell value'. I just think of the data section as a concatenation of null-terminated strings (this is how strings work in C), not that the NULL is a delimiter. So, what I see in that is [other cell][''][other cell].

There is a trailing NUL on the last element of the data section, but not an *extra* one. In my terminology, the last data item is a standard null-terminated string. In what you have above it would be [last cell] NUL EOF. Where EOF is not an actual byte, just a human-readable thing signifying end of the file for purpose of discussion.

Yeah, I would definitely recommend using a hex editor to view your binary 2DA results and not a text editor.

LewsTherinTelescope · January 9, 2018

It is just the single NUL as terminator for representing empty string AFAIK, so [other cell] NUL NUL [other cell] NUL. The bold NUL is a separate 'cell value'. I just think of the data section as a concatenation of null-terminated strings (this is how strings work in C), not that the NULL is a delimiter. So, what I see in that is [other cell][''][other cell].

There is a trailing NUL on the last element of the data section, but not an *extra* one. In my terminology, the last data item is a standard null-terminated string. In what you have above it would be [last cell] NUL EOF. Where EOF is not an actual byte, just a human-readable thing signifying end of the file for purpose of discussion.

Yeah, I would definitely recommend using a hex editor to view your binary 2DA results and not a text editor.

OK, thanks, makes sense. I haven't used C[++/#] before.

By extra I meant trailing. Thanks for clearing that up.

At the time, I was looking at appearance.2da, which was a lot easier to look at in the text editor than hex. Didn't think about the fact that a hex editor would work better. Will do from now on.

LewsTherinTelescope · January 10, 2018

Newline in Unix is 10 (0x0A), no? So, if you are opening a file in a text editor looking for Unix line endings and using a UTF-8/ASCII encoding, every time it finds a 0x0A byte it will interpret that as a new line.

Have you tried using a simple / small file (as per my example) and checking manually in a hex editor?

The offset code works. The issue was with another section. Thanks for the help!

peedeeboy · January 10, 2018

Here's a dirty big monolithic Java script to write a binary 2da file I just bashed together. It is based on the first three rows of Actions.2da and uses a Map to ensure there is one unique entry for each string in the data section - as ndix UR suggested.

Obviously, you would want to split each part of the process out into a separate function...

There's no real error checking or data validation either (I think spaces/whitespace would need to be stripped from user input?) - and I'm using a List of Lists of Strings for the 2da model - I don't know much about Android UI components, but for most desktop Java GUI components you need to use Collections rather than Arrays for the TableModel if you want to add / remove rows etc.

    public static void main(String[] args) {
        
        /* Variables */
        // Data
        List<String> colHeaders = new ArrayList<>(Arrays.asList(
                "label", "string_ref", "iconresref"));                          // 2da column headers
        List<String> rowHeaders = new ArrayList<>(Arrays.asList(
                "0", "1", "2"));                                                // 2da row headers
        List<List<String>> data = new ArrayList<>();                            // 2da data
        
        List<String> row1 = new ArrayList<>(Arrays.asList(
                "MoveToPoint", "", "IR_MOVETO"));                               // Create row 1
        List<String> row2 = new ArrayList<>(Arrays.asList(
                "PickUpItem", "", "ir_pickup"));                                // Create row 2
        List<String> row3 = new ArrayList<>(Arrays.asList(
                "DropItem", "", "ir_drop"));                                    // Create row 3
        
        data.add(row1);                                                         // Add row1 to 2da
        data.add(row2);                                                         // Add row2 to 2da
        data.add(row3);                                                         // Add row3 to 2da
        
        File file = new File("D://Temp//MiniAction.2da");                       // File we are writing to
        
        // Constants
        final String CHAR_ENCODING = "US-ASCII";                                // Use either "US-ASCII" or "UTF-8"
        final byte[] FILE_HEADER = new byte[]
            {0x32, 0x44, 0x41, 0x20, 0x56, 0x32, 0x2E, 0x62, 0x0A};             // "2DA V2.b" followed by LF
        final byte NUL = 0x00;                                                  // NULL byte
        final byte TAB = 0x09;                                                  // TAB byte
        
        
        /* Do the work! */
        try(FileOutputStream fos = new FileOutputStream(file)) {                // "Try with resources" (Stream closes automatically)
            
            // Write header
            fos.write(FILE_HEADER);                                             // Write "2DA V2.b" followed by LF
            
            // Write column headers     
            for(String header : colHeaders) {                                   // Loop through column headers
                fos.write(header.getBytes(CHAR_ENCODING));                      // Convert String ASCII bytes and write to file
                fos.write(TAB);                                                 // Write TAB delimiter
            }
            
            // Write NULL delimiter to end of column headers section
            fos.write(NUL);
            
            // Write number of rows (32 bit unigned integer)
            ByteBuffer rowBuffer = ByteBuffer.allocate(4)                       // Create a byte rowBuffer of 4 bytes
                                             .order(LITTLE_ENDIAN)              // Little Endian byte order
                                             .putInt(rowHeaders.size());        // containing the number of rows
            fos.write(rowBuffer.array());                                       // Write number of rows to file
            rowBuffer.clear();                                                  // Empty the byte rowBuffer
            
            // Write row headers
            for(String header : rowHeaders) {                                   // Loop through row headers            
                fos.write(header.getBytes(CHAR_ENCODING));                      // Convert String ASCII bytes and write to file
                fos.write(TAB);                                                 // Write TAB delimiter
            }
            
            // Write offsets
            short offset = 0;                                                   // Keep a running total of how far into data we are
            Map<String, Short> map = new LinkedHashMap<>();                     // Keep an ordered map of Data/Offset
            
            ByteBuffer offsetBuffer = ByteBuffer.allocate(2)                    // Create a byte rowBuffer of 2 bytes
                                             .order(LITTLE_ENDIAN);             // Little Endian byte order
            
            for(List<String> row : data) {                                      // Loop through rows
                for(String element : row) {                                     // loop through each cell in row
                    
                    offsetBuffer.clear();                                       // Empty buffer
                    
                    if(map.containsKey(element)) {                              // If data has been encountered before
                        offsetBuffer.putShort(map.get(element));                // get the existing offset for it
                    }
                    else {                                                      // otherwise,
                        offsetBuffer.putShort(offset);                          // use the current running offset
                        map.put(element, offset);                               // store the data and offset in the map
                        if(element.length() > 0) {                              // If data is not an empty string
                            offset += element.length();                         // update the offset for the next unique piece of data
                        }                            // 
                        offset++;                                               // add +1 to offset to account for NULL byte delimiter
                    }
                    
                    fos.write(offsetBuffer.array());                            // Write offset to file
                }
            }
              
            // Write two NULL bytes to mark end of offsets and start of data    
            fos.write(new byte[]{NUL, NUL});                                    // Write two null bytes
            
            // Write data
            for(String key : map.keySet()) {
                if(key.length() == 0) {                                         // If data is an empty string
                    fos.write(NUL);                                             // write a single NULL byte
                }
                else {                                                          // if data is not empty,
                    fos.write(key.getBytes(CHAR_ENCODING));                     // convert String ASCII bytes and write to file
                    fos.write(NUL);                                             // Write NUL delimiter
                }
                
            }
      
        } catch (IOException ex) {                                              // Something went wrong
            ex.printStackTrace();                                               // so print stack trace
        }
        
        System.out.println("Boo-yah!");                                         // We're done!  Crack open a beer!
        
    }

Here is the resulting file opened in KOTORTool. Hopefully you find this useful - feel free to pinch any bits of code anywhere you get stuck :ice:

LewsTherinTelescope · January 10, 2018

Here's a dirty big monolithic Java script to write a binary 2da file I just bashed together. It is based on the first three rows of Actions.2da and uses a Map to ensure there is one unique entry for each string in the data section - as ndix UR suggested.

Obviously, you would want to split each part of the process out into a separate function...

There's no real error checking or data validation either (I think spaces/whitespace would need to be stripped from user input?) - and I'm using a List of Lists of Strings for the 2da model - I don't know much about Android UI components, but for most desktop Java GUI components you need to use Collections rather than Arrays for the TableModel if you want to add / remove rows etc.

public static void main(String[] args) {                /* Variables */        // Data        List<String> colHeaders = new ArrayList<>(Arrays.asList(                "label", "string_ref", "iconresref"));                          // 2da column headers        List<String> rowHeaders = new ArrayList<>(Arrays.asList(                "0", "1", "2"));                                                // 2da row headers        List<List<String>> data = new ArrayList<>();                            // 2da data                List<String> row1 = new ArrayList<>(Arrays.asList(                "MoveToPoint", "", "IR_MOVETO"));                               // Create row 1        List<String> row2 = new ArrayList<>(Arrays.asList(                "PickUpItem", "", "ir_pickup"));                                // Create row 2        List<String> row3 = new ArrayList<>(Arrays.asList(                "DropItem", "", "ir_drop"));                                    // Create row 3                data.add(row1);                                                         // Add row1 to 2da        data.add(row2);                                                         // Add row2 to 2da        data.add(row3);                                                         // Add row3 to 2da                File file = new File("D://Temp//MiniAction.2da");                       // File we are writing to                // Constants        final String CHAR_ENCODING = "US-ASCII";                                // Use either "US-ASCII" or "UTF-8"        final byte[] FILE_HEADER = new byte[]            {0x32, 0x44, 0x41, 0x20, 0x56, 0x32, 0x2E, 0x62, 0x0A};             // "2DA V2.b" followed by LF        final byte NUL = 0x00;                                                  // NULL byte        final byte TAB = 0x09;                                                  // TAB byte                        /* Do the work! */        try(FileOutputStream fos = new FileOutputStream(file)) {                // "Try with resources" (Stream closes automatically)                        // Write header            fos.write(FILE_HEADER);                                             // Write "2DA V2.b" followed by LF                        // Write column headers                 for(String header : colHeaders) {                                   // Loop through column headers                fos.write(header.getBytes(CHAR_ENCODING));                      // Convert String ASCII bytes and write to file                fos.write(TAB);                                                 // Write TAB delimiter            }                        // Write NULL delimiter to end of column headers section            fos.write(NUL);                        // Write number of rows (32 bit unigned integer)            ByteBuffer rowBuffer = ByteBuffer.allocate(4)                       // Create a byte rowBuffer of 4 bytes                                             .order(LITTLE_ENDIAN)              // Little Endian byte order                                             .putInt(rowHeaders.size());        // containing the number of rows            fos.write(rowBuffer.array());                                       // Write number of rows to file            rowBuffer.clear();                                                  // Empty the byte rowBuffer                        // Write row headers            for(String header : rowHeaders) {                                   // Loop through row headers                            fos.write(header.getBytes(CHAR_ENCODING));                      // Convert String ASCII bytes and write to file                fos.write(TAB);                                                 // Write TAB delimiter            }                        // Write offsets            short offset = 0;                                                   // Keep a running total of how far into data we are            Map<String, Short> map = new LinkedHashMap<>();                     // Keep an ordered map of Data/Offset                        ByteBuffer offsetBuffer = ByteBuffer.allocate(2)                    // Create a byte rowBuffer of 2 bytes                                             .order(LITTLE_ENDIAN);             // Little Endian byte order                        for(List<String> row : data) {                                      // Loop through rows                for(String element : row) {                                     // loop through each cell in row                                        offsetBuffer.clear();                                       // Empty buffer                                        if(map.containsKey(element)) {                              // If data has been encountered before                        offsetBuffer.putShort(map.get(element));                // get the existing offset for it                    }                    else {                                                      // otherwise,                        offsetBuffer.putShort(offset);                          // use the current running offset                        map.put(element, offset);                               // store the data and offset in the map                        if(element.length() > 0) {                              // If data is not an empty string                            offset += element.length();                         // update the offset for the next unique piece of data                        }                            //                         offset++;                                               // add +1 to offset to account for NULL byte delimiter                    }                                        fos.write(offsetBuffer.array());                            // Write offset to file                }            }                          // Write two NULL bytes to mark end of offsets and start of data                fos.write(new byte[]{NUL, NUL});                                    // Write two null bytes                        // Write data            for(String key : map.keySet()) {                if(key.length() == 0) {                                         // If data is an empty string                    fos.write(NUL);                                             // write a single NULL byte                }                else {                                                          // if data is not empty,                    fos.write(key.getBytes(CHAR_ENCODING));                     // convert String ASCII bytes and write to file                    fos.write(NUL);                                             // Write NUL delimiter                }                            }              } catch (IOException ex) {                                              // Something went wrong            ex.printStackTrace();                                               // so print stack trace        }                System.out.println("Boo-yah!");                                         // We're done!  Crack open a beer!            }

Here is the resulting file opened in KOTORTool. Hopefully you find this useful :ice:

Thanks, I'll test that. I solved the issue I mentioned, though it still acts up. My code can successfully edit difficultyopt.2da (Impossible difficulty in K1R), but appearance.2da messes up (the 2da opens fine, but characters are invisible, so probably just need to check for issues in the data). That looks really similar to my code, except that I use arrays. I wrote my method to use your TwoDA class, so I'll have to change that.

At least for now, I have no visual editor, it just parses changes.ini, so the GUI doesn't matter yet.

Why would I need to split them? A lot of changes affect other parts, so I'd think it'd be better to rewrite it all, just in case?

EDIT: Also, any reason you used hex codes over numbers (0x00 instead of 0, 0x0A instead of 10, etc)?

peedeeboy · January 11, 2018

That looks really similar to my code, except that I use arrays. I wrote my method to use your TwoDA class, so I'll have to change that.

When I wrote those classes, it a quick and dirty solution whilst I was still trying to figure the .2da file type out properly. I didn't really think about the GUI - conceptually, I needed a two-dimensional array, so I used Object[][]. Also, one can construct a JTable by passing it a two dimensional Array. Using Collections (such as ArrayList) one has to extend AbstractTableModel and write implement some of the methods (such as how to set data in a particular cell) oneself.

And as arrays take up consecutive space in the heap memory, its a pain to add rows in the middle...

Of course - you might not want to add that functionality, as for editing the existing .2da files, I believe its generally considered best to leave the existing order of rows as is, and add any new rows to the end of the file.

But if you want people to be able to create .2da files from scratch - Collections will probably be the way to go

EDIT: I also stored row headings in a seprate list, as that is conceptually how they are stored in a 2da file. But in the Netbeans plugin I'm working on at the moment, I will probably add them into the first column of the table model, so the user can edit them as if they were part of the data (a la the way KOTORTool does it). Again - stylistic choice based on GUI!

Why would I need to split them? A lot of changes affect other parts, so I'd think it'd be better to rewrite it all, just in case?

Its a stylistic choice on your part - but most programmers think that functions/methods should have one simple purpose. It makes code eaiser to read / maintain in the long run over monolithic scripts. E.g. in this instance, you might have a public write() method that accepts the table model, headers etc. But that method will then call several private methods for each stage writeHeader() writeOffsets() writeData() etc.

This means your write() method reads more like a recipe, and you can skip to the other methods as you need. Code that reads like a recipe is easy to maintain - so the theory goes.

EDIT: Also, any reason you used hex codes over numbers (0x00 instead of 0, 0x0A instead of 10, etc)?

Its a stylistic choice against for consistency in this script. I copied the hex for "2DA V2.b" from opening actions.2sa in a hex editor, (because I couldn't be bothered to look the characters up in an ASCII table) so everywhere else I referred to a byte I used the hex code, rather than an integer too.

Again, I could have stored "2DA V2.b" as a string, and converted it to bytes and written it out, but it was quick and easy to store the bytes as a constant because they would always be the same writing any binary 2da....

There's no write or wrong answer there, only preference!

Good luck with the project! :ice:

LewsTherinTelescope · January 11, 2018

When I wrote those classes, it a quick and dirty solution whilst I was still trying to figure the .2da file type out properly. I didn't really think about the GUI - conceptually, I needed a two-dimensional array, so I used Object[][]. Also, one can construct a JTable by passing it a two dimensional Array. Using Collections (such as ArrayList) one has to extend AbstractTableModel and write implement some of the methods (such as how to set data in a particular cell) oneself.

And as arrays take up consecutive space in the heap memory, its a pain to add rows in the middle...

Of course - you might not want to add that functionality, as for editing the existing .2da files, I believe its generally considered best to leave the existing order of rows as is, and add any new rows to the end of the file.

But if you want people to be able to create .2da files from scratch - Collections will probably be the way to go

EDIT: I also stored row headings in a seprate list, as that is conceptually how they are stored in a 2da file. But in the Netbeans plugin I'm working on at the moment, I will probably add them into the first column of the table model, so the user can edit them as if they were part of the data (a la the way KOTORTool does it). Again - stylistic choice based on GUI!

Its a stylistic choice on your part - but most programmers think that functions/methods should have one simple purpose. It makes code eaiser to read / maintain in the long run over monolithic scripts. E.g. in this instance, you might have a public write() method that accepts the table model, headers etc. But that method will then call several private methods for each stage writeHeader() writeOffsets() writeData() etc.

This means your write() method reads more like a recipe, and you can skip to the other methods as you need. Code that reads like a recipe is easy to maintain - so the theory goes.

Its a stylistic choice against for consistency in this script. I copied the hex for "2DA V2.b" from opening actions.2sa in a hex editor, (because I couldn't be bothered to look the characters up in an ASCII table) so everywhere else I referred to a byte I used the hex code, rather than an integer too.

Again, I could have stored "2DA V2.b" as a string, and converted it to bytes and written it out, but it was quick and easy to store the bytes as a constant because they would always be the same writing any binary 2da....

There's no write or wrong answer there, only preference!

Good luck with the project!

OK thanks. Your code functions 100% fine, I wonder what was wrong with mine, since difficultyopt.2da worked.

So I should change the data array in your TwoDA class to a List of Lists of Strings?

I do actually currently have a test option that allows creating 2das, but it uses newline and comma delimiters rather than a table, at least for now.

LewsTherinTelescope · February 4, 2018

EDIT: I also stored row headings in a seprate list, as that is conceptually how they are stored in a 2da file. But in the Netbeans plugin I'm working on at the moment, I will probably add them into the first column of the table model, so the user can edit them as if they were part of the data (a la the way KOTORTool does it). Again - stylistic choice based on GUI!

Something I just thought about: columns should definitely be separate, because if they are part of the List/array, if you ever need to add one for some reason, iterating through the entire thing is waayy slower than just adding it to one List of names.

Sign In

Editing offsets in binary 2DAs

Recommended Posts

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

ndix UR 224

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

ndix UR 224

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

peedeeboy 23

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

ndix UR 224

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

LewsTherinTelescope 18

Share this post

Link to post

Share on other sites

peedeeboy 23