Guest cornel

batch editing the dialog.tlk file

Recommended Posts

Guest cornel

I'm currently editing - line by line - the dialog texts in the KOTOR game, and after only one day's work, I'm already getting desperate (lol). I have looked but there doesn't seem to be a way to "seach and replace" certain words in TalkEd. Is there maybe another utility that can batch modify text in a tlk file?

 

For example: I'm changing the "Republic" references into "New Republic". (I'm working my way, little mod by little mod, to changing the time frame of the game to after the original trilogy movies.) It would be nice to be able to do that with one command for all the entries in the dialog.tlk file ;) instead of changing text line by line...

Share this post


Link to post
Share on other sites

Read this thread - http://deadlystream.com/forum/topic/3435-tlk-merging - in particular, using Xoreos-Tools to decompile the TLK into XML, as I detail in post #12. As DrMcCoy posted at the end of the thread, another tool is now also included to recompile an XML back into a TLK.

 

That links to an older version of the tools though, the latest are available here - https://xoreos.org/downloads/index.html - at the time of writing this post being 0.0.4 “Chodo”. Make sure to grab the tools link from the righthand side, not the base engine link on the left.

Share this post


Link to post
Share on other sites
Guest cornel

Thank you both. I actually had already read that thread, but did not think it relevant (it is about merging) at the time. I did not grasp that converting tlk into xml was already the answer to my batch edit problem...

 

Is there any advantage (or disadvantage) to TLK->XML or TLK->TXT? What I mean is: is there a preferred (recommended) method?

Share this post


Link to post
Share on other sites

If you do use the xoreos-tools, I'd be happy to hear any feedback you can offer. :)

  • Like 1

Share this post


Link to post
Share on other sites
Guest cornel

Sorry DrMcCoy... I'm afraid I went (which seemed to me) the easiest way: TLK2TXT and TXT2TLK...

;)

Share this post


Link to post
Share on other sites
Guest cornel

Please forgive me ignorance. I've tried to find an exact answer to my doubt, but was unable to.

 

I was under the impression that the DLG files were merely indexed "links" to the actual dialog text, which i thought was located in the dialog.tlk file. The fact that the DLG editor needed to know the location of the TLK file, gave me this idea. But now I see that i was mistaken... After several days worth of dialog.tlk editing, i find that the lines in the DLG files have remained the same. Do I need to edit these aswell?

Share this post


Link to post
Share on other sites
Guest cornel

But why is the same "talking heads" dialog also present in the dialog.tlk file, then? Which version takes preference? In other words: which version am I supposed to edit?

Share this post


Link to post
Share on other sites

Basically, the way the DLG files work is if the node has a String Reference of -1, it's a custom text in the file itself. If the String Reference is 0 or higher, then it references the dialog.tlk file.

 

I should also note that the game will only recognize a dialog.tlk file if it's in the main game folder, not the override folder.

 

Lastly, were there any issues with the conversion back to TLK from TXT? I ask because as the black window will tell you, I laid out a pretty strict formatting.

Share this post


Link to post
Share on other sites
Guest cornel

This forum is very helpful for (beginning) modders. I appreciate the help I am getting here!

 

@Fair Strides:

 

So if I understand correctly, I only need to modify those DLG files with nodes that have a StrRef of -1 (haven't seen one yet) and everything else can be edited in the dialog.tlk? Then I didn't do all that work for nothing... ;)

 

I cannot place my modified dialog.tlk in the override folder? Hmmm. that's bad news. Makes it difficult to publish a dialog mod.

 

I had no issues with either conversion, but then again, I never changed the layout, just the text strings...

Share this post


Link to post
Share on other sites
Guest cornel

I'm a little embarrassed, but here goes: I've changed several dialog texts (but not any menu texts) in the dialog.tlk file, and now I see that, for some reason, certain menu items have changed.

 

For example, the "EXIT GAME" now reads "sEXIT GAM", the "SWITCH TO GET ITEMS" now is "rSWITCH TO GET ITEM" and the "SOUND" now says "sSOUN"... 

 

Does anyone recognize (and know the source of) these strange changes?

Share this post


Link to post
Share on other sites

Just throwing my two cents in the ring, while knowing absolutely nothing about how Fair Strides' tool works:

 

That looks like the offset to the strings is off by one starting with an entry somewhere in the middle of the file? So I guess the string length counting messed up somewhere? Did you maybe modify a string to contain a non-ASCII character, like an umlaut or a smart-quote, and it chockes there?

 

At least that's what I'd guess, that it counts something by characters instead of by bytes, leading to it writing two bytes but only counting one character.

Share this post


Link to post
Share on other sites

As far as how the tool works, the TXT->TLK does this as far as the text goes:

 

Example entry:

String 38542:
  Flags:
    Sound: Yes
    SoundLength: 0.0
    Text Present: Yes
  Audio:                 
  Text: Get Items
----------

 

And the code:

 

 

 

    # The checking is done backwards. If the following line matches, we're at the end of the TLK entry's data

    if($line =~ /----------/)
    {#print "Found 31\n";
        $number = 30;
    }

 

    # If the line has any text, numbers, newlines, or dashes at the start of the line, assume it's part of the TLK entry
    if($line =~ /^[A-Z]|[0-9]|-|\n/)
    {

        # debugging check, will never be equal to 2
        if($number != 2)
        {

            # If we aren't at the end of the entry, see if we're in a TLK entry phase
            if($line !~ /----------/)
            {
                if($text_check == 1)
                {#print "Found\n";

                    # Okay, we are, so set the phase to 31.
                    $number = 31;
                }
            }
        }
    }
    if($line =~ /Text: (.*)/s)
    {

        # begin recording the TLK entry
        $number = 32; $test_line = $1; $text_check = 1;
    }

 

# later in the file:

 

    # We're done with the entry, so record the data.

    if($number == 30)
    {
        $text_check = 0;
#        print "String $current_string\n Text: $text";

        # Actual text
        $Strings{$current_string}{Text} = $text;

 

        # Get the actual length of the text string
        $Strings{$current_string}{Size} = length($text);
#        print "Length: " . length($text) . "Text: $text\n";
        $text = "";
    }

 

    # Append string to TLK text
    if($number == 31)
    {
        $text .= $line;
    }

 

    # Duplicate to above, used to start TLK text
    if($number == 32)
    {
        $text .= $1;
    }

 

 

 

Going by the code, I'm 99% sure I'm counting characters and not bytes.

Share this post


Link to post
Share on other sites
Guest cornel

I'm afraid you both lost me... What is "offset"?

 

String length? Are you saying that the amount of characters need to stay the same in every (dialog) text string?

 

Please forgive my ignorance, and thank you for helping me.

 

P.S.

I compared the original dialog.tlk with the modified one in NotePad++ with the compare plugin, and it gives the exact same amount of strings, in the same places. Only differences were in text (dialog altered by me)...

 

When using the original, unaltered dialog.txt, the menu typos disappear! So I must have done something wrong. Can't imagine what, though...

 

Maybe something went wrong with the batch search and replace I did a couple of times... although I checked the file several times manually, edit for edit. Oh man, if I have to do this again... this is almost a whole weeks worth of work (from morning till late at night!)

 

:(

 

EDIT:

 

So, I've started again, taking the original dialog.tlk and adding, little by little, the modified parts, and then checking the game's menu if everything is still okay... Don't see any other way of going about this.

Share this post


Link to post
Share on other sites

I'm afraid you both lost me... What is "offset"?

 

The place in the TLK file where the game looks for the text.

 

String length? Are you saying that the amount of characters need to stay the same in every (dialog) text string?

 

No. But the tool that writes the new TLK needs to count how many characters there are. And it's possible that Fair Strides' tool might count wrong.

 

 

Going by the code, I'm 99% sure I'm counting characters and not bytes.

 

That would be wrong, then. The length field in the file needs to be how many *bytes* the string is long. And the offset is, of course, also byte-based. So it's vital to count how many actual bytes you write to the file, and calculate the running offsets with that.

 

Still, the strings should be CP-1252 encoded for US and western European versions of the game, no? So it should map 1:1 bytes to characters. How are you reading the string, i.e. in which encoding and how it is stored in the variables, and how do you write them?

 

I'm afraid I have no clue at all how Perl (that is Perl, right?) handles strings, and different encodings.

 

Of course, I might be barking up the completey wrong tree here as well.

Share this post


Link to post
Share on other sites

I compared the original dialog.tlk with the modified one in NotePad++ with the compare plugin, and it gives the exact same amount of strings, in the same places. Only differences were in text (dialog altered by me)...

 

Can the compare plugin spit out a file of only the differences? Can you post that (if it's not too long, I guess)? Best not even copy-pasted, but straight-up what that spits out, so we know it's not getting mangled. Do you see any "weird" character there?

Share this post


Link to post
Share on other sites

Okay, experimenting with your tools, Fair Strides, I fail to get them to break the file. :)

 

You basically read and write the strings verbatim, without converting. So you write CP-1252 to the txt file, and read it back as CP-1252, all the while strictly assuming that one character is one byte (as would be correct for CP-1252). So if the text is CP-1252, everything is okay. If someone modifies it and writes, say, UTF-8 into it, the text for that entry is wrong in the game, but further entries are not broken.

 

While that's not the way I chose to go about it [*], I see nothing wrong.

 

I can't see a way how that would break the file in a way cornel experienced, so I guess I am barking up the wrong tree with the encoding thing.

 

[*] I am explictly converting read TLK string to UTF-8, and always write UTF-8 to the XML file. I always read UTF-8 from the XML file and explictly write in an encoding the game wants. This is way more work and headaches, but something I needed anyway for xoreos. I can also break in other interesting ways. :)

Share this post


Link to post
Share on other sites
Guest cornel

I've googled a bit and it seems that NotePad++ can compare (with a plugin) but cannot resume the differences and export them in a new file.

 

I am writing in English and am not using any strange characters.

 

Currently, I am in the process of copying and pasting my altered dialog lines into an unaltered dialog.tlk file (have been at it since this morning and it is now evening and I'm not even a third of the way...), testing it with the game every half hour or so (checking for strange menu entries), but so far no problems yet.

 

P.S. My wife is not all too happy with my new hobby... ;)

(nor am I at the moment, lol)

Share this post


Link to post
Share on other sites

P.S. My wife is not all too happy with my new hobby... ;)

Heh, only know that all too well (just replace wife with girlfriend).
  • Like 1

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.