DrMcCoy

Members
  • Content Count

    78
  • Joined

  • Last visited

  • Days Won

    3

Everything posted by DrMcCoy

  1. Except that this document is older than Skywing. It was created by Torlack, together with code, his compiler, which was released under the OpenKnights banner. Skywing build upon the documentation and code, creating, among other things, an NWScript JIT environment. The docs are also not without errors. SAVEBP, for example, does modify the SP, because that's where the BP is push onto. In German, we have a saying: "Papier ist geduldig", meaning you can write down a lot of things, none of which have to be true. Cicero knew it as "epistula non erubescit", a letter doesn't blush. The rest of your post is more hot air, because you have nothing to show for yourself except unsupported claims. Even if your documentation materilized some day, which I find doubtful, I don't expect it to contain more than those sorts of half-wrong claims with no proof. We're done here. Good day, Sir.
  2. I mean, you talk all you want about it, but it's all hot air, because you're not showing your work. Anyone can make grandiose claims like that. I never said that struct "disappear", I have no idea where you get that from. Struct don't technically exist in the bytecode, but you can detect them by seeing stack elements that are copied around together, and when the DESTRUCT opcode is used to select the element that's to be operated on (which is essentially the bytecode view of saying "foo.x"). Tracing that is a bit of a hassle. And technically, a smarter compiler could implemented structs differently: instead of copying the whole struct and then calling DELETE, it could just copy the one single element. And the DELETE opcode could be used in other places. That's the thing, I'm not only concerned with the OpenKnights compiler and the BioWare compiler from the KotOR era, I'm also concerned about later revision of the latter, used by Dragon Age II. I have already added 4 new opcodes to the xoreos disassembler (and the xoreos interpreter) used by Dragon Age: Origins and Dragon Age II, that, as far as I am aware, haven't been documented by anybody else prior to my work. I called them WRITEARRAY, READARRAY, GETREF and GETREFARRAY. If you think that shows I "don't understand NCS", sure. In either case, my work is out here in the open, for anybody to judge. The general case of stack-analysing a recursive functions is impossible to handle, that's just a fact of the halting problem. If you can dispute that, you shouldn't waste your time on this, you should write a computer science or mathematics paper, whatever makes you more likely to be eligble for a Fields Medal or Abel Prize. If you've found a pattern that holds for both the BioWare compiler and the OpenKnights compiler, that is distinctive enough to not be triggered for other constructs, that's great too, but, like I said, anybody can make claims like that. Let me be blunt here: I don't believe most of your claims. There, I said it. I don't believe your swagger that this is just all easy-peasy and you have it all working and figured out. You might well have found a pattern for some things, they might even hold up for the majority of cases, but that's the rub: there's always weird corner cases where it doesn't work, where there's ambiguity between two different control structures. This is why this is hard. For x86, there's only a few tools out there of varying quality and prices (HexRays is the old top dog in the infosec space, Ghidra is the newcomer, for example). NCS is no x86 (it has more in common with the JavaVM or AVM structurally), but a lot of the theoretical underpinnings are still relevant. I'm assuming you haven't look at what BioWare, if anything, changed up to Dragon Age II. In XHTML, it should. But all that is irrelvant, because HTML (nowadays) is a set of standards, regulated by a standards body, reasoned about people, interoperability focused. We have specs written by that standards body. For NWScript we don't have that, nevermind for its bytecode. In NWScript, everything is legal that the original compiler eats. And in NCS, everything is legal that the original game interprets as intended. That's it, that's all the securities we have. NWScript is literally "the proof is in the code". Except in, for example, the codecs space. They release something called a "reference implementation". In short, what that implementation eats and produces, that's the file format. Some might also release written specs. But have a look at what funny things the MPEG4 specs proclaim, including 3D rendering of arbitrary polygon data, which no one implements, not even they themself, and tell me that's the format. And then have a look at the code in ffmpeg, where you have decades of person-hours poured into getting all the files that actually exist out there in the world read. Even your HTML example violates it, because look at the difference between what the specs say is legal and what browsers accept.
  3. Yes, you can actually inspect my git tree, see what I did, retrace the changes. That's exactly why it's public, so that other people can benefit from it. This is how this works. I mean, no, not really? I haven't exactly tackled the decompiler stage yet, I've so far concentrated on the disassembler. The problem with xoreos is that there are so many different subsystems to have a got at, and I can only go at one thing at a time. And if you look at the git of all the xoreos repos, I haven't done much work on any of them. My struggle is rather, quite frankly, with trying to balance the requirements in our current nightmare capitalistic society and my mental health in the pandemic that's still ongoing despite everyone behaving like it's over, i.e. RL issues outside of xoreos. If you look at what FLOSS work I've done at all in the past 2 years, it's virtually all related to DM'ing tabletop RPG games via Foundry VTT, because I'm DM'ing 4 groups, I've crammed all my free time full with doing that, because that keeps me stable at the moment. I also reject your claims that a decompiler is such a unique untertaking. Especially since I already have written one in the past, for the script language used in Coktel Vision's line of adventure games (Gobliiins, Woodruff and the Schnibble, but also the Adi/Adibou/Addy edutainment games) for my work with the ScummVM project. You'll find the same dependency on previous output when, for example, REing codecs. And heck, if you look at what a complex beasts modern browsers are, those command the utmost respect from me. All the things you and I brought up here as an example can be done by one person alone, but a modern browser? Even compiling Firefox takes like multiple hours (and I do that regularily, since I run Gentoo Linux). I've filed multiple bug reports with Firefox, some I managed to trace, one I even managed to fix and send a patch upstream (which then wasn't used because someone else beat me to the punch), but most of them are too complex for me to understand. For example this here: https://bugs.gentoo.org/821898#c4 , where compiling Firefox with a specific version of Rust breaks it in a completely weird way.
  4. Agreed. I come from the FLOSS side of things, for me it's natural to release sources. And while in the FLOSS space, we disagree on different licenses, we agree that source releases, with a proper license attached to allow for re-use, is important. And "release early, release often" is also a common mantra: things might happen, so while we wait for a drop of perfectly finished and cleaned up sources, maybe you get hit by a bus. Or, less crass, your HDD dies and you don't have any backups. Or something else catches your interest or your RL gets bogged down or a global panini hits and your headspace is just weird as all f. It happens, that's okay. Don't get me wrong, good documentation and detailed documented specs of file format is also important. Ideally, there's both docs and code. With code, though, if you can get it to run, you can immediately see if it's correct. This isn't meant as an attack or rant or anything, just for an explanation of where I'm coming from here.
  5. I'd disagree. The proof is in the pudddingcode. But sure, do also release proper, structured documentation, please.
  6. Why? Why should that have any say on whether an NCS decompiler is useful?
  7. The C++98-era method would be to use stringstream. It's a bit terrible, though. You could the C function snprintf(): #include <cstdint> #include <cstdio> #include <string> std::string hex_string(uint8_t data) { char tmp[3]; std::snprintf(tmp, 3, "%02X", data); return tmp; } In C++20, you'll be able to use std::format() instead, which provides a syntax similar to Python's string formating. I.e. you could do std::format("{:02X}", data), which already returns a std::string. Compiler support for that is still sparse though. Alternatively, you can pull in the (optionally header-only) library fmt, which is actually what became C++20's std::format().
  8. Yeah, xoreos-tools lacks support for animated textures, for example. It also doesn't export the TXI information. And it can't convert textures back to TPC. That's all in the TODO still. We do take patches and pull requests, though
  9. I'm not sure about the race condition theory, considering that the game isn't even, AFAIK, multi-threaded. I don't see how the checks could happen out of order. And how that would then persist for the next checks. Likewise, a memory leak also sounds implausible. Not that I have any better idea, mind. I'm pretty sure there's even a lot of legitimate entries that have delay 0 that should be skipped: ones with only developer commentary. TSL has commentary in basically all of its dialogue files, mostly talking about how the camera is moved, the mental state of a character, etc. I think that's meant for the animators and voice actors? The commentary is enclosed in curly brackets and the game filters those out. Likewise, there's entries that exist solely for camera movement. Can't say anything about your proposed changes, though. However, 0xFFFFFFFF is often used as a sentinel value of sorts, so it might be there's some other functional difference between delay 0 and delay 0xFFFFFFFF in the game. In either case, the solution should of course be properly tested. Maybe even cross-checked with a disassembly of the game, or even attaching a debugger like OllyDbg while the game is running (and/or even already in the buggy state). That is, however, time-consuming and complex.
  10. You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files. Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable. You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases. There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development.
  11. For correctness' sake, the OBB file, at least in the Android version (I haven't looked at the iOS version at all so far), isn't exactly encrypted. In fact, the files themselves aren't even compressed, but the resource list is zlib-compressed. I've looked at the files a few days ago, planning to extent the unobb tool in xoreos-tools to make it read them as well: https://github.com/xoreos/xoreos-tools/issues/69 TL;DR: The KotOR2 Android OBBs, just like the Jade Empire Android OBBs, are a "virtual filesystem" (their terminology, they called their obb reading library libobbvfs.so). Essentially, the Jade Empire files were stored in 4096 byte blocks, zlib-compressed, with a zlib-compressed resource list at the end of the file. The KotOR2 ones are similar in principle, just that the files themselves are uncompressed for some reason. The resource list is still zlib-compressed though and has the exact same layout. There are some differences in how I have to go about locating the resource list, though, and I'm currently a bit unsure how to go about it. Potentially, the solution is in finding out how one thing I so far ignored in the file works, a list of blocks. If that exists in the KotOR2 files as well, that is, I haven't checked. If everything fails, I'll just add a simple auto-detection heuristics with a command line flag to override that, but that feels a bit dirty. I'll play around it a bit more over the coming days. If anybody else has already looked at the OBB files, feels free to ping me, maybe we can collaborate a bit there. If nobody even cares about reading those, feel free to ignore me EDIT: See https://github.com/xoreos/xoreos-tools/blob/master/src/aurora/obbfile.h and https://github.com/xoreos/xoreos-tools/blob/master/src/aurora/obbfile.cpp for more details on the current code that can read the Jade Empire files. Still unextended with my KotOR2 stuff, because that's still WIP and hacky
  12. Yeah, nwnnsscomp is not optimizing. But my point is that lots of roads lead to Rome, and denoting one as more correct than others is the wrong way to think about this. That different compilers written by different people lead to different bytecode that still accurately perform the same visible actions is normal, that should be dealt with. Similar for dead code. Dead code is entirely fine and should be handled properly without throwing errors there. And I feel that saying "nwnnsscomp is wrong" does it a huge disservice, because it's technically not wrong at all. So I'm more hung up about your phrasing here. As long as the decompiler still works as such (and doesn't just outright refuse to work) for cases were different idiosyncrasies slip through, be it nwnnsscomp or maybe the BioWare compiler used for Dragon Age, everything is entirely fine with me (for what that's worth, which is maybe not a lot). Guaranteeing a 100% match with the original source in all cases is pretty much impossible anyway, and not something I expected in the first place.
  13. Frankly, saying this is "incorrect" compilation is a misunderstanding of the compilation process. If you look at modern C compilers, they're doing far more transformations. That's all legal and often even wanted (to keep pipelines from stalling, etc). Don't stoop down to clickbait. And in general, for many targets, a nested if like that is indistinguishable from an unnested if, else if, else in the final machine code anyway. This one reason compilation-decompilation is never a lossless process, different things in the source can compile down to the same machine code. What you have found is a "tell" in the original BioWare compiler, that lets you distinguish certain things. That nwnnsscomp doesn't have the same tell is not a fault of nwnnsscomp, it's the nature of the thing. Ask me about gcc and clang differences some day, or different styles of mangling C++ symbols in gcc and msvc.
  14. It's probably easier and safer just query the user on what game this script is for (via a command line option, for example), than to dependant on some brittle heuristic to detect that. If you lack files to test this on, you could get Neverwinter Nights and install its toolset, because that came with an official BioWare NWScript compiler. Yes, it's for NWN, but the control structure stuff should be identical. Dunno if I already wrote that here or just in IRC, on a GitHub issue or wherever, but: the interesting thing about the DESTRUCT opcode is that it's used to single out individual struct members. I.e. whenever the nss used a struct member, the ncs copies the whole struct to the top of the stack and used DESTRUCT on this new block in the stack to get rid of everything but the single struct member it's interested in. Which means you can use the existance of DESTRUCT to identify structs.
  15. Yeah, the way the original game create the UI is a bit...meh. It's mostly hardcoded and the GUI files are only read to apply some styling afterwards. Though I can't really tell you outright which things are changeable and which aren't.
  16. Interesting that you're encountering the same short-circuiting bug I've seen in NWN (I wrote about it here, though everyone reading is probably aware: https://xoreos.org/blog/2016/01/12/disassembling-nwscript-bytecode/). Up until now, I had assumed they fixed that bug by the time they moved to other games. I know that the bug is in some scripts in NWN, but not in all. And I know that the compiler in the last released toolset does not have that bug anymore. I'd be interested to know when exactly it was fixed...
  17. Technically, this is not quite correct. xoreos-tools is GPLv3+, with the additional "or later" term. I.e. "licensed under the terms of the GNU General Public License version 3 or (at your option) any later version", as it's stated in the README.md and the comment-header of all source files. The reason this exists is so that code can be upgraded from one GPL to the next version without having to ask every contributor for approval. You don't specify that in ncs2nss, so with just the GPLv3 license text, this code is by default GPLv3 exclusive. There's no separate full license text for the variant, it's part of §14, you only have to state this intend (preferably with that exact phrasing) where you say that the code is GPLv3. (Best practice is also that you add a small blurb to the top of all source files stating the license, just to make sure that it's absolutely clear. You can see an approach in all xoreos project source files, for example. Then again, it's also best practice to explictly state the copyright holder in all files, but I've outsourced this into the AUTHORS file and only reference that file in the comment header.)
  18. xoreos and xoreos-tools are C++11 now, Phaethon is C++14. Though there are still vestiges of old stuff in there, like our own ScopedPtr which isn't necessary anymore now with std::unique_ptr. Gradually replacing that is on the TODO. As for newer standards, these are also gradually on my purview. Moving over xoreos and xoreos-tools to C++14 shouldn't be that much trouble. C++17 needs some more evaluations. Probably still a bit off, just to make sure. And I need to check if our CI systems (Travis CI and AppVeyor) and my personal cross-compilers can be upgraded. And probably also need to wait on Coverity Scan, that has a few issues with some C++14 stuff in Phaethon already, from verdigris we pull in (which lets you write Qt code without moc). Oh, and about the library request, I've written down my thoughts about it here: https://github.com/xoreos/xoreos-tools/issues/58#issuecomment-662419079 TL;DR: That's planned, but I don't think the time for that is now.
  19. There's also these discussions on the xoreos-tools issue tracker: https://github.com/xoreos/xoreos-tools/issues/56 , https://github.com/xoreos/xoreos-tools/issues/57 , that might give you some pointers and ideas. Also shows a case where the current control flow analysis in xoreos-tools' ncsdis fails lachjames also wrote me a few thoughts and ideas on Discord, but since this is non-public 1:1 talk, I can't really paste this without permission. (Also a reason I generally prefer public conversations, to be frank. Information wants to be free and all that jazz.)
  20. See also https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor.h vs https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor2.h . Btw, @lachjamesis also currently working on a dencs. You might want to talk to each other to see if you can maybe combine your efforts. As for me, just to lay my cards on the table, I'm also quite interested in this. With some caveats: I'd want to integrate it into the xoreos-tools package, written in C++ and targetting not just KotOR/KotOR2, but all the Aurora-derivate games that xoreos supports and of course based on the NCS disassembly code that already exists in xoreos-tools (which includes a few more opcodes for the two Dragon Age games, since those can do references and arrays). So what I'd like to do, once there is a working dencs that supports KotOR/KotOR2, is take that and port it to C++ and add it to the xoreos-tools. Provided there's sources, licenced compatibly with the GPLv3, and of course giving proper credits in the source file and AUTHORS text file. Probably not directly 1:1 either, but more seeing how you do things and redo them within the current framework in xoreos-tools.
  21. To explain what this warning means: ncsdis tries to run through the whole disassembly to analyze the stack. Essentially, how the stack looks after each operation: how many stack slots are allocated and what types (int, string, float, ...) are in there. When there is a fork, for example due to branching in an if-clause, ncsdis follows both paths. And when the paths merge again, for example after the if when the excution continues on linearly, the stack has to look the same for both paths. Otherwise, I wouldn't know what to make out of this situation. (ncsdis only checks that the size of the stack matches, and ignores the types at the moments.) In this case, for some reason ncsdis found that the stack sizes of two paths don't match. It's 2 slots for one path and 5 slots for another path. This might be a bug in ncsdis. This could also be a bug in the compiler that produced the NCS in question. For example, IIRC I found the same message for one of the bugs I wrote about in my blog post, the one where the short-circuiting boolean OR. In that case, it didn't really matter that the stack is broken for one path, because that path was logically dead anyway, it would never have been called. So to work around this bug in the original compiler, I added a dead branch check, and the stack analysis won't follow dead branches. It might be that this here is a similar problem and my dead branch check is not catches it and needs to be expanded. Or this script is broken after all. (Or it could still be a bug in ncsdis.) I'd need to check that, I'll add it to my TODO list. Or if anybody else would like to tackle this, feel free
  22. I don't remember seeing this exacty case. Then again, it's been a while since I touched that code and I'm not known to have a good memory. :P Also, my disassembler just follows the starting segment and IIRC I don't even check for unreachable blocks without an edge leading into them. That's maybe something that could use improvement. I take GitHub pull requests! ;) However, I can say that I have found a few bugs in BioWare's script compiler (I explained two in my replies above, I think), so I wouldn't particularily rule that out. And while I can't name any specifics right now, I also never got the feeling that their compiler was all that great in optimizing, fusing instructions, or removing outright unnecessary instructions. I don't think they focussed on that at all, rather keeping it simple and working without any surprises (bugs nonwithstanding).
  23. As for operators, you should probably also make sure you know the precedence rules. I'm not sure NWScript follows the C rules there exactly. I haven't yet looked at that, since it's unambiguous in the bytecode. Oh, and also, there's another bug I've found, but I'm not sure in which games it was present. It had to do with parameter shadowing. I.e. void foobar(int blah) { int blah = barfoo(blah); } (excuse the quirky formatting, this editor is weirder than I remember) I've seen the produced bytecode use the uninitialized value of the just created local variable blah as in input for barfoo(), instead of the parameter of foobar(). Maybe something to have an eye on. Hmm, depends on how DeNCS operates. Does it also try to decompile functions that are not called? I.e. if you have an include with foobar1(), foobar2() and foobar3(), and the script itself just calls foobar1() (and that doesn't call the other functions), does DeNCS also decompile foobar2() and foobar3()? Because IIRC ncsdis completely ignores foobar2() and foobar3() in that case and wouldn't see any recursion there.
  24. Yeah, do have a look at the nwscript directory: https://github.com/xoreos/xoreos-tools/tree/master/src/nwscript Specifically, the instruction.h: https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/instruction.h , which contains a list of opcodes (including 2 each that were introduced by Dragon Age: Origins and Dragon Age II). You might also want to look at https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor.h and https://github.com/xoreos/xoreos-tools/blob/master/src/nwscript/game_kotor2.h for a list of engine functions and their signatures (return type + types of parameters). Alternatively, you can parse the nwscript.nss from the games yourself. If you haven't already, have a read over my blog post about disassembling the bytecode: https://xoreos.org/blog/2016/01/12/disassembling-nwscript-bytecode/ . It also shows a bug in BioWare's compiler (though that particular bug is not present in the KotOR scripts, it was fixed before KotOR development started it seems). If you want to decompile all scripts, you'll find that you're going to have problems with recursion, i.e. functions that call themselves. You need to analyze how a function leaves the stack to be able to continue with the code after the call, so you need to branch into a callee first before continuing with the caller. If the callee is the caller itself (or A calls B calls C calls A again)... that's a problem. This is essentially the halting problem, and there's no general solution, unfortunately.
  25. (Btw, please feel free to tag me in when talking about xoreos or something related. Google takes a while until it picks up new posts :P) In either case, I'd be happy to have any help willing to work on xoreos and its sub-projects (xoreos-tools and Phaethon). It'd very much like xoreos-tools and Phaethon to be a collection of tools helpful for modders as well. My general idea was that xoreos-tools was to be a CLI collection and Phaethon a tool incorporating complimentary GUI functionality. Both, I feel, are a good base to add more things that can edit and create, as opposed to just read, the files for the targetted BioWare games. I'd also be willing to have similar sub-projects under the same xoreos banner as well [1][2][3]. Basically, what I'm saying: I for one think xoreos would be the ideal grounds for modernized KotOR modding tools [4]. We're not quite dependency-free, because there are libraries we depend on. Especially Boost is somewhat of a beast, I'm aware. In the future, I do want to phase it out as stuff gets approved into new the C++ standards and those become mature to be used in userland code. [1] Though it might maybe be a good idea to migrate some of the code into a library first. Right now, xoreos-tools and Phaethon contains (mostly) unmodified copies of the xoreos files and that's a bit eeeeeeh [2] I do want them to be portable, though and of course license-compatible with the GPLv3'd xoreos code. I can understand if people disagree with me on these points :P [3] And yes, I am sometimes a stickler for form and following a consistent coding style. Again, I can understand if people feel I'm too strict there and wouldn't want to me near any of their code [4] And also the other targetted BioWare games, to be honest