DrMcCoy

Members
  • Content Count

    78
  • Joined

  • Last visited

  • Days Won

    3

DrMcCoy last won the day on April 8 2018

DrMcCoy had the most liked content!

Community Reputation

40 Jedi Knight

1 Follower

About DrMcCoy

  • Rank
    Jedi Padawan
  • Birthday 08/05/1985

Contact Methods

  • Website URL
    https://drmccoy.de/

Profile Information

  • Gender
    Male
  • Location
    Braunschweig, Germany
  • Interests
    Geek. Atheist+. Leftist. Metal-Head. Discordian. Lefty.
    ScummVM dev, xoreos lead.
    Software freedom zealot.

Recent Profile Visitors

5,136 profile views
  1. Except that this document is older than Skywing. It was created by Torlack, together with code, his compiler, which was released under the OpenKnights banner. Skywing build upon the documentation and code, creating, among other things, an NWScript JIT environment. The docs are also not without errors. SAVEBP, for example, does modify the SP, because that's where the BP is push onto. In German, we have a saying: "Papier ist geduldig", meaning you can write down a lot of things, none of which have to be true. Cicero knew it as "epistula non erubescit", a letter doesn't blush. The rest of your post is more hot air, because you have nothing to show for yourself except unsupported claims. Even if your documentation materilized some day, which I find doubtful, I don't expect it to contain more than those sorts of half-wrong claims with no proof. We're done here. Good day, Sir.
  2. I mean, you talk all you want about it, but it's all hot air, because you're not showing your work. Anyone can make grandiose claims like that. I never said that struct "disappear", I have no idea where you get that from. Struct don't technically exist in the bytecode, but you can detect them by seeing stack elements that are copied around together, and when the DESTRUCT opcode is used to select the element that's to be operated on (which is essentially the bytecode view of saying "foo.x"). Tracing that is a bit of a hassle. And technically, a smarter compiler could implemented structs differently: instead of copying the whole struct and then calling DELETE, it could just copy the one single element. And the DELETE opcode could be used in other places. That's the thing, I'm not only concerned with the OpenKnights compiler and the BioWare compiler from the KotOR era, I'm also concerned about later revision of the latter, used by Dragon Age II. I have already added 4 new opcodes to the xoreos disassembler (and the xoreos interpreter) used by Dragon Age: Origins and Dragon Age II, that, as far as I am aware, haven't been documented by anybody else prior to my work. I called them WRITEARRAY, READARRAY, GETREF and GETREFARRAY. If you think that shows I "don't understand NCS", sure. In either case, my work is out here in the open, for anybody to judge. The general case of stack-analysing a recursive functions is impossible to handle, that's just a fact of the halting problem. If you can dispute that, you shouldn't waste your time on this, you should write a computer science or mathematics paper, whatever makes you more likely to be eligble for a Fields Medal or Abel Prize. If you've found a pattern that holds for both the BioWare compiler and the OpenKnights compiler, that is distinctive enough to not be triggered for other constructs, that's great too, but, like I said, anybody can make claims like that. Let me be blunt here: I don't believe most of your claims. There, I said it. I don't believe your swagger that this is just all easy-peasy and you have it all working and figured out. You might well have found a pattern for some things, they might even hold up for the majority of cases, but that's the rub: there's always weird corner cases where it doesn't work, where there's ambiguity between two different control structures. This is why this is hard. For x86, there's only a few tools out there of varying quality and prices (HexRays is the old top dog in the infosec space, Ghidra is the newcomer, for example). NCS is no x86 (it has more in common with the JavaVM or AVM structurally), but a lot of the theoretical underpinnings are still relevant. I'm assuming you haven't look at what BioWare, if anything, changed up to Dragon Age II. In XHTML, it should. But all that is irrelvant, because HTML (nowadays) is a set of standards, regulated by a standards body, reasoned about people, interoperability focused. We have specs written by that standards body. For NWScript we don't have that, nevermind for its bytecode. In NWScript, everything is legal that the original compiler eats. And in NCS, everything is legal that the original game interprets as intended. That's it, that's all the securities we have. NWScript is literally "the proof is in the code". Except in, for example, the codecs space. They release something called a "reference implementation". In short, what that implementation eats and produces, that's the file format. Some might also release written specs. But have a look at what funny things the MPEG4 specs proclaim, including 3D rendering of arbitrary polygon data, which no one implements, not even they themself, and tell me that's the format. And then have a look at the code in ffmpeg, where you have decades of person-hours poured into getting all the files that actually exist out there in the world read. Even your HTML example violates it, because look at the difference between what the specs say is legal and what browsers accept.
  3. Yes, you can actually inspect my git tree, see what I did, retrace the changes. That's exactly why it's public, so that other people can benefit from it. This is how this works. I mean, no, not really? I haven't exactly tackled the decompiler stage yet, I've so far concentrated on the disassembler. The problem with xoreos is that there are so many different subsystems to have a got at, and I can only go at one thing at a time. And if you look at the git of all the xoreos repos, I haven't done much work on any of them. My struggle is rather, quite frankly, with trying to balance the requirements in our current nightmare capitalistic society and my mental health in the pandemic that's still ongoing despite everyone behaving like it's over, i.e. RL issues outside of xoreos. If you look at what FLOSS work I've done at all in the past 2 years, it's virtually all related to DM'ing tabletop RPG games via Foundry VTT, because I'm DM'ing 4 groups, I've crammed all my free time full with doing that, because that keeps me stable at the moment. I also reject your claims that a decompiler is such a unique untertaking. Especially since I already have written one in the past, for the script language used in Coktel Vision's line of adventure games (Gobliiins, Woodruff and the Schnibble, but also the Adi/Adibou/Addy edutainment games) for my work with the ScummVM project. You'll find the same dependency on previous output when, for example, REing codecs. And heck, if you look at what a complex beasts modern browsers are, those command the utmost respect from me. All the things you and I brought up here as an example can be done by one person alone, but a modern browser? Even compiling Firefox takes like multiple hours (and I do that regularily, since I run Gentoo Linux). I've filed multiple bug reports with Firefox, some I managed to trace, one I even managed to fix and send a patch upstream (which then wasn't used because someone else beat me to the punch), but most of them are too complex for me to understand. For example this here: https://bugs.gentoo.org/821898#c4 , where compiling Firefox with a specific version of Rust breaks it in a completely weird way.
  4. Agreed. I come from the FLOSS side of things, for me it's natural to release sources. And while in the FLOSS space, we disagree on different licenses, we agree that source releases, with a proper license attached to allow for re-use, is important. And "release early, release often" is also a common mantra: things might happen, so while we wait for a drop of perfectly finished and cleaned up sources, maybe you get hit by a bus. Or, less crass, your HDD dies and you don't have any backups. Or something else catches your interest or your RL gets bogged down or a global panini hits and your headspace is just weird as all f. It happens, that's okay. Don't get me wrong, good documentation and detailed documented specs of file format is also important. Ideally, there's both docs and code. With code, though, if you can get it to run, you can immediately see if it's correct. This isn't meant as an attack or rant or anything, just for an explanation of where I'm coming from here.
  5. I'd disagree. The proof is in the pudddingcode. But sure, do also release proper, structured documentation, please.
  6. Why? Why should that have any say on whether an NCS decompiler is useful?
  7. The C++98-era method would be to use stringstream. It's a bit terrible, though. You could the C function snprintf(): #include <cstdint> #include <cstdio> #include <string> std::string hex_string(uint8_t data) { char tmp[3]; std::snprintf(tmp, 3, "%02X", data); return tmp; } In C++20, you'll be able to use std::format() instead, which provides a syntax similar to Python's string formating. I.e. you could do std::format("{:02X}", data), which already returns a std::string. Compiler support for that is still sparse though. Alternatively, you can pull in the (optionally header-only) library fmt, which is actually what became C++20's std::format().
  8. Yeah, xoreos-tools lacks support for animated textures, for example. It also doesn't export the TXI information. And it can't convert textures back to TPC. That's all in the TODO still. We do take patches and pull requests, though
  9. I'm not sure about the race condition theory, considering that the game isn't even, AFAIK, multi-threaded. I don't see how the checks could happen out of order. And how that would then persist for the next checks. Likewise, a memory leak also sounds implausible. Not that I have any better idea, mind. I'm pretty sure there's even a lot of legitimate entries that have delay 0 that should be skipped: ones with only developer commentary. TSL has commentary in basically all of its dialogue files, mostly talking about how the camera is moved, the mental state of a character, etc. I think that's meant for the animators and voice actors? The commentary is enclosed in curly brackets and the game filters those out. Likewise, there's entries that exist solely for camera movement. Can't say anything about your proposed changes, though. However, 0xFFFFFFFF is often used as a sentinel value of sorts, so it might be there's some other functional difference between delay 0 and delay 0xFFFFFFFF in the game. In either case, the solution should of course be properly tested. Maybe even cross-checked with a disassembly of the game, or even attaching a debugger like OllyDbg while the game is running (and/or even already in the buggy state). That is, however, time-consuming and complex.
  10. You might want to see if the round-trip holds, at least using nwnnsscomp and existing source files. Like, source file -> nwnnsscomp (ncs #1) -> decompiler -> nwnnsscomp (ncs #2), then compare whether ncs #1 and ncs #2 identical. There shouldn't be any differences in the bytecode, probably. Possibly also run it through the decompiler again to check that the output is stable. You can automate that steps and just focus on the examples where it fails. That's possible also good as a sort of test suite between big changes or releases. There never was a public release of a KotOR-era BioWare compiler, though, right? Has anybody ever tried rigging the compiler in the Neverwinter Nights toolset (does it read the nwscript.nss or is it more hardcoded)? If it can be rigged, that would give you another angle to test on, though it's not necessarily guaranteed any of the NWN releases matches the version (versions?) that was used during KotOR development.
  11. For correctness' sake, the OBB file, at least in the Android version (I haven't looked at the iOS version at all so far), isn't exactly encrypted. In fact, the files themselves aren't even compressed, but the resource list is zlib-compressed. I've looked at the files a few days ago, planning to extent the unobb tool in xoreos-tools to make it read them as well: https://github.com/xoreos/xoreos-tools/issues/69 TL;DR: The KotOR2 Android OBBs, just like the Jade Empire Android OBBs, are a "virtual filesystem" (their terminology, they called their obb reading library libobbvfs.so). Essentially, the Jade Empire files were stored in 4096 byte blocks, zlib-compressed, with a zlib-compressed resource list at the end of the file. The KotOR2 ones are similar in principle, just that the files themselves are uncompressed for some reason. The resource list is still zlib-compressed though and has the exact same layout. There are some differences in how I have to go about locating the resource list, though, and I'm currently a bit unsure how to go about it. Potentially, the solution is in finding out how one thing I so far ignored in the file works, a list of blocks. If that exists in the KotOR2 files as well, that is, I haven't checked. If everything fails, I'll just add a simple auto-detection heuristics with a command line flag to override that, but that feels a bit dirty. I'll play around it a bit more over the coming days. If anybody else has already looked at the OBB files, feels free to ping me, maybe we can collaborate a bit there. If nobody even cares about reading those, feel free to ignore me EDIT: See https://github.com/xoreos/xoreos-tools/blob/master/src/aurora/obbfile.h and https://github.com/xoreos/xoreos-tools/blob/master/src/aurora/obbfile.cpp for more details on the current code that can read the Jade Empire files. Still unextended with my KotOR2 stuff, because that's still WIP and hacky
  12. Yeah, nwnnsscomp is not optimizing. But my point is that lots of roads lead to Rome, and denoting one as more correct than others is the wrong way to think about this. That different compilers written by different people lead to different bytecode that still accurately perform the same visible actions is normal, that should be dealt with. Similar for dead code. Dead code is entirely fine and should be handled properly without throwing errors there. And I feel that saying "nwnnsscomp is wrong" does it a huge disservice, because it's technically not wrong at all. So I'm more hung up about your phrasing here. As long as the decompiler still works as such (and doesn't just outright refuse to work) for cases were different idiosyncrasies slip through, be it nwnnsscomp or maybe the BioWare compiler used for Dragon Age, everything is entirely fine with me (for what that's worth, which is maybe not a lot). Guaranteeing a 100% match with the original source in all cases is pretty much impossible anyway, and not something I expected in the first place.
  13. Frankly, saying this is "incorrect" compilation is a misunderstanding of the compilation process. If you look at modern C compilers, they're doing far more transformations. That's all legal and often even wanted (to keep pipelines from stalling, etc). Don't stoop down to clickbait. And in general, for many targets, a nested if like that is indistinguishable from an unnested if, else if, else in the final machine code anyway. This one reason compilation-decompilation is never a lossless process, different things in the source can compile down to the same machine code. What you have found is a "tell" in the original BioWare compiler, that lets you distinguish certain things. That nwnnsscomp doesn't have the same tell is not a fault of nwnnsscomp, it's the nature of the thing. Ask me about gcc and clang differences some day, or different styles of mangling C++ symbols in gcc and msvc.
  14. It's probably easier and safer just query the user on what game this script is for (via a command line option, for example), than to dependant on some brittle heuristic to detect that. If you lack files to test this on, you could get Neverwinter Nights and install its toolset, because that came with an official BioWare NWScript compiler. Yes, it's for NWN, but the control structure stuff should be identical. Dunno if I already wrote that here or just in IRC, on a GitHub issue or wherever, but: the interesting thing about the DESTRUCT opcode is that it's used to single out individual struct members. I.e. whenever the nss used a struct member, the ncs copies the whole struct to the top of the stack and used DESTRUCT on this new block in the stack to get rid of everything but the single struct member it's interested in. Which means you can use the existance of DESTRUCT to identify structs.
  15. Yeah, the way the original game create the UI is a bit...meh. It's mostly hardcoded and the GUI files are only read to apply some styling afterwards. Though I can't really tell you outright which things are changeable and which aren't.