Stage Once – a Visual Novel hacking tutorial

Transcription

Stage Once – a Visual Novel hacking tutorial
Stage Once – a Visual Novel hacking tutorial
This is a PDF version of my tutorial located at my blog (http://proger.i-forge.net).
It might be slightly outdated than the original article (http://proger.i-forge.net/Stage_Once).
Necessary files and documentation can be downloaded from there.
I became interested in visual novels (and anime) much earlier than I became interested in hacking them for
translation purposes. This happened near September of 2008 (I still remember the month because that's the time
school year starts lol).
When this happened I suddenly found that I'm able to understand those messy lines of what seemed like
totally unmanagable assembler code before that came out of OllyDbg almost scaring me to faints. I've set up a
page with my visual novel (or shortly – VN) tools which is still located here, although I'm planning to
significantly improve it some day.
I wrote this tutorial to one of my Internet friends with whom we had intensive chat for several months (which
resulted in more than 200 forum posts, some of which were 60-90 KiB in size - pure ANSI). This guide is
intended to give an all-round view of how reverse-engineering (RCE) is performed. It requires no
knowledge learnt beforehand - maybe except for more or less common mechanics of how computer works and
what WinAPI is. You don't even have to be able to write assembly code - you'll learn this and more things as
you go through the pages.
It's ironical that none of people (by the time of this writing - two) whom I've send this guide to actually
completed it - albeit they've asked themselves if I can teach them some hacking stuff. In fact, I don't know if
they have started at all, haha... Well, nobody is to be blamed for this, of course.
But, still, I would highly appreciate any feedback that you might drop in the comments! Or if you prefer
forums – welcome to ours :)
Now without further ado let's dive into the world of hacking...
VN hacking. That’s a very interesting and broad theme, much because of its puzzles that actually make
our mind work very hard :) I’ll try to show you the basics of RCE and hacking overall.
What we need now is a hex editor. If you don’t have one, unpack 010 editor from the rar – some time ago I
used WinHex but it sucks when working in Japanese locale – it almost becomes unusable. I tried this packedwith-features tool and it turned out to be quite good.
After registering 010 editor you can get rid of its splash screen in Tools | Options | Hide splash screen on
startup.
So let’s begin.
See runme.dat in the Scenario Runner directory? That’s a “scenario” it runs. Run the EXE. The scenario
is simple, as well as the interpreter itself – it outputs a string, then asks you a question and depending on your
answer it will either ask you the same question again or output a message and exit.
Your job is to change the strings it outputs – that’s what we do when we need to translate a game, for
example.
We’ll go brute force for the start. Open runme.dat in a hexed. You can clearly see some strings like
“Make me laugh!” and others. What if we change one of them?
Edit the first string – I made it “Do not cry…” – in this case we have an extra “!” left from the old string –
delete it. The file will become 1 byte shorter.
When you edit in a hexed there are two areas where you can do this – on the left you should enter
hex values (0-9, a-f) while on the right you can enter normal letters. Since we need to enter a string
it’s more convenient to do that on the right side.
Btw, you can switch between two sides by using Tab.
Also, note that there are two modes – overwriting and insertion. You can switch between them as
you usually do in normal text editors – by Insert.
Run the exe. Oh huh, we get some exception and also we see that it output some “☺” – while it’s good to
stay positive it’s not exactly what we wanted to see :)
We probably forgot to update something, for example, a string length. Naturally, the app should know how
long the string is. How would it learn that?
Remember about two general types of strings – C-style and Pascal-style. As you know, C-style don’t
have any length field, the length is determined run-time by finding a char with code #0: 32 33 34
00. Pascal-style don’t use null-chars, it rather includes string length before the actual string, which
makes string processing much faster (for the cost of having an additional length field but it’s a very
low cost):
03 00 32 33 34. Usually that field is 2 or 4 bytes long.
We need to lo look around to find where the problem is. Can you do that on your own? :)
Here’s what we have: 0E 00 44 6F 20 6E 6F 74 20 63 72 79 2E 2E 2E.
Those 0E 00 bytes look exactly like the string’s length, right? We need to do a quick conversion to check
this supposition: hit F11 (Tools | Base Converter) and enter 0E in the hex field – it’s 14. Actually, you could
also do this by setting a pointer before 0E and looking at the Inspector under Unsigned Short.
If you're using a different hexed or something else prevents you from using a desktop notation convertor tool
you can use an online base convertor such as this one at i-Tools.org.
A quick intro about how numbers are stored in memory and such locations. One thing I found
confusing about them in the beginning was that they are stored in reversed order, e.g. if we have a 2byte number and it’s 255 it will look like FF 00 in machine representation, not like 00FF as we’d
write it.
In fact, this is called a little-endian or Intel byte order and there’s also big-endian byte order which
will look like our normal notation, that is, as 00 FF.
However, Intel byte order is most widely used (one place using big-endian that I know are network
transfers).
By the way, before doing anything on your subject make sure to backup original files in case you
mess them up too much - this is always a good habit to acquire.
Update that length and run the script. Wow, that’s cool, we now got a working script! Yahoo!
…well, not exactly as it turns our after a bit of investigation – the script works normally when the second
choice is made on the branch, otherwise it will crash. It says that the bytecode is corrupted…
Now I’ll leave you to explore the script with the hexed on your own so you can try to find where the trouble
is. When you’ve used up all your deductions carry on to the most fascinating part – debugging :D
Alright, here we are. Debugger is an incredible thing that allows us to read people’s minds… Ahem, yes
sorry for being a little offtopic :)
Let’s unpack IDA Pro somewhere and load ScenarioRunner.exe into it. It will show a dialog box
about its file type, hit Enter.
Most of the time I use IDA because it has much better capabilities than OllyDbg when it comes to
give names to memory locations, functions, etc. – Olly don’t have any of this.
However, Olly has very good breakpoint logging, plus patching functions and many other features
that IDA lacks (or is limited in) so sometimes I use Olly as well – we'll use it too in later chapters.
However, at the beginning I use IDA.
After a few sec IDA will disassemble the exe, you’ll notice it has finished when a bulb on the right of the
third panel row from the top changes color from yellow to green.
IDA has a terrible interface when you look at it at the beginning, even compared to Olly. It looks
unintuitive and, frankly, it is, unless you gather some experience. Luckily, it took me only a few days to
understand its basic concepts and remember keyboard shortcuts so I'm sure you'll do fine soon.
Btw, be ready (1) to use keyboard and (2) to remember many shortcuts – you simply can’t operate
with IDA the other way using context menus because items in those menus change very often and even
I don’t’ always understand the logics behind its changes :)
When you load an exe into IDA it will create 4 temp files in the folder of that exe. After you close IDA, it will glue them
to one *.idb file.
So, we should be at public start right now. If for some reason you’re not, hit “G” (most IDA
shortcuts are just one letter) and type “start”.
We’ll jump straight into the pool now – i.e. run the program from within IDA. Unlike Olly, IDA has
several debuggers (which all are external) but I always used only one so far – select Local Win32 Debugger and
press F9.
You can terminate the debugged process by Ctrl+F2.
Nothing extraordinary will happen right now. The program runs as it does without a debugger – and that’s
exactly what we need since we can peek at its inner machanics while it signs its song peacefully… muhaha
Let’s finish debugging and look around some more.
A short intro into the asm language.
You will find out the rest on your own quite easily with just this knowledge anyway.
I think you already know that CPU has registers which are like memory slots but many times fasters. A
register can hold 32bit value. They are (not including specialized registers like FPU’s):
 EAX, EBX, EDX, ESI, EDI, EBP - general-purpose registers for any use (which depends on a compiler).
 ECX – normally used as a counter in cycles: for (ECX = 0; ECX < length; ECX++) { … }
 ESP – always holds Stack Pointer, the top of thread stack. Values on addresses < ESP are unused (free).
 EIP – always holds Instruction Pointer, i.e. the next instruction to be executed by the CPU.
Most registers can be changed by any instruction like MOV, except the EIP register – it’s changed by J*,
CALL, RET, etc.
Registers are prefixed with “E” (“Extended”) for a reason. Each “E*” register is 32-bit. In fact, you can
address registers ending on “X” (it probably also meant “eXtended” in old times, lol) as 16-bit and 8-bit: EAX
=> AX (16-bit) => AL & AH (8-bit). If AL is “L” and AH is “H” then coverage of bits (one letter – one bit)
of a register is like this: HHHHHHHH LLLLLLLL. Changing Low/High parts of a register doesn’t affect its
other parts. Also note that you can’t access top 16-bit register of “E*X” in this fashion.
Constructions like EAX:EDX create 64-bit “super-registers” but we rarely face them yet (especially on 32bit systems).
Basic instructions are:
MOV dst, src – copies src into dst. MOV EAX, 2 is the same as EAX := 2. Btw, a confusing fact is
that a few assemblers have the arguments in reverse order: MOV src, dst. However, this shouldn’t
concern us currently.
Note that most operations that have the form of OP reg, otherReg put the result back into reg, thus
modifying its content: reg = reg [op] otherReg.
 PUSH src / POP dst – puts src into and gets dst from the stack. A stack is simply a memory area
which is aligned on 4-byte boundary (i.e. each item in a stack is a DWord). Address of current stack size
(stack top) is stored in ESP register. You can find more info in the web but basically now you don’t need
to know more.
 XOR reg, byKey – it’s the same as Boolean XOR: reg = reg ^ byKey. An interesting fact is that this
instruction is used to make a register zero, i.e. clear it (as you know anything XOR’ed by itself becomes 0)
– it’s faster than MOV and needs less bytecode. So you can think of XOR EAX, EAX as MOV EAX, 0.
 AND/OR reg, byReg – also Boolean (bitwise) operations.
 ADD/SUB/IMUL/IDIV reg, byReg – arithmetic operations: reg := reg + byReg.
 TEST/CMP reg1, reg2 – these two instructions are what make asm conditions tick. It’s hard to explain
exactly what they do and, to tell the truth, I can’t always predict myself what the result of these instruction
will be. However, usually it's enough to keep in mind that result of running these instructions is put into a
flag register. E.g. if we TEST EAX, 0 and EAX was 0 then flags like ZF (zero flag) will be set. And if the
consequent instruction will be JZ addr it will jump, if not – it will skip the jump instruction.
If you have questions about how exactly do they work I suggest you google an easy explanation because
I probably won’t be able to explain it myself :)
 JMP/J* reg/addr – this simply turns execution to start at another place. You can think of this as MOV
EIP, reg. JMP is an unconditional jump while its forms (JNZ, JGE, etc.) are conditional jumps.
 CALL reg/addr / RET [bytesToPop] – as the names suggest, they have something to do with
function calls.
CALL can be thought of as a shortcut for PUSH EIP; JMP addr.
RET – as POP EIP.
Attention: RET xxx is not the function’s return value as you might think (functions usually retunrs its
result in EAX – note this) – it’s the number of bytes to pop from the stack (“cleaning the stack”). So it’s like:
SUB ESP, bytesToPop. A cycle might be more illustrative:
while (bytesToPop > 0) { POP tmp; bytesToPop -= 4; }
If you’re interested why this form exists and why it’s rarely used (particularly in C and C++) read about
calling conventions – stdcall and those of Pascal and C.

I thnk this is all for the asm intro (just one page – compare it to loads of books written about “Assembler
basics”…). Now we can back to the disassembly. I’ve uploaded some docs about Intel asm instrcutions so you
can always consult them – they have every existing instruction, literally everything (I often use them myself).
So now we’re on public start. We don’t yet understand anything because everything has meaningless
names like CALL sub_40276C – right, “sub_” is a function’s prefix but what does the name tell us?
Nothing, that’s why we should start giving things proper names!
I’ll give you a direction. Gray lines starting with “;” are comments – IDA (as well as Olly) puts some useful
info into those comments. For example, we see this:
MOV
EDX, offset aScenariorunner ; "
* * * ScenarioRunner demo... "
Can you guess what is it? It’s a string, “aScenariorunner” is a name that IDA has auto-chosen for it (if you
go to Options | General | Strings you will see Prefix field which you can change; default is “a”). This time
autogenerated name looks okay so we’ll leave it as it is.
Let’s now think about the purpose of this block (note than sub_XXX addresses will be different if we change
and recompile ScenarioRunner.exe since procedure offsets will shift – it’s ok):
CODE:00413F8E
MOV
EDX, offset aScenariorunner
CODE:00413F93
CALL
sub_4049E4
CODE:00413F98
CALL
sub_4032E4
CODE:00413F9D
CALL
sub_40276C
With a little deduction we can come to a thought that one of these functions outputs a line to the console.
Which one? Best guess is the first since it seems to accept EDX as argument (I think so simply because it’s
closer to MOV than other CALLs). However, this might not be the case – maybe sub_4049E4 is only a
preparation and its result is used in the actual write-console function.
So let’s find out. Breakpoints – this is what we need. A breakpoint is just a “point” at which normal
program execution will pause and the debugger will be passed control. Both in Olly and IDA breakpoints are
set by F2, in Delphi – by F5.
Note that to set a breakpoint in IDA you need to select a debugger in the debugger list (you should’ve already done
this).
Let’s put a BP on the first function call (sub_4049E4). After you press F2 the line will be highlighted.
Now let’s start the program by F9 – IDA will almost instantly pause on a BP we’ve just set. Look at the
Scenario Runner’s console – confirm that it’s yet empty. Hit F8 now to walk-over the current instruction.
Now the cursor will be moved to the line with a call to sub_4032E4. Look at the console – huh, it’s still
empty! No problem. Press F8 again – yup, that worked, the line is there. We likely just found the output
function! Isn’t that great?
As in VS, Delphi and Olly, F7 steps into instruction, F8 – over and F9 continues execution.
Here comes the difference between Olly and IDA I was talking about – Olly can’t change function names AFAIK while
in IDA you can give names to many things – which helps a lot.
Put cursor on the name of the second CALL sub_4032E4 (inside sub_4032E4 word) and press “N” –
this opens a rename dialog. Enter some meaningful name for the function, e.g. “WriteLn” and hit OK. Btw, I
suggest you also prepend the names you give with some symbol, I use, “$”, so you can quickly distinguish the
names you gave from the autogenerated ones.
So, I named this function “$WriteLn”.
We should also take care of sub_4049E4 – although we don’t know what it does we still need to give it
some name so we’ll at least know what we’re dealing with when we encounter it the next time. Since we don’t
know the call’s purpose let’s call it something like “$IsCalledBeforeWriteLn” – we can always rename it later.
If we don’t do this we’ll most likely end up in a situation when we’re lost in a mess of unnamed functions,
although most of those functions might be the same – we just don’t recognize unintuitive names like
sub_40F6B8 as the same function (it’s hard to remember a lot of such names anyway).
Good, we now have one function recognized. This will sure help us navigate thru the assembly code.
Now it’s time to take care of weird IDA Debug workspace. You can customize it as you like. Here’s mine for
example (you can see it in the full size if you do Reset image properties of the Office Image settings panel):
I’ve only closed unnecessary tabs and rearranged existing windows. At this point don’t worry about Graph
view and window – you’ll see them soon.
I almost don’t customize IDA workspace used out of debug mode because 95% of time I spend with IDA is
debugging.
So, we have used one aspect of determining a subroutine’s purpose – by strings that it accept. In fact, strings
are like beacons for us reversers in the ocean of asm code – strings are what we see and what connect us to the
original program source, which is somewhere within that disassembly…
Well, actually we’ve used another – setting a BP before a function, stepping it over and looking what has
changed after its execution. This doesn’t work always, especially in GUI apps but it’s a simplest way if it does.
You can explore the functions on your own now and when you’re done we’ll begin to search for the actual
interpretator’s loop.
You can see the list of all strings in a exe that IDA recognized – by Shift+F12 or View | Strings.
Once you double-click on a string IDA wil navigate you to that string. To find where it is actually
used (you rarely want to just see that string's raw bytes) press Ctrl+X to open Xrefs window.
In IDA you can also add your own comment to lines by pressing “;” . IDA has repeatable and nonrepeatable comments – sometimes, if a comment was put on the instruction that is a target for JMP or
CALL anywhere in the code – IDA will show the comment for that line (that’s being jumped to) near
the JMP or CALL instruction too – but not if the comment was marked as non-repeatable.
You’ll understand this when you face the situation, just keep it in mind.
Did you call us? We are – the Imported Ones!
Strings are beacons but there's also another thing – imported functions. Imported functions also connect us to
the program source code, although in a bit more subtle way than strings because we don't exactly see them on
the screen but rather feel them being using somewhere in the core hehe
Table of imported functions are number one target for exe protectors – they implement some tricks so
debuggers and disasms like IDA and Olly won’t see those functions… without extra effort at least.
This table is simply an array of DWords – Pointers to each function's first instruction – thus, target for CALL
(in rare cases JMPs can be used in place for CALLs - this is usually the behaviour of Delphi’s compiler).
For example, a program draws something on screen – some text. And the text that this program outputs just don’t look
good when used in another language – particularly, this is often an issue with Japanese games which use monospaced
square fonts – for Western languages they look unnatural at best.
So we want to replace the standard font it uses. We know that there’s an API function CreateFont which among
other things accept the name of the font to create. We search for it, fix the name – and voila! The game displays neat font
for our language.
Another bit of info regarding functions. In Windows there are two versions of almost each system
function: ending on “A” and on “W” (e.g. TextOutA and TextOutW). “A” stands for ANSI while
“W” stands for Unicode (also called “Wide” because each symbol takes up 2 bytes instead of 1). In
NT 5.0+ all functions ending on “A” AFAIK are just wrappers for “W” since the OS core operates
solely on Unicode.
Functions that don’t have any suffix after it are used in VS header files to easily switch between
A/W versions by defining a directive like UNICODE. In system DLLs such functions don’t exist.
Let’s get more specific to our problem. We need to find a function that acts as an interpretator’s loop –
since a general idea of a script interpreter is to have a loop which reads instructions from a script, have some
switch..case block (or in Delphi this is just case) and… well, we’ll see what next once we find that.
At least we’ll know what functions does the engine support and what their opcodes are (in a few words,
opcode is an index of a function which that interpreter recognizes and eventually “interprets”).
To give you an idea here's a sample interpreter's loop written in PHP (as a compromise between C++ and
Delphi :P):
function RunScript($script) {
$pos = 0;
while (strlen($script) < $pos) {
switch ($script[$pos++]) {
case 0:
WriteConsole('A message: ', ReadStrFrom($script, $pos));
break;
case 1:
$varName = ReadStrFrom($script, $pos);
SetVar($varName, ReadConsole());
break;
case 2:
$scriptName = ReadStrFrom($script, $pos);
RunScriptNamed($scriptName);
break;
...
default:
throw new Exception('Unknown command opcode '.ord($script[$pos]));
}
}
}
Of course, better code would use named constants instead of magic numbers (opcodes), encapsulated script
traversing and so on but the above code is just an example anyway.
I suggest this approach: we find calls to ReadFile, set BPs on them and watch something to happen.
Open Imports tab (Open | Subviews | Imports) and type on a keyboard first chars of the function name (if you
press F1 you’ll get some help on IDA’s lists, they have a few handy features like searching by Alt/Ctrl+T)…
Aha, gotcha.
Let’s press Enter – IDA has transferred us to that function’s record in the import table – though since we
need code that refers to it, not that record. Press Ctrl+X – great, here’s… only one place? That’s strange but it’s
Delphi after all and I mostly deal with C* programs.
Go to that part. Ah, we see some kind of a wrapper function: JMP
DS:__imp_ReadFile. We need
some real code so let’s find what refers to this function, then – Ctrl+X. We got another two matches, great. Go
to both and set BPs there.
For some reason there’s also a second entry in the imports table for ReadFile – I dunno why there
are both of them, maybe it’s some compiler trick but that’s not a problem – set a BP there too.
By the way, one wonderful feature in IDA is its Graph view. In doesn’t work for first-level “functions” (for the reason
there are not functions) but most of functions are subroutines so it will work for most time as well. If you’re not in Graph
view yet try pressing Space – if IDA complains just keep on trying each time you enter a new function and eventually it’ll
show up a bunch of nice code blocks.
Now I have 3 BPs set. Let’s roll! F9. Catch! Let’s review what we need to find once again: it should
somehow hint on connection with runme.dat. Let’s see which arguments ReadFile has… file
handle, buffer and bytes to read are those beacons.
File handle is, sure, the best connection we can have since it’s an unique ID of any file but to match it to a
real file on disk we’ll need to know the ID of that real file – it’s returned by CreateFile and is different each
time. We can set BPs on CreateFile calls, note when it is called with lpFileName =
‘.../runme.dat’ and note down somewhere the file handle it returns (btw, do you remember that
functions usually return their result in EAX register?).
OllyDbg has very handy Handles window for this kind of thing – it lists there all opened handles
(numeric values) and their string representations, not only for CreateFile but for other functions
too. However, we’re in IDA now so let’s carry on with it.
However, this is complex and we’re either lazy (lazy programmers, duh) or we just want to avoid doing more
steps than necessary (since that’s something we can always make up for). At first we’ll examine other signs –
we have bytes to read argument left, and also buffer. Well, let’s try bytes to read.
Let’s see, our first catch has hFile = 0x4C (doesn’t tell much) and nNumberOfBytesToRead =
0x0142. Let’s hit Shift+/ and open IDA’s calculator, which can also be used for base conversion (although
010 editor’s tool is more convenient to use). Let’s enter 0x0142 = 322 in decimal, which I suspect… Eureka!
Check the size of runme.dat – it’s exactly 322 bytes. How handy, the program reads the whole file into the
buffer (so it seems).
Well, to tell the truth, you’ll rarely get such a coincidence – at least not on the first call to ReadFile,
because files usually have some kind of header and signature, which are a dozen of bytes in size. But if you
keep on watching ReadFile calls you may eventually find that a program reads some large chunk of data –
and if you compare it with a size of some file you may see that it differs only in a few tens of bytes.
We got lucky, our test subject is naïve and goes straight into our arms reading that whole script into the
memory
Now it’s time to track down what it’s gonna do with all that data it has just read.
The exe is paused before ReadFile call – let it fly by F8, but before you do that open the register that’s
PUSHed as lpBuffer in a new window of IDA – right-click on the register name and use the context menu or
just set the cursor on it and hit Ctrl+Enter (if you press just Enter it will open it in the same code window).
You can go back any time by using Esc.
Now we’re looking at the memory area that will have the contents of the file read into it. Let’s finally press
F8 and we’ll instantly see how that area becomes filled with data – our precious scenario bytecode. It’s just
about time we use hardware breakpoints.
Hardware breakpoints allow us to set BPs without modifying the program’s code (even in memory). Since normal BPs
are set by writing asm instruction INT 03 before a command to break on they can in some cases be real break points
since the program doesn’t execute this code, it only reads from that location (such as a string in memory – it doesn't run
string's byte values as an asm code, it only accesses them).
Hardware BPs are CPU’s prerogative and a number of BPs you can set at once depends on your CPU type. Generally
you can count on at least 4 hardBPs – usually even more, near 8.
Still, 4 are usually enough since you can put normal BPs on executable code without any limits.
Also note that unlike softBPs hardBPs are triggered after a trapped statement or inside it (if it’s a complex command
like REPE MOV* - in simple MOV or CMP it triggers after the instruction).
Don’t forget to delete BPs that were set in memory locations after the debugged program
terminates (unless it's global app memory but that's not important in our simple case) – since memory
addresses usually change each time program starts you’ll need to re-set all memory BPs each time
you start it from the debugger.
Go to that buffer window you’ve previously opened and set the cursor somewhere inside that lpBuffer
(or on its first byte), click F2 – IDA will likely check hardware BP flag for you already. Mode should be Read.
We don’t need anything anymore from the BPs set on ReadFile calls so you can remove them – although I
suggest simply disabling them (from their context menu) so you can get back to them quickly if necessary. You
can open Breakpoints tab by Ctrl+Alt+B or by Debugger | Breakpoints | Breakpoints list menu command.
Now press F9 and wait until something happens… Here we go – “Hardware breakpoint … has been
triggered”. That’s great, let’s see what we’ve got here…
REPE MOVSD. Well, that looks scary but it's simply an asm instruction that copies a block of memory from
one location to another. If you want some info, e.g. how many and to where does this instruction copy open up
the docs I’ve uploaded (N-Z.pdf – that’s Intel’s manual) and search for that instruction there (I use simple
FoxitReader’s Search since the links in PDF’s contents only work for A-M.pdf).
We can now undertake a challenge of setting BPs on every REPE instruction we get unless we hit something
useful or run out of hardBPs but we’ll go another route – let’s just press (or better hold) F8 unless we find
something of interest. This way (holding F8) we’ll go up the call tree towards the root because we won’t go
into new functions (we’re not holding F7) but will return from all subroutines gradually. On the road we need
to keep our eyes open so we can catch something of value.
…after nearly five functions I got tired of this so I decided to press F9 again – maybe we’ll find something
faster in another part. IDA has shown me the same function again, just another branch. Okay, let’s be more
patient this time…
After a dozen of returns I stumble upon some Graph
which looks like a case statement.
See those boxes going from one root and then joining
together on the bottom? If you think about it, that’s
exactly how a case statement can be visualized.
IDA has even identified it for us (Olly can do this too,
although not like these neat colorful blocks) by putting
comments like “switch jump” all around the
disassembled code.
Can it be an interpreter’s loop we’re looking for?
Let’s look at the comments and strings we have in this function. Hmm…
You can use Ctrl+Wheel Up/Down to zoom, “1” for 100% zoom and “W” to fit-window zoon.
Well, so far the code doesn’t tell me much about its purpose. One thing that looks interesting for me is a
referenced string that says “opcode %.2x” – btw, let’s take a note where it is used and why we don’t see it in
the console output.
I’ve got an idea, I’ll disable all BPs for now and set one in the beginning of this func… no, rather at the
case’s beginning – like IDA says, it’s here:
JMP
off_41374A[EAX*4] ; switch jump
A word on case statement mechanics and why case can’t accept
strings as keys in compilable languages.
You might think that case statement is exactly the same for computer that a series of IF statements –
and thus you might wonder why a compiler says that it can’t take a string as a case variable. I also
thought case was the same as if+if+if+… but for compiler it’s totally different. Each case label is
actually an index in a “jump table”. Such a table is simply an array of addresses. Since a case
statement accepts integer values, you can use those values as indexes for that array-of-addresses –
thus there’s no need to compare anything more than one time, no need to compare anything at all –
just call JMP caseArray[caseValue] – and you’re done!
In the above code fragment, for example, off_41374A is nothing else than base address of that
array – and EAX is caseValue, which should be multiplied by 4 because every address in 32-bit
CPUs is also DWord – in another words, 4 bytes in size.
So we put a BP there and hit F9. Let’s look at EAX register (you can either look at it in General registers
window or wait until IDA shows you a hint when you put a mouse over EAX).
Once IDA has shown you that memory hint you can use Wheel Up/Down to show more/less lines.
You can also hover over many other places – like values in General registers tab.
Also, be sure not to hover on off_41374A[EAX*4] statement – only on the separate string
“EAX” (somewhere in the code above or below) – otherwise IDA will calculate address
(off_41374A + EAX * 4) and show hint for that location rather than the value of EAX.
IDA shows that EAX = 0x01. This doesn’t tell us anything, probably yet. Let’s roll back a little and review
how EAX gets this value. What we see is (try to guess what it does before reading on):
CODE:00413735
CODE:0041373A
CODE:0041373C
CODE:0041373E
CODE:00413741
CALL
XOR
MOV
CMP
JA
sub_413AA0
EAX, EAX
AL, BL
EAX, 6
; switch 7 cases
short loc_4137B5 ; default
Firstly, we clear EAX by XOR, then we set its lower part (AL) to some value of BL (as you remember BL is a
low-word of 16-bit register BX which is itself a part of 32-bit register EBX). We need to track how BL is set.
But before we go further let’s rename the function we’re in now – I named it “$InterpretInstruction”.
…And now let’s take a break and breathe in deeply a few times.
What do we need to do? I mean, what’s our goal with this program? We want to change script lines and as
we do so it crashes. We need to find why.
To think about it, I’ve almost rushed into finding which function actually sets that instruction byte into the
place – but that’s not necessary to know. One thing we really need to know is what to do with that code, not to
find which one of those zillion disassembled functions picks that code from bytecode stream.
Interestingly, we might eventually find that function on our way but doing so now is not required and will be
a waste of time. That’s IMO.
So we’ll skip to the next part. Let’s assume that we have found the interpreter’s case statement. We can
verify it in a few ways but since it’s a tutorial I’ll show you how Olly can help us with its great BP loggin
functions.
Let’s turn the page of our enlightenment to the new level now…
And then Olly the Mighty stood up and said: “I am the king of this mountain!”.
I assume that you already have OllyDbg. I am using v1.1, although there’s v2.0 already but it still doesn't
have all the features v1.1 has so I'm waiting.
Let’s open it up and load ScenarioRunner.exe in it.
Since we’ve already analyzed quite a lot of code with IDA this task will be a peace of cake for us. Copy the
address that IDA shows in disassembly listing on the left of JMP instruction (case statement start), for me it’s
00413743. Hit Ctrl+G in Olly and put in there.
We see something similar. Even more, since Olly doesn’t show the code as a graph we even see the case
table right under the JMP instruction – remember the sidenote on case and IF statements?
CMP
JA
JMP
DD
DD
DD
DD
DD
DD
DD
LEA
EAX, 6
SHORT Scenario.004137B5
DWORD PTR DS:[EAX*4+41374A]
Scenario.00413766
Scenario.00413787
Scenario.00413790
Scenario.00413799
Scenario.004137A3
Scenario.004137AC
Scenario.004137C6
EDX, DWORD PTR SS:[EBP-10]
;
;
Switch (cases 0..6)
<= jump for default case
;
Switch table used at 00413743
;
Case 0 of switch 0041373E
“DD” in asm means something like “Data DWord” – accompanied with DB (data-byte) and DW (2 bytes, data-word).
Now we’re going to set a breakpoint that will log the, presumably, instruction code that this function is
passed. As we’ve already determined, it’s stored in EAX.
We select the line with JMP and press Shift+F4 (or right-click | Breakpoints | Conditional log). Olly has got
complex condition support which is described in details in its help. For this reason I’ll only show the basics –
the interface is pretty much intuitive anyway (unlike IDA's lol).
There are 3 radio groups with 3 choices: when to Pause the program, when to Log expression result and
when to Log function arguments. Each one can have a value of Never, On condition or Always. By combining
these settings we can create very flexible breakpoints.
For example, we can pause a debugged process when a condition (entered in the Condition field) is satisfied.
At the same time we can log the value of some expression into the Olly’s log – if Explanation is entered then
log line will look like “Explanation = <value>”, otherwise it’ll be just “<value>”.
In our case we only need to log a value of an expression we
enter. So we need the setup shown on the picture on the left.
(It’s again scaled down so use Office’s Reset image properties
button.)
I hope you have already moved Olly’s windows in places
where they make most sense for you. Now you’ll need to see its
log (the first [L] button from the left on the toolbar).
Let the program run freely now (by default when a program is ran from the debugger, Olly will pause it at
the entry point). The log will have lines similar to these:
00413743
COND: Instruction code = 00000006
00413743
COND: Instruction code = 00000000
00413743
COND: Instruction code = 00000001
Then the program asks us its thoughtful question and crashes (it probably thinks we’re amateurs – ha!).
Let’s get back to our not-forgotten 010 editor, which has runme.dat opened. Let’s review what we
have here:
0000h: 06 00 0D 00 44 6F 20 6E 6F 74 20 63 72 79 2E 2E ....Do not cry..
0010h: 2E 01 29 00 59 6F 75 20 73 65 65 20 61 20 63 6C ..).You see a cl
0020h: 6F 75 64 2E 20 57 68 61 74 20 64 6F 20 79 6F 75 oud. What do you
0030h: 20 74 68 69 6E 6B 20 61 62 6F 75 74 3F 02 21 00
think about?.!.
0040h: 4E 6F 20 6A 6F 6B 65 2C 20 49 20 74 68 69 6E 6B No joke, I think
I have highlighted the codes which purpose is unknown to us. Everything else are just strings and their
length. And, actually… don’t these codes remind you of something? I think they do! Compare them with the
log messages we got in Olly – direct hit!
Well, even though we still don't know what that last 02 byte is about this shouldn’t prevent us from
celebrating our victory, should it?
Note that while a real game script engine will have a similar-looking instruction execute function,
its case statement will be hundred of times larger than those 7 blocks in our function. Most of the time
(unless you’re writing a total decompiler) you don’t have to analyze more than a dozen of them
though.
This means we’ve found the function that executes script instructions. If you try, you can even guess what
does 06 opcode do and why it points after the case statement at all (to the end of it) – the answer might look
strange for you but don’t worry, strange is what we’re working on :)
Think about it while I carry on speaking about means to can identify what causes the crash.
We have two options now. First is going right into the decompilation of the script engine, understanding
what each of its 7 instructions does. This is most appropriate way because all functions called from within our
case seem to be very small (very small) and it'd save us some effort because brute force doesn't involve much
creativity but routine instead :) I would choose this option normally for such a tiny case if it wouldn't be a demo
project.
In real life, we often go bottom-up – that’s why the action is called Reverse Software Engineering after all –
so we’ll choose the second route. Specifically, we'll try to trap the function causing problems. I can say in
advance that it won't be hard in this project, although it'd be harder in real-life engines with a lot of code. Still,
it's easier than decompiling all the real script engine because all functions will have even more code – simply
amazing amount of it, I'd say.
How do we trap things? Simple – we use hardBPs. Well, easy to say and easy to put but it’s not that easy to
trap – naturally, when you put a breakpoint on one location and program copies that memory block into another
area (you can trust me that this happens almost always, even in our demo project) you’ll need to put a BP on
that area too. You need to take measures to avoid getting lost in all these copy operations… This is when our
brain comes in play with its supernatural power.
Now, enough lyrics! Let’s do the job. You can close Olly already, we’re getting back to IDA. Let’s return
our breakpoints on ReadFile. Let’s set a BP inside the buffer which it reads our runme.dat into… right,
now disable it (only by using context menu – too bad IDA don’t have any shortcuts for it like Olly where you
can disable a BP by using Space in the Breakpoints window).
Now we wait until the program asks the question but don’t choose anything when it does. Recall: when we
answer “2” it exits correctly while if we choose “1” it crashes. Here goes the question… okay, now we can
enable our breakpoint to see what the app is going to do when we choose “1”. Gotcha, something’s been
triggered. It’s REPE MOVSD again, we must be lucky on them today :)
Mm, let’s slide down a bit using F8… code, code, code… aah. Aah! What’s that? It says: "read str of
len %d". Interesting! Hm, let’s slide down… What? We’re already in $InterpretInstruction? So
this parts seems ok. What does the console says? Aha, it output a message that we’ll need to take a
questionnaire again. Boring! We can change the flow of its thoughts but later, later…
$InterpretInstruction has returned successfully, nothing got broken so the error seems not to be in
this instruction. That’s not surprising, actually, because why should it break on a simple message output? We
probably broke some jump instruction or something more complicated than just that.
I have an idea. What if we set a BP on that case statement in $InterpretInstruction? At least we
should be able to see what instructions does the program perform and which one breaks it. Let’s do this.
In case you’ve slid too far you can restart the debug process and reach this step again – you'll now do this
much faster :)
On my side, this function receives the following opcodes: 05, 00, 04 – I found this out by simply having
a BP on the case statement and quickly pressing F9. One thing worth our attention is that after it receives 04
opcode IDA says the program has risen an exception. That’s our target
Whatever you answer to IDA’s question on how to handle the exception the program will break so we’ll need
again to stop on 04 opcode – but this time we stop and think. Alright, this didn’t help so we press F9 – I
assume you still have that hardBP set inside the bytecode buffer. We’ve caught something.
A bit of tracing… uh, something strange, it looked to me like the program’s normal instruction execution
flow got somehow transferred to another place in one instant. Well, we don’t really care unless it works… Uh?
Come on, how could we end up in default case if we started from 4th? And just a few instructions later IDA
will tell us about an exception. Interesting, so the program actually doesn’t run sequentially!
This only happens when an exceptional situation has been met. Exceptions change normal
execution flow so weird jumps and transitions might seem to be taking place but in a nutshell it’s
described by just one word “unwounding”. This means that after an exception has been risen all
functions will be exited immediately eventually reaching the entry point (althogh this rarely happens
as most compilers have their own custom exception handling routine set up that takes over for
completely uncaught exceptions) – unless some of those functions handle the exception. In latter case
the execution will continue normally from the place it was caught in. If an exception reaches the end,
“unwounding” of the entry point will effectively result in “unwounding” the whole program :)
–
“This program has encountered an error and has to be closed. Would you like to send a report?”
sounds familiar, right? That’s why you should always handle exceptions, like I did in this demo
app. Don’t forget about try..catch blocks (and try..except in Delphi)!
Well, while all this is sure fascinating to know but this doesn’t help us with the script crashing.
Let’s think, think, think… Alright, this again didn’t help us much so let’s look at the bytecode, maybe it
will reveal some secrets to us?
0100h: 61 67 61 69 6E 21 20 3A 50 0A 04 12 00 00 00 00
0110h: 2F 00 57 6F 77 2C 20 74 68 61 74 27 73 20 73 75
again! :P.......
/.Wow, that's su
I think we’ve got something here. Not taking strings into account we have some strange numbers right after
that message after which the program breaks. We can say that 04 is that “Crashing Opcode” ™ of ours, 00 in
the end is probably another opcode (maybe it’s an instruction code for outputting a message? Look, it has a
string length going right after it, and the message itself. That’s something you should investigate about in your
spare time). In between we have 12 00 00 00 – I wonder if it’s just a coincidence that it looks like a
DWord? We can check this out.
We go back to IDA. Open bytecode buffer and scroll down where you can see the end of You'll need to guess
it again! :P” message. We see the same bytes that we saw in our hexed above. Great, we’re on the right path.
Let’s put a hardBP on each of the following bytes! Okay, maybe not on each, we just need those 12 00 00
00, And, in fact, we can do this with just one hardBP since a BP size can vary from 1 to 4 bytes – IDA even
supports more but I personally wouldn’t advice to set a breakpoint with a size different from machine standard
sizes of 1, 2 and 4 bytes. And OllyDbg also supports just those sizes.
If you can’t set exactly 4-size hardBP don’t worry, set smaller. It depends on the address you’re
trying to trap – since 32-bit CPU operate on DWords they need addresses to be aligned on 32-bit
boundary too. Hardware breakpoints should comply with this rule as well.
A useful fact to know is that compilers are also aware of CPUs operating faster on DWord-aligned
instructions and data so they very often align memory by inserting instructions that do nothing (like
MOV EAX, EAX) – so you’ll be able to set hardBPs in more places that you probably thought :)
… here I took a nap writing the tutorial …
Alright, we’re back on track and we’re kicking hard. Or we’re not yet? Doesn’t matter! You see, what we
need to do is this: set a hardBP in the bytecode buffer immediately after ReadFile call, track al copies of
that buffer and in the end find a place where the code does something with that buffer. Easy, huh?
The key thing here is to set a hardBP on that DWord (12 00 00 00) so the first instruction that doesn’t
simply copy a buffer will be the instruction doing something with this number.
After a few traps I found that instruction so I now challenge you to do the same thing :)
What will definitely help you is to read Intel docs on instructions you don’t know – like REPE MOVSD.
Don’t worry if you can’t find exactly REPE MOVSD, some other similar instruction (REPE MOVS) will do it
since the only thing you need to know is the purpose of that instruction and, if it copies something, you will
also need to know the source and the destination.
I have faith in you! Go-go-go!
Can you see the fire
burning inside us?
And the last step to the win… to the top of Olympus!
I assume you did find that place where that number gets used in some obscure operations. It looks like this:
CODE:0041390F loc_41390F:
CODE:0041390F XOR
ECX, ECX
CODE:00413911 MOV
EDX, [ESP+0Ch+var_C] ; <= here it got trapped.
CODE:00413914 MOV
EAX, [EBX+14h]
CODE:00413917 MOV
EBX, [EAX]
CODE:00413919 CALL
DWORD PTR [EBX+14h]
CODE:0041391C JMP
SHORT loc_41392f
Now it’s time for some last tips before you go fly on your on. Let’s take a look at EDX – clearly, it’s 0x12,
exactly that number we were trapping for. I got curious and peeked at memory at [ESP+0Ch+var_C] – well,
nothing like our bytecode from runme.dat, that 0x12 is all alone there.
But it was copied from our bytecode, which we can confirm because we put hardware breakpoints on places
originating from runme.dat – so we don’t care why it ended up here separated from his pals. And now it got
copied to EDX and something is called – we need to see what that function does.
Luckily for us, the code seems well-written so the functions are small and this one looks particularly small –
just step inside (F7) to confirm this. Actually, it only runs 6 instructions including RET – how handy!
Here’s how it looks in our case:
sub_412278 proc near
SUB
CX, 1
JB
SHORT loc_412287
...
loc_412287:
MOV
[EAX+0Ch], EDX
JMP
SHORT loc_412297
...
loc_412297:
MOV
EAX, [EAX+0Ch]
RETN
sub_412278 endp
At this point I can congratulate you with finding of a very important place, even more important than the
interpretator's loop or case statement – you’ve just found the program’s variable that holds its current position
in the script bytecode. This is a core thing, something that lets us see what exactly the program does, how it
interprets the bytecode, how does it read those codes – ultimately, it leads us to places where it uses those
values it’s just read (because to access something inside the bytecode you need to use a pointer like the one
you've just found - and that access operation must be quite close to the code that actually uses the accessed
value so you just put a hardBP on this variable and see what's gotten into the net).
This is great, we now can make last preparations before creating a decent localization tool.
You can open OllyDbg and put Conditional log BP on the location of $ScenarioPos (as I called that
location – simply set the pointer on memory (not the instruction) at [EAX+0Ch] and click “N”) and see how
program accesses the bytecode.
The last position of scenario after which the exception occurs will be 0x12. Well, we’ve just confirmed that
12 00 00 00 we’ve seen earlier is some kind of offset.
In fact, we could have guessed that it was an offset and we could have guessed of what kind (relative from
current position or absolute from the beginning) right when we saw it in the bytecode without going thru al this
hunting. Yes, we could, and the logic is a great thing if you can use it right – and I’d only encourage to use it
when you’ll do some real hacking. However, in a real VN engine with a ton of code you won’t always be able
to distinguish what now looks like 12 00 00 00 from another code. One of thee reasons might be that large
games’ scenario files are also large, counting in megabytes, and if it refers to something it will look like C0 C6
2D 00 – only one trailing zero which doesn’t tell us much because preceding numbers can represent anything
and our guess that this number means 3 000 000 will be as good as guessing that it’s two Words – 0xC6C0 and
0x002D – or it’s simply a string, maybe encoded with some weird algorithm, and that 0x00 is a terminating
null-character, etc.
In other words, possibilities are immortal.
I must admit that in this guide I've cheated a little when searching for $ScenarioPos – I made a
supposition that 12 00 00 00 was indeed an offset of some kind and I was placing BPs on it. This is more
like brute force attacks - they are natural for humans since they go top-down but unnatural to RCE where
everything goes bottom-up. In a real-world game engine you’ll do exactly that – going from bottom to the top.
I’ll demonstrate how we could achieve the same goal (of finding $ScenarioPos) in real life.
Let’s roll back a little and see what happens when our ScenarioRunner.exe raises an exception.
-6. It happens in the default case branch of $InterpretInstruction.
-5. It’s because case statement don’t have a branch for its key value (that’s why it’s called default branch).
-4. The key value is help in EAX, when we put a BP at the beginning of the default branch (on MOV DL, 1,
for example) we see that EAX = 0x29.
-3. Since we’re sure that this is an interpretation case statement we think that this engine only has opcodes
00-06 and “opcode” that it gets this time – 0x29 – is not an instruction at all.
-2. Thus we guess that it’s read that opcodes from a wrong position, which is the same as saying that we’ve
shifted something when we’ve altered a string in the scenario.
-1. We continue to roll back up the code from that default case branch unless we see a code that MOVes
something from a large buffer based on some position pointer. Once we find this code we can say that
we’ve burnt enough midnight oil and then go to sleep because we have just found both the bytecode buffer
and the current position variables.
Keep in mind that you can always see how program works with its original scenario which runs correctly and the
scenario that you’ve modified and that makes it crash. Compare what program does – here Olly’s logging BPs will help
you a lot. Log scenario buffer positions of both scenarios, log instruction codes and compare them. What position was the
last when an exception rose? What is in original scenario at that point and what is in your scenario file?
For example, regarding our demo scripter I found the code that’s described in step “-1” within just a few
minutes. Try to do the same thing. It’ll be easer than in large VN games because functions in this exe are very
small.
It looks looks this:
MOV
EDI, [EBX+0Ch] ; [EBX+0Ch] – location of $ScenarioPos in memory.
TEST
EDI, EDI
; Is it < 0?
JL
SHORT loc_4
; Yes – jump, it should always be >= 0.
MOV
ADD
...
EAX, [EBX+4]
EAX, EDI
; great, it’s the pointer to our bytecode buffer!
; base address is summed up with current position.
The last statement is like a joint point where everything comes together.
Okay, so what’s now? The next step is the last one – and we’ve already done everything necessary so it will
only require a little of mental activity.
This… this is Olympus! But wait… I can see far grounds in the fog!
Let’s review what we have got. We have a sequence of bytes: 04 12 00 00 00 – we know that 04 is the
opcode (probably jump opcode) and 12 00 00 00 is an offset of some kind. We’ve also found the function
that sets the current scenario buffer position to 0x12. Does this tell us something?
It sure does. This instruction:
MOV [EAX+0Ch], EDX ; $ScenarioPos, 0x12
…is almost shouting into our ears: “Set $ScenarioPos to the value of just read DWord!”.
And applying some deduction we can say with confidence that 04 12 00 00 00 does nothing else than
setting $ScenarioPos to 0x12. And 0x12 is a position in the bytecode stream from which the execution
will go will continue. In other words, it's an absolute jump.
This is very handy to know because we’ve modified a string right before that position!
And the program now crashes because it doesn’t understand why it’s getting some weird 0x29 “opcode”.
And now we fix it.
Quickly, just one second to open a hexed, Ctrl+G to 010Bh, then change 12 to 11 – run the program…
Ahahaha, it now works without a hitch, totally disregarding modification of the first line! Victory!
What can I say now? If you’ve reached this point and managed to make the program work – you’re a real
man :)
There are still a lot of places to apply your newly acquired knowledge and skills to. ScenarioRunner has a
few Easter eggs hidden inside. I strongly suggest you either keep on messing with it or, if you feel like you
need some more exciting stuff, get a visual novel and try to reverse its script formats.
You can pick something from the list of my tools at http://vn.i-forge.net/tools/ so if you’ve run into a problem
during reverse-engineering you could ask me for a hint (either via comments on my blog here or by e-mail).
However, I’d advice to deal with ScenarioRunner while there are yet things to discover.
Here’s what you can do on your own on ScenarioRunner.exe:

Remember that suspiciously-looking string “opcode %.2x”? Actually, if you look over the entries under
Names subview of IDA (click on Name column to sort the list) you’ll find more interesting strings (they
should have also been listed in Strings subview but IDA either doesn’t catch Delphi strings or it doesn’t
catch Unicode strings).
For example, aReadStrOfLenD – what’s this for? Why none of them show up anywhere? You can
investigate on this – maybe it will help you in tracking down crashes in ScenarioRunner or give some
insight into its innerworkings, who knows?

You can undertake a challenge of understanding every opcode function. We’ve already got the notion of 04
opcode and I’m sure you’ve understood a few others but there are at least 3 of them left – and also that
strange case 06 branch – what it is for?

Try to translate every other string in runme.dat – there are 6 lines in total, including 2 question strings.
We’ve already translated the first line. There’s a surprise awaiting you when you modify one of the
remaining lines – you’ll need to dig into disassembly or put deduction in action to solve it :)

Just for fun, try quickly finding out how to make ScenarioRunner.exe execute an arbitrary scenario
file, with any name other than runme.dat.

Can you make your own custom scenario (or modify existing one) so that it would do things you make it to
do? I made this runme.dat from scratch in a hexed – can you do something similar? It will likely
require knowledge of most opcodes - and it’ll be a lot of fun :)

And an ultimate challenge – try writing a complete decompiler of ScenarioRunner’s scenario files and
even a compiler if you want. Or at least try to make a translation tool like those on my page – which will
extract texts from a script into a text file and update texts inside the script based on lines in a text file.
I think that’s it. On this point I would give you a certificate of Licensed Hackership ™ but certificates don’t
mean much, what matters are ourselves – that’s probably why the best hackers don’t wear suits, lol.
“…Thus they parted and went on in different directions, leaving Olympus behind…”
Lol, hope I haven’t overloaded this tutor with too many novel elements :)
Well, thanks for reading!
Proger_XP
_ 11.5.10 _