Report - Danny Matthews

Transcription

Report - Danny Matthews
This PDF has been downloaded from http://www.dmatthews.co.uk.
Please feel free to make use of any of the content of this document (including source code) as part
of any publication or piece of software provided that it is to be freely distributed.
I'd appreciate a reference (with a link back to the site) and an E-Mail if you do make use of
anything.
Cheers!
Danny
UNIVERSITY OF SUSSEX
Emulating the Nintendo
Entertainment System
Danny Matthews
42762
Computer Science BSc
Department of Informatics
Supervised by Dr Des Watson
2008
This report is submitted as part requirement for the degree of Computer Science BSc at
the University of Sussex. It is the product of my own labour except where indicated in the
text. The report may be freely copied and distributed provided the source is
acknowledged.
This project uses small amounts of code from the NESCafe (http://www.nescafeweb.com/)
and FC64 emulators (http://www.osflash.org/fc64).
Both are released under the GNU GENERAL PUBLIC LICENSE Version 2, permitting
use of “pieces of it (the software) in new free programs”.
Code use is documented within the report and source code.
_________________________
Danny Matthews
1
Summary
The purpose of this project was to research, design, implement and test a Nintendo
Entertainment System (or NES) emulator with software development facilities.
The system was developed with the goal of satisfying the needs of both game players
who wish to experience their collections in a portable and convenient form and those who
develop for the system. It is intended that the software be portable and feature-rich to
compete with the systems currently available.
The NES is a games console released worldwide in the mid 1980‘s by hardware and
software giant Nintendo. The system was supported up until the end of 2007 in Japan and
is heralded by many as the most successful games console released to date.
Emulators allow software to be executed on a system for which they were not written.
This is achieved by writing a piece of software which precisely simulates the original
machine.
The NES has four main units which must be emulated. These are the Central Processing
Unit (CPU), the Picture Processing Unit (PPU), the Audio processing Unit (APU) and the
Input units (otherwise known as ―controllers‖). These four units, when combined provide
a fully functional NES.
The software executed on the NES was distributed in the form of cartridges containing
program code and graphical data. Obviously, an emulator cannot execute software in this
form (without specialist equipment). Instead, a hardware tool is used to ―dump‖ the
cartridge to hard disk (hereon in referred to as a ROM). In order for an emulator to
correctly execute ROMs, it must have access to certain information not included in the
―dump‖. The most common representation for this information, and the one chosen for
this project is the iNes format, consisting of sixteen bytes appended to the beginning of
the ROM. The emulator parses this to provide the correct behaviour.
The major development tool provided is a debugger with four major functions: a system
status viewer, providing an at-a-glance status summary, a breakpoint system
incorporating code stepping, a disassembler which transforms assembled ROM files into
a form close to their un-assembled form and a memory viewer to examine machine
memory during execution.
The other two tools provided are a name table and pattern table viewer. Essentially, these
tables are the means used by the NES to store and display graphics. These tools are
intended to simplify development by providing useful information to the user. For
example, the memory locations of graphics and colour information.
The project was largely a success, meeting almost all objectives. The only requirement
lacking is a scanline based background rendering routine in the PPU (discussed within the
report). I feel that given additional development time, this limitation could be quashed.
A number of extensions were added to the software including additional debugging
capabilities and a sprite viewer (images capable of independent movement).
2
Table of Contents
1. Introduction ................................................................................................................... 5
1.1.
Aims and Objectives ........................................................................................... 5
1.2.
Introduction to the problem area ......................................................................... 7
1.2.1. Java Applications .............................................................................................. 7
1.2.2. Emulation .......................................................................................................... 8
1.2.3. Debuggers ......................................................................................................... 8
1.3. Report Overview ...................................................................................................... 9
2. Requirements Analysis ............................................................................................... 10
2.1. Professional Considerations ................................................................................... 10
2.1.1. Code of Conduct ............................................................................................. 10
2.1.2. Code of Practice .............................................................................................. 10
2.2. Needs of Intended Users ........................................................................................ 11
2.2.1. Current Solutions ............................................................................................ 11
2.3. Proposed Solution .................................................................................................. 13
2.3.1. Primary Objectives.......................................................................................... 13
2.3.2. Extensions ....................................................................................................... 14
2.4. Requirements Specification ................................................................................... 15
2.4.1. Overall............................................................................................................. 15
2.4.2. CPU ................................................................................................................. 16
2.4.3. PPU ................................................................................................................ 22
2.4.4. APU ................................................................................................................ 37
2.4.5. Input/Output ................................................................................................... 43
2.4.6. Development Tools ......................................................................................... 45
2.4.7. Further Specifications ..................................................................................... 48
3. Design ........................................................................................................................... 49
3.1. Overall System Design .......................................................................................... 49
3.2. CPU Design ........................................................................................................... 53
3.3. PPU Design ............................................................................................................ 62
3.4. APU Design ........................................................................................................... 68
3.5. Input/Output Design............................................................................................... 80
3.6. Development Design .............................................................................................. 81
3.7. GUI Design ............................................................................................................ 88
4. Implementation ........................................................................................................... 94
4.1. Common Tactical Policies ..................................................................................... 94
4.2. Software Re-use ..................................................................................................... 96
4.3. Threading ............................................................................................................... 96
4.4. Outputting Sound .................................................................................................. 99
4.5. Testing.................................................................................................................. 100
5. Conclusion ................................................................................................................. 102
5.1. Finished Software Screenshots ............................................................................ 102
5.2. Success of the finished product............................................................................ 109
5.3. Future Extensions............................................................................................... 1152
5.4. Alternative Methodologies................................................................................. 1174
6. Works Cited ............................................................................................................. 1185
7. Appendices ........................................................................................................... 12118
Appendix A: Cartridge Specification...................................................................... 12219
3
Appendix B: File Format Specification .................................................................... 1252
Appendix C: Regional Differences Specification ..................................................... 1285
Appendix D: Input Devices and Other Peripherals ................................................. 13027
Appendix E: Background Rendering in Detail ....................................................... 13229
Appendix F: Low Level Designs .............................................................................. 1385
Appendix G: Test Specification ................................................................................ 1585
Appendix H: Project Logs ........................................................................................ 1663
Appendix I: GNU GENERAL PUBLIC LICENSE Version 2 ................................. 1696
Appendix J: Source Code.......................................................................................... 1785
4
A Nintendo Entertainment System (NES) Emulator
By Danny Matthews
Supervised by Des Watson
1. Introduction
1.1. Aims and Objectives
1.1.1. Purpose
The purpose of this project is twofold.
Firstly, it aims to provide an end user with a convenient and enjoyable means of playing
the games of yesteryear on their computer.
The second major objective is to provide development tools for those who write software
for the Nintendo Entertainment System. This functionality will include a debugger and
disassembler amongst other things.
A further, more academic reason include that the completed software could allow for the
analysis of program execution. For example, how fast the CPU is required to run for
games to be played successfully and the amount of time spent performing graphical
operations.
The system will provide many additional benefits to software developers (of which there
are still surprisingly many for the NES) other than those discussed above. These benefits
include:
A substantially more accessible test platform
Greatly accelerated development due to this increased accessibility
A reduction in costs (e.g. limiting the need for NES flash carts and other
peripherals)
1.1.2. Motivations
My motivations for undertaking the writing of a NES emulator are several. First and
foremost, they stem from a long time interest in the area of computer emulation of all
types of system. The NES is of particular interest to me for reasons of nostalgia, having
spent many an hour using the system in my youth.
Additionally, an interest in the implementation and logic of hardware drew me to the
area, as did the opportunity to research, design and implement a piece of software
significantly larger than anything previously required.
5
I decided to incorporate development tools into the system with the aim of making the
lives of Nintendo developers easier. This grew out of a respect for their work, being
constrained in the way they are in how they design and code their software (having to
work on such limited hardware).
Finally, the great possibilities for extending the functionality of the software appeals
(with several of these possible extensions discussed below).
1.1.3. Relevance
This project relates to several areas of the degree programme.
The main areas of relevance are those modules which focus on systems architecture
(Computer Systems Architecture), programming (Introduction to Programming, Further
Programming) and data structures (Data Structures, parts of Introduction to Operating
Systems).
Also, because it involves the writing of a substantial piece of software, Algorithmics will
be useful, as will Software Engineering and Software Design.
The software will require a GUI, meaning that Human Computer Interaction will prove
helpful in making said GUI as intuitive and user friendly as possible.
Computability and Complexity has proven helpful in that it has helped me to understand
why it is that emulation is theoretically possible.
Finally, Professional Issues in Computing will help ensure understanding of what is
expected of me in terms of conduct and practice and Technical Communication Skills
will undoubtedly help with the presentation aspect of the project.
1.1.4. Intention
I intend to write an emulator for the Nintendo Entertainment System (NES) games
console.
I intend to implement this system using the Java programming language. One of the
major benefits of using Java is its platform independent nature. The system will be
deployable on all the major platforms (PC, Macintosh, and Linux) as well as any other
system for which a Java Virtual machine has been written (see the ―Introduction to the
problem area‖ for an explanation of the above).
It consists of three main units (all present on the motherboard). These are:
CPU – a modified 6502 processor,
PPU – a chip providing the graphical capabilities of the system,
6
APU – a third chip providing all audio processing capabilities.
These three units, working in parallel perform all the main tasks required of the machine.
It is additionally required to provide input support so that the software can be controlled
by the user.
Finally, I intend to implement a set of development tools to aid NES developers. This
will include:
A debugger and disassembler,
A Pattern table viewer,
A Name table viewer
Additional functionality can be added if time permits. This functionality is discussed in
the Objectives section of the report.
1.1.5. Resources Required For Development
A computer with internet access
A copy of the Java Runtime Environment (JRE) v1.5.
ROM files of games using varying MMC‘s and other features.
All required resources are available.
1.2.
Introduction to the problem area
1.2.1. Java Applications
Java is an object-oriented programming language developed to be multi-platform without
the need for separate compilation. It achieves this in two steps.
Firstly, Java programs are compiled into byte code as opposed to target architecture
compilation. This byte code acts as an intermediate language.
Secondly, virtual machines (VM‘s) are written for each target platform which interprets
this intermediate language into statements executable by the target machine.
It should be noted that it is possible to compile Java applications into native code via
third party compilation solutions. Also, VM‘s no longer simply interpret the intermediate
language, instead using the ―Just in Time‖ paradigm, caching the code as it is compiled
into native form, thus allowing for much faster execution speeds.
7
1.2.2. Emulation
The British Computer Society (BCS) define emulation as follows:
―Emulation is a very precise form of simulation which should mimic exactly the
behaviour of the circumstances that it is simulating. An emulator may enable one
type of computer to operate as if it were a different type of computer.‖ (1) (2)
It is possible to emulate any computer system on another. This is due to the following
(assuming the Church-Turing Thesis to be correct):
1. All current computer systems are Turing Complete. That is, it is possible to
completely emulate the universal Turing Machine (Lu) on all current systems.
Lu is a special multi-taped Turing machine which is capable of emulating the
behaviour of any other Turing Machine.
2. All current systems can be emulated by Lu.
Thus, in principle, it is possible to emulate any system on another (3).
Emulators are available for a vast number of systems. For example, there exist emulators
for the Amiga (4), Commodore 64 (5) and even pocket calculators (6).
1.2.3. Debuggers
Debuggers are software tools designed to help in the elimination of programming errors
from code.
Debuggers usually offer a large amount of functionality. This includes
―breakpoints‖ – where program execution is paused whenever particular
conditions are met (for example, when a particular line in the code is met or a
variable reaches a certain value)
Memory viewers – allowing the viewing of memory during execution.
8
1.3. Report Overview
The remainder of this report is split into several major sections.
Firstly, the project is discussed in regard to the conditions which must be met by a
member of the British Computer Society (BCS) before, during and after development.
This is followed by a discussion of the tactical policies to be followed during
development and an analysis of project requirements. This includes a discussion of the
intended user group for the software and an evaluation of currently available solutions.
Project objectives are also detailed.
A requirements analysis follows, detailing all that must be implemented to achieve the
project objectives. This is divided into seven major sections, each describing a one of the
major system objectives.
The design follows. Again, the sections largely mirror the system objectives,
documenting the design to be followed when implementing the system. The first section
of the design provides an overview of the system design in terms of package and class
structure.
Following the design is a discussion of the implementation. This discusses issues which
occurred during implementation as well as design decisions which could not be discussed
in an implementation free way within the design.
System testing follows. Most of the testing followed a unit testing methodology although
performance and portability testing were also discussed and tested.
The final sections critically analyse the project in terms of its design and implementation,
concluding with a discussion of the projects success, including ideas for future extension
and alternative design and implementation methodologies.
9
2. Requirements Analysis
2.1. Professional Considerations
In undertaking any project it is important to ensure that the software and development
process is ethically sound. To this end, the British Computer Society (BCS) provide
guidelines to be followed. These come in the form of two published documents. The
project will be discussed in relation to these documents below:
2.1.1. Code of Conduct
Arguably the most valid considerations are to ―have regard to the legitimate rights of
third parties" and to have ―knowledge and understanding of relevant legislation,
regulations and standards‖ (points 2 and 3).
Although I have not received permission from Nintendo to write an emulator for their
system, the law supports the development of emulators provided that no proprietary
information was used in the development of the system (all information has been
obtained from the internet where this information was ascertained via backwards
engineering the NES). Several court cases support this (7) (8).
Additionally, it should be stressed that at no time will pirated ROM files be used with the
system. Firstly, legal copies of many titles are already owned, therefore providing
sufficient testing material. Secondly, the emulator will not be released in any form (apart
from in the form of source code in the final report).
In short, this project is perfectly legal as:
1. The system is being built entirely without the use of proprietary documentation,
2. All ROM files used will either be dumped from a legally owned copy of the
software or will be non-proprietary.
No information will be ―misrepresented or withheld‖ in this report (point 9).
All clauses deemed relevant from the BCS Code of Practice will be observed (point 16,
see below).
2.1.2. Code of Practice
I have attempted to keep my workload such that I will be able to successfully meet all
goals within the time available and have ensured that I have access to all necessary
resources (―Manage your workload efficiently).
An acceptance strategy will be devised that ―will fairly demonstrate that the requirements of
the project have been met‖ (―When defining a new project‖).
10
When the project is completed, I will be sure to ―honestly summarise the mistakes made,
good fortune encountered and lessons learned‖ and to ―recommend changes that will be
of benefit to later projects‖ (―When Closing a project‖).
It will be ensured that the analysis and design specifications provide as accurate a
representation of the system as possible and every attempt will be made to ensure that all
programming practices used help to provide easy to maintain and efficient code.
It is intended that the documentation will be written to a level of detail ―that others could
take over the work if need be‖ and will be kept up to date (―When writing technical
documentation‖).
2.2. Needs of Intended Users
It is intended that there will be two distinct groups of users for the finished software.
Firstly, there will be those who use the emulator simply to allow them to play their NES
games in a convenient and portable form.
To this end, the software should be written so as to provide maximum compatibility with
the game images.
Ideally, the software should support as many of the systems‘ peripherals as possible and
should include additional features (possibly not present in the original system) which
would enhance the users play (e.g. save states).
It should also ideally be sufficiently efficient to allow the games to be played at close to
original speed.
The second intended user group are those developing software for the NES.
At a minimum, the system should provide a number of development tools to aid those
developing for the system (detailed in the ―Proposed Solution‖).
Ideally, a great number of additional tools will be provided to allow efficient
development (such as hex editors, trace loggers etc).
2.2.1. Current Solutions
2.2.1.1. Java Solutions
The quality of NES emulators written in Java vary greatly.
The Jamicom emulator (9) lacks sound support and executes at a speed so quick it is
rendered un-useable. It also lacks GUI support, requiring command line operation.
11
Several emulators are incomplete (such as Animosity (10), currently consisting of little
more than a 6502 emulator).
However, one very complete emulator known as NESCafe (11) stands out above all
others in quality. It is available in both applet and application form, fully emulating the
graphics and sound of the NES. It additionally provides a rudimentary debugger.
2.2.1.2. Non-Java Solutions
The quality of non-Java emulators tend to be much better. These include Nestopia (12), a
Windows emulator of high quality (although it provides no software development aids)
and FCEUXD (13), which, again is an excellent emulator. In contrast to Nestopia,
FCEUXD provides excellent development tools which include a debugger, a hex editor
and a RAM filter.
2.2.2. Ideal Solution
Whilst FCEUXD would seem to be the ideal choice, providing both a robust emulator
and many excellent development aids, it suffers from its lack of portability (with only a
Windows version released to date).
NESCafe does not suffer from this but lacks any kind of substantial development tool.
An ideal solution would be one that incorporates the strengths of both these programs.
An outline of the functionality of such a system can be seen below (Proposed Solution).
12
2.3. Proposed Solution
2.3.1. Primary Objectives
O 1. Emulating a fully functional 2A03 (NTSC) processor
O 1.1. Providing a means of pausing and continuing CPU execution and resetting
the CPU.
O 2. Emulating a fully functional NTSC Picture Processing Unit (PPU)
O 2.1. Allowing alterable Tint and Hue of system colours.
O 3. Emulating a fully functional NTSC Audio Processing Unit (APU)
O 3.1. Providing the ability for the user to modify the output volume of the APU
(including sound muting).
O 4. Emulating the standard NES control pad (allowing user input), using the
computer keyboard to control this interaction.
O 4.1. Providing a means of changing the keyboard keys which correspond to
control pad inputs.
O 5. Implementing a development environment to aid NES programmers.
O 5.1. Debugger
O 5.1.1. Present system information in an easily readable way.
O 5.1.2. Present a disassembled version of the loaded ROM.
O 5.1.3. Provide a breakpoint system to stop CPU execution on certain
register value conditions being met (with step based execution
capabilities).
O 5.1.4. Memory Viewer
O 5.2. Name Table Viewer
O 5.2.1. Display the name tables graphically (in 2 bit and 4 bit colour
modes).
O 5.2.2. Display attribute table values.
O 5.2.3. Display the scroll line values graphically.
O 5.2.4. Allow alterable name table refresh rates.
O 5.3. Pattern Table Viewer
O 5.3.1. Display the pattern tables graphically.
O 5.3.2. Display the palette values graphically.
O 5.3.3. Display details of individual pattern table tiles and palette entries.
O 5.3.4. Display the numeric contents of the name tables in a grid format.
This will provide the contents of the name table memory to the user
in a much more accessible and readable format.
O 6. Implementing a Graphical User Interface to facilitate easy use of the system.
13
2.3.2. Extensions
E 1. Implementing additional Memory Management Controllers (MMC).
E 2. Implementing a tool for easy analysis of program execution. For example,
providing statistics of instruction and addressing mode use.
E 3. Extending debugging facilities.
A variety of functionality could be provided here. Possibilities include:
E 3.1. A trace logger, which would allow for the user to view all (or a set
number) of the instructions as they are processed. Register values could
also be included. This could be a useful means of viewing the execution of
the program.
E 3.2. Palette altering facilities. One palette of colours exists for sprites and one
for the background. Allowing the altering of these palettes would give the
user a very efficient way of changing their games‘ colour schemes
significantly.
E 3.3. Cartridge details could be made available to the user. For example,
the type of MMC used.
E 3.4. Implementing visual system state, providing an at-a-glance view of the
system state (e.g. illustrating whether the CPU is executing via a symbol).
E 3.5. Implementing code highlighting. The user can highlight specified strings
within the disassembled code. For example, they may wish to highlight all
JMP instructions.
E 4. Implementing additional peripherals. A significant number of alternative input
devices exist for the NES, as do a number of other peripherals designed to
enhance the use of the system.
These include the ―Zapper‖ (a gun shaped peripheral) and the Game Genie
(allows the user to alter the execution of software).
Additional input devices and peripherals are detailed here.
E 5. Implementing undocumented opcodes. There are many opcodes available for
use on the 6502 which were not documented by the manufacturers. These
include opcodes which implement several defined instructions at once. (14)
E 6. Implementing Saved Game functionality. Some games allow you to save your
progress through the game so that it is possible to return to that point at a later
date. This could be implemented allowing saving only in those games which
14
allowed it or it could be implemented allowing the user to save the game state at
any time and in any game they wish.
E 7. Implementing a Sprite Viewer. This would allow the user to view all the sprites
present in their project at once so as to ensure that they will be rendered as
intended.
E 8. Implementing a Frames Per Second counter so as to allow the user a means of
measuring the performance of their software.
2.4. Requirements Specification
2.4.1. Overall
2.4.1.1. Development Model
For the most part, the waterfall model will be employed in development. This will work
well because of the nature of the project and my lack of familiarity with the system to be
emulated. That is, it shall be necessary to carry out substantial research into the subject
area before being able to design the system (requirements analysis) and it would be
advisable to produce a design to guide through the implementation process.
Figure 1: The Waterfall Model (modified from (15))
However, I feel that it would be impractical to follow this model too closely due to the
large number of corner cases and detail involved in the system. Thus, once the main
analysis and design processes have taken place, a more ad-hoc method of analysis and
design will be used to fill in the inevitable holes in the main analysis and design
documentation.
Both formal and informal testing will be carried out throughout development.
15
2.4.2. CPU
2.4.2.1. General
The CPU consists of a modified 6502 processor distributed by Ricoh. This modified CPU
was known as the 2A03 in NTSC systems (1.79MHz) and the 2A07 in PAL (1.66 MHz).
The 2A0X series was identical to the 6502 in every way except that it lacked a Binary
Coded Decimal mode and included 22 memory mapped registers to assist in a multitude
of tasks including sound generation and joypad reading. (16)
Figure 2: The 6502 CPU (17)
2.4.2.2. Memory Mapping
The 2A0X interacts with external devices via the method of memory mapping. All
memory locations written to/read from by the CPU over 401F actually perform
operations on some external device. For example, reading from location C000 will read
the first byte from the upper bank of cartridge ROM (discussed later).
2.4.2.3. Memory Mirroring
A significant amount of memory mirroring also occurs. Memory mirroring is a technique
where multiple addresses map to the same location in memory. For example, if locations
6-10 mirrored locations 1-5, a write to 7 would be same as writing to 2 (the same applies
to the reading of data).
Mirroring is used to cut down on the hardware required by the system. Using mirroring,
you need only decode part of the address in question. If all the available memory
locations were required, this would result in a problem. However, the CPU is able to
reference 64KB of memory with the NES needing only a fraction of this. Thus, mirroring
can be used without issue.
16
2.4.2.4. Opcodes
The 2A0X has 56 official instructions; with many of these supporting multiple addressing
modes. This results in the 6502 supporting a total of 151 official op-codes. (2) There also
exist 105 undocumented opcodes, many of which perform multiple official instructions at
once. (14)
The number of CPU cycles that each opcode takes varies and is largely dependent on the
addressing mode used.
2.4.2.5. Addressing Modes (17)
Addressing modes specify a format for how the addresses given to the opcode will be
interpreted. Thirteen such modes exist in the 2A0X.
Each addressing mode will not be discussed here. However, a discussion of these can be
found at. (2)
2.4.2.6. Memory (17)
The maximum amount of memory which can be present in the 2A0X is 64K. This is
because the processor has a 16 bit address bus and thus cannot reference any memory
beyond this boundary.
2.4.2.7. Page System (17)
Memory locations are divided into pages of 256 bytes. Whenever a page boundary is
crossed, it often introduces an extra cycle delay to the execution of the instruction.
Two pages of memory are used for specific purposes. These are:
Page 0
It is possible to read from and write to page 0 memory faster than any other
memory as you need only specify one byte for the address (the speed advantage
comes from needing fewer cycles for execution. The memory itself is no faster).
Thus, it is usually used as ―working memory‖, storing those data which will be
accessed regularly.
Page 1
Page 1 consists of the stack. This is predominantly used for storing data that
should be preserved whilst sub-procedure calls take place.
All other pages have no specific purpose.
17
2.4.2.8. ALU
The Arithmetic-Logic Unit is used to perform the arithmetic operations of the processor
(addition, subtraction etc) as well as the logical operations (ANDs, ORs etc).
It takes two inputs and returns a single output.
Figure 3: The operation of the ALU unit (modified from (18))
2.4.2.9. Registers (17)
The processor has seven internal registers (all 8 bits wide). These are as follows:
1. X (index register)
This register can be used as a high-speed counter but is predominantly used as an
index into a block of memory. It can also be used to get and set the stack pointer.
2. Y (index register)
This register is used in the same way as the X register (the only difference being
that Y cannot be used to set or get the stack pointer).
3. S (Stack Pointer)
The stack pointer is used to point to the current top of the stack. This register is
actually nine bits wide with the ninth bit always set to ―1‖. Thus, the stack pointer
is only able to access memory locations in the range 256-511 (page 1 of memory).
The stack pointer begins at location 511 and the stack grows ―backwards‖ in
memory.
18
4. PCH (Program Counter High)
This register stores the higher 8 bits of the Program Counter (where the Program
Counter is used to point to the next address in memory to be executed by the
CPU).
5. PCL (Program Counter Low)
This register stores the lower 8 bits of the Program Counter.
6. A (Accumulator)
The Accumulator is tied to the left input of the ALU, with the right input typically
being a memory location. When the result is computed, it is then deposited into
the Accumulator. This is the reason why it is said that the 2A0X uses an
Accumulator based design (with the Accumulator being used both as an input and
as the output) and also explains its name (as it accumulates results).
7. P (Status Register)
Figure 4: The 6502's status register (17)
The status register consists of seven one-bit flags (and an un-used bit always set to
one). They are defined as in the image above.
Note that the Decimal flag performs no purpose in the 2A0X series (although the
bit is still included in the status register and is available for programmer use).
2.4.2.10. Subroutines
When subroutine calls are made, there must exist a means of returning to the previous
point of execution once the subroutine has come to an end. This is achieved by pushing
the value of the PC (minus one) onto the stack before changing the PC value to the first
location of the subroutine.
19
When the subroutine comes to an end, the PC value previously pushed onto the stack is
popped and becomes the new value of the PC. This way execution will continue from the
point it was at prior to execution of the subroutine.
2.4.2.11. Interrupts (17)
There are three kinds of interrupt in the 2A0X:
1. IRQ‘s (Interrupt Request)
IRQ‘s are the standard type of interrupt on the processor. This type of interrupt
may be masked (that is, ignored by the processor) depending on the state of the
interrupt flag in the status register.
When an IRQ is activated, the interrupt register is set (ignoring all subsequent
IRQ‘s) and the PC and status register are pushed onto the stack (but not before
setting the Break flag of the status register to ―0‖).
The contents of memory locations FFFE and FFFF are then branched to. These
two locations contain the IRQ interrupt vector.
When the interrupt is completed, the PC and status register is popped off the stack
and execution continues as before.
2. NMI‘s (Non-Maskable Interrupts)
NMI interrupts are of higher priority than IRQ‘s and cannot be masked by the
interrupt flag.
NMI‘s are otherwise identical to IRQ‘s except that they instead branch to the
address at locations FFFA and FFFB.
3. BRK (Break)
The equivalent of a software interrupt, the BRK command behaves identically to
an IRQ except that the Break flag of the status register is set to ―1‖ as opposed to
―0‖ for an interrupt (to allow the processor to differentiate between interrupt
types).
In the case that multiple interrupts occur at once, the following rules are used:
1. The type of the interrupt is checked. If it is not a NMI, it is ignored.
2. If the interrupt was an NMI, the PC and status register of the currently executing
interrupt is pushed onto the stack and the PC value for the new interrupt is loaded
into the PC.
20
Other than these special rules, the normal rules of operation for interrupts are followed.
2.4.2.12. Reset
When the 2A0X is reset:
The interrupt flag in the status register is set
The PC is set to the contents of memory locations FFFC and FFFD
The registers should be reset to their default values. (19)
Figure 5: The three vectors used by the 6502 (17)
21
2.4.3. PPU (2) (20)
2.4.3.1. General
The PPU generates a 256x240 pixel colour image (cropped to 256x224 on NTSC
televisions).
The PPU is external to the CPU with its‘ own memory. This memory cannot be directly
accessed by the CPU, with access being restricted to manipulation of eight memory
mapped registers.
2.4.3.2. The Rendering Process
Graphics are rendered onto the screen line by line (scanline), beginning at the top left of
the display. Each line has a height of 1 pixel and a width of 256.
There are two types of delay which occur during the rendering process. These are HBlank
and VBlank and are discussed below:
2.4.3.2.1 HBlank
HBlank is the name given to the time that it takes to travel from the right hand side of the
display back to the left after the current line has been rendered. This adds a delay to the
rendering of scanlines.
2.4.3.2.1. VBlank
VBlank refers to the time that it takes to travel from the bottom right hand side of the
display back to the top left. VBlank occurs once per frame.
The VBlank period is instrumental in allowing interactive programs to be written for the
NES. It is during the VBlank period that most operations are performed (e.g. checking for
user input, updating the scrolling etc).
22
2.4.3.3. Sprites
Sprites consist of those images which are capable of being moved independently around
the screen without causing damage to the background. Examples include "characters"
(some of which are controllable by the user).
The hardware allows for sprites to be either 8x8 or 8x16 pixels in size and the
maximum number of sprites is 64.
8x16 sprites are dealt with slightly differently than 8x8 sprites (discussed later).
All sprites have a priority which determines in what order they are drawn by the
PPU, with higher priority sprites being drawn last to ensure that they are drawn
"on top".
The hardware allows for all sprites to be flipped horizontally and vertically as
well as being able to specify whether the sprite should appear in front of the
background or behind it.
The NES allows a maximum of 8 sprites per scanline.
Most "characters" are constructed from multiple sprites.
Figure 6: Mario (above) is made up of 8 sprites (2)
23
2.4.3.4. Memory
The PPU has access to 16KB of memory. There also exists a separate 256 byte area of
memory dedicated to storing sprite data known as Sprite RAM. Each area of memory has
a different purpose. These will now be explained.
Figure 7: Summary of PPU Memory (2)
2.4.3.4.1. Colour Palettes
The NES has three colour palettes. These are the master, sprite and image palettes.
The master palette contains all the colours that the NES is capable of displaying (52, with
space for 64).
The sprite and image palettes each contain 16 colours. The sprite palette contains those
colours that can be used for sprites and the image palette contains those colours usable
for the background tiles.
Both the sprite and image palettes are subdivided into groups of four colours (thus, each
palette consists of four sub-palettes of four colours each).
A further complication comes from the fact that the first element of each sub-palette is
always transparent. This is achieved by mirroring the first palette entry every four bytes.
24
If it is the case that for a pixel location on the screen, neither a sprite nor the background
points to a non-transparent colour, there is a ―fallback‖ background colour stored at
location 3F00.
The reasons why the palettes are constructed this way will hopefully become clear soon.
Figure 8: All 8 sub-palettes begin with a transparency element (adapted from (20))
25
2.4.3.4.2. Pattern Tables
The PPU has two pattern tables.
The pattern tables are used to store the 8x8 pixel tiles which make up the graphics used
for both the sprites and the background.
NES graphics use 4-bit colour. However, only the two least significant bits of this colour
are stored in the pattern tables. The attribute tables contain the remaining two bits for
each pixel.
Each tile consists of two planes. The first contains the least significant bit of the colour
and the second contains the most significant bit. These values are combined to make up
the 2-bit colour (see below for a visual example).
Figure 9: An illustration of how tiles are made up (the remaining two bits are in the attribute tables) (2)
The two bits stored in the pattern tables actually determine which of the four colours
available in the currently active sub-palette should be used to represent the colour of the
pixel. For example, if the graphic was a sprite, the two bits were ‗11‘ and the currently
active sub-palette (which sub-palette is active is determined by the attribute table value –
see later) was the third, the pixel colour would be brown (using the above palettes).
26
2.4.3.4.3. Name Tables/Attribute Tables
Name tables and attribute tables are closely related.
Name tables consist of a 32x30 matrix where each element points to a tile in the pattern
tables. As each tile is 8x8, this results in the name table being 256x240 pixels in size (the
size of the image generated by the PPU).
Attribute tables provide the upper two bits of colour for the tiles in the name tables. Each
byte provides the colour information for a 4x4 block (32x32 pixels). Every two bits of
each byte represents a quarter of this 4x4 block (2x2 block or 16x16 pixels).
These two bits actually represent the sub-palette to be used for the block of 16x16 pixels.
Thus, it is the case that for every block of 16x16 pixels in the name tables, only four
colours are available.
Figure 10: An illustration of the colour palettes in action (interacting with the pattern and attribute tables)
2.4.3.4.4. Sprite RAM (SPR-RAM) (21)
Sprite RAM is a separate 256 byte area of memory which has the same purpose as the
attribute tables except that it is used to construct the sprites rather than the background.
The way that the graphical elements obtain the colour information for each of their pixels
is summarised in the below image:
27
Figure 11: A summary of the interactions between the pattern tables and the Sprite RAM/Attribute Tables
(Pattern table image from (22))
2.4.3.4.5. Name Table Mapping
The NES is capable of handling up to four name tables/attribute tables but only has
enough memory to store two.
However, via the use of mirroring it is possible to use all four. This capability is
important when it comes to scrolling the screen.
The four name tables are arranged in a 2x2 pattern as illustrated below.
Figure 12: A visual representation of the name tables’ layout
28
In describing the mechanics of name tables, the concepts of physical and logical name
tables will be used.
Physical name tables are the actual memory used to represent the name table whereas
logical name tables are those which are addressable via PPU memory.
Each mirroring technique is described below and then is immediately followed by an
image illustrating said technique in use. The images used are real name tables taken from
Super Mario Brothers 2 (23).
Horizontal mirroring maps 0x2000 and 0x2400 to the first physical name table and
0x2800 and 0x2C00 to the second physical name table.
Figure 13: Horizontal Mirroring
Vertical mirroring maps 0x2000 and 0x2800 to the first physical name table and
0x2400 and 0x2C00 to the second physical name table.
Figure 14: Vertical Mirroring
29
Single-screen mirroring maps all four logical name tables to the same physical name
table.
Figure 15: Single-Screen Mirroring
Four-screen mirroring allows each logical name table to map to separate physical name
tables. An additional 2kb to store the two extra physical name tables (on the game
cartridge) is required to achieve this.
Figure 16: Four-Screen Mirroring
30
2.4.3.5. Registers
2.4.3.5.1. Control Register
This register controls the operation of the PPU by allowing specification of various
parameters needed for operation. For example, Sprite and background pattern table
addresses and the base name table address.
2.4.3.5.2. Mask Register
This register controls the behaviour of the graphics rendering process. It allows
enabling/disabling of background and sprite rendering, allows rendering in greyscale
mode and allows intensifying of certain colours amongst other things.
2.4.3.5.3. Status Register
Like the CPU and APU, the PPU also has a status register and is used in much the same
way.
2.4.3.5.4. SPR-RAM Address and data Registers
These registers allow for reading and writing to Sprite RAM. The address to be accessed
is written into the address register and it is then possible to read from or write to this
address using the data register.
2.4.3.5.5. Scroll Register
This register is used for specifying horizontal and vertical scroll offsets, determining in
what direction and how much the screen should scroll.
2.4.3.5.6. Pattern Tables address and data Registers
This works in the same way as the SPR-RAM address and data registers except that they
allow access to pattern table addresses rather than those of Sprite RAM.
2.4.3.6. Direct Memory Access (DMA) (2) (24)
During the execution of a program, large quantities of data will need to be periodically
transferred between CPU memory and Sprite memory. To make this as efficient as
possible, a DMA controller exists.
Using a DMA, you are able to transfer large amounts of memory from one place to
another without a need to go through the processor.
31
It is required to inform the CPU before DMA begins and to inform the CPU once all the
data has been transferred. This is to ensure that the DMA controller has exclusive access
to the memory bus whilst copying.
In the case of the NES, access to the DMA controller is achieved by writing to a special
register (4014), specifying the starting offset as the write operand – to which 0x100 is
added. 256 bytes is then transferred from this address onwards.
2.4.3.7. Background Rendering
2.4.3.7.1. Overall
A high level look at rendering a scanline to the display follows:
1. Obtain the pattern table information relevant for the current line, split into two
layers.
2. Convert the information into 2 bit colour form by combining the two layers.
3. Retrieve the attribute table information for the current line and combine with the 2
bit colour form to make the complete 4 bit colour form.
4. Output the line to the display using the relevant palette entries.
32
2.4.3.7.2. Scrolling
Scrolling is achieved via the use of multiple name tables. When the users' "character"
moves sufficiently close to the edge of a name table, the name table next to the currently
displayed one (which one will vary depending on the mirroring technique chosen) will
begin to be used for screen rendering producing a composite of the two name tables).
This continues until the first name table has been ―scrolled‖ out of view, resulting in the
rendered image coming entirely from the second name table.
To maintain the ability to scroll, when a name table is not visible during rendering, the
PPU will often be filling the table with the graphics to be viewed when scrolling begins
again.
This process is illustrated below:
Figure 17: A summarisation of the NES scrolling mechanism (adapted from (2))
33
2.4.3.7.3. Rendering Modes
Rendering modes influence the image rendered to the display.
The possible modes follow:
greyscale
background/sprite clipping
background/sprite rendering turned on/off
Colour intensity (RGB).
Specifying rendering modes is a simple process of writing to certain bits of PPUMASK
which are then used by the PPU to decide exactly how the rendering should take place.
The background rendering process is explained in far greater
detail discussing the registers used along with how they are
interpreted and manipulated by the machine.
2.4.3.8. Sprite Rendering
2.4.3.8.1. Sprite Evaluation (21)
During each scanline (341 clocks), the PPU accesses the Sprite RAM in a particular
pattern. This procedure determines the sprites on the current scanline, making them
available for drawing to the screen.
In more detail:
There are two types of Sprite Memory: main and secondary. The sprites in
secondary after the sprite evaluation are the ones which are to be rendered to the
display.
Each sprite‘s Y Co-ordinate is stored in secondary and evaluated. If the sprite is
present on the current scanline, the remainder of the sprites attribute data is copied
into secondary memory.
This process continues for all sprites in memory until either:
o Eight sprites have been copied to secondary. At this point, the evaluation
logic becomes very erratic (having eight sprites on a scanline effectively
―breaks‖ the logic).
o All sprites have been checked.
Only the data used to determine where on the display the sprites should be rendered are
stored in Sprite Memory. The actual sprite data is stored in one of the pattern tables.
34
(2.9)
A flow chart representing this sprite evaluation logic.
2.4.3.8.2. Sprite Rendering
Each sprite takes up 4 bytes of memory.
Byte 0 – The Y position of the top of the sprite (used to determine whether the sprite is
present on the current scanline – see later).
Byte 1 – The tile index number used for this sprite.
Byte 2 – Sprite Attributes.
Byte 3 – The X position of the left side of the sprite.
It essentially achieves this by checking whether each sprite is present on the current
scanline using the sprites top Y co-ordinate to determine this (byte 0).
Any sprites which should be rendered to the display are stored in secondary Sprite RAM.
Much of the above logic is unnecessary in achieving an accurate emulation. Also, much
of the more complicated logic occurs once more than 8 sprites have been found which
should be rendered on the scanline. This behaviour will not be emulated for two reasons:
1. Original NES titles did not use more than 8 sprites per scanline because of the
machines inability to handle any larger number in a deterministic manner. Thus,
original titles will emulate correctly.
2. Removing the 8 sprite restriction will allow current day developers additional
freedom in writing their applications.
Removing this restriction makes the use of secondary Sprite RAM impractical as there is
no way of gauging how much memory needs to be set aside for sprite storage. Rendering
will instead be dealt with by rendering sprites to the display as they are found.
2.4.3.8.2.1. The Rendering Process
For each sprite in secondary Sprite RAM (provided that the sprite is set to be rendered
above the background):
1. Obtain the pattern table information relevant for the current sprite and line, split
into two layers.
35
2. Convert the information into 2 bit colour form by combining the two layers.
3. Use byte 2 of the current sprites information to obtain the 2 high colour bits and
combine with the 2 bit colour form to make the complete 4 bit colour form.
4. Output the sprite to the display at the correct X position (using byte 3) using the
relevant palette entries.
2.4.3.8.2.2. Sprite Size
NES sprites have the ability to be either 8x8 or 8x16 pixels. This is handled by byte 1 of
the sprite data and PPUCTRL.
By manipulation of bit 6 of PPUCTRL, the programmer can decide which size sprites to
use.
The PPU uses the value of this bit to determine how byte 1 should be interpreted.
8x8: Byte 1 specifies the tile number to be used.
8x16: Byte 1 specifies which tile bank should be used and the tile number of the top of
the sprite. The bottom half of the sprite uses the next tile in the bank.
36
2.4.4. APU (25) (26)
2.4.4.1. General
The NES APU (Audio Processing Unit) consists of five sound channels. These are as
follows:
Pulse 1 (Pulse wave)
Pulse 2 (Pulse wave)
Triangle (Triangle wave)
Noise (Pseudo-Random)
DMC (Delta Modulation)
With the exception of the DMC channel (which plays samples), all channels play
waveforms.
The unit is made up of many smaller units which interact with one another in order to
achieve the audio capabilities of the NES. This interaction is achieved largely via the use
of clocking; that is, when one unit wishes to interact with another, the first clocks the
second.
2.4.4.1. Component Building Blocks
Most of the components in the APU use the following ―building blocks‖, used in
different ways so as to achieve varying effects. Descriptions of these base units follow:
2.4.4.1.1. Divider
Dividers output a clock every n clocks, where n is the divider‘s period. It contains a
counter which is decremented on the arrival of each clock. When this counter reaches
zero, the divider is reloaded with its period and an output clock is generated.
It is possible to force a divider to reload its counter immediately. If this is done, an output
clock is not generated.
Upon changing the dividers period, the current count is not affected.
2.4.4.1.2. Sequencer
Sequencers continuously loop over a sequence of values or events. When clocked, the
next item in the sequence is generated. For example, a sequencer could have a sequence
of events where each item causes specific units in the APU to be clocked.
A table showing each channel and their use of these building blocks can be seen below:
37
2.4.4.2. Channel Units
There are several components used by the various channels in order to alter the output of
that channel (for example, increasing or decreasing the volume). A description of each
follows.
2.4.4.2.1. Building Block Usage
Several of the channel units use the building blocks defined above. A summary table
follows:
Building Block
Unit
Timer
Length Counter
Envelope
Sweep
Linear Counter
Shift Register w/ feedback
Memory Reader
Sample Buffer
Output Unit
Frame Counter
Status Register
Divider
X
Sequencer
X
X
X
X
Table 1 : A summarisation of the use of the "building blocks" in "channel units"
2.4.4.2.2. Timer
A timer is equivalent to a divider.
2.4.4.2.3. Length Counter
The length counter provides automatic duration control for the waveform channels of the
APU.
The channel can be set to continue playing until it is told to stop or it can be set to play
for a certain amount of time and then silence the channel.
2.4.4.2.4. Envelope
The purpose of the Envelope unit is to control the volume of the waveform channels.
The volume can be set to be constant or to decrease gradually over time.
It also provides the capability for sound looping.
38
2.4.4.2.5. Sweep
The sweep unit allows the frequency of the Pulse channels‘ output to be increased or
decreased periodically.
This allows for a variety of effects to be achieved.
2.4.4.2.6. Linear Counter
This is simply an additional duration timer which is similar to the Length Counter except
that it is of higher accuracy (being 7 bits as opposed to the Length Counter‘s 5 bits).
2.4.4.2.7. Shift Register w/ feedback
This unit provides a mechanism for generating pseudo-random bit sequences for use in
the noise channel.
The bit-sequences are generated using the current state of the register, the ExclusiveORing of bits and a right shift.
2.4.4.2.8. Memory Reader, Sample Buffer and Output Unit
These three units work together to help achieve the functionality of the DMC channel.
The sample buffer is used to store one byte samples of sound from the currently playing
sample (with the data retrieved from memory).
The sample buffer is populated via the memory reader whenever it is emptied.
The output unit continuously outputs complete sample bytes to the mixer.
Figure 18: A visual representation of the DMC channels operation (Mixer image, see (27))
2.4.4.2.9. Frame Counter
The frame counter acts as a master unit, periodically clocking various channels and other
units. It has two sequences at its disposal, each of which clocks different units at different
steps within the sequence.
For example, if the first sequence is chosen and the frame counter is clocked, it will clock
all envelopes and the triangle‘s linear counter. When the frame counter is clocked again,
39
it will clock all length counters and sweep units. It continues in this fashion, looping back
to step 1 once the end of the sequence is met.
This is an extremely important unit, playing a major part in ensuring that the sound is
synchronised correctly.
2.4.4.2.10. Status Register
The status register provides a means of enabling/disabling audio channels and querying
the current playing status of the channels.
It also allows for various interrupt flags to be read.
2.4.4.3. Channels
2.4.4.3.1. Mixer
Output from all channels is sent to the mixer. The mixer then combines the inputted
frequencies so as to produce a single output value. This is the value that is used for
playing the sound.
2.4.4.3.2. Channel Unit Usage
Each channel uses several of the units defined above. A summary table follows:
Unit
Timer
Length Counter
Envelope
Sweep
Linear Counter
Shift Register w/ feedback
Memory Reader
Sample Buffer
Output Unit
Frame Counter
Status Register
Pulse
X
X
X
X
Channels
Triangle
Noise
X
X
X
X
X
DMC
X
X
X
X
X
X
X
X
X
Table 2 : A summarisation of the use of "channel units" in the sound channels
40
X
X
X
X
X
2.4.4.3.3. Pulse
The Pulse channel outputs a pulse wave and is capable of outputting one of four
waveform sequences at a time.
The Pulse channels have four registers at their disposal which allow you to alter the
outputted value in several ways:
The waveform sequence to be used for output can be specified
The frequency that the sound should be played at
The amount of time that the sound should be played for (or indefinitely)
The volume that the sound should be played at (or constant)
The rate that the channels frequency should be increased/decreased over time (if
at all).
The two Pulse channels only differ in the way that periodic frequency shifting (sweep
unit) is calculated.
2.4.4.3.4. Triangle
The triangle channel outputs a pseudo-triangle wave.
The triangle channel has access to three registers which allows alteration of output in the
following ways:
The frequency that the sound should be played at
The amount of time that the sound should be played for (or indefinitely)
The triangle channel also contains an additional counter to allow for the output to play for
a longer period of time than the Pulse channel.
2.4.4.3.5. Noise
The noise channel outputs pseudo-random 1 bit noise at 16 different frequencies.
It has access to three registers used to achieve the following capabilities:
The amount of time that the sound should be played for (or indefinitely)
The volume that the sound should be played at (or constant)
The frequency that the sound should be played at
The mode to be used:
o Long mode - 32767 bit long sequences
o Short mode - 93 bit long sequences
41
2.4.4.3.6. Delta Modulation Channel (DMC)
The DMC channel outputs samples retrieved from memory.
It uses four registers as well as a memory reader, sample buffer and output unit
(described above) to achieve its goals:
The frequency that the sound should be played at
The ability to directly load the values to be output to the mixer
The ability to loop the sample
The ability to use the CPU‘s IRQ interrupt mechanism to make playback more
flexible.
as well as the basic ability to stream samples to the mixer.
Note
Complex details concerning how the APU channels work are not included in this report
(for example, the exact sequence information of the Frame Counter and the lookup tables
used by various channels). However, this information is freely available from (28) (26).
42
2.4.5. Input/Output (2) (27)
2.4.5.1. Ports
The NES has two input ports (both of which can be read simultaneously) and an
expansion port underneath the system.
The expansion port can be written to via the lowest three bits of register 4016
The state of the input devices is accessible via the reading of the bottom five bits
of 4016 for the first device and 4017 for the second.
Reading from the expansion port is also possible from either 4016 or 4017.
2.4.5.2. Standard Control Pad
The most widely used input is the standard rectangular control pad. This is the input
device which will be emulated (emulation of further devices is a possibility if time
permits).
The discussed (annotated) control pad can be seen below:
Figure 19: An annotated Standard NES control pad (unannotated image from (2))
43
2.4.5.2.1. Determining Input State
In order for programs to be interactive, there needs to be a mechanism for determining
the actions the user has performed so that a response can be made. Inputs from the
standard control pads achieve this by requiring a read of the appropriate register (4016 for
pad 1, 4017 for pad 2) for each of the pads buttons. That is:
Read #
Button
1
A
2
B
3
Select
4
Start
5
Up
6
Down
7
Left
8
Right
Table 3 : The state of each button is obtained by reading the controller registers a certain number of times
Before these reads can be made however, it is necessary to strobe the controllers.
2.4.5.2.2. Strobing
When the controllers are strobed, their current status is stored. It is then obtainable via the
method above. Thus, strobing is required before every read of the controller state.
Initiating a strobe is achieved by writing a ‗1‘ to the lowest bit of the 4016 register.
Writing a ‗0‘ to this bit then stores the states and allows them to be read.
If 4016 is read after ‗1‘ has been written, it will return the current state of the A button
(not the state stored upon a complete strobe).
Figure 20: Controller Strobe State Diagram
44
2.4.6. Development Tools
2.4.6.1. Debugger
The debugger should have the following features:
2.4.6.1.1. Register State Pane
2.4.6.1.1.1. Functionality
1. CPU Register values
2. PPU Register values
3. PPU State
a. Base Name Table
b. VRAM address increment amount
c. Sprite Pattern Table Address
d. Background Pattern Table Address
e. Sprite Size
f. NMI generation active status
4. Scroll Values
a. X Tile
b. Y Tile
c. X Fine Positioning (X position within the current tile)
d. Y Fine Positioning (Y position within the current tile)
The Register State Pane should display information about the various registers in the
NES as the program executes. This includes both displaying the register contents as held
in memory (1 and 2) and displaying state based on interpreting certain bits of said
registers (3 and 4).
It is hoped that this information will prove helpful in the debugging of applications. For
example, use of the scroll values to help correct any scrolling related issues in the users'
code.
45
2.4.6.1.2. Breakpoints Pane
2.4.6.1.2.1. Functionality
1. Break upon NMI
2. Break upon BRK
3. Register value
a. Equals
b. is Greater Than
c. is Greater Than or equal
d. is Less Than
e. is Less Than or Equal
4. Remove Breakpoints
5. Remove all Breakpoints
The most important function here is 3, allowing the user to break execution upon a
particular register meeting a condition. Most likely used much of the time to stop
execution at a particular line of code (PC equals …).
2.4.6.1.3. Tools Pane
2.4.6.1.3.1. Functionality
1. Step
Executes one instruction and then pauses execution.
2. Resume
Leaves Step mode and continues execution only breaking when the next
breakpoint is met.
3. Seek PC
Jumps to and highlights the instruction held at a given location in memory (see
code pane).
2.4.6.1.4. Code Pane
This should display the code of the currently loaded ROM. This will require a
disassembler to be written.
46
2.4.6.1.5. Memory Pane
This should display the contents of the CPU and PPU memory.
The memory should be divided into sections (Stack, Zero Page etc) and also be divided
by Unit (CPU and PPU).
2.4.6.2. Pattern Table Viewer
1. The tiles stored in the pattern tables should be visible.
2. The palette colours used by the sprite and image palettes should be visible.
3. Tile Information:
a. Pattern Table Number
b. Tile Number (in the order the tiles are stored in the tables).
4. Palette Information:
a. Which Palette (image or sprite).
b. Entry number within the palette.
c. Entry number within the master palette.
5. Display options.
a. Automatic Table refresh?
b. Table refresh rate (between 0 and 5 seconds).
2.4.6.3. Name Table Viewer
1. The images stored in the name tables should be visible.
2. The attribute table data should be visible.
3. Name Table numeric data display. This should allow the user to view the actual
data stored in the name tables in an accessible form. That is, the data should be
displayed in a grid to make it easier to identify particular parts of the name table
than if just a flat display of memory were provided.
4. Display Options
a. Scroll Lines should be visible on the name tables, representing the scroll
register visually.
b. Automatic Table refresh?
c. Table refresh rate (between 0 and 5 seconds).
d. Two bit colour display – displays the tables if the attribute bits were not
added to the pattern table tiles which make up the name tables.
47
2.4.7. Further Specifications
In addition to the analysis above, several other aspects of the system have been specified:
Cartridge Specification
Documenting the workings of the NES cartridges (containing the
executed software)
File Format Specification
A header appended to the beginning of the ROM files is necessary
to execute them on foreign hardware.
Regional Differences Specification
NES hardware varies based on region. These differences are
discussed here.
48
3. Design
3.1. Overall System Design
The system can be broken down into several logical units.
This overall structure is detailed in the following sections.
3.1.1. Packages
Figure 21: Package Diagram
Note that the above diagram only indicates the important package dependencies.
49
3.1.2. Classes
3.1.2.1. CPU
The CPU uses the Memory package to provide objects to
represent the whole CPU memory system (one object per memory
type – e.g. Stack, CartRAM etc).
The Memory package is discussed in detail later in the report.
3.1.2.2. PPU
The PPU holds a reference to a Palette object. The Palette object
allows the easy manipulation and recalculation of Palette
colours. It also provides an easy means of updating the palette
entries for the image and sprite palettes.
50
3.1.2.3. Inputs
Input devices for the NES follow a common format.
Thus, an interface would seem desirable to allow the
generalization of all inputs (allowing simpler, better
structured code).
3.1.2.4. APU
Figure 22: APU Class Diagram
51
3.1.2.5. GUI
Figure 23: GUI Class Diagram
3.1.2.6. Development
Figure 24: Development Class Diagram
52
3.2. CPU Design
3.2.1. General (29)
The general method to be used for emulating the CPU core is common.
It will consist of an infinite loop which continues to run until the user chooses to halt the
execution.
The body of the loop will have two main purposes:
1. Executing the 6502 instructions of the running program
2. Dealing with the cyclic tasks which must be performed.
Cyclic tasks are those tasks which need to be performed periodically. Examples
include the refreshing of the screen and input state.
The CPU will execute machine instructions for a pre-determined number of CPU cycles
before breaking to deal with the aforementioned cyclic tasks.
An integer interrupt period will be used to keep track of the number of cycles left to be
executed. This variable will be decremented by a number of cycles after each instruction
is executed. Once the cyclic tasks have been dealt with, the interrupt period will be
restored.
The interrupt period should be set to be the biggest common divisor of the number of
cycles required for each task.
Booleans will be used to allow the CPU to be paused and stopped. If the CPU is paused,
it will simply not execute any further instructions (whilst remaining in the loop) until the
CPU is un-paused.
To aid efficiency, the CPU will only be able to be stopped (breaking out of the loop) once
the interrupt period is exhausted.
53
It should be noted that this efficiency measure results in a quirk in the way the CPU runs.
It is necessary to ensure that the CPU is not paused before being able to stop it:
Figure 25: CPU State Transitions
appendix F (1.1)
3.2.1.1. Data Mirroring
Because the NES mirrors much of the data stored in memory, addresses sometimes need
to be converted into physical addresses before they are used.
This can be achieved by performing a logical AND between the address and the
maximum physical address for that type of memory. Pseudo code illustrating this
follows:
int: actualAddress;
If (address < 0x2000) { // Mirroring of zero page, stack and the CPU RAM
actualAddress = address & 0x7FF;
}
else if (address < 0x4000) { // Mirroring of PPU I/O registers
actualAddress = address & 0x2008;
}
else { // No other memory types involve mirroring
actualAddress = address;
}
54
3.2.2. Memory
Different ranges of memory addresses provide different effects and types of values when
written to/read from. For example, some memory cannot be written to and some change
the behaviour of other units (such as the PPU) when written to. In order to handle this,
several classes will be created, each representing a different range of memory. All classes
will inherit from a Memory super-class.
All memory will be represented as arrays of primitive integers inside Memory objects.
This scheme should also help to simplify the implementation of memory mappers
(MMC‘s).
The size of the memory in bytes will be specified to the Memory objects‘ constructor.
The integer array representing the memory for that range of memory will then be created.
For several of the Memory classes, this will be all that is required as they inherit the
reading/writing to memory behaviour from their super class ―Memory‖.
Exceptions are where a range of memory should not be written to or some other
behaviour should be performed instead.
These exceptions are:
Cart and Expansion ROM should not be written to (override writeToMemory with
an empty method)
Reading and writing to the IO should instead read and write to the PPU
The Stack will require two additional methods: push and pop.
Figure 26: Memory Class Hierarchy
appendix F (1.2)
55
3.2.3. Registers
3.2.3.1. General
All registers shall be stored as primitive integers. The program counter will be stored as a
single 16-bit value instead of two 8-bit values for simplicity and efficiency.
3.2.3.2. Status Register
The status register must be kept up to date at all times to ensure that programs are
executed correctly.
The CPU itself manipulates the overflow, carry, zero and negative flags (although user
code can switch all other bits in the register).
Efficient ways of setting/clearing these flags have already been developed and are in
wide use. These methods are explained below (with appropriate referencing).
3.2.3.2.1. Zero and Negative Flags
A common approach to dealing with the setting/clearing of the Zero and Negative bits is
the use of a lookup table(30). This method ensures efficiency.
The lookup table is typically stored as an array with 256 elements. When a registers‘
value is changed, its new value is used as an index into the array and a logical OR is
performed against the returned value and the status register. This ensures that the
negative and zero flags are always set to the correct values (as long as the register used is
8-bit using 2‘s complement format).
Status Register |= znTable[Register]
002,
000,
000,
000,
000,
000,
000,
000,
128,
128,
128,
128,
128,
128,
128,
128,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000, 000,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
Table 4: A lookup table allowing efficient setting of the Zero and Negative bits (znTable)
56
000,
000,
000,
000,
000,
000,
000,
000,
128,
128,
128,
128,
128,
128,
128,
128
3.2.3.2.2. Overflow Flag
An efficient way of determining whether the overflow flag should be set/cleared is by
performing an Exclusive OR between the carry-in and carry-out of bit 8 of the register
whose value was just changed.(17)
This fact was used to create the efficient overflow code below (30):
If (comp (accumulator EOR Register value) & (accumulator EOR Register value) & 0x80) != 0 {
Status register |= bit 8; // Set overflow bit
}
// comp RHS = The compliment of the RHS. I.e. all bits are inverted.
// &
= Logical AND
3.2.3.2.3. Carry Flag
The carry bit can be set if either of two conditions are met.
1. If addition is taking place, carry has occurred if the result of the addition is more
than 255.
If result > 255 {
Status register |= bit 1; // Set carry bit
}
// LHS |= RHS = Shorthand for LHS = LHS (logical OR) RHS
2. If subtraction is taking place, carry has occurred if the result of the subtraction is
more than or equal to zero.
If result >= 0 {
Status register |= bit 1; // Set carry bit
}
3.2.3.2.4. Other Flags
All other flags are simply set/cleared by use of certain instructions.
57
3.2.4. Addressing Modes (17) (2) (19)
This will involve the implementation of several methods; each one emulating the function
of one of the 6502‘s addressing modes.
Each of the addressing modes manipulate and use the addresses given in different ways
(for example, one takes two eight bit addresses and forms a sixteen bit address, another
simply uses the eight bit address given as is).
appendix F (1.3)
All methods will be situated in the main CPU class.
Most will make use of the getMemory() utility function (discussed later) in order to easily
obtain the data from memory.
The methods will simply return an integer value of the address to be used.
3.2.5. Opcodes
All opcode execution will be dealt with via a simple switch/case setup.
switch (opcode) {
case (0x00) :
// Code to emulate opcode execution
Case(0x01) :
// …
}
The number of cycles that each opcode take to complete varies. A lookup table will be
implemented where the cycle number can be determined by using the opcode value as the
index.
This will be implemented as a 256 element primitive integer array (providing space for
information about all opcodes, documented or not).
58
3.2.6. Interrupts
Upon being triggered, the interrupts simply carry out a number of actions on the systems
memory and registers. These actions are best explained in the form of high level pseudo
code below:
3.2.6.1. IRQ‘s
void : IRQ() { // Returning from the interrupt is achieved via the RTI opcode.
if (interrupt flag is NOT set) {
Clear Break Flag
Set Interrupt Flag
Push PC onto Stack
Push Status Register onto Stack
PC = loadWord(0xFFFE);
}
}
3.2.6.2. NMI‘s
Void : NMI {
// Returning from the interrupt is achieved via the RTI opcode.
if (interrupt flag is NOT set) {
Clear Break Flag
Set Interrupt Flag
Push PC onto Stack
Push Status Register onto Stack
PC = loadWord(0xFFFA);
}
}
3.2.6.3. BRK‘s
void : BRK() { // Returning from the interrupt is achieved via the RTI opcode.
if (interrupt flag is NOT set) {
Set Break Flag
Set Interrupt Flag
Push PC onto Stack
Push Status Register onto Stack
PC = loadWord(0xFFFE);
}
}
When the RTI instruction is encountered (the interrupt has come to an end), the status
register and PC are popped off the stack and execution continues as before.
59
3.2.6.4. System Reset
void : reset {
Set the interrupt flag
PC = loadWord(0xFFFC);
Reset all registers and memory
}
3.2.7. Direct Memory Access (DMA)
This will be implemented as a method within the CPU class.
It will take the address to begin reading from as a parameter (0x100 should be added to
it) and will simply fill the Sprite Memory by simply looping through the CPU memory,
copying 256 consecutive values across.
For the sake of efficiency, the data should be written to a temporary array within the CPU
class before passing this array across to the PPU. The PPU sprite memory object (an
integer array) will then be made to hold a reference to this new array. This will save on
the method invocation overhead of writing each byte to sprite memory individually.
appendix F (1.4)
3.2.8. Utility Methods
A number of utility methods should be written so as to simplify implementation and
eliminate the duplication of code:
1. branch(int : branchOpCode). This method will deal with determining whether or
not branching should occur from one memory location to another dependent on
the state of PPU register flags. It returns a Boolean.
2. checkPageBoundary(int : oldAddress, int : newAddress). It is often the case that if
memory operations cross the current page boundary (each page being 256 bytes)
an additional cycle is necessary for computation. This method will return a
Boolean indicating whether a page boundary has occurred.
3. loadWord(int : lowestByte). This is a convenience method which returns a 16-bit
value when given the lowest byte of the value.
4. getMemory(int : memory). This method will return the data stored at the memory
address given. In order to achieve this, it will need to take account of memory
60
mirroring and will need to determine which section of memory the address
corresponds to (which Memory object the data should be obtained from).
appendix F (1.5)
61
3.3. PPU Design
3.3.1. General
The execution of the PPU will be dealt with on a per scanline basis. That is, the graphical
data will be written to the display one line of pixels at a time.
Specifically, the process will take the following form:
1. When the CPU has executed a defined number of cycles, all cyclic tasks will be
dealt with (see CPU design).
2. If the number of cycles required for the output of a scanline has been met, the
PPU will generate and output the next scanline.
3.3.2. Registers
The first three registers of the PPU represent how various aspects of the PPU behave. In
general, each bit of each register determines the behaviour of one aspect. For example,
one bit in PPUMASK determines whether background rendering should occur. Another
specifies whether the display should be rendered in greyscale or not).
These PPU behavioural aspects are to be stored as simple variables within the PPU class.
The updating of these values will be dealt with within a single method (updateState()).
updateState() should be called just prior to performing any rendering to ensure correct
rendering.
Details of the effects of manipulating the bits of these registers are provided below.
62
Figure 27 : Changing the values of certain sets of bits alters image output (21)
63
All the below registers are to be represented as integer variables.
The behaviours required for reading and writing to each register are to be handled via
individual methods within the PPU class.
3.3.2.1. OAMADDR ($2003)
As this register is simply used to specify the address to be read from or written to in
Sprite Memory when using the OAMDATA register, it only requires a simple ‗setter‘
method.
3.3.2.2. OAMDATA ($2004)
Represented by two methods: one for reading and one for writing.
Reading will simply return the data present at the given address (OAMADDR) in Sprite
Memory.
Writing will write the data given to Sprite Memory at the given address (OAMADDR)
and then increments OAMADDR.
3.3.2.3. PPUSCROLL ($2005) and PPUADDR ($2006)
These two registers share an address latch (in one of two states). What occurs when
writing to the registers depends on the state of this latch.
The latch will be represented as a Boolean.
A simple If construct will be used to determine what writing to these registers will do.
Writing to the registers will set either the top or bottom 8 bits of the register being written
to.
3.3.2.4. PPUDATA ($2007)
Reading and writing to this register will simply set or retrieve the memory location in
PPU memory specified by PPUADDR.
All Registers. Appendix F (2.1 – 2.6)
64
3.3.3. Start-Up
Upon starting up the PPU (and on resets), both the PPUCTRL and PPUMASK registers
should be set to 0 (disabling NMI‘s and rendering).
PPUCTRL = 0x00; // disables NMI’s.
PPUMASK = 0x00; // disables rendering.
3.3.4. Memory
3.3.4.1. General
The PPU memory will be represented as simple arrays for each part of memory. It does
not require the additional structure that is present for CPU memory as all reads and writes
are handled through registers. Thus, the registers are able to handle read and write issues
before accessing the memory.
3.3.4.2. Colour Palettes
A near perfect algorithm for determining the colours used in the NES master palette
which also supports alterable hue and tint was devised by Kevin Horton (31).
This mathematical algorithm has been implemented in Java by David de Niese (32) with
additional palette manipulation possibilities (colour emphasis and a black and white
display – both necessary for a complete emulation).
Nieses' code was distributed under the GNU General Public License allowing for the
code to be used in other projects.
His code will be reused in this project and comments will be put in place to indicate
ownership.
3.3.4.2.1. Master Palette
A master palette which is used by the image and sprite palettes will be maintained. The
palette will be stored in a separate Palette class, allowing for the palette to be passed
around the system by reference (allowing easy updating and data consistency).
Representing the palette like this will also allow very easy modification of palette Tint
and Hue.
65
3.3.4.2.2. Image and Sprite Palettes
The image and sprite palettes are to be represented using two arrays for each:
1. The first will store the location of the colour in the master palette,
2. The second will store the integer value representing the colour in RGB (using the
values in the first array).
When a pixel is to be rendered to the display, the RGB arrays will be updated. This will
allow for colour information to be maintained efficiently, only requiring updates to the
information when it is needed.
appendix F (2.7)
3.3.4.2.3. Scrolling and background rendering
Scrolling will be implemented using the method set out by Loopy in his document ―The
skinny on NES Scrolling‖ (33) detailed in the analysis.
3.3.4.2.4. Sprite Rendering
The following algorithm details a methodology for dealing with the slightly altered sprite
rendering routine discussed in the analysis.
1. Using byte 0 of the sprite memory, determine whether the sprite is on the current
scanline.
2. If Yes:
a. copy the next three bytes and use them to render the sprite to the display.
b. If the number of sprites rendered to the display >= 8, set the sprite
overflow flag.
3. If No:
a. If all sprites have been evaluated, finish
b. If some sprites remain, return to 1.
appendix F (2.8)
66
3.3.4.2.5. Name Table Mapping
Depending on the value of a bit in one of the PPU status registers, different name table
mapping modes may be used (horizontal or vertical).
As mentioned above, the state of this bit will be stored in a variable (in this case, a
Boolean).
Where data is stored during writes to the name table memory depends on the state of this
variable.
This will be handled via simple nested Ifs.
appendix F (2.10)
67
3.4. APU Design
3.4.1. Common Features
Several aspects of the design will be common for most units. These will be discussed
here.
3.4.1.1. Clocking
The NES APU implementation uses clocking between the units in order to handle the
outputting of sound.
It would seem to be a good practice to use when emulating the system also.
All the APU units will include a clock() method which performs the relevant clocking
activities for the unit.
3.4.1.2. Resets
It must be possible to reset all units of the APU. This is in the case that a ROM is reset or
a new ROM is loaded.
3.4.2. Interfaces
Due to the common behaviour of the APU units, it would seem appropriate to define an
interface which all units will implement. This is added less for functional advantages than
to highlight the common behaviours of the units within the code.
Interface APUUnit {
public void clock();
public void reset();
}
3.4.3. Divider
The dividers‘ purpose is to output a clock every time a counter reaches zero, allowing
control of the duration of sounds among other things.
It would seem reasonable to have three methods and an integer counter to implement this
unit. The three methods would allow:
1. Clocking (decrementing the counter and outputting a clock whenever it reaches
zero)
2. Forcing a Divider Reload
68
3. Changing the Dividers period (the number of clocks the counter requires before it
reaches zero).
appendix F (3.1)
3.4.4. Sequencer
The sequencer simply runs through a sequence of values. When the sequencer is clocked,
it will move control to the next element of the sequence. Once the end of the sequence is
reached, control is returned to the head of the sequence.
This can be implemented straightforwardly using a variable to hold the sequence number
and an array structure to hold the sequence.
Continuous looping of the sequencer can be handled via modulo on the length of the
sequence.
The Sequencer will be implemented in a general way, allowing for the class to be
extended, substituting in alternative values for the Sequencers‘ sequence.
Figure 28: Sequencer Example Generalisations
This setup will be used to create the Sequencers for the Triangle and Square waveform
channels.
69
appendix F (3.2)
3.4.5. Timer
The timer is nothing more than a divider under another name.
3.4.6. Envelope
There exists only one Envelope in the APU, its purpose being to control the volume of
the sounds output by the waveform channel.
The operation of the Envelope is summarised in the below Activity Diagram. The
implementation shall follow this closely.
Figure 29: Operation of the Envelope Unit
3.4.7. Sweep
Only one Sweep unit exists within the APU, being used to adjust the periods of the Pulse
channels.
This period adjustment occurs when the internal divider outputs a clock (provided certain
conditions are met).
70
The behaviour of the unit is described in the following two diagrams:
Figure 30: Operation of the Sweep Unit (1)
The reload flag is set every time the Sweep unit registers are written to. As a result,
writing to these registers has the effect of restarting the unit for another sweep.
Figure 31: Operation of the Sweep Unit (2)
The above essentially modifies the Pulse Channel‘s period in such a was as to produce
effects such as gradually increasing pitch of the output, allowing for more interesting
sounds than the Pulse would allow alone.
71
3.4.8. Shift Register with Feedback
Generates the bit sequences output by the Noise channel.
Figure 32: Shift Register with Feedback
Each run through of the above generates one bit of the bit sequence used by the Noise
Channel.
appendix F (3.3)
72
3.4.9. Frame Counter
This unit will be implemented simply as a switch/case construct. The current step through
the appropriate sequence will be kept track of and units will be clocked based on the
value of this step counter.
Switch (step) {
case 1:
Clock Envelopes and Linear Counters
case 2:
Clock Envelopes, Linear Counters, Length Counters and Sweeps.
…
appendix F (3.4)
3.4.10. Mixer
The mixer is to be implemented simply as two lookup tables (28) (which use the outputs
of the channels as the lookup values). This closely approximates the actual working of
the mixer. This method was chosen over the slightly more accurate purely mathematical
algorithm for efficiency:
Lookup Tables
pulse_table [n] = 95.52 / (8128.0 / n + 100) // 31 entry table – Used by the Pulse
channel
tnd_table [n] = 163.67 / (24329.0 / n + 100) // 203 entry table – Used by other
channels
Mixer Behaviour
pulse_out = pulse_table [pulse1 + pulse2] // Pulse Channels Output
tnd_out = tnd_table [3 * triangle + 2 * noise + dmc] // Other Channels Output
output = pulse_out + tnd_out // Mixer Output
Figure 33: Mixer Formula
73
3.4.11. Channels
Both Pulse and Triangle waveform channels require sequencers. As noted above, their
sequencers will extend a core Sequencer class.
Pulse
0 01000000
1 01100000
2 01111000
3 10011111
Triangle
15,
7,
0,
8,
14,
6,
1,
9,
13,
5,
2,
10,
12,
4,
3,
11,
11,
3,
4,
12,
10,
2,
5,
13,
9,
1,
6,
14,
8,
0,
7,
15
Figure 34: Waveform Sequences
All three waveform channels share a common design, being tweaked in certain ways to
provide the different behaviour required.
Each channel contains a Timer. This holds the number of clocks from the Frame Counter
required before the sequencer held by the channel is clocked. Thus, whilst the Timer is
more than 0, the channel will be performing its operations on the same element of the
sequencers' sequence. When the channel is clocked, it will start receiving the next value
from its related sequencer.
When the Timer reaches zero, it is reset.
Each channel also contains a length counter which dictates how long the channel should
produce output. When the counter reaches zero, the channel will no longer output to the
Mixer.
The below diagram illustrates the common behaviour of the channels pictorially:
74
Figure 35: Waveform Channels (General Case)
The way in which the channels vary will now be discussed.
75
3.4.11.1. Pulse Channel
This channels‘ sequencer is more complicated than the others. It contains four sequences.
Which is used depends on the Duty value given (0-3). It is assumed that a Duty value has
been specified and the channel is using a particular sequence.
An additional requirement for output is required in step 4 of the above diagram. The
sequence value currently used must not equal 0. If it does, nothing is output to the Mixer.
In step 5, the value output the Mixer if it reaches this far is the current envelope volume.
Figure 36: Pulse Channel
76
3.4.11.2. Triangle Channel
This channel contains an additional timer: the linear counter. This allows for the Triangle
channel to output sound data to the Mixer for longer than the other waveforms. Thus, an
additional condition must be added to step 4: that the linear counter > 0.
This channel does not contain an envelope. Instead, it outputs the actual sequence value.
Figure 37: Triangle Channel
77
3.4.11.3. Noise Channel
This channel produces random noise. To achieve this, it contains a shift register which is
manipulated in such a way as to produce apparently random activity. This behaves like
the sequencer where bit 0 of the register is taken to be the current sequence value.
In step 2, rather than clocking a sequencer, this shift register is clocked.
In step 4, the additional condition that bit 0 of the Shift Register must be > 0 is added.
When step 5 is reached, it is the current envelope volume which is output.
Figure 38: Noise Channel
Pulse, Triangle and Noise Channels. Appendix F (3.5 – 3.7)
78
3.4.11.4. DMC Channel
Most of the detail of how the DMC channel behaves internally (see analysis) can be
disregarded for implementation.
Loosely, the channel performs as follows:
Figure 39: DMC Channel Operation
The Sample Address and a ―Bytes remaining‖ variable should be incremented every time
sample data is returned from the Memory.
79
3.5. Input/Output Design
3.5.1. Structure
All inputs to the system follow a common pattern. Thus, they can be represented with a
common structure.
Because of this, using a simple interface would seem to be desirable as the handling code
can then be more generalized.
Interface InputDevice {
int : read();
void : write (int : data);
}
3.5.2. Determining Input State
Upon being strobed, the controllers on the real NES store the state of each button in an
internal 8-bit shift register (one bit per button state). Each read of the controller returns
the lowest bit of this shift register. This register is then shifted to the right by one each
time.
It shall be implemented here in a near identical fashion using a simple integer variable as
the shift register.
appendix F (4.1)
80
3.6. Development Design
3.6.1. Debugger
3.6.1.1. Register State
This can be handled via the use of simple text fields and labels along with accessor
methods in the CPU and PPU classes.
It would seem to be a good idea to have an update method which updates the state of all
the text fields when called. The method could then be called periodically to keep all fields
up to date.
3.6.1.2. Breakpoints
The breakpoint system is to be handled by allowing the CPU to be in one of two modes:
Normal Mode
Step Mode
Figure 40: CPU Modes
Modes will be set using a simple Boolean in the CPU.
When the CPU is in Normal mode, it will behave as detailed in the CPU sections of this
report.
When in Step mode, it will execute one instruction and then pause the CPU. The user
then chooses to either ‗step‘ through execution or ‗continue‘ execution. Continuing
execution will return the CPU to Normal mode. Stepping through execution will leave the
CPU in Step mode but will un-pause the CPU. The CPU will execute one instruction and
then pause again.
81
Figure 41: CPU Step Mode Execution
Each breakpoint is to be represented as a Breakpoint object. This object will encapsulate
all the information required (register, condition and address).
All breakpoints will be stored in the BreakpointHandler (in a suitable Collection). This
class will provide the "add", "remove" and "remove all" functionality as well as being
used to determine whether execution should break at a given point (by iterating over the
Breakpoints).
The Register and Conditional information is to be stored in the form of enumerations to
ensure code is as readable as possible and to allow the use of a simple Switch/case
construct when checking breakpoint state in the BreakpointHandler.
82
Figure 42: Breakpoint System Design
The above setup should be reasonably extensible, allowing simplified further
development.
3.6.1.3. Tools
‗Step‘ and ‗Continue‘ functionality will be simple to implement. It will simply require
setting for Step and clearing for Continue the Step Boolean in the CPU. The Breakpoint
system in place above will take care of the rest.
The ‗Seek PC‘ functionality will be handled by Focus handling methods of the JList
object in Java (see GUI Design). Users will be permitted to enter the desired address with
or without the ‗0x‘ Hex prefix.
83
3.6.1.4. Code
This will require the implementation of a disassembler.
When assembled, each opcode is represented in the ROM by an integer value (as are all
the operands). It is possible to determine the opcode used by this value.
The disassembler will make use of many OpCodeInfo objects. One will exist for each
Opcode of the CPU. These objects are to hold information (pertinent to the disassembly
process) about the opcode they represent. They are to be stored in an array in ascending
order of opcode number.
This will allow for easy retrieval of opcode information for each assembled opcode by
using the assembled opcodes numeric value as an index into the array. This retrieved
information can then be used to rebuild the original code (as best as possible). The
addressing mode will be used to determine how the data should be interpreted and the reassembled code formatted.
Figure 43: Disassembler Design
3.6.1.5. Memory
This will be simple to implement, requiring accessor methods in the CPU and PPU to
retrieve the memory.
For efficiency, memory should only be updated in the debugger when it is visible to the
user (see GUI Design).
84
3.6.2. Pattern Table Viewer
3.6.2.1. Tiles Display
The technique that must be employed to form the pattern table tiles is discussed in the
analysis.
The algorithm to be employed is detailed below:
Figure 44: Pattern Table Bit Combination Algorithm
Essentially, it runs through each pair of bytes in the pattern table memory, combining
their bits to make up the pattern table tiles.
3.6.2.2. Palettes Display
These are obtainable from the Palette object.
85
3.6.2.3. Tile and Palette Information
The tile and palette entry numbers can be derived by performing some simple maths on
the images on display (see GUI Design). For example, say that each tile is 8x8 pixels.
Divisions by eight can be employed to determine the number of the tile currently hovered
over.
Whether the display should be refreshed or not can be handled simply by use of a
Boolean.
The refresh rate can be implemented by requiring that the thread executing the viewer
sleep for the determined time before updating the display.
3.6.3. Name Table Viewer
Screen refresh controls can be handled in the same way as for the Pattern Table Viewer.
3.6.3.1. Name Table Rendering
Rendering should be handled using the following abstracted model:
Figure 45: Name Table Rendering Model
Once both the tile data and attribute bit for the current name table element is retrieved, it
is possible to display the tile on screen in full four bit colour.
Rendering the table in two bit colour is a simple case of not applying the attribute bit
when displaying on screen.
86
3.6.3.2. Scroll Line Display
This can be achieved by obtaining the X Tile, X Fine, Y Tile and Y Fine values from the
PPU and simply adding the X‘s together and the Y‘s together:
Scroll X = X Tile + X Fine
Scroll Y = Y Tile + Y Fine
This will display the scroll lines at the exact pixel locations of the scroll.
3.6.3.3. Attribute Table Information
Obtainable via access to the PPU.
3.6.3.4. Name Table numeric data display
A mock up of this display is available in the GUI design.
Again, the name table data is easily accessible from the PPU. The data can be displayed
in grid formation by applying some simple maths to Java‘s graphics system.
drawLineX draws a solid line from the top to the bottom of the display given an X Coordinate. drawLineY draws a solid line from the left to the right of the display given a Y
Co-ordinate. Neither method exists in Java. They are provided here to simplify the code.
int : tileSize = 24;
for (i = 0 to 32) { // 32 tiles across
drawLineY(i*tileSize);
}
For (i = 0 to 30) { // 30 tiles down
drawLineX(i*tileSize);
}
The Name Table values can be drawn in the appropriate places using a similar method.
87
3.7. GUI Design
3.7.1. The main window
Figure 46: The main window GUI Designs
88
The Set Keys menu item and Sound menu will result in the same tabbed window being
brought up. Depending on the item selected, the appropriate tab will be displayed. The
Graphics menu will not initially do anything. Graphics options may be incorporated if
time permits.
3.7.2. The Settings Menu
Figure 47: The Key Mapping Tab
Figure 48: The Sound Tab
89
Figure 49: The Graphics Tab
The settings menu consists of all input and sound options split into appropriate tabs.
3.7.2.1. The Key Mapping Tab
Changing the input device in the drop down box will change the panel below it to display
the input options available for that device. Unless time permits, the only device available
here will be the standard controller.
The implementation of the key mapping will be based on a solution suggested in the book
―Developing Games in Java‖ by David Brackeen (34).
3.7.2.2. The Sound Tab
The sound tab allows you to change the sound volume. This can be changed either for all
sound channels at once or for each individually. It is also possible to mute channels or all
sound.
90
3.7.3. The Debugger
Figure 50: The Debugger
The debugger works as described in the analysis and design.
91
3.7.4. The Name Table Viewer
Figure 51: The Name Table Viewer
Clicking the buttons in the ―Name Table View‖ will result in the appropriate name table
data being displayed on screen using the GUI shown below.
Figure 52: The Name Table Number View
92
3.7.5. The Pattern Table Viewer
Figure 53: The Pattern Table Viewer
Tile and Palette information should be displayed for the tile or palette entry currently
hovered over by the mouse.
Note: The Pattern table viewers‘ GUI design is based heavily on the Pattern Table viewer
of FCEUXD.
93
4. Implementation
4.1. Common Tactical Policies
4.1.1. Package Naming
This project will use the package naming conventions documented in the Java Language
Specification (Third Edition) (35) in an attempt to eliminate package name conflicts.
The package hierarchy used will be as follows:
uk.ac.sussex.drm24
4.1.2. Commenting Policy
In an effort to make the code clear, two types of comments will be used.
Comments of the following form will be used to describe each major block within the
code. For example, if a class contains several lookup tables, they will be labelled as such
above the first table within the code:
//=====================================================================
// LOOKUP TABLES
//=====================================================================
For code within these blocks, normal commenting will be used. For example:
// This variable represents the Accumulator.
or
/*
* This variable represents the Accumulator.
*/
4.1.3. Debugging Conventions
Any software project invariably requires a great deal of debugging.
In an effort to allow for the insertion of debugging code without it becoming intrusive or
resource hungry, a simple debugging system will be put in place.
This will involve the creation of a Debug class containing many Boolean class variables.
Each of these variables will correspond to a certain class of debug code. This will allow
for large chunks of debug code to be switched on or off with ease.
This class will also contain any methods which carry out a function helpful to the
debugging process (also class level).
94
For Example:
Class Debug {
// When true, any general CPU debug code is executed.
public static final boolean CPU_DEBUG = false;
// When set, any PPU debugging information will be shown.
public static final boolean PPU_DEBUG = false;
// Debugging specific to display rendering.
public static final boolean PPU_RENDERING_DEBUG = false;
}
// Generic debug method given as example.
public static final debugHelp() {}
When some debug code is added, the appropriate flag must be checked before carrying
out the actions contained within. For example, if it desired to place a message when the
PPU starts, this would be achieved as follows:
if (Debug.PPU_DEBUG) {
System.out.println(“PPU Started”);
}
It should be noted that implementing the debugging in this way will not result in any
additional overhead to the execution time. This is because the Java compiler will exclude
any debug code whose flag is clear and exclude the conditional from the debug code
should the flag be set (this code will not be present in the compiled byte code). It can do
this safely due to the ―final‖ status of the flags. It is not possible to reassign their value.
4.1.4. GUI Structure
The project GUI‘s are to contain a great many listeners (to listen for user interaction with
the GUI). While there are several possible ways of writing listeners, the method to be
used in this project is to make each listener an inner class of the GUI for which it is
listening. For the most part, each component will have its own listener rather than dealing
with many within one.
This should help to keep the class hierarchy manageable and ensure the code is clear
respectively.
95
4.1.5. JavaDoc Documentation
The important methods and fields in the project will be documented in such a way as to
allow automatic JavaDoc documentation generation.
The benefits of this will be two-fold.
Firstly, the documentation will make it easier during development, allowing viewing of
just the important information from each class. This will make it quicker and easier to
find desired information.
Secondly, the documentation will prove helpful if other developers wish to expand or
modify the code at a later date.
4.2. Software Re-use
Small pieces of code have been incorporated into the software from other projects. In
most cases, this was in an effort to not ―re-invent the wheel‖. All code used was released
under the GNU General Public License Version 2.
Palette colour generation routine – Documented within the PPU design.
Code to carry out low pass filtering and note smoothing on the audio data as it
passed through the mixer was used from David de Niese‘s NESCafe emulator.
(11)
Two lookup tables within the CPU – the first used to set/clear the values of the
zero and negative flags given a value, the second to determine the number of
cycles a given instruction takes to execute. (11)
An opcode information table was modified from Darron Schall and Claus
Wahlers‘ CPU emulator. It was used in the Disassembler. (30)
4.3. Threading
The software is multi-threaded. This implementation decision was made mainly out of
necessity.
4.3.1. Thread Creation
Because of the way the CPU was designed (executing within a never ending loop, except
when stopped), it was necessary to thread it so as to allow the rest of the program to
continue executing at the same time as the CPU.
It was decided that whilst the CPU would be created at start-up, it would only begin
instruction execution upon being started in a new thread. This new thread is created upon
a ROM being loaded:
96
Figure 54: Loading a ROM creates a new CPU Thread
When a Java application is loaded, all execution occurs in the ―Main‖ thread. If a GUI is
created, an additional ―AWT‖ thread is created which is used to listen for user interaction
with the GUI. Thus, when a ROM is loaded, there will be three executing threads.
New threads are also created upon opening the name table and pattern table viewers. This
is a necessity here also, using a similar means of execution as the CPU.
Figure 55: Opening the Name Table and Pattern Table Viewers (excluding threads managed by Java). All
created threads are members of the same Thread Group.
97
Note that the threads are all assigned to the ―NESThreads‖ thread group upon creation.
This is to simplify the process of destroying the threads later (see below).
4.3.2. Thread Destruction
When a ROM is closed, all members of the NESThreads thread group are destroyed. This
is handled by setting a ―stop‖ Boolean which is checked for periodically in each thread
(built-in Java methods for destroying threads are inherently unsafe and their use was
avoided).
It is ensured that all threads are destroyed before execution continues.
Figure 56: All three created Threads are destroyed when a ROM is closed.
4.3.3. Thread Priorities
In order to make the system as efficient as possible, thread priorities change dynamically
to best suit the current situation. The Pattern table viewer, name table viewer, sprite
viewer and CPU are all threaded.
When the main GUI window has the focus, all threads except the CPU are given
the lowest possible priority so as to allow the emulator to run at the best possible
speed.
All threaded elements of the system have an accompanying GUI (except the
CPU). When one of these GUI‘s is hidden from view, the appropriate thread is set
to minimum priority.
98
4.4. Outputting Sound (36)
Due to the huge discrepancy between the frequencies that the NES APU and modern PC
hardware output sound samples (the APU outputs approximately 1789772 samples a
second as opposed to sample rates between 11,025 and 192,000 samples a second on PC).
A sampling rate of 44,100 is used in this project.
In order to bridge this gap, a sample was output every 41 APU outputs made up of the
average of these 41 APU outputs.
This figure was decided upon via the following calculation:
1789772 / 44,100 = ~41
Thus, there are approximately 41 APU samples output per 1 PC sample output.
99
4.5. Testing
The tests below use this test specification.
Each test is labelled using the references documented within this
specification.
4.5.1. Pre-Written Test Files
These files are ROM images written with the intent of testing specific aspects of the
system. These are documented and referenced within the test specification.
CPU
CPU Timing
Branch Timing
CPU Operation
CLI Latency
CLI and Related
Overflow
PPU
Pass
Pass
Pass
Pass
Pass
Pass
APU
PPU General
Scanline Rendering
Sprite Overflow
Sprite Hit
PPU Miscellaneous
Partial
Fail
Partial
Fail
Partial
APU
Miscellaneous
Sound Test
Partial
Pass
I/O
Joypad
Pass
4.5.2. Project Specific Testing
4.5.2.1. Performance Testing
Partial Success.
The frame rate is reasonable, running at speeds between about 47-53 frames per second.
This does not quite meet the desired run-time speed.
4.5.2.2. Portability Testing
Windows
Ubuntu
Mac
Pass
Pass
Pass
100
4.5.3. Unit Testing
Overall
Load
Exit
CPU
Pass
Pass
APU
Volume
Control
Pause
Stop
Reset
Pass
Pass
Pass
I/O
Pass
Pattern Table Viewer
Many
Refresh
PPU
Pass
Pass
Graphical Effects
4 of 6
Name Table Mappings Pass
Pattern Table Use
Pass
Debugger
Key Mapping and
input recognition
Pass
Register State
Breaks
Breakpoint Manager
Step and Resume
Seek PC
Disassembled
Memory Display
Name Table Viewer
Tiles and Scroll Lines
Attribute Information
Refresh
Two Bit Colour
Numeric Name Tables
Pass
Pass
Pass
Pass
Pass
4.5.4. Integration Testing
The need for integration testing is minimal as if all units work correctly independently;
they will work correctly when combined. It is of utmost importance that the CPU
operates correctly as all other major units rely heavily upon it (APU, PPU, I/O).
101
Pass
Pass
Pass
Pass
Pass
Pass
Pass
5. Conclusion
5.1. Finished Software Screenshots
Figure 57: Illustrating the Background and Sprite Rendering
Figure 58: Space Invaders in Action
102
Figure 59: Examples of software being emulated (TV Mode)
Figure 60: Examples of software being emulated (Non-TV Mode)
103
Figure 61: Pattern Table and Name Table Viewers in action
104
Figure 62: Input and Graphic Settings Dialog
105
Figure 63: Two GUI Modes
Figure 64: Frame per Second Indicator and Sprite Viewer
106
Figure 65: About (provides cartridge information) and Help menus
107
Figure 66: Debugger
108
5.2. Success of the finished product
5.2.1. Objectives Completion Summary
CPU Emulation (O 1)
Done
PPU Emulation (O 2)
Palette Handling
Register Handling
Name Table Handling
Attribute Table Handling
Sprite Rendering
Background Rendering
Scrolling
Alterable Palettes (O 2.1)
Partial
Done
Done
Done
Done
Done
Partial
Partial
Done
APU Emulation (O 3)
Done
Control Pad Emulation (O 4)
Done
Development Environment (O 5)
Done
GUI Implementation (O 6)
Done
5.2.2. Project Evaluation
5.2.2.1 Objective Completion
All requirements of the CPU, APU, control pad emulation, development environment and GUI
are completed.
Most capabilities of the PPU are complete but the unit lacks:
5.2.2.1.1. scanline based background rendering
A routine which paints the tiles of the name tables to the display has been provided in Lou of the
scanline based renderer, and for some ROMs, this works without fault.
5.2.2.1.2. Scrolling
The scrolling of the playfield has been partially implemented. Whilst the scrolling is not visible
during the rendering of the display, the logic which lies behind the scrolling is fully
implemented.
112
Other minor issues also exist within the PPU which occasionally result in incorrect rendering.
These issues are predominately a result of unimplemented corner cases of PPU behaviour.
Obviously, these issues restrict the use of the emulator. However, I believe these limitations
could be overcome if more time were permitted for development.
5.2.2.2. Comparison Against Existing Software
Emulator
Jamicom (Java)
Graphical Support
Limited
Audio Support
None
Animosity (Java)
NESCafe (Java)
Very Little
Full
None
Full
Usability
Command Line
only
N/A
Reasonable
Development Aids
None
Nestopia (NonJava)
FCEUXD (NonJava)
Full
Full
Reasonable
None
A Rudimentary
Debugger
None
Full
Full
Good
Extensive
In a comparison against existing software on the market, I feel that my emulator bears well.
The current Java emulators available lack any real development aids whereas this project offers a
number. Also, although it is not as feature rich as FCEUXD, it has the benefit of being crossplatform (FCEUXD executes on Windows machines only).
Admittedly, the majority of the emulators above offer superior graphical support but in my
opinion are largely inferior in regard to usability.
I feel that the developed software certainly has its place within the market, offering good
emulation and development capabilities and availability on all major operating systems.
5.2.2.2. Design Issues
I feel that the design is flawed in several areas.
Firstly, I made the false assumption at an early stage that the three major units (CPU, PPU, and
APU) were largely independent with minimal interaction between them. Although the units are
largely self contained, the interactions are significant enough that they should have been
considered in greater depth at the design stage so as to allow these interactions to be defined
more cleanly.
Secondly, I feel that although the CPU memory structure documented provides some advantages,
its‘ complexity makes its use undesirable. This is discussed further in Alternative Methodologies
below.
113
The original design choice of emulating the APUs‘ Mixer unit via the use of two lookup tables
was abandoned after implementation due to the fact that their use produced an inferior sound
quality. Instead, the values outputted from the channels were simply fed directly to the mixer.
Although less efficient, the sound quality tends to be significantly higher.
5.2.2.3. Implementation Issues
The system suffers from issues of efficiency. This lack of efficiency stems largely from the
background rendering implementation in place. Whilst background rendering is enabled, the
sound will occasionally output from the system at an incorrect rate. This issue would almost
certainly disappear if a correct scanline based background renderer was put in place.
One aspect of the implementation which would do well to be improved is the methodology used
to determine four bit colour for the name tables. The approach is long winded and inefficient.
This should ideally be re-written to make it more compact.
5.2.3. Extensions Achieved
All the cartridge details have been made been made accessible to the user via the ‗About‘ option
on the GUI. These are grouped into three sections: details about the file and its condition, details
about the memory held on the cartridge and all other details such as the mirroring mode and
region (E 3.3).
A panel was added to the debugger to display the state of execution visually to the user. The
CPU state (whether it is executing or not) is represented by either a green or red circle and an
exclamation mark with explanatory text is used to alert the user when a breakpoint is hit (and
which one) (E 3.4).
The ability to highlight specified text within the code panel (which displays the disassembled
program) has been added. This allows the user to select text to highlight from several predetermined options or to specify the text themselves. It is possible to add and remove any
number of strings to match at a time (E 3.5).
A sprite viewer was added which allows you to view all the sprites currently in memory on
screen at once (E 7).
A Frames per Second counter was added to allow the user to measure the performance of their
software (E 8).
5.2.4. Conclusion
In my opinion, the software fully satisfies the requirement of an integrated development
environment, meeting the needs of the target user group.
114
Because of the limitations in the PPU, the needs of the second user group: those who intend to
use the software to play games have not been fully met. However, the system is fully useable for
many titles.
I believe the only major limitation of the system to be the lack of an efficient background
rendering routine. Whilst it is unfortunate that the PPU remains incomplete, I feel confident that
its issues could and will be remedied with additional time spent.
In conclusion, I believe that this project has been largely a success. It has resulted in a piece of
software that I view as useful to both target user groups and has been a very rewarding learning
experience.
5.3. Future Extensions
5.3.1. Additional Debugger Functionality
I feel that if the system was to be extended, most effort should be concentrated on the
development of additional debugger capabilities. For example, in the current debugger, the APU
is largely excluded. Much useful APU information could be added to the application. For
example, the values of length counters, duty cycles etc could be displayed. This would help to
make the process more user friendly.
A trace logger to log the execution of the CPU could also be invaluable to users attempting to
debug their code.
On the simplest level, a trace logger could be implemented by simply writing register and
instruction values to a buffer and then outputting them to a file once the logger is stopped.
5.3.2. Sound Quality
Whilst the sound outputted by the system is of a reasonable quality, it could be much improved
through the use of more complicated sampling techniques. One such technique which would
result in greatly improved sound is band limited synthesis. This technique is too complicated to
discuss here but is outlined at (37) and would be a worthwhile extension to the system.
5.3.3. Save State Functionality
This is a feature which could significantly improve the user experience. State saving could be
employed in several areas of the system:
Saving user preferences (palette alterations, volume levels, key configurations etc),
Saving execution state, allowing the user to return to the running title later,
Saving debugger state (breakpoints, text highlights, PPU manipulations).
Java provides several methods for implementing state save functionality. Three possibilities
follow:
115
1. The simplest method conceptually is simply to write the values which you wish to save to
a text file with names to identify the values by. You would then need to write a parser to
retrieve the values.
2. A second approach is to use the Serialisation capabilities of Java which allow objects to
be flattened and written to disk and then reconstructed.
3. Finally, the Properties object (in the java.util package) could be used. This object has
been written specifically for the purpose of saving state to text file.
The third option is perhaps the easiest but storing the data in an object may be the more flexible
option as you then have all the capabilities of Java at your disposal for storing and retrieval.
116
5.4. Alternative Methodologies
In retrospect, I believe that the system design could be improved upon in several areas.
5.4.1. CPU Memory Structure
Figure 67: The CPU Memory Structure Used
The initial intention behind this structure was to provide a clear and simple way of modelling the
various types of memory within the CPU. For example, if a type of memory was read only, it
was possible to emulate this by simply overriding the ‗writeToMemory‘ method with an empty
method.
Although this system has its advantages, it was too complex overall and required a fair amount
of ―housekeeping‖ to keep track of the objects to be used for particular reads and writes.
Another limitation is the overhead produced. This overhead includes the method invocations
required when reading or writing to memory and the overhead produced by the abovementioned
―housekeeping‖ methods.
If I was to redesign the system, I would instead use a single array to represent the memory. What
it would lose in code conciseness, it would more than make up for in efficiency and ease of
access. This would also be a much more flexible setup, allowing for a dynamic memory system
by the simple inclusion of a number of pointers into the array (making the emulation of memory
mapper hardware significantly simpler).
5.4.2. Palette Implementation
The NES colour palette was implemented using a mathematical algorithm devised by Kevin
Horton and implemented by David de Niese. Whilst this provides the advantage of very accurate
colours and a simplistic means of changing hue and tint, it is inflexible in that providing users
with the ability to change the colour palettes to their liking would be difficult.
Thus, I have concluded that the preferable methodology would be to implement the palette
entries using simple RGB values. This would likely result in a less accurate colour scheme but
would be much simpler to allow user manipulation.
117
6. Works Cited
1. Burdett, A, et al. A Glossary of Computing Terms. A Glossary of Computing Terms. 9th
Edition. s.l. : Longman, 1998, pp. 30-31.
2. Diskin, Patrick. Nintendo Entertainment System Documentation. NesDev. [Online] [Cited: 12
October 2007.] http://nesdev.parodius.com/NESDoc.pdf.
3. Various. Turing Complete. Wikipedia. [Online] [Cited: 2008 February 23.]
http://nostalgia.wikipedia.org/wiki/Turing-complete.
4. Wilen, Toni. WinUAE. WinUAE. [Online] [Cited: 16 October 2007.] http://www.winuae.net.
5. Sundell, Per Hakan. CCS64 - A Commodore 64 Emulator - By Per Håkan Sundell.
Computerbrains. [Online] [Cited: 16 October 2007.] http://www.computerbrains.com/ccs64.
6. Rechlin, Eric. HP Calculator Emulators for the PC. hpcalc.org. [Online] [Cited: 16 October
2007.] http://www.hpcalc.org/hp49/pc/emulators.
7. Smith, Tony. Bleem beats Sony. The Register. [Online] [Cited: 15 October 2007.]
http://www.theregister.co.uk/1999/04/12/bleem_beats_sony.
8. —. Playstation emulator wins first round against Sony. The Register. [Online] [Cited: 15
October 2007.] http://www.theregister.co.uk/1999/02/05/playstation_emulator_wins_first_round.
9. Ninn. Emulators. Patent Pending. [Online] [Cited: 12 November 2007.]
http://patpend.net/emulators/NES/OS/.
10. Ani. Java NES Emulators. Zophar's Domain. [Online] [Cited: 12 November 2007.]
http://www.zophar.net/java/nes.html.
11. De Niese, David. Download NESCafe. NESCafe Web. [Online] [Cited: 12 November 2007.]
http://www.nescafeweb.com/main.download.php.
12. Freij, Martin. Nestopia Index. Nestopia. [Online] [Cited: 12 November 2007.]
http://nestopia.sourceforge.net/.
13. Porst, Sebastian. Release of FCEUXD SP 1.07. Programming Stuff. [Online] [Cited: 12
November 2007.] http://www.the-interweb.com/serendipity/.
14. Vardy, Adam. Extra Instructions of the 65XX Series CPU. FC64 Wiki. [Online] [Cited: 12
October 2007.] https://mirror1.cvsdude.com/trac/osflash/fc64/wiki/6502Extras.
15. Berthouze, Luc. Software Engineering: Product and Processes. Software Engineering.
[Online] [Cited: 25 March 2007.]
http://www.sussex.ac.uk/informatics/syllabus/current/15626.html.
16. University of Illinois. The Nintendo Entertainment System. UIUC Computer Science
Department. [Online] [Cited: 12 October 2007.]
http://www.cs.uiuc.edu/homes/luddy/PROCESSORS/Nintendo.pdf.
17. Zaks, Rodney. Programming the 6502. Programming the 6502. Berkeley CA : Sybex Inc,
1983.
18. Burnett, Colin. ALU Symbol. Arithmetic Logic Unit. [Online] [Cited: 25 October 2007.]
http://en.wikipedia.org/wiki/Image:ALU_symbol.svg.
19. Bluechip. Rockwell 6502 Programmers Reference. Cyborg Systems. [Online] [Cited: 16
October 2007.] http://homepage.ntlworld.com/cyborgsystems/CS_Main/6502/6502.htm.
20. Rost, Bob. Nintendo. Game Development for the 8-bit NES - A class by Bob Rost. [Online]
[Cited: 28 October 2007.] http://bobrost.com/nes/lectures/NES_January_21.pdf.
21. Green, Shay and Disch. NES PPU. NES Dev Knowledge Base. [Online] [Cited: 26
November 2007.] http://nesdevwiki.org/wiki/index.php/NES_PPU.
22. Fry, Ben. deconstructulator. ben fry. [Online] [Cited: 30 October 2007.]
http://acg.media.mit.edu/people/fry/deconstructulator/.
118
23. Moby Games. Super Mario Bros. 2. MobyGames. [Online] [Cited: 13 November 2007.]
http://www.mobygames.com/game/super-mario-bros-2.
24. Fayzullin, Marat. Nintendo Entertainment System Architecture. Computer Emulation
Resources. [Online] [Cited: 30 October 2007.] http://fms.komkon.org/EMUL8/NES.html.
25. Green, Shay. NES APU Sound Hardware Reference. Blargg's Home. [Online] [Cited: 19
October 2007.] http://www.slack.net/~ant/nes-emu/apu_ref.txt.
26. Taylor, Brad. 2A03 Sound Channel Hardware Documentation. Game Development for the
8-bit NES - A class by Bob Rost. [Online] [Cited: 19 October 2007.]
http://bobrost.com/nes/files/nessound.txt.
27. Kemp, Kevin. Mixer Image. Introduction to Digital Home Recording. [Online] [Cited: 1
November 2007.] http://www.kevinkemp.com/homerecordingtutorial/images/softwaremixer.JPG.
28. Green, Shay and Disch. APU Mixer. NES Dev Knowledge Base. [Online] [Cited: 27
November 2007.] http://nesdevwiki.org/wiki/index.php/APU_Mixer.
29. Fayzullin, Marat. How To Write a Computer Emulator. Computer Emulation Resources.
[Online] [Cited: 14 November 2007.] http://fms.komkon.org/EMUL8/HOWTO.html.
30. Schall, Darron and Wahlers, Claus. CPU Core. fc64 C64 Emulator Source Code. [Online]
[Cited: 10 November 2007.]
http://svn1.cvsdude.com/osflash/fc64/trunk/projects/fc64/core/cpu/CPU6502.as.
31. Horton, Kevin. NES Palette Generator. Bluetech. [Online] [Cited: 14 November 2007.]
http://nesdev.parodius.com/kevin_palette.txt.
32. Niese, David de. NESCafe Nintendo Emulator for Java. The David de Niese Homepage.
[Online] [Cited: 14 November 2007.] http://www.daviddn.com/nescafe/index.asp.
33. Loopy. The Skinny on Scrolling. NES DEV. [Online] [Cited: 27 November 2007.]
http://nesdev.parodius.com/loopyppu.zip.
34. Brackeen, David. Developing Games in Java. s.l. : New Rider Games, 2003.
35. Sun Microsystems. Packages. The Java Language Specification Third Edition. [Online]
[Cited: 26 September 2007.]
http://java.sun.com/docs/books/jls/third_edition/html/packages.html#7.7.
36. Disch. APU Sound Frequencies. NesDev. [Online] [Cited: 15 February 2008.]
http://nesdev.parodius.com/bbs/viewtopic.php?t=4011.
37. Blargg. Band-Limited Sound Synthesis. Blargg's Home. [Online] [Cited: 17 February 2008.]
http://slack.net/~ant/bl-synth/.
38. Disch. NES Memory Mapping Version 1.0. Romhacking dot net. [Online] [Cited: 12 October
2007.] http://www.romhacking.net/docs/353/.
39. Firebug. Comprehensive NES Mapper Document v0.80. TuxNES. [Online] [Cited: 12
October 2007.] http://tuxnes.sourceforge.net/mappers-0.80.txt.
40. Hunsinger, Ed. How does the Nintendo Light Gun work? Geeked.info. [Online] [Cited: 12
October 2007.] http://www.geeked.info/how-does-the-nintendo-light-gun-work/.
41. Gamers Graveyard. NES Four Score. Gamers Graveyard. [Online] [Cited: 12 October
2007.] http://www.gamersgraveyard.com/repository/nes/peripherals/fourscore.html.
42. —. Power Pad/Family Fun and Fitness/Family Trainer. Gamers Graveyard. [Online] [Cited:
12 October 2007.] http://www.gamersgraveyard.com/repository/nes/peripherals/powerpad.html.
43. The Mighty Mike Master. NES Game Genie Technical Notes. TuxNES. [Online] [Cited: 12
October 2007.] http://tuxnes.sourceforge.net/gamegenie.html.
119
44. Nick M. Introducing the Miracle System. The Warp Zone. [Online] [Cited: 12 October
2007.] http://thewarpzone.classicgaming.gamespy.com/piano.html.
45. Green, Shay. NES Tests/CPU. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-tests/.
46. —. NES Tests/Branch. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-tests/branch_timing_tests.zip.
47. Horton, Kevin. NES. BlueTech. [Online] [Cited: 12 November 2007.]
http://tripoint.org/kevtris.
48. Green, Shay. NES Tests/CLI. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-emu/cli_latency_tests.zip.
49. Bridgewater, Alastair. NES Programs. Nes Dev. [Online] [Cited: 27 November 2007.]
http://nesdev.parodius.com/overtest.zip.
50. Green, Shay. NES Tests/PPU. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-tests/blargg_ppu_tests.zip.
51. —. NES Tests/PPU Overflow. Blarrg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-emu/sprite_overflow_tests.zip.
52. —. NES Tests/PPU Hit. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-tests/sprite_hit_timing.zip/http://www.slack.net/~ant/nestests/sprite_hit_tests.zip.
53. —. NES Tests/PPU More. Blargg's Home. [Online] [Cited: 13 November 2007.]
http://www.slack.net/~ant/nes-emu/vbl_nmi_timing.zip.
54. —. NES Tests/APU 2005. Blargg's Home. [Online] [Cited: 12 November 2007.]
http://www.slack.net/~ant/nes-tests/blargg_apu_2005.07.30.
55. Fry, Ben. Programs. Moogle Charm. [Online] [Cited: 21 November 2007.]
www.morganleahrecords.com/mooglecharm/programs.html.
56. Gough, Paul. Computer Systems Architecture Notes. G6015 Computer Architectures.
[Online] [Cited: 30 October 2007.]
http://www.informatics.sussex.ac.uk/users/michaelg/computerarchitectures/course_notes.pdf.
57. fluBBa. FluBBas TechDocs. GBARetro.com. [Online] [Cited: 12 November 2007.]
http://www.ndsretro.com/download/NEStress.zip.
58. Green, Shay. NES Tests/CLI More. Blargg's Home. [Online] [Cited: 13 November 2007.]
http://www.slack.net/~ant/nes-tests/cli_latency_tests.zip.
59. Firebug. NES ROMS - Starting with j. Rom Hustler. [Online] [Cited: 31 November 2007.]
60. Oorni, Lasse. Roms NES. Consolemul. [Online] [Cited: 21 November 2007.]
http://roms.consolemul.com/index.php?machine=20.
120
7. Appendices
Appendix
Appendix
Appendix
Appendix
Appendix
Appendix
Appendix
Appendix
Appendix
Appendix
A:
B:
C:
D:
E:
F:
G:
H:
I:
J:
Cartridge Specification
File Format Specification
Regional Differences Specification
Input Devices and Other Peripherals
Background Rendering in Detail
Low Level Designs
Test Specification
Project Logs
GNU GENERAL PUBLIC LICENSE Version 2
Source Code
121
Appendix A: Cartridge Specification
122
General
All software for the NES came encased in a plastic cartridge external to the system (in the form
of ROM) which was executed by slotting the cartridge into a 72 pin connector on the NES and
turning on the power.
Figure 68: A NES game cartridge. Used by slotting into the cartridge slot in the NES hardware (adapted from (2))
Basic cartridges contain two types of ROM, CHR-ROM and PRG-ROM.
The CHR-ROM contains the pattern tables of the game whilst the PRG-ROM contained the
actual program code.
Cartridges contained either 16 or 32KB of PRG-ROM depending on the size of the program.
Figure 69: The annotated insides of a standard NES cartridge (un-annotated image from (2))
123
Additional Hardware
It is possible for cartridges to contain additional hardware (and very often did) which provide
additional functionality. These enhancements will be briefly summarised below:
WRAM
WRAM allowed for information to be saved which would allow the user to return to a previous
state in the execution of the program. For example, returning to the beginning of a level in a
game when the player ―dies‖ with the same statistics they had when they first entered the level.
This RAM may retain data even when the console is switched off via the use of a small battery
maintaining the current through the memory.
Memory Management Chips (MMC‘s) (2) (38) (39)
Memory management chips (also commonly known as memory mappers) were used to
counteract the limitations in the NES hardware.
They allowed the use of a larger number of both PRG and CHR-ROM banks thus allowing larger
programs with superior graphics.
This was achieved by the executing program indicating the need for data from a ROM bank not
currently loaded into memory. The MMC would then swap the required data into a defined page
of memory for use by the program.
Some mappers also provided additional functionality such as the ability to trigger IRQ‘s and
enhanced graphical manipulations (such as the ability to only scroll certain areas of the screen).
124
Appendix B: File Format Specification
125
In order to parse the ROM files that make up the NES software, a file format has to be decided
upon.
There exist two main formats in use by NES emulators today. These are the iNes format and
UNIF. The iNes format was the first to be proposed but has been criticised for being ambiguous
and for storing little data about the software title, making correct emulation more difficult. UNIF
attempts to fix these issues and more.
Whilst UNIF would seem the natural choice, it is the case that practically all ROM files use the
iNes format. Because of this reason, iNes has been chosen to provide maximum compatibility.
iNes (24)
The iNes format consists of a 16 byte header situated at the beginning of ROM files.
It identifies which MMC is used (if any) via an 8-bit number. The numbers for each mapper were
decided upon by Marat Fayzullin (the creator of the iNes format).
After the header, the ROM banks should be stored in the file in ascending order. If a trainer
exists, its 512 bytes will precede the ROM banks in the file.
The format makes reference to both trainers and the VS System. These are briefly discussed
here:
Trainers
Trainers are 512 bytes of code which were used to allow the copying of cartridges. They allowed
the bypassing of the normal MMC used by the cartridge, instead using the MMC defined by the
copier.
The use of trainers will not be emulated as cartridges copiers no longer need to resort to using
them. Thus, the only possible purpose for emulating this functionality would be to allow the
proper handling of illegally obtained ROM files.
VS System
The VS System series is a collection of arcade games designed for competitive play between two
people which were based on many NES titles.
This functionality will not be implemented as it has very little scope for use and would leave less
time available for more useful implementations.
The iNes format is summarised below:
126
Byte Contents
0-3
This should contain the string ‗NES‘ followed by the MS-DOS end-of-file
character (in hex: $4E $45 $53 $1A)
4
The number of 16KB PRG-ROM banks (program code)
5
The number of 8KB CHR-ROM banks (pattern tables)
6
Bit 0 – Indicates the name table mirroring scheme used.
0 – horizontal mirroring
1 – vertical mirroring
Bit 1 – Presence of battery backed RAM
0 – No battery backed RAM at $6000-$7FFF
1 - Battery backed RAM at $6000-$7FFF
Bit 2 (see below for explanation)
0 – No 512 byte trainer present at $7000-$71FF
1 – A 512 byte trainer present at $7000-$71FF
Bit 3 – Presence of four screen name table mirroring
0 – The name table mirroring scheme indicated in bit 0 is used
1 – Four screen name table mirroring is used
Bits 4-7 – Four lower bits of the ROM Mapper type
7
Bit 0 (see below for explanation)
0 – The cartridge is not of the VS-System type
1 – The cartridge is of the VS-System type
Bits 1-3 – Reserved for later use. They should all be set to 0.
Bits 4-7 – Four higher bits of the ROM Mapper type
8
The number of 8KB RAM banks present (when zero, this should be read to mean
that one bank exists. This is for compatibility reasons).
9
Bit 0 – Indicates the region of the original cartridge
0 – NTSC
1 – PAL
Bits 1-7 – Reserved for later use. They should all be set to 0.
10These bytes are reserved for later use and should all be set to 0.
15
Table 5: A summarisation of the iNes header format(24)
127
Appendix C: Regional Differences
Specification
128
It is important to note that all three of the main units to be implemented vary in minor ways
depending on the region that the NES is manufactured for (NTSC or PAL).These differences are
noted below (2) (25):
NTSC
PAL
CPU Clock Speed
1.79 MHz
1.66 MHz
PPU Clock Speed
21.477272MHz / 4 26.601712MHz / 5
Frames Per Second
60
50
Visible Screen Resolution
256x224
256x240
Table 6 : Differences between the NTSC and PAL NES
Additionally, the sound frequencies emitted by the noise and DMC sound channels tend to be
higher for the NTSC APU
I intend to write the emulator to conform to the NTSC NES specification as this will allow for a
wider array of software to be used on the system than if PAL was chosen.
PAL ROMs will still be useable but will execute in a way not intended by the developers (due to
the discrepancies in machine specification).
129
Appendix D: Input Devices and Other
Peripherals
130
Input Devices
E 4.1. The NES Zapper. This is a peripheral shaped like a gun and used as such.
The users point it at the screen and press the ―fire‖ button, with the aim to
―shoot‖ targets on screen. (40)
E 4.2. The NES Four Score. This peripheral allows for up to four people to play
the same game simultaneously. This is achieved via inputs on the device
allowing four control pads to be inserted. (41)
E 4.3. The Power Pad. This peripheral consisted of a mat with inputs which the
users were supposed to press with their feet. It was designed as a way of
getting fit whilst playing games. (42)
Other Peripherals
E 4.4. The NES Game Genie. This peripheral allows the user to alter the way that
the software used in the NES is executed via codes input into the game genie. The NES cartridge
is inserted into the Game Genie which is then, in turn, inserted into the NES cartridge slot.
The codes input by the user translate into addresses and data in a game‘s program space which
the Game Genie tricks the CPU into using instead of the data that should be there. (43)
E 4.5. The Miracle Piano Teaching System is a peripheral which can be used to
learn basic skills on the piano. A keyboard with pedals is provided, as is software for teaching.
The software‘s AI alters lessons based on how the user plays the keyboard. (44)
131
Appendix E: Background Rendering in
Detail
132
Display Rendering
The rendering process relies on the values in three internal registers:
PPUADDR
XFine
loopyT (named after the person who first identified the scrolling behaviour).
PPUADDR is used and manipulated throughout the rendering process in order to render the
image in the name tables to the display. Thus, it should not be altered by the programmer during
rendering so as to avoid rendering issues.
It should, however be possible to write to PPUADDR during rendering without altering the
rendering behaviour.
loopyT is used for this purpose and is written to via writes to PPUCTRL and PPUSCROLL.
PPUADDR is updated with the value in loopyT once every frame. This allows the updating of
PPUADDR without interfering with the rendering.
The bits of loopyT and XFine are interpreted in such a way as to always point to a particular
pixel within a particular name table.
The meaning behind the bits in these registers is illustrated in the below diagram:
133
Figure 70: Display Rendering Register Interpretations
X Tile Position – The first tile from the X axis of the current name table which should be
rendered for the current scanline.
Y Tile Position – The first tile from the Y axis of the current name table which should be
rendered for the current scanline.
X Fine Position – The first column of the selected tile which should be rendered for the current
scanline.
Y Fine Position – The first row of the selected tile which should be rendered for the current
scanline.
Name Table Base Address - Dictates the name table that rendering should begin at.
Name Table Address (in red) – Contains the actual address of the next name table element to be
rendered.
The writes to PPUADDR are carried out by the user. They are included to show that loopyT is
copied into PPUADDR upon the second write. Also that bit 15 is always set to zero. This is to
prevent the user from attempting to reference memory locations not present (14 bits gives a
maximum address of 0x3FFF, the highest referenceable memory location in the PPU).
134
The use of X tile, Y tile, X fine and Y fine is illustrated below:
Figure 71: Applying the Register Interpretations to the Name Tables
Rendering Behaviour
After each tile of a scanline is rendered to the display, the value of X Tile is incremented,
allowing the PPU to render the next tile on the name tables X axis.
The X Tile‘s value should wrap to 0 when it reaches 31 (the end of the name table has been
reached). This will result in bit 11 of loopyT being inverted, switching the horizontal name table
which will be used for rendering from now on.
135
After every complete scanline has been rendered to the display, Y Tile is incremented, allowing
rendering from the next row of tiles in the name table. Y Tile wraps from 29 to 0 (the end of the
name table has been reached). This will result in bit 12 of loopyT being inverted, switching the
vertical name table which will be used for rendering from now on.
Figure 72: Name Table Switching
As can be seen from the above diagram, this bit inverting process ensures that the PPU will never
―run out‖ of name table to render, simply alternating between tables each time the current name
table comes to an end.
Ordinarily, the X Fine value is not changed during the rendering of a frame. This is so that each
scanline begins rendering at the same point, maintaining the image stored in the name table
during scrolling. However, some programs manipulate this value to achieve various effects (such
as split screen).
At the beginning of each scanline, certain bits of loopyT are copied into PPUADDR to achieve
two purposes:
―Resets‖ the X Tile value. This will ordinarily copy across the X Tile value present before the
scanline began (resulting in the next scanline beginning its horizontal rendering at the same point
as the one before it).
136
Sets bit 11 of PPUADDR to bit 11 of loopyT. This will return to the horizontal name table being
used before the scanline began in case a horizontal name table boundary was crossed during the
previous scanline (resulting in the alternate horizontal table being set for use).
137
Appendix F: Low Level Designs
138
1. CPU
1.1 General CPU Design
Field : stopCPU : boolean
Field pauseCPU : boolean
Field : counter : int
Field : interruptPeriod : int
counter = interruptPeriod;
stopCPU = false;
pauseCPU = false;
for (;;) {
if (!pauseCPU) { // To ensure the CPU can be stopped whilst paused, the CPU
// must be un-paused before attempting to stop it.
Deal with the next instruction
counter = counter – number of cycles for current instruction
}
}
if (counter <= 0) {
if (stopCPU) {
break;
}
else {
Deal with all cyclic tasks
counter = counter + interruptPeriod;
}
}
1.2. Memory
Class Memory {
Field: memory: int Array
Contructor Memory(int: memorySize) { // in bytes
memory = new Array[memorySize];
}
void: writeToMemory(int: address, int: data) {
Memory[actualAddress] = data;
}
139
}
int: readFromMemory(int: address) {
return memory[actualAddress];
}
Class ZeroPage extends Memory { // Requires no overriding.
}
Constructor ZeroPage() {
Super(256);
}
Class Stack extends Memory { // The stack grows “backwards” in memory.
Constructor Stack(int: memorySize) {
Super(256);
}
void: writeToMemory(int stackPointer, int data) {
memory[stackPointer] = data;
decrement stackPointer;
}
}
int: readFromMemory(int: stackPointer) {
increment stackPointer;
return memory[stackPointer];
}
Class CartRAM extends Memory {
Constructor CartRAM(int: memorySize) {
Super(8191);
}
}
Class CartROM extends Memory {
/* NOTE: Even though writing to ROM does nothing, a method for writing
should
still be provided as some programs include instructions which write to ROM as an antipiracy measure (if the write is successful, the program ends execution because it knows it
is not executing on official hardware. */
Constructor CartROM(int: memorySize) {
140
Super(16383);
}
}
void: writeToMemory(int: address, int: data) {
// Writing to ROM is not allowed.
}
Class CPURAM extends Memory {
}
Constructor CPURAM(int: memorySize) {
Super(1535);
}
Class ExpansionROM extends Memory {
Constructor ExpansionROM(int: memorySize) {
Super(8159);
}
}
void: writeToMemory(int: address, int: data) {
// Writing to ROM is not allowed.
}
Class IO extends Memory {
/* No actual data is stored. Just provides a means of interacting with external devices. */
Constructor IO(int: memorySize) {
Super(0);
}
void: writeFromMemory(int: address, int: data) {
switch (address) {
case (2000) :
// PPU Control Register 1
Case (2001) :
// PPU Control Register 2
…
}
}
int: readFromMemory(int: address) {
141
}
}
switch (address) {
case (2000) :
// return PPU Control Register 1
Case (2001) :
// return PPU Control Register 2
…
}
1.3. Addressing Modes
int : absolute() { // Returns the operand found at a 16-bit address.
int : lowEightBits = PC++;
int : highEightBits = PC++;
int : address = (highEightBits <<< 8) | lowEightBits; // full 16-bit address
}
return getMemory(address);
// absoluteY also present.
int : absoluteX() { // Returns the operand found at a 16-bit address plus X
int : lowEightBits = PC++;
int : highEightBits = PC++;
// full 16-bit address + X (with wrapping to ensure a valid address)
int : address =(( (highEightBits <<< 8) | lowEightBits) + X) & 0xFFFF;
}
return getMemory(address);
int : zeroPage() {
return getMemory(PC++);
}
int : zeroPageX() { // zeroPageY also exists. Identical to zeroPageX
int : address = PC++;
address = (address + X) & 0xFF; // Logical AND to keep the address in zero
// page (wraparound)
}
return getMemory(address);
142
int : indirect() {
int : address = PC++;
int : lowEightBits = getMemory(address);
int : highEightBits = getMemory((address+1) & 0xFF); // Possible
// wraparound.
}
return getMemory((highEightBits <<< 8) | lowEightBits);
int : indexedIndirect() {
int : address = PC++;
address = (address + X) & 0xFF; // Wraparound possible.
int : lowEightBits = getMemory(address);
int : highEightBits = getMemory((address+1) & 0xFF); // Possible
// wraparound.
return getMemory((highEightBits <<< 8) | lowEightBits);
}
int : indirectIndexed {
int : address = PC++;
int : lowEightBits = getMemory(address);
int : highEightBits = getMemory((address+1) & 0xFF); // Possible
// wraparound.
int : baseAddress = ((highEightBits <<< 8) | lowEightBits);
baseAddress = (baseAddress + Y) & 0xFFFF; // Wraparound possible.
}
return getMemory(baseAddress);
143
1.4. DMA Access
// The DMA controller is used to write 256 bytes from CPU memory to Sprite
// memory.
// The transfer takes a total of 512 cycles to complete.
// The CPU is unable to access memory while this process takes place.
// The DMA controller is started by a write to register $4014. The operand given
// specifies the memory address to begin copying from (with an offset of 0x100).
void : DMA(int: start) {
start = start + 0x100;
}
for (i = 0; i < 256; i++) {
spriteMem.writeToMemory(CPUMem.readFromMemory(start + i));
}
1.5. Utility Methods
Branch
switch (branchOpCode) {
Case (0xB0) : // BCS – Branch on carry set
return Carry Set?
Case (…
}
CheckPageBoundary
return (newAddress EOR oldAddress) & 256; // (30)
LoadWord
int : loadWord(int : lowestByte) {
int : highEightBits = (lowestByte + 1) & 0xFFFF; // wraparound.
int : address = (highEightBits <<< 8) | lowEightBits; // 16-bit address
}
return getMemory(address);
144
2. PPU
2.1. PPUCTRL ($2000)
// Certain bits of the registers will cause different effects in the rendering of the
// image to the screen.
// Bits 0 and 1. Give the base name table address to be used.
int : nameTableToUse = PPUCTRL & 00000011;
switch (nameTableToUse) {
Case (0) {
baseNameTable = 0x2000;
}
Case (1) {
baseNameTable = 0x2400;
}
Case (2) {
baseNameTable = 0x2800;
}
Case (3) {
baseNameTable = 0x2C00;
}
}
// Bit 2. Internal PPU RAM Address increment per CPU read/write of
// PPUDATA. 0: Increment by 1 (going across), 1: Increment by 32 (going // down).
int : VRAMIncrement = PPUCTRL & 00000100;
if (VRAMIncrement == 0) {
addressIncrement = 1;
else {
addressIncrement=32;
}
The above code is not representative of how it will be written in the finished software. This is
because it will be necessary for the code to be spread out throughout the PPU class so it would
be impractical to show here. Additionally, pseudo code has only been provided for the first three
bits of register PPUCTRL. Any further code would be superfluous as all three behave very
similarly.
145
2.2 OAMADDR ($2003)
// The value written to this register specifies the location of sprite memory you wish to
access (read from or write to). The address written can then be accessed via the OAMDATA
($2004) register.
void : writeToOAMADDR(int : address) {
OAMADDR = address;
}
2.3 OAMDATA ($2004)
// Behaviour of this register depends on whether it is being written to or read from.
// Reading from this register simply returns the data at the address in Sprite RAM
// specified by OAMADDR.
// Writing to this register writes the data and then increments OAMADDR.
void : writeToOAMDATA (int : data) {
SpriteRAM[OAMADDR] = data;
OAMADDR++;
}
int : readFromOAMDATA() {
return SpriteRAM[OAMADDR];
}
2.4 PPUSCROLL ($2005)
// The first write to PPUSCROLL sets the horizontal scroll offset.
// The second write sets the vertical scroll offset.
boolean : horizontalScrollNext = false;
writeToPPUSCROLL (int : data) {
horizontalScrollNext = !horizontalScrollNext;
if (horizontalScrollNext) {
hScroll = data; // offsets range from 0 to 255.
}
else {
vScroll = data; // offsets range from -16 to 239.
}
}
146
2.5 PPUADDR ($2006)
// The first write to PPUADDR sets the upper byte of the address in PPU internal
// memory that you wish to access.
// The second write sets the lower byte of the address.
int : PPUADDRWord;
boolean : PPUADDRfirstByte = false;
writeToPPUADDR(int : address) {
PPUADDRfirstByte = !PPUADDRfirstByte;
if (PPUADDRfirstByte) {
PPUADDRWord &= 0x00FF;
PPUADDRWord |= (address << 8); // Highest byte written.
}
else {
PPUADDRWord &= 0xFF00;
PPUADDRWord |= address; // Lowest byte written.
}
}
2.6 PPUDATA ($2007)
// Allows you to read from or write to PPU internal memory at the address specified
// by PPUADDR
// NOTE: Reads are delayed by one cycle.
void : writeToPPUDATA (int : data) {
PPUInternal[PPUADDRWord] = data;
}
int : readFromPPUDATA() {
return PPUInternal [PPUADDRWord];
}
147
2.7 Image and Sprite Palette Representation
The code includes Java Specific objects (Color). Note that every fourth element in both palettes
contains the same colour. The object created by the use of ―Color(0,0,0,0)‖ represents
transparency.
Color[16] imagePalette;
Color[16] spritePalette;
IMAGEPALETTE = 0x3F01; // first non-transparent element of image palette
SPRITEPALETTE = 0x3F11; // first non-transparent element of sprite palette
// Set the transparent elements of imagePalette.
imagePalette[0] = new Color(0,0,0,0);
imagePalette[4] = new Color(0,0,0,0);
imagePalette[8] = new Color(0,0,0,0);
imagePalette[12] = new Color(0,0,0,0);
// backgroundColour used if both sprite and image colours are transparent.
Color : backgroundColour = new Color(masterPalette.get(memory[0x3F00]));
// sub-palette 1
imagePalette[1] = new Color(masterPalette.get(memory[IMAGEPALETTE]));
imagePalette[2] = new Color(masterPalette.get(memory[IMAGEPALETTE+1]));
imagePalette[3] = new Color(masterPalette.get(memory[IMAGEPALETTE+2]));
// sub-palette 2
imagePalette[5] = new Color(masterPalette.get(memory[IMAGEPALETTE+4]));
imagePalette[6] = new Color(masterPalette.get(memory[IMAGEPALETTE+5]));
imagePalette[7] = new Color(masterPalette.get(memory[IMAGEPALETTE+6]));
// sub-palette 3
imagePalette[9] = new Color(masterPalette.get(memory[IMAGEPALETTE+8]));
imagePalette[10] = new Color(masterPalette.get(memory[IMAGEPALETTE+9]));
imagePalette[11] = new Color(masterPalette.get(memory[IMAGEPALETTE+10]));
//sub-palette 4
imagePalette[13] = new Color(masterPalette.get(memory[IMAGEPALETTE+12]));
imagePalette[14] = new Color(masterPalette.get(memory[IMAGEPALETTE+13]));
imagePalette[15] = new Color(masterPalette.get(memory[IMAGEPALETTE+14]));
// An identical process is followed for the sprite palette (using the SPRITEPALETTE
constant).
148
2.8 Sprite Evaluation Routine
int : numSprites = 0; // Still need to raise the Sprite Overflow flag.
int : currentSecondary = 0;
for (i = 0 to 64) {
int : yPos = spriteMem[i*4];
int : difference = yPos – scanline // scanline == current scanline
int : ySize = -8;
if (8x16Sprites) { // Used to determine if Y pos in range later.
ySize = -16;
}
// Is the sprite in range of the current scanline?
if (difference <= 0 && difference > ySize) {
numSprites++;
if (numSprites == 8) {
Set Sprite Overflow Flag // Not actually adhered to.
}
// Sprite found to be in range. Render it.
int : byte1 = spriteMem[(i*4)+1];
int : byte2 = spriteMem[(i*4)+2];
int : byte3 = spriteMem[(i*4)+3];
}
}
// Using the three bytes retrieved above, render the sprite to the display.
149
2.9 Sprite Evaluation Flowchart
150
2.10 Name Table Mapping
// Depending on the name table mirroring scheme being used, writes to
// addresses in the range of the name table memory (0x2000 – 0x3000) will
// be treated differently.
if (horizontalMirroring) {
if (address >= 0x2000 && address < 0x2800) {
physicalTable1[address & 0x1000];
}
else {
physicalTable2[address & 0x1000];
}
}
else if (verticalMirroring) {
if ((address >= 0x2000 && address < 0x2400) || (address >= 0x2800
&& address < 0x2C00)) {
physicalTable1[address & 0x1000];
else {
physicalTable2[address & 0x1000];
}
}
151
3. APU
3.1 Divider
int : period = n; // The period will be specified by writing to a sound register.
int : counter = period;
void : clock() { // Called each time the CPU clocks.
if (--counter <= 0) {
Output a clock.
counter = period;
}
}
void : forceReload() { // Reload the clocks counter with the period.
counter = period;
}
void : changePeriod(int : newPeriod) {
period = newPeriod;
}
3.2 Sequencer
// The below code represents the sequencer present in the Triangle sound channel
// (simplified slightly so as to make the code as general as possible for illustration
// purposes).
field : sequence : int array =
[15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15];
field : current = 0;
int : clock() {
return sequence[current];
current = (current++ MOD sequence.length); // Keep looping.
}
152
3.3 Shift Register with Feedback
int : shiftReg
// Clock() results in a pseudo-random bit sequence.
// Exclusive OR of bit 0 and either bit 6 or bit 1 (depending on status of loop bit).
// The result of this EOR replaces bit 15 of the shift register.
// Finally, shift the shift register 1 bit to the right.
void : clock() {
int : firstBit = shiftReg & 0000000000000001; // Get bit 0.
if (loopSet) { // If loop has been set.
int : secondBit = 0000000001000000; // Get bit 6.
}
else {
int : secondBit = 0000000000000010; // Get bit 1.
}
int : eorBits = firstBit EOR secondBit;
}
shiftReg = shiftReg & (eorBits << 16); // Right shift eorBits + sign extension.
shiftReg = shiftReg >>> 1; // Shift 1 right w/o sign extension.
3.4 Frame Counter
int : mode = 0;
int : steps = 4; // number of steps for the sequence. Depends on the mode.
int : current = 0; // Current position in the sequence.
void : setMode(int : mode) {
this.mode = mode;
current = 0;
}
if (mode == 1) {
steps = 5; // A 5 step sequence.
clock(); // clock immediately if mode is 1.
}
else {
steps = 4;
}
divider.forceReload(); // The divider is what clocks the frame counter. This
// is left out for simplicity.
153
void : clock() {
current = current++ MOD steps;
if (mode == 0) { // 4 step sequence.
if (current == 1) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
}
else if (current == 2) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
pulse1.lengthCounter.clock();
pulse2.lengthCounter.clock();
noise.lengthCounter.clock();
triangle.lengthCounter.clock();
pulse1.sweep.clock();
pulse2.sweep.clock();
}
else if (current == 3) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
}
else {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
pulse1.lengthCounter.clock();
pulse2.lengthCounter.clock();
noise.lengthCounter.clock();
triangle.lengthCounter.clock();
pulse1.sweep.clock();
pulse2.sweep.clock();
}
else {
}
if (interrupt inhibit clear) {
frame interrupt flag = true;
}
154
if (current == 1) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
pulse1.lengthCounter.clock();
pulse2.lengthCounter.clock();
noise.lengthCounter.clock();
triangle.lengthCounter.clock();
pulse1.sweep.clock();
pulse2.sweep.clock();
}
else if (current == 2) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
}
else if (current == 3) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
pulse1.lengthCounter.clock();
pulse2.lengthCounter.clock();
noise.lengthCounter.clock();
triangle.lengthCounter.clock();
pulse1.sweep.clock();
pulse2.sweep.clock();
}
else if (current == 4) {
pulse1.envelope.clock();
pulse2.envelope.clock();
noise.envelope.clock();
triangle.linearCounter.clock();
}
}
}
155
3.5 Pulse Channel
// adapted from C code written by Blargg.
// The timer variable in this code will be a Timer object in the implementation. It
// has been kept as an int here for simplicity.
field : $4000 : int;
field : $4002 : int;
field : $4003 : int;
field : timer : int = 0;
field : phase : int
field : waves[4][8] : int array = {
{0,1,0,0,0,0,0,0},
{0,1,1,0,0,0,0,0},
{0,1,1,1,1,0,0,0},
{1,0,0,1,1,1,1,1}};
// The below outputs waveform values based on the state of three APU registers.
int : clock() {
if (--timer <= 0) {
int : raw = (($4003 & 7) << 8) | $4002;
timer = (raw + 1) * 2;
phase = (phase + 1) & 7;
}
}
return waves[($4000 >> 6) & 3][phase];
3.6 Triangle Channel
field : sequence : int array ={15,14,13,12,11,10,9,8,7,6,5,4,3,
2,1,0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
field : timer : int = 0;
field : $400A : int;
field : $400B : int;
field : current : int; // Current sequencer value.
int : clock() {
if (--timer <= 0) {
int : raw = (($400B & 7) << 8) | $400A; // timer high and timer low,
// plus one.
timer = (raw + 1);
}
current = current++ MOD sequence.length; // Looping.
156
return sequence[current];
}
3.7 Noise Channel
field : timerPeriods : int array = {4, 8, 16, 32, 64, 96, 128, 160, 202, 254, 380, 508,
762, 1016, 2034, 4068};
field : timer : int = 0;
field : $400E : int;
field : SRWF : ShiftRegisterWithFeedback;
int : clock() {
if (--timer <= 0) {
int : 400ETimerPeriod = ($4003 & 15);
timer = timerPeriods[400ETimerPeriods];
SRWF.clock();
}
}
return current SRWF value;
4. Input/Output
4.1 Determining Input State
// This code assumes that the control pads have already been strobed via a write of
// 1 followed by 0 to the lowest bit of register $4016. It also only handles one pad.
// Strobing will set button back to 0 and fill the shiftRegister with the button states.
int : shiftRegister;
int : button = 0; // The next buttons state to check.
int : read() { // Read the next button state from pad.
if (!button == 8) {
return shiftRegister & 00000001;
shiftRegister = shiftRegister >>> 1; // Shift right 1 to allow getting of
// next buttons state next time.
}
}
157
Appendix G: Test Specification
158
Overview
This test specification will be referred to within the main report. To this end, each test or set of
tests discussed below will be labelled to allow easy referencing.
These references will be provided after the test title in the form (‗reference‘) or in its own
explicitly labelled column.
Pre-Written Test Files
In addition to project specific testing, there are a number of pre-written tests which will be used
to help verify the soundness of the software.
CPU
Cpu_timing_test (‗CPU Timing‘)
Tests 6502 instruction timing for most 6502 instructions. This consists of four separate test files,
all available at: (45)
Branch_timing_tests (‗Branch Timing‘)
Tests correct emulation of the 6502 branch instructions. (46)
NES Test (‗CPU Operation‘)
Thoroughly tests the operation of the CPU. (47)
cli_latency_tests (‗CLI Latency‘)
Tests for the correct operation of the CLI instruction. (48)
cli_tests (‗CLI and Related‘)
Tests CLI and related instructions.
overtest (‗Overflow‘)
Tests that the CPU‘s overflow flag works correctly. (49)
PPU
blargg_ppu_tests_2005.09.15b (‗PPU General‘)
Tests several aspects of the NTSC PPU. (50)
159
scanline.nes (‗Scanline Rendering‘)
Tests for the correct operation of the scanline rendering process. (50)
sprite_overflow_tests (‗Sprite Overflow‘)
Tests for the correct operation of the sprite overflow flag. (51)
sprite_hit_timing (‗Sprite Hit‘)
Tests for the correct operation of sprites. (52).
vbl_nmi_timing (‗PPU Miscellaneous‘)
Additional PPU tests (53)
APU
blargg_apu_2005.07.30 (‗APU Miscellaneous‘)
Tests the Frame Counter operation and the first square wave‘s length counter. (54)
sndtest.nes (‗Sound Test‘)
A NES ROM file which allows the testing of the APU via allowing manipulation of the APU
registers and outputting the resulting sound. (55).
I/O
Sndtest.nes (‗Joypad‘)
In order to test the sound unit using this ROM, values on screen must be changed using the input
device. If this works correctly, it can be concluded that the input mechanism works correctly.
Project Specific Testing
Performance Testing
Even on modern hardware, few (if any) NES emulators achieve the full 60 Frames per Second of
the original system. Thus, it would seem reasonable to accept a frame rate of anything over 50 as
being acceptable performance.
160
Portability Testing
To test portability, it will be tested on the following systems:
Windows XP Pro (Service Pack 2) (‗Windows‘)
Ubuntu Linux 7.10 (‗Ubuntu‘)
Mac OS X (‗Mac‘)
Unit Testing
Overall
Test Case
A ROM file can be loaded and
executed
The program can be exited
Acceptance Condition
The ROM chosen will begin executing and
displaying
Choice of the ―Quit‖ option in the ―File‖ menu
closes the application
Reference
‗load‘
Acceptance Condition
Paused instruction execution upon selection of the
―Pause‖ option in the ―CPU‖ menu.
Reference
‗pause‘
‗exit‘
CPU
Test Case
Can be paused and continued
Can be stopped
Can be reset
Correct working of the DMA
Continued instruction execution upon selection of
the ―Continue‖ option in the ―CPU‖ menu.
The CPU stops execution upon selecting the
―Close ROM‖ option in the ―File‖ menu.
The display rendering starts from the beginning of
the ROM file and continues to render as if opened
for the first time.
Writing 256 bytes of data from a location to the
sprite RAM. Compare the Sprite RAM against the
location copied from. If these two are the same,
the copying has worked correctly.
161
‗stop‘
‗reset‘
PPU
Test Case
Graphical effects the emulator
should be capable of:
greyscale/colour modes
enable/disable background
clipping
enable/disable sprite clipping
enable/disable background
rendering
enable/disable sprite rendering
Set/Unset colour emphasis
Name table mappings should
work correctly.
Pattern table use:
Correct use in rendering
sprites
Correct table used for
displaying sprites and
background
Acceptance Condition
These can be tested by flipping the appropriate
bits responsible for these effects and visually
checking that the intended effect occurs.
Reference
‗Graphical
Effects‘
This can be tested entirely visually. If they are not
working correctly, the display will not render
correctly.
This too can be tested visually.
‗Name Table
Mappings‘
Acceptance Condition
This will be tested purely by inspecting the output
at a variety of volume levels (0%, 25%, 50%,
75%, 100%).
Reference
‗Volume Control‘
‗Pattern Table
Use‘
APU
Test Case
The volume and mute features
of the system work correctly.
Correct APU operation will be verified largely through comparison testing against the output of
the FCEUXD SP emulator. The ―sndtest.nes‖ ROM will be used for this purpose.
162
I/O
Test Case
The key mapping facility in
the GUI works correctly.
Input from the user is picked
up and interpreted correctly.
Acceptance Condition
This will be tested by using the GUI to change the
key mappings for the standard control pad. The
Joypad Test Cartridge will then be used to confirm
that the keys just mapped do indeed correspond to
standard control pad buttons.
The above method of using the test cartridge can
be used here also
Reference
‗Key Mapping
and input
recognition‘
Acceptance Condition
This is confirmable by tracing the execution of the
CPU for a time, ensuring that the register display
matches that of the actual register values.
Confirmable by entering breakpoints of all types
and ensuring that the system breaks at set points.
Reference
‗Register State‘
Remove a breakpoint. Check the program no
longer halts at this point
Remove all breakpoints. Check the program no
longer halts at all.
Add a breakpoint, check it halts where desired.
‗Breakpoint
Manager‘
Both by observation.
‗Step and
Resume‘
By observation.
‗Seek PC‘
Comparison with the disassembled code produced
by FCEUXD SP.
‗Disassembled‘
Debugger
Test Case
The register state fields show
the correct values.
The debugger breaks program
execution when it is required
to do so.
The breakpoint manager works
as desired. i.e. it should allow:
The removal of a particular
breakpoint
The removal of all breakpoints
The addition of an additional
breakpoint
Upon pressing ―Step‖, the
CPU should execute one
instruction and then break.
Upon pressing ―Resume‖,
execution should continue
until the next breakpoint (if
any).
―Seek PC‖ should highlight
the line in the disassembled
code with the same PC
number.
The ―Code‖ pane should
display the disassembled
source code for the running
ROM.
Memory Panes should display
the correct values for all
memory locations.
‗Breaks‘
Check the memory locations when just written to
‗Memory
by the code. If they hold the correct values, take as Display‘
correct.
163
Pattern Table Viewer
Test Case
It should be capable of
displaying:
the image tiles in the pattern
table
the Palettes used by the pattern
table entries
tile information (table num,
tile num)
palette information (Palette
type, entry num, master palette
entry num)
Display options:
toggle automatic refresh
alterable refresh rate
Acceptance Condition
Most of these requirements can be tested via
comparison with the output of the FCEUXD SP
emulator with a given ROM.
Reference
‗Many‘
The remainder can be tested via observation (table
num, palette type).
By observation.
164
‗Refresh‘
Name Table Viewer
Test Case
It should be capable of
displaying:
the image made up of pattern
table entries in the name tables
scroll lines
Should display correct
attribute table values
Display options:
toggle automatic refresh
alterable refresh rate
Display the name tables in two
bit colour.
The capability to view the
name table in numeric form.
Acceptance Condition
Accept if the visuals match those seen in
FCEUXD SP for several ROMs.
Reference
‗Tiles and Scroll
Lines‘
Accept if the name table colours are displayed
correctly. The attribute values must be correct if
these are displayed correctly.
By observation.
‗Attribute
Information‘
Compare the tile colours used in the name tables
against the colours of the appropriate tiles in the
pattern table viewer. If these match, two bit colour
display is working correctly.
Compare the values shown in this display against
the appropriate locations in memory.
‗Two Bit Colour‘
165
‗Refresh‘
‗Numeric Name
Tables'
Appendix H: Project Logs
166
Before 1st October 2007 – Preliminary research of the NES and the 6502.
1st October 2007 – NES and 6502 processor information researched. Mainly ―Programming the
6502‖ by Rodney Zaks.
4th October 2007 – Project requested via the project database. Also, PERT charts developed.
8th October 2007 – First meeting with project supervisor.
12th October 2007 – Much online material consulted. Mainly related to project extension
possibilities.
18th October 2007 – Meeting with project supervisor. Project Proposal given to project
supervisor.
18th October 2007 – Alterations to report suggested by supervisor implemented. Continued
research on the NES APU and began documenting APU.
19th – 24th October 2007 – APU documentation continues
24th October 2007 – Restructured document to help readability.
26th October 2007 – Beginning to document PPU for analysis.
27th October 2007 – Continued documentation of PPU.
28th October 2007 – PPU documentation.
29th October 2007- Documentation of PPU registers and name tables
30th October 2007 – PPU documentation completed. Input Documentation completed.
2nd November 2007 - Continued Interim report.
7th November 2007 – Continued editing of interim document. Began work on design.
9th November – Begun CPU design doc.
14th November 2007 – Completed CPU design. Beginning PPU design doc.
16th November 2007 – Continued work on PPU design.
22nd November 2007 – Completed PPU design doc.
25th November 2007 – Beginning APU design.
167
27th November 2007 – Continued APU design.
28th November 2007 – Completed I/O design spec.
29th November 2007 – GUI designs completed.
30th November 2007 – Testing section added.
31st November 2007 – Report Cleanup, project proposal added to appendix.
1st December 2007 – Interim report completed.
15th January 2008 – CPU mainly operational.
22nd January 2008 – PPU internal operation functional.
29th January 2008 – Pattern Table Viewer functional.
2nd February 2008 – Name Table Viewer functional.
8th February 2008 – Debugger partially functional.
12th February 2008 – CPU timing locked to 60 FPS.
10th March 2008 – APU mostly functional.
19th March 2008 – APU functional bar the DMC channel.
26th March 2008 – CPU fully operational.
5th April 2008 – Sprite rendering functional.
7th April 2008 – Sprite rendering at both priority levels working.
17th April 2008 – Programming Completed.
19th April 2008 – Report Finished.
168
Appendix I: GNU GENERAL PUBLIC
LICENSE Version 2
169
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
170
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
171
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
172
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
173
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
174
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT
WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER
PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE
RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD
THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
SERVICING,
REPAIR OR CORRECTION.
175
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR
DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL
DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH
ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF
THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) 19yy <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
176
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.
177