|
|
(14 intermediate revisions by 3 users not shown) |
Line 1: |
Line 1: |
| Address maps define how the address space of a CPU is layed out. This article aims to explain how memory and address space is managed in MAME.
| | See [https://docs.mamedev.org/techspecs/memory.html] |
| | |
| This article is WIP.
| |
| | |
| == Address Spaces ==
| |
| | |
| Currently, MAME supports CPUs with up to three distinct address spaces:
| |
| | |
| # Program space ('''ADDRESS_SPACE_PROGRAM''') is by definition the address space where all code lives. On [http://en.wikipedia.org/wiki/Von_neumann_architecture Von Neumann architecture] CPUs, it is also where data is stored. Most CPUs are Von Neumann architecture systems, and thus comingle code and data in a single address space.
| |
| # Data space ('''ADDRESS_SPACE_DATA''') is a separate address space where data is stored for [http://en.wikipedia.org/wiki/Harvard_architecture Harvard architecture] CPUs. An example of a Harvard architecture CPU in MAME is the ADSP2100.
| |
| # I/O space ('''ADDRESS_SPACE_IO''') is a third address space for CPUs that have separate I/O operations. For example, the Intel x86 architecture has IN and OUT instructions which are effectively reads and writes to a separate address space. In MAME, these reads and writes are directed to the I/O space.
| |
| | |
| Note that a number of CPU architectures also have internal memory that is in a separate space or domain from the other three. Memory referenced in this way is expected to be maintained internally in the CPU core and is not exposed through the memory system in MAME.
| |
| | |
| == CPUs and Bus Width ==
| |
| | |
| Everyone has probably heard about the "8-bit" Z80 CPU or the "32-bit" 80386 CPU. But where does this notion of "8-bit" and "32-bit" come from? When referring to CPUs, there are three metrics worth considering, all of which might be used to describe a CPU, depending on which sounds better to the marketing department (seriously).
| |
| | |
| The first possible metric is the size of the internal arithmetic units in the CPU. For example, the Motorola 68000 CPU can do arithmetic operations on 32-bit numbers internally. Does this make it a 32-bit CPU? Depends on who you ask, though most people would probably say "no", because it is at odds with the other two metrics.
| |
| | |
| The second possible metric is the width of the address bus. When a CPU goes to fetch memory, it has to tell the outside world what address it wishes to access. To do this, it drives some of the pins on the chip to specify in binary the address it wants to access, and then signals a read or a write to actually cause the memory access to occur. The number of pins available on the chip for sending these addresses is referred to as the address bus width, and ultimately controls how much memory the CPU can access. For example, the original Intel 8086 has 20 address pins on it, and could access 2<sup>20</sup> bytes (or 1 MB) of memory; thus, we say it had a 20-bit address bus. When Intel created the 80286, it increased the address bus width to 24-bit (16 MB), and then to 32-bit (4 GB) with the introduction of the 80386. Which is one reason why the 80386 is called a "32-bit" CPU.
| |
| | |
| The third possible metric is the width of the data bus. This describes how many bits of data the CPU can fetch at one time. Again, this is related to pins on the chip, so a CPU that has an 8-bit data bus can fetch 8 bits (or one byte) at a time, and has 8 pins on the CPU which either send or receive the data that goes out to memory. Almost all CPUs access memory either in 8-bit, 16-bit, 32-bit, or 64-bit chunks (though there are a few oddballs that don't follow these rules). For example, the original Motorola 68000 accessed memory in 16-bit chunks, meaning it had 16 pins on the CPU which sent/received data, and thus we say it had a 16-bit data bus width. When Motorola introduced the 68020, it doubled that data bus width to 32-bits, meaning that it could fetch twice the amount of data in a single memory access. This is why the 68020 is called a "32-bit" CPU.
| |
| | |
| So why do you need to know all of this for working with address maps? Well, the first metric is irrelevant because it doesn't apply to memory accesses, but the second two metrics describe how the CPUs deal with memory, and some of those details leak into the address maps.
| |
| | |
| MAME today supports any address bus width from 1-32 bits, and it supports data bus widths of 8, 16, 32, and 64-bit. For CPUs with oddball data bus widths, we generally round up to the next highest and clean up the details in the CPU core.
| |
| | |
| Also note that each address space can have different properties, even within the same CPU. For example, the Intel 80386 has a program address space with a 32-bit address bus width, but it has an I/O address space with a 16-bit address bus width.
| |
| | |
| == Memory Layout in MAME ==
| |
| | |
| Hoo boy, this is a controversial topic. MAME was originally written with only 8-bit CPUs in mind. The nice thing about 8-bit CPUs is that from the CPU's perspective, all memory accesses are a single byte wide, so it doesn't matter if the system running the emulator was little endian or big endian. Eventually, however, MAME expanded and an emulator for the Motorola 68000 was added. This brought up the tough question: how do you model memory accesses for a CPU with a 16-bit data bus?
| |
| | |
| There are a few things to keep in mind here. First, because the concept of an 8-bit byte is so firmly grounded in microprocessors, almost all CPUs that have larger data busses also support accessing individual bytes. So while a 16-bit CPU can access 16 bits at a time, it also has the ability to individually access either the upper 8 bits or the lower 8 bits, effectively permitting byte-level access to memory and peripherals. This is generally taken one step further with CPUs that have a 32-bit data bus; for the most part, they can access any of the four 8-bit bytes independently, or either of the two 16-bit words independently.
| |
| | |
| The second thing to understand is that when a 16-bit CPU performs a 16-bit memory access, at the hardware level it sends out a single read or write request. The reason this becomes important is that when you write an emulator, each memory access generally translates into a function call which simulates the read/write behavior of the appropriate block of memory or I/O device. A naive approach to supporting 16-bit CPUs might be to keep all these function calls operating at a byte level only, and just perform two function calls to read two neighboring bytes if a 16-bit request is issued. The problem with this is that you make it difficult for the functions that simulate the hardware to know whether the orignal CPU request was for 8 or 16 bits, and in some cases it does make a difference.
| |
| | |
| The third thing to know is that at the hardware level, a CPU with a 16-bit databus always performs "aligned" memory accesses. That is, when communicating to outside memory or peripherals, it will only ask for even addresses. In fact, 16-bit CPUs don't actually even have a pin for the low bit of the address bus! But what about accessing individual bytes, you ask? Well, if you read a byte from an even address (say $3000), the CPU will request a read from address $3000, but will also request that only the 8 bits corresponding to the first byte be returned. If you read from an odd address (say $3001), it will also issue a read from address $3000, but will also request that only the 8 bits corresponding to the second byte be returned.
| |
| | |
| With that in mind, the way that MAME decided to support 16-bit CPUs was to define a new set of functions which were dedicated to reads and writes from 16-bit CPUs. These functions know that accesses may be either 8-bit or 16-bit, and know how to call out to special 16-bit memory handlers, which are functions that simulate the behavior of peripherals and other devices mapped into the CPU's address space. Seems straightforward enough, so where's the controversy?
| |
| | |
| The first controversial bit has to do with the 'offset' parameter to the read/write handlers. As mentioned earlier, 16-bit CPUs don't actually have a low bit on their address bus, they just specify which bytes within the word they actually want to access. To enforce this, the memory system could either have masked off the low bit, or else shifted the address one bit to the right to discard the bit. In MAME, the latter approach was taken, which can be a bit confusing. For example, an access to address $30 on a 16-bit CPU would mean that your read/write handler actually gets passed an address of $18 (and on a 32-bit CPU that would translate as an address of $0C: shifted right 2 bits). But this actually makes sense once the second controversy is explained.
| |
| | |
| The second controversial bit is how RAM and ROM memory in laid out. Because all accesses to memory are effectively 16 bits wide (potentially with some masking to extract either one byte or the other), the decision was made to store RAM and ROM natively in 16-bit chunks. What does this mean? Well, say you have a block of ROM on a Motorola M68000 (big-endian) with the following values:
| |
| | |
| $1234 $5678 $9ABC $DEF0
| |
| | |
| And let's say you have a pointer to it in your code:
| |
| | |
| UINT16 *memptr;
| |
| | |
| Then the intention is that, regardless of the endianness of the CPU that is running the emulator, if you read memptr[2], you will get the value $9ABC. That's because for a CPU with a 16-bit data bus, all memory is organized and accessed by the core in 16-bit chunks. As long as you access memory strictly through 16-bit pointers, everything will work fine. (Note also that because of the use of 16-bit pointers, having the 'offset' parameter shifted by 1 lets you use the offset directly as an array index to memory.)
| |
| | |
| To illustrate the mind-bending a bit more graphically, consider running the M68000 on an Intel x86-based CPU. The M68000 is big endian while the x86 is little endian. When accessing data as words, it goes like this:
| |
| | |
| M68000: $1234 $5678 $9ABC $DEF0
| |
| x86: $1234 $5678 $9ABC $DEF0
| |
| | |
| But if you look at the byte level, you see it is different:
| |
| | |
| M68000: $12 $34 $56 $78 $9A $BC $DE $F0
| |
| x86: $34 $12 $78 $56 $BC $9A $F0 $DE
| |
| | |
| For this reason, with 16-bit CPUs, you generally cannot take a UINT16 * pointer to memory, cast it to a UINT8 *, and access the individual bytes without getting incorrect results when the endianness of the native CPU is different from that of the emulated CPU.
| |
| | |
| == Special Memory Types ==
| |
| | |
| RAM, ROM, and banks, and unmap
| |
| | |
| == Read/Write Handlers ==
| |
| | |
| Before diving into the details of the address map macros, let's talk about read/write handlers. The purpose of an address map is to describe what MAME should do when memory within a certain range is accessed. In its most simplistic sense, it specifies a set of functions which should be called in response to memory accesses. These are the read/write handlers.
| |
| | |
| A read handler is a function which accepts an address and perhaps a mask, and returns the value obtained by "reading" memory at that address. Here is a prototype for an 8-bit read handler:
| |
| | |
| UINT8 my_read_handler(offs_t offset);
| |
| | |
| Notice a couple of things about this definition. First, I specifically said an "8-bit" read handler, and you can see that the function returns a UINT8. This means that yes, there are 4 different handler function types, one each for 8, 16, 32, and 64-bit memory accesses. Regardless of the size of data returned, however, all the functions take an offset of type "offs_t", which today is 32 bits (though in the future we may expand it to 64). Also note that it is called an "offset", not an "address". This is because the memory system in MAME always subtracts the beginning address of a memory range from the raw address before passing it into the read/write handlers. This means that the '''offset''' parameter is always the offset relative to the starting address of the range.
| |
| | |
| Similarly, a write handler is a function which accepts an address, a value, and perhaps a mask, and "writes" memory at that address.
| |
| | |
| == Address Map Structure ==
| |
| | |
| A typical address map looks like this (this example is taken from the [http://mamedev.org/source/src/mame/drivers/qix.c.html qix.c driver]):
| |
| | |
| static ADDRESS_MAP_START( main_map, ADDRESS_SPACE_PROGRAM, 8 )
| |
| AM_RANGE(0x8000, 0x83ff) AM_RAM AM_SHARE(1)
| |
| AM_RANGE(0x8400, 0x87ff) AM_RAM
| |
| AM_RANGE(0x8800, 0x8bff) AM_READNOP /* 6850 ACIA */
| |
| AM_RANGE(0x8c00, 0x8c00) AM_MIRROR(0x3fe) AM_READWRITE(qix_video_firq_r, qix_video_firq_w)
| |
| AM_RANGE(0x8c01, 0x8c01) AM_MIRROR(0x3fe) AM_READWRITE(qix_data_firq_ack_r, qix_data_firq_ack_w)
| |
| AM_RANGE(0x9000, 0x93ff) AM_READWRITE(pia_3_r, pia_3_w)
| |
| AM_RANGE(0x9400, 0x97ff) AM_READWRITE(pia_0_r, qix_pia_0_w)
| |
| AM_RANGE(0x9800, 0x9bff) AM_READWRITE(pia_1_r, pia_1_w)
| |
| AM_RANGE(0x9c00, 0x9fff) AM_READWRITE(pia_2_r, pia_2_w)
| |
| AM_RANGE(0xa000, 0xffff) AM_ROM
| |
| ADDRESS_MAP_END
| |
| | |
| As you can see, it relies heavily on macros to do the heavy lifting. In the current implementation (as of March, 2008), the macros expand into a small "constructor" function. In the future, they may just boil down to a simple data-driven tokenization. Regardless, don't worry about the actual behavior of the macros, just what they mean.
| |
| | |
| Each address map starts with an '''ADDRESS_MAP_START''' declaration. This declaration takes 3 parameters. The first parameter (''main_map'') is the name of the variable you are defining. Each memory map is associated with a variable name so that you can reference it in your machine configuration. The second parameter (''ADDRESS_SPACE_PROGRAM'') simply specifies which address space the memory map is intended for. This helps MAME ensure that you don't mix memory maps inappropriately. The final parameter (''8'') is the data bus width, which again is used as a cross-check against the CPU's defined data bus width for the address space you are working with.
| |
| | |
| Following the '''ADDRESS_MAP_START''' declaration is a list of address ranges. Each range starts with a begin/end address pair wrapped in an '''AM_RANGE''' macro, followed by a series of macros that describe how to handle memory accesses within that range. The details of each macro will be described in detail below.
| |
| | |
| Finally, there is an '''ADDRESS_MAP_END''' macro which ties everything up.
| |
| | |
| A few general comments about the address map above:
| |
| | |
| * First, note that this address map has everything listed in nice ascending order. This is not required, though it is usually recommended for readability.
| |
| | |
| * Second, note that there are no overlapping ranges. This is also not a requirement. Entries in the address map are always processed in reverse order, starting from the bottom and working up to the top. So any overlapping ranges which appear earlier in the list will take precedence over ranges which appear later.
| |
| | |
| == Address Map Macros ==
| |
| | |
| Below is a comprehensive list of the supported macros and what they mean.
| |
| | |
| === AM_RANGE ===
| |
| AM_RANGE(start, end)
| |
| The primary purpose of this macro is to declare a memory range. Any AM_* macros which follow implicitly apply to the most recently declared range. The AM_RANGE macro takes two parameters which specify an inclusive range of consecutive addresses beginning with ''start'' and ending with ''end'' (that is, an address hits in this bucket if the address >= ''start'' and address <= ''end'').
| |
| | |
| === AM_READ, AM_WRITE, AM_READWRITE ===
| |
| AM_READ(readhandler)
| |
| AM_WRITE(writehandler)
| |
| AM_READWRITE(readhandler, writehandler)
| |
| These macros provide pointers to functions that will be called whenever a read or write within the current range is detected. The actual prototypes and behaviors of the ''readhandler'' and ''writehandler'' functions will be described later. However, it is important to note that there is strict typechecking on the function pointers, especially in terms of data bus width, to prevent you from specifying a 16-bit ''readhandler'' in an 8-bit address map (recall that the data bus width of the address map was specified in the '''ADDRESS_MAP_START''' macro).
| |
| | |
| Instead of passing the raw address to the read/write handlers, the memory system actually passes an offset relative to the ''start'' address provided in the '''AM_RANGE''' macro. This allows for common handlers regardless of where the component is actually mapped in the address space.
| |
| | |
| In addition to regular function pointers, a small number of static identifiers are also permitted. For example, in an 8-bit address map, you can specify a ''readhandler'' of MRA8_RAM to specify a dynamically allocated region of RAM, or a ''writehandler'' of MWA8_UNMAP to specify that the current address range is unmapped for writes. More information on the supported static handler types is provided later.
| |
| | |
| The '''AM_READWRITE''' macro is really just a shortcut for '''AM_READ''' followed by '''AM_WRITE'''.
| |
| | |
| === AM_READ_PORT ===
| |
| AM_READ_PORT(tag)
| |
| This macro is an alternate way of specifying a read handler for an input port. Since it is preferred that input ports are referenced by tag, you can use this macro to have MAME automatically look up the tagged port and substitute the correct input port handler.
| |
| | |
| === AM_DEVREAD, AM_DEVWRITE, AM_DEVREADWRITE ===
| |
| AM_DEVREAD(type, tag, readhandler)
| |
| AM_DEVWRITE(type, tag, writehandler)
| |
| AM_DEVREADWRITE(type, tag, readhandler, writehandler)
| |
| These three macros follow the same pattern as the previous set of macros, except that they are used for device-specific read/write handlers. The only difference between a regular read/write handler and a device read/write handler is that the former is passed a pointer to the currently live running_machine, while the latter is passed a pointer to a specific device. The intention here is that if you have allocated a device in your machine's configuration (via MDRV_DEVICE_ADD), then the read/write handlers appropriate for that device should be invoked with a reference to that devices rather than a global pointer to the machine.
| |
| | |
| To specify which device you wish to pass to the read/write handler, you provide the device's ''type'' and ''tag'', which is used to look up the device.
| |
| | |
| === AM_MASK ===
| |
| AM_MASK(mask)
| |
| Specifies a bitmask which applies to the offset that is passed to the read/write handlers. By default, there is no mask, and the read/write handlers are passed in the raw address minus the start address of the current address range. If a ''mask'' is provided, this bitmask is applied in an AND operation after subtracting the start address. Thus, the value passed to the read/write handlers is really ((address - ''start'') & ''mask'').
| |
| | |
| === AM_MIRROR ===
| |
| AM_MIRROR(mirror)
| |
| This macro specifies the "mirror mask" for the current address range. There are two ways to understand a mirror mask; hopefully at least one of them makes sense!
| |
| | |
| * A hardware-centric interpretation would describe a mirror mask as essentially a bitmask consisting of all bits that are ignored when the address is decoded by the hardware. Most arcade hardware does not fully decode each address; rather, in order to save on chip counts, the hardware is set up to do the minimum necessary work to separate accesses to different components in the system, and many bits are ignored. For example, in Pac-Man, bits 13 and 15 are not used at all when deciding whether an access should be directed to spriteram. Thus, the mirror mask is set as $A000.
| |
| | |
| * A software-centric interpretation would be that each bit in the mirror mask describes a "mirror" of the address range at a different address in the system. Looking again at the Pac-Man example, spriteram is traditionally thought of as existing at address $4FF0. But it turns out that you can also access it at $6FF0, $CFF0, and $EFF0, due to the fact that the hardware does not care whether bits 13 and 15 are 0 or 1. So the mirror mask of $A000 means that the memory system will replicate this address range to automatically create these mirrors by going through each bit of the mirror mask and mapping the range with that bit set to 0 and then to 1.
| |
| | |
| Note that the mirroring is by default completely hidden to the read/write handlers. This is done by making the default '''AM_MASK''' value for a mirrored range equal to the logic NOT of the mirror. In the case of Pac-Man above, for example, the mask would be ~$A000 = $5FFF. Looking at an example access to $CFF7, we would subtract the base address of $4FF0, giving an offset of $8007. Then we apply the mask of $5FFF to get the final offset of $0007.
| |
| | |
| If you want your read/write handler to see the full address with no masking, you can provide an explicit '''AM_MASK''' which will override the default value and enable you to specify which bits you wish to see.
| |
| | |
| === AM_REGION ===
| |
| AM_REGION(region, offset)
| |
| This macro is only useful if you used '''AM_READ''' or '''AM_WRITE''' and specified a reference to RAM, ROM, or a BANK. By default, in these cases memory is either allocated (RAM and BANK) or assumed to point to the memory region corresponding to the relevant CPU (ROM). When you use the '''AM_REGION''' macro, you are overriding this default behavior, specifying instead a particular memory ''region'' and an ''offset'' within that region which corresponds to the ''start'' address of the memory range.
| |
| | |
| === AM_SHARE ===
| |
| AM_SHARE(index)
| |
| This macro has limitations similar to '''AM_REGION''', in that it only makes sense when used with RAM, ROM, or a BANK. However, instead of specifying an explicit memory region and offset, you instead specify a non-zero ''index''. The first memory range that is encountered with an '''AM_SHARE''' allocates its memory in the default fashion. However, subsequent ranges which also use '''AM_SHARE''' and which reference the same ''index'' override this default behavior and point to the exact same memory that the first instance referenced.
| |
| | |
| This is primarily used to map shared memory between multiple CPUs. For example, if CPU #1 has RAM in the region $4000-$4fff, and CPU #2 has that same RAM mapped in the region $8000-$8fff, you can specify AM_SHARE(1) next to each one. When building up the memory system, AM_SHARE(1) is seen first for CPU #1, and it is allocated as normal RAM. Shortly afterwards, AM_SHARE(1) is see a second time for CPU #2, but instead of allocating memory, we simply point back to the same RAM that was allocated for CPU #1.
| |
| | |
| Note that multiple independent shared regions can be managed this way, by using a different value for ''index''. Also note that this sharing technique only works between CPUs with the same data bus width (e.g., 8-bit to 8-bit, or 16-bit to 16-bit). If there is shared RAM between, say, an 8-bit CPU and a 16-bit CPU, then you need to write your own handlers to manage that RAM.
| |
| | |
| === AM_BASE, AM_SIZE ===
| |
| AM_BASE(base)
| |
| AM_SIZE(size)
| |
| These macros are a convenience for driver writers. Since the memory system will allocate memory automatically for certain types of address ranges, you need a way to get ahold of a pointer to that memory so you can examine it. The '''AM_BASE''' macro takes a pointer to an appropriately-sized pointer (e.g., a pointer to a UINT8 * for 8-bit data bus), and fills it in after the memory system is initialized with the address of the memory that was allocated. In a similar fashion, the '''AM_SIZE''' macro takes a pointer to a size_t and returns in it the size (''end'' + 1 - ''start'') of the range referenced.
| |
| | |
| === AM_BASE_MEMBER, AM_SIZE_MEMBER ===
| |
| AM_BASE_MEMBER(struct, member)
| |
| AM_SIZE_MEMBER(struct, member)
| |
| These two macros are variants of the standard '''AM_BASE''' and '''AM_SIZE''' macros which are designed to work in a newer more object-oriented style. Drivers now have a pointer in the running_machine object which contains driver-specific data. But since the memory for this data is allocated dynamically, you cannot use the regular '''AM_BASE''' and '''AM_SIZE''' macros to point to an address where the pointer or size should be stored. To make this work, you instead use the '''AM_BASE_MEMBER''' macro to specify the type of ''struct'' that will be allocated and the name of the struct ''member'' where you want the data to be stored.
| |
| | |
| == Address Map Shortcuts ==
| |
| | |
| === AM_UNMAP ===
| |
| === AM_RAM ===
| |
| === AM_ROM, AM_WRITEONLY ===
| |
| === AM_RAMBANK, AM_ROMBANK ===
| |
| === AM_NOP, AM_READNOP, AM_WRITENOP ===
| |
| | |
| == Address Map Flags ==
| |
| | |
| === AMEF_ABITS ===
| |
| === AMEF_UNMAP ===
| |
| | |
| == Runtime Modifications ==
| |
| | |
| == Debugging Helpers ==
| |