Thursday, November 27, 2008

Address space

In computing, an address space defines a range of discrete addresses, each of which may correspond to a physical or virtual memory register, a network host, peripheral device, disk sector or other logical or physical entity.

A memory address identifies a physical location in computer memory, somewhat similar to a street address in a town. The address points to the location where data is stored, just like your address points to where you live. In the analogy of a person's address, the address space would be an area of locations, such as a neighborhood, town, city, or country. Two addresses may be numerically the same but refer to different locations, if they belong to different address spaces. This is similar to your address being, say, "32, Main Street", while another person may reside in "32, Main Street" in a different town from yours.

Example address spaces:

* House numbers in street addresses
* Street addresses in towns
* Main memory (physical memory)
* Virtual memory
* I/O port space
* Network
o IP addresses in particular
* The cylinder-head-sector scheme for hard drives

Specific examples for the Linux kernel:

* Kernel virtual address space
* User virtual address space, accessed by the kernel through copy_to_user(), copy_from_user() and similar functions
* I/O memory, accessed through readb(), writel(), memcpy_toio(), etc.

Address translation

In general, things in one address space are physically in a different location than things in another address space. For example, "house number 101 South" on one particular southward street is completely different from any house number (not just the 101st house) on a different southward street.

However, sometimes different address spaces overlap (some physical location exists in both address spaces). When overlapping address spaces are not aligned, translation is necessary. For example, virtual-to-physical address translation is necessary to translate addresses in the virtual memory address space to addresses in physical address space -- one physical address, and one or more numerically different virtual addresses, all refer to the same physical byte of RAM.

Memory models

Many programmers prefer to use a flat memory model, in which there is no distinction between code space, data space, and virtual memory -- in other words, numerically identical pointers refer to exactly the same byte of RAM in all three address spaces.

Unfortunately, many early computers did not support a flat memory model -- in particular, Harvard architecture machines force program storage to be completely separate from data storage. Many modern DSPs (such as the Motorola 56000) have 3 separate storage areas -- program storage, coefficient storage, and data storage. Some commonly-used instructions fetch from all three areas simultaneously -- fewer storage areas (even if there were the same or more total bytes of storage) would make those instructions run slower.

Memory models in x86 architecture

Early x86 computers used addresses based on a combination of two numbers: a memory segment, and an offset within that segment. Some segments were implicitly treated as code segments, dedicated for instructions, stack segments, or normal data segments. Although the usages were different, the segments did not have different memory protections reflecting this.

Now, many programmers prefer to use a flat memory model, in which all segments (segment registers) are generally set to zero, and only offsets are variable.

No comments: