A1.1.4 The The Purposes of Different Types of Primary Memory

DEFINITION

Primary Memory

Primary memory is the only memory in a computer that the CPU can access directly by itself.

Primary memory is used to store the data and instructions of the programs that the CPU is actively processing.

When you use a computer, you usually have several different applications (programs) running at once. These applications have their own data and instructions.

The CPU needs to be able to quickly access this information so that these applications can run seamlessly.

For this reason, there needs to be a designated piece of storage that stores the data and instructions that are actively being processed by the CPU.

That storage is known as primary memory.

Primary memory is the only memory that the CPU is able to access.

The CPU is unable to directly access any other pieces of memory that are not in primary memory.

Instead, data stored outside of primary memory needs to be transferred to primary memory before the CPU can access it.

KEY POINT

Primary memory consists of RAM, ROM, the Cache and the CPU's registers.

Primary memory is not a separate physical piece of storage in a computer. It's a term used to refer to a group of several different types of memory.

There are 4 memories which make up a computer’s primary memory. These are:

Random Access Memory (RAM)
Read Only Memory (ROM)
Cache
Registers

All of these memories have different physical properties and serve different purposes.

KEY POINT

Primary memory is volatile and temporary.

Primary memory is volatile. This means that it needs power to store data.

If there is no longer any power, all of the data stored in primary memory is lost.

Therefore, primary memory is considered to be temporary, as it is unable to store data permanently without power.

However, in a computer, there also needs to be a place to store data permanently.

This is known as secondary memory. Secondary memory is used to permanently store data when it is not in use.

Within a computer, all of the different memories are categorised into one of these two groups:

A tree which breaks down computer memory.

An image showing the “bigger picture” of how memory is organised in a computer. Image credit: Icons from Flaticon.

Sometimes, the data that the CPU currently needs is stored in secondary memory. However, the CPU cannot access secondary memory on its own.

If the CPU needs a certain piece of data from the secondary memory, this data is transferred from the secondary memory into the primary memory.

The CPU can then load this data from the primary memory.

KEY POINT

Data can be written to and read from primary memory very quickly.

Primary memory provides data and instructions to processors.

These processors need to be able to quickly load and store data, otherwise they waste time waiting for data to be transmitted.

In order to not slow down the processors, primary memory has the ability to load and store data very quickly.

KEY POINT

Primary memory is also known as main memory.

There are several terms used to refer to primary memory. It is also known as the main memory and internal memory.

You will need to be able to compare primary and secondary memory. Below is a table of the main differences:

	Primary Memory	Secondary Memory
Data persistence	It is temporary	It is permanent
Volatility	It is volatile	It is non-volatile
Price	It is expensive	It is inexpensive
Access time	It has a fast access time	It has a slower access time
Capacity	It has a small capacity	It has a large capacity

A table showing the differences between primary and secondary memory.

DEFINITION

Random Access Memory (RAM)

Random Access Memory is a temporary and volatile storage used to store the data and instructions that the CPU is actively processing.

RAM is a part of a computer’s primary memory.

Primary memory is used to store the data and instructions of the programs that the CPU is actively processing.

However, primary memory is not a separate physical piece of memory. Instead, it is a term used to refer to several different pieces of memory found in a computer.

One of those pieces of memory is the Random Access Memory (RAM). RAM is the actual piece of memory that is used to store the data and instructions of the programs that the CPU is actively processing.

RAM is also volatile, which means that it needs power to store data.

RAM can only store data temporarily, as when there is no power, the contents of the RAM are lost.

Before a computer is turned off, all of the important data stored in RAM is transferred to secondary memory.

Here, the data is permanently stored until it is needed again.

Below is a photograph of what RAM looks like:

A close up photograph of two identical “sticks” of RAM. These particular sticks of RAM each have a capacity of 8 gigabytes and are designed by Transcend. Image credit: Andrey Matveev on Unsplash.

KEY POINT

Data can be read from and written to RAM very quickly.

It typically takes 50 to 100 nanoseconds to access data.

RAM provides data and instructions to the CPU.

The CPU needs to be able to quickly read (load) and write (store) data, otherwise it will waste time waiting for data to be transmitted.

In order to not slow down the CPU, RAM can load and store data very quickly.

In a modern computer, it typically takes the CPU 50-100 nanoseconds to access data stored in RAM.

KEY POINT

The typical capacity of RAM is 8 to 16 gigabytes.

Since the invention of computers, the average capacity (size) of RAM has been constantly increasing:

A graph showing the average RAM capacity from 1980 to 2000

A graph showing the increase in RAM capacity throughout the years 1980 to 2000. Source: Intel.

In the 1980s, the capacity of RAM was initially measured in kilobytes. By the 2000s, the capacity of RAM had increased by so much that it was impractical to continue to measure the capacity in kilobytes.

Instead, RAM capacity is now measured in gigabytes. Recall that there are roughly 1 million kilobytes in a gigabyte.

Nowadays, a modern computer has RAM with a capacity from 8 to 16 gigabytes.

KEY POINT

Increasing the capacity of RAM can improve a computer’s performance.

RAM is used to store the data and instructions of the programs currently being executed by the CPU.

If there are too many programs open at once, there may not be enough capacity inside the RAM to store all of this data.

If this happens, a computer will have to swap any extra data between RAM and to an external piece of storage. This storage is known as virtual memory.

However, using virtual memory takes much more time than using RAM. For this reason, a computer that has insufficient RAM will have a lower performance.

The performance of such a computer can be improved by increasing the capacity of the RAM. The performance will improve because the computer will not have to use virtual memory to slowly load data.

DEFINITION

Read Only Memory (ROM)

Read Only Memory is a read-only, non-volatile storage used to store the Basic Input Output System (BIOS).

ROM is a part of a computer’s primary memory.

Nearly all of the components that are a part of primary memory are volatile and temporary. However, there is one memory that is an exception.

ROM is non-volatile, which means that it is able to permanently store data without power.

The contents of the ROM are also read-only. The CPU can only read (load) the contents of the ROM, but not change them in any way.

Since ROM is permanent and non-volatile, it is used to store data and instructions that are important and not frequently updated.

The most important program stored in ROM is the Basic Input Output System (BIOS). BIOS contains the instructions used to boot (start up) the computer.

Here is a photograph of what ROM looks like:

A photograph of two ROM “chips”. These particular ROM chips are designed for the Amiga 500 computer. Image credit: Ebay.

KEY POINT

The only way the contents of the ROM can be changed is by using a process called "flashing".

Originally, the contents of the ROM were permanent, and could not be changed at all.

The contents were placed inside the ROM at the time it was manufactured using a process called “burning”.

This is why the component was named “Read Only” Memory. A computer could only “read” the data, but not change it in any way.

Nowadays, the ROM can be updated using a process called flashing.

KEY POINT

The typical capacity of ROM is 4 to 8 megabytes.

ROM does not store a lot of data. It only stores the Basic Input Output System (BIOS) and some other small system programs.

The capacity of ROM in a modern computer is typically only 4 to 8 megabytes.

You will need to be able to compare RAM and ROM. Below is a table of the differences:

	RAM	ROM
Data persistence	It is temporary	It is permanent
Volatility	It is volatile	It is non-volatile
Access speed	It has a fast access time	It has a slower access time
Type of Access	It can be both read and written to	It is read-only

A table showing the differences between RAM and ROM.

DEFINITION

Cache

The cache is a small capacity but high speed memory used to store copies of frequently used data and instructions.

The CPU loads its data and instructions from RAM. Some of this information is used quite often by the CPU.

The CPU wastes time repeatedly loading the same data from RAM.

To save time, there needs to be a memory even faster than RAM to store frequently used data and instructions.

That memory is known as the cache.

The cache stores a copy of data that is frequently used by the CPU.

An original “version” of the data is still kept in RAM.

This is so that if a computer has several processors, all of them are still able to access that data.

KEY POINT

The cache is split into three different levels: L1, L2, L3.

The cache acts as an intermediate memory between the CPU and RAM. In modern computers, the cache is physically a part of the CPU.

The cache is split into three parts, known as “levels”. These cache levels are located at different distances from the cores inside the CPU.

L1 cache is located inside each core of the CPU.

L1 cache has the quickest read/write speeds, with the CPU typically taking only 1 to 5 nanoseconds to access data. However, this technology makes L1 cache very expensive.

L1 cache also has the smallest capacity, typically storing just 32 to 64 kilobytes of data.

L2 cache can be implemented in one of two ways. L2 cache is either located inside each core (just like L1 cache) or shared by all of the cores in the CPU.

L2 cache typically stores 256 to 512 kilobytes of data. The CPU can access this data in 5 to 10 nanoseconds.

L2 cache is still quite expensive, but it is not as expensive as L1 cache.

L3 cache is shared by all of the cores in the CPU.

L3 cache has the slowest read/write speeds, with the CPU typically taking 10 to 20 nanoseconds to access data. However, this is still quicker than loading data from RAM.

L3 cache has the largest capacity, typically storing 8 to 32 megabytes. L3 cache is also the least expensive.

Here is a table which summarises all 3 cache levels:

	L1 Cache	L2 Cache	L3 Cache
Read/write time	1–5 nanoseconds	5–10 nanoseconds	10–20 nanoseconds
Capacity	32–64 kilobytes	256–512 kilobytes	8–32 megabytes
Location	Inside each core of the CPU	Inside each core of the CPU or shared between all cores	Shared between all cores of the CPU
Cost	Most expensive	Expensive	Least expensive

A table showing the differences between the three kinds of cache: L1, L2 and L3.

KEY POINT

The cache uses spatial and temporal locality to improve the performance of the CPU.

The cache implements the principles of spatial and temporal locality to improve its performance.

Spatial locality is the principle that data stored close to recently used data has a high chance of being used as well.

The cache stores similar data in blocks. If a piece of data was recently used from a block, then the other data kept in the same block has a high chance of being used again.

Temporal locality is the principle that recently used data has a high chance of being used again soon.

In computers, complex problems are broken down into many simple repetitive instructions. After a piece of data is used by the CPU, it has a high chance of being used again.

Using these principles of spatial and temporal locality, the cache stores data that has a high chance of being used by the CPU. This improves the performance of the CPU as it does not have to search for the data in RAM.

KEY POINT

A “cache hit” is when the CPU is able to find the data it needs in the cache.

A “cache miss” is when the CPU is unable to find the data it needs in the cache.

When looking for a certain piece of data, the CPU checks its own registers first. If it is unable to find this piece of data, it checks the cache, starting from L1 and ending with L3 cache.

A cache “hit” means that the CPU was able to find the data it needed in the cache.

A cache “miss” means that the CPU was unable to find the data it needed in the cache.

After a cache miss, the CPU will continue searching for the data in RAM and then in secondary memory. This slows down the CPU.

KEY POINT

There are 3 main types of cache misses:

Compulsory miss
Capacity miss
Conflict miss

A cache miss is when the CPU is unable to find the data it needs in the cache.

There are three main types of cache misses. These are: compulsory misses, capacity misses and conflict misses.

A compulsory miss is when the CPU cannot find data in the cache because data is being loaded from primary memory for the first time.

This means that the data has not been copied to the cache because it has not been used previously by the CPU.

A capacity miss is when the CPU cannot find data in the cache because there is not enough space in the cache.

The cache regularly removes old and unused data. However, if there is a large amount of frequently used data then there may not be enough capacity in the cache for it all.

A conflict miss is when the CPU cannot find data in the cache because several different pieces of data are mapped to the same cache location, forcefully evicting each other.

In the cache, every piece of data has a designated amount of space. Sometimes, several different pieces of data will be allocated the same space in the cache.

These different pieces of data will remove each other as they are copied into the cache.

When the CPU tries to access this data in the cache, it may find the other piece of data that it does not need.

KEY POINT

The CPU tries to avoid cache misses by using 3 techniques:

Data Prefetching
Optimal Memory Allocation
Cache Replacement Policies

A cache miss is when the CPU cannot find the data it needs in the cache. This means that the CPU must continue searching for the data in RAM. This slows down the CPU.

For this reason, the CPU tries to avoid cache misses as much as possible. The CPU avoids cache misses using three techniques: data prefetching, optimal memory allocation and cache replacement policies.

Data prefetching is when data is copied to the cache before it is actually needed by the CPU.

The CPU predicts what data it will need next based on patterns, and copies this data to the cache.

Optimal memory allocation is when the CPU strategically allocates data inside the cache based on the cache’s usage patterns.

The CPU will group similar data and store it in continuous blocks within the cache, making it easier to find, update and evict the data.

Cache replacement policies are rules that dictate how the CPU evicts stale (old) and unnecessary data from the cache.

The CPU will remove unnecessary data from the cache when it is full. This data is then replaced by new data that the CPU frequently uses.

All three of these techniques increase the odds of cache hits happening, which improves the performance of the CPU.

DEFINITION

Volatile and Non-volatile Memory

Volatile memory is a type of storage that loses its data when power is turned off.

Non-volatile memory is a type of storage that retains its data even when power is turned off.

When a computer is powered on, it uses a steady supply of electricity. When the computer has been completely shut down, it does not use any electricity at all.

Volatile memory is a type of storage that needs a constant supply of electricity to store data. This means that while the computer is powered on, it is able to use volatile memory to temporarily store data.

However, once the computer is powered off, and there is no longer a constant supply of electricity, all of the data stored in the volatile memory is lost.

This is why a computer needs another type of storage, called non-volatile memory. Non-volatile memory is a type of storage that is able to store data even when there is no supply of electricity.

KEY POINT

Non-volatile memory is cheaper but slower at reading and writing data than volatile memory.

In computers, all data is stored as 1s or 0s. These are known as binary digits, or in short, bits.

In non-volatile memory, it takes a relatively large amount of energy and time to change the state of a bit (as an example, from a 0 to a 1). However, this means that non-volatile memory is now able to store data without power.

The slower read/write speeds also make non-volatile memory unsuitable for processors, which need to be able to quickly access and store data. Instead, processors use volatile memory.

In volatile memory, it takes a small amount of energy and time to change the state of a bit. This means that data in volatile memory can be accessed very quickly and efficiently.

However, this technology is expensive and data can only be stored temporarily while there is a supply of power.

These differences between volatile and non-volatile memory demonstrate why both kinds of memory are necessary in a computer.

KEY POINT

Non-volatile memory is used to store data permanently when it is not in use.

Volatile memory is used to temporarily store data while there is a supply of power.

Non-volatile memory is used to permanently store data that is not in use. This is so that the data remains once the computer is powered off. Later, this data can be loaded once it is needed again.

You will need to be able to compare volatile and non-volatile memory. Below is a table of the main differences:

	Volatile Memory	Non-volatile Memory
Data persistence	It is temporary	It is permanent
Access time	It has a fast access time	It has a slower access time
Capacity	Small	Large
Price	Expensive	Inexpensive

A table showing the main differences between volatile and non-volatile memory.

DEFINITION

Latency

Latency is the time delay in recieving a piece of data in response to sending a request for data.

Latency is a term that refers to the time it takes for a piece of data to arrive at its intended destination after being requested from a certain source.

As an example, RAM has a latency of 60 to 100 nanoseconds. This means it takes around 60 to 100 nanoseconds for data to reach the CPU from RAM.

Something wrong or missing?

If you would like to report an error or request an edit, please contact us through email.