Memory Access Controller

304-487A Computer Architecture Lab Assignment #2

Due: Friday, October 5 at 16:30
May be completed in groups of maximum two students.

Note: Important changes indicated in red.

Introduction

The purpose of this assignment is to construct a Memory Controller (MC) that interacts with Static Random Access Memory (SRAM) through read and write operations. The assignment is to complete a design and verify complete functionality through simulation. The VHDL code can be developed and simulated using the Altera MAX Plus II software.

The MC will be tested using a simple host module that repeatedly makes use of the MC. A second method, which you may undertake as an optional bonus component, uses an external module that sorts the entries stored in the memory unit. In the remainder of this document the combination of the memory controller and the SRAM will be referenced as the memory unit.

Background

Current methods of storing memory are in large grids, or arrays, of cells that are associated to an individual address value. The width of these cells are variable and usually match the word length of the system. Memory blocks are most notably used to accompany Central Processing Units (CPU), however they can be used for other applications where storage of large amounts of information is necessary.

For satellites and other systems used in space, radiation bombardment can cause latchup and bit flipping. In order to combat these problems, data can be encoded in the form of Huffman codes. Huffman coding allows the detection of two errors, and can detect and correct one error. Therefore, a data word can be stored with its Huffman code and then checked when the data is read back. If one of the bits is incorrect when the data is read back, the error can be corrected.

The Assignment

The requirements of this assignment is to construct an MC with Huffman coding to perform error correction. The SRAM is provided. During a write operation from the MC to the SRAM, the MC will have to compute the Huffman error code and load it into the SRAM along with the data word. When the data is to be read, the data will be retrieved along with the error code. The error bits will be used to determine if there was an error and if it can be corrected. Figure 1 shows a block diagram of the entire system including input and output signals. The external module that uses the memory unit is referred to as the host.

Figure 1: System Overview

The Static RAM

A memory block entitled SRAM is provided. The SRAM is available in two versions. The first SRAM should be used to test the system and demonstrate functional operation. The second SRAM includes random error generation. It should be used to demonstrate that your Huffman coding works properly. You may need to simulate over an extensive period to find an error in the memory block.

In order for data to be written to and read from the memory correctly, specific read and write procedures must be followed. Both the read (read_n) and write (write_n) lines are active low.

Read Procedure

During a read operation, the address value must be set and the read line put low. The address and read line can be changed at the same time or in any order. It is recommended that the address is set at the same time or before the read line is activated. There is a finite propagation delay (Td) after the read line is activated or after the address is changed. This delay is dependent on the characteristics of the chosen device. It can be determined through simulation. Once the read line is deactivated, the output data line will return to the “0” state.

Figure 2: Read Procedure

Write Procedure

For the write procedure, the address, input data and write line may be set at the same time. It is recommended that the address and data lines are set before the write line. The write line must be held low long enough for the data to be written properly to the correct address. Then, the write line may be returned to the high value, however the address and input data must be held for a short period of time after the write line is deactivated in order to ensure that the data is loaded correctly. These delays are dependent on the chosen device.

Figure 3: Write Procedure

Addressing

The data word for the SRAM memory block is 16 bits wide. Each cell is 8 bits wide and the memory is 32 cells deep. The memory is cell-addressable and the address bus is thus 5 bits wide.

The addressing scheme of the SRAM allows the cells to be addressable in 8 bit increments, even though the word size is 16 bits. The memory block is addressed as if the data bus is only 8 bits wide. However, 16 bits are actually being written and read during a memory access. Therefore, adjacent 16-bit cells differ by an address value of two instead of one.

Figure 4: SRAM Addressing

Example Operation of SRAM

The first example is a write operation to two adjacent 16-bit blocks. The address value would be 0 for the first 16-bit data word. The least significant 8 bits would be placed in cell 0 and the most significant 8 bits would be placed in cell 1. The next write access would be addressed to cell 2, thus writing to cell 2 and 3.

The next example is accesses to two adjacent 8-bit words. The first address value would be 4 and the next would be 5. The difficulty with writing with 8-bit words is the fact that 16-bits are always written. As a result, if the MC is to write 8-bits into cell 4 again to overwrite the existing data, the value in cell 5 would also be overwritten. If the MC was writing a 16-bit word, then this would not matter. The write procedure for 8-bits must include a protective measure to ensure that the most significant bits are not lost (i.e. the data in cell 5 is preserved) This can be done with a read procedure before writing an 8-bit word.

The Memory Controller

The memory controller has the purpose of accessing the memory. It must adhere to the timing specifications in order to read and write data properly. The common bus that is used to interact with the SRAM can be made private under the assumption that it is a direct connection between the components.

Since the data word is 16 bits and the Huffman error code is only 5 bits, only three 8-bit cells should be used. Since the data bus is only 16 bits, two write and read operations will be necessary to access all of the data and error bits. Also, the data should be packed as tightly as possible to save space (i.e. in 8-bit increments, not to pack every single bit).

From the point of view of a host connected to the MC, the MC appears to be a linear array of ten 16-bit memory cells. The host addresses the MC in word increments. The MC maps this address to its actual location in the SRAM. For example, if the host device wishes to store a word in address 1, the MC will be placing it with its error code in cells 3, 4, and 5.

Huffman Coding

Huffman error coding is an algorithm that allows the detection of two errors and the correction of one error. For a 16 bit word, there are 5 error bits. They are calculated as follows:

Once these check bits are computed using XOR gates, they are stored in RAM with the data. When the data is retrieved, the error check bits are computed again and then compared with the error check bits retrieved from the memory block.

If errors do not exist, C[4..0] should be “00000”. If there is one error, the numerical value of C corresponds to the location of the error in the following manner.

Table 1: Value Mapping

Numeric Value	Bit	Numeric Value	Bit	Numeric Value	Bit
1	P₀	8	P₃	15	B₁₀
2	P₁	9	B₄	16	P₄
3	B₀	10	B₅	17	B₁₁
4	P₂	11	B₆	18	B₁₂
5	B₁	12	B₇	19	B₁₃
6	B₂	13	B₈	20	B₁₄
7	B₃	14	B₉	21	B₁₅

Therefore, the value of the data bits can be corrected before they are transmitted by the MC.

Testing the SRAM and Memory Controller

The SRAM and memory controller will be tested by writing a simple host module that repeatedly reads and writes to the SRAM through the memory controller. The second method (optional as bonus) involves designing a circuit which performs the bubble sort algorithm on the contents of the memory unit. Both of these methods are explained in greater detail below.

Test Method #1: Host Block

The host block is a simple test circuit to demonstrate the functional correctness of the MC. The host will request the MC to load several consecutive data words. The host will then retrieve certain data words, use them for a computation and then rewrite them to memory. The computation performed is not specified and may be trivial. Several of these operations should be demonstrated. The memory block is constructed so that errors will occur randomly. Memory accesses simulations should include reads of correct data and corrupted data. Simulations should clearly show where errors occur in the SRAM and that they were corrected, if possible, by the MC. Note that the error correction scheme used will fail in the unlikely event that two bits are corrupted by the SRAM. It is not necessary to take this case into account.

Test Method #2:Bubble Sort (optional bonus)

This method involves sorting the contents of the SRAM using the bubble sort algorithm. At the end of the sorting procedure, the largest 16-bit unsigned binary number in the SRAM must be located at address 9 while the smallest number must be located at address 0. A sorting device which actually performs the bubble sort algorithm must be built and connected to the MC. Figure 5 shows a block diagram of the entire system with the bubble sorting device. The sorter module replicates the interface of the memory controller with the addition of two signals. The sort input is used to trigger the beginning of a sorting operation. The busy output is raised by the sorter while it is sorting the entries of the memory unit. The sorter gives direct access to the memory controller while the sort signal is low. When sort is raised, the sorter raises the busy signal and starts sorting the content of the memory unit. Once the memory unit is sorted, the busy signal is dropped and operations return to normal. The sorter may be tested by running a simulation that writes ten numbers to the memory unit, requests a sort and reads back the ten numbers.

Figure 5: System overview with bubble sorter attached to MC.

Bubble Sort Algorithm

In the bubble sort algorithm, two neighboring values are compared and then swapped if they are in the wrong order. Given an array a of numbers, with length n, the algorithm would be coded in C as follows:

    for (i=0; i<n-1; i++) {
          for (j=0; j<n-1-i; j++)
                if (a[j+1] < a[j]) { /* compare the two neighbours */
                      temp = a[j];         /* swap a[j] and a[j+1]      */
                      a[j] = a[j+1];
                      a[j+1] = temp;
              }
    }

As shown above, the index j in the inner loop travels up the array, comparing adjacent entries in the array while the outer loop causes the inner loop to make repeated passes through the array. After the first pass, the largest element is guaranteed to be at the end of the array, after the second pass, the second largest element is in position, and so on.

Note that the code shown above is a software implementation of the bubble sort algorithm and therefore cannot necessarily be implemented in hardware in the same way. To successfully implement this algorithm in hardware, you must think carefully about the states needed to perform the desired operations (see the section below).

Algorithm Implementation in the Bubble Sorter

The sorting device should use an FSM to perform the bubble sort algorithm. In order to compare and swap two neighboring values in the memory unit, the sorter must first instruct the MC to read the appropriate values. The sorter should then latch these values just read and compare them. Once this is done, the sorter must write back the values in their new locations in the SRAM (if it was decided that a swap should occur). The sorter must also perform enough comparisons to ensure that the data have been placed in the correct locations in the memory unit. Use as few states as possible when implementing the bubble sort algorithm in hardware.

The Report

The design should be fully tested, simulated, and assured to be functional. The report will be based on the design, simulation and analysis.

Mark Breakdown

The assignment will be graded as follows:

Memory Controller (85%)

design - 40%
simulation and testing (host) - 30%
analysis - 30%

Bubble Sort (bonus 15%)

design - 40%
simulation and testing (host) - 30%
analysis - 30%

Overall Presentation (15%)

Design

These marks will come from the functionality, simplicity, robustness and elegance of design. It should be expandable and generic. The VHDL code should be submitted with the report in small but legible font. (so that there is no annoying wrap around) The design should also be as compact as possible. The size of the design will be noted.

Proper documentation and commenting will play an important role in the marking. Each file header should include the filename, title, author, date, and a brief description. The brief description should explain how the functional block represented by that file works from a black box point of view. It should be written in such a way that someone could read the description and look at the ports in the entity declaration and understand what the functional block does and how it can be used. This does not need to be very long. There should also be a few descriptive words of comment next to each port in the entity declaration. Each process should have another brief description just before the process declaration. It should describe how the process works. If the process is a finite state machine, describe each state and its purpose along with the other states it would transition to and why.

Simulation

The simulations should be clear and demonstrate the functionality of the important blocks (a simulation for a couple of and gates is not necessary). There should be a simulation for the complete functioning design as well. There should be simulations testing for errors and demonstrating the robustness of the design.

The simulations should include the important signals in the external and internal operation of the blocks. There should be clear and concise indications on the simulations describing what is happening. Key points in the simulation should be pointed out. Simply write on the simulations with a pen (multicolored would be best) indicating what state the block is in and what transitions are important. However, there should not be any clutter. The simulations should be large and found in the appendix. Important simulations should be reproduced in the report body.

Report

The report is to be limited to 5 pages (single sided). Appendices are considered extra. Dual column format is recommended but not necessary. It should be clear and easy to read.

The report should include a short description of the design and any interesting or important elements encountered in the design process. Also, the interesting features of the design should be pointed out. The important diagrams should be shown in the body of the report, however there may be references to diagrams and simulations in the appendices. The exact organization of the report is left to your discretion.

Analysis

Also in the report, the analysis is a breakdown of the performance and abilities of the design. It should indicate the maximum operating frequency of the design and the corresponding speed at which the memory controller can perform a complete read and write operation (i.e. how long will the host have to wait for the MC to put or fetch data). If you have done the bonus component, it should also indicate how long a sorting operation takes. Also, it should show where the longest delay path is and how it limits the maximum operating frequency.

The size of the design should be indicated by the number and % of logic blocks that were used. Excepts from the .rpt file should be included in the appendix in order to demonstrate this value. Only the important parts of the .rpt file should be included such as the Device Summary and the Resource Usage. DO NOT INCLUDE THE ENTIRE REPORT FILE IN THE APPENDIX.

Discuss limitations of your design and possible improvements.

Enclosures

The hardcopy report must contain all VHDL code and simulations and excerpts from the .rpt file. Also, all the simulation files (.scf), VHDL files (.vhd), and the overall design report file (.rpt) (i.e. not one for each component) must be tared, uuencoded and emailed to the instructor. Refer to assignment #1 for the exact procedure. Replace 487-A1-LASTNAME by 487-A2-LASTNAME or by 487-A2-LASTNAME1-LASTNAME2 in the case that you worked on this assignment in groups of two.

Note that there is no mini-presentation component for this assignment.