CSCI 411

rmC VIRTUAL MACHINE
QUADRUPLES INTERPRETATION

Our compiler for rmC will translate source code to a low-level code where each instruction is a quadruple of integers of the form
         (OP, OPND1, OPND2, RESULT),
where
          OP is a numeric code for an operation;
          OPND1 is a number representing the first operand;
          OPND2 is a number representing the second operand; and
          RESULT is a number indicating the location for the result of the operation.

We shall consider these quadruples as instructions for a fictitious machine, the rmC virtual machine. You are to write a program which will simulate the action of this machine. In fact, as you write this program, you should structure it so that it clearly implements a simplistic instruction fetch cycle for the rmC virtual machine.

Consult the table of quadruples for rmC for a full listing of the operation codes and their corresponding operands. These quadruples will act as the instruction set for our virtual machine. Note that most operands of these instructions refer numerically not to addresses in our machine's memory, but to positions in an array of entries from portions of a compiler-generated symbol table and/or constant table.

Your first assignment, in preparation for the development of your rmC compiler, is to write a C program, rmcrun.c, which will act as an interpreter of a set of quadruples. That is, your program, on being given a set of quadruples along with appropriate portions of a symbol table and a constant table, should act as an executor of the quadruples. It should read each quadruple instruction in sequence, interpret the action to be taken, and take the action. Note that the quadruples should be interpreted in the sequence in which they appear, except when the current quadruple indicates a branch to some quadruple other than the next one in sequence.

In the course pickup directory you will find copies of the rmC sample programs stored in the files prog1.rmc, prog1a.rmc, prog2.rmc, and prog2a.rmc. Also you will find in the pickup directory two executable files, called irmcc and irmcrun. These are working forms of the compiler and virtual machine simulator that you are to create during this course. Copy the sample programs and these two executables to your own directory so that you may use them to generate test quadruple strings and to see how quadruple execution is to work.

In fact, once you get these files in your directory, execute the following and carefully look over the results.

                   % irmcc prog1.rmc -q

The output from this command will be a rather readable dumping of the list of quadruples, the symbol table entries, and the constant table entries generated in compiling the rmC source for Sample Program #1 into code executable by the virtual machine. Compare this output with the listing of quadruples for Sample Program 1 along with the full symbol and constant tables. 

You will note that a file called prog1.vm has also been produced by running irmcc. Study this file to see that it shows essentially the same information as was shown on standard output, except that its content is stripped down to just those entries that are needed by the virtual machine. This file prog1.vm is the one that is to be run on the virtual machine by using the command

                    % irmcrun prog1.vm

Noting first the action that Sample Program 1 is supposed to create, run prog1.vm on the virtual machine and observe that the action goes as expected.

Now study the format of the entries in prog1.vm. This is the format for any input file that your own quadruples interpreter (machine simulator) will be expected to "execute." As you will note in studying this, the layout is:

After designing and implementing your interpreter, test it on the quadruples generated by irmcc for each of the given sample programs. Be sure that each runs correctly.


Some Details

Given a file like prog1.vm, the first action of your interpreter is to read this file, storing the quadruples in one array, the partial symbol table in another, and the partial constant table in a third. In what follows we'll assume that all arrays are set up as in C, where the first member of an array has index 0. (For now just allocate these arrays statically using reasonably large upper bounds for the sizes of these arrays.)

Since rmC has no subroutine calls, static allocation of storage suffices. In fact, for the simple programming we will do with it, we can store all values of variables right in the runtime symbol table entry corresponding to the variable. Consequently, action on the assignment statement will be just that.

The compiler places values of constants in a separate runtime constant table. Whereas quadruple operands which refer to variables use the index value of the variable's entry in the runtime symbol table, references to constants are made similarly except they are shown as negative numbers and offset somewhat in order to avoid confusion with symbol table references. For example, an operand referring to the first entry in the runtime constant table (the one with index 0 in that array) would be shown with value -1; a reference to the second entry in the runtime constant table would be shown as -2; etc.

What symbol table and constant table fields are needed by the interpreter? Only the VAL field will be used during execution. Consequently, for now you need to worry about only that field. That is, as noted above, the input file for your interpreter should contain

Here, for the sake of introduction, we shall discuss briefly other fields that appear in the full versions of these tables as used by the compiler. These other fields are used during phases of compilation that precede quadruples interpretation.

The NAME field of the symbol table is a reference to the actual variable name used in the source program being compiled. As such, however, it does not contain the name itself, but rather a pointer to (subscript of) the place in a large character string where the name is stored. This large character string will be called the "name table" and will contain all of the names defined in the source program. Each such name will be null terminated, as is the C convention. That is, each will end with the null character. The NAME field of the constant table is similar except that it refers to the ASCII string which gives the value of the constant. For example, the ASCII string "500" is the name of the integer constant 500.

The TYPE field of the symbol and constant tables is used to distinguish integer, Boolean, and label values (the only three types admissible in rmC ). These are encoded numerically using numeric codes
       1 for INTEGER,
       2 for BOOLEAN, and
       3 for label.
With this and the convention for the NAME field mentioned above, the compiler can set up the symbol and constant tables as two-dimensional arrays of integers (assuming a numeric 0-1 code for the TEMP and DCL fields as well and assuming BOOLEAN values will be standardly encoded with 0 for FALSE and 1 for TRUE).


Sample Program #1: Data for Quads Interpreter

Index Name Type Value

0

"0"

Int

0

1

"10"

Int

10

2

"1"

Int

1

Constant Table for Sample Program #1

Identifier

Symbol table entry

Name

"X"

"SUM"

"I"

"LOOP"

"T1"

"T2"

"T3"

Index

Type

Temp

Dcl

Val

0

Int

F

T

0

1

Int

F

T

0

2

Int

F

T

0

3

Label

F

T

2

4

Bool

T

T

0

5

Int

T

T

0

6

Int

T

T

0

Final Symbol Table for Sample Program #1

Quadruple Listing

Meaning of Quadruple

Quad #

0

1

2

3

4

5

6

7

8

9

10

11

12

OP

OPND1

OPND2

RESULT

3

-1

0

1

3

-1

0

2

11

2

-2

4

1

11

4

0

4

0

0

0

5

0

0

0

6

1

0

5

3

5

0

1

6

2

-3

6

3

6

0

2

2

2

0

0

5

1

0

0

0

0

0

0

Meaning

SUM := 0

I := 0

T1 := I LT 10

if T1 = 0 then go to quad 11

READ X

WRITE X

T2 := SUM + X

SUM := T2

T3 := I + 1

I := T3

go to quad 2

WRITE SUM

NOP

Quadruples for Sample Program #1