Efficiently implementing objects

by Carl Burch, Hendrix College, December 2011

Efficiently implementing objects by Carl Burch is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.
Based on a work at www.cburch.com/books/oops/.

1. Basics
2. Polymorphism
3. Virtual methods
4. Efficiency of virtual methods

For a compiler handling procedural language like C, the job is relatively straightforward: C and many other procedural languages have been designed to approximate what the computer does. Object-oriented programming systems, however, involve several concepts that were invented without worrying about how they would be translated. As you might expect, then, translating these programs is more difficult. In this document, we'll look at how a compiler can address some of these challenges, using translations from Java to ARM assembly language for our examples.

1. Basics

An object-oriented programmer tends to think of an object as a conglomeration of variables and methods. In memory, however, each individual Java object of the same class shares the same methods, and so there is no reason to store methods with individual instances. The values associated with instance variables are specific to the object, however, and so the representation of an object in memory includes each of these values. The memory representation of an instance is called its class instance record, or CIR.

Consider, for example, the Container class definition below.

class Container { int capacity; int contents; Container(int cap) { capacity = cap; contents = 0; } int getRemaining() { return capacity - contents; } void add(int wt) { if (contents + wt <= capacity) { contents += wt; } } }

Given this definition, a Java compiler will decide that each Container needs to hold two integer values, for the instance variables capacity and contents. Thus, it will reason, it must allocate eight bytes for each Container instance. (Later, in Section 3, we'll see that the CIR requires an additional four bytes, but we'll imagine we only need 8 for now.)

Container CIR

0 capacity

4 contents

	`Container` CIR
0	`capacity`
4	`contents`

Compiling an instance method is similar to compiling a function from C. The method will be translated into a single subroutine placed at a fixed location in memory, and compiled code can call this subroutine when the method is to be executed. One difference is that the code for the instance method needs to know which object it is addressing (that is, what this is). This can be done by giving each instance method an implicit first parameter, the address of this, through R0; whatever is the first parameter given in the parentheses will actually be passed as the second subroutine parameter in R1, and subsequent parameters will follow it.

Thus, a compiler might produce the following in compiling Container's getRemaining instance method.

Container_getRemaining LDR R1, [R0] ; load capacity into R1 LDR R2, [R0, #4] ; load contents into R2 //@ getContents SUB R0, R1, R2 ; place return value in R0 MOV PC, LR ; and return

The subroutine's first line illustrates the compiler fetching an instance variable from an object: It loads the capacity variable, which we're supposing is the first thing in a Container CIR. The contents variable, loaded in the second line, would be four bytes beyond this.

Constructors can be compiled similarly.

Container_construct STR R1, [R0] ; store to this.capacity MOV R1, #0 STR R1, [R0, #4] ; store to this.contents MOV PC, LR

Constructors are slightly different when they are called, however: Before calling a constructor, the code must allocate space for the CIR. Suppose the compiler were to compile new Container(100). Since a Container occupies eight bytes, the first task is to allocate eight bytes of memory.

MOV R0, #8 ; first we call malloc to allocate 8 bytes BL malloc MOV R1, #100 ; now we call constructor subroutine with 100 as parameter BL Container_construct

Methods, then, aren't very different from functions. Class methods are even less different — they don't even require the implicit parameter.

Note that in our examples, we neglected to include any indication of protection, like private or public. Though such protection is a central feature of object-oriented programming, it has no implication for the compiled code. This information is used only during compilation, when the compiler should flag protection violations. Because the compiler ensures that there are no protection violations, the executable program it generates needn't bother with them.

(By the way, the CIR approach discussed here applies to Java but not to all object-oriented languages. Other languages allow an object to gain new instance variables after the object is created — and even the instance methods might change after the object is created. Python and JavaScript are two prominent examples that provide this flexibility. For these languages, the object is essentially represented as a dictionary mapping member names to their values. Java does not follow this technique due to the memory and time inefficiencies of doing a dictionary lookup with each object access.)

2. Polymorphism

Polymorphism refers to the ability of an object of one class to masquerade as an object of a superclass. For example, suppose we define a Suitcase class extending Container.

class Suitcase extends Container { boolean closed; Suitcase(int cap) { super(cap); closed = true; } void setClosed(boolean val) { this.closed = val; } void add(int wt) { if (!this.closed) { super.add(wt); } } }

Assuming this subclass definition, we can create a Container variable that actually would refer to a Suitcase object.

Container bag = new Suitcase(100);

This ability of a variable to reference a subclass instance in polymorphism, and it has some major implications for code generation.

Each Suitcase object has three instance variables: In addition to the closed instance variable, it inherits the capacity and contents instance variables from the Container class. In order to be able to masquerade as a Container object, the compiler needs to place Container's instance variables first in the CIR. Thus, at the address contained by the bag variable initialized as above, we would find its capacity variable's value. Four bytes beyond this, we would find its contents variable's value. And four bytes beyond this, we would find its closed variable's value. (The program wouldn't be able to access bag.closed directly, since the bag variable is declared as a Container, but it would still exist in memory.)

Suitcase CIR

0 capacity

4 contents

8 closed

	`Suitcase` CIR
0	`capacity`
4	`contents`
8	`closed`

The compiler cannot place the closed variable elsewhere in a Suitcase CIR. For example, it couldn't go between capacity and closed. Doing this would prevent the already-compiled Container methods from working; for example, the compiled getRemaining method we saw relies on contents being four bytes beyond the capacity value in memory.

If the compiler places instance variables of the superclass before those of the subclass, then all the inherited instance methods work well. Indeed, if the compiler needs to generate code to handle a cast into a superclass, it can simply copy the existing CIR address.

(By the way, this approach works only as long as each class has just one superclass. This is good enough for Java, which requires this. But C++ allows multiple inheritance (where a class can inherit from several superclasses), and so it must incorporate more complex techniques for supporting polymorphism.)

3. Virtual methods

An important complication arises when we consider the fact that a class can override methods of its superclass. For example, the Suitcase class overrides Container's add() method. In object-oriented language parlance, a method that can be overridden is a virtual method. In Java, virtually all methods are virtual.

Compiling a virtual method isn't a major problem. The problem is in how the computer calls the method. Suppose that bag is a variable declared as a Container, and that the compiler sees bag.add(25) in the program. Without considering virtual methods, the natural way to compile this is the following.

MOV R0, R4 ; bag (assumed to be in R4) is implicit first parameter MOV R1, #25 ; 25 is the parameter in parentheses. BL Container_add

This is erroneous, though, because while bag is declared as a Container, it may actually be a Suitcase due to polymorphism. This assembly code would call Container's add method, and the 25 pounds would go into the suitcase whether or not it is open. But overridden methods are to be in force even when the variable's type is its parent class. In this case, if the suitcase is closed, the 25 pounds should not go in.

To support this, object-oriented compilers use a technique called the virtual method table, or VMT. For each class in a program, the compiler generates a fixed table of the virtual methods defined by the class.

Container VMT

0 Container_getRemaining

4 Container_add

Suitcase VMT

0 Container_getRemaining

4 Suitcase_add

8 Suitcase_open

12 Suitcase_close

The Suitcase VMT must start by following the Container VMT for the methods that it inherits; but for methods that it overrides (here, the add method), it contains the address of the overriding method's first instruction instead.

We change the CIR format by always allocating the first four bytes to refer to the VMT of the instance's actual class. Thus, if bag were a Suitcase, its CIR contains first the Suitcase VMT's memory address, followed by its capacity instance variable value, then its contents value, then its closed value. The constructor would be responsible for initializing the VMT pointer for each class instance.

Container CIR

0 Container_VMT

4 capacity

8 contents

Suitcase CIR

0 Suitcase_VMT

4 capacity

8 contents

12 closed

The following translation of bag.add(25); would support overriding, where R4 contains the address corresponding to bag.

MOV R0, R4 ; set up parameters as before MOV R1, #25 LDR R3, [R4] ; load VMT address for bag into R3 ADD LR, PC, #4 ; set up link register to be instruction after next LDR PC, [R3, #4] ; load into PC (thus branching) the address of add method

The first line looks at the memory pointed to by bag; with our redefinition of the CIR format, the data here is the memory address of the VMT for bag's actual type. If bag is a Suitcase, then this line would place the address of the Suitcase VMT into R3. When the computer enters the routine from the last line, it determines the routine's address from the second entry in the Suitcase VMT, which would be Suitcase's add method. If bag were a plain Container, however, then the third line would load Container's VMT address, and so it would enter Container's add method in the final line since this is where the second entry of Container's VMT points.

4. Efficiency of virtual methods

Calling virtual methods is somewhat less efficient than calling non-virtual methods. The problem is that calling them requires two memory accesses: one to find the virtual table's address, and another to load the method's location in memory. In contrast, a non-virtual method requires no memory accesses. We call this the virtual method penalty.

This slight inefficiency of calling virtual methods led the C++ designers to require C++ programmers to explicitly designate which methods should be virtual. Because many C++ programmers do not make this effort, most C++ methods turn out to be non-virtual. This jars many object-oriented advocates, who feel that non-virtual methods are contrary to object-oriented ideals. The Java designers followed this sentiment by defining all instance methods to be virtual by default. They did, however, allow a programmer to designate a method as final, which prevents a subclass from overriding the method; methods marked as private would also be impossible to override. This permits the compiler to avoid the virtual method call penalty. (Generally, you should avoid such optimizations unless tests demonstrate that the change truly makes a difference for your particular program. It rarely does, since the two extra memory references aren't that catastrophic.)

In practice, a good optimizer can reduce the virtual method call penalty. A simple optimization for an object-oriented compiler is to determine the actual variable type at compile time. Consider the following.

Container bag = new Suitcase(100); bag.add(10);

A compiler might look at this code and determine that, when the add method is being called, bag is of necessity a Suitcase. Thus, it could generate code to call Suitcase's add method directly, without reference to the VMT to which bag refers.

When inference of the actual type doesn't eliminate the penalty, it might be alleviated through avoiding repeated loads from the VMT. Consider the following.

for (; n != 0; n--) bag.add(n);

This seems to require two memory references each time the program calls add. But the compiler could notice that the computation of add's location can be lifted from the loop.

LDR R11, [R4] ; load VMT address into R3 LDR R11, [R3, #4] ; load address of add method into R11 B test again MOV R0, R4 ; bag.add(n) MOV R1, R5 ADD LR, PC, #4 MOV PC, R11 SUB R5, R5, #1 ; n-- test TST R5, R5 ; repeat if n != 0 BNE again

Through these two optimizing techniques, a compiler can reduce the virtual method penalty to a fraction of what it would be otherwise. Many programmers erroneously exaggerate the virtual method penalty because they underestimate what a good optimizer can do.