Virtual method table

From Wikipedia, the free encyclopedia

A virtual method table, virtual function table, dispatch table, or vtable, is a mechanism used in programming languages to support dynamic polymorphism, i.e., run-time method binding.

Suppose a program contains several classes in an inheritance hierarchy: a superclass, Cat, and two subclasses, HouseCat and Lion. Class Cat defines a virtual function named "speak" but provides no implementation. Instead, its subclasses must provide an appropriate implementation (i.e., either meow or roar).

When the program calls the "speak" method on a Cat pointer (which can point to any subclass of Cat), the run-time environment must be able to determine which version to call.

There are a variety of different ways to implement such dynamic dispatch, but the vtable solution is especially common among C++ and related languages (such as D and C#).

Contents

[edit] Implementation

An object's dispatch table will contain the addresses of the object's dynamically bound methods. Method calls are performed by fetching the method's address from the object's dispatch table. The dispatch table is the same for all objects belonging to the same class, and is therefore typically shared between them. Objects belonging to type-compatible classes (for example siblings in an inheritance hierarchy) will have dispatch tables with the same layout: the address of a given method will appear at the same offset for all type-compatible classes. Thus, fetching the method's address from a given dispatch table offset will get the method corresponding to the object's actual class.
(Ellis & Stroustrup 1990, pp. 227–232)

The C++ standards do not mandate exactly how dynamic dispatch must be implemented, but compilers generally use minor variations on the same basic model.

Typically, the compiler creates a separate vtable for each class; adds a pointer to this vtable, called the virtual table pointer or vpointer, as a hidden member (often the first member) of each base class; and generates hidden code in the constructor for each class to initialize its vpointer to the address of the corresponding vtable.

[edit] Example

Consider the following class declarations in C++ syntax:

class B1 {
  void f0();
  virtual void f1();
  int int_in_b1;
};

class B2 {
  virtual void f2();
  int int_in_b2;
};

used to derive the following class:

class D : public B1, public B2 {
  virtual void d();
  virtual void f2();  // override B2::f2()
  int int_in_d;
}

and the following piece of C++ code:

  B2 * b2 = new B2();
  D * d = new D();

g++ 3.4.6 from the GNU Compiler Collection produces the following 32bit memory layout for the object b2:

(commands to get memory layout / tools etc. Please insert)

(g++'s -fdump-class-hierarchy argument can be used to dump virtual method tables for manual inspection)

b2:
  +0: pointer to virtual method table of B2
  +4: value of int_in_b2

virtual method table of B2:
  +0: B2::f2()   
  +4: B2::~B2()

and the following memory layout for the object d:

d:
  +0: pointer to virtual method table of D (for B1)
  +4: value of int_in_b1
  +8: pointer to virtual method table of D (for B2)
 +16: value of int_in_b2
 +20: value of int_in_d

virtual method table of D (for B1):
  +0: B1::f1()
  +4: D::~D()
  +8: D::d()
 +12: D::f2()

virtual method table of D (for B2):
  +0: D::f2()   // B2::f2() is overridden by D::f2()
  +4: D::~D()

Note that non-virtual functions (such as f0) do not generally appear in the vtable, but there are exceptions in some special cases (such as the default constructor).

Overriding of the method f2() in class D is implemented by duplicating the virtual method table of B2 and replacing the pointer to B2::f2() with a pointer to D::f2().

[edit] Multiple Inheritance

The g++ compiler implements the multiple inheritance of the classes B1 and B2 in class D using two virtual method tables, one for each base class. (There are other ways to implement multiple inheritance, but this is the most common.) This leads to the necessity for "pointer fixups" when casting.

Consider the following C++ code:

  D * d = new D();
  B1 * b1 = dynamic_cast<B1*>(d);
  B2 * b2 = dynamic_cast<B2*>(d);

While d and b1 will point to the same memory location after execution of this code, b2 will point to the location d+8 (eight bytes beyond the memory location of d). Thus, b2 points to the region within d which "looks like" an instance of B2, i.e., has the same memory layout as an instance of B2.

[edit] Invocation

A call to d->f1() is handled by dereferencing d's D::B1 vpointer, looking up the f1 entry in the vtable, and then dereferencing that pointer to call the code.

In the case of single inheritance (or in a language with only single inheritance), if the vpointer is always the first element in d (as it is with many compilers), this reduces to the following pseudo-C++:

  *((*d)[0])(d)

In the more general case, as above, calling f1(), B1::f1(), and B2::f2() on d are more complicated:

  *((d->"pointer to virtual method table of D (for B1)")[0])(d)
  *((d->"pointer to virtual method table of D (for B1)")[12])(d)
  *((d->"pointer to virtual method table of D (for B2)")[0])(d+8)

By comparison, a call to d->f0() is much simpler:

  *"B1::f0"(d)

[edit] Efficiency

Note that a virtual call requires at least an extra indexed dereference, and sometimes a "fixup" addition, compared to a non-virtual call, which is simply a jump to a compiled-in pointer. So, calling virtual functions is inherently slower than calling non-virtual functions. Although this is often not a factor in real-world efficiency, it can be.

In addition, in enviroments where JIT compilation is not in use, virtual function calls usually cannot be inlined. While a compiler could replace the lookup and indirect call with, e.g., a conditional execution of each inlined body, such optimizations are not common.

To avoid this overhead, compilers usually avoid using vtables whenever the call can be resolved at compile time.

Thus, the call to f1 above may not require a vtable lookup because the compiler may be able to tell that d can only hold a D at this point, and D does not override f1. Or the compiler (or optimizer) may be able to detect that there are no subclasses of B1 anywhere in the program that override f1. The call to B1::f1 or B2::f2 will probably not require a vtable lookup because the implementation is specified explicitly (although it does still require the this-pointer fixup).

[edit] Comparison with Alternatives

While the vtable is probably the fastest implementation possible for dynamic dispatch, it is not the most flexible.

Vtables only allow for single dispatch on the special "this" parameter, in contrast to multiple dispatch (as in CLOS or Dylan), where the types of all parameters can be taken into account in dispatching.

Vtables also only work if dispatching is constrained to a known set of methods, so they can be placed in a simple array built at compile time, in contrast to duck typing languages (such as Smalltalk, Python, or JavaScript, or C++'s own compile-time template facility).

Languages that provide either or both of these features often dispatch by looking up a string in a hash table, or some other equivalent method. There are a variety of tricks to speed this up (e.g., interning/tokenizing method names, caching lookups, just-in-time compilation), and dispatch time often does not have a significant effect on total processing time, but nonetheless, vtable lookup is obviously faster. Vtables are also simpler to implement and debug, and closer to the "C spirit" than hash tables of strings.

[edit] See also

[edit] References

In other languages