On the Use of Inheritance

March 7, 2023 · 7 min read

Herwig Mannaert

Founder

The technique of inheritance was introduced as an essential part of the object-oriented programming paradigm, aiming to deliver more anthropomorphic code by mimicking the concept of ontological refinement. Just like a jet fighter is a special type of an airplane, or a bird is a special type of animal, and a sparrow is a special type of bird, inheritance was created to define classes as refinements of other classes. Such a subclass would inherit the member variables and methods of a superclass, and extend it with more specific variables and methods for that particular refinement. The notion of ontological refinement is strongly related to taxonomy, i.e., the practice and science of the classification of things or concepts, including the underlying principles. Originally, taxonomy referred only to classifying – or classifications of – organisms. This probably explains that the technique of inheritance in software programming languages is often illustrated with examples of animals or plants.

However, the use of ontological refinement to define classes entails several issues, both on a conceptual and technical level. First, taxonomies are essentially multi-dimensional, i.e., they can be based on multiple criteria. For example, one could refine or categorize a human person based on gender, nationality, or age. Constructing a taxonomy tree based on various criteria could lead to a combinatorial explosion, as nearly all combinations would be possible, e.g., BelgianMaleChild and GermanFemaleAdult. Another option would be to ignore the multi-dimensional nature of the taxonomy, and to use a single primary criterion at a certain level. This approach could limit the anthropomorphic meaning of the classes, as can be illustrated by the distinction between mammals and fish based on the procreation mechanism, e.g., whales and dolphins are no fish although their appearance and behavior is typical for fish. Therefore, it seems often preferable to use references to external classes, such as gender, nationality and age, instead of using inheritance to create subclasses based on taxonomy trees.

Second, the introduction of additional attributes does not always correspond with ontological refinement. Though it is indeed true that a more specific concept exhibits in general more specific attributes, this is not the only mechanism. For instance, a point in 3-dimensional space has an additional attribute with respect to a 2-dimensional point, but it is a generalization instead of a refinement. It is not that rare that attributes of a general concept get default values for more specific concepts and can therefore be omitted. In such cases, programmers are tempted to define the more general concept as a subclass, which is once again detrimental to the anthropomorphic meaning of the classes.

Besides the underlying conceptual issues, the use of inheritance in a traditional object-oriented language creates several technical issues in the context of software development. These issues are mainly related to the introduction of technical coupling between the various classes. Every subclass in the inheritance tree has unrestricted access to the variables and methods of the superclass(es). This coupling is essentially hidden, i.e., the inherited attributes and methods cannot be seen in the class source code, and may be spread out over many classes across many inheritance levels. The fact that modern IDE tools visualize inherited attributes and methods does not really change that. A simple typo can remain invisible during compilation as the compiler may find an actual attribute or method with that name somewhere in the inheritance hierarchy, resulting in the use of the wrong attribute or method. The fact that some languages do not allow multiple inheritance and authors often advise to limit the number of levels in multilevel inheritance, is a clear testimony to the dangers of this technical coupling.

We have often argued that the principle of separation of concerns implies that object-oriented classes cannot contain both data and behavior from a functional point of view. A class needs to either represent data with some auxiliary utility methods, or to represent a behavioral action with some auxiliary utility data attributes. Therefore, we need to distinguish between data and behavior to further elaborate the preferred way to deal with inheritance or ontological refinement.

For behavior functions or methods, we believe that the use of polymorphism is both appropriate and sufficient to provide the required functionality. The mechanism enables the software engineer to provide a variety of implementations that may represent ontological refinement, while realizing at the same time action version transparency for different versions and variants. For instance, an Encrypter interface can encapsulate various specific ways of encrypting a message, while a Publisher interface may be used as a common concept for different ways of message delivery. Every implementation class can be passed in a way that makes the presence of the behavior explicit and allows the programmer to invoke that behavior.

Concerning data attributes or references, we want to stress again that it is often preferable to use external taxonomy entities to avoid the combinatorial explosion of inheritance trees due to the multi-dimensional nature of the taxonomies. A person could for instance have separate taxonomy entities for gender, nationality, and age, or a car could have separate taxonomy entities for the car brand, vehicle type, and propulsion type. Nevertheless, in many cases ontological refinement could be preferable as a number of common attributes may reflect an important common notion or concept that needs to be made explicit. For instance, a home could be an important concept with common attributes that needs to be refined for a house or an apartment. Or a legal person, being a person or legal entity. We argue here once again that the use of polymorphism is both appropriate and sufficient to support such a concept. By providing an interface to represent the common concept, such as a home, the programmer is able to identify, handle, pass, and access the various attributes of this common concept through all ontological refinements.

Information systems still use to a large extent relational databases. To support ontological refinement, we distinguish traditionally three possible options for the database tables as reflected in the drawing below.

One super-table containing the union of all subclass attributes.
Dedicated tables for every subclass, duplicating the common attributes.
Main table with common attributes, and dedicated tables with specific attributes for subclasses.

Option a does not have an explicit representation for the various subclasses, and would lead to lots of empty entries in the various instance rows. Moreover, the introduction of new subclasses – and even new inheritance levels – would lead to an ever expanding structure of the single table.

Option c seems to be the most efficient in terms of database columns or attributes. However, in terms of instances of columns or attributes, it is not. With respect to option b, it requires for every subclass instance an additional type attribute and link key. In addition, it would not be evident to assign an anthropomorphic name to the various dedicated tables. And finally, the software dealing with the persistency data would have to perform some dereferencing logic to retrieve all the attributes, which would not scale well with the introduction of new subclasses or inheritance levels.

Option b seems to have redundant columns for the common attributes, but has less redundancy and more efficiency in terms of instances of database entries. It does not need dereferencing logic, and integrates easily with a polymorphism implementation, where the common concept and attributes are represented as an interface. The fact that the refinement relation is not explicit seems reasonable as the concept is only present in object-oriented classes and not in relational databases.

Based on this reasoning, we have initiated support in Normalized Systems tooling for inheritance or ontological refinement using an implementation that is based on polymorphism for the software classes, and utilizes option b for the organization of the database tables.