Schema Biographies

A Data(base) Schema is the formal representation of the structure, i.e., the elements and their relationships, of a structured collection of data (typically, a database). Schema evolution is the change of the schema from an older to a newer version.

Schema evolution can have a tremendous impact to the entire information system built around the evolving database, severely affecting both developers and end-users. Quite frequently, development waits till a "schema backbone" is stable and applications are build on top of it. This is due to the dependency magnet nature of databases: a change in the schema of a database may immediately drive surrounding applications to crash (in case of deletions or renamings) or be semantically defective or inaccurate (in the case of information addition, or restructuring). Therefore, discovering laws, patterns and regularities in schema evolution can result in great benefits, as we would be able to design databases with a view to their evolution and minimize the impact of evolution to the surrounding applications: (a) by avoiding "design anti-patterns" leading to cumulative complexity for both the database and the surrounding applications, and, (b) by planing administration and maintenance tasks and resources, instead of just responding to emergencies.

With these thoughts in mind, we embarked in the adventure of uncovering the internal mechanics of schema evolution. First, we start with charting the morphology of schema evolution and study how do schemata evolve in terms of their elements and their composition, what are the changes performed within each transition from a version to another, and any other characteristic that allows to "see" what happens during schema evolution. Next, we want to study the etiology of change, or else, the reasons why schemata evolve. This includes understanding what external events cause schema evolution and whether there are mechanisms that provide order in change ("laws", "repeatable patterns").