Panos Vassiliadis, Apostolos Zarras
29th International Conference on Advanced Information Systems Engineering (CAiSE 2017), 12-16 June 2017, Essen, Germany
How can we plan development over an evolving schema? In this paper, we study the history of the schema of eight open source software projects that include relational databases and extract patterns related to the survival or death of their tables. Our findings are mostly summarized by a pattern, which we call "electrolysis pattern" due to its diagrammatic representation, stating that dead and survivor tables live quite different lives: tables typically die shortly after birth, with short durations and mostly no updates, whereas survivors mostly live quiet lives with few updates -- except for a small group of tables with high update ratios that are characterized by high durations and survival.
Based on our findings, we recommend that development over newborn tables should be restrained, and wherever possible, encapsulated by views to buffer both infant mortality and high update rate of hyperactive tables. Once a table matures, developers can rely on a typical pattern of gravitation to rigidity, providing less disturbances due to evolution to the surrounding code.
Panos Vassiliadis, Apostolos Zarras. Survival in schema evolution: putting the lives of survivor and dead tables in counterpoint. 29th International Conference on Advanced Information Systems Engineering (CAiSE 2017), 12-16 June 2017, Essen, Germany
[Local copy of the paper at CAiSE 2017 (PDF)]
Panos Vassiliadis, Apostolos V. Zarras. Schema Evolution Survival Guide for Tables: Avoid Rigid Childhood and You’re En Route to a Quiet Life. Journal on Data Semantics, Volume 6, Issue 4, December 2017, pp 221–241, ISSN: 1861-2040, DOI: 10.1007/s13740-017-0083-x.
Local copy at the Journal of Data Semantics
The following code and data are presented on-line to allow the reproduction of results by others. We would like to to clearly state that we simply cannot support any requests for the maintenance of the code, or clarifications, explanations etc. Moreover, we do not assume any responsibility for any side effects of the code (although we cannot think of, or have ever encountered, any). You are free to reuse the following code and data for academic purposes, provided you give the appropriate citation:
Panos Vassiliadis, Apostolos Zarras. Survival in schema evolution: putting the lives of survivor and dead tables in counterpoint. 29th International Conference on Advanced Information Systems Engineering (CAiSE 2017), 12-16 June 2017, Essen, Germany. Source code, datasets, presentations available at http://www.cs.uoi.gr/~pvassil/publications/2017_CAiSE_Electrolysis
(and, yes, academic honesty rules impose that this includes student projects too ;) )
Input: Database history versions (Raw input for extracting transitions)
Code: Source code for Hecate. Requires Java 7 and Eclipse.