Software
Poikilo: Comparing and Visualizing Diversification Algorithms
Search result diversification has attracted considerable attention as a means of improving the quality of results retrieved by user queries. Poikilo is a tool designed to assist users in locating and evaluating diverse results. We provide implementations of a wide suite of models and algorithms to compute and compare diverse results. Users can tune various diversification parameters, combine diversity with relevance and also see how diverse results change over time in the case of streaming data.
Web site
Manual
Source Code
Related publications:
- Marina Drosou and Evaggelia Pitoura, POIKILO: A Tool for Evaluating the Results of Diversification Models and Algorithms, 39th International Conference on Very Large Data Bases (VLDB 2013), August 26-30, 2013, Riva del Garda, Trento, Italy
DisC Diversity: Result Diversification based on Dissimilarity and Coverage
Result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this paper, we propose a new, intuitive definition of diversity called DisC diversity. A DisC diverse subset of a query result contains objects such that each object in the result is represented by a similar object in the diverse subset and the objects in the diverse subset are dissimilar to each other. We have shown that locating a minimum DisC diverse subset is an NP-hard problem and provided heuristics for its approximation. We have also proposed adapting DisC diverse subsets to a different degree of diversification. We call this operation zooming. We have developed efficient implementations of our algorithms based on the M-tree, a spatial index structure, and experimentally evaluated their performance.
Description
Source Code and scripts
Related publications:
- Marina Drosou and Evaggelia Pitoura, DisC Diversity: Result Diversification based on Dissimilarity and Coverage, in Proc. of the 39th International Conference on Very Large Data Bases (VLDB 2013), August 26-30, 2013, Riva del Garda, Trento, Italy
Granted the Reproducible Label by the VLDB 2013 reproducibility committee (http://www.dbxr.org).
PrefSIENA: Preferential Publish/Subscribe
In publish-subscribe systems, subscribers express their interests in specific events and get notified about all published events that match their interests. Typically, in such systems, all subscriptions are considered equally important. However, as the amount of information generated increases rapidly, to control the amount of data delivered to users, we propose enhancing publish-subscribe systems with a ranking mechanism, so that only the top-ranked matching events are delivered. Ranking is based on letting users express their preferences on events by ordering the associated subscriptions. To avoid the blocking of new notifications by top-ranked old ones, we associate with each notification an expiration time. Since many times, top-ranked events are similar to each other, we propose increasing the diversity of delivered events. Furthermore, we examine a number of different timing policies for delivering ranked events to users. We have fully implemented our approach in SIENA, a popular publish-subscribe middleware system.
Description
Source Code
Acknowledgements: PrefSIENA is an extension to SIENA, written by Antonio Carzaniga. This project has been partially funded by AEOLUS (Algorithmic Principles for Building Efficient Overlay Computers - Integrated Project IST-15964)
Related publications:
- Marina Drosou, Kostas Stefanidis and Evaggelia Pitoura, Preference-Aware Publish/Subscribe Delivery with Diversity, in Proc. of the 3rd ACM International Conference on Distributed Event-Based Systems (DEBS 2009), July 6-9, 2009, Nashville, TN, USA
(Also presented at the 8th Hellenic Data Management Symposium (HDMS), September 1, 2009, Athens, Greece)
- Marina Drosou, Evaggelia Pitoura and Kostas Stefanidis, Preferential Publish/Subscribe, in Proc. of the 2nd International Workshop on Personalized Access, Profile Management and Context Awareness: Databases (PersDB 2008), in conjunction with the VLDB 2008 Conference, August 23, 2008, Auckland, New Zealand
PerK: Personalized Keyword Search in Relational Databases
Keyword-based search in relational databases allows users to discover relevant information without either knowing the database schema or using complicated queries. However, such searches may return an overwhelming number of results. We propose personalizing keyword database search by utilizing user preferences. Query results are ranked based on their degree of preference for the user, their relevance to the query and two new quality metrics that evaluate their goodness as a set, namely coverage and diversity. We introduce an algorithm for processing preference queries that uses the keyword appearances in the preferences to direct the joining of relevant tuples from multiple relations. We also show how to reduce the complexity of this algorithm by sharing computational steps. Finally, we report evaluation results of the efficiency and effectiveness of our approach.
Source Code
Related publication:
- Kostas Stefanidis, Marina Drosou and Evaggelia Pitoura, PerK: Personalized Keyword Search in Relational Databases through Preferences, in Proc. of the 13th International Conference on Extending Database Technology (EDBT 2010), March 22-26, 2010, Lausanne, Switzerland
(Also presented at the 9th Hellenic Data Management Symposium (HDMS), June 28 - July 3, 2010, Ayia Napa, Cyprus)