Home Contact


Computer Science & Engineering Department
University of Ioannina




Someone said

"Education is what survives when what has been learned has been forgotten." -- B. F. Skinner

CS Dept



Poikilo: Comparing and Visualizing Diversification Algorithms

Search result diversification has attracted considerable attention as a means of improving the quality of results retrieved by user queries. Poikilo is a tool designed to assist users in locating and evaluating diverse results. We provide implementations of a wide suite of models and algorithms to compute and compare diverse results. Users can tune various diversification parameters, combine diversity with relevance and also see how diverse results change over time in the case of streaming data.

pdf Web site
pdf Manual
zip Source Code

Related publications:

DisC Diversity: Result Diversification based on Dissimilarity and Coverage

Result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this paper, we propose a new, intuitive definition of diversity called DisC diversity. A DisC diverse subset of a query result contains objects such that each object in the result is represented by a similar object in the diverse subset and the objects in the diverse subset are dissimilar to each other. We have shown that locating a minimum DisC diverse subset is an NP-hard problem and provided heuristics for its approximation. We have also proposed adapting DisC diverse subsets to a different degree of diversification. We call this operation zooming. We have developed efficient implementations of our algorithms based on the M-tree, a spatial index structure, and experimentally evaluated their performance.

pdf Description
zip Source Code and scripts

Related publications:

PrefSIENA: Preferential Publish/Subscribe

In publish-subscribe systems, subscribers express their interests in specific events and get notified about all published events that match their interests. Typically, in such systems, all subscriptions are considered equally important. However, as the amount of information generated increases rapidly, to control the amount of data delivered to users, we propose enhancing publish-subscribe systems with a ranking mechanism, so that only the top-ranked matching events are delivered. Ranking is based on letting users express their preferences on events by ordering the associated subscriptions. To avoid the blocking of new notifications by top-ranked old ones, we associate with each notification an expiration time. Since many times, top-ranked events are similar to each other, we propose increasing the diversity of delivered events. Furthermore, we examine a number of different timing policies for delivering ranked events to users. We have fully implemented our approach in SIENA, a popular publish-subscribe middleware system.

pdf Description
zip Source Code

Acknowledgements: PrefSIENA is an extension to SIENA, written by Antonio Carzaniga. This project has been partially funded by AEOLUS (Algorithmic Principles for Building Efficient Overlay Computers - Integrated Project IST-15964)

Related publications:
  • Marina Drosou, Kostas Stefanidis and Evaggelia Pitoura, Preference-Aware Publish/Subscribe Delivery with Diversity, in Proc. of the 3rd ACM International Conference on Distributed Event-Based Systems (DEBS 2009), July 6-9, 2009, Nashville, TN, USA
    (Also presented at the 8th Hellenic Data Management Symposium (HDMS), September 1, 2009, Athens, Greece)
    pdf pptx
  • Marina Drosou, Evaggelia Pitoura and Kostas Stefanidis, Preferential Publish/Subscribe, in Proc. of the 2nd International Workshop on Personalized Access, Profile Management and Context Awareness: Databases (PersDB 2008), in conjunction with the VLDB 2008 Conference, August 23, 2008, Auckland, New Zealand
    pdf pptx

PerK: Personalized Keyword Search in Relational Databases

Keyword-based search in relational databases allows users to discover relevant information without either knowing the database schema or using complicated queries. However, such searches may return an overwhelming number of results. We propose personalizing keyword database search by utilizing user preferences. Query results are ranked based on their degree of preference for the user, their relevance to the query and two new quality metrics that evaluate their goodness as a set, namely coverage and diversity. We introduce an algorithm for processing preference queries that uses the keyword appearances in the preferences to direct the joining of relevant tuples from multiple relations. We also show how to reduce the complexity of this algorithm by sharing computational steps. Finally, we report evaluation results of the efficiency and effectiveness of our approach.

zip Source Code

Related publication: