- Aristeidis Mastoras and George Manis, “Ariadne – Directive-based parallelism extraction from recursive functions,” Journal of Parallel and Distributed Computing, Elsevier, vol. 86, pp. 16 – 28, Dec. 2015 [link]
Abstract: In this paper we present Ariadne, a compiler that extracts parallelism from recursive function calls. Ariadne takes as input C code enhanced with directives for recursive functions and automatically produces code for multi-core architectures. It produces code for the POSIX standard, the OpenMP model and the Cilk programming language, which run on a wide variety of computing systems. Ariadne also produces code for SL, a programming language proposed for the SVP processor and model. This is of special interest, since we can map certain function calls onto SVP, which contain inherent parallelism that cannot efficiently be expressed in other programming models. Ariadne is the only compiler that extracts parallelism from various forms of recursive functions using directives. It is also the only compiler that handles all forms of reduction operations for addition, subtraction, multiplication and division. The experimental results are very promising showing significant speedups in all benchmarks.
- Dimitris Saougkos and George Manis, “Self adaptive run time scheduling for the automatic parallelization of loops with the C2mTC/SL compiler,” Parallel Computing, Elsevier, vol. 39, pp. 603–614, Oct. 2013 [link]
Abstract: In this paper we suggest a new approach for solving the hyperplane problem, also known as “wavefront” computation. In direct contrast to most approaches that reduce the problem to an integer programming one or use several heuristic approaches, we gather information at compile time and delegate the solution to run time. We present an adaptive technique which intuitively calculates which new threads will be able to be executed in the next computation cycle based on which threads are executed in the current one. Moving the solution to the run time environment provides us with higher versatility alongside a perfect solution of the underlying hyperplane pattern being discovered without the need to perform any prior calculations. The main contribution of this paper is the presentation of the self adaptive algorithm, an algorithm which does not need to know the tile size (which controls the granularity of parallelism) beforehand. Instead, the algorithm itself adapts the tile size while the program is running in order to achieve optimal efficiency. Experimental results show that if we have a sufficient number of parallel processing elements to diffuse the scheduler’s workload, its overhead becomes low enough that it is overshadowed by the net gain in parallelism. For the implementation of the algorithm we suggest, and for our experimentations our parallelizing compiler C2μTC/SL is used, a C parallelizing compiler which maps sequential programs on the SVP processor and model.
- George Manis, “C2µTC/SL - C Paralellizing Compiler targeting SVP”, 2011[link]
Abstract: This document presents the C2µTC/SL parallelizing source to source compiler targeting the SVP model. It presents the transformations supported, its novel run-time scheduler, the task parallelism supported for recursive functions, implementation details and an evaluation section presenting both qualitative and quantitative results.
- Dimitris Saougkos, George Manis, Konstantinos Blekas, and Apostolos Zarras, “Revisiting Java bytecode compression for embedded and mobile computing environments,” IEEE Transactions on Software Engineering, vol. 33, no. 7, pp. 478–495, Jul. 2007 [link]
Abstract: Pattern-based Java bytecode compression techniques rely on the identification of identical instruction sequences that occur more than once. Each occurrence of such a sequence is substituted by a single instruction. The sequence defines a pattern that is used for extending the standard bytecode instruction set with the instruction that substitutes the pattern occurrences in the original bytecode. Alternatively, the pattern may be stored in a dictionary that serves for the bytecode decompression. In this case, the instruction that substitutes the pattern in the original bytecode serves as an index to the dictionary. In this paper, we investigate a bytecode compression technique that considers a more general case of patterns. Specifically, we employ the use of an advanced pattern discovery technique that allows locating patterns of an arbitrary length, which may contain a variable number of wildcards in place of certain instruction opcodes or operands. We evaluate the benefits and the limitations of this technique in various scenarios that aim at compressing the reference implementation of MIDP, a standard Java environment for the development of applications for mobile devices.