Publication Type : Journal Article
Publisher : ACM Transactions on Embedded Computing Systems (TECS)
Source : ACM Transactions on Embedded Computing Systems (TECS) , ACM , Volume 14, Number 2, New York, NY, USA , p.33 (2015)
Url : http://dl.acm.org/citation.cfm?doid=2737797.2656207
Keywords : Design space exploration, FPGAs, GPUs, OpenCL, parallel computing, simulation tools
Campus : Bengaluru
School : School of Engineering
Department : Computer Science
Year : 2015
Abstract : The design cycle for complex special purpose compute systems is extremely costly and time-consuming. It involves a multi-parametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation through time consuming Monte-Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as Low-Density Parity-Check (LDPC) codes, adopted by modern communication standards, which involves thousands of Monte-Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to graphics processing units (GPUs) and FPGAs. The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context we evaluate the concept of retargeting a single OpenCL program to multiple-platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g. 8000 bit) to large length (e.g. 64800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, providing different acceleration factors over conventional multicore CPUs.
Cite this Research Publication : M. Owaida, Falcao, G., Andrade, J., Antonopoulos, C., Bellas, N., Dr. Madhura Purnaprajna, Novo, D., Karakonstantis, G., Burg, A., and Ienne, P., “Enhancing design space exploration by extending CPU/GPU specifications onto FPGAs”, ACM Transactions on Embedded Computing Systems (TECS) , vol. 14, p. 33, 2015.