Scalable Hardware Accelerator Design for FPGA Platforms

Bharucha, Ushma

Bharucha, Ushma

2020

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Deep Learning revolutionized the field of computer vision when convolutional neural networks (CNNs) solved complex computer vision problems with promising developments in the areas of research in artificial intelligence (AI). The progress in AI has attracted the hardware community to accommodate the growing demand of computationally expensive state-of-the-art Deep CNNs, coupled with diminishing performance gains of general-purpose architectures, which has fueled the need for specialized and scalable hardware accelerator designs and architectures for Deep CNNs. Moreover, Deep Separable Convolutional Neural Networks (DSCNNs) has become an emerging paradigm in the field of computer vision by offering modular networks with structural sparsity to achieve higher accuracy with relatively lower operations and parameters. However, there is a lack of customized architectures that can provide flexible solutions that fit the sparsity of the DSCNNs. The purpose of the domain-specific accelerators is to satisfy two requirements: (1) execution of DSCNN models with low latency, high throughput, and high efficiency, and (2) flexibility to accommodate evolving state-of-the-art models like EfficientNet families without costly silicon updates. On this note, the state-of-the-art GPUs tend to be too power-hungry and ASICs are too inflexible. This is where FPGA shines due to its architectural reconfigurability, ability to accommodate custom datatypes, and process irregular parallelism, power efficiency, and low latency which extends its usability in real-time. This work proposes DeepDive, which is a fully-functional, vertical hardware-software co-design architecture for the power-efficient implementation of DSCNNs on both edge and cloud FPGA platforms. With two different architectural principles applied for DeepDive's implementation on edge and cloud, the architecture for the former demonstrates latency-orient design whereas the architecture for the later demonstrates a throughput-orient design each of which is designed to fully support DSCNNs with various convolutional operators interconnected with structural sparsity. The accelerator design for both introduces parameterized, configurable, and scalable compute units that can be configured based on the user-specific requirement depending on the hardware it is implemented on, the degree of parallelism required, and the family of DSCNN chosen for inference. This accelerator design was implemented using Xilinx Vitis HLS 2019.2 tool. The execution results for DeepDive - Edge accelerator on Xilinx ZCU102 Edge FPGA demonstrates 233.3 FPS/Watt for a compact version of EfficientNet as the state-of-the-art DSCNN. These comparisons showcase how this edge design improves FPS/Watt by 2.2x and 1.51x over Jetson Nano high and low power modes, respectively. Whereas, DeepDive - Cloud accelerator achieves 87 FPS on Xilinx Alveo U50 with a power efficiency of 7.25 FPS/Watt for the baseline version of EfficientNet.

Details

Title

Scalable Hardware Accelerator Design for FPGA Platforms

Author

Bharucha, Ushma (Electrical Engineering)

Contributor

ProQuest (Firm) Contributor
University of North Carolina at Charlotte Degree Granting Institution
Tabkhivayghan, Hamed Thesis Advisor
Willis, Andrew Committee Member
Saqib, Fareena Committee Member

Date

2020

Publisher

University of North Carolina at Charlotte

Subjects

Computer engineering

Keywords

Deep Separable Convolutional Network; Deepdive; Depthwise Convolution; Edge and Cloud; FPGA; Reconfigurable Architectures

Link to This Page

Handle: http://hdl.handle.net/20.500.13093/etd:2361

Publication Type

masters theses

Pagination

1 online resource (55 pages) : PDF

File Format

application/pdf

Degree Type

M.S.

Usage Statement

This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). For additional information, see http://rightsstatements.org/page/InC/1.0/., (http://rightsstatements.org/page/InC/1.0/)
Copyright is held by the author unless otherwise indicated.

Record Appears in

Departments and Institutes > Electrical Engineering
Types > Masters Theses
Graduate Theses and Dissertations
Graduate Thesis and Dissertations

PDF

Statistics

Download Full History

Files

Abstract

Details

Related Items

PDF

Statistics