Hardware Acceleration of Deep Neural Network Models on FPGA ( Part 1 of 2)

Artificial Intelligence has become all-pervasive, by finding applications in areas which seemed impossible earlier. Deep Learning, which is a subfield of Machine Learning, has become a state-of-the-art solution to all AI problems due to its high accuracy and efficiency. It helps in making real time decisions in applications like Advanced Driver Assistance Systems (ADAS), Robots, Autonomous Vehicles, Industrial Automation, Aerospace and Défense. For accurate decisions and real time behaviour, a massive amount of data needs to be processed. Deep Neural Network (DNN) models achieve this by using a large number of neural network layers.

Deep Neural Network

Want to know how Deep Learning works? Here's a quick guide for everyone. — Source: freecodecamp.org

Deep Neural Networks is the state-of-art solution for a variety of applications like computer vision, speech recognition and natural language processing etc. Artificial Neural Networks is a mathematical construct that tie together a large number of simple elements, called neurons, each of which can make simple mathematical decisions. A shallow neural network has only three layers: input layer, one hidden layer and output layer. A neural network becomes a Deep Neural Network (DNN) as the number of hidden layers increases. So, Deep Learning can be considered as a class of Artificial Neural Networks that is composed of many processing layers. They are more accurate and keep improving in accuracy as more neuron layers are added. Some important Deep Neural Network models are Feed-Forward Neural Network, Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN).

Hardware Accelerators for Deep Neural Networks

Hardware acceleration is defined as a process in which an application will offload a high computational task into specialised hardware for achieving high efficiency when compared to software implementation in CPU alone. To achieve accurate results in real-time, better models operating on a larger dataset are required. Also, time taken for decision making is an important factor. As new Deep Learning models evolve, the model structure becomes more complex. Thus, a huge number of operations and parameters, as well as more computing resources are needed. Three options for Hardware Accelerators are GPU’s, ASICs and FPGAs.

Source: https://ysu.edu/news/ysu-hosts-national-gpu-computing-workshop

GPUs are designed for processing images through massive parallelism, but nowadays they are used in big data analytics, acceleration of a portion of an application that requires high throughput and memory bandwidth. GPUs are excellent in parallel processing. They can provide acceleration where the same operations are required many times in rapid succession. But GPUs consume a huge amount of power which throws a challenge to DNN applications that need to be enabled on edge devices, especially battery-operated devices. GPUs achieve throughput with their ability to process input batches of large size, but typically the latency will be high. So, they are not suitable for latency-critical applications.

Source: https://www.eebinc.org/post/the-global-semiconductor-crunch

ASICs are integrated circuits specially designed for a particular purpose or application. They are highly optimized in terms of power and performance for one particular application. They have less I/O bandwidth, limited memory and other computing resources. Although they can attain moderate performance at low power, the downside is that the development time and costs to realize them are high.

What Is FPGA and FPGA Applications - Latest open tech from seeed studio — Source: https://www.renesas.com/

FPGAs can be used to accelerate a portion of an algorithm by assigning the high computational tasks to the programmable logic. They can attain high performance through extensive parallelism and at the same time, are energy efficient when compared to GPUs, and have less time to market and costs compared to ASICs. Another important feature of FPGAs is their; reconfigurability which is not possible with GPU and ASIC. As deep learning structures are advancing day by day, reconfigurability is an added advantage.

The following section lists out the reasons for considering FPGAs as hardware accelerators.

FPGAs as Hardware Accelerators:

When compared to GPU; ASIC and FPGA have less I/O bandwidth, limited memory and other computing resources but they can attain moderate performance at low power. ASIC is optimized for power and performance, but cost and development time is more. Also, they are not flexible. As an alternative to GPU and ASIC, FPGA based accelerators are currently used due to the following advantages:

FPGA offers high performance per watt when compared to GPU, making it a strong candidate for DNN computations and inference.
Architecture is customizable and flexible so that the required resources can be used.
Provide high throughput with massive parallelism at low latency.
FPGA has block RAM which allows faster data transfer compared to off-chip memory.
FPGAs are reconfigurable according to application. This enables a reduction in time to market. As the new machine learning algorithm evolves, less development time and reconfigurability make them a better option when compared to ASIC.
Apart from power efficiency and throughput, the speed of a DNN deployed on an FPGA can be further increased when the inferred algorithm uses low numeric precision in the calculation. For example, the quantization process converts a 32-bit or 64-bit floating-point network models to a fixed point which reduces computations by maintaining reasonable accuracy.

On the other hand, one of the main reasons for engineers not adopting FPGA is the difficulty in programming. FPGA is programmed by describing functionalities using Hardware Description Language (HDL) coding like VHDL or Verilog. This is different from regular programming like C or C++.

To reduce complexity, tools like High-Level Synthesis (HLS) that synthesize high level languages to HDL codes exist. There are different hardware frameworks developed by FPGA vendors and other third-party companies to implement inference on FPGA. Xilinx and Intel have their own frameworks to improve the performance over others. Some of the hardware frameworks are OpenCL, Intel’s OpenVino, Xilinx DNNDK and Xilinx Vitis AI which we will cover in part 2 of our blog.

Read Part 2 here…

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

info@ignitarium.com

Hardware Acceleration of Deep Neural Network Models on FPGA ( Part 1 of 2)

Deep Neural Network

Hardware Accelerators for Deep Neural Networks

FPGAs as Hardware Accelerators:

Leave a Comment Cancel Reply

Stay informed

NEWS & VIEWS

Join our team

APPLY

PRIVACY POLICY

©2024 Ignitarium Technology Solutions, All Rights Reserved

An ISO 9001:2015 certified company

Great Place to Work® Certified

Request for Video

info@ignitarium.com

Hardware Acceleration of Deep Neural Network Models on FPGA ( Part 1 of 2)

Deep Neural Network

Hardware Accelerators for Deep Neural Networks

FPGAs as Hardware Accelerators:

Leave a Comment Cancel Reply

Stay informed

NEWS & VIEWS

Join our team

APPLY

PRIVACY POLICY

©2024 Ignitarium Technology Solutions, All Rights Reserved

An ISO 9001:2015 certified company

Great Place to Work® Certified

Human Pose Detection & Classification

Features:

Target Markets:

OCR / Pattern Recognition

Use cases :

Highlights :

Behavior Monitoring

Use cases :

Highlights :

Attire & PPE Detection

Use cases :

Use cases :

Request for Video

Real Time Color Detection​

Use cases :

Highlights :

Missing Artifact Detection

Use cases :

Highlights :

Real Time Manufacturing Line Inspection

Use cases :

Highlights :

Ground Based Infrastructure analytics

Use cases :

Highlights :

Aerial Analytics

Use cases :

Highlights :

SANJAY JAYAKUMAR

Request Free Demo

RAMESH EMANI

MALAVIKA GARIMELLA​

PRADEEP KUMAR LAKSHMANAN

SONA MATHEW

ASHWIN RAMACHANDRAN

AZIF SALY

RAJU KUNNATH

PRADEEP SUKUMARAN

SUJEET SREENIVASAN

RAJIN RAVIMONY

SIBY ABRAHAM

SUJEETH JOSEPH

SUJITH MATHEW IYPE

RAMESH SHANMUGHAM

Real Time Color Detection

MALAVIKA GARIMELLA