Deep Neural Network Model on FPGA Part-2

Hardware Acceleration of Deep Neural Network Models on FPGA (Part 2 of 2)

While Part 1 of this 2-part blog series covered Deep Neural Networks and the different accelerators for implementing Deep Neural Network Models, Part 2 will talk about different Deep Learning Frameworks and hardware frameworks provided by FPGA Vendors.

Deep Learning Frameworks:

Deep learning framework can be considered as a tool or library that helps us to build DNN models quickly and easily without any in-depth knowledge of the underlying algorithms. It provides a condensed way for defining the models using pre-built and optimized components. Some of the important deep learning frameworks are Caffe, TensorFlow, Pytorch, Keras, etc.

Caffe is a deep neural network framework designed to improve speed and modularity. It is developed by Berkeley AI Research. Caffe mainly focuses on image processing applications involving convolutional neural networks (CNNs), but it also provides support for Region-based CNN, RNN, Long Short-term Memory and fully connected neural networks designs. It also supports CPU and GPU acceleration libraries such as NVIDIA cuDNN and Intel MKL. It provides support for C, C++, Python and MATLAB.

TensorFlow is a completely open-source deep learning framework which has pre-written code for deep learning models like RCNN and CNN. It was developed by researchers from Google. It has support for R, C++ and Python languages. It has a flexible architecture that allows deploying models across different platforms like CPU and GPU. TensorFlow works well on sequence-based data as well as on images. The latest version of TensorFlow is TensorFlow 2.0 which has significant improvements in performance on GPU.

Keras is an open-source framework that can run on top of TensorFlow. It is a high-level API which helps in fast experimentation of neural network models. Keras supports both CNN and RNN. It was developed by Francois Chollet, a Google engineer. Keras is written in python and it works perfectly on CPU as well as GPU.

PyTorch is an open-source machine learning library. It is developed by Facebook’s AI research lab and used for applications like computer vision, natural language processing etc. It has Python as well as C++ interface.

Hardware Frameworks for DNN:

FPGA as a hardware accelerator for Deep Neural Networks has its own advantages and disadvantages.  One of the main challenges is that FPGA is programmed by describing functionalities using Hardware Description Language (HDL) like VHDL or Verilog. This is different from regular programming like C or C++. To reduce the complexity, tools exist like High-Level Synthesis (HLS) which synthesize high-level languages to HDL codes. Even though implementing neural network models defined in Caffe or TensorFlow frameworks are still complex as designers require in-depth knowledge in both machine learning frameworks as well as FPGA hardware, there are different hardware frameworks developed by FPGA vendors and other third-party companies to significantly reduce such complexity.

Some of the hardware frameworks that we cover here are OpenCL, Intel’s OpenVino, Xilinx DNNDK, Xilinx Vitis AI and Lattice sensAI stack.

Open Computing Language (OpenCL) is a heterogeneous framework for writing and executing programs on different computing platforms, including CPUs, GPUs, FPGAs, Digital Signal Processors (DSPs) and other hardware accelerators. It was launched in 2009 by Apple to utilise the acceleration possibilities of on-board GPU. The newest version is 3.0, which incorporated more C++ features to the language.

The OpenCL framework officially supports C and C++, but unofficial support is available for Python, Java, Perl and INET. An OpenCL implementation of a program is based around a host containing different computing devices, such as a CPU and a GPU, which is further divided into multiple processing elements.  A function which is executed using OpenCL is called a kernel and can run in parallel on all processing elements. A programmer can utilise the acceleration capabilities available on a system by getting the device information from the computer the program is running on.

While OpenCL provides good possibilities for acceleration and resource usage, it is limited by its low-level nature. While it has functions for standard operations like FFT, neural networks have to be manually declared unless the frameworks used to generate the network have OpenCL branches. Caffe has such a branch, but it is currently under development. TensorFlow has an OpenCL-branch on its roadmap. The lack of  neural network framework support limits its adoption. A more supported and similar framework to OpenCL is Nvidia’s CUDA, although this only runs on Nvidia GPUs.

OpenVINO toolkit is provided by Intel for running neural networks on FPGAs and aims to simplify the process compared to existing solutions. The OpenVINO toolkit was launched in 2018 and it allows users to program applications where neural networks can be accelerated on Intel processors, GPUs, FPGAs and Vision Processing Units (VPUs). The toolkit is compatible with different inference targets and varies between platforms.

OpenVINO is mainly used for accelerating image recognition CNNs but can be used for other purposes such as speech recognition. It supports frameworks such as Caffe and TensorFlow and deep learning architectures such as AlexNET and GoogleNET. It supports a set number of layers for each framework out of the box, with custom layer support available for developers.

In OpenVINO toolkit, the neural network models are optimised using Models Optimizer by taking the models files provided by the neural network framework, such as a caffemodel (from Caffe), with the calculated weights. The default model’s precision is single-precision floating-point, while quantisation to half-precision floating-point is available in the Optimizer. 8-bit integer quantisation is also available.

The Optimizer provides an optimised intermediate representation which is loaded into the code using the Inference Engine API. The API prepares and infers the network to the target device and runs the network with the supplied input data. All pre-processing and post-processing is done in C++, so the only part which has to be replaced is the inference or prediction process.

On an FPGA, OpenVINO uses a pre-loaded bitstream programmed onto the FPGA to accelerate instructions. It does not utilise HLS, but uses the FPGA as a specialised processor for performing mathematical operations found in neural networks, such as convolutions and activations. The OpenVINO bitstreams are fixed for an FPGA and do not allow customizations like adding other IO functions.

To compete with OpenVINO, Xilinx acquired Chinese developer DeePhi in 2018 and their neural network FPGA acceleration SDK Kit (DNNDK). The DNNDK SDK performs model pruning, quantisation and deployment on Xilinx FPGA development kits such as the Xilinx ZCU102, ZCU104 and Avnet Ultra96, along with some of DeePhi’s development kits.

Along with FPGAs, the systems have embedded MCUs, on the Xilinx devices called Multi-Processor System-on-Chip (MPSoC), with FPGA as Programmable Logic and MCU as Processor System (PS). DeePhi claims that the SDK is capable of accelerating CNNs as well as RNNs, achieving a speedup of 1.8x and 19x when compared to Application Specific Integrated Circuit (ASIC) and HLS-implementations of the same network, using 56x less power than the HLS implementation.

DNNDK tool kit utilizes a soft-core processor, the Deep-learning Processor Unit (DPU) to accelerate high computational tasks of DNN algorithms. The DPU is designed to support and accelerate common neural network designs, such as VGG, ResNet, GoogLeNet, YOLO, AlexNET, SSD and SqueezeNet, as well as custom networks. In contrast to OpenVINO, the FPGA image does not occupy the whole FPGA, leaving space for custom HDL code to run alongside the SDK. DNNDK is not available as a separate tool from September 2020. There will not be any new releases further. Xilinx has introduced a new version of a tool called Vitis AI for the deployment of DNN models.

Vitis AI is Xilinx’s latest development platform for DNN inference on Xilinx hardware such as edge devices and Alveo cards. It has tools, well-optimized IPs, models, libraries and example designs. It has the same development flow as DNNDK. It is developed with ease of use and efficiency in mind. Vitis AI also uses Deep Learning Processing Unit (DPU) for AI acceleration. DPU can be scaled to fit different Xilinx hardware Zynq®-7000 devices, Zynq UltraScale+ MPSoCs, and Alveo boards from edge to cloud to meet the requirements of many diverse applications.

Lattice sensAI is a full-featured stack that helps to evaluate, develop and deploy machine learning models in Lattice FPGAs provided by Lattice Semiconductor. It supports popular frameworks like Caffe, TensorFlow and Keras. They have IP cores specially designed to accelerate CNN models. They provide easy to implement, highly flexible, small and low power machine learning solutions.

FPGA Families Targeted for AI Acceleration:

FPGA vendors have optimized their FPGA families to specifically target AI Acceleration.

  • Intel® Stratix® 10 NX FPGA is Intel’s first AI-optimized FPGA. It embeds a new type of AI-optimized block, the AI Tensor Block, tuned for common matrix-matrix or vector-matrix multiplications.
  • Intel® Agilex™ FPGAs and SoCs deliver up to 40 percent higher performance or up to 40 percent lower power for applications in the data centre, networking, and edge compute.
  • Xilinx SoCs are an optimal solution for AI applications. They integrate a processor for software programmability and FPGA for hardware programmability providing scalability, flexibility and performance. They include cost-effective Zynq 7000 SoC and high end Zynq Ultrascale+ MPSoC, Zynq Ultrascale+ RFSoC.
  • Lattice Semiconductor provides FPGAs for machine learning applications which are easy to implement, low power and highly flexible. Their hardware platforms include iCE40 UltraPlus FPGA, ECP5 FPGA and CrossLink-NX.
  • Microchip has PolarFire SoC that is suitable for reliable, secure and power-efficient computations in Artificial Intelligence/Machine Learning (AI/ML), industrial automation, imaging and Internet of Things (IoT) etc


FPGAs are now widely used in data centres for offloading GPU-based and CPU-based inference engines. These are early days in the definition, expansion and deployment of such capabilities starting from targeted FPGAs, model development and optimization frameworks and ecosystem of supported libraries. A rapid acceleration of capabilities of FPGAs is envisaged over the next five years to tackle a plethora of applications that could be deployed in the real world.

Read Part 1 here…

Leave a Comment

Scroll to Top

Human Pose Detection & Classification

Some Buildings in a city


  • Suitable for real time detection on edge devices
  • Detects human pose / key points and recognizes movement / behavior
  • Light weight deep learning models with good accuracy and performance

Target Markets:

  • Patient Monitoring in Hospitals
  • Surveillance
  • Sports/Exercise Pose Estimation
  • Retail Analytics

OCR / Pattern Recognition

Some Buildings in a city

Use cases :

  • Analog dial reading
  • Digital meter reading
  • Label recognition
  • Document OCR

Highlights :

  • Configurable for text or pattern recognition
  • Simultaneous Analog and Digital Dial reading
  • Lightweight implementation

Behavior Monitoring

Some Buildings in a city

Use cases :

  • Fall Detection
  • Social Distancing

Highlights :

  • Can define region of interest to monitor
  • Multi-subject monitoring
  • Multi-camera monitoring
  • Alarm triggers

Attire & PPE Detection

Some Buildings in a city

Use cases :

  • PPE Checks
  • Disallowed attire checks

Use cases :

  • Non-intrusive adherence checks
  • Customizable attire checks
  • Post-deployment trainable


Request for Video

    Real Time Color Detection​

    Use cases :

    • Machine vision applications such as color sorter or food defect detection

    Highlights :

    • Color detection algorithm with real time performance
    • Detects as close to human vison as possible including color shade discrimination
    • GPGPU based algorithm on NVIDIA CUDA and Snapdragon Adreno GPU
    • Extremely low latency (a few 10s of milliseconds) for detection
    • Portable onto different hardware platforms

    Missing Artifact Detection

    Use cases :

    • Detection of missing components during various stages of manufacturing of industrial parts
    • Examples include : missing nuts and bolts, missing ridges, missing grooves on plastic and metal blocks

    Highlights :

    • Custom neural network and algorithms to achieve high accuracy and inference speed
    • Single-pass detection of many categories of missing artifacts
    • In-field trainable neural networks with dynamic addition of new artifact categories
    • Implementation using low cost cameras and not expensive machine-vision cameras
    • Learning via the use of minimal training sets
    • Options to implement the neural network on GPU or CPU based systems

    Real Time Manufacturing Line Inspection

    Use cases :

    • Detection of defects on the surface of manufactured goods (metal, plastic, glass, food, etc.)
    • Can be integrated into the overall automated QA infrastructure on an assembly line.

    Highlights :

    • Custom neural network and algorithms to achieve high accuracy and inference speed
    • Use of consumer or industrial grade cameras
    • Requires only a few hundred images during the training phase
    • Supports incremental training of the neural network with data augmentation
    • Allows implementation on low cost GPU or CPU based platforms

    Ground Based Infrastructure analytics

    Some Buildings in a city

    Use cases :

    • Rail tracks (public transport, mining, etc.)
    • Highways
    • Tunnels

    Highlights :

    • Analysis of video and images from 2D & 3D RGB camera sensors
    • Multi sensor support (X-ray, thermal, radar, etc.)
    • Detection of anomalies in peripheral areas of core infrastructure (Ex: vegetation or stones near rail tracks)

    Aerial Analytics

    Use cases :

    • Rail track defect detection
    • Tower defect detection: Structural analysis of Power
      transmission towers
    • infrastructure mapping

    Highlights :

    • Defect detection from a distance
    • Non-intrusive
    • Automatic video capture with perfectly centered ROI
    • No manual intervention is required by a pilot for
      camera positioning


    Co-founder & CEO


    Founder and Managing director of Ignitarium, Sanjay has been responsible for defining Ignitarium’s core values, which encompass the organisation’s approach towards clients, partners, and all internal stakeholders, and in establishing an innovation and value-driven organisational culture.


    Prior to founding Ignitarium in 2012, Sanjay spent the initial 22 years of his career with the VLSI and Systems Business unit at Wipro Technologies. In his formative years, Sanjay worked in diverse engineering roles in Electronic hardware design, ASIC design, and custom library development. Sanjay later handled a flagship – multi-million dollar, 600-engineer strong – Semiconductor & Embedded account owning complete Delivery and Business responsibility.


    Sanjay graduated in Electronics and Communication Engineering from College of Engineering, Trivandrum, and has a Postgraduate degree in Microelectronics from BITS Pilani.


    Request Free Demo

      RAMESH EMANI Board Member


      Board Member

      Ramesh was the Founder and CEO of Insta Health Solutions, a software products company focused on providing complete hospital and clinic management solutions for hospitals and clinics in India, the Middle East, Southeast Asia, and Africa. He raised Series A funds from Inventus Capital and then subsequently sold the company to Practo Technologies, India. Post-sale, he held the role of SVP and Head of the Insta BU for 4 years. He has now retired from full-time employment and is working as a consultant and board member.


      Prior to Insta, Ramesh had a 25-year-long career at Wipro Technologies where he was the President of the $1B Telecom and Product Engineering Solutions business heading a team of 19,000 people with a truly global operations footprint. Among his other key roles at Wipro, he was a member of Wipro's Corporate Executive Council and was Chief Technology Officer.


      Ramesh is also an Independent Board Member of eMIDs Technologies, a $100M IT services company focused on the healthcare vertical with market presence in the US and India.


      Ramesh holds an M-Tech in Computer Science from IIT-Kanpur.


      General Manager - Marketing

      A professional with a 14-year track record in technology marketing, Malavika heads marketing in Ignitarium. Responsible for all branding, positioning and promotional initiatives in the company, she has collaborated with technical and business teams to further strengthen Ignitarium's positioning as a key E R&D services player in the ecosystem.

      Prior to Ignitarium, Malavika has worked in with multiple global tech startups and IT consulting companies as a marketing consultant. Earlier, she headed marketing for the Semiconductor & Systems BU at Wipro Technologies and worked at IBM in their application software division.

      Malavika completed her MBA in Marketing from SCMHRD, Pune, and holds a B.E. degree in Telecommunications from RVCE, Bengaluru.



      VP - Operations

      Pradeep comes with an overall experience of 26 years across IT services and Academia. In his previous role at Virtusa, he played the role of Delivery Leader for the Middle East geography. He has handled complex delivery projects including the transition of large engagements, account management, and setting up new delivery centers.

      Pradeep graduated in Industrial Engineering and Management, went on to secure an MBA from CUSAT, and cleared UGN Net in Management. He also had teaching stints at his alma mater, CUSAT, and other management institutes like DCSMAT. A certified P3O (Portfolio, Program & Project Management) from the Office of Government Commerce, UK, Pradeep has been recognized for key contributions in the Management domain, at his previous organizations, Wipro & Virtusa.

      In his role as the Head of Operations at Ignitarium, Pradeep leads and manages operational functions such as Resource Management, Procurement, Facilities, IT Infrastructure, and Program Management office.


      SONA MATHEW Director – Human Resources


      AVP – Human Resources

      Sona heads Human Resource functions - Employee Engagement, HR Operations and Learning & Development – at Ignitarium. Her expertise include deep and broad experience in strategic people initiatives, performance management, talent transformation, talent acquisition, people engagement & compliance in the Information Technology & Services industry.


      Prior to Ignitarium, Sona has had held diverse HR responsibilities at Litmus7, Cognizant and Wipro.


      Sona graduated in Commerce from St. Xaviers College and did her MBA in HR from PSG College of Technology.



      Vice President - Sales

      As VP of Sales, Ashwin is responsible for Ignitarium’s go-to-market strategy, business, client relationships, and customer success in the Americas. He brings in over a couple of decades of experience, mainly in the product engineering space with customers from a wide spectrum of industries, especially in the Hi-Tech/semiconductor and telecom verticals.


      Ashwin has worked with the likes of Wipro, GlobalLogic, and Mastek, wherein unconventional and creative business models were used to bring in non-linear revenue. He has strategically diversified, de-risked, and grown his portfolios during his sales career.


      Ashwin strongly believes in the customer-first approach and works to add value and enhance the experiences of our customers.


      AZIF SALY Director – Sales


      Vice President – Sales & Business Development

      Azif is responsible for go-to-market strategy, business development and sales at Ignitarium. Azif has over 14 years of cross-functional experience in the semiconductor product & service spaces and has held senior positions in global client management, strategic account management and business development. An IIM-K alumnus, he has been associated with Wipro, Nokia and Sankalp in the past.


      Azif handled key accounts and sales process initiatives at Sankalp Semiconductors. Azif has pursued entrepreneurial interests in the past and was associated with multiple start-ups in various executive roles. His start-up was successful in raising seed funds from Nokia, India. During his tenure at Nokia, he played a key role in driving product evangelism and customer success functions for the multimedia division.


      At Wipro, he was involved in customer engagement with global customers in APAC and US.


      RAJU KUNNATH Vice President – Enterprise & Mobility


      Distinguished Engineer – Digital

      At Ignitarium, Raju's charter is to architect world class Digital solutions at the confluence of Edge, Cloud and Analytics. Raju has over 25 years of experience in the field of Telecom, Mobility and Cloud. Prior to Ignitarium, he worked at Nokia India Pvt. Ltd. and Sasken Communication Technologies in various leadership positions and was responsible for the delivery of various developer platforms and products.


      Raju graduated in Electronics Engineering from Model Engineering College, Cochin and has an Executive Post Graduate Program (EPGP) in Strategy and Finance from IIM Kozhikode.


      PRADEEP SUKUMARAN Vice President – Business Strategy & Marketing


      Vice President - Software Engineering

      Pradeep heads the Software Engineering division, with a charter to build and grow a world-beating delivery team. He is responsible for all the software functions, which includes embedded & automotive software, multimedia, and AI & Digital services

      At Ignitarium, he was previously part of the sales and marketing team with a special focus on generating a sales pipeline for Vision Intelligence products and services, working with worldwide field sales & partner ecosystems in the U.S  Europe, and APAC.

      Prior to joining Ignitarium in 2017, Pradeep was Senior Solutions Architect at Open-Silicon, an ASIC design house. At Open-Silicon, where he spent a good five years, Pradeep was responsible for Front-end, FPGA, and embedded SW business development, marketing & technical sales and also drove the IoT R&D roadmap. Pradeep started his professional career in 2000 at Sasken, where he worked for 11 years, primarily as an embedded multimedia expert, and then went on to lead the Multimedia software IP team.

      Pradeep is a graduate in Electronics & Communication from RVCE, Bangalore.


      SUJEET SREENIVASAN Vice President – Embedded


      Vice President – Automotive Technology


      Sujeet is responsible for driving innovation in Automotive software, identifying Automotive technology trends and advancements, evaluating their potential impact, and development of solutions to meet the needs of our Automotive customers.

      At Ignitarium, he was previously responsible for the growth and P&L of the Embedded Business unit focusing on Multimedia, Automotive, and Platform software.

      Prior to joining Ignitarium in 2016, Sujeet has had a career spanning more than 16 years at Wipro. During this stint, he has played diverse roles from Solution Architect to Presales Lead covering various domains. His technical expertise lies in the areas of Telecom, Embedded Systems, Wireless, Networking, SoC modeling, and Automotive. He has been honored as a Distinguished Member of the Technical Staff at Wipro and has multiple patents granted in the areas of Networking and IoT Security.

      Sujeet holds a degree in Computer Science from Government Engineering College, Thrissur.


      RAJIN RAVIMONY Distinguished Engineer


      Distinguished Engineer


      At Ignitarium, Rajin plays the role of Distinguished Engineer for complex SoCs and systems. He's an expert in ARM-based designs having architected more than a dozen SoCs and played hands-on design roles in several tens more. His core areas of specialization include security and functional safety architecture (IEC61508 and ISO26262) of automotive systems, RTL implementation of math intensive signal processing blocks as well as design of video processing and related multimedia blocks.


      Prior to Ignitarium, Rajin worked at Wipro Technologies for 14 years where he held roles of architect and consultant for several VLSI designs in the automotive and consumer domains.


      Rajin holds an MS in Micro-electronics from BITS Pilani.


      SIBY ABRAHAM Executive Vice President, Strategy


      Executive Vice President, Strategy


      As EVP, of Strategy at Ignitarium, Siby anchors multiple functions spanning investor community relations, business growth, technology initiatives as well and operational excellence.


      Siby has over 31 years of experience in the semiconductor industry. In his last role at Wipro Technologies, he headed the Semiconductor Industry Practice Group where he was responsible for business growth and engineering delivery for all of Wipro’s semiconductor customers. Prior to that, he held a vast array of crucial roles at Wipro including Chief Technologist & Vice President, CTO Office, Global Delivery Head for Product Engineering Services, Business Head of Semiconductor & Consumer Electronics, and Head of Unified Competency Framework. He was instrumental in growing Wipro’s semiconductor business to over $100 million within 5 years and turning around its Consumer Electronics business in less than 2 years. In addition, he was the Engineering Manager for Enthink Inc., a semiconductor IP-focused subsidiary of Wipro. Prior to that, Siby was the Technical Lead for several of the most prestigious system engineering projects executed by Wipro R&D.


      Siby has held a host of deeply impactful positions, which included representing Wipro in various World Economic Forum working groups on Industrial IOT and as a member of IEEE’s IOT Steering Committee.


      He completed his MTech. in Electrical Engineering (Information and Control) from IIT, Kanpur and his BTech. from NIT, Calicut


      SUJEETH JOSEPH Chief Product Officer


      Chief Technology Officer


      As CTO, Sujeeth is responsible for defining the technology roadmap, driving IP & solution development, and transitioning these technology components into practically deployable product engineering use cases.


      With a career spanning over 30+ years, Sujeeth Joseph is a semiconductor industry veteran in the SoC, System and Product architecture space. At SanDisk India, he was Director of Architecture for the USD $2B Removable Products Group. Simultaneously, he also headed the SanDisk India Patenting function, the Retail Competitive Analysis Group and drove academic research programs with premier Indian academic Institutes. Prior to SanDisk, he was Chief Architect of the Semiconductor & Systems BU (SnS) of Wipro Technologies. Over a 19-year career at Wipro, he has played hands-on and leadership roles across all phases of the ASIC and System design flow.


      He graduated in Electronics Engineering from Bombay University in 1991.


      SUJITH MATHEW IYPE Co-founder & CTO


      Co-founder & COO


      As Ignitarium's Co-founder and COO, Sujith is responsible for driving the operational efficiency and streamlining process across the organization. He is also responsible for the growth and P&L of the Semiconductor Business Unit.


      Apart from establishing a compelling story in VLSI, Sujith was responsible for Ignitarium's foray into nascent technology areas like AI, ML, Computer Vision, and IoT, nurturing them in our R&D Lab - "The Crucible".


      Prior to founding Ignitarium, Sujith played the role of a VLSI architect at Wipro Technologies for 13 years. In true hands-on mode, he has built ASICs and FPGAs for the Multimedia, Telecommunication, and Healthcare domains and has provided technical leadership for many flagship projects executed by Wipro.


      Sujith graduated from NIT - Calicut in the year 2000 in Electronics and Communications Engineering and thereafter he has successfully completed a one-year executive program in Business Management from IIM Calcutta.


      RAMESH SHANMUGHAM Co-founder & COO


      Co-founder & CRO

      As Co-founder and Chief Revenue Officer of Ignitarium, Ramesh has been responsible for global business and marketing as well as building trusted customer relationships upholding the company's core values.

      Ramesh has over 25 years of experience in the Semiconductor Industry covering all aspects of IC design. Prior to Ignitarium, Ramesh was a key member of the senior management team of the semiconductor division at Wipro Technologies. Ramesh has played key roles in Semiconductor Delivery and Pre-sales at a global level.

      Ramesh graduated in Electronics Engineering from Model Engineering College, Cochin, and has a Postgraduate degree in Microelectronics from BITS Pilani.