Building a state-of-the-art Reconfigurable Data flow and Scalable Deep Learning Accelerator (RDFS_DLA) IP & Chip for AI, HPC & Edge applications.

This project aims to take India into the elite league of Semiconductor and IP design

Brief Description

The Reconfigurable Dataflow and Scalable Deep Learning Accelerator (RDFS_DLA) is a highly configurable hardware accelerator designed to optimize deep learning inference tasks. It speeds up deep learning applications by providing dedicated hardware for critical operations, allowing for efficient and scalable processing. 

 

The RDFS_DLA can be deployed in multiple ways to meet the needs of different industries:

•    Custom SoC ASIC: Integrating the RDFS_DLA with a processor in a custom application-specific integrated circuit (ASIC).

•   SOM Module: A specialized system-on-module (SOM) for easy integration with existing carrier boards, enabling AI inference capabilities.

•     PCI Cards: RDFS_DLA cards that can be plugged into PCs and servers to accelerate deep learning workloads.

 

Potential applications include:

• Smart Cameras: Camera manufacturers can integrate RDFS_DLA into camera devices powered by ARM processors and FPGAs, enabling low-power AI features like object detection, with a target of 10 fps prediction.

• Drones: Drone based startup/companies can build AI-powered drones for real-time flight control and AI processing with a focus on low battery consumption.

• IP Design: IP design companies can integrate RDFS_DLA with their own RISC-V processor cores to offer advanced deep learning solutions to their customers.



Use Cases

 •  Smart Camera Object Detection (Camera Manufacturing): Real-time object detection in smart cameras for security or retail, optimized for low power consumption.

•  Drone Image Classification (Drone Manufacturing): On-board image classification for drones, enabling real-time object recognition during flight for agricultural or delivery applications.

•  Surveillance Video Analytics (Security & Surveillance): Accelerate video processing for real-time face recognition and anomaly detection in security and monitoring systems.

• Medical Imaging Diagnostics (Healthcare): Faster and more accurate image classification for medical imaging systems, aiding in early detection of conditions like cancer or fractures.

•  Autonomous Vehicle Obstacle Detection (Automotive): Real-time pedestrian and obstacle detection in autonomous vehicles, enhancing safety and navigation.



Salient Features

•  AI Framework Integration: Fully compatible with major AI frameworks (TensorFlow, PyTorch, Caffe, ONNX), making it easy to integrate into existing machine learning workflows.

•  Built-In Image and Video Processing: Includes powerful image processing capabilities like resizing, color conversion, and non-maximum suppression for real-time inference applications.

•  Optimized for Edge and Cloud Use: Offers flexible deployment for both cloud-based high-performance computing and local, low-power devices, ideal for IoT, drones, cameras, and more.



Technical Specifications

•   Flexible Deployment Modes: Available as custom SoC ASIC, SOM module, or PCI-based accelerator card.

•    AI Framework Support: Compatible with TensorFlow, PyTorch, Caffe, Keras, TFLite, and ONNX.

•    Memory and Throughput: Supports external memory interfaces (32 to 256 bits) and up to 16 MB internal SRAM.

•     Power Efficient: Achieves up to 7 TeraOps per watt, customizable for low-power or high-performance applications.



Platform Required

•   Hardware: ZCU104, KCU105, Vega RISC V Processor.

•    OS: Linux (CentOS 7.9 is recommended), Windows 10.

•    Process Node: TSMC 11nm.

•   Software: Vivado, Vitis, Synopsys VCS, Synopsys Verdi, Synopsys DC, Catapult HLS, Cadence Xcelium, Cadence JasperGold, Synopsys SpyGlass.



Chief Investigator Details

 

Abhishek Tiwari, Scientist E

abhishek@cdac.in

Email id: abhishek@cdac.in

Contact no.: +91 99718 11440


Top