Name: Hybrid CNN Accelerator System Design and the Associated Model Training/Analyzing Tools
Brand: Future Tech Pavilion, FUTEX
SKU: P0028600005114
Price: 2020 TWD

:::

recommend

An Artificial Intelligence Medicine RecognitionVerification System in Hospital Dispensing Room

AI-Embedded 5-Axis CNC Controller

Human-Robot Co-Dancing: A Computer Vision-Based, No-Code, Intuitive Robot Arm Choreography Interface and Human-Robot Collaborative Creation System

Dual Deep Learning Models for Gastric Premalignant Condition Diagnosis in Precision Health

Trace

Technical Name	Hybrid CNN Accelerator System Design and the Associated Model Training/Analyzing Tools
Project Operator	National Chiao Tung University
Project Host	郭峻因
Summary	Hybrid CNN DLA system 1. Use SPE to deal with CNN operation efficiently, and support 1, 2, 4, 8-bit CNN operations. It allows Hybrid CNN deep learning accelerator operated in a high-efficiency mode. 2. Use Input Ping-Pong Buffer to optimize DRAM access and scheduling of Hybrid CNN DLA. Use 2-Stage Input Buffer to reduce latency in data transmission among 2-D Systolic PE Arrays. 3. Use independent Partial Sum Sorter to access On-chip memory, allowing 100% hardware utilization. Hybrid CNN Training/Analyzing Tools 1. Using knowledge transfer, dynamic quantization, and other techniques, this tool can greatly simplify the Hybrid CNN model. 2. Using SSTE to solve the gradient deviation problem in back-propagation when training at the bit-accurate level. By recording the overflowrate, it prevents gradient from nodes with a high probability of overflow rate. 3. Our team develops an automatic analyzing/training tool (ezQUANT), which supports automatically quantizing and fine-tuning DL models.
Scientific Breakthrough	1.1.The Hybrid CNN DLA proposed by this work can support 1/1, 2/2, 4, and 8/8bit CNN operations. 2.On Xilinx ZCU102 FPGA, it can achieve peak performance 691.2 GOPS at 8/8bit, 1382.4 GOPS at 4/4bit, 2764.8 GOPS at 2/2bit, and 5529.6 GOPS at 1/1bit. 3.The first Hybrid CNN training mythology greatly compresses the model size while keeping the accuracy. 4.The first bit-accurate level software in the world, which solves the problem of analyzing accuracy loss when porting the model on the DLA in the existing papers or industrial AI DLA.
Industrial Applicability	For the industrial applications, our Hybrid CNN model training tool can reduce the model size efficiently. Then this model can inference in higher speed by our customized DLA. The software tool can be used to CNN model training and the model re-training. And we can provide the corresponding DLA to inference the model in edge application. This solution can avoid quantization error when floating-point model porting to other edge device.
Keyword	Training tool for Hybrid fixed point CNN Model Dynamic Quantization Liteweight Model Binary Model Hybrid Model Bit-Accurate Analysis Bit-Accurate Traning Deep Learning Quantization Quantization Analysis

Email
apple.35932003@gmail.com

Matchmaking

other people also saw