DRIVE PX2

Transcription

DRIVE PX2
車載ディープラーニング及び自動運転用プラットフォーム
NVIDIA DRIVE PX2
馬路 徹
技術顧問、GPUエバンジェリスト
講演目次
•
NVIDIAの自動車ビジネス
•
ディープラーニングによる先進の画像認識
•
GPU: ディープラーニング及び超並列処理のための
エンジン
•
ディープラーニング及び超並列処理用
車載プラットフォームDRIVE PX2
•
ADAS及び自動運転用SWフレームワーク
DRIVE WORK
•
自動運転稼動状況の可視化
•
直近の自動運転関連応用事例(公開情報)
NVIDIAの自動車ビジネス
Automotive Experience
10 Years
Car Models
80
Units Shipped
10+ M
NVIDIA SDK (SOFTWARE DEVELOPMENT KIT)
The Essential Resource for OEM, Tier1, Eco System Proliferation
developer.nvidia.com | Available Now
THE NEW
REALIZATION
"Modules, modules and more modules. There's
so many modules there. If we were to strip off
this car, we'd probably have a basketful of
Modules -- little black boxes that do something.
It's getting out of control. They're very
expensive. They're tough to package. They're
very complex.
“I’d like to see a monster module that controls
the entire vehicle and that's easier to upgrade.“
Ralph Gilles, Fiat Chrysler Automobiles
Global Design Chief
Automotive News, February 28, 2016
NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
THE FUTURE OF CAR COMPUTERS
ONLY TWO MAIN INTEGRATED MODULES
DRIVE CX
DRIVE PX
Cockpit
Software
Self-Driving
Software
GPU Virt
Perception
AI - Speech
Localization
SurroundView
Planning
Smart Mirror
Visualization
Cockpit Computer
Two computers replace many ECUs
Both have access to cameras/sensors
Multiple OSs, Displays
Powered by Artificial Intelligence
Upgradeable SW replaces HW ECUs
One architecture
Higher performance
Lower total cost
Self-Driving Computer
ディープラーニングによる先進の画像認識
DL REVOLUTIONIZE CAR COMPUTER VISION
Required Separate Algorithms/Apps
- Pedestrian: HOG etc
- Traffic Sign: Hough Transform + Character Recog. etc
Only simple context recognition
- Pedestrian Y/N Only (no additional info)
- Speed Limit Signs Only
One Deep Neural Net App can Detect various Objects
- Pedestrian, Cars, Traffic Signs, lanes
- Also with many attributes (Car: Police Car, Van, Sedan, Truck, Ambulance….)
DEEP NEURAL NETWORK
CONVENTIONAL
(…)
VERY SHORT TIME TO GET TOP-CLASS SCORE
KITTI Dataset: Object Detection
NVIDIA DRIVENet
100%
90%
Top Score
88%
80%
72%
70%
KITTY Database
Object Detection
60%
55%
50%
40%
39%
30%
7/2015
8/2015
9/2015
10/2015
11/2015
12/2015
EVERYBODY USING GPU !
(Not the latest Ranking)
Courtesy of Cityscape
Courtesy of Daimler
Courtesy of Audi
“Using NVIDIA DIGITS deep
learning platform, in less than
four hours we achieved over 96%
accuracy using Ruhr University
Bochum’s traffic sign database.
While others invested years of
development to achieve similar
levels of perception with
classical computer vision
algorithms, we have been able
to do it at the speed of light.”
Matthias Rudolph, Director of Architecture,
Driver Assistance Systems, Audi
GPU: ディープラーニング及び
超並列処理のためのエンジン
NVIDIA GPU BIG CONRIBUTION ON SUPERCOMPUTER
USING CUDA (GPU Massive Parallel Computing)
CUDA: Compute Unified Device Architecture
From SC TOP500 November 2015
NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
LEAPS IN SUPERCOMPUTER GPU ADOPTION
# accelerated systems
120
100
80
60
40
20
0
Nov 2013
Nov 2014
Nov 2015
Accelerated Systems x2 from 2013 to 2015
96% of New Systems using NVIDIA GPU
超並列プログラミング環境CUDA
CUDA (Compute Unified Device Architecture)
https://developer.nvidia.com/gpu-accelerated-libraries
代表的なCUDA対応ライブラリ
cuDNN
ディープラーニング
cuBLAS
行列演算(密行列)
cuSPARSE
行列演算(疎行列)
cuFFT
フーリエ変換
cuRAND
乱数生成
NPP
画像処理プリミティブ
cuSOLVER
行列ソルバ (y=Ax)
Thrust
C++テンプレートライブラリ
…
SOLID GPU ROADMAP
72
Volta
60
Pascal
48
Mixed Precision
Double Precision
3D Memory
NVLink
36
SGEMM / W
24
Maxwell
12
Kepler
Fermi
Tesla
0
2008
2010
2012
2014
2016
2018
NVIDIA ONE-ARCHITECTURE
FROM SUPER COMPUTER TO AUTOMOTIVE SOC
Tesla
Automotive Tegra
In Super Computers
Quadro
In Work Stations
GeForce
In PCs
Mobile
GPU
In Tegra
PARALLEL PROCESSING AND AI/DL EVERYWHERE
WITH ONE-ARCHITECTURE OVER ALL
PRODUCTS/PLATFORMS
NVIDIA Tegra/DRIVE PX
NVIDIA Tesla/Supercomputer, HPC
NVIDIA Tegra/Jetson
TITAN X/Graphics Card
WHAT TRULY SCALABLE GPU ARCHITECTURE ENABLES
TIME-CONSUMING TRAINING ON SERVER & REAL-TIME RECOGNITION ON EMBEDDED SYSTEM
Classified Object
!
Trained
Neural Net Model
NVIDIA GPU DEEP LEARNING
SUPERCOMPUTER
DRIVE PX AUTO-PILOT
CAR COMPUTER
Camera Inputs
ディープラーニング及び超並列処理用
車載プラットフォームDRIVE PX2
DRIVE PX2 ENGAGEMENTS >100
Passenger Car OEMs
Commercial Car OEMs
Tier 1s
~25
~10
~20
TAAS
(Transportation As A Service)
Eco System Partners
(R&D, Universities, OS, Sensor, ISV etc)
~10
~50
DRIVE PX PLATFORM
SOLUTION
• Drive PX is a computing platform for
ADAS / autonomous driving
• End-to-End platform optimized for deep
learning (Super Computer – DRIVE PX)
DL: VERY FAST DEVELOPMENT SPEED
TOWARDS TOPDL
SCORE(1)
Training
Workstation/SuperComputer
• Open and Scalable SW Stack:
DRIVE Works
• Scalable architecture from ADAS to
Autonomous Driving (One Tegra to
2 x Tegra + 2 x discrete GPU)
DRIVE PX
DRIVE PX
 Dual Tegra X1
 8 CPU Cores
 Maxwell GPU
 850GFLOPS (FP32)
 12 simultaneous LVDS
camera inputs
 2 LVDS display ports
Camera Inputs
Display
Ports
Car Connector
Proprietary & Confidential
All Information Subject to Change
DRIVE PX HARNESS FROM CAR CONNECTOR
CAN, LIN, FlexRay and Ethernet Supported
Ethernet (x1)
1x Power
UART (x1)
FlexRay (x2)
LIN (x4)
48-pin Automotive Grade
Vehicle Harness
CAN 2.0 (x6)
DRIVE PX2
 Dual Next Generation
Tegra
Dual Tegras on Top
 Dual Discrete GPUs
 12 CPU Cores
 Pascal GPU
 8TFLOPS (FP32)
Dual Discrete GPUs
on the Bottom
 24DL TOPS
 12 simultaneous LVDS
camera inputs
Liquid Cooled if All
Devices used
Proprietary & Confidential
All Information Subject to Change
DRIVE PX2 COMPUTATION ENGINES
TEGRA A
GPU TOTAL PERFORMANCE
- 8TFLOPS (FP32)
- 24DL TOPS
HIGH PERFORMANCE 12CPUs
- 2 x Quad ARM A57
- 2 x Dual Denver
(ARM 64b compatible)
SCALABLE
- Scalable Platform
Max: 2-Tegras + 2-dGPUs
Min: 1-Tegra
REDUNDANCY
- For Function Safety
DEDICATED MEMORY
for each GPU
8GB
LPDDR4
128bit
UMA
Denver
A57
A57
Denver
A57
PCIex4
Pascal
Discrete GPU
A57
Pascal
Integrated GPU
TEGRA B
8GB
LPDDR4
128bit
UMA
PASCAL A
Denver
A57
A57
1Gb Ether
Denver
A57
A57
Pascal
Integrated GPU
4GB
GDDR5
PASCAL B
PCIex4
Pascal
Discrete GPU
4GB
GDDR5
DRIVE PX2 INTERFACES
70 Gigabits per second of I/O
 Sensor Fusion Interfaces
GMSL Camera, CAN, GbE, BroadR-Reach,
Camera
CAN
DRIVE PX2
LIN
Gb Ether
BroadR-Reach
TEGRA A
PASCAL A
FlexRay, LIN, GPIO
 Displays/Cockpit Computer Interfaces
HDMI, FPDLink III and GMSL
 Development and Debug Interfaces
HDMI, GbE, 10GbE, USB3,
USB3.0
USB2.0
GPIOs
Gb Ether
TEGRA B
PASCAL B
Display
10Gb Ether
Gb Ether
JTAG
FlexRay
ASIL-D
Safety MCU
Display(HDMI)
USB 2 (UART/debug), JTAG
Auto Grade connectors
Debug/Lab interfaces
DRIVE PX2 SOFTWARE
A full stack of rich software components
 NVIDIA Vibrante Linux
& Comprehensive BSP
 Rich Autonomous Driving
DRIVE Works SDK
 SDK, Samples and more
DRIVE PX ANALYSIS AS AN SEOOC
(SAFETY ELEMENTS OUT OF CONTEXT)
 NVIDIA DRIVE PX as an SEooC is developed based on
“Assumptions on use in Vehicles” including external
interfaces
 Safety Manual, FMEAD: NVIDIA as a developer of this
SEooC will provide the assumptions to the Tier1s and OEMs
 In order to have a compete safety case, these
“assumptions” are validated by OEMs, Tier1s in the
context of the actual Vehicle system
 In case that NVIDIA SEooC does not fulfill the Vehicle
requirements, “a modification needs to be made” to
either the Vehicle or the SEooC
SEooC: Safety Elements out of Context
HARA: Hazard Analysis and Risk Assessment
FEMDA: Failure Mode Effects and Diagnostic Analysis
FTA: Fault Tree Analysis
Quantitative Analysis
FEMDA/FTA
SEooC Done
ADAS及び自動運転用SWフレームワーク
DRIVE WORKS
NVIDIA DRIVEWORKS
AI/DL is now used in Detection (Perception)
Other Features are accelerated by CUDA (GPU Massive-Parallel Computing)
COMPUTEWORKS
GAMEWORKS
Sensor Fusion
VRWORKS
Detection
DESIGNWORKS
DRIVEWORKS
Localization
and other technologies such as Driving, Planning
JETPACK
HD Maps
AND OTHER SUPPORTING SDKS
Deep Learning SDK
DIGITS Workflow
VisionWorks
and other technologies such as:
GIE (GPU Inference Engine), System Trace, Visual Profiler
The NVIDIA DriveWorks SDK gives developers
a foundation to build applications across the
self-driving pipeline — perception,
localization, planning and visualization.
And we can bring all of these technologies
together into a beautiful cockpit
visualization to give the driver confidence
that the car is accurately seeing the world
around him.
“As a leading provider of graphical hardware
for gamers and researchers alike, NVIDIA
has a lot of expertise in building systems
that can make sense of video input and
make it something understandable.”
— Business Insider
DRIVEWORKS
Perception
Localization
Planning
Visualization
37
自動運転稼動状況の可視化
NEW AI DRIVING
MAPPING
KALDI
LOCALIZATION
DRIVENET
DAVENET
Training on
DGX-1
NVIDIA DGX-1
NVIDIA DRIVE PX
Driving with
DriveWorks
直近の自動運転関連応用事例
(公開情報)
As a part of VOLVO Drive
Me project, they will run
100 autonomous driving
test cars in 2017.
These cars will be
equipped with NVIDIA’s
Deep Learning Car
Computer DRIVE PX2.
WORLD’S FIRST AUTONOMOUS CAR RACE
 10 teams, 20 identical cars
 DRIVE PX 2: The “brain” of
every car
 2016/17 Formula E season
FAST-SPEED RACING ALGORITHM ALREADY THERE
Georgia Tech MPPI (Model Predictive Path Integral control) Algorithm
• Calculate the optimized trajectory from
the weighted average of 2,560 different
trajectories (each looking 2.5sec ahead)
calculated in parallel on the monster
NVIDIA GPU 60-times every sec.
• Using just one sampled trajectory will
be very jerky. Thus 2,560 trajectories
are weighted averaged.
• The dynamics model is a linear function
of 25 features based on an analytical
vehicle model
• On Car GPU used there is NVIDIA
GTX750Ti (640-cores, 1,305-GFLOPS)
Doing by itself: Counter Steering, Power Slide….
Max speed 100km/Hr
THANK YOU