DRIVE PX2
Transcription
DRIVE PX2
車載ディープラーニング及び自動運転用プラットフォーム NVIDIA DRIVE PX2 馬路 徹 技術顧問、GPUエバンジェリスト 講演目次 • NVIDIAの自動車ビジネス • ディープラーニングによる先進の画像認識 • GPU: ディープラーニング及び超並列処理のための エンジン • ディープラーニング及び超並列処理用 車載プラットフォームDRIVE PX2 • ADAS及び自動運転用SWフレームワーク DRIVE WORK • 自動運転稼動状況の可視化 • 直近の自動運転関連応用事例(公開情報) NVIDIAの自動車ビジネス Automotive Experience 10 Years Car Models 80 Units Shipped 10+ M NVIDIA SDK (SOFTWARE DEVELOPMENT KIT) The Essential Resource for OEM, Tier1, Eco System Proliferation developer.nvidia.com | Available Now THE NEW REALIZATION "Modules, modules and more modules. There's so many modules there. If we were to strip off this car, we'd probably have a basketful of Modules -- little black boxes that do something. It's getting out of control. They're very expensive. They're tough to package. They're very complex. “I’d like to see a monster module that controls the entire vehicle and that's easier to upgrade.“ Ralph Gilles, Fiat Chrysler Automobiles Global Design Chief Automotive News, February 28, 2016 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE. THE FUTURE OF CAR COMPUTERS ONLY TWO MAIN INTEGRATED MODULES DRIVE CX DRIVE PX Cockpit Software Self-Driving Software GPU Virt Perception AI - Speech Localization SurroundView Planning Smart Mirror Visualization Cockpit Computer Two computers replace many ECUs Both have access to cameras/sensors Multiple OSs, Displays Powered by Artificial Intelligence Upgradeable SW replaces HW ECUs One architecture Higher performance Lower total cost Self-Driving Computer ディープラーニングによる先進の画像認識 DL REVOLUTIONIZE CAR COMPUTER VISION Required Separate Algorithms/Apps - Pedestrian: HOG etc - Traffic Sign: Hough Transform + Character Recog. etc Only simple context recognition - Pedestrian Y/N Only (no additional info) - Speed Limit Signs Only One Deep Neural Net App can Detect various Objects - Pedestrian, Cars, Traffic Signs, lanes - Also with many attributes (Car: Police Car, Van, Sedan, Truck, Ambulance….) DEEP NEURAL NETWORK CONVENTIONAL (…) VERY SHORT TIME TO GET TOP-CLASS SCORE KITTI Dataset: Object Detection NVIDIA DRIVENet 100% 90% Top Score 88% 80% 72% 70% KITTY Database Object Detection 60% 55% 50% 40% 39% 30% 7/2015 8/2015 9/2015 10/2015 11/2015 12/2015 EVERYBODY USING GPU ! (Not the latest Ranking) Courtesy of Cityscape Courtesy of Daimler Courtesy of Audi “Using NVIDIA DIGITS deep learning platform, in less than four hours we achieved over 96% accuracy using Ruhr University Bochum’s traffic sign database. While others invested years of development to achieve similar levels of perception with classical computer vision algorithms, we have been able to do it at the speed of light.” Matthias Rudolph, Director of Architecture, Driver Assistance Systems, Audi GPU: ディープラーニング及び 超並列処理のためのエンジン NVIDIA GPU BIG CONRIBUTION ON SUPERCOMPUTER USING CUDA (GPU Massive Parallel Computing) CUDA: Compute Unified Device Architecture From SC TOP500 November 2015 NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE. LEAPS IN SUPERCOMPUTER GPU ADOPTION # accelerated systems 120 100 80 60 40 20 0 Nov 2013 Nov 2014 Nov 2015 Accelerated Systems x2 from 2013 to 2015 96% of New Systems using NVIDIA GPU 超並列プログラミング環境CUDA CUDA (Compute Unified Device Architecture) https://developer.nvidia.com/gpu-accelerated-libraries 代表的なCUDA対応ライブラリ cuDNN ディープラーニング cuBLAS 行列演算(密行列) cuSPARSE 行列演算(疎行列) cuFFT フーリエ変換 cuRAND 乱数生成 NPP 画像処理プリミティブ cuSOLVER 行列ソルバ (y=Ax) Thrust C++テンプレートライブラリ … SOLID GPU ROADMAP 72 Volta 60 Pascal 48 Mixed Precision Double Precision 3D Memory NVLink 36 SGEMM / W 24 Maxwell 12 Kepler Fermi Tesla 0 2008 2010 2012 2014 2016 2018 NVIDIA ONE-ARCHITECTURE FROM SUPER COMPUTER TO AUTOMOTIVE SOC Tesla Automotive Tegra In Super Computers Quadro In Work Stations GeForce In PCs Mobile GPU In Tegra PARALLEL PROCESSING AND AI/DL EVERYWHERE WITH ONE-ARCHITECTURE OVER ALL PRODUCTS/PLATFORMS NVIDIA Tegra/DRIVE PX NVIDIA Tesla/Supercomputer, HPC NVIDIA Tegra/Jetson TITAN X/Graphics Card WHAT TRULY SCALABLE GPU ARCHITECTURE ENABLES TIME-CONSUMING TRAINING ON SERVER & REAL-TIME RECOGNITION ON EMBEDDED SYSTEM Classified Object ! Trained Neural Net Model NVIDIA GPU DEEP LEARNING SUPERCOMPUTER DRIVE PX AUTO-PILOT CAR COMPUTER Camera Inputs ディープラーニング及び超並列処理用 車載プラットフォームDRIVE PX2 DRIVE PX2 ENGAGEMENTS >100 Passenger Car OEMs Commercial Car OEMs Tier 1s ~25 ~10 ~20 TAAS (Transportation As A Service) Eco System Partners (R&D, Universities, OS, Sensor, ISV etc) ~10 ~50 DRIVE PX PLATFORM SOLUTION • Drive PX is a computing platform for ADAS / autonomous driving • End-to-End platform optimized for deep learning (Super Computer – DRIVE PX) DL: VERY FAST DEVELOPMENT SPEED TOWARDS TOPDL SCORE(1) Training Workstation/SuperComputer • Open and Scalable SW Stack: DRIVE Works • Scalable architecture from ADAS to Autonomous Driving (One Tegra to 2 x Tegra + 2 x discrete GPU) DRIVE PX DRIVE PX Dual Tegra X1 8 CPU Cores Maxwell GPU 850GFLOPS (FP32) 12 simultaneous LVDS camera inputs 2 LVDS display ports Camera Inputs Display Ports Car Connector Proprietary & Confidential All Information Subject to Change DRIVE PX HARNESS FROM CAR CONNECTOR CAN, LIN, FlexRay and Ethernet Supported Ethernet (x1) 1x Power UART (x1) FlexRay (x2) LIN (x4) 48-pin Automotive Grade Vehicle Harness CAN 2.0 (x6) DRIVE PX2 Dual Next Generation Tegra Dual Tegras on Top Dual Discrete GPUs 12 CPU Cores Pascal GPU 8TFLOPS (FP32) Dual Discrete GPUs on the Bottom 24DL TOPS 12 simultaneous LVDS camera inputs Liquid Cooled if All Devices used Proprietary & Confidential All Information Subject to Change DRIVE PX2 COMPUTATION ENGINES TEGRA A GPU TOTAL PERFORMANCE - 8TFLOPS (FP32) - 24DL TOPS HIGH PERFORMANCE 12CPUs - 2 x Quad ARM A57 - 2 x Dual Denver (ARM 64b compatible) SCALABLE - Scalable Platform Max: 2-Tegras + 2-dGPUs Min: 1-Tegra REDUNDANCY - For Function Safety DEDICATED MEMORY for each GPU 8GB LPDDR4 128bit UMA Denver A57 A57 Denver A57 PCIex4 Pascal Discrete GPU A57 Pascal Integrated GPU TEGRA B 8GB LPDDR4 128bit UMA PASCAL A Denver A57 A57 1Gb Ether Denver A57 A57 Pascal Integrated GPU 4GB GDDR5 PASCAL B PCIex4 Pascal Discrete GPU 4GB GDDR5 DRIVE PX2 INTERFACES 70 Gigabits per second of I/O Sensor Fusion Interfaces GMSL Camera, CAN, GbE, BroadR-Reach, Camera CAN DRIVE PX2 LIN Gb Ether BroadR-Reach TEGRA A PASCAL A FlexRay, LIN, GPIO Displays/Cockpit Computer Interfaces HDMI, FPDLink III and GMSL Development and Debug Interfaces HDMI, GbE, 10GbE, USB3, USB3.0 USB2.0 GPIOs Gb Ether TEGRA B PASCAL B Display 10Gb Ether Gb Ether JTAG FlexRay ASIL-D Safety MCU Display(HDMI) USB 2 (UART/debug), JTAG Auto Grade connectors Debug/Lab interfaces DRIVE PX2 SOFTWARE A full stack of rich software components NVIDIA Vibrante Linux & Comprehensive BSP Rich Autonomous Driving DRIVE Works SDK SDK, Samples and more DRIVE PX ANALYSIS AS AN SEOOC (SAFETY ELEMENTS OUT OF CONTEXT) NVIDIA DRIVE PX as an SEooC is developed based on “Assumptions on use in Vehicles” including external interfaces Safety Manual, FMEAD: NVIDIA as a developer of this SEooC will provide the assumptions to the Tier1s and OEMs In order to have a compete safety case, these “assumptions” are validated by OEMs, Tier1s in the context of the actual Vehicle system In case that NVIDIA SEooC does not fulfill the Vehicle requirements, “a modification needs to be made” to either the Vehicle or the SEooC SEooC: Safety Elements out of Context HARA: Hazard Analysis and Risk Assessment FEMDA: Failure Mode Effects and Diagnostic Analysis FTA: Fault Tree Analysis Quantitative Analysis FEMDA/FTA SEooC Done ADAS及び自動運転用SWフレームワーク DRIVE WORKS NVIDIA DRIVEWORKS AI/DL is now used in Detection (Perception) Other Features are accelerated by CUDA (GPU Massive-Parallel Computing) COMPUTEWORKS GAMEWORKS Sensor Fusion VRWORKS Detection DESIGNWORKS DRIVEWORKS Localization and other technologies such as Driving, Planning JETPACK HD Maps AND OTHER SUPPORTING SDKS Deep Learning SDK DIGITS Workflow VisionWorks and other technologies such as: GIE (GPU Inference Engine), System Trace, Visual Profiler The NVIDIA DriveWorks SDK gives developers a foundation to build applications across the self-driving pipeline — perception, localization, planning and visualization. And we can bring all of these technologies together into a beautiful cockpit visualization to give the driver confidence that the car is accurately seeing the world around him. “As a leading provider of graphical hardware for gamers and researchers alike, NVIDIA has a lot of expertise in building systems that can make sense of video input and make it something understandable.” — Business Insider DRIVEWORKS Perception Localization Planning Visualization 37 自動運転稼動状況の可視化 NEW AI DRIVING MAPPING KALDI LOCALIZATION DRIVENET DAVENET Training on DGX-1 NVIDIA DGX-1 NVIDIA DRIVE PX Driving with DriveWorks 直近の自動運転関連応用事例 (公開情報) As a part of VOLVO Drive Me project, they will run 100 autonomous driving test cars in 2017. These cars will be equipped with NVIDIA’s Deep Learning Car Computer DRIVE PX2. WORLD’S FIRST AUTONOMOUS CAR RACE 10 teams, 20 identical cars DRIVE PX 2: The “brain” of every car 2016/17 Formula E season FAST-SPEED RACING ALGORITHM ALREADY THERE Georgia Tech MPPI (Model Predictive Path Integral control) Algorithm • Calculate the optimized trajectory from the weighted average of 2,560 different trajectories (each looking 2.5sec ahead) calculated in parallel on the monster NVIDIA GPU 60-times every sec. • Using just one sampled trajectory will be very jerky. Thus 2,560 trajectories are weighted averaged. • The dynamics model is a linear function of 25 features based on an analytical vehicle model • On Car GPU used there is NVIDIA GTX750Ti (640-cores, 1,305-GFLOPS) Doing by itself: Counter Steering, Power Slide…. Max speed 100km/Hr THANK YOU