VSIPL Design Principles - Object Management Group Portals
Transcription
VSIPL Design Principles - Object Management Group Portals
VSIPL Design Principles James Lebak Massachusetts Institute of Technology Lincoln Laboratory VSIPL Forum 1 Outline VSIPL is an API that is… • Portable • Object-based • For signal and image processing applications • On embedded platforms VSIPL Forum 2 VSIPL Portability Goals • Enable source-code portability between platforms – Require only a simple re-compile of application • Achieve portability by use of a standard API – – – – • ANSI C API Agreed to and maintained by the VSIPL forum Forum membership consists of both users and implementors Test suite to verify conformance to API Maintain performance – API must not restrict ability of implementations to optimize VSIPL Forum 3 Achieving Portability • • VSIPL gives good portability for the computation portion of applications Multiple implementations exist – Vendor-optimized for specific platforms – Third-party optimized for single-board G4 systems – TASP reference implementation for general workstation use • Implementations exist with very little performance penalty VSIPL Forum 4 Outline VSIPL is an API that is… • Portable • Object-based • For signal and image processing applications • On embedded platforms VSIPL Forum 5 Object-Based Libraries • Traditional libraries are functional – e.g. BLAS (Basic Linear Algebra Subroutines, FORTRAN scientific library) saxpy( n, alpha, x, incx, y, incy) Length • y=αx+y Data Stride Array VSIPL is object-based – Application uses VSIPL abstract data types – Implementation of these data types is hidden to allow vendor-private optimizations – Primary abstract data types are blocks and views vsip_vadd_f(x, y, y) Views of Data (Include data arrays, lengths, strides) VSIPL Forum 6 Blocks and Views • Blocks and views are the primary VSIPL abstract data types – A block is a contiguous storage area – Data in a block may be viewed as a vector, matrix, or 3-tensor – Calculations are performed using views • View characteristics: – Offset from start of block – Length (number of elements in view) – Stride (spacing between elements in view) Example data in a block 1 2 3 4 5 6 7 8 9 View of entire block as a vector (offset 0, length 9, stride 1) 1 2 3 4 5 6 7 8 9 View of even-numbered elements (offset 1, length 4, stride 2) 1 2 3 4 5 6 7 8 9 View of block as 3 by 3 matrix (offset 0, row length 3, row stride 1, column length 3, column stride 3) 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 VSIPL Forum 7 Outline VSIPL is an API that is… • Portable • Object-based • For signal and image processing applications • On embedded platforms VSIPL Forum 8 VSIPL Functionality VSIPL contains a wide range of functions including: Vector/Matrix Elementwise Ops • Arithmetic (+,-,*,/) • Comparison operations (<,>,=) Signal Processing • FFT • Selection operations • • • • – 1D, 2D, 3D – Multiple 1D – In-place and out-of-place – Real-to-complex and complex-to-real Convolution (1D, 2D) Correlation (1D, 2D) FIR/IIR Filters Window functions – Min, max • Boolean operations – AND, OR, NOT • Data conversion Linear Algebra • Inner, Outer, Kronecker product • Matrix-Vector, Matrix-Matrix Multiply • QR, LU, Cholesky, Singular Value Decompositions • Solvers based on above VSIPL Forum 9 Supported Data Types Boolean, integer, floating-point, and index types • Implementations must support at least one integer and one floating-point type Complex integer and floating-point data • Input data may be stored split or interleaved • Implementation allowed to choose preferred order for VSIPL space (must support import of data in either order) Input data layout examples Split Interleaved Real Part Imaginary Part Standard defines portable precision specifiers for user-defined types Float Integer At least n decimal digits of accuracy Exactly n bits At least n bits Fastest type of at least n bits VSIPL Forum 10 Object-Based Linear Algebra qrdObject = vsip_qrd_create_f(M, N, VSIP_QRD_SAVEQ); ... vsip_qrd_f(qrdObject, A); vsip_qrd_prodq_f(qrdObject, VSIP_MAT_HERM, VSIP_MAT_LSIDE, w); • • Allocate memory; setup to use Q later A=QR w = QH w Algorithm setup is done before inner loop calls Implementation is free to choose algorithm – Gram-Schmidt or Householder method could be used – Choice can be made by create call VSIPL Forum 11 VSIPL Profiles • VSIPL Profiles provide functionality for specific areas – Core lite targeted at vector signal processing – Core targeted at adaptive signal processing Functionality Core Lite Profile float, complex, signed int types FFT, FIR Filters Vector arithmetic Matrix arithmetic Random numbers Convolution Correlation Matrix decomposition and solvers Core Profile VSIPL Forum 12 Outline VSIPL is an API that is… • Portable • Object-based • For signal and image processing applications • On embedded platforms – Early binding – Separate memory spaces – Separate development and performance modes VSIPL Forum 13 Early Binding Principle of Early Binding: Allocate resources for an operation as early as possible for better performance Examples: • FFT Setup phase Calculate coefficients and store in FFT object • Calculation phase Calculate the FFT using stored coefficients Object and data memory allocation Setup phase Allocate block and views and bind to data Calculation phase Operate on data VSIPL Forum 14 VSIPL Data Spaces VSIPL has two logical memory spaces User Data Space • User manipulates data using – – – – Direct access I/O functions Other math libraries Communication libraries (e.g. MPI, MPI/RT) • VSIPL will not operate on data in user space VSIPL Data Space • User manipulates data using • • VSIPL functions (only) Memory hierarchy details hidden Implementation may optimize memory use – Chaining – Deferred execution – Strip-mining These logical spaces may be the same physical address space, depending on the implementation VSIPL Forum 15 Example VSIPL Implementations User Space =VSIPL Space User Space (DRAM) User Space (interleaved complex) Real Part Imaginary Part VSIPL Space (SRAM) VSIPL Space (split complex) • Workstation – No special memory management needed • Digital Signal Processor – e.g. SHARC – User space is in DRAM – Library may manage movement of data through SRAM • Vector Processor – e.g. Altivec – User can store data in either complex format – Library can store data ••VSIPL VSIPLcode codeisisportable portableto todifferent differentplatforms platforms internally in best • Vendor can optimize for each platform format • Vendor can optimize for each platform VSIPL Forum 16 VSIPL Error Checking • VSIPL provides separate modes for debugging and for deployment – Vendors may provide either or both modes – May be one library or two – Operate the same except for error reporting and timing • Development mode – Extensive error checking – All errors are fatal • Production mode – Expected to be faster – Implies no error checking – Programming errors may have unpredictable results VSIPL Forum 17 Summary • VSIPL was designed to be – – – – • Portable, without sacrificing performance Object-based Useful for signal and image processing Targeted at embedded systems Other talks today will explore VSIPL in more detail VSIPL Forum 18
Similar documents
VSIPL Linear Algebra
VSIPL General Matrix Product and Sum (cont.) – Included in Core Profile – Very versatile functions from the user point of view – Should be thought of as “Swiss Army Knives” Ÿ They serve a lot of di...
More informationIntroduction to GPU VSIPL - GPU Technology Conference
• Portable, complete encapsulation of memory management • All blocks have a type: int, float, boolean, etc; real or complex
More information