Visualisation of Large Datasets with Houdini

Transcription

Visualisation of Large Datasets with Houdini
Visualisation of Large Datasets with Houdini
Ben Simons
Data Arena Lead Developer
University of Technology, Sydney
[email protected]
[email protected]
New UTS Broadway Building
UTS Data Arena
~ April 2014
Today's Outline - Big Data
1. Some strategies used in Film Visual FX
2. Visualisation Techniques in Houdini
3. VFX Data Formats & Disk Systems
Happy Feet 2
●
2 Petabytes (2,000,000 GB)
●
3D Stereo HD images
●
Render: 18,000 cpu cores
●
●
●
●
Parallel access to data
HDF5 data on Bluearc & Isolon
NAS Disk Systems
Linux software: Maya, Houdini,
Naiad, Nuke, 3Delight
Entirely made at Carriageworks
in Sydney at Dr D Studios
Resident Evil 3 Extinction
●
The Desert Undead: 18-layer images (Rman AOV's)
●
Each single image frame was split into 96 tiles
●
Rendered on 96 machines, then each frame tile-joined
Houdini
www.sidefx.com
Houdini across 2 screens
Houdini Object Nodes
Houdini Procedural Network
Houdini Parameters
Houdini Chops
●
●
●
●
●
●
Channel is a column of data
Plain textfiles ok – separate
columns with tabs
Interactive Channel graph
(zoom in)
Visual programming
Filtering, Sampling, shading,
instancing, and rendering
Hands-on tomorrow will be
Chops & Vops
Spitzer Glimpse Dataset
http://data.spitzer.caltech.edu/popular/glimpse/20070416_enhanced_v2/source_lists/south/
Spitzer Space Telescope
GLIMPSE Dataset
●
South: ~300 files, 78 different Channels, 145K rows
●
gzipped .tbl data loaded into Houdini
●
Houdini Chops used to filter & calc 'colours'
●
Show difference of infra-red magnitude bands
Point colours and scales calculated by VOPs SIMD
Shaders
●
Houdini Movie Rendered (Mantra PBR)
–
–
36M points, filtered <12M
Shading & VOP's
●
A shader is a mini-program which makes data
●
It can be better to generate data than load it.
●
Shaders allow additional level of management
●
●
Geom shaders on HF2 generated 1 billion snow
particles per image frame (impossible to load).
Houdini VOP's are SIMD
Houdini
VOP Network
Instancing
●
Saves Memory & I/O by re-using geometry
●
Copies generated at render time
●
●
Each Instance can be varied based on point
attributes
Referencing one “instance object” provides a
massive data reduction
Adaptive Meshes, LOD, Caching &
Filtering
●
Data reduction techniques
●
Level of Detail (distance from camera)
●
Adaptive Meshes
●
Cache common files locally
●
Filter texture (images) - Mipmapping
Other tricks Baked Lighting & Shadows
●
●
●
●
Pre-calculate lighting
& shadows
“bake” new textures
& reapply onto geom
Sydney Harbour
Multi-Beam Sonar
Survey, 30cm data.
Interactive 3D Flythrough
Know ur Limits: Memory & I/O
●
I/O will Bottleneck - Partition the problem & then scale it up
–
–
●
Split job across many independent machines (eg. render)
Segment data access for each machine (eg. HDF5)
Alternate memory hardware
●
Vector (array) processor - SIMD
–
–
●
as Cray, now intel SSE/MMX and Nvidia GPU
IBM Cell Processor has Vector Processor
Content-Addressable Memory
–
“associative arrays” are used by Network Routers
Types of System Memory
●
Virtual Memory
●
●
Swapping is good, thrashing is bad
SMP vs MPI
●
SMP Symmetric Multiprocessing: Multiple CPU's with
common/shared memory. Multi-threaded apps.
eg. Intel Xeon, Core 2 Duo are SMP.
– Cache coherency, snooping bus (on distributed SM)
ccNUMA
MPI (Message Passing) PVM Clusters, Beowulf, etc
(Memory not shared)
–
●
Data Formats
●
●
HDF5 “Heirachical Data Format”
●
www.hdfgroup.org
●
Browsable container of data (HDFView)
●
Has “groups & datasets” like “dirs & files”
●
Data stored in B-Trees
●
Can also store Binary Data
HDF5 for Python www.h5py.org
●
Operate on HDF5 data via python dictionaries
& NumPy arrays - www.numpy.org
Disk Systems
●
Network Attached Storage (NAS)
●
Bluearc (now Hitachi) implemented via FPGA
●
Isilon (now EMC) clustered filesystem, 100GB/s
–
●
Lustre Filesystem
●
●
Multiple SSD nodes & maintains global file coherency
Experimental Parallel distributed filesystem – can
have multiple copies of a file, one master.
Venti (Bell Labs Plan-9 & Inferno)
–
WORM Archive. Shares Blocks by secure SHA-1 Hash.
Data Formats 2
●
Open VDB www.openvdb.org
●
Hierachical structure for volumetric data (“clouds”)
●
Good for sparse volumetric time-varying data
●
Fast access (constant-time) to voxels
●
Large set of operators (Level Set tools, filters,
transforms & morphological operators)
Data Formats 3
●
Disney Ptex eliminates uv texture assignment
●
http://ptex.us/
●
no (u,v)'s required! no seams visible
●
works on sub-d/poly faces
●
Stores face adjacency data & filters
●
Efficiently stores 106 mipmapped texture files
●
Multi-channels, compressed separately
●
Used in Disney's “Bolt”
“D3” Data-Driven Documents
●
●
D3 – An amazing Data visualisation web framework (javascript)
●
http://d3js.org
●
See: https://github.com/mbostock/d3/wiki/Gallery
Offers Parallel Coordinates
●
Demo ? Nutrient Contents - An interactive visualization of
the USDA Nutrient Database.
http://exposedata.com/parallel/
Parallel Co-ordinates
protein, calcium, sodium, fibre, vitamin c, potassium, carbohydrate, sugar, fat, water, calories, saturated, ...