Using a Webcam as a Game Controller
Transcription
Using a Webcam as a Game Controller
Making the Pieces Fit Together Jonathan Blow Game Developers Conference Reception October 21, 2002 Seoul, Korea 3D Techniques I Will Discuss • • • • • Level-of-Detail Management (LOD) Triangle Strip Generation Vertex Cache Optimization Normal Map Generation Ordered Rendering (sorted output geometry) How I will discuss them • You can read about these techniques on the internet: hardware vendor sites, programmer hobbyist sites. There is a lot of hype. • Most of this stuff is not written by people actually making ambitious games (they’re busy!). • Most of it is ill-advised. • I want to provide a hype-free, skeptical review. Lecture in Three Parts • Part 1: A Sense of Perspective – What is 3D rendering for games, today? • Part 2: The Techniques – Explained by a Skeptic… • Part 3: Making Games – How to use 3D techniques without going out of business or building a horrible game. Part 1: A Sense of Perspective 3D rendering for games is a complicated subject • Partially because we have accomplished a lot – Recent demos, and a few games, are graphically very impressive • (but the demos look much better than the games – why is that??) • The games all draw worlds by projecting a bunch of triangles onto the screen. Primary Rendering Paradigm • Projecting triangles – but very fancy triangles – Texture maps, normal maps, complex lighting • Alternative representations exist – NURBS, N-Patches, subdivision surfaces – These are used in preprocesses, translated into triangles for the realtime pipeline. Why have triangles dominated? They are simple and robust. Suppose we’re inventing realtime rendering from scratch • Project every point of solid object to the screen, use depth buffer – We waste a lot of resources drawing everything inside the solid, which will inevitably be hidden! • Cull out interior points (same result) • Now we have a bunch of solid 2D shells to draw, but each still has a large number of points • We want a more compressed way to represent 2D subsets of 3D Introducing the Triangle • The triangle is the simplest way to denote a closed region of 2D space. Start with a point P • We have a 0-dimensional space Define one more point P Q • Suddenly we have a 1D space! • That is a lot bigger than 0D. P + t(Q-P) Add a third point R P Q • Now we have a 2D space! P + t(Q-P) + s(R-P) • In a way, the concept of a triangle is the same as the concept of two dimensions. The linearity of the triangle is tremendously useful! • Easy to: – – – – Interpolate Clip Intersection test Bounding volume • Linear equations are the most basic and wellunderstood kind (see, for example, linearizing differential equations!) • If you are doing something unconventional, the triangle probably won’t get in your way. Higher-order surfaces cause more problems. • Clipping a curved surface is annoying. • Bounding volumes are also annoying. • The offset of a Bezier surface is not a Bezier surface – So what happens if spline parameters are your base representation, and you need to offset? Green surface is a spline Red is not Among linear polygons, triangles are the simplest. • Quads can be noncoplanar (vertex lighting will fail!) • Pipeline must handle primitives of varying vertices • Games had brief dalliances with quads / ngons around 1996, but nobody uses them any more to represent general geometry. In Summary • The impressiveness of our current graphics techniques depends on us being able to draw a lot of triangles. Question: “So how do I draw a lot of triangles?” Part 2: The Techniques (Answer: “very carefully.”) Rendering Techniques that people like to hear about… but first: • There are two basic kinds of 3D techniques – #1: We would think about them if we had infinitely fast hardware (e.g. projective transform, BRDF) – #2: The kind we only care about because hardware is slow • Type #2 usually introduces complications, and we need to manage those complications Drawing a lot of triangles: Reduce Data Size • Fancy Triangles = big vertices (60 bytes each) – – – – XYZ position (12 bytes) Texture UV coordinates (8 bytes) RGBA color (4 bytes) Tangent frame (36 bytes; maybe smaller) • 180 bytes per triangle if you just list vertices! (5000 triangles = 900 kbytes) • This makes the hardware run slowly Indexed Triangle List • • • • A mesh has a lot of shared vertices Put the vertices into an array The triangles are described by indices into this array Shrinks total amount of data – F = 2V ; S0 = 3kF; S1 = kV + 3iF; S1 – S0 = V(5k – 6i) Bonus: Separates topology from position data 3 2 4 0 1 0, 1, 2 1, 2, 3 1, 3, 4 Triangles in a mesh share not only vertices, but edges too 3 3 2 2 3 2 4 4 0 0 1 1 0, 1, 2 1 1, 2, 3 1 3, 1, 4 Triangle Strips • We can compress a list of indexed triangles by forming “strips” that run along the shared edges. 6 4 012, 123, 234, 345, 456 2 5 012, 3, 4, 5, 6 3 0 1 Cost analysis of triangle strips is often somewhat wrong • 3 indices for the 1st triangle, 1 for each thereafter • Incomplete because there also needs to be a way to delimit strips 3 strips: 01234, 567, 89241 Index buffer: 0123456789241 But where do they start and end? Delimiting Triangle Strips • Explicitly add numbers to describe strip length 3 strips: 01234, 567, 89241 Index buffer: 5012343567589241 • DirectX8-style separate API calls (impact on CPU usage, AND adds numbers behind the scenes) Index buffer: 0123456789241 DrawIndexedPrimitive 0, 5 DrawIndexedPrimitive 5, 3 DrawIndexedPrimitive 8, 5 Output stream: 5012343567589241 • Strips start out worse than lists, and have to catch up… the longer the strip, the better you catch up Because triangle strips are limited, we need to add swaps 6 6 5 4 4 2 2 5 3 0 3 0 1 1 012, 3, 4, 5, 6 012, 3, 2, 4, 5, 6 012, 123, 234, 345, 456 012, 123, 232, 324, 245, 456 Triangle Strip Efficiency • Depends on strip length, which depends on your data • It takes a complicated algorithm to make good strips. 4 strips, 40 indices 10 strips, 52 indices (no swaps yet) Triangle Strip Skepticism • In a full game, performance numbers don’t necessarily validate triangle strips… we’ll see why • Strips make implementation complications • Even with perfect stripping, you only reduce index data (minority of total data) from 6iV to 2iV+2. You won’t have perfect stripping. • Degenerate triangles can cost you. If you want to make a strip algorithm… • Most papers give you the basic idea, but are not very good in the end – Old SGI source code – STRIPE papers • You really want a non-greedy algorithm – Heuristics based on strip length and cache – Tunneling operator Vertex Cache and Vertex Shader • We want to cache vertex memory for fast access… • Vertex Shader is a small hardware program that runs for each vertex – Compute lighting, transform, skinning, etc • Hardware caches the results of the evaluated vertex shader – A cache miss means running the shader again • (More expensive than traditional CPU cache miss!) memory shader vertex cache You want to order vertices by cache efficiency • Mostly use vertices you just used recently • But this conflicts with triangle strip efficiency! Can’t even do the red path in one triangle strip without inserting a teleport (very expensive!) Vertex cache effects can be dominant • Multi-pass rendering –you skin the guy multiple times, so shader is expensive! • Or do you skin on the CPU? • Now you begin to have a lot of optimization choices; these can determine who’s dominant • The “right answer” depends on your game and target platform How do we resolve the conflict between strips and cache? • Maybe you write a triangle stripper that tries to deal with the vertex cache – Complicated to write, degraded results on both sides; Nvidia’s does this • Maybe you ignore vertex caching – Might be okay if your shaders are cheap • Maybe you ignore triangle strips, and just use triangle lists Quirks of some architectures make strips better • Nvidia triangle setup (Xbox, etc) • Nvidia push buffer bottleneck also makes strips more effective. Now… we need some kind of LOD • Because even perfect triangle strips / cache hits still draws way too many triangles… we need to go from O(n) to O(log n). • Several types of LOD available: – Dynamic (view-dependent): FORGET IT – Static mesh switching (simple) – Progressive mesh (best algorithm: VIPM) View Independent Progressive Mesh • Collapse vertices due to base-plane error metric. • Generate one sequence of collapses that takes us from highres to low-res. • Popping in VIPM is subtle, which is good. • VIPM draws fewer triangles than static switching, since we usually push static switching away in Z to avoid popping. Problem with VIPM • VIPM slides a window across the index buffer, doing fix-ups. index buffer fix-up record • Need to sort vertices by LOD collapse order • This conflicts with strip / cache sorting • You can’t do all three at once (though you can do stripped VIPM or cache-sorted VIPM) Sorting Score Card (more items will be added here) • Triangle strip efficiency order • Vertex cache order • LOD collapse order (if VIPM) Normal Map Generation • Approximate huge amounts of geometry by per-texel normals • Generate the maps by crunching a high-res mesh down onto a low-res one… • When rendering, transform texture normal by iterated tangent frame, and you get the normal of the high-res model (almost) • Object or tangent space? Normal Map Generation influences LOD choice • With static switching, you just have an array of meshes • With VIPM, you are forced to use objectspace normal maps, which probably don’t compress as well as tangent-space maps. • Normal mapping to a high-res model makes static mesh switching look better (much less popping… most popping was due to light) More Sorting • To render quickly, we want to sort by render state (multiple materials on the same object means we break that object into several passes, decreasing triangle strip and vertex cache effectiveness) • To render quickly, we want to draw front-to-back (fast z-fail) • To render transparent things correctly, we need to draw those back-to-front (break these into a separate pass, decrease stripping and cache effectiveness) • We are robbing ourselves of the benefits we got earlier… so hopefully we didn’t pay very much for them (more on this later) Sorting Score Card • • • • • • Triangle strip efficiency order Vertex cache order LOD collapse order (if PM) Sort by shader Front-to-back (opaque things) Back-to-front (translucent things) How do you LOD a guy with multiple materials? • Materials usually done by one pixel / vertex shader pair, per material • Can only combine triangles so much (can’t cross material boundary) • Can’t combine textures into one (lose lighting effects) • Everybody just kind of punts… this is an important problem to solve for the future. Part 3: Making Games Trade-Offs • As computer scientists and engineers we are accustomed to the idea of engineering tradeoffs (time for space, etc) • Must consider code complexity to be a FINITE RESOURCE that can be traded with time, space, etc. Complexity as Resource • Every extra line of code or ‘if’ statement must be maintained through the life of the project and must interact with new features – IMPORTANT: most new features are not orthogonal; they will FIGHT with your existing code. • You only have so much complexity to spend over the course of your project; too much and your project will fail. Cultural Problem • At least in America, many programmers try to prove themselves by doing complicated, impressive-sounding things. • Try to make 3D engine that is the “next big cool thing” • The successful paths of the past have been things that are NOT complicated (triangles are simple!) • Successful paths of the future will probably also be the simpler ones. So… A Thought • If your engine / algorithms seem very complicated…. – …. they are unlikely to be on a path that history will make successful – They will NOT be the next big thing Good Art and Levels are more important than a good engine Max Payne • If you are adding engine features that make it more difficult to create levels / content (without making the content a lot richer), this is probably a mistake Cost-Benefit Analysis • Don’t forget to account for opportunity cost … every minute you spend working on A is a minute not working on B • You need to be an economist, deciding how to get the most net worth out of the resources you have to spend. • YOU need to do it, not just the managers – It is a multiscale (fractal) phenomenon So how is a game different from a demo? • A demo is free to isolate itself to a small group of effects that “fit together” – Examples? (stencil w/out transparencies, highpoly without dynamic shadows, etc) • Games have gameplay ramifications – You MUST support the game design, make all the disparate pieces connect – “Is this thing in shadow?” My Personal Choices • Triangle lists, vertex cache optimized – No triangle strips • Static-switching LOD – No progressive mesh I try to be as simple as possible in graphics, reserving complexity for things like AI or physics. References • VIPM, triangle strips, vertex cache: – www.cbloom.com/3d/techdocs/vipm_topics.txt The End Questions?