SlidesVidmar3

Transcription

SlidesVidmar3
DATA VISUALISATION
PART 3 –
RESOURCES &
EXAMPLES
Gaj Vidmar
University Rehabilitation Institute,
Republic of Slovenia
&
University of Ljubljana, Faculty of Medicine,
Institute for Biostatistics
and Medical Informatics
Ljubljana, 13 May, 2013
GUIDELINES
►
Chart selection
►
►
►
►
►
►
►
Graph Selection Matrix (S. Few – Perceptual Edge)
Visualization Options (IBM – Many Eyes)
Chart Chooser (Juice Analytics)
Chart Suggestions (A. Abela – Etreme Presentations)
Selecting the Best Graph Based on Data, Tasks, and User Roles (UPA)
Selecting a Graph Type (LabWrite – NCSU)
General guides
►
►
►
►
►
►
►
(ordered by decreasing preference)
(in no particular order, of varied scope)
HI-CHART RULES, SUCESS, HI-NOTATION (R. Hichert – H+P)
14 Misconceptions about Charts and Graphs (J. Camoes – ExcelCharts)
Information visualization: frequently asked questions (J. Camoes )
Twenty rules for good graphics (R.J. Hyndman, of time series fame)
9 Steps to Simpler Chart Formatting (J. Peltier – Peltier Tech)
Visualization Types, Chart Dos and Don'ts (A. Zoss – Duke U)
8 videos: Visualization Training Course (Juice Analytics)
DataVis 3 – Resources & Examples (G. Vidmar)
SOFTWARE: VISUAL ANALYTICS
►
Free
►
►
►
►
►
►
►
►
►
►
►
►
►
(selection)
Mondrian (from RoSuDa; accompanying book)
GGobi (R packages rggobi and cranvas; accompanying book)
XmdvTool
Crystal Vision (manual)
ViVA (author‘s lesson to high energy physicists re. colours)
VisiCube
NDVis
DAVIS (includes some statistics/machine learning)
iNZight (a simple modern data analysis system)
VisuLab (Excel add-in; includes permutation matrices)
TOPCAT (astronomy)
ViDaExpert (small, but important author and complex ideas)
Commercial
►
►
►
►
►
►
►
(a selection of the market leaders and major niche products)
Tableau ("Visual analytics for everyone"; Public version free)
TIBCO Spotfire (most powerful, integrates statistical analysis)
QlikView
Visokio Omniscope
Panopticon (leader for financial data)
Macrofocus (InfoScope, ProfilePlot, SurveyVisualizer, TreeMap)
Grapheur
DataVis 3 – Resources & Examples (G. Vidmar)
SOFTWARE: SCIVIS & TOOLKITS
►
Scientific visualisation
►
►
►
►
►
►
►
►
►
►
►
Amira (commercial)
BioImageXD (free, multidimensional biomedical images)
IDV (free)
IGOR Pro (commercial)
Makai Voyager (commercial)
Mayavi (free)
McIDAS-V (free)
OpenDX (free, older)
ParaView (free)
VisIt (free)
Free toolkits
►
►
►
►
►
►
►
►
►
(listed in alphabetic order)
(a selection of the most capable and widely used ones)
D3.js (JavaScript library)
Processing (programming language & development environment)
perfuse (Java)
flare (ActionScript for Flash)
JFreeChart (Java)
Weave (Java, requires Tomcat and MySQL)
Improvise (Java)
FusionCharts Free (Flash)
Google Charts (HTML5 and SVG)
DataVis 3 – Resources & Examples (G. Vidmar)
COLOUR
►
Guidelines
►
►
►
Escaping RGBland: Selecting Colors for Statistical Graphics (A. Zeileis, K.
Hornik, P. Murrell – Comput Stat Data An 2009, 53, 3259-70)
Limitations of red-green colour scales (G. Aisch; Data Driven Journalism)
Software
►
►
►
►
►
Color Oracle (color blindness simulator, free, cross-platform)
Color Scheme Designer (free online tool)
ColorBrewer (free online tool)
ColorBrewer 2.0 (free online tool)
Vischeck (simulates colorblind vision) &
Daltonize (corrects images for colorblind viewers)
DataVis 3 – Resources & Examples (G. Vidmar)
NETWORKS – SOFTWARE
►
►
►
Dedicated (listed in alphabetic order)
► Cytoscape (free, platform-independent)
► Gephi (free, cross-platform)
► Graphviz (free, cross-platform)
► GUESS (free, cross-platform)
► NetDraw (free, Windows)
► NetMiner (commercial, Windows)
► NodeXL (free, Excel add-in)
► Pajek (free, Windows)
► Pnet (free, Windows)
► SocNetV (free, cross-platform)
► StOCNET (free, Windows)
► Tulip (free, cross-platform)
► UCINET (commercial, Windows)
► visone (free, platform-independent)
Library
► igraph (free, cross-platform, for R, Python & C)
Statistical/DM with network visualisation & analysis modules
► Skytree Adviser (commercial, free beta)
► Orange (free, cross-platform)
DataVis 3 – Resources & Examples (G. Vidmar)
NETWORKS & FLOW – APPLICATIONS
►
Research groups
►
►
►
Transportation networks (better than animation)
►
►
►
IDV – Illustrating Flow
IDV – Shipping Mix
Dynamic networks – flow visualisation (animation)
►
►
►
►
Research on Complex Systems – D. Brockmann
Wolphram|Alpha – Data Science of Facebook
Traffic Visualization – Istanbul
Traffic Visualization – Moscow
A century of ocean shipping animated
May be beautiful and produce the
Wow effect but add little value
(i.e., insight beyond the obvious)
►
►
Global flight paths
many people live by the coast &
there are some extremely busy
hubs (Frankfurt, Atlanta,
Sao Paolo, Beijing)
– but this is already known
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: CHARTJUNK
Metaphore:
vs.
Extreme examples:
Example:
vs.
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: LINE CHART
►
Effective display of line graphs with 5+ lines
Source data
Typical
Better
The best (small multiples!)
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: ANNUAL RAINFALL
►
Two tales of one dataset
bar chart emphasises absolute values
dot plot displays deviations from the average
very busy despite the simplest data set
general trend compared to the historical pattern
all data are included (to too many decimals)
average, max and min on record are still shown
so axis labels provide no added information
axes switched to keep time on the horizontal axis (as usual)
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: PIES INTO SPAGHETTI
►
“The only worse design than a pie chart is several of them” (E. Tufte)
toe - tons oil equivalent
OECD - 30 developed economies, incl. USA and Europe
EME - Emerging Market Economy countries
FSU - Former Soviet Union countries
Other EMEs - EMEs less China and FSU countries
Four factors: year, region, share, toe
Three increasing, three decreasing
The user must scan back and forth between charts
China & other EMEs max. increase; FSU max. decrease
Key information is hidden rather than obvious
US & Europe decreasing; other OECD slight increase
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: BIRTHDAYS
►
Graphs & tables, details & trends, raw data & statistical models
►
►
►
part 1
part 2
part 3
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: R
►
To See a World (=Circle) in Grains (=Pile) of Sand
1.Original scatter plot
2.Semi-transparent
5.Random subset of data 6. Hexagons (greyscale)
3.Set axes limits
4.Smaller plot symbols
7.2D kernel density
8.Perspective (OpenGL)
DataVis 3 – Resources & Examples (G. Vidmar)
SHOWCASE: Iris data, ggplot2, chplot
►
The most widely used dataset in statistics, ML, DM, DataVis
►
More here in French
DataVis 3 – Resources & Examples (G. Vidmar)
SUGGESTED READINGS
DataVis 3 – Resources & Examples (G. Vidmar)