SlidesVidmar3
Transcription
SlidesVidmar3
DATA VISUALISATION PART 3 – RESOURCES & EXAMPLES Gaj Vidmar University Rehabilitation Institute, Republic of Slovenia & University of Ljubljana, Faculty of Medicine, Institute for Biostatistics and Medical Informatics Ljubljana, 13 May, 2013 GUIDELINES ► Chart selection ► ► ► ► ► ► ► Graph Selection Matrix (S. Few – Perceptual Edge) Visualization Options (IBM – Many Eyes) Chart Chooser (Juice Analytics) Chart Suggestions (A. Abela – Etreme Presentations) Selecting the Best Graph Based on Data, Tasks, and User Roles (UPA) Selecting a Graph Type (LabWrite – NCSU) General guides ► ► ► ► ► ► ► (ordered by decreasing preference) (in no particular order, of varied scope) HI-CHART RULES, SUCESS, HI-NOTATION (R. Hichert – H+P) 14 Misconceptions about Charts and Graphs (J. Camoes – ExcelCharts) Information visualization: frequently asked questions (J. Camoes ) Twenty rules for good graphics (R.J. Hyndman, of time series fame) 9 Steps to Simpler Chart Formatting (J. Peltier – Peltier Tech) Visualization Types, Chart Dos and Don'ts (A. Zoss – Duke U) 8 videos: Visualization Training Course (Juice Analytics) DataVis 3 – Resources & Examples (G. Vidmar) SOFTWARE: VISUAL ANALYTICS ► Free ► ► ► ► ► ► ► ► ► ► ► ► ► (selection) Mondrian (from RoSuDa; accompanying book) GGobi (R packages rggobi and cranvas; accompanying book) XmdvTool Crystal Vision (manual) ViVA (author‘s lesson to high energy physicists re. colours) VisiCube NDVis DAVIS (includes some statistics/machine learning) iNZight (a simple modern data analysis system) VisuLab (Excel add-in; includes permutation matrices) TOPCAT (astronomy) ViDaExpert (small, but important author and complex ideas) Commercial ► ► ► ► ► ► ► (a selection of the market leaders and major niche products) Tableau ("Visual analytics for everyone"; Public version free) TIBCO Spotfire (most powerful, integrates statistical analysis) QlikView Visokio Omniscope Panopticon (leader for financial data) Macrofocus (InfoScope, ProfilePlot, SurveyVisualizer, TreeMap) Grapheur DataVis 3 – Resources & Examples (G. Vidmar) SOFTWARE: SCIVIS & TOOLKITS ► Scientific visualisation ► ► ► ► ► ► ► ► ► ► ► Amira (commercial) BioImageXD (free, multidimensional biomedical images) IDV (free) IGOR Pro (commercial) Makai Voyager (commercial) Mayavi (free) McIDAS-V (free) OpenDX (free, older) ParaView (free) VisIt (free) Free toolkits ► ► ► ► ► ► ► ► ► (listed in alphabetic order) (a selection of the most capable and widely used ones) D3.js (JavaScript library) Processing (programming language & development environment) perfuse (Java) flare (ActionScript for Flash) JFreeChart (Java) Weave (Java, requires Tomcat and MySQL) Improvise (Java) FusionCharts Free (Flash) Google Charts (HTML5 and SVG) DataVis 3 – Resources & Examples (G. Vidmar) COLOUR ► Guidelines ► ► ► Escaping RGBland: Selecting Colors for Statistical Graphics (A. Zeileis, K. Hornik, P. Murrell – Comput Stat Data An 2009, 53, 3259-70) Limitations of red-green colour scales (G. Aisch; Data Driven Journalism) Software ► ► ► ► ► Color Oracle (color blindness simulator, free, cross-platform) Color Scheme Designer (free online tool) ColorBrewer (free online tool) ColorBrewer 2.0 (free online tool) Vischeck (simulates colorblind vision) & Daltonize (corrects images for colorblind viewers) DataVis 3 – Resources & Examples (G. Vidmar) NETWORKS – SOFTWARE ► ► ► Dedicated (listed in alphabetic order) ► Cytoscape (free, platform-independent) ► Gephi (free, cross-platform) ► Graphviz (free, cross-platform) ► GUESS (free, cross-platform) ► NetDraw (free, Windows) ► NetMiner (commercial, Windows) ► NodeXL (free, Excel add-in) ► Pajek (free, Windows) ► Pnet (free, Windows) ► SocNetV (free, cross-platform) ► StOCNET (free, Windows) ► Tulip (free, cross-platform) ► UCINET (commercial, Windows) ► visone (free, platform-independent) Library ► igraph (free, cross-platform, for R, Python & C) Statistical/DM with network visualisation & analysis modules ► Skytree Adviser (commercial, free beta) ► Orange (free, cross-platform) DataVis 3 – Resources & Examples (G. Vidmar) NETWORKS & FLOW – APPLICATIONS ► Research groups ► ► ► Transportation networks (better than animation) ► ► ► IDV – Illustrating Flow IDV – Shipping Mix Dynamic networks – flow visualisation (animation) ► ► ► ► Research on Complex Systems – D. Brockmann Wolphram|Alpha – Data Science of Facebook Traffic Visualization – Istanbul Traffic Visualization – Moscow A century of ocean shipping animated May be beautiful and produce the Wow effect but add little value (i.e., insight beyond the obvious) ► ► Global flight paths many people live by the coast & there are some extremely busy hubs (Frankfurt, Atlanta, Sao Paolo, Beijing) – but this is already known DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: CHARTJUNK Metaphore: vs. Extreme examples: Example: vs. DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: LINE CHART ► Effective display of line graphs with 5+ lines Source data Typical Better The best (small multiples!) DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: ANNUAL RAINFALL ► Two tales of one dataset bar chart emphasises absolute values dot plot displays deviations from the average very busy despite the simplest data set general trend compared to the historical pattern all data are included (to too many decimals) average, max and min on record are still shown so axis labels provide no added information axes switched to keep time on the horizontal axis (as usual) DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: PIES INTO SPAGHETTI ► “The only worse design than a pie chart is several of them” (E. Tufte) toe - tons oil equivalent OECD - 30 developed economies, incl. USA and Europe EME - Emerging Market Economy countries FSU - Former Soviet Union countries Other EMEs - EMEs less China and FSU countries Four factors: year, region, share, toe Three increasing, three decreasing The user must scan back and forth between charts China & other EMEs max. increase; FSU max. decrease Key information is hidden rather than obvious US & Europe decreasing; other OECD slight increase DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: BIRTHDAYS ► Graphs & tables, details & trends, raw data & statistical models ► ► ► part 1 part 2 part 3 DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: R ► To See a World (=Circle) in Grains (=Pile) of Sand 1.Original scatter plot 2.Semi-transparent 5.Random subset of data 6. Hexagons (greyscale) 3.Set axes limits 4.Smaller plot symbols 7.2D kernel density 8.Perspective (OpenGL) DataVis 3 – Resources & Examples (G. Vidmar) SHOWCASE: Iris data, ggplot2, chplot ► The most widely used dataset in statistics, ML, DM, DataVis ► More here in French DataVis 3 – Resources & Examples (G. Vidmar) SUGGESTED READINGS DataVis 3 – Resources & Examples (G. Vidmar)