Usability evaluation of design solutions for tablet

Transcription

Sami Pekkala
Usability evaluation of design solutions for tablet magazines
Department of Media Technology
Thesis submitted for examination of the degree of
Master of Science in Technology
Espoo, August 14th , 2012
Thesis supervisor: Professor Pirkko Oittinen
Thesis instructor: Mikko Kuhna M.Sc
“Note the written instructions on how to use the interface,
which are always a sign of trouble.”
Jakob Nielsen
Aalto University School of Science
ABSTRACT OF THE MASTER’S THESIS
Author
Sami Pekkala
Title of the thesis
Usability evaluation of design solutions for tablet magazines
Date
August
14th ,
2012
Language
Number of pages
English
96
Degree Programme
Degree Programme in Automation and Systems Technology
Department
Department of Media Technology
Professorship
T-75
Supervisor
Professor Pirkko Oittinen
Instructor
Mikko Kuhna M.Sc
Abstract
The aim of this thesis is to evaluate the usability of tablet magazines. The content of the magazines is the same, but the design solutions (layout, structure and
interaction possibilities) vary. A formative usability evaluation was done to find usability problems and a summative evaluation was carried out to compare and rank
the magazines. The main emphasis on usability evaluation is in user testing and
eye-tracking.
The field of research is digital publishing, especially magazines for tablet computers.
Print sales are declining and publishers are keen to find new means to approach the
consumer. The form of digital publishing processes and business models for mobile
devices have not yet been established. Web-based and image-based magazines are
both common. Semi-automatic computational layout is presented as a publishing
technique that can reduce the human effort of converting content into various screensizes. Before deciding on how to publish a tablet magazine, it is important to evaluate
how the different design solutions affect the usability of a magazine.
A theoretical background for this study is presented in the beginning. Usability is
defined to consist of effectivenes, efficiency and satisfaction of the user interface. Usability is found to be dependent on context, i.e. users, tasks and environment. Usability evaluation methods (uem) chosen for this study include think aloud, performance
measures, questionnaires and quantitative and qualitative eye-tracking analysis.
The summative evaluation results of this study show that a web-based version of a
magazine with dynamic, semi-automatic layout has better usability than the others.
An image-based version with static manual layout is second and another version with
dynamic layout is third in the terms of usability. Design solution suggestions for even
more usable tablet magazine are made as a result of the formative evaluation.
Keywords
Usability, tablet, tablet computer, magazine, eye-tracking
Aalto-yliopisto Perustieteiden korkeakoulu
DIPLOMITYÖN TIIVISTELMÄ
Tekijä
Sami Pekkala
Työn nimi
Tablettilehtien designratkaisujen käytettävyysarviointi
Päivämäärä
Kieli
Sivumäärä
14.8.2012
englanti
96
Tutkinto-ohjelma
Automaatio- ja systeemitekniikan koulutusohjelma
Laitos
Mediatekniikan laitos
Professuuri
T-75
Työn valvoja
Professori Pirkko Oittinen
Työn ohjaaja
Mikko Kuhna DI
Tiivistelmä
Tämän diplomityön tavoitteena on synnyttää tietoa neljän tablettiaikakauslehden
käytettävyydestä kun arvioitavien lehtiversioiden sisältö on sama, mutta designratkaisut (taitto, rakenne ja interaktiomahdollisuudet) vaihtelevat. Formatiivinen
käytettävyyarviointi tehtiin käytettävyysongelmien löytämiseksi ja summatiivisella
arvioinnilla pyrittiin vertaamaan aikakauslehden versioita toisiinsa. Arvioinnissa
painotetaan käyttäjätestejä ja silmänliikemittauksia.
Tutkimusalueeksi määritellään digitaalinen julkaiseminen, erityisesti tablettiaikakauslehden osalta. Paperilehtien tilaajamäärien laskiessa julkaisijat yrittävät
löytää uusia keinoja tavoittaa kuluttaja. Digitaalisten julkaisuprosessien ja liiketoimintojen mallit eivät ole vielä vakiintuneet. Web- ja kuvapohjaiset aikakauslehdet
ovat molemmat yleisiä. Puoliautomaattinen laskennallinen taittaminen esitellään
julkaisutekniikkana, joka vähentää työn määrää sisällön muokkaamisessa eri näyttöko’oille. Eri designratkaisujen vaikutus käytettävyyteen täytyy arvioida ennen
kuin päätetään miten tablettilehti julkaistaan.
Alun teoreettisen taustan esittelyssä käytettävyys määritellään koostuvan käyttöliittymän tehokkuudesta, suorituskyvystä ja käyttäjätyytyväisyydestä. Käytettävyys
on myös aina riippuvainen asiayhteydestä: käyttäjistä, tehtävistä ja ympäristöstä.
Valitut käytettävyyden tutkimusmenetelmät ovat ääneen ajattelu, suorituskykymittaus, kyselyt sekä määrällinen ja laadullinen silmänliikemittaus.
Summatiivinen arviointi osoittaa, että dynaamisen, puoliautomaattisen taiton
omaava web-pohjainen tablettilehti on parempi käytettävyydeltään kuin muut. Kuvapohjainen lehti manuaalisella taitolla, on toinen ja käytettävyydeltään huonoin on
toinen dynaamisen taiton omaava versio samasta aikakauslehdestä. Formatiivisen
arvioinnin avulla määritellään käytettävyyden kannalta parhaat designratkaisut.
Avainsanat
Käytettävyys, tabletti, tablettitietokone, aikakauslehti, silmänliikemittaus
Acknowledgements
This master’s thesis was done at the Department of Media Technology at Aalto University School of Science. My work was a part of the NextMedia program financed by
Tekes – the Finnish Funding Agency for Technology and Innovation.
Bigger acknowledgement go to my supervisor Professor Pirkko Oittinen, who gave me the
job first place and who re-revised this lengthy script (with mercifully few corrections).
I’m most grateful for the opportunity to do my final schoolwork here in your research
group. Also, big up to Mikko Kuhna (M. Sc) for instructing me through this process.
Keep up the good work with future instructees, but beware betting on football against
them (you might end up losing).
To all 40+ participants in my user tests: Thank you, for your time and your eyes.
Subject No. 6, a cute blonde girl, gets the biggest credit for my graduation by feeding
me after work and by gently kicking me to the goal. Finally, I want to thank my family.
Especially my sister, for piloting; and dad, for providing me with valuable insight into
how the eldelry use an iPad magazine :).
Espoo, August 10th, 2012 Sami Pekkala
iv
Contents
Abstract in English
ii
Abstract in Finnish
iii
Acknowledgements
iv
List of Figures
viii
List of Tables
x
Abbreviations
xi
1 Introduction
1.1 Scope and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 Usability evaluation methods
2.1 Definition of usability . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 A method for every need . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Formative usability evaluation . . . . . . . . . . . . . . . .
2.2.2 Summative usability evaluation . . . . . . . . . . . . . . . .
2.3 An overview of common usability methods . . . . . . . . . . . . . .
2.3.1 Heuristic evaluation and other usability inspection methods
2.3.2 Think aloud (as a usability evaluation method) . . . . . . .
2.3.3 Performance measures (as a usability evaluation method) .
2.3.4 System Usability Scale and Single Usability Metric . . . . .
2.3.5 Eye-tracking (as a usability evaluation method) . . . . . . .
3 Tablet computers and tablet publishing
3.1 Touchscreen devices . . . . . . . . . . . . . . . . . . .
3.2 Tablet computers . . . . . . . . . . . . . . . . . . . . .
3.2.1 Apple iPad tablet computer . . . . . . . . . . .
3.3 Usability of tablet computers . . . . . . . . . . . . . .
3.3.1 Direct manipulation in graphical user interfaces
3.3.2 Natural user interface . . . . . . . . . . . . . .
3.3.3 Previous research . . . . . . . . . . . . . . . . .
3.3.4 iPad specific research . . . . . . . . . . . . . .
3.4 Definition of magazine . . . . . . . . . . . . . . . . . .
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
2
3
.
.
.
.
.
.
.
.
.
.
4
4
5
5
7
7
7
10
11
11
12
.
.
.
.
.
.
.
.
.
15
15
16
16
16
16
18
19
19
20
Contents
3.5
vi
3.4.1 Definition of tablet magazine . . . . . . . . . . . . . . . . . . . . . 20
Tablet magazine publishing . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5.1 Tablet publishing in Finland . . . . . . . . . . . . . . . . . . . . . 22
4 Magazine in the tests
4.1 Tietokone magazine . . . . . . . .
4.2 Retail version . . . . . . . . . . . .
4.3 AnyReader version . . . . . . . . .
4.4 “Fancybox” web-based magazine .
4.5 “Photoswipe” web-based magazine
4.6 Structural differences in magazines
5 Experiment setup
5.1 Chosen methods . . . . . . . . .
5.2 Users . . . . . . . . . . . . . . . .
5.3 Test setup . . . . . . . . . . . . .
5.3.1 Eye-tracking system setup
5.4 Test protocol . . . . . . . . . . .
5.5 Tasks . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
24
24
25
27
30
32
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
38
38
39
39
40
41
43
6 Analysis
6.1 Task time . . . . . . . . . . . . . . . . . . . . . . .
6.2 System Usability Scale and Single Usability Metric
6.3 Quantitative eye-tracking . . . . . . . . . . . . . .
6.4 Think aloud and qualitative eye-tracking . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
47
47
48
49
49
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Results
7.1 Task time . . . . . . . . . . . . . . . . . . . . . . .
7.1.1 Total task time . . . . . . . . . . . . . . . .
7.1.2 Task browsing time . . . . . . . . . . . . . .
7.2 System Usability Scale and Single Usability Metric
7.2.1 Satisfaction . . . . . . . . . . . . . . . . . .
7.3 Quantitative eye-tracking . . . . . . . . . . . . . .
7.3.1 Pupil diameter . . . . . . . . . . . . . . . .
7.3.2 Fixation duration . . . . . . . . . . . . . . .
7.4 Think aloud and qualitative eye-tracking . . . . . .
7.4.1 Usability problems . . . . . . . . . . . . . .
7.5 Summary of results . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
51
51
54
54
56
56
58
58
60
61
62
8 Discussion
8.1 Summary of the summative usability evaluations . . . . . . . . .
8.1.1 Usability implications of task time and satisfaction scores
8.1.2 Quantitative and qualitative eye-tracking result analysis .
8.1.3 Low reliability of SUS and SUM scores . . . . . . . . . . .
8.2 Summary of the formative usability evaluations . . . . . . . . . .
8.2.1 Findings from AnyReader version . . . . . . . . . . . . . .
8.2.2 Findings from retail version (Woodwing) . . . . . . . . . .
8.2.3 Findings from Fanxybox and Photoswipe versions . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
66
66
66
68
69
69
70
70
71
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
8.3
vii
Reliability and validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
8.3.1 Influence of user background . . . . . . . . . . . . . . . . . . . . . 72
9 Conclusion
74
References
76
Appendices
82
List of Figures
2.1
2.2
A model of iso standard and the Single Usability Metric . . . . . . . . . . 12
Screen capture from iViewX-software shows corneal reflection (black crosshair)
and pupil (white crosshair) . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1
3.2
Apple iPad2 tablet computer from front, back and side . . . . . . . . . . . 17
Different types of tablet magazine solutions, from left: application-based,
web-based, and a compilation magazine . . . . . . . . . . . . . . . . . . . 21
4.1
In ww, headlines in cover and toc are hyperlinks to the corresponding
articles, a ⊕button opens a pop-up window with additional information
inside an article . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The four different image interaction possibilities in ww from top left:
a pop-up window, enlarge image to full-screen, show image caption and
enlarge image by little . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Different scrollable portions of a page in ww from left: scrollable text
column, scrollable image and scrollable article . . . . . . . . . . . . . . .
Toolbar and functions of four toolbar buttons in ww, from top left: page
browser (did not work in the tests), library, homepage and store . . . .
In ar, articles are accessed by tapping a hyperlink in top-level . . . . .
Tapping an image in article opens an image carousel in ar, where images
of the same article can be browsed . . . . . . . . . . . . . . . . . . . . .
Tapping the “−button” in ar toolbar shrinks the text size and layout
adjusts accordingly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Toolbar navigation shortcuts in ar: “toc button” brings user to top-level
and “home button” exits to library . . . . . . . . . . . . . . . . . . . . .
Layout changes in ar after rotating the device 90◦ . . . . . . . . . . . .
In fb and ps, articles can be accessed by tapping a hyperlink in toc . .
Image opens to a pop-up window in fb . . . . . . . . . . . . . . . . . . .
The navigation bar in fb and ps . . . . . . . . . . . . . . . . . . . . . .
Layout changes in fb and ps after rotating the device 90◦ . . . . . . . .
After tapping an image in ps, image carousel opens and all images in the
article can be browsed . . . . . . . . . . . . . . . . . . . . . . . . . . . .
An overview of the navigational structure of ww magazine . . . . . . .
An overview of the navigational structure of ar magazine . . . . . . . .
An overview of the navigational structure of ps and fb magazines . . .
4.2
4.3
4.4
4.5
4.6
4.7
4.8
4.9
4.10
4.11
4.12
4.13
4.14
4.15
4.16
4.17
5.1
5.2
. 26
. 26
. 27
. 28
. 28
. 29
. 29
.
.
.
.
.
.
30
30
31
31
32
32
.
.
.
.
33
34
35
36
Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
An Epiphan dvi2usb Frame Grabber window was placed directly under
the iPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
viii
List of Figures
ix
5.3
Gesture instructions for novice users . . . . . . . . . . . . . . . . . . . . . 43
6.1
Screen capture from a video combining think aloud, gestures and eyetracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
7.13
7.14
Average total task times for each task . . . . . . . . . . . . . . . . . . .
Average task browsing times for each task . . . . . . . . . . . . . . . . .
Average task times grouped into usability aspects . . . . . . . . . . . . .
Total task times averaged over magazines . . . . . . . . . . . . . . . . .
Task browsing times averaged over magazines . . . . . . . . . . . . . . .
sus score averaged over magazines . . . . . . . . . . . . . . . . . . . . .
sum score averaged over magazines . . . . . . . . . . . . . . . . . . . . .
Satisfaction score averaged over magazines . . . . . . . . . . . . . . . . .
Average satisfaction scores given for each task . . . . . . . . . . . . . . .
Average task satisfaction scores grouped into usability aspects . . . . . .
Pupil diameter averaged over magazines . . . . . . . . . . . . . . . . . .
Fixation duration averaged over magazines . . . . . . . . . . . . . . . .
Amount of negative and positive comments about each magazine . . . .
Total number and different usability problems found from observing the
videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.1
.
.
.
.
.
.
.
.
.
.
.
.
.
52
52
53
53
53
55
55
56
57
57
59
59
61
. 62
Eye-tracking shows how correct headline is not “seen” even though quickly
looked at, because of more demanding typography below, which was not
a headline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
List of Tables
2.1
Summary of usability methods . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Technical specifications of Apple iPad2 . . . . . . . . . . . . . . . . . . . . 17
4.1
4.2
The three biggest IT-magazines in Finland . . . . . . . . . . . . . . . . . . 24
An overview of the magazine user-interfaces . . . . . . . . . . . . . . . . . 37
5.1
5.2
Test user statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Task overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
7.10
7.11
7.12
7.13
7.14
7.15
7.16
7.17
Key statistics for total task times . . . . . . . . . . . . . . . . . . . . . . .
Key statistics for task browsing times . . . . . . . . . . . . . . . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for total task times . . . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for task browsing times . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for sus score . . . . . . . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for sum score . . . . . . . . . . . .
Key statistics for sus score . . . . . . . . . . . . . . . . . . . . . . . . . .
Key statistics for sum score . . . . . . . . . . . . . . . . . . . . . . . . . .
Key statistics for satisfaction score . . . . . . . . . . . . . . . . . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for satisfaction score . . . . . . . .
Results (P (T ≤ t)) of two-tailed t-tests for average pupil diameters . . . .
Results (P (T ≤ t)) of two-tailed t-tests for average fixation durations . . .
Correlation coefficients between pupil diameter, task time and satisfaction
Key statistics for pupil diameter measures . . . . . . . . . . . . . . . . . .
Key statistics for fixation duration measures . . . . . . . . . . . . . . . . .
Correlation coefficients between fixation duration, task time and satisfaction
The number of positive and negative comments from think aloud and the
three most remarked aspects of usability (−/+) . . . . . . . . . . . . . . .
Total number and different usability problems found from observing the
videos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Most important individual usability problems by magazine . . . . . . . . .
The order of magazines in all usability measurements . . . . . . . . . . . .
A “pros and cons” summary of each magazine based on the entire study .
7.18
7.19
7.20
7.21
8.1
6
52
53
54
54
55
55
55
55
58
58
59
59
59
59
60
60
61
61
63
64
65
Correlation coefficients between user background and some usability measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
x
Abbreviations
HCI
Human-computer interaction
GUI
Graphical user interface
NUI
Natural user interface
TOC
Table of contents
UEM
Usability evaluation method
ISO
International Organization for Standardization
CI
Confidence interval
HTML
Hypertext Markup Language
AR
AnyReader magazine
WW
Woodwing (retail) magazine
FB
Fancybox magazine (and image viewing system)
PS
Photoswipe magazine (and image viewing system)
SUM
Single Usability Metric
SUS
System Usability Scale
xi
Chapter 1
Introduction
The ongoing digital revolution can be considered as the greatest change in the print
media business since Gutenberg’s press brought the printing revolution. Production
processes of print media publishers have already been digital for a few decades. The
outlook or quality of a typical printed publication has not altered when the production
processes have switched from analog to digital. What has changed the game radically is
the advent of multiple new viewing platforms for digital media.
Desktop, laptop and notebook computers and mobile phones can all be used to view
the same digital content. A plethora of new devices implies that there must be also
new usage behaviors for print media. This has been a pitfall to some traditional media
companies; they have not realized that a digital carbon copy of a print publication is
not enough for “digital omnivores”.
The single new device that has changed the media experience most dramatically could
be the iPad, manufactured by Apple Inc. Apple is expected to sell its 100 millionth iPad
this year (2012)1 . iPad is a tablet computer, which are defined as a mobile computers
consisting only from a large flat touchscreen surface. Tablet computers, or tablets, combine the strengths from the previously mentioned devices from the view of digital media
consumption. Computers have big enough screens for reading but they are not mobile.
Mobile phones, on the other hand, are always with you, but small screen size hinders
digital media usage. iPad’s ten inch touchscreen, long battery life, instant power-up
from hibernation and wireless data connection make it suitable for media consumption.
Print media publishers have struggled to find a flexible publishing platform to suit the
various screen sizes of the new devices. 1:1 digital copy of a broadsheet newspaper cannot
be read comfortably from a four inch mobile phone screen. Then again, the same paper
can be read from a computer display. A computer display with an HD-resolution has
over three times more pixels than a typical mobile phone display2 . The input devices and
1
2
http://www.usatoday.com/tech/news/story/2012-03-03/apple-ipad-sales/53344970/1
HD: 1920 × 1080 / iPhone: 960 × 640 ≈ 3.4
1
Chapter 1: Introduction
2
use situations have also great variability between the devices. For maximum audience
and profit, the same digital publication should be viewable and usable with all devices.
1.1
Scope and methods
The main concepts this master’s thesis deals with are tablet computer, magazine, layout,
usability, user studies and eye-tracking. As said before, print media is trying to find new
routes to the consumer as print circulations are slowly declining. In Finland, magazines
circulations went down by 1.2 % and newspapers by 2.6 % during 2009–20103 .
When the same publication is presented on multiple platforms, it is usually preferred
that the content of the publication is the same. Instead, the form should be shaped
according to the device. Due to the relatively small portion of readership which uses
mobile devices, there are incentives to do this alteration of form with little manpower
in Finnish publishing houses. In this thesis, two automatic layout systems have been
compared along with manually produced layout.
The viewpoint of the comparison research was that of usability; when the content is the
same, how do the distinct outlook and structure created by the layout systems affect the
usability of the publication. Four different versions of an issue of a Finnish computer
magazine were examined. The viewing device used was Apple’s iPad 2 and the methods
for usability evaluation included heuristic evaluation and user tests with observation,
think aloud, questionnaires, performance measures and eye-tracking.
Even though the focus was on a tablet magazine usability, the same usability principles
also apply—to some limit—to touchscreen-equipped mobile phones also. This should be
noted because the two tested dynamic layout systems are adaptable to all screen sizes.
In fact, many of the usability principles related to visual aspects are universal and could
be applied even to print magazine layouts as well, but this is out of the scope of this
study and is not discussed separately.
1.2
Aim
The aim of this thesis is to find out what makes a usable tablet magazine. Usability
problems and bottlenecks from the four magazine versions were identified. The results
can be used to further develop automatic and manual layout of the tablet magazines
for a better user experience. In addition, it is also important to evaluate and rank the
magazine versions from the perspective of usability. These results can guide publishers
to choose from the different versions: should they use the effort of graphic designers and
3
http://www.levikintarkastus.fi/uutisia/Levikkitiedote2011.pdf
Chapter 1: Introduction
3
programmers to convert the content for different devices or should they consider a more
automatic approach?
Previous research about tablet magazine usability is scarce. General tablet usability
and e-reader4 studies can be found, but the author is not aware of any tablet magazine specific research. Neither were studies found about eye-tracking used for tablet
usability evaluations. This thesis’s research patches a gap in the hci (Human-computer
interaction) field combining tablet (magazine) usability evaluation and eye-tracking.
The main research question of the thesis can be stated as What defines a tablet magazine’s usability? Also, a sub question, How do the different magazine versions compare
in terms of usability?, is discussed related to the research material specifically. These
questions are answered as thoroughly as possible with the methods mentioned before.
The second sub question of the research considering the rather explorative method is:
How can eye-tracking be used to evaluate tablet (magazine) usability?.
1.3
Structure
In the next chapter, Chapter 2, usability is defined and an overview of usability evaluation methods (uems) is given. The methods used in this thesis are discussed in more
detail. Chapters 3 and 4 present the material of this research. Tablet computers, especially iPad, are introduced and an overview of tablet publishing is given in Chapter 3.
In Chapter 4, the magazine chosen for this research is presented. In addition, the four
different versions of the magazine and their differences are discussed.
Chapters 5 and 6 deal with the experimental research which was conducted. Chapter 5
shows the user test setup and Chapter 6 examines the data analysis methods. Finally,
the results are presented in Chapter 7, and in 8, the results are evaluated, compared
to previous research and the research questions are answered. To conclude, Chapter 9
summarizes the research and the obtained results.
4
An electronic device for reading e-books. It has usually a black & white display and few buttons
(no touchscreen) and thus lacks the versatility of tablets.
Chapter 2
Usability evaluation methods
In this chapter, a definition of usability is given and different usability evaluation methods
are presented and evaluated. Some methods, which are relevant to this research, are
discussed in more detail.
2.1
Definition of usability
A formal and widely used definition for usability can be derived from an iso standard,
which states:
[Usability is the] Extent to which a product can be used by specified users
to achieve specified goals with effectiveness, efficiency and satisfaction in a
specified context of use. [37]
The iso standard further defines effectiveness (a task completion measure), efficiency
(a task time measure), satisfaction (a subjective measure of experience of a user) and
context (equipment, environment, tasks and users). Other quality attributes which can
be attached to usability are learnability (easy to learn for a beginner), memorability
(easy to remember for a casual user) and error rate (few and easily recoverable errors)
of the system [50].
Besides being a quality attribute, the word usability can also be used to mean the process
and methods for improving ease-of-use of a system during product development and
after. Usability, in the latter sense, is a synonym for usability engineering. The Usability
Engineering Lifecycle by Mayhew (1999) defines usability engineering to consist of four
parts. The usability process can be divided into distinct phases as follows: requirements
analysis, design/testing/development and installation [46].
4
Chapter 2: Usability evaluation methods
5
In this study, usability is used as a quality attribute of a system or as a part of the hci
discipline. Usability engineering, usability testing (with users) and usability inspection
(with experts) are used to describe the processes.
2.2
A method for every need
Usability evaluation methods, or uems, can be divided into two subsets: usability inspection and usability testing methods. In usability inspection methods, one or several
experts on user interface design and usability examine the system. Usability testing
methods have real users using the system and the usability practitioner’s role is to observe them. Examples of usability inspection methods are heuristic evaluation, cognitive
walkthrough and formal usability inspection. Examples of usability testing methods are
think aloud, performance measurements and eye-tracking.
A summary of usability evaluation methods is shown in Table 2.1 (adapted from Nielsen
(1993) and the last two methods from Pernice (2009) [50, 61]). The table is not exhaustive and some of the methods can be divided further (e.g. heuristic estimation,
retrospective think aloud). To summarize: there are tens of usability evaluation methods available, each with its own pros and cons. The methods also come into play at
different stages of the usability engineering lifecycle [50].
With different advantages and disadvantages, it is advisable to use a set of methods,
which complement each other. To choose a set of methods, a usability practitioner has
to apply some criteria in the selection. The next two sections can be thought as a
starting point for a decision: whether to improve system’s usability or compare system’s
usability with others.
2.2.1
Formative usability evaluation
Besides self-explanatory quantitative–qualitative categorization of usability evaluation
methods, a formative–summative division can also be used. This partition is based on
the goals of the usability study. Formative evaluation aims at improving the usability
of an interface when the system is developed or revised [50]. The methods described
as formative are typically fast, cheap and simple; due to the fast cycle of a product
development process, usability tests need to be conducted and analyzed quickly.
As a result, formative studies and usability testing can be thought as unscientific. Usability, in this sense, is reduced to craft, not science [2]. For a usability professional,
this does not diminish the value of formative evaluations. Heuristic evaluation is one
example of a formative method with high benefit–cost ratio [51]. Other examples of
usability methods, which are normally used for formative evaluation are heuristic evaluation, think aloud, observation, interviews and qualitative eye-tracking.
6
Table 2.1: Summary of usability methods
Method
Users
Main advantage(s)
Main disadvantage(s)
Heuristic
evaluation
0
Finds individual usability Does not involve real
users, so does not find
problems.
“surprises” relating to
Can address expert user
their needs.
issues.
Performance
measures
10 at least
Numerical data.
Results easy to compare.
Does not find individual
usability problems.
Think aloud
3–5
Pinpoints user
misconceptions.
Cheap and easy to
conduct.
Unnatural for users.
Hard for expert users to
verbalize.
Observation
3 or more
Ecological validity;
reveals users’ real tasks.
Suggests functions and
features.
Appointments hard to
set up.
No experimenter control.
Questionnaires 30 at least
Finds subjective user
preferences.
Easy to repeat.
Pilot work needed
(to prevent
misunderstandings).
Interviews
5
Flexible, in-depth
attitude and experience
probing.
Time consuming.
Hard to analyze and
compare.
Focus groups
6–9
per group
Spontaneous reactions
and group dynamics.
Hard to analyze.
Low validity.
Logging
actual use
20 at least
Finds highly used (or
unused) features.
Can run continuously.
Analysis programs
needed for huge mass of
data. Violation of user’s
privacy.
User feedback
Hundreds
Tracks changes in user
requirements and views.
Special organization
needed to handle replies.
Eye-tracking
6 qual./
39 quant.
Data about where users
look (can not be acquired
by other methods).
Analyzed data does not
directly translate to a
usability measure.
Unreliable and expensive
equipment.
Card sorting
15
Easy, cheap and quick.
Can be conducted
without an interface.
One-sided data about
concept grouping, which
can be highly varied.
2.2.2
7
Summative usability evaluation
After a product has been developed, it can be compared with competing products.
Rather than improving a product in progress, summative evaluation attempts to rank
the usability of a finished product with others [50]. More time and resources can be
allocated to this than to the formative evaluation because summative evaluations are
(usually) done outside the product development cycle.
The results from summative evaluations have to be numeric, so that different systems
can be compared. This creates requirements for the data gathering methods. Number
of subjects in summative usability evaluations has to be relatively high to get significant
and reliable results (see table 2.1). Also, measurements and analysis of the data has
to be somehow standardized for different systems to prevent biased results. iso has
defined usability so that it can be measured summatively with performance measures
(task completion, task time) and questionnaire (satisfaction questionnaire) [38].
In conclusion, summative usability evaluations can be considered to be more scientific
than formative. Examples of usability methods used for summative evaluations include performance measures, questionnaires and quantitative eye-tracking. Formative–
summative categorization of usability methods presented here gives a starting point for
a usability practitioner to choosing a method. Next, an overview of the more common
usability evaluation methods is laid out.
2.3
An overview of common usability methods
The most common usability evaluation methods are presented in this section including
the methods which are relevant to this study. The next subsection 2.3.1 is devoted to
usability inspection methods as a whole. Usability testing methods are presented in the
end of this chapter. More space is allocated here, because usability testing methods are
the most influential and popular uems [63].
2.3.1
Heuristic evaluation and other usability inspection methods
Heuristic evaluation (also called Expert evaluation) is the most widely used usability
inspection method. Usability inspection is a generic term for methods, which involve
evaluations of usability of a user-interface by experts, not users [53]. Other common
usability inspection methods are cognitive and pluralistic walkthrough.
Heuristic evaluation is “done by looking at an interface and trying to come up with
an opinion about what is good and bad about the interface”. The name of the method
comes from the set of recognized usability guidelines, the heuristics. Ideally, the evaluator
8
compares the system against some heuristics or guidelines as the evaluation proceeds,
although the evaluation can be conducted using intuition and common sense also. [50]
The number of guidelines in a heuristic system can be as high as thousand [71], which
can be too intimidating and time consuming. To tackle these problems, the most used
set of heuristics was developed by Nielsen & Molich in 1990 (revised in 1994) [53, 54].
The set contains ten guidelines to be fulfilled by a usable user-interface, presented in the
following list [53].
Visibility of system status
The system should always keep users informed about what is going on, through
appropriate feedback within reasonable time.
Match between system and the real world
The system should speak the users’ language, with words, phrases and concepts
familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.
User control and freedom
Users often choose system functions by mistake and will need a clearly marked
”emergency exit” to leave the unwanted state without having to go through an
extended dialogue. Support undo and redo.
Consistency and standards
Users should not have to wonder whether different words, situations, or actions
mean the same thing. Follow platform conventions.
Error prevention
Even better than good error messages is a careful design, which prevents a problem
from occurring in the first place. Either eliminate error-prone conditions or check
for them and present users with a confirmation option before they commit to the
action.
Recognition rather than recall
Minimize the user’s memory load by making objects, actions, and options visible.
The user should not have to remember information from one part of the dialogue
to another. Instructions for use of the system should be visible or easily retrievable
whenever appropriate.
Flexibility and efficiency of use
Accelerators—unseen by the novice user—may often speed up the interaction for
the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.
Aesthetic and minimalist design
Dialogues should not contain information, which is irrelevant or rarely needed.
9
Every extra unit of information in a dialogue competes with the relevant units of
information and diminishes their relative visibility.
Help users recognize, diagnose, and recover from errors
Error messages should be expressed in plain language (no codes), precisely indicate
the problem, and constructively suggest a solution.
Help and documentation
Even though it is better if the system can be used without documentation, it may
be necessary to provide help and documentation. Any such information should be
easy to search, focused on the user’s task, list concrete steps to be carried out, and
not be too large.
The evaluator should be a usability specialist, who has some knowledge about usability in general or about the platform and particular user-interface, preferably both (a
“double specialist”). Even novice evaluators can find some usability problems from a
user-interface, but usability specialists are more potent at finding them.
A study showed that, on average, five novice evaluators found half, five “regular” usability
specialists found 88 % and five “double” specialists found 97 % of the usability problems
from a user-interface [49]. Furthermore, averaged over six studies, five evaluators found
75 % of the usability problems [53].
According to Nielsen & Landauer (1993), a following equation can be used to predict
the number of usability problems found with a certain number of evaluators:
P roblemsF ound(i) = N (1 − (1 − λ)i )
(2.1)
where P roblemsF ound(i) is the number of (different) usability problems found by i
evaluators, N is the total number of usability problems, and λ is the proportion of N
found by a single evaluator. Averaged across six studies, the mean λ was 31 % and
the mean N was 41. With some other assumptions on the project size, benefits of
corrected and costs of uncorrected usability problems, an optimal (highest benefit–cost
ratio) number of heuristic evaluators can be derived to be 4.4. [51]
The goal of a heuristic evaluation is to find usability problems. Due to its simplicity
and affordability, it is most often used during an iterative product development process
where it is crucial to constantly revise product prototypes [50]. If resources allocated
to a product development project do not allow for any formal usability methods to be
used, it is advised that at least a single heuristic evaluation should be conducted [46].
Besides heuristic evaluation, other widely exploited usability inspection methods are
walkthroughs, especially cognitive and pluralistic walkthroughs. The cognitive walkthrough concentrates on evaluating user-interfaces ease of learning by exploring it. Lewis
et al. first introduced the method in a paper in 1990 [42]. The method is based on the
10
idea that users do not read manuals but rather discover features they need by exploring
the system [53]. An overview of a cognitive walkthrough process is listed below (adapted
from [53]).
1. Define inputs to the walkthrough
2. Convene analysts
3. Walk through the action sequences for each task
4. Record critical information
5. Revise the interface to fix the problems
During the cognitive walkthrough, evaluators examine the user-interface in the context
of some tasks and use scenarios. The inputs required for a evaluation session are the
interface (usually a paper mock-up), description of the assumed user population, a task
and use scenario, and a list of actions a user should execute to complete each task (hence
the name walkthrough) [53]. By these means, the evaluators assess whether or not the
sequence of required actions is suitable for the current task.
The pluralistic walkthrough was first developed at ibm in the 1980’s and it was introduced to the public in an article was published in 1991 [5]. It differs from a cognitive one
by having several evaluators from three different groups: the users, the product developers and the usability specialists. Gradually, the evaluators collaboratively go through
the action sequences of each interface dialogue window as normal users would. Then
they decide if the actions are appropriate for the task [53].
2.3.2
Think aloud (as a usability evaluation method)
The rest of this section deals with usability testing methods and the first to be discussed
is think aloud. Think aloud, or thinking aloud, ”may be the single most valuable usability engineering method” [50]. It started as a method for psychological research when
Ericsson & Simon claimed that verbal reports could be used as data in 1980’s [26, 27].
They claimed that when properly instructed to think aloud, users “verbalize information
that they are attending to in short-term memory” and that it does not necessarily affect
cognitive processes [26]. In a sense, properly executed think aloud can thus be described
as an indirect way of accessing user’s mind, or at least the short-term memory.
Ericsson & Simon’s protocol describes strictly one-sided communication: the facilitator
is silent and the test subject speaks. Usability specialists have later adopted the method
to serve as a practical tool for evaluating human-computer interfaces. Think aloud used
by usability specialists differs from Ericsson & Simon’s protocol by involving more twoway communication, although the user still does most of the talking. Instructions for
11
facilitating a think aloud session found in usability handbooks are vague; there are no
strict protocols for think aloud method in usability, which makes the results of studies
harder to compare [7].
During a think aloud session, a participant is simply instructed to use an interface
while continuously thinking out aloud. This is usually done while executing some preformatted tasks. User comments are usually recorded and transcribed or they are observed live and noted down. Think aloud produces a great deal of qualitative data from
small amount of users. The drawback is that it might affect performance measurement.
It seems that think aloud slows down complex tasks but does not affect simple tasks,
such as finding information [20]. One handbook recommends not to use think aloud and
performance measures at the same time at all, because it slows down users considerably
[64]. On the other hand, think aloud has been found to even speed up problem solving in
search tasks, by allowing users to refine their thoughts as they verbalize them [4]. What
is clear is that users performance (task time, errors etc.) when thinking aloud cannot
be compared to a natural performance, but can be compared to other users think aloud
performance.
2.3.3
Performance measures (as a usability evaluation method)
Multiple performance measures can be gathered from a user doing tasks. Nielsen and
Rubin both list 18 (not all same) quantifiable performance measurements in their handbooks, such as: task completion time, ratio between successful interactions and errors,
number of user errors, number of commands or features used by user and number of
times user contacts help desk [50, 64].
Performance measures are usually easy to obtain with a stopwatch and user observation
and they produce easy to analyze quantifiable data. However, they do not tell why something is difficult to a user. Usability problems cannot be identified by using performance
measurements alone, so they are usually used in summative evaluations (e.g. comparing
competitive products) [50].
Usability, as defined by iso, can be measured with performance measures and a satisfaction questionnaire alone. As told before (2.2.2), usability consists of effectiveness,
efficiency and satisfaction which can be measured by task completion rate, task time and
satisfaction questionnaire respectively [37]. Therefore, performance measures are used
in summative but not in formative evaluations.
2.3.4
System Usability Scale and Single Usability Metric
System Usability Scale, sus, is described as “a reliable, low-cost usability scale that
can be used for global assessments of systems usability”. It was developed at Digital
12
Equipment Corporation in 1986 and has since been widely exploited in different areas
for its robustness and ease of use.
A sus questionnaire has ten statements about different aspects of usability. Users mark
their agreement with the statements on a five-point Likert scale. Agreement in half of
the statements implies negative usability, so for five statements score contribution is:
5 − scaleposition and for the other five: scaleposition − 1. Sum of each ten score points
is then multiplied by 2.5. As a result, a sus score ranges from 0–100 with hundred being
the maximum score for usability. [9]
Figure 2.1: A model of iso standard and the Single Usability Metric [66]
Single Usability Metric, sum, tries to gather all aspects of usability (as defined by iso)
under its hood. Task time and subjective satisfaction are the measures of efficiency and
satisfaction respectively, as is defined in the standard [37]. Effectiveness is defined by
two measurements: number of errors and task completion ratio. Previous research has
found that these four measurements of usability would produce “maximum amount of
information in one score” [67]. Figure 2.1 encapsulates the idea behind the sum.
2.3.5
Eye-tracking (as a usability evaluation method)
Eye-tracking devices enable insight that no other uem can give: that is to see where
users look. The value of this data is based on eye–mind hypothesis coined by Just &
Carpenter in 1976, who stated that “the eye fixates the referent of the symbol currently
being processed if the referent is in view” [12]. They arrived in this conclusion when they
found out that during task execution, subjects eye-fixations lasted as long as mental
processes in working memory (50–800 milliseconds) and users also always looked at
object in question when possible. It can be generalized that we look at what we think,
13
if the object is in sight. Like think aloud, eye-tracking can be considered as an indirect
mind-reading method in a sense.
Eye movements in humans can be divided in convergent, smooth pursuit and saccadic
movement. Convergence and smooth pursuit movement occurs when eyes have focused
on an object, which moves towards or away from you (convergence), or across the field of
view (smooth pursuit). Most of eye movement however is not smooth at all but sporadic
which consist of saccades and fixations. During fixations, which typically last 200–600
milliseconds, eyes are stable. Saccades are quick movements between fixations, which
last 20–100 milliseconds and can reach velocities of 900◦ per second. During saccades,
eyes are effectively blind so humans are able to see only during fixations. [77]
Eye-tracking devices record saccades and fixations on one part of field of view. Eyetracker essentially looks at user’s eyes to see where they are directed at. If the eyetracker is head-mounted, in addition to cameras recording eyes, it consists of a camera
pointed forward next to eyes. As a result, gaze direction related to field of view video can
be calculated and visualized, when eye and camera positions are known. Stand-alone
eye-trackers have fixed position between eye camera and monitor from where the gaze
is tracked.
Most of the modern eye-tracking systems use infrared light and cameras to track the eyes.
Non-collimated infrared light from two light sources is projected towards eyes creating
distinct corneal reflections in both eyes besides pupil1 . Infrared light also makes the
contrast between pupil and surrounding cornea larger and allows the camera to capture
the locations of corneal reflection and pupil. After a calibration, the location between
these two points (corneal reflection and pupil) can be used to calculate where the eye
is looking at. Figure 2.2 is a screen capture from an eye-tracking software showing the
locations of the points.
Figure 2.2: Screen capture from iViewX-software shows corneal reflection (black
crosshair) and pupil (white crosshair)
Eye-tracking is a novel uem and it is still being explored how it can be used to make products usable. The more experimental methods include eye-tracking facilitated automatic
usability testing, remote usability testing with webcams as eye-trackers and retrospective
1
http://www.smivision.com/en/gaze-and-eye-tracking-systems/products/red-red250-red-500.html
14
think aloud with eye-tracking [1, 16, 24]. Website usability testing has been a popular
target for eye-tracking research partly because the stand-alone eye-tracker used in hci
environments is cheaper and more accurate than the head-mounted one [25, 55].
The problem using eye-tracking as an uem is the uncertainty how eye movements relate
to specific cognitive processes or the usability of a system [19]. Cooke has published
several papers on the theme (2004–2006) and has arrived to the next conclusions about
eye-tracking: a) a bottom–up approach is best suited with it (not having any preconceived hypothesis about how eye-tracking relates to cognitive processes) [17, 20]; b) it
is most valuable in qualitative analysis and it should be used with other uems, such as
think aloud [18, 20, 60]; and c) some quantitative measures, such as fixation duration,
can be used to evaluate usability [19].
A recent paper suggests that by combining results from several eye-tracking measures,
the mental effort during a hci task can be measured [14]. Blinks, pupil sizes, fixations
and saccades were measured from participants during tasks where working memory load
was varied. Results from blink and pupil data as cognitive load indicators complied with
previous studies. However, saccade and fixation data contradicted previous research by
exhibiting correlation with cognitive load.
All in all, using quantitative eye-tracking as an uem is slightly problematic due to
the various possible sources of noise. This is especially so when one tries to keep the
test setup and stimuli as natural as possible, which is usually vital for valid usability
results. For example, dryness of eyes leads to increased blink rate and changes in screen
brightness leads to changes in pupil diameter.
Even though these obstacles can be overcome, there remains a difficult question on
the behalf of formative usability evaluation (see Section 2.2.1): how to fix a system
when it brings about e.g. long fixation durations? Therefore qualitative eye-tracking,
which simply enables usability practitioners to see where user looks and what draws
their attention, has been traditionally used as an uem instead of the quantitative. In
this study, both methods were used and how the eye-tracking results were mapped to
usability issues is explained in Chapter 9: Discussion (Section 8.1.2).
Chapter 3
Tablet computers
and tablet publishing
A theoretical and methodological base for this study was established in the previous two
chapters. This chapter further narrows the scope of hci to tablet computers. The device
in question here, Apple’s iPad, is examined more closely. At the end of the chapter, an
overview of tablet publishing is given.
3.1
Touchscreen devices
Touchscreens combine input and output of a computer to a single device by enabling
direct interaction by touch of a screen with finger or stylus. Touchscreen technology
was first introduced in a short paper published in 1965 [39]. It was later used, as first
intended, in flight control.
The first touchscreens implemented a basic capacitive layer over a cathode-ray tube
monitor, which included a mesh of “touch wires” in the front part of the layer and
insulated wires in the back. In the abstract of his article, Johnson (1965) foresaw the
effect of touchscreens in people’s lives and hci fortyfive years later by stating: “This
device, the ‘touch display’, provides a very efficient coupling between man and machine”
[39].
Along with capacitive, the more common current touchscreens include resistive, surface acoustic wave and infrared technologies. Most popular of these are resistive and
capacitive technologies1 . The development of technologies has made touchscreens so
accurate and reliable, that computers with touchscreens can be used without mouse and
keyboard.
1
http://whatistouchscreen.com/
15
Chapter 3: Tablet computers and tablet publishing
3.2
16
Tablet computers
A tablet computer consists solely of a touchscreen, which has a built-in central unit.
Tablet computers, or tablets, differ from touchscreen-equipped mobile phones by having
larger screens and thinner structure than other mobile phones: a typical tablet screen
size ranges from seven to ten inches and thickness of the device from 10 to 15 mm.
The first attempt to control a computer with a stylus instead of a keyboard was published
in 1957 [23]. 1990’s saw some companies release tablet computers as it was made possible
to integrate touchscreen and central unit into a mobile device. In 2000, Microsoft released
first version of the Microsoft Tablet PC but heavy and faulty tablets were not a viable
options to a desktop or laptop computer before Apple’s iPad.
3.2.1
Apple iPad tablet computer
After its initial release in 2010, iPad has become almost synonymous to a tablet computer. In the first year of its release, three out of four tablets shipped were iPads. In
this year 2012, a recent marketing forecast predicts that half of the tablets shipped will
be iPads [36]. Other manufacturers are catching up slowly, but they still have, at the
highest, only 5 % of the tablet market share [28].
Apple iPad is a part of the new breed of tablet computers which lack the deficiencies of
older generations. It is lightweight and thin and the screen is accurate to touch and clear
to look. Long battery life and fast power-up make it truly mobile. The most important
technical specifications of Apple iPad2 are listed in Table 3.1. In March 2012, Apple
released third generation iPad, which had a display resolution of 2048 × 1536.
Apple iPad uses capacitive touchscreen technology. Capacitive touchscreens work by
using skin, which is a conductive material, to change the capacitance of the electric field
on the touchscreen surface. Capacitive touchscreens are thus dependent on skin contact
and cannot be used with gloves or a stylus, like a resistive touchscreen. The operation
system in all iPads is the proprietary iOS, which is used in Apple iPhones also. [44]
3.3
3.3.1
Usability of tablet computers
Direct manipulation in graphical user interfaces
The most fundamental difference between tablet computers and regular computers relating to usability is the input method. Directly manipulated graphical user interfaces (gui)
were the next step after command-line based interaction when the term was introduced
in 1983 [69]. With direct manipulation, users can handle files as icons, dragging and
Table 3.1: Technical specifications of Apple iPad2
Size
Height:
Width:
Depth:
Weight:
24.1 cm
18.6 cm
0.9 cm
601 g
Display
Size:
Resolution:
Features:
9.7 inches = 24.6 cm (diagonal)
1024×768 (132 pixels per inch)
LED-backlit, multi-touch, widescreen
Connections Wi-Fi
Bluetooth
3G
Cameras
Back camera:
Front camera:
Other
Storage:
Battery life:
Only in “Wi-Fi + 3G” model
HD (720p) 30 fps video recording
5×digital zoom still camera
VGA video recording
VGA still camera
16/32/64 gigabytes
10 hours
Figure 3.1: Apple iPad2 tablet computer from front, back and sidea
a
Figure and specifications from http://www.apple.com/ipad/specs/
17
18
clicking them instead of wiriting commands in command line [65]. Direct manipulation
of digital objects is the base of a conventional gui.
Furthermore, touchscreen devices and other gestural interfaces “take direct manipulation
to another level” by allowing users to touch the digital items directly on the screen itself
[65]. Tablet computers are providing this with direct touch and gestures being the main
means of interaction. This makes the interface of a tablet computer natural.
3.3.2
Natural user interface
Natural user interfaces, or nuis, are computer interfaces which can be used by means
familiar from real-life, such as speech, touch and gestures [68]. Hands-on and tactile
experience allows fast learning of the basic functions and gestures such as tap and swipe
(gestures are illustrated later in Figure 5.3).
Nevertheless, there are some problems with using gestures as an input method: the lack
of established standards for gestures and their actions and the developers’ ignorance
about the universal usability principles (complying also with the new devices) are to be
blamed for these problems [58].
A recent study by Mauney et al. [45] found that the executions of symbolic gestures, such
as characters, had the highest variance between users from different cultures. Another
significant user background factor was the previous experience from touchscreen devices.
Users who had learned the use logic before swiped from right to left when they wanted
to scroll right. Users who had experience only from scroll bars and arrow keys swiped
erroneously from left to right to perform the same action. Even so, after few mistakes
the basic navigation logic was quickly learned.
Another even more serious problem evident in a pure gestural system is the lack of cues
and feedback from gestures. Gestures are non-standard, imprecise and unrepeatable by
their nature as non-verbal communication. This can be illustrated by an example from
a fictional auction, where bidding is done by gestures:
One person sneezes and thereby purchases an unwanted painting. A couple argues, and as they wave their hands at one another, the waving gets
interpreted as ever-escalating bids. [57]
When a user makes a gesture and gets an incorrect response, he or she she cannot know
why and how to correct the gesture. A traditional gui with precise and repeatable input
methods, do not have these issues. Tablet computers have solved this problem caused
by a lack of feedback, by integrating elements from the traditional gui, like icons, menus
and help system. As a conclusion, the natural in nui, in the strictest sense of the term,
can be debated in the tablet computer domain. [57]
3.3.3
19
Previous research
The commercial success of tablet computers was started by iPad in 2010, so not much
research has yet been published. A good amount of research can be found if the scope
is broadened from tablets to consider e-reading devices also, which have been around
longer. E-reading devices are specialized devices used solely on reading e-books, unlike
the more general-purpose tablet computers [70].
A summative usability study with e-reading and tablet devices can be approached from
two directions. The research compares either the usability of devices with the same
applications (e.g. [70]) or usability of applications with the same device (e.g. [34]).
However, a research to generate general e-book design guidelines for software and hardware utilized both approaches having several applications and devices in the tests [80].
Some design guidelines and usability problems are analogous with e-reading and tablet
devices, especially so when a a tablet computer is used for reading. A summarizing
research found four categories of “usability barriers” (i.e. usability problems) from ebooks, which have hindered the acceptance of e-reading: screen readability, navigation,
portability/physical, and network connection [33]. Another research shows that “ease
of use is highly associated with ease of navigation” [15]. Navigation seems to be the
category that users find most difficulties in e-reading devices [33].
3.3.4
iPad specific research
Although iPad is not solely an e-reading device, it has been found to fare well in comparison with them [31, 35]. However, being a new device at the market, it has its own
usability problems. Most of these problems are application dependent, meaning that
developers are not yet adapted for iPad. The most comprehensive iPad usability studies
have been published by Nielsen & Budiu in 2010 and 2011 [10, 11]. The findings from
the studies are recapped below:
Read-tap asymmetry
Content that is large enough to read but too small to tap.
Too small touchable areas too close together
Leads to accidental activation.
Accidental activation
Particularly problem in apps lacking a back button.
Low discoverability
Active areas that do not look touchable.
20
Poor typing
Users disliked the typing on the virtual keyboard on touchscreen.
Splash screen
A compulsory introduction screen irritates users.
Swipe ambiguity
If multiple items on the same screen can be swiped, navigation (e.g. swiping to
turn page) is impaired.
Information squeezed into too small areas
Making the content harder to perceive and manipulate.
Too much navigation
Large number of navigation options gives one less space.
E-reading devices already have legibility similar to print so the biggest obstacle in popularizing e-reading has been the poor usability [70, 76]. Tablet usage can be generalized as
being mainly media consumption with news, magazines and books having a large share
of it [47]. With slowly declining subscriber numbers, many publishers have realized
that the already widely distributed tablet computer could be a solution to popularize
electronic reading and make digital publishing viable addition to print.
3.4
Definition of magazine
The word magazine meant a storehouse when the first magazine-papers appeared in the
mid 18’th century [40]. The contemporary meaning of magazine can also be considered
analogous to the former; a storing place for knowledge, ideas and opinions. A magazine
is separated from a newspaper in many ways. They are not as topical as newspapers,
but more in-depth and specialized: stories being features, not news. Also, they do not
appear daily, making them more unique and long-lasting (print magazines stitched or
glued like books).
A typical magazine business model varies by market place. An average us consumer
magazine gets 54 % of its income from advertisers and 46 % from magazine sales [40].
Finnish magazines, due to the realities of a smaller marketplace, get majority of their
income from magazine sales. An average Finnish consumer magazine gets 70 % from
magazine sales and only 30 % from advertisers [74].
3.4.1
Definition of tablet magazine
Tablet magazine is simply a digital version of a print magazine, which can be read on
a tablet computer. Tablet magazines come in various forms. They can be categorized
21
either by the type of distribution or by the form of the digital magazine. The distribution
of tablet magazines to the devices and consumers can be handled with downloadable
viewing applications or by making magazines available online. The form of a tablet
magazine varies from carbon copy print replicas to magazines with rich multimedia
content and interaction possibilities.
Issues of application-based magazines or newspapers can be purchased or subscribed to
after the application has been acquired. These magazines are usually pdf-style print
replicas with varying amount of multimedia content and interaction possibilities. The
new html5 standards have made it possible to layout online magazines without restrictions and it has paved the way for web-based digital magazines. It has also enabled
dynamic layout, which makes it possible for magazines to adapt to different screen-sizes
and orientations.
Figure 3.2 shows examples of different tablet magazines and newspapers. The biggest
daily newspaper in Finland, Helsingin Sanomat, is distributed through an iOS application2 . Suomen Kuvalehti3 , on the other hand, is a platform independent html5
magazine and can be viewed by any device having a modern web browser. Finally,
compilation magazines, such as Flipboard4 , are worth mentioning. They are software
that gather news and stories from multiple online sources. Application- and web-based
magazines are discussed in more detail below.
Figure 3.2: Different types of tablet magazine solutions, from left: application-based,
web-based, and a compilation magazine
3.5
Tablet magazine publishing
Electronic publishing works the same way as print publishing process until the last step.
Distribution of the product is handled by making digital copies available, not by printing
physical copies. The form of electronic publication of a newspaper or magazine for an
2
http://asiakaspalvelu.hs.fi/tilaus/hsipad/
http://suomenkuvalehti.fi/jutut/kotimaa/digilehti
4
http://flipboard.com/
3
22
instance is still finding its shape. The easiest method has been to directly copy the
publication for web, as the digital files used for printing are already available [40].
The launch of iPad in 2010 introduced a new ecosystem for digital publishers, App
Store, which is based on application, or “app”, sells. A publisher makes an application
for App Store and after user has downloaded the app, it can be used to purchase and
read issues of the publication. Native iPad applications can allow more flashy animations
and interaction possibilities than web-based publications.
However, Stevens (2011) has predicted that in the near future, users have gotten bored of
the unsubstantial additional value offered by applications. As a result, web will prevail
over applications as a publishing platform, being more flexible and widely available:
The publishing industry will quickly come to an understanding that there is
already a much more efficient and flexible means of publishing to the iPad
and it already exists. It is called a website. [72]
Whichever being the course of development of the distribution and business model of
digital publishing, recent study shows that tablets and e-reading devices are already
encouraging users to consume more magazines. Of the 1009 us mobile magazine readers
surveyed by The Association of Magazine Media in the late 2011: a) 90 % consume as
much or more magazines since they acquired a mobile device; b) 66 % plan to consume
more digital magazines; and c) 63 % want more digital magazine content.[48]
So there is a demand amongst consumers, at least amongst those who already own
a tablet, for quality digital publications. Another us-based survey found out that 67
% tablet owners would read a tablet magazine rather than a print one, when both
were available. However, 65 % reported print to be more satisfying to read 5 . When
the problems of converting a print magazine succesfully to tablets have been solved,
increasing e-reading sales could replace decreasing print sales.
3.5.1
Tablet publishing in Finland
As mentioned before, magazine publishing business in Finland deals with different realities than in bigger markets. The global decline of print publications sales has affected
Finnish publishing houses also. 74 Finnish magazine chief editors answered a survey by
Aikakausmedia in May 20116 investigating the attitudes towards digital publishing and
tablet magazines. Only one in three magazines believed that a digital version for mobile
reading devices from their publication will be made available in the next five years. 5 %
of Finnish magazines had already made the transition in 2011.
5
http://www.gfkmri.com/assets/PR/GfKMRI_020312PR_DigitalUpdate.htm
http://www.aikakauslehdet.fi/Etusivu/Ajankohtaista/Tiedotteet/default.asp?docId=
31423
6
23
Moreover, Another survey by Sanomalehtien Liitto from the beginning of 20127 shows
that newspapers have adopted the electronic distribution venue more widely. Majority
of the daily newspapers have a tablet version and the rest are planning or considering
implementing it, according to the survey.
The reason for a slow start towards tablet publishing amongst magazines is the unpredictable marketplace. The chief editor of MikroPC (see Table 4.1) has said:
It is not yet known, what could be a good distribution model, will the devices
be application or browser-based, and what could be the business model.
Pioneer users are already waiting for the new distribution channels, but first
we need to get the system running.6
Indeed, there have been several approaches towards tablet publishing in Finland. The
most recent examples are from Suomen Kuvalehti and Helsingin Sanomat. Suomen
Kuvalehti, published by Otavamedia and having 310 000 readers, has been available
as an iPad application from 2010. In April 2012, they decided to change the digital
magazine version from an app into an html5 version, which can be read with all devices
with an internet browser: mobile phones, tablets and desktop computers alike8 . At
the same time, Helsingin Sanomat—leading daily newspaper in Finland published by
Sanoma News and with 905 000 readers9 —launched a subscription model where reader
gets an iPad and 2 years subscription to the digital newspaper for a monthly fee10 .
7
http://www.sanomalehdet.fi/index.phtml?s=2799
http://suomenkuvalehti.fi/jutut/kotimaa/digilehti
9
http://www.levikintarkastus.fi/levikintarkastus/tilastot/Levikkitilasto2011.pdf
10
http://asiakaspalvelu.hs.fi/tilaus/hsipad/
8
Chapter 4
Magazine in the tests
In this chapter, the magazine used in the study is presented. An issue of the magazine
and four different versions of it for tablet computers are examined more thoroughly.
4.1
Tietokone magazine
Tietokone is a monthly Finnish magazine concentrating on computers and information
technology in general. The circulation of the magazine in 2011 was 33 828, which was
11.9% lower than the year before1 . Total audience in 2011 was 113 000 making the
readers-per-copy ratio 3.342 . The biggest competitors in Finland for Tietokone are
MikroPC and Mikrobitti. The three magazines with their key numbers are compared in
the table 4.1 below.
A typical reader of Tietokone magazine is over 40 year-old (53 % of the readership)
male (86 %) office worker with a high income (48 %) living in a big city (71 %)3 . As
mentioned before, the common phenomenon for all print magazines is that circulations
1
http://www.levikintarkastus.fi/levikintarkastus/tilastot/Levikkitilasto2011.pdf
http://www.levikintarkastus.fi/mediatutkimus/KMT_Lukija_2011_perustaustat.pdf
3
http://www.sanomamagazines.fi/mediabank/document/4042.pdf
2
Table 4.1: The three biggest IT-magazines in Finland
Magazine
Circulation Audience
Readers-per-copy
Publisher
Tietokone
33 828
113 000
3.34
Sanoma Magazines
Mikrobitti
71 429
255 000
3.57
Sanoma Magazines
MikroPC
28 462
90 000
3.16
Talentum
24
Chapter 4: Magazine used in the tests
25
are declining and so has happened to Tietokone magazine as well. But with a computersavvy reader profile such as this, the transition towards digital and e-reading magazines
could be easier. In 2011 Tietokone launched their version for iPad, which includes the
same content than in print. It remains to be seen, whether the iPad version can boost
the circulations and income.
Rest of this chapter is dedicated in examination of the four versions (including the retail
version mentioned above) of the Tietokone magazine compared in this study. The most
distinctive and relevant differences in the usability point of view are discussed. The
publication in question is the June 2011 issue and all the usability discussion in this
study is based on this issue alone. All the versions here have the same content, text and
images, only differences being in the form.
The different interaction possibilities in each magazine are presented below. After that,
more generic differences are discussed along with summarizing illustrations from each
magazine.
4.2
Retail version
Tietokone magazine viewing application (“Tietokone for iPad”) is free to acquire from the
Apple’s digital application marketplace, App Store. After that, users can purchase single
digital issues of the magazine with the price ranging from 4.99–8.99 euros (a newsstand
copy costs 8.50 euros). Currently issues of the magazine are manually laid out with
Adobe dps4 software. The issue used in this study was made with Adobe Indesign
together with Woodwing5 (hence the acronym ww) but it is no longer available for
purchase.
The iPad magazine is almost a direct copy of the print version, with some benefit from
the digital in the form of few interactive features. The interactive features added to this
version (besides basic navigation from page to page) are hyperlinks, image interaction,
scrollable portions of page and navigation shortcuts.
Hyperlinks in retail version
Hyperlinks are found on the cover page and in the table of contents page, or toc.
Tapping headlines (text and/or image) brings user to the first page of article in
question. Cover and toc page are almost a carbon copy of the print, so the
hyperlinks are not visibly separated from regular text and images. Few hyperlink
buttons that open a pop-up window are found inside articles also. No hyperlinks
lead user outside the magazine. Figure 4.1 shows different hyperlinks found in
ww.
4
5
http://www.adobe.com/products/digital-publishing-suite-family.html
http://www.woodwing.com/en/tablet-publishing-overview
26
Figure 4.1: In ww, headlines in cover and toc are hyperlinks to the corresponding
articles, a ⊕button opens a pop-up window with additional information inside an article
Image interaction in retail version
Some kind of image interaction is found in about every other image in the magazine.
Tapping an image does different things depending of the image. It can a) enlarge an
image to full-screen; b) enlarge an image by little; c) show the caption of the image;
or d) open a pop-up window with some additional content inside. Possible actions,
if there are any, are not indicated prior to touch. Figure 4.2 shows examples of all
four.
Figure 4.2: The four different image interaction possibilities in ww from top left: a
pop-up window, enlarge image to full-screen, show image caption and enlarge image by
little
27
Scrollable portions of page in retail version
In some articles, portions of page can be scrolled horizontally or vertically by
swiping. These can be text columns, images or complete articles, as shown in
Figure 4.3. The scrollable portions are indicated by a small “triple guillemet” sign
(>>>) on the corner of the scrollable area.
Figure 4.3: Different scrollable portions of a page in ww from left: scrollable text
column, scrollable image and scrollable article
Navigation shortcuts in retail version
All the navigation shortcuts are found in the toolbar. Toolbar is opened by tapping
the lower edge of a page, which reveals six buttons, as shown in Figure 4.4. The
functions of the buttons starting from the left are a) go to cover; b) go to toc;
c) open a page browser6 ; d) open library where the previously purchased magazines
can be accessed; e) open a pop-up window to the homepage of Tietokone magazine
http://www.tietokone.fi/; and f) go to the store, where new magazines can be
purchased.
4.3
AnyReader version
This version of the magazine was automatically laid out with a dynamic layout software
developed by a Finnish company Anygraaf. When all the text and images have sufficient
metadata, a dynamic layout can be compiled from the content, which adjusts to any
screen size. The layout can be accessed by AnyReader7 , which was in prototype phase
at the time of this study but is now available in App Store, Nokia Store and Google
Play.
AnyReader, or ar, version of the Tietokone magazine differs radically from that of
print. Even though the content is the same, the differences in layout and navigation are
substantial. ar has two hierarchical levels for navigation: top level view, which is like
6
This did not work in the tests. Users who found this feature (5/10) during tasks, were instructed to
ignore it. Later, during the free browsing phase, the page browser feature was showed to all users from
an another iPad and they were asked to rate the magazine in sus as the feature would have worked.
7
http://www.anygraaf.fi/fin/eng_frontpage/anyreader__tablet_and_smartphone_
publishing_system_397.html
28
Figure 4.4: Toolbar and functions of four toolbar buttons in ww, from top left: page
browser (did not work in the tests), library, homepage and store
a toc spread across multiple pages horizontally and a section view, where articles from
the same section are presented side by side horizontally on different pages. ar offers
the following interactive features: hyperlinks, image interaction, adjustable font size and
navigation shortcuts.
Hyperlinks in AnyReader version
Articles can be accessed via tapping hyperlinks on the top-level. The hyperlinks
are large square areas of a page consisting of a headline, image (sometimes) and
the beginning lines of the article body text. This is illustrated in Figure 4.5.
Figure 4.5: In ar, articles are accessed by tapping a hyperlink in top-level
Image interaction in AnyReader version
Image interaction inside an article is indicated by a symbol on the corner of toplevel image. When an image within an article found under this symbol is tapped,
29
an image carousel is opened and all the images in the current article are browsable.
This action is illustrated in Figure 4.6.
Figure 4.6: Tapping an image in article opens an image carousel in ar, where images
of the same article can be browsed
Adjustable font size in AnyReader version
ar was the only magazine version in this study with an adjustable font size. There
are two ways to change the font size, either by using a gesture or by a button in
the toolbar. A spread gesture performed on a page enlarges the font size of the
magazine and a pinch gesture shrinks the text. + and − buttons in the toolbar
performs the same actions respectively when touched. The change of text size
changes the layout of the magazine also, as illustrated in Figure 4.7.
Figure 4.7: Tapping the “−button” in ar toolbar shrinks the text size and layout
adjusts accordingly
Navigation shortcuts in AnyReader version
Navigation shortcuts in ar are found from the upper and lower edges of the screen.
An always-visible toolbar with four buttons is on top of the screen. + and − change
the text size as mentioned before. First button on the left, the “home button”,
returns user to a library view from where previously purchased papers can be
accessed. The second button from the left, the ”back” button, returns user to an
upper hierarchy level in the magazine.
30
Figure 4.8: Toolbar navigation shortcuts in ar: “toc button” brings user to top-level
and “home button” exits to library
Alternative orientation in AnyReader version
When the device is rotated 90◦ , the magazine layout adjusts automatically making
it possible to use ar in portrait or landscape orientation.
Figure 4.9: Layout changes in ar after rotating the device 90◦
4.4 “Fancybox” web-based magazine
The html5 web-based magazine was developed in the department of Media Technology
of Aalto University. It implements features from the newest generation of html/css as
well as from JavaScript. Also, algorithms for automatic image alignment, cropping and
main color extraction are used. Baker8 framework is used for enabling html5 magazine
viewing on iPad, whereas Friar9 is used in Android devices.
fb and ps are two versions of the same html5 magazine with different image browsing
techniques. fb, short for Fancybox, is a pop-up image gallery, where each image opens
8
9
http://bakerframework.com/
http://www.friarframework.com/
31
in full screen, when tapped. ps, short for Photoswipe, is an image carousel (like in ar),
which opens when an image is tapped and from where images of the same article can be
browsed without returning to the article.
Hyperlinks in Fancybox version
First page of the magazine is toc, where all articles can be directly accessed via
tapping the hyperlinks, instead of browsing through the magazine. The hyperlinks
are wide rectangular buttons consisting of a headline, image and lead text. This
is illustrated in Figure 4.10.
Figure 4.10: In fb and ps, articles can be accessed by tapping a hyperlink in toc
Image interaction in Fancybox version
As mentioned above, images (if not in full-screen width already) are enlarged to
a full-screen pop-up window when tapped. Whether image opens or not, is not
indicated. Figure 4.11 shows an image pop-up window.
Figure 4.11: Image opens to a pop-up window in fb
Navigation shortcuts in Fancybox version
fb and ps have a hidden navigation bar similar to ww’s page browser illustrated
in Figure 4.12, which is accessed by a double-tap anywhere on the screen. The
32
navigation bar shows all articles side-by-side and can be scrolled horizontally. How
to open the navigation bar, is not indicated.
Figure 4.12: The navigation bar in fb and ps
Alternative orientation in Fancybox version
When the device is rotated 90◦ , the magazine layout adjusts automatically. This
makes it possible to use the magazine in portrait or landscape orientation, as shown
in Figure 4.13.
Figure 4.13: Layout changes in fb and ps after rotating the device 90◦
4.5 “Photoswipe” web-based magazine
Image interaction in Photoswipe version
As mentioned above, some images are enlarged to an image carousel pop-up window when tapped, which is illustrated in Figure 4.14. From the carousel, all the
images of the same article section can be browsed by swiping or by controls on the
bottom. This was only magazine where images could also be zoomed with a spread
33
gesture in the carousel view. Whether a carousel opens or not when tapping an
image, is not indicated.
Figure 4.14: After tapping an image in ps, image carousel opens and all images in
the article can be browsed
Navigation shortcuts in Photoswipe version
See Section 4.4.
Alternative orientation in Photoswipe version
See Section 4.4.
4.6
Structural differences in magazines
Figures 4.15, 4.16 and 4.17 illustrate the differences between the magazines. Table 4.2
gives an overview of the differences between the four magazines and their user-interfaces.
ww has all articles available through swiping. Shorter articles in the beginning and in
the end of the magazine are stacked vertically on top of each other. Longer articles in
the mid-section are separated horizontally to different pages and if they do not fit in
one page, they are continued vertically below. Some sections of articles, like tables and
additional information, are not visible directly but need to be accessed by tapping a
hyperlink (see section 4.2.
In ar, small bits (headline, picture, lead) from all articles are presented in the top-level,
which is several pages wide horizontally. Articles can be accessed by tapping on the
square presenting the article (see section 4.3). This brings user to lower level, “articlelevel”, where all articles are stacked side-by-side horizontally. If article does not fit in
the first page, it is continued below. Articles are grouped together based on the section
of the magazine they belong; unlike in ww, fb, and ps where articles are mixed and in
the same order as in print.
Unlike in ww and ar, in fb and ps, all of the content is available through swiping alone.
Tapping only enlarges images (see sections 4.4 and 4.5) and brings out the navigation
34
bar (4.4). Like in ww, shorter articles in the beginning and in the end are arranged on
top of each other. All articles in the mid-section are separated horizontally and when
the article is too long to fit one page, it is continued below.
toc is available in the first pages of ps, fb, and ww. In ar, the whole top-level can be
considered as a toc of sorts. Horizontal transitions are paginated in every magazines,
i.e. browsing between articles is done in steps. In fb and ps, vertical transitions are
stepless, i.e. articles can be scrolled up and down continuously, like a web page. ww
and ar has paginated vertical scrolling inside articles.
Figure 4.15: An overview of the navigational structure of ww magazine (circles
and lines indicate tap and transition, three dots indicate omitted pages due to space
constraints)
Figure 4.16: An overview of the navigational structure of ar magazine (circles and
lines indicate tap and transition, three dots indicate omitted pages due to space constraints)
35
Figure 4.17: An overview of the navigational structure of ps and fb magazines (circles
and lines indicate tap and transition, three dots indicate omitted pages due to space
constraints)
36
37
Table 4.2: An overview of the magazine user-interfaces
WW
AR
PS & FB
Structure
Top-level (“toc”)
separated from
article-level,
articles separated
horizontally on
article-level
Articles separated
horizontally
Articles separated
horizontally
Pagination
Paginated
Paginated
Continuous
Columns
Mainly two
Mainly two (changes
with font size)
One
Table of
contents
Hyperlinked list of
headlines and leads
Top-level consists of
hyperlinked
collection of
headlines and images
Hyperlinked list of
headlines, leads and
images
Navigation
bar
Page browser found
inside the toolbar
Visible on page
change
Page browser visible
on double-tap
Toolbar
Visible on tap to
bottom,
6 buttons
Always-visible,
4 buttons
No
Adjustable
font size
No
Yes, layout changes
accordingly
No
Image
carousel
No, images opened
separately
Yes
No, images opened
separately (fb)/Yes
(ps)
Image zoom
No
No
No (fb)/Yes (ps)
Chapter 5
Experiment setup
In the previous chapters base of the research, the device and the research material have
been introduced. This chapter shows the setup for the user tests. The setup is explained
in a manner so that a similar research could be conducted with these instructions.
Selection of users as test subjects is also discussed. All the areas of the context (users,
environment, tasks) in this usability research are considered.
5.1
Chosen methods
As mentioned before (see 2.1), different usability evaluation methods test different parts
of the hci. uems chosen for this study were think aloud, performance measures, questionnaires and eye-tracking. Also, a heuristic evaluation for the magazines was done
prior to users tests to help in the task design.
One goal of this study was to evaluate and compare the usability of dynamic and manual
layouts. Summative evaluation was needed to rank the layouts. In order to do this type
of evaluation, performance measures and questionnaires were chosen as methods for
gathering quantitative data.
The dynamic layout systems used in three of the tested magazines (ar, fb and ps)
were still being developed, so a formative analysis was also done to find and address
any usability problems that could be corrected. Think aloud was chosen for the large
amount of qualitative data it produces.
Finally, eye-tracking was selected by default. The emphasis for this thesis was from
the very start on how to evaluate iPad magazine usability with eye-tracking. A nonintrusive eye-tracking device was employed, so all uems could be used simultaneously
with minimum effect to each other. Both qualitative and quantitative data was gathered
through eye-tracking the users and it was used both as a summative and formative
evaluation method.
38
Chapter 5: Experiment setup
39
Table 5.1: Test user statistics
AR
WW
FB
PS
All
5.2
Age
27.3
23
26.8
26.3
25.85
Males/Females
8/2
6/4
8/2
7/3
29/11
Proficiency
3.2
3.8
3.1
3.9
3.5
Users
A sufficient amount of qualitative data from a user-interface usability can be gathered
from five users with think aloud or eye-tracking (see 2.1). Quantitative data, on the
other hand, needs as many test subjects as possible for the measurements in order to get
statistically significant results. Four groups of ten users was used, keeping the amount
of resources spent in this study reasonable.
Because four systems were evaluated in the tests, a between-subject testing was selected
to keep the test time approximately in an hour. Within-subjects testing would have
rendered the amount of time spent with each version to less than fifteen minutes in an
hour-long test, which would have been insufficient.
Most of the users were recruited with an Aalto-university newsletter. The compensation
for participation was a movie ticket. No pre-requisites for users were imposed other
than having Finnish as mother tongue. The user statistics in the four magazines are
summarized in Table 5.1.
Along with basic information, prior tablet computer and magazine experience and knowledge were surveyed with several “yes/no” questions before the tests, such as “Do you own
a tablet computer?”, “Have you used a tablet computer before?”, and “Have you read Tietokone magazine before?”. A proficiency level from 1–8 was calculated for convenience
to summarize the results of these questions simply by giving “yes” answers value of 1
and “no” answers 0. The pre-test survey (in Finnish) can be found in Appendices.
5.3
Test setup
An overview of the test setup used in this study is shown in Figure 5.1. Users sat in
an adjustable chair next to the iPad, which was attached to the eye-tracked monitor.
The test facilitator sat to the left and behind the user so any movement by him did not
distract the user. From this angle, the facilitator could also observe the user better than
directly from behind or to the side. Video camera was also placed behind and to the
side of the user for the same reasons: not to distract and for better view.
40
The test instructions (found in Appendices) and pencil could be placed on the table
left or in front of the user, depending on user preference. The monitors of computers
running the experiment and eye-tracking software were placed so that they were visible
to the facilitator but not to the user. The experiment software was operated with a
keyboard (start/stop tasks) and a mouse (start experiment and calibration) in front of
the facilitator. After the tasks, the iPad was released from the frame and the user held
the device during the free browsing part.
Figure 5.1: Test setup, showing 1: user, 2: facilitator, 3: iPad and eye-tracking
system, 4: video camera, 5: computer recording eye-tracking data, 6: computer running
experiment software.
5.3.1
Eye-tracking system setup
smi eye-tracking system1 together with Epiphan Frame Grabber2 was used to enable eyetracking. A special setup was necessary to allow eye-tracking iPad with a stand-alone
eye-tracking device.
Epiphan Frame Grabber hardware and software was used to stream the iPad video-out
signal to the monitor being eye-tracked. As shown in Figure 5.2, the window streaming
iPad video-out signal was resized and placed directly under the iPad. As a result, when
1
http://www.smivision.com/en/gaze-and-eye-tracking-systems/products/
red-red250-red-500.html
2
http://www.epiphan.com/products/dvi-frame-grabbers/dvi2usb/
41
users looked at the iPad, their gaze was tracked to the iPad video-out signal, thus
allowing iPad eye-tracking.
The setup was not optimal, however. The eye-tracking device was fixed below the
monitor, between users’ feet. When users interacted with the iPad, especially with
horizontal swipes, the hand blocked the signal from eye-tracking device to the eyes.
This resulted in gaps in the eye-tracking data.
smi Experiment Center was used to operate the eye-tracking of the tasks. More specifically, the Screen recording feature was started for every task (besides task 1, the practice), which recorded gaze on the desktop, including the Frame Grabber window. smi
iViewX ran from a laptop, which captured the data from the eye-tracker device.
Figure 5.2: An Epiphan dvi2usb Frame Grabber window was placed directly under
the iPad
5.4
Test protocol
Every test was conducted with a following protocol:
1. Before user arrived, iPad was wiped clean, the magazine was set to the practice
article (“Nanokoossa kaikki on toisin”) and the Experiment center was set up for
the user.
42
2. When arrived, user was welcomed to the test session and asked to turn off any
mobile phones.
3. A short overview of the test was given to allow proper orientation.
4. User was asked to read the first page of the test instructions.
5. Before doing the practice task, one eye-tracking “demo” calibration run was done
to introduce user to the system. The fact that first task and calibration were for
practice and not recorded, was not told to users, as instructed by Nielsen [55].
6. User was reminded to note the instructions for gestures on the wall on their left,
shown in Figure 5.3. Video recording was started.
7. User did the first task to practice think aloud and magazine interaction. Task
descriptions were asked to be read aloud to make following think aloud more
effortless.
8. Calibration was ran five times, and the most accurate result was selected.
9. User did five tasks, answering the satisfaction questionnaire after every task.
10. Calibration was done again five times.
11. Rest of the tasks was done.
12. After eleven tasks, eye-tracking was quit and the iPad was given to the user, who
now was instructed to freely browse the magazine for five minutes continuing think
aloud. During free browsing, if user had missed some features of the magazine, he
or she was instructed how to find them. Free browsing was thus made to simulate
the magazine use like the magazine was familiar to the user beforehand.
13. After free browsing, the user was asked to fill the sus questionnaire based on the
tasks and free browsing.
14. At the end, user was thanked for collaborating in the study and a movie ticket was
given for compensation.
Test instructions including task descriptions, satisfaction and sus questionnaires, are
included in Appendices. Figure 5.3 was presented to users and it shows the six gestures
needed to fully operate each magazine. Pinch and spread gestures were used to zoom
out and in of images (ps) and adjust text size (ar). Tap is used to operate hyperlinks
and open images in each magazine. Double-tap brings out the navigation bar in fb and
ps. Slide is a more accurate version of swipe; both are used to move and scroll within
the magazines.
43
Figure 5.3: Gesture instructions for novice users (from top left: tap, double-tap, slide,
pinch, spread and swipe)a
a
Adapted from: http://www.lukew.com/ff/entry.asp?1071
5.5
Tasks
Eleven tasks were presented to users. The first task was the same to all users, but the
remaining ten tasks were presented in random order to prevent the different effect of a
certain task order in different magazines [64]. A heuristic evaluation on the magazines
was conducted to aid the task design. (Notes from the evaluation are attached to Appendices.) Tasks were designed to steer users towards usability problems to see how
they would cope with them. The magazines tested had different usability problems in
different parts. The challenge was to address problems evenly between magazines at the
same time emulating as natural and broad magazine usage as possible.
The tasks were designed to be realistic in order to get relevant eye-tracking data [55].
Scenarios, such as “Task 6: Your friend has recommended you to read a column at the
end of the magazine...”, were used in task design but later dropped in order to allow the
task descriptions to be easily remembered. Goal-oriented tasks, as these, allow effective
qualitative eye-tracking analysis, when the researcher knows what user is trying to find
[60].
The eleven tasks are presented below with an aspect of usability being tested and usability problems found prior to user testing. Usability aspects presented in the rightmost
column are broader and more related to the tablet magazine context than the heuristics.
Task numbers found in the title column are used to identify the tasks in future chapters.
For a complete task description (in Finnish), see Appendices.
Objective
Usability problem
Us. aspect
In ww, the image, which was required to be opened, was
not placed besides the corresponding story.
4: “Tietoturvaa iPadiin ja iPhoneen”
The task was to find the advice the magazine gives in
case of a stolen iPhone.
In ww, the items 8–10 were hidden behind a scrollable
column (see Section 4.2). In ar, some of the pictures of
the devices were hidden behind a link (Section 4.3). In
fb and ps, only a part of the list was visible at a time.
This was a test for reading and information screening:
how easy it was to find a keyword from a mass of text.
3: “Kriisi 2.0”
The task was to find out what project was also called “the
Wikipedia of maps”.
5: “10 vekkulia usb-lelua”
The task was to choose the most interesting usb toy from
the article.
In ar and ww, the results of the test were not visible
directly in the article but hidden behind a hyperlink.
2: “Järkkäristä tuli videokamera”
The task was to find the best camera model from a camera test article.
Visibility
Layout
Readability
Visibility
1: “Nanokoossa kaikki on toisin”
The first task was designed as a practice especially for those who had not used a tablet computer before. All the gestures on how
to interact with the magazine were shown (Figure 5.3) and user was instructed on the basic use of the magazine, i.e. how to swipe
to proceed inside an article and from one article to another and how to tap to open an image.
Task title
Table 5.2: Task overview
44
9: “Suljettujen ovien takana”
The task was to browse through all the images of a photo
feature.
8: “Sähköinen lukeminen maistuu jo”
The task was to examine an information graphic and
choose one interesting fact from it.
7: “Tietokoneen tulevaisuus on täällä”
The task was to choose one tablet computer from a comparison table and to find its picture.
6: “Pakina”
The task was to find an article written by a pseudonym
“Kiukkuinen ict-johtaja” which was hinted to be at the
end of the magazine.
In ar, most of the images are not visible in the article, but are hidden inside an image carousel (see section
4.3). In ww, the images exhibit different actions when
tapped (4.2). In fb, every image has to be opened separately (4.4). In ps, images were divided into several image
carousels (4.5).
In ar and fb, the image could not be zoomed. In ww,
only a portion of the image was visible at a time.
In ar, the review article was split into two, with technical
specification comparison table situated in different article
than the images of the devices. In ww, the comparison
table was hidden behind a link.
The lack of page numbers (ww) and toc (ar) were predicted to affect user’s sense of orientation inside the magazine. Also, the article being short and belonging to a
section with multiple short stories, it was not visible in
any tocs.
Image
interaction
Image
interaction
Visibility
Navigation
45
11: “Kolumni”
The task was to find an article by Jyrki Kasvi.
10: “Suunnistuksen uudet tuulet”
The task was to choose the best navigation software, according to the outlook of the user interface.
In fb and ps, the writer’s name is not visible in the toc.
In ar and ps, images were shown separately in an image
carousel. In ww, the information related to an image was
hidden behind a link.
Navigation
Image
interaction
46
Chapter 6
Analysis
As a result of the user tests explained in the previous chapter, large amounts of data
was acquired. This chapter explains the various ways the data was analyzed to obtain
the results.
6.1
Task time
Task time measures the efficiency part of the usability as defined in the iso standard
[37]. Navigation between articles had the biggest differences between magazine versions
(see Section 4.6) so a task browsing time, i.e. how long it took a user to arrive at
the correct article (but not yet finding the answer), was calculated also. This allows
describing the performance of browsing more accurately.
Data from ten tasks (tasks 2–11) were acquired from each user, resulting in total four
hundred task time samples for total task time and task browsing time each. In tasks 6
and 11 (see Section 5.2 for task descriptions), where users were to find correct article,
task browsing time equaled total task time. In other tasks, where answers were found
later inside the articles, the browsing time to reach the correct article was measured
from the video recordings.
All task attempts were included in the calculations. If task completion took more than
the time limit allowed (12 cases out of 400), they were marked to take 300 seconds.
Also, if user aborted the task execution before the five minute time limit (6/400), it was
treated as incomplete. Wrong answers for task questions (4/400, only found in task 4
when users misread iPhone as iPad in the task description), were treated as correct in
this measurement. All task attempts had to be taken into analysis to get adequate and
equal amount of samples for each magazine. Task completion was addressed separately.
Total task and task browsing times were averaged over the four magazine versions
(10users × 10tasks = 100samples/1magazine). In addition, task-specific averages of
47
Chapter 6: Analysis
48
the times was calculated to reveal in which tasks were the greatest differences between
magazines (10users × 1task = 10samples/1task). Other key numbers, such as median,
standard deviation and range are presented as well as instructed by Rubin [64]. The
margins of error were calculated using 95 % confidence level and it is presented with the
corresponding task times in the next chapter.
6.2
A System Usability Score, sus, was calculated for each user (as explained in Section
2.3.4) and averaged over the four magazines. The margins of error were calculated
using 95 % confidence level and they are presented with the correspondent sus score in
the next chapter. sus was asked at the end of the test session, immediately after free
browsing. Therefore, it can be thought to represent the normal use of the magazine also,
not just task performance. Finnish translation of the sus questionnaire can be found in
the Appendices [75].
Before calculating the Single Usability Metric (as explained in Section 2.3.4), sum, few
exceptions were made related to suggestions for specifications. A specification value, a
reference to determine good and bad usability, for all the measurements was determined.
For number of errors and completion rate this is always “no errors” and “successful
completion”.
Number for opportunities for errors, the different situations where users could make an
error, was determined as 4 for almost all the tasks: 1) return from previous task (e.g.
user has problems closing an opened image); 2) navigate to upper level (e.g. user swipes
to toc instead of shortcuts); 3) navigate to article (e.g. user goes to wrong article); and
4) find an answer (e.g. user leaves the correct article). For tasks 6 and 11, where the
goal was to find the articles, the last opportunity from the list was excluded.
An average from completed tasks was used as a task time reference rather than the
average of only the most satisfied users. This was necessary because in some tasks,
the number of satisfied users was too small for averaging a representative number for a
reference. The median (3.65) was used as a reference for satisfaction score as suggested
by Nielsen & Levy [52].
The input data for calculations was acquired as follows: task time was measured as
explained before; task completion was a binary measure: “1”, when user gave correct
answer before the time limit, otherwise “0”; number of errors (range 0–4) were calculated
from the videos; user satisfaction was the average from the three Likert scale questions
(see Section 5.4).
Chapter 6: Analysis
49
Weighting, standardizing and averaging was done with a tool made available by one of
the authors of sum1 . A sum score is task-based by nature and can be averaged to get a
score for the whole system. Results from sum analysis is presented in the next chapter
along with 95 % confidence level margins of error.
6.3
Quantitative eye-tracking
smi eye-tracking device records pupil diameter along with pupil location. As well as
automatically adjusting the amount of light arriving to retina, in previous research pupil
diameter has also been found to correlate with cognitive load during short term memory
tasks [41]. For this study, the hypothesis was that the pupil dilation is a measure of
mental effort in hci (as presented in a recent article [14]) and thus would correlate with
other task usability measures as well.
For every user, pupil diameter data was extracted from the smi BeGaze eye-tracking
analysis software as ascii .txt files. An Excel macro was programmed to handle the
data. First, all samples where one or both eyes were not measured, were dropped.
Then, average pupil diameter (averaged for both eyes also) was calculated for tasks.
Sample size below hundred/task was considered insufficient to give a representative
average. Therefore, tasks with less than 100 samples (16/400) were excluded from the
final analysis.
Average fixation duration has been proposed as one possible quantitative usability measure derived from eye-tracking, indicating the complexity of a user-interface [29]. Average fixation duration was calculated similarly to the pupil diameter. Only difference
being that tasks with samples under 10 (48/400) were dropped. The eye-tracking device
failed to record data from one ww user altogether, which accounted for 10 zero-sample
tasks for both quantitative measures.
6.4
Think aloud and qualitative eye-tracking
Think aloud data was analyzed in two ways: textually and verbally together with eyetracking. For textual analysis, all the think aloud videos were transcribed including task
and free browsing phase from every user. Then, the qualitative data analysis software
Atlas.ti2 was used to code users’ speech. For example, if a user had said, “First time I
saw that button, it didn’t occur me to press it”, the sentence would have been coded as
“affordance−”. Or, if a user had said “I like the text, it is clear to read.”, the sentence
would have been coded as “readability+”.
1
2
http://www.usabilityscorecard.com/
http://www.atlasti.com/
Chapter 6: Analysis
50
The codes that were used were acquired from a previous masters thesis research investigating the same material. Data about the proportion of positive/negative comments
and the usability aspects the comments were related to was extracted from Atlas.ti. The
results are presented for each magazine separately in the next chapter.
In addition to textual analysis, the think aloud videos were analyzed to search for usability problems. Gaze replay videos from five users who had most eye-tracking samples,
were combined and synchronized with think aloud videos as shown in Figure 6.1. This
type of analysis was inspired by a similar method described in the article “Using eye
tracking to address limitations in think-aloud protocol (2005)” [20]. Other half of the
users, who had insufficient eye-tracking data, were analyzed from the think aloud videos
only.
Similar analysis was done to all videos (think aloud and combined think aloud + eyetracking): videos were examined to find usability problems. Found usability problems
were grouped together according to the Nielsens heuristics presented before (see Section
2.3.1). Number of usability problems per magazine and other results are presented in
the next chapter.
Figure 6.1: Screen capture from a video combining think aloud, gestures and eyetracking
Chapter 7
Results
After analyses of the measurements were defined in the previous chapter, it is time to
present the results. All results presented here are plotted with 95 % confidence interval
levels calculated from two-tailed t-tests.
7.1
7.1.1
Task time
Total task time
Total task time was measured from the start of task execution to finish (i.e. task completion, out of time or abortion by user). Figure 7.4 shows an average from task times
for every magazine along with a 95 % confidence interval levels. The order from highest
average task time to lowest is ar, fb, ww and ps. Descriptive statistics related to Figure
7.4 is presented in Table 7.1.
A further analysis for differences between the means shows that ar has statistically
significantly longer task times than the others. The differences between the three other
magazines were not statistically significant. This can be seen from Table 7.3 where only
comparisons between ar and other magazines has p-values below α = .05 threshold.
Plotting tasks separately allows more detailed look into which tasks produce biggest
differences between magazines in total task times. Each bar in Figure 7.1 shows averaged
task times from ten users for each task and magazine (task 1 was practice and not
included in calculations). As was expected from Figure 7.4, ar seems to have longest
times in most of the tasks.
Figure 7.3 shows task times grouped according to the usability aspects they tested
(presented in Table 5.2). The most significant difference can be seen in visibility, where
ar has the longest task times. The other four usability aspects have more or less the
same task times between magazines.
51
Chapter 7: Results
52
Figure 7.1: Average total task times for each task with 95 % ci margins of error (the
black vertical lines) (see Table 5.2 for task descriptions)
Figure 7.2: Average task browsing times for each task with 95 % ci margins of error
(the black vertical lines) (see Table 5.2 for task descriptions)
Table 7.1: Key statistics for total task times
AR
WW
FB
PS
Mean
137.74
109.37
116.23
104.78
Median
123.5
96.5
102.5
84
SD
76.17
66.83
69.46
65.74
Range
290
294
292
286
Min.
10
6
8
14
Max.
300
300
300
300
Count
100
100
100
100
Chapter 7: Results
53
Figure 7.3: Average task times grouped into usability aspects (Visibility: tasks 2, 5,
7; Readability: 3; Layout: 4, Navigation: 6, 11; Image interaction: 7, 8, 9) with 95 %
ci margins of error (the black vertical lines)
Figure 7.4: Total task times averaged
over magazines with 95 % ci margins of
error (the black vertical lines)
Figure 7.5: Task browsing times averaged over magazines with 95 % ci margins of error (the black vertical lines)
Table 7.2: Key statistics for task browsing times
AR
WW
FB
PS
Mean
65.93
39.26
41.21
36.25
Median
49.5
25
21
22
SD
59.92
45.6
41.21
41.75
Range
298
272
298
272
Min.
2
2
2
5
Max.
300
274
300
277
Count
100
100
100
100
Chapter 7: Results
54
Table 7.3: Results (P (T ≤ t))
of two-tailed t-tests for total task
times
WW
FB
PS
7.1.2
AR
0.01
0.04
0.00
WW
0.47
0.29
Table 7.4: Results (P (T ≤ t)) of
two-tailed t-tests for task browsing times
FB
0.07
WW
FB
PS
AR
0.00
0.00
0.00
WW
FB
0.79
0.63
0.49
Task browsing time
Task browsing time was the time users spent browsing the magazine until they found and
accessed the correct article. Average task browsing times for each magazine is plotted
in Figure 7.5. This shows even more radical a difference between ar and the other
magazines. The order of the magazines is the same as in Figure 7.4 above. Detailed
statistics related to Figure 7.5 is presented in Table 7.2. Statistical analysis of the
differences between the means is presented in Table 7.4. It shows even more significant
differences between ar and other magazines than in the case of total task times in
Table 7.3. As before, differences between the other three magazines are not statistically
significant, p-value being higher than α = .05.
When task browsing times are plotted for each tasks separately, the results give a more
detailed view of the differences. Each bar in Figure 7.2 shows averaged task times from
ten users for each task and magazine. For instance, article in task 6 was the hardest
to find in every magazine. Task browsing time had smaller variances than total task
time, as can be seen from the shorter error bars and smaller standard deviation in Table
7.2. This was due the fact that once the correct article was found, some users spend
more time arriving at the answer (e.g. deciding the most interesting fact in task 4) than
others.
7.2
Figure 7.6 shows the average sus score for each magazine. Although there are no statistically significant differences here due to small sample size (see Table 7.5), the same
trend continues. The order of the means is ar, ww, fb and ps; from lowest perceived
usability to highest. Table 7.7 shows the descriptive statistics behind the figure.
Single Usability Metric scores from the tasks averaged over each magazine are plotted
in Figure 7.7. sum was calculated from task time, completion, errors and satisfaction
measurements and averaged over users for each task. As can be seen from the figure
and from Table 7.6, there are no statistically significant differences between magazines;
error margins are smaller here than in sus but so are the differences between means.
This can be seen from Table 7.8, which shows the descriptive statistics.
Chapter 7: Results
55
Figure 7.6: sus score averaged over
magazines with 95 % ci margins of error (the black vertical lines)
Figure 7.7: sum score averaged over
magazines with 95 % ci margins of error (the black vertical lines)
of two-tailed t-tests for sus score
(none are below α = .05 limit)
of two-tailed t-tests for sum score
(none are below α = .05 limit)
WW
FB
PS
AR
0.89
0.64
0.10
WW
0.75
0.14
FB
AR
0.89
0.74
0.27
WW
FB
PS
0.25
WW
FB
0.65
0.23
0.43
Table 7.7: Key statistics for sus score
AR
WW
FB
PS
Mean
59
60.25
63.25
73.5
Median
61.25
58.75
65
75
SD
19.37
20.36
20.55
17.53
Range
55
65
72.5
52.5
Min.
27.5
27.5
22.5
42.5
Max.
82.5
92.5
95
95
Count
10
10
10
10
Table 7.8: Key statistics for sum score
AR
WW
FB
PS
Mean
64.9
64.29
66.29
69.2
Median
66.65
64.65
68.65
69.35
SD
9.74
10.23
8.96
6.89
Range
66.9
62.6
68
57.9
Min.
41.1
53.5
43.8
57.2
Max.
75.2
81.7
77.1
79.9
Count
10
10
10
10
Chapter 7: Results
7.2.1
56
Satisfaction
User satisfaction was asked after every task for sum measurement, but it can also be
presented here separately. Besides task time, satisfaction was the only measurement in
sum that produced significant differences. Figure 7.8 shows how ps has the best average
satisfaction score followed by fb, ww and ar. From Table 7.10 it can be seen that the
difference between average satisfaction score is statistically significant between ps and
ar.
Figure 7.9 shows how satisfaction scores are distributed among individual tasks. Most of
the tasks yield similar satisfaction scores, error margins considered, with all magazines,
but some differences can be found from the tasks 5, 7, 8 and 11.
Figure 7.10 shows task satisfaction scores grouped according to the usability aspects they
tested (presented in Table 5.2). The most notable difference can be seen in visibility,
where ar has the lowest satisfaction scores. The other four usability aspects show only
minor differences in satisfaction between magazines.
Satisfaction and sus scores were the only subjective measures in the tests. Satisfaction
questionnaires were filled immediately after task completions, so it measures more of
the task performance in contrast to sus, which was filled after free browsing phase.
Nevertheless, the order of the magazines is the same as in sus (see Figure 7.6).
Figure 7.8: Satisfaction score averaged over magazines with 95 % ci margins of error
(the black vertical lines) (see Table 5.2 for task descriptions)
7.3
Quantitative eye-tracking
Eye-tracking was used as a quantitative and qualitative uem. Quantitative measures
taken were pupil diameter and fixation duration. Qualitative examination, presented in
Chapter 7: Results
Figure 7.9: Average satisfaction scores given for each task with 95 % ci margins of
Figure 7.10: Average task satisfaction scores grouped into usability aspects (Visibility:
tasks 2, 5, 7; Readability: 3; Layout: 4, Navigation: 6, 11; Image interaction: 7, 8, 9)
with 95 % ci margins of error (the black vertical lines)
57
Chapter 7: Results
58
Table 7.9: Key statistics for satisfaction score
AR
WW
FB
PS
Mean
3.35
3.55
3.57
3.80
Median
3.33
3.67
3.67
4
SD
1.01
0.95
1.07
1.01
Range
4
4
4
4
Min.
1
1
1
1
Max.
5
5
5
5
Count
100
100
100
100
Table 7.10: Results (P (T ≤ t)) of two-tailed t-tests for satisfaction score (ps–ar is
below α = .05 limit)
WW
FB
PS
AR
0.16
0.15
0.00
WW
FB
0.91
0.07
0.11
the section 7.4.1 was done to the combined eye-tracking–think aloud videos in order to
find usability problems from the magazines.
7.3.1
Pupil diameter
Figure 7.11 shows pupil diameter measures for each magazine. Measured pupil diameters
ranged from 2.7 to 5.5 millimeters, which complies with established results for a normal
adult pupil diameter in bright illumination [6, 41]. This measure seems to separate the
magazines in two groups: ar and ww have significantly larger pupil diameters measured
than fb and ps. Table 7.11 shows this to be true: the only differences between means
that are not significant (α = .05) are ar–ww and ps–fb. Detailed information of the
data statistics is presented in Table 7.14.
Table 7.13 shows pupil diameter correlations with other task-level measurements: task
time and satisfaction. As explained before, the hypothesis was that pupil diameter,
indicating cognitive load, would correlate with other usability measures: directly with
task time and inversely with satisfaction. There are significant correlation between pupil
diameter and task time in ar (to the wrong way according to hypothesis) and ww. Also,
correlation is significant between pupil diameter and satisfaction in fb (to the wrong
way according to hypothesis) and ps. However, when all measures are combined, the
correlations are negligible.
7.3.2
Fixation duration
In Figure 7.12, average fixation durations are plotted. fb and ps are in the middle ar
having higher and ww lower average fixation durations. The only statistically significant
difference is between ar and other magazines, as seen from Table 7.12. Detailed numbers
Chapter 7: Results
59
Figure 7.11: Pupil diameter averaged
over magazines with 95 % ci margins of
Figure 7.12: Fixation duration averaged over magazines with 95 % ci margins of error (the black vertical lines)
of two-tailed t-tests for average
pupil diameters
of two-tailed t-tests for average
fixation durations
WW
FB
PS
AR
0.07
0.00
0.00
WW
0.00
0.00
FB
WW
FB
PS
0.65
AR
0.00
0.03
0.01
WW
FB
0.18
0.09
0.95
Table 7.13: Correlation coefficients between pupil diameter, task time and satisfaction
along with results (P (T ≤ t)) of two-tailed t-tests for correlation coefficients
AR
WW
FB
PS
All
Task time
-0.27
0.23
-0.15
-0.05
-0.05
T-test
0.01
0.026
0.15
0.60
0.33
Satisfaction
0.09
0.07
0.39
-0.31
0.06
T-test
0.35
0.53
0.00
0.00
0.27
Count
100
90
100
99
389
Table 7.14: Key statistics for pupil diameter measures
AR
WW
FB
PS
Mean
4.06
4.18
3.79
3.83
Median
4.17
4.19
3.72
3.82
SD
0.50
0.54
0.67
0.53
Range
1.96
2.53
2.81
2.40
Min.
2.74
3.02
2.60
2.80
Max.
4.70
5.55
5.40
5.20
Count
100
90
96
99
Chapter 7: Results
60
Table 7.15: Key statistics for fixation duration measures
AR
WW
FB
PS
Mean
232.14
202.89
214.23
214.78
Median
229.88
202.53
204.52
211.73
SD
50.63
42.43
62.58
46.12
Range
295.04
226.78
306.38
241.48
Min.
102.64
113.17
127.63
125.89
Max.
397.68
339.95
434.01
367.38
Count
99
76
87
90
Table 7.16: Correlation coefficients between fixation duration, task time and satisfaction along with results (P (T ≤ t)) of two-tailed t-tests for correlation coefficients
AR
WW
FB
PS
All
Task time
0.10
-0.16
0.06
0.24
0.10
T-test
0.34
0.13
0.57
0.02
0.06
Satisfaction
0.06
-0.05
-0.11
0.08
-0.02
T-test
0.54
0.66
0.26
0.41
0.67
Count
99
76
87
90
352
behind the figure are presented in Table 7.15. As mentioned before, the hypothesis was
that average fixation duration during a task would correlate directly with task time and
inversely with satisfaction.
Table 7.16 shows the correlation between fixation duration and task time and satisfaction. Unlike in pupil diameter correlations, the combined correlation coefficients are of
the right sign compared to the hypothesis. Correlation with task time almost falls under
the .05-limit, indicating a possible relation. In contrast, correlation with satisfaction is
highly unlikely.
7.4
Think aloud and qualitative eye-tracking
Figure 7.13 and Table 7.17 show summarized results from the qualitative think aloud
data analysis done with Atlas.ti. ps had the best ratio of positive and negative comments
followed closely by fb. Users were asked to find usability problems so negative comments
were made more frequently than positive ones.
Table 7.17 shows also three of the most commented aspects of the magazines. Image
zoom should be a default feature and when it was not found (from ar, ww and fb), users
pointed this out. ar was the only magazine where navigation bar was always visible, but
its design (faded away too quickly, section divisions were not noticed) confused users.
In ww, fb and ps, when the navigation bar was found, it generated positive comments.
Until then, users showed frustration in their comments for they had to manually leaf
through the magazine to articles and back.
Chapter 7: Results
61
Table 7.17: The number of positive and negative comments from think aloud and the
three most remarked aspects of usability (−/+)
AR
WW
FB
PS
All
Positive
16
32
40
33
121
Negative
104
141
126
103
474
Ratio
0.15
0.23
0.32
0.32
0.26
Count
120
173
166
136
695
Image zoom
12/1
4/0
13/0
0/0
29/1
Navigation bar
15/1
3/5
13/4
14/4
45/14
toc
5/0
8/2
5/2
8/1
26/5
Table 7.18: Total number and different usability problems found from observing the
videos
AR
WW
FB
PS
Total
95
65
60
51
Different
25
22
19
18
toc generated negative comments in ar, because it did not have one; in ww, because
it was difficult to access and the hyperlinks were not noticed; in fb and ps, because it
was too long, it lacked writer names and showed only the first article when several were
stacked vertically.
Figure 7.13: Amount of negative and positive comments about each magazine
Figure 7.14 shows the number of usability problems found during qualitative analysis of
the think aloud videos (20 with and 20 without eye-tracking). The exact numbers are
presented in Table 7.14.
7.4.1
Usability problems
Table 7.19 shows the most important usability problems found with qualitative video
analysis (explained in Section 6.4). The first column on the left is a categorizing summary
Chapter 7: Results
62
Figure 7.14: Total number and different usability problems found from observing the
videos
of the slightly different usability problems in each magazine. The second column shows
to which heuristic class (see Section 2.3.1) the problems belong to (1–8, last two were
excluded because none of the magazines contained instructions). In the third column,
the severity of the usability problem is marked as L=low (leads to user complaints
and hinders task completion) or as H=high (prevents task completion) based on the
observations from the user tests.
The rest of the columns contain more detailed usability problem descriptions and number
of different users who encountered and noticed such a problem. If the number is zero,
no users explicitly pointed out the problem, but the observator discovered it. This table
includes only the most important (frequent and/or severe) usability problems; some
minor problems have been left out or combined with others for space constraints.
7.5
Summary of results
Results from all the measures are presented in Table 7.20. On the left side of the table,
the rank is derived from means of the measurements only. 95 % confidence intervals are
taken into account on the right side of the table, for the three measures where there were
significant differences. sus, sum and think aloud comment ratio (positive − negative
comments) were not significantly different. Confidence intervals were not calculated for
number of usability problems found. The individual results and the apparent trend are
discussed further in the next chapter.
Table 7.21 shows pros and cons of each magazine. This is a summary of all the findings
from the pre-test heuristic evaluation, user test observation and video analysis. Only
those aspects of ps are presented which are different from fb (i.e. image gallery and
zooming). The implications of this table are considered thoroughly in the next chapter.
Chapter 7: Results
63
Table 7.19: Most important individual usability problems by magazine (H.–Heuristics:
1–Visibility of system status, 2–Match between system and the real world, 3–User
control and freedom, 4–Consistency and standards, 5–Error prevention, 6–Recognition
rather than recall, 7–Flexibility and efficiency of use, 8–Aesthetic and minimalist design.
S.–Severity: L–low, H–high) *Dossier refers to a tablet magazine “page”; it can be
vertically longer than screen size (horizontal swipe moves between dossiers, vertical
swipe moves within a dossier)
Usability issue
Latency
H.
1
S.
L
Untraditional
2
L
Pagination
3
L
Undo
3
L
Zoom
4
L
Unpredictable
action
Gestures
4
H
5
L
Hidden
content
Navigation
bar
Text–image
association
Bookmarking
5
H
6
L
6
L
6
L
Search
7
H
Hidden
shortcuts
Scrollable
portions
Article
hierarchy
Dossier*
continuity
7
H
7
L
8
L
8
H
ww
Number of user occurences in magazines ar
fb ps
3 text size and swipe
—
7 article or image open
1 same as previous
5 no toc, Käynnistys≈Käynnistä 4 no page numbers
—
—
4 paginated browsing
2 paginated browsing
—
—
0 no “undo” action
9 no image zoom
2 no image or text zoom
6 no image or text zoom
2 no text zoom
5 in toolbar buttons
6 in image opening
—
—
2 small buttons in navigation bar 8 swipe direction error
7 same as previous
7 same as previous
8 inside image carousel
3 poor affordance ⊕buttons
—
—
8 fades away, poor affordance
—
1 poor affordance to swiping
—
5 images not next to text
6 same as previous
—
—
—
3 starts always from top page
5 same as previous
1 same as previous
4 no search
2 no search
6 no search
5 no search
—
5 bottom bar hard to find
6 same as previous
3 same as previous
—
15 hide content, slows browsing
—
—
5 cluttered top-level
—
5 short articles poorly separated
5 same as previous
0 poorly implied
2 poorly implied
5 not implied (see Section 8.2.3)
6 not implied (see Section 8.2.3)
Chapter 7: Results
64
Table 7.20: The order of magazines in all usability measurements (TT: Task time, PD:
Pupil diameter, FD: Fixation duration, TA: Think aloud user comment, UP: Usability
problems from think aloud and eye-tracking videos) (*95 % ci taken into account)
AR
WW
FB
PS
TT
4
2
3
1
SUS
4
3
2
1
SUM
3
4
2
1
PD
3
4
1
2
FD
4
1
2
3
TA
4
3
2
1
UP
4
3
2
1
AVG
3.7
2.9
2
1.4
TT*
2
1
1
1
PD*
2
2
1
1
FD*
2
1
1
1
AVG*
2
1.7
1
1
Items in the columns are mostly based on user observation and few (e.g. single-column vs.
multi-coulmn layouts) are based on previous research. Users were asked to find usability
problems, which was the point of this whole study, so many positive items are omitted
from the “Pros” column. For example, even though magazines had different typography,
legibility did not produce problems in any of the magazines: it was considered to be
a “default” feature and not worth reporting. Only those positive aspects are reported,
which were not found from all the magazines.
When accessed, navigation bar stayed visible
Page number was indicated
No content was hidden behind hyperlinks
Single column layout
Layout rotated with device
Stepless vertical scrolling
Image carousel with zoom
FB &
PS
PS
When accessed, toolbar stayed visible
Toolbar had shortcuts to cover and toc
Cover and toc had hyperlinks
Vertical pagination separated short articles
Pros
Toolbar was always visible
Page number was indicated
Image order was indicated in image carousel
Articles were clearly separated to different dossiers
Articles were found directly from top level
Only magazine with an adjustable font size
Layout rotated with device
Image carousel
WW
AR
Buggy image gallery
Cons
Toolbar symbols hard to decipher
Navigation bar faded away, had poor affordance and divisions inside sections
Relevant content was hidden inside image carousel
Some articles are divided unnecessarily
Top level was a jumble of images and bits of texts
Adjusting text size changed layout drastically and disoriented users
Paginated: did not allow stepless vertical scrolling
Multi-column layout
No image zoom, search feature, cover, toc
Toolbar was very hard to find
Page number was not indicated
Poor affordance in hyperlinks
Allowed stepless vertical scrolling only by holding
Relevant content was hidden behind hyperlinks (⊕buttons)
Image opening action was not indicated
Multi-column layout
Layout did not rotate with device
No image zoom, search feature, adjustable text size
Navigation bar was very hard to find
Dossier continuation downwards was not implied on first page
No shortcut to toc
No image zoom, search feature, adjustable text size, cover
Table 7.21: A “pros and cons” summary of each magazine based on the entire study
Chapter 7: Results
65
Chapter 8
Discussion
In this chapter, the results presented in the previous chapter are discussed further.
The research questions are answered and problems occurred during the research are
scrutinized. A summary of the results concerning each magazine is presented along with
the relation of these findings to previous research. Finally, validity and reliability of the
research is discussed.
8.1
Summary of the summative usability evaluations
In the previous chapter, seven measures were suggested for magazine usability evaluation.
Between four of them—task time, satisfaction, pupil diameter and fixation duration—
statistically significant differences were found. However, task time (efficiency) and satisfaction score (satisfaction) are the only measures of the four which are scientifically
sound indicators of usability according to iso [37]. Task completion rates (effectiveness)
did not show differences.
8.1.1
Usability implications of task time and satisfaction scores
Task time was easily measured. The problem was to decide what to do with incomplete
tasks, to which no instructions from literature were found. The number of incomplete
samples was so small compared to the total number of task time samples, that even if
they were dealt with erroneously, it did not have an effect on the overall results. After
that, task time and satisfaction score were easily analyzed to imply the usability of the
magazine user interface.
From these two measures and their margins of error, it can be stated that ps had better
usability than ar in the context of this study. More thorough analysis on task-level
reveal the usability aspects where the differences stem from. Figures 7.1, 7.2 and 7.9
66
Chapter 8: Discussion
67
show how total task time, task browsing time and satisfaction scores are distributed
between tasks. Figures 7.3 and 7.10 group task-level measures from the task time and
satisfaction score graphs into five usability aspects. (Task descriptions and the usability
aspects they tested are presented in Table 5.2.)
Readability (task 3) and Layout (task 4) show no significant differences between magazines in the two figures.
In Navigation (tasks 6 and 11), differences in both time and satisfaction can be seen. fb
differs most from others in task 6 (Figure 7.2) signifying poor navigation, although ps
magazine is identical in the terms of this particular task. Only explanation to this can
be found from Table 5.1, which shows that the proficiency level of the fb users happened
to be lower than the ps users’. This implies that inexperienced users need more cues for
dossier1 continuation (which was a severe usability problem in fb and ps) than more
experienced.
ww was the only magazine which presented author’s name in toc which could have
lead to the lower task times and higher satisfaction scores in task 11 (Figures 7.1 and
7.9) and in Navigation (Figures 7.3 and 7.10).
Visibility (groups tasks 2, 5 and 7) in both figures make the point that all relevant
content should be clearly visible, not behind obscure hyperlinks. Time and satisfaction
levels in the figures show that this is especially problem in ar, where even the correct
articles for tasks 5 (see Figure 8.1) and 7 were nearly impossible to find. Also, ww’s
scrollable columns are not good, which can be seen from the low satisfaction scores in
task 5 (Figure 7.9).
In Image interaction (tasks 8, 9, 10), some minor differences can be seen from the figures.
The ability to zoom images results in better user experience (satisfaction), even though it
does not necessarily translate directly to more efficient user interface (task time), which
can be seen from ps task 8 (Figures 7.9 and 7.1). The low scores for ww in the same
task shows that partially visible, scrollable images are a very poor choice for infographic
presentation.
Image carousel seems to be quicker way to browse through many images than individually
opening them, which could be expected. Total task times in task 9 are low for ar, but
not for ps (the other magazine with an image carousel), which means that an image
carousel is only good when it is not buggy and shows all of the article’s images in the
same carousel.
1
See caption from Table 7.19
8.1.2
68
Quantitative and qualitative eye-tracking result analysis
Pupil diameter was first of the two measures extracted from the eye-tracking data. From
Table 8.1, it can be seen that pupil diameter has significant negative correlation of −.518
with the age of the user. This is a validation for the results because human pupil size is
known to decrease with age [6]. However, Table 7.13 shows that pupil diameter did not
correlate on task basis with satisfaction or task time.
Even though pupil diameter data is recorded in eye-tracking by default, the measurements are not normally used in usability studies. This is because the effect of cognitive
load or arousal on pupil size is easily masked by changes in the amount of light arriving
to eye [62]. Also, baseline measurements with different screen brightness settings would
have been necessary to deal with the individual differences. It would have enabled calibration to get reliable results from different magazine layouts (with different amounts of
whitespace). Pupil diameter was decided to be investigated retrospectively, so the tests
were not conducted these constraints in mind.
Looking at the differences in average pupil sizes between magazines in Figure 7.11,
it can be seen that three categories are formed: ww in first, ar in second, and fb
and ps (which look effectively the same) in third class with smallest pupil sizes. No
measurements were made, but the differences between pupil diameters could have been
caused by the different layouts, which had different amount of whitespace as can be seen
from Figures 4.15, 4.16 and 4.17.
Fixation duration, on the other hand, has been used in several usability studies before
[13, 22, 30]. Fixation duration is usually thought as a measure of cognitive processing
difficulty. Eye-mind hypothesis (see Section 2.3.5) states that people look at what they
think. From this, it is derived that they also look at something as long as they think
of it. Therefore, when a part of a user interface is difficult to process, it will generate
longer fixations. On the other hand, long fixations can mean interesting and intriguing
user interface, rendering the fixation–usability relation to a U-shaped curve [22].
Nevertheless, some conclusions can be made from the fixation duration data. Cowen
(2002) found out that web sites with a high “clutter index”, i.e. small amount of white
space and densely clustered items, made the layout more difficult to process and generated longer fixations on average. This could be why ar had the longest fixations, as the
top-level layout is more dense than in the other magazines. To enable more in-depth
analysis from eye-tracking, the data should have been divided to phases depending on
the stage of cognitive processing the user is going through. [22]
Qualitative eye-tracking produced important findings about affordance. Affordance, in
the context of hci, was defined famously by Norman in his book (1988) to refer to “the
perceived and actual properties of the thing, primarily those fundamental properties
that determine just how the thing could possibly be used” [56]. Web sites have mostly
69
gotten rid of the problem but today, poor affordance design plagues iPad applications,
as mentioned before (see Section 3.3.4).
For instance, eye-tracking five ww users revealed six cases when a user looked at a
⊕button or hyperlink in toc containing the correct answer (see Section 4.2), but did
not tap it. This clearly indicates that the hyperlink design is faulty. Hyperlinks in ar
were not made to look touchable but they were the only way to access articles so users
quickly learned the interaction. In fb and ps, toc was the only place where hyperlinks
were evident and users had little problems thinking of them as buttons.
Eye-tracking is seldom used with mobile devices. In a meta-analysis of 100 mobile
device usability studies, only two had used eye-tracking [21]. The biggest challenges
are movement of the mobile device and users hands as they can block the eye-tracking
signal. Some suggestions on how to use a standalone eye-tracker with mobile devices has
been proposed, but the only sure solution would be to use a head-mounted eye-tracker
[73]. In this study, iPad was fixed to position but hands blocked the signal about half of
the time. This was a conscious compromise between ergonomics (natural hand position)
and eye-tracking data quality.
8.1.3
Low reliability of SUS and SUM scores
sus and sum scores did not produce statistically significant differences due to the small
sample size. Especially sum could be used as a comprehensive uem, but it requires
lots of resources: the number of users should be closer to hundred than ten and the
examination of error rate is timely. In retrospect, the time consuming sum could have
been dropped from the used methods. If a pre-test simulation would have been ran, it
could have shown too small differences.
However sus and sum together contributed to reveal the trend in the results, even though
they were not totally statistically significant. To summarize the summative part of the
evaluations: ps and fb are consistently ahead of ar and ww in the terms of usability.
8.2
Summary of the formative usability evaluations
This section presents the most important differences and similarities between the four
magazines. Findings are linked to previous research. All of the magazines had some good
and some bad qualities, none was perfect. A hypothetical model of a tablet magazine
with a “perfect usability” is presented in the next chapter.
8.2.1
70
Findings from AnyReader version
In ar, the most severe problems were related to finding content. Relevant content (such
as technical specifications tables) was hidden inside image carousel. Also, eye-tracking
revealed that headlines in the top-level did not pop out as they are supposed, because
of faint typography and competing elements (see Figure 8.1 for an eye-tracking example
and Figure 7.2: task 5 for notable differences in task time).
An eye-tracking study has shown that extra information in search results competes with
relevant information and impairs task times in navigational tasks [30]. Toolbar on top
was docked and thus visible all the time, whereas the bottom navigation bar faded away
too quickly for users to exploit it. Eye-tracking revealed that toolbar buttons had good
affordance and saliency (users noticed and tapped them quickly), but navigation bar
was mostly thought of only as an indicator of placement.
Figure 8.1: Eye-tracking shows how correct headline is not “seen” even though quickly
looked at, because of more demanding typography below, which was not a headline
Article “Tietokoneen tulevaisuus on täällä” was divided erroneously during the dynamic
layout into two dossiers: technical specifications table was in different article than the
images of the devices. This was deliberately addressed in task 7 and 5/10 users failed
to complete the task. Content was paginated horizontally and vertically, and it did not
allow stepless scrolling. Most users in this study preferred to scroll steplessly (like in
a web browser), which complies previous studies [8]. Finally, an eye-tracking study of
online newspapers has suggested that single-column layout is more effortlessly read than
a multi-column [59].
8.2.2
Findings from retail version (Woodwing)
ww had problems with navigation inside the magazine. Toolbar was hard to find (tap at
bottom): 7/10 users did not find it at all during tasks. Lack of page numbers left users
unsure of their location inside the magazine; information on current location is crucial to
effectively navigate in any information space [3]. Hyperlinks in cover, toc and articles
(⊕buttons) had poor or non-existent affordance, which is a common problem in iPad
applications [10].
71
Image opening action lacked consistency and affordance: it was not indicated what would
happen if image was tapped or if anything would happen at all. Scrollable portions of
page frustrated many users, as has been the case in previous research (see Section 3.3.4).
In order for them to work, they should be properly indicated for scrollability [10, 11].
Lastly, even though most users in this and in previous e-reading studies prefer portrait
orientation, a possibility for a landscape orientation should have been enabled [78, 79].
8.2.3
Findings from Fanxybox and Photoswipe versions
fb are ps had two major problems: navigation bar and indicator of article continuity.
7/20 users did not find the navigation bar during tasks and most of the rest found it
by accident. Some users requested for a “shorter” shortcut to toc to avoid scrolling the
navigation bar. Shortcuts allow more effective usage of any user interface and they are
mentioned in the heuristics (see Section 2.3.1).
Another severe usability problem was found in task 6, where users had to find a short
article situated at the end of the magazine. The dynamic layout system in fb and ps had
made the first article of “Vikatila” dossier exactly as long as screen height by accident.
Other articles, including the one searched for in task 6, were below the first one but most
users had problems noticing the continuation. Finally, ps was the only magazine with an
image zoom, but the image gallery had some glitches (controls disappeared abruptly).
In conclusion, a good tablet magazine, at least in the usability point of view, combines
features from all the tested versions. A tablet magazine should have freedom of scrolling
and device orientation. In addition, this and previous research has shown that a good
digital publication has the affordances of a print along with interaction possibilities of
digital environment [43]. A model for a “perfect” magazine is visioned in the last chapter.
8.3
Reliability and validity
Too small sample size was the biggest obstacle to obtain reliable results. This is a
common problem for every usability study involving real test users, when there is a need
to get statistically significant, quantitative, results. A 95 % confidence interval level was
used throughout this study. This showed that some of the results were unreliable due
to the natural variability in users. Ten users per magazine was too small a sample size
to level out these differences.
Usability literature has contradictory information on how uems affect one another. For
example, think aloud has been found to speed up and slow down task times. Also, eyetracking is adviced to be used alone and with think aloud. There is not a unanimous
theory on how uems should be used. In this study, it was decided to obtain as much
72
measurements as possible from the small amount of users to at least reveal some trend
from the more or less unreliable results.
Finally, considering reliability, the evaluator effect has to be taken into account. A
meta-analysis of heuristic evaluation and think aloud studies found out that different
evaluators found different usability problems [32]. This means that the usability problems found are, in some degree, dependent on the evaluator. In this study, this considers
only the qualitative analysis of think aloud–eye-tracking videos, which were done by a
single person. However, usually the most severe and frequent usability problems are
found by all evaluators, which is also believed to be the case here [51].
Usability evaluations are always sensitive to the context they have been made in. The
context consists of users, tasks and the environment. In this study, it was shown that
fb and ps had the best usability, but the results are strictly speaking only applicable
in this context. The selection of users, tasks and the environment was done so that the
results would be valid in a common context. If this was achieved, then the results of the
summative usability evaluations are valid and generalizable.
As stated before, the usability is strictly dependent on context: users, tasks and environment. The following steps were taken in order to make the context of user tests
applicable for generalizations. Users were selected to represent together a summary of
the Tietokone tablet magazine readership. Tasks were designed to mimic common magazine use experience. Ergonomics of the test setup were constructed so that it allowed a
comfortable seating position for user. The test environment was the most problematic,
being a (temporary) usability laboratory with a test facilitator and video camera next
to the user. Nevertheless, a strong case can be made from the results of this study that
fb and ps magazines have the best usability also in more common contexts.
8.3.1
Influence of user background
Table 8.1 shows how user background affected the various measures. There are ten cases
where the correlation is significant (marked by an asterisk). The most interesting finding
from this is that the sus score correlates negatively with the amount of prior experience
with tablets and e-reading. This would imply that experienced users have used better
tablet magazines than the tested magazines. sus correlates also with satisfaction, which
could be expected.
Age affects pupil diameter, as can be seen from the table. This is explained in more
detail in Section 8.1.2. Age also correlates with tablet ownership and Tietokone reading,
which was expected. However, no explanation can be given to correlations between
eye-tracking measures and questions.
73
Table 8.1: Correlation coefficients between user background and some usability measurements (TT: Task time, PD: Pupil diameter, FD: Fixation duration, Sat: Satisfaction; all averaged per user) (*Correlation is significant at the α = .05 level (2-tailed))
SUS
SUS
Pupil diameter
Fixation duration
Age
Task time
Satisfaction
Q: Graphic design
Q: Owns tablet
Q: Used tablet
Q: Read tablet
Q: Has Apple
Q: Will buy Apple
Q: Read Tietokone
Q: Interest in tech.mags
Q: Reads print regularly
Proficiency
.183
-.138
-.148
-.179
*.674
-.187
*-.370
-.274
*-.582
-.118
-.181
.039
.095
-.093
*-.367
PD
.183
-.259
*-.518
-.036
.123
-.131
.251
.173
.115
.275
.237
*-.433
-.115
.099
.040
FD
-.138
-.259
.156
.133
-.162
.262
.120
.033
.115
.041
.075
.092
*.356
*-.372
.160
Age
-.148
*-.518
.156
.291
-.036
-.096
-.089
-.289
-.104
*-.369
-.048
*.405
-.085
-.002
-.165
TT
-.179
-.036
.133
.291
.117
-.133
.140
-.097
-.050
-.109
-.019
-.032
-.127
-.122
-.162
Sat
*.674
.123
-.162
-.036
.117
-.208
-.197
-.213
-.225
-.124
-.176
-.035
.008
.020
-.248
Chapter 9
Conclusion
New mobile e-reading devices, tablet computers, have been proposed as a salvation for
publishing houses to combat the declining print sales. More and more book, newspaper
and magazine content is being made digitally available for consumers. The standard form
of the digital publication has not yet been decided and many publishers are hesitant to
do digital publishing until this. Application- and web-based magazines with various
amounts of interaction are all available for mobile devices. This usability study has
compared four versions of the same magazine: two application- and two web-based
solutions.
Four digital versions of the same Tietokone magazine issue were evaluated. The retail
version ww, short from the Woodwing publishing solution used to build it, was a traditional image-based application with static manual layout. ar, short from AnyReader
e-reading solution, had different layout and structure than ww. fb and ps were two
slightly different versions from the same html5 web-based magazine. fb is a short from
Fancybox, a simple pop-up window image viewing system and ps from Photoswipe, a
browsable and zoomable image gallery used in otherwise similar magazine. Structure of
the latter magazines was similar to ww, but the dynamic layout was different than in
ww or ar.
User tests were done with four groups of ten users, each individual testing one version of
a magazine for an hour. Eleven tasks were designed with a heuristic evaluation as a basis.
First task was a practice meant for those who had not used a tablet computer before.
Time, subjective satisfaction, think aloud and eye-tracking data was recorded from ten
tasks. The data was analyzed qualitatively (think aloud and eye-tracking videos) and
quantitatively (task time, pupil diameter, fixation duration, sus, sum).
Formative evaluation based on heuristic evaluation and think aloud–eye-tracking video
observation revealed many usability problems from each magazine, some easier to correct
than others. Each of the magazines had own set of pros and cons. The following model
74
Chapter 9: Conclusion
75
for tablet magazine maximizes the “pros–cons” ratio and can be argued to have maximum
usability.
Users want more freedom of choice in digital environment, so stepless scrolling (in fb and
ps), landscape orientation (ar, fb and ps), adjustable text size (ar) and image zoom
(ps) should be enabled. Navigation around the digital magazine has to be as effortless as
in print, so easily discovered shortcuts (ar), page numbers (ar, fb and ps), toc (ww,
fb and ps) and page browser (ww page browser did not work in the tests, fb and ps
navigation bar was similar to it) should be available. None of the magazines tested fully
exploited the benefits of digital platform: search feature and multimedia content (video,
sound) were most frequently missed. Finally, all content should be easily found with
browsing by making it either directly accessible (fb and ps) or behind clearly marked
hyperlinks (hyperlinks in every magazine lacked affordance, i.e. did not look touchable).
Magazines were compared with summative evaluations from task times, sus and sum
scores, fixation duration and pupil diameter, think aloud comments and number of
found usability problems. Even though the sample size was too small in this study
for some measures to obtain statistically significant differences, a clear trend can be
summarized from the measures. In this context—usability testing is always dependent
on the context = users + environment + tasks—the html5 based magazines fb and
ps with dynamic layout had the best usability. The retail version ww was second and
ar, also with dynamic layout, fared worst according to summative usability evaluation.
Eye-tracking proved to be challenging usability evaluation method. Both qualitative
(gaze replay analysis together with think aloud videos) and quantitative (pupil diameter
and fixation duration) analysis was done from the eye-tracking data. Pupil diameter
measures did not correlate with other usability metrics because variability in user interface brightness had more effect on it. However, average fixation duration seemed to
imply user interface complexity and was measured to be greatest in ar. Qualitative
eye-tracking analysis gave valuable insight into affordance of hyperlinks and saliency of
layout elements.
This study dealt strictly with usability; visual qualities were not addressed at all. One
could argue that automatic layout systems make dull and homogenic layouts1 , but this
has to be researched further. One solution could be a “semi-automatic” layout system,
where the basic layout is done automatically and final adjustments are left to experts of
graphic design. However, being automatic or manual, latest development2 has suggested
that html5 is the technology of future for digital publishing rather than applications.
1
On the other hand, one user, who was acquainted with the print version of Tietokone, commented
the html5 version as “brandlike” without knowing the truth
2
“The new iPad” released in March 2012 has four times more pixels than iPad 2, which increase the
memory requirements of a image-based publication substantially
References
[1] M. G. Albanesi, R. Gatti, M. Porta, and A. Ravarelli. Towards Semi-Automatic
Usability Analysis through Eye Tracking. In CompSysTech’11 Proceedings of the
12th International Conference on Computer Systems and Technologies, pages 135–
141, New York, NY, USA, 2011.
[2] J. Arnowitz and E. Dykstra-Erickson. Usability as Science. Interactions, 12(2):7–8,
2005.
[3] D. Benyon. Navigating Information Space: Web site design and lessons from the
built environment. PsychNology Journal, 4(1):7–24, 2006.
[4] D. C. Berry and D. E. Broadbent. The role of instruction and verbalization in
improving performance on complex search tasks. Behaviour & Information Technology, 9:175–190, 1990.
[5] R. Bias. Interface-Walkthroughs: Efficient Collaborative Testing. IEEE Software,
8(5):94–95, 1991.
[6] J. E. Birren, R. C. Casperson, and J. Botwinick. Age Changes in Pupil Size. Journal
of Gerontology, 5(3):216–221, 1950.
[7] T. Boren and J. Ramey. Thinking Aloud: Reconciling Theory and Practice. IEEE
Transactions on Professional Communication, 43(3):261–278, 2000.
[8] C. Braganza, K. Marriott, P. Moulder, M. Wybrow, and T. Dwyer. Scrolling behaviour with single- and multi-column layout. In Proceedings of the 18th international conference on World wide web - WWW ’09, pages 831–840, New York, New
York, USA, Apr. 2009.
[9] J. Brooke. SUS – A quick and dirty usability scale. Usability evaluation in industry,
page 7, 1996.
[10] R. Budiu and J. Nielsen. Usability of iPad Apps and Websites: 1st edition. Technical
report, 2010.
[11] R. Budiu and J. Nielsen. Usability of iPad Apps and Websites: 2nd edition. Technical report, 2011.
76
References
77
[12] P. A. Carpenter and M. A. Just. Eye fixations and cognitive processes. Cognitive
Psychology, 8(4):441–480, Oct. 1976.
[13] A. Çöltekin, B. Heil, S. Garlandini, and S. I. Fabrikant. Evaluating the Effectiveness
of Interactive Map Interface Designs: A Case Study Integrating Usability Metrics
with Eye-Movement Analysis. Cartography and Geographic Information Science,
36(1):5–17, Jan. 2009.
[14] S. Chen, J. Epps, N. Ruiz, and F. Chen. Eye activity as a measure of human mental
effort in HCI. In Proceedings of the 15th international conference on Intelligent user
interfaces - IUI ’11, pages 315–318, New York, New York, USA, 2011.
[15] P. F. Chong, Y. P. Lim, and S. W. Ling. On the Design Preferences for Ebooks.
IETE Technical Review, 26(3):213–222, 2009.
[16] P. Chynal, J. Szymański, P. Campos, N. Graham, J. Jorge, N. Nunes, P. Palanque,
and M. Winckler. Remote Usability Testing Using Eyetracking. INTERACT 2011
Human-Computer Interaction (Lecture Notes in Computer Science), 6946:356–361,
2011.
[17] L. Cooke. Improving usability through eye tracking research. In IPCC 2004 International Professional Communication Conference Proceedings, pages 195–198.
IEEE, 2004.
[18] L. Cooke. Eye Tracking: How It Works and How It Relates to Usability. Technical
Communication, 52(4):456–463, 2005.
[19] L. Cooke. Is Eye Tracking the Next Step in Usability Testing? 2006 IEEE International Professional Communication Conference, pages 236–242, Oct. 2006.
[20] L. Cooke and E. Cuddihy. Using eye tracking to address limitations in think-aloud
protocol. In IPCC 2005 International Professional Communication Conference Proceedings, pages 653–658. IEEE, 2005.
[21] C. K. Coursaris and D. J. Kim. A Meta-Analytical Review of Empirical Mobile
Usability Studies. Journal of Usability Studies, 6(3):117–171, May 2011.
[22] L. Cowen, L. J. Ball, and J. Delin. An Eye Movement Analysis of Webpage Usability.
In People and Computers XVI - Memorable yet Invisible: Proceedings of the HCI
2002, pages 1–14, 2002.
[23] T. Dimond. Devices for reading handwritten characters. In Proceedings of Eastern
Joint Computer Conference, pages 232–237, 1957.
[24] N. Eger, L. J. Ball, R. Stevens, and J. Dodd. Cueing Retrospective Verbal Reports
in Usability Testing Through Eye-Movement Replay. In BCS-HCI ’07 Proceedings of the 21st British HCI Group Annual Conference on People and Computers:
HCI...but not as we know it, pages 129–137, Swinton, UK, 2007.
References
78
[25] C. Ehmke and S. Wilson. Identifying Web Usability Problems from Eye-Tracking
Data. In BCS-HCI ’07 Proceedings of the 21st British HCI Group Annual Conference on People and Computers: HCI...but not as we know it, pages 119–128,
Swinton, UK, 2007. British Computer Society.
[26] A. K. Ericsson and H. A. Simon. Verbal Reports as Data. Psychological Review,
87(3):215–251, 1980.
[27] A. K. Ericsson and H. A. Simon. Protocol Analysis: Verbal Reports as Data. The
MIT Press, Cambridge, MA, USA, 1984.
[28] Forrester. US consumer tablet forecast update, 2011 to 2016. Technical report,
2012.
[29] J. Goldberg and X. Kotval. Computer Interface Evaluation Using Eye Movements: Methods and Constructs. International Journal of Industrial Ergonomics,
24(6):631–645, 1999.
[30] Y. Habuchi, M. Kitajima, and H. Takeuchi. Comparison of eye movements in
searching for easy-to-find and hard-to-find information in a hierarchically organized
information structure. In ETRA’08 Proceedings of the 2008 symposium on Eye
tracking research & applications, number 212, pages 131–134, New York, New York,
USA, 2008.
[31] H. Heikkilä. eReading User Experiences: eBook Devices, Reading Software & Contents. Technical Report 54, NextMedia, 2011.
[32] M. Hertzum and N. E. Jacobsen. The evaluator effect: a chilling fact about usability
evaluation methods. Int. Journal of Human-Computer Interaction, 15(1):183–204,
2003.
[33] A.-m. Horcher and M. Cohen. Ebook Readers: An iPod for Your Books in the
Cloud. Communications in Computer and Information Science Part I, 174:22–27,
2011.
[34] C.-h. Huang and C.-m. Wang. Usability Analysis in Gesture Operation of Interactive
E-Books on Mobile Devices. Design, User Experience, and Usability Lecture Notes
in Computer Science, 6769:573–582, 2011.
[35] A. Huthwaite, C. E. Cleary, B. Sinnamon, P. Sondergeld, and A. McClintock. Ebook
Readers: Separating the Hype from the Reality. In Proceedings of 2011 ALIA
Information Online Conference & Exhibition, page 12, Brisbane, Australia, 2011.
QUT Library.
[36] IDC. Media Tablet Shipments Outpace Fourth Quarter Targets. Worldwide Quarterly Media Tablet & e-Reader Tracker, 2012.
References
79
[37] ISO. ISO 9241-11:1998 Ergonomic requirements for office work with visual display
terminals (VDTs) - Part 11: Guidance on usability. Technical report, International
Organization for Standardization, 1998.
[38] ISO. ISO/IEC 25062: Software engineering - Software product Quality Requirements and Evaluation (SQuaRE) - Common Industry Format (CIF) for usability
test reports, 2006.
[39] E. Johnson. Touch display—a novel input/output device for computers. Electronics
Letters, 1(8):219–220, 1965.
[40] S. Johnson and P. Prijatel. The Magazine from Cover to Cover. Oxford University
Press, 2006.
[41] D. Kahneman and J. Beatty. Pupil Diameter and Load on Memory. Science,
154(3756):1583–1585, 1966.
[42] C. Lewis, P. Polson, C. Wharton, and J. Rieman. Testing a Walkthrough Methodology for Theory-Based Design of Walk-Up-and-Use Interfaces. In Chi ’90 Proceedings, pages 235–242, 1990.
[43] C. C. Marshall and S. Bly. Turning the page on navigation. In Proceedings of the
5th ACM/IEEE-CS joint conference on Digital libraries - JCDL ’05, page 225, New
York, New York, USA, June 2005.
[44] T. Masalin. iPad & iPhone käsikirja. Docendo, Jyväskylä, 2011.
[45] D. Mauney, J. Howarth, A. Wirtanen, and M. Capra. Cultural similarities and
differences in user-defined gestures for touchscreen user interfaces. In CHI EA ’10
Proceedings of the 28th of the international conference extended abstracts on Human
factors in computing systems, pages 4015–4020, New York, New York, USA, 2010.
[46] D. Mayhew. The Usability Engineering Lifecycle. Morgan Kaufmann Publishers,
San Francisco, CA, 1999.
[47] B. A. Mitchell, L. Christian, and T. Rosenstiel. The Tablet Revolution and What
it Means for the Future of News. Technical Report 202, Pew Research Center’s
Project for Excellence in Journalism, 2011.
[48] MPA. The Mobile Magazine Reader - A Custom Study of Magazine App Users.
Technical report, MPA–The Association of Magazine Media, 2011.
[49] J. Nielsen. Finding usability problems through heuristic evaluation. In CHI’92
Proceedings of the SIGCHI conference on Human factors in computing systems,
pages 373–380, New York, New York, USA, 1992.
[50] J. Nielsen. Usability engineering. Academic Press, Boston, 1993.
References
80
[51] J. Nielsen and T. K. Landauer. A mathematical model of the finding of usability
problems. In CHI’93 Proceedings of the SIGCHI conference on Human factors in
computing systems, pages 206–213, New York, USA, May 1993.
[52] J. Nielsen and J. Levy. Measuring Usability: Preference vs. Performance. Communications of the ACM, 37:66–76, 1994.
[53] J. Nielsen and R. L. Mack. Usability inspection methods. Wiley, New York, NY,
USA, 1994.
[54] J. Nielsen and R. Molich. Heuristic evaluation of user interfaces. In CHI ’90
Proceedings of the SIGCHI conference on Human factors in computing systems:
Empowering people, volume 17, pages 249–256, New York, NY, USA, 1990.
[55] J. Nielsen and K. Pernice. Eyetracking Web Usability. Voices That Matter. New
Riders, Berkeley, CA, USA, 2009.
[56] D. A. Norman. The Design of Everyday Things. Basic Books, 1988.
[57] D. A. Norman. Natural user interfaces are not natural. interactions, 17(3):6, May
2010.
[58] D. A. Norman and J. Nielsen. Gestural Interfaces: A Step Backward In Usability.
interactions, 17(5):46, Sept. 2010.
[59] S. Outing and L. Ruel. The Best of Eyetrack III: What We Saw When We Looked
Through Their Eyes, 2006.
[60] B. Pan, G. K. Gay, H. A. Hembrooke, L. A. Granka, M. K. Feusner, and J. K.
Newman. The Determinants of Web Page Viewing Behavior: An Eye-Tracking
Study. In ETRA ’04 Proceedings of the 2004 symposium on Eye tracking research
& applications, volume 1, pages 147–154, New York, NY, USA, 2004.
[61] K. Pernice and J. Nielsen. Eyetracking methodology: How to conduct and evaluate
usability studies using eyetracking. Technical Report August, 2009.
[62] M. Pomplun and S. Sunkara. Pupil Dilation as an Indicator of Cognitive Workload
in Human-Computer Interaction. In Proceedings of the 10th International Conference on Human-Computer Interaction, page 5, 2003.
[63] S. Rosenbaum, J. A. Rohn, and J. Humburg. A Toolkit for Strategic Usability: Results from Workshops, Panels, and Surveys. In CHI ’00 Proceedings of the SIGCHI
conference on Human factors in computing systems, volume 2, pages 337–344, 2000.
[64] J. Rubin and D. Chisnell. Handbook of usability testing: how to plan, design, and
conduct effective tests. Wiley, Indianapolis, IN, 2nd edition, 2008.
[65] D. Saffer. Designing Gestural Interfaces. O’Reilly Media, Sebastopol, CA, 2009.
References
81
[66] J. Sauro. Using a Single Usability Metric (SUM) to Compare the Usability of
Competing Products. In HCII 2005 Proceeding of the Human Computer Interaction
International Conference, page 9, 2005.
[67] J. Sauro and E. Kindlund. A method to standardize usability metrics into a single
score. In CHI ’05 Proceedings of the SIGCHI conference on Human factors in
computing systems, page 9, New York, New York, USA, 2005.
[68] S. C. Seow, D. Wixon, A. Morrison, and G. Jacucci. Natural user interfaces. In CHI
EA’10 Proceedings of the 28th of the international conference extended abstracts on
Human factors in computing systems, page 4453, New York, New York, USA, Apr.
2010.
[69] B. Shneiderman. Direct Manipulation: A Step Beyond Programming Languages.
IEEE Computer, 16(8):57–69, 1983.
[70] E. Siegenthaler, P. Wurtz, and R. Groner. Improving the Usability of E-Book
Readers. Journal of Usability Studies, 6(1):25–38, 2010.
[71] S. L. Smith and J. N. Mosier. Design Guidelines for User-System Interface Software.
Technical report, 1984.
[72] C. Stevens. Designing for the iPad: building applications that sell. Wiley, Hoboken,
N.J., 2011.
[73] Tobii. Using Eye Tracking to Test Mobile Devices. Technical report, Tobii Technology AB, 2010.
[74] M. Töyry, P. Räty, and K. Kuisma. Editointi aikakauslehdessä. Taideteollinen
korkeakoulu, Helsinki, Suomi, 2008.
[75] Työterveyslaitos. Käytettävyydellä potkua tuotekehitykseen. Technical report,
Työterveyslaitos, Oulu, 2009.
[76] M. Väisänen. E-lukulaitteen ensikäytön käytettävyysongelmat ja käyttäjäkokemuksen ajallinen kehittyminen. Diplomityö, Aalto-yliopisto, 2011.
[77] C. Ware. Information Visualization: Perception for design. Morgan Kaufmann,
San Francisco, CA, 2000.
[78] S. Wearden. Landscape vs . Portrait Formats: Assessing Consumer Preferences.
Technical report, 1998.
[79] S. T. Wearden, R. Fidler, A. B. Schierhorn, and C. Schierhorn. Portrait vs. landscape: Potential users’ preferences for screen orientation. Newspaper Research Journal, 20(4):50–61, 1999.
[80] R. Wilson and M. Landoni. EBONI Electronic Textbook Design Guidelines. Technical Report March, Joint Information Systems Committee (JISC), 2002.
Appendices
82
Below are the pre-test user background questions and the assignment paper as they were
presented to the users (tasks 2–11 in random order).
Notes from a preliminary heuristic expert evaluation (ww and ps) is attached in the
end.
All appendices are in Finnish.
29.4.2012
Taustatietokysely
Taustatietokysely
Tämän kyselyn tarkoituksena on kartoittaa taustan vaikutusta tutkimustuloksiin. Kaikki tiedot
käsitellään luottamuksellisesti.
* Required
Perustiedot
Nimesi: *
Sähköpostiosoite:
Anna sähköpostiosoitteesi, jos haluat jatkossakin saada kutsuja koehenkilöksi Mediatekniikan
laitokselle.
Ikäsi: *
Sukupuolesi: *
Mies
Nainen
Testiaika:
Sovitun testiajankohdan päivämäärä ja kellonaika
Kätisyys: *
Kummalla kädellä kirjoitat?
Vasen
Oikea
Kuuluuko harrastuksiisi/opiskeluusi/ammattiisi graafista suunnittelua? *
Esimerkiksi www-sivun ulkoasun suunnittelu, lehden taittaminen jne.
Kyllä
Ei
Ammattisi tai koulutusohjelmasi *
Jos olet opiskelija, kirjoita tähän koulutusohjelmasi (esimerkiksi tietotekniikka)
https://docs.google.com/spreadsheet/viewform?formkey=dHl3V0g1RjFmekVoMHg0Mm9Ldl9tMUE6M…
1/3
29.4.2012
Taustatietokysely
Aikaisempi taulutietokoneiden käyttökokemus
Omistatko taulutietokoneen? *
Esimerkiksi Apple iPad tai Samsung Galaxy Tab
Kyllä
En
Oletko koskaan käyttänyt taulutietokonetta? *
Esimerkiksi Apple iPad tai Samsung Galaxy Tab
Kyllä
En
Jos olet käyttänyt taulutietokonetta, oletko käyttänyt sitä sanoma- tai aikakauslehtien
lukemiseen? *
Kyllä
En
Jos vastasit edelliseen kysymykseen kyllä, mitä lehtiä olet lukenut taulutietokoneella?
Omistatko Applen tuotteita? *
iPhone, iPod, Mac, iPad
Kyllä
En
Voisitko kuvitella ostavasi jonkin Applen tuotteen? *
Kyllä
En
Ehkä
2/3
29.4.2012
Taustatietokysely
Tietokone-lehti
Oletko aikaisemmin lukenut Tietokone-lehteä? *
Kyllä
En
Kiinnostaako sinua elektroniikka- ja tietotekniikka-aiheiset lehdet? *
Kyllä
Ei
Luetko säännöllisesti jotakin aikakausi- tai sanomalehteä? *
Kyllä
En
Jos vastasit edelliseen kysymykseen kyllä, niin mitä lehtiä luet?
Submit
Powered by Google Docs
Report Abuse - Terms of Service - Additional Terms
3/3
Koejärjestelyt ja ääneen ajattelu
k0
Tervetuloa testiin.
Sinulle esitetään pian 11 tehtävää, jotka pitäisi suorittaa käyttämällä iPadia ja samalla ajatella ääneen.
Ääneen ajattelulla voit selittää menetelmiä, joita käytät tehtävän suorittamiseen ja kommentoida
mahdollisesti kohtaamiasi ongelmia. Ääneen ajattelun tulisi olla mahdollisimman jatkuvaa ja kokeen pitäjä
muistuttaa, jos olet liian kauan hiljaa.
Ilmoita jokaisen tehtävän jälkeen, kun olet mielestäsi valmis ja täytä kolmen kohdan kysely. Tehtävien
jälkeen saat vapaasti selata lehteä. Lopuksi saat täytettäväksi lomakkeen. Testi kestää kokonaisuudessaan
45-60 min.
Tämän testin aikana ei arvioida suoritustasi, vaan lehden käytettävyyttä. Älä pelkää kritiikin antamista
ääneen ajattelun aikana; kokeen pitäjä ei ole ollut mukana kehittämässä lehteä. Kaikki palautteesi on
arvokasta ja se käsitellään nimettömästi. Koetilanne kuvataan, jotta kommentit ja iPadin ohjauseleet
saadaan tallennettua myöhempää tarkastelua varten.
Silmänliikkeiden mittaus
Tehtäviä tehdessä katseesi iPadilla tallennetaan. Silmänliikekamera sijaitsee näytön alapuolella ja käyttää
silmille täysin vaaratonta infrapunatekniikkaa. Kokeen pitäjä pyytää sinua tarvittaessa korjaamaan asentoa
tehtävien aikana, jotta silmänliikkeet saadaan tallennettua.
Tehtävät
Kuvittele, että olet ladannut iPadiisi Tietokone-lehden kesäkuun numeron ja avaat sen nyt ensimmäistä
kertaa. Lue tehtävien otsikot ja tehtävänannot ääneen ja varmistu, että olet ymmärtänyt tehtävänannon,
ennen kuin aloitat. Yritä suorittaa kaikki tehtävät mahdollisimman nopeasti.
1. ”Nanokoossa kaikki on toisin”
Selaa läpi artikkeli ”Nanokoossa kaikki on toisin” pystysuuntaisilla pyyhkäisyillä ja etsi siitä kaikki
toiminnallisuudet/interaktiot painelemalla artikkelin elementtejä. Siirry lopuksi edelliseen tai seuraavaan
näkymään pyyhkäisemällä näyttöä sivusuunnassa. Palaa takaisin artikkeliin ”Nanokoossa kaikki on toisin”.
Voit tarkistaa tästä paperista tehtävänannot ja seinältä ohjauseleet kaikkien tehtävien aikana.
Muista puhua ääneen miksi teet mitä teet, mitä ajattelet ja mitä tunnet. Ennen seuraavan tehtävän alkua
kokeen pitäjä palauttaa sovelluksen lehden ”etusivulle”: näkymään, joka aukeaa, kun sovellus
käynnistetään ensimmäistä kertaa.
Ilmoita, kun olet mielestäsi valmis. Täytä alla oleva kysely ympyröimällä numero asteikolta, joka vastaa
parhaiten kokemustasi. Kaikkiin kohtiin on vastattava. Jos et jostain syystä pysty vastaamaan kyselyn
kohtaan, ympyröi ”3”.
vaikeaa
1 2 3 4 5
helppoa
ärsyttävää
1 2 3 4 5
miellyttävää
hidasta
1 2 3 4 5
nopeaa
Oliko tehtävän tekeminen tällä sovelluksella mielestäsi
vaikeaa vai helppoa?
ärsyttävää vai miellyttävää?
hidasta vai nopeaa?
2. ”Järkkäristä tuli videokamera”
Etsi kameratestissä ”Järkkäristä tuli videokamera” parhaan arvosanan saanut kamera. Avaa kameran kuva
suurennettuna näkyviin näytölle, jos mahdollista.
Ilmoita, kun olet mielestäsi valmis. Täytä alla oleva kysely.
Tehtävän tekeminen tällä sovelluksella oli mielestäni:
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
3. ”Kriisi 2.0”
Minkä nimistä projektia kutsutaan artikkelin ”Kriisi 2.0” mukaan karttojen Wikipediaksi?
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
4. ”Tietoturvaa iPadiin ja iPhoneen”
Millaisen neuvon artikkeli ”Tietoturvaa iPadiin ja iPhoneen” tarjoaa iPhone-kännykän varastamisen varalle?
Etsi palvelun/toiminnon nimi ja avaa siihen liittyvä kuva suurennettuna näkyviin näytölle, jos mahdollista.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
5. ”10 vekkulia USB-lelua”
Valitse mielenkiintoisin laite artikkelista ”10 vekkulia USB-lelua”. Jos laitteesta on kuva, avaa se
suurennettuna näkyviin näytölle, jos mahdollista.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
6. Pakina
Etsi nimimerkin ”Kiukkuinen ICT-johtaja” kirjoittama pakina Facebookin käytöstä työpaikoilla aivan lehden
lopusta.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
7. ”Tietokoneen tulevaisuus on täällä”
Etsi tablettitietokonetestistä ”Tietokoneen tulevaisuus on täällä” 10 tai 7-tuumaisten laitteiden
vertailutaulukko (ei Akkukesto-taulukko) ja valitse yksi tablettitietokone jollain vapaavalintaisella
kriteerillä. Etsi sitten laitteen arvostelu ja avaa kuva laitteesta suurennettuna näkyviin näytölle, jos
mahdollista.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
8. ”Sähköinen lukeminen maistuu jo”
Etsi tilastografiikka artikkelista ”Sähköinen lukeminen maistuu jo” ja valitse mielestäsi yllättävin asia, mikä
tilastoista käy ilmi.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
9. ”Suljettujen ovien takana”
Selaa reportaasin ”Suljettujen ovien takana” kaikki kuvat läpi siten, että avaat jokaisen kuvan
suurennettuna näkymään näytölle, jos mahdollista. Valitse mielenkiintoisin kuva.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
10. ”Suunnistuksen uudet tuulet”
Vertaile kuvia kännyköiden navigointisovellusten opastusnäkymistä artikkelissa ”Suunnistuksen uudet
tuulet”. Valitse navigointisovellus, jota käyttäisit mieluiten.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
11. Kolumni
Etsi Jyrki Kasvin kirjoittama kolumni.
vaikeaa
ärsyttävää
hidasta
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
helppoa
miellyttävää
nopeaa
Selaus
Selaa lehteä vapaasti kommentoiden 5-10 minuuttia. Täytä tämän jälkeen käytettävyyskysely.
Käytettävyyskysely
Tuotteella tarkoitetaan seuraavissa väittämissä käyttämääsi Tietokone-lehden iPad-sovellusta, ei itse iPadlaitetta. Yritä vastata kaikkiin kohtiin nopeasti ilman pitkää miettimistä. Kaikkiin kohtiin on vastattava.
Voit vastata ”3”, jos et mielestäsi pysty vastaamaan johonkin kysymykseen.
1: Olen sitä mieltä, että voisin käyttää tätä tuotetta säännöllisesti.
Täysin eri mieltä
1 2 3 4 5
Täysin samaa mieltä
2: Tuote on mielestäni liian monimutkainen.
Täysin eri mieltä
1 2 3 4 5
3: Tuotetta on mielestäni helppo käyttää.
Täysin eri mieltä
1 2 3 4 5
4: Mielestäni tuotteen käytön oppiminen vaatii kokeneen käyttäjän opastusta.
Täysin eri mieltä
1 2 3 4 5
5: Mielestäni tuotteen eri toiminnot ovat liitetty toisiinsa onnistuneesti.
Täysin eri mieltä
1 2 3 4 5
6: Mielestäni tuotteessa on liikaa epäjohdonmukaisuuksia.
Täysin eri mieltä
1 2 3 4 5
7: Uskon, että useimmat oppivat käyttämään tuotetta hyvin nopeasti.
Täysin eri mieltä
1 2 3 4 5
8: Mielestäni tuote on hyvin kömpelö käyttää.
Täysin eri mieltä
1 2 3 4 5
9: Tunsin oloni hyvin luottavaiseksi tuotetta käyttäessäni.
Täysin eri mieltä
1 2 3 4 5
10: Mielestäni ennen tuotteen käyttöä pitää opetella paljon uusia asioita.
Täysin eri mieltä
1 2 3 4 5
11. Kuvien selaus toimi mielestäni tuotteessa hyvin
Täysin eri mieltä
1 2 3 4 5
12. Tuotteen ulkoasu tuki mielestäni tuotteen käyttöä.
Täysin eri mieltä
1 2 3 4 5
Kannessa otsikot vievät juttuihin
●
●
Missä sisällysluettelo/sivukartta? Löytyi alareunaa napauttamalla, ei
mitään indikaattoria moisesta.
Alapalkissa:
○ “Kansi” (yksiselitteinen)
○ “Sisältö” (yksiselitteinen)
○ “Sivukartta” (yksiselitteinen)
■ muuten hyvä, mutta ei korostusta nykyisellä sivulla
■ ja hyppii oudosti sivun tai kaksi eteenpäin kun
Sivukarttaan palaa, s.e. nykyinen sivu jää vasemmalle
piiloon
○ “Kirjasto” (melko yksiselitteinen, “Arkisto” olisi parempi, avaa
saman lehden muut ostetut numerot)
○ “Uutisvirta” (avaa Tietokone-lehden nettisivun, joten “Kotisivu”
oilsi parempi. Avaa nettisivun uuteen ikkunaan, joten
alapalkkia ei saa näkyviin uudestaan kunnes tajuaa sulkea
ikkunan ylänurkan Xstä.)
○ “Kauppa” (yksiselitteinen, voi ostaa Tietokone-lehden
numeroita)
●
Yleistä:
● Ei tue vaakarotaatiota
● Ei zoomausta nipistyksellä
● Takaisin-painike: ei ole, sisällysluettelossa virhepainalluksesta joutuu
palaamaan kahdella askeleella alapalkin kautta
● Linkkejä ei ole erotettu tavallisesta tekstistä tai kuvasta mitenkään.
Miksei voi olla kuten webissä, alleviivattuina?
● Sivunumeroja ei ole->vaikeuttaa navigointia
● Tekstiä ei voi kopioida, sivut kuvatiedostoja
● Siirryttäessä sovellusten välillä ei muista mihin kohtaan lukeminen jäi,
vaan hyppää ensimmäiseen juttuun.
WoodWing
Heusristinen testi “Tietokone 06/2011”, 11.10.2011
●
●
Ei kantta
Sivukartta löytyy nopeasti tuplaklikkauksen alta, vaikkei sitä
etukäteen tiedä
○ ei nykyisen sivun korostusta
○ 2-numeroiset sivunumerot jäävät piiloon thumbnailien taakse
○ pystyrotaatiossa pitäisi olla 3 s. kerrallaan näkyvissä kuten
vaa’assa: edellinen, nykyinen ja seuraava sivu->parempi
navigoida
○ pidettäessä sivukarttaa esillä ja vaihdettaessa sivua, sivun
alalaita jää sivukartan peittoon, eikä tule esiin vaikka
sivukartan ottaa pois
Yleistä:
● Tukee vaakarotaatiota
○ välillä pystystä vaakaan käännettäessä sivu kyllä kääntyy
mukana, mutta ei “levity” ja oikealle jää tyhjää
● Ei zoomausta nipistyksellä
● Ei kantta, ei “alapalkkia”
● Takaisin/Undo painike: ei ole
● Linkit tajuaa linkeiksi, ainakin sis.luettelossa
● Sivujen lataaminen kestää 0.5-1 s., hankaloittaa nopeaa selausta
● Sivunumerot löytyy, helpottaa navigointia
● HTML5->tekstiä ja kuvia voi kopioida
Photoswipe
Sisällysluettelo
● Ei sivunroita vaikka sivuilla ne näkyvät
● Ei ilmennetä että se jatkuu alaspäin, artikkeleissa pieni nuoli
ilmentämässä
● Liikaa eri fontteja (5) osioiden ingresseissä
○ 2-3 riittää, sitaatit, ingressit ja testattavat
● Melko pitkään joutuu vierittämään nähdäkseen sis.luettelon kokonaan
○ pienentämällä kuvia ja samantyyppiset jutut yhteen
ryhmittämällä (kolumnit allekkain/vierekkäin, testit allekkain/
vierekkäin) säästyisi tilaa
● “Testit” artikkeleissa vielä lisäksi “Testissä”, turhaa
● Kuvanrajausalgoritmi näyttäisi toimivan hyvin
Pääkirjoitus
● Riittävän isot kuvat
Käynnistys
● 1. otsikon viimeinen sana “jo” menee toiselle riville pystyrotaatiossa
● Karttagrafiikassa vaikea yhdistää maiden nimiä karttaan
● OpenOffice-jutun yhteydessä Itella-sitaatti, joka ei liity juttuun,
WoodWingissä sitaatti erotettu selkeämmin
● Kuvatekstit näkyvät suoraan kuvien alla, näin kuvatekstien
tiedot eivät jää vahingossakaan katsomatta. Kuvien suurennos
painettaessa työn alla.
10 USB lelua
● Turhan väljä taitto ja isot kuvat. Toimisiko pikkujutut kahdella tai
useammalla sarakkeella?
Sisällysluettelo
● Ei ilmennetä että se jatkuu alaspäin, artikkeleissa pieni nuoli
ilmentämässä lisäsivuista
● “Joka numerossa” ei ole juttua Amazonin pilvi repesi tai
Tietoyhteiskunta 2.0, vaan Kolumnit geneerisenä, kuten Kytkentöjä
(tämä kunnossa helmikuun nrossa)
● Sis.luettelon linkeissä sivunvaihtoanimaatio, näkee että nyt hypätään
lehdessä eteenpäin
● Sis.luetteloon pitäisi olla oikoreitti
Pääkirjoitus
● Ei löydy sivukartasta
● Pienet kuvat
● Outoa pistekoon vaihtelua 1. kplssa
Käynnistys
● 1. jutun iso grafiikka ei mahdu sivulle ja vierittäminen kadottaa graafin
toisen reunan
● Kartta toimii hyvin, kunhan tajuaa painaa maita
● “Timo Valli” kuva kasvaa turhan vähän painettaessa, vieressä
olevista 3sta kuvasta ei tapahdu mitään painettaessa vaikka
samanlainen kehys. Tällä ratkaisulla käyttäjä jouttu koettamaan
jokaista kuvaa löytyisikö lisätietoa. Jos kysyttäisiin “Milloin Timo Valli
aloittaa työnsä?”, moniko löytäisi kuvatekstin?
● HP jutussa kuvassa “+” symboli, joka indikoi kuvatekstiä
painettaessa. Miksei näin kaikissa kuvissa, joiden yhteydessä on
kuvateksti piilossa?
10 USB-lelua
● Hyvä, tiivis taitto, melkein kaikki 10 mahtuvat samalle näkymälle
Sosiaalinen media
● Leipätekstin saa kokonaan näkyviin yhdelle ruudulle portaattoman
scrollauksen ansiosta.
● Nostettu sitaatti ei mahdu ruudulle kokonaan pystyrotaatiossa
Uudet tuotteet
● Kaikki tiedot samalla sivulla vierityksen päässä
Tulevaisuuden tekniikka
●
Tietoyhteiskunta 2.0
Testissä: Tietokoneen tulevaisuus...
● Jutun jatkumista kuvaava nuoli näyttää napilta, mutta ei ole sitä
● Laitetestit näkyvissä kerralla samassa näkymässä, helppo vertailla
keskenään.
Sosiaalinen media
● Ruutu kerrallaan scrollautuva sivu toimii paremmin lyhyiden juttujen
kohdalla, tässä ruudun vaihto alaspäin katkaisee leipätekstin.
Uudet tuotteet
● Tuotteista esillä vain kuvat, tiedot painalluksen päässä, joka
aukaisee ponnahdusikkunan, jonka saa suljettua vain ylänurkan
ruksista. Vierittäminen paljon vaivattomampi tapa liikkua tuotteiden
arvostelujen välillä.
Tulevaisuuden tekniikka
● Ensimmäisen kuvan kuvatekstiä ei löydy
Tietoyhteiskunta 2.0
● Tässä sivun sisällä toimiva leipätekstin vieritys toimii hyvin, koska
jutun otsikko pysyy näin näkyvissä koko ajan.
Testissä: Tietokoneen tulevaisuus...
● Jutun jatkumista alaspäin ei indikoida mitenkään
● Tässä jutussa kuvissa tyylikästä toiminnallisuutta (Angry Birdsin ja
liittimien suurennos)
● Lopun kuvagalleria toimii ainakin tässä tapauksessa näin paremmin,
jossa kaikki osat ovat näkyvissä kerralla
○ mutta näitä ei voi halutessaan suurentaa
● Laitetestejä vaikea vertailla, kun aukeavat vain omiin ikkunoihinsa
yksi kerrallaan
Kriisi 2.0
Järkkäristä tuli videokamera
● Kameroita helppo vertailla keskenään, kaikki ovat näkyvillä
vierityksen päässä
Suljettujen ovien takana
● Kannen tekstilaatikolle sopivampi paikka (pystyrotaatiossa) olisi
oikealla alakulmassa lattialla
● Kuvakaruselli muuten hyvä, mutta
○ yläosan leipäteksti kaipaisi sivuille marginaalit.
○ Osissa kuvissa on sama leipäteksti vaikka se näyttäisi
vaihtuvan kuvaa vaihdettaessa
○ vaakarotaatiossa yläosan leipäteksti ei mahdu näkymään
Automatisoi Windows 7 asennus
● Jutun lopussa linkkejä, joihin voisi päästä suoraan painamalla. Tässä
ne voi sentään kopioida leikepöydälle.
Vältä sokki puhelinlaskussa
● Turhan väljä taitto, ainakin vaakarotaatiossa voisi olla kahdella
palstalla
Tietoturvaa iPadiin ja iPhoneen
● Selvästi parempi taitto, kuin WoodWingillä
Kriisi 2.0
Järkkäristä tuli videokamera
● Kameroita hankala verrata kun vain kuvat näkyvät yhdessä, arvostelu
painalluksen takana
Suljettujen ovien takana
● Kuvatekstien nuolet (>>) samannäköisiä kuin tekstipalkin
vieritysmahdollisuutta kuvaavat (>>>)
● Tässä jotkut kuvat suurenevat painettaessa ja osassa niistä on
kuvateksti painalluksen takana. Jotkut kuvista eivät suurene
painettaessa, näitä ei ole erotettu mitenkään
Automatisoi Windows 7 asennus
● Jutun lopussa linkkejä, joihin voisi päästä suoraan painamalla. Tässä
niitä ei voi edes kopioida
Vältä sokki puhelinlaskussa
● Sivua vaikea vaihtaa nopeasti, koska sivulle vierittyvä leipätekstiosa
vie suuren osan ruutua
Tietoturvaa iPadiin ja iPhoneen
● Sekava taitto. Miksi “Etsi iPhone”-kuva heti alussa? Turhan väljä
rajaus “Alarmomatic”-kuvassa
Väripoikkeama kuriin
● Toimiva taitto kahdella palstalla
● Kuvakaruselli
○ kuvatekstiä ei ehdi lukea ennen automaattista piilotusta
○ kun kuvat eivät täytä koko ruutua pystyrotaatiossa,
kuvatekstin ja hallintapainikkeiden piilotus turhaa
○ kuvateksti pysyvästi näkyviin yläreunasta alas vetämällä?
○ kuvakarusellista poistuminen heittää pois artikkelista
○ nipistyszoomaus toimii, mutta sen jälkeen ei saa kontrolleja/
kuvatekstiä näkyviin painalluksella
Verkot ohjelmien ohjaukseen
● Nostositaatin ensimmäinen sana ei mahdu ruudulle pystyrotaatiossa
Suunnistuksen uudet tuulet
● Tyylikäs kansi!
● Kahdelle palstalle taitetut testit kulkevat eri korkeudella otsikoiden
vaatimien rivien erosta johtuen
○ rivirekisteri (ja korkeuksien tasaus esim otsikon jälkeen)
käyttöön
● Kuvien marginaalit osin liian isot (esim Verbatim Mediashare)
Vikatila
● Tarpeettoman väljät marginaalit (ostomääräys, flipperi, kirja),
vaakarotaatiossa korostuu
Väripoikkeama kuriin
● Kuvatekstejä ei suoraan näkyvissä,
● Viimeisen sivun alareunaan jää musta palkki
Verkot ohjelmien ohjaukseen
●
Suunnistuksen uudet tuulet
●
Vikatila

Usability evaluation of design solutions for tablet

Transcription

Similar documents

the world`s largest online newsstand

CribMaster 9 Software

The Neo-Futurists - Faculty Web Sites

Got Game? - AnswerLab

Method Publication Event People Method Publication Event People

“Evergreen” “Save Your Magazine Subscriptions”

Got Game? - AnswerLab

Media Back - Arabisk London