Information Leakage through Mobile Analytics Services

Transcription

Information Leakage through Mobile Analytics Services
Information Leakage through Mobile Analytics Services
Terence Chen∗† , Imdad Ullah∗† , Mohamed Ali Kaafar∗? , and Roksana Boreli∗†
∗
National ICT Australia
†
University of New South Wales, Australia
?
INRIA, France
[email protected]
ABSTRACT
In this paper we investigate the risk of privacy leakage
through mobile analytics services and demonstrate the ease
with which an external adversary can extract individual’s
profile and mobile applications usage information, through
two major mobile analytics services, i.e. Google Mobile App
Analytics and Flurry. We also demonstrate that it is possible to exploit the vulnerability of analytics services, to influence the ads served to users’ devices, by manipulating the
profiles constructed by these services. Both attacks can be
performed without the necessity of having an attacker controlled app on user’s mobile device. Finally, we discuss potential countermeasures (from the perspectives of different
parties) that may be utilized to mitigate the risk of individual’s personal information leakage.
1.
INTRODUCTION
The mobile advertising ecosystem, comprising ad companies (with associated analytics services), app developers, the
companies running ad campaigns and mobile users, has become a powerful economic force. Increasingly, targeted ads
are being served based on user’s information collected by the
apps.
Potential privacy threats resulting from data collection by
third-parties have been extensively studied in the research
literature [7, 10, 9], and a number of countermeasures are
proposed [6, 11, 8]. the immediate impact of such data collection has been so far overlooked.
In this paper, we argue and show that, even if the genuine
purposes of analytics services are legitimate (i.e. to provide analytics to developer and/or to serve targeted ads),
user’s privacy can be leaked by third-party tracking companies due to inadequate protection of data collection phase
and the aggregated information process. We further argue
that inappropriate security measures of the mobile analytics
services may threaten the ads eco-system.
We consider mobile analytics services as entities that can
leak private data to external adversaries, and indeed show
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ACM HotMobile’14, February 26–27, 2014, Santa Barbara, CA, USA.
Copyright 2014 ACM 978-1-4503-2742-8 ...$15.00.
how user profiles and app usage statistics can be extracted
from two major mobile analytics and ads delivery networks
(Google Mobile App Analytics and Flurry Analytics1 ). We
exploit, first, the use of globally unique identifiers (IDs) and
the lack of mobile device authentication in the mobile analytics data collection. Then, the detailed (per user) reporting available from the mobile analytics companies.
Considering the minimum effort required to obtain the
device ID and other identifying information used by mobile
analytics company, that can be collected either by monitoring any accessible networks, e.g. open wireless access points
(APs) or local area network (LAN), or by any app (we note
that no permission is required for the app to collect device
IDs and related information, e.g. Android ID, device model,
OS version, etc), we show how this type of attack presents
a serious threat to user’s privacy. We validate the information leakage through experiments using a set of 44 volunteers
Android mobile devices. We successfully extract the volunteer’s profiles from Flurry and Google Analytics and show
that it is also possible to spoof a device by setting the identifying parameters in either a (different) mobile device or an
emulator.
We further show how a malicious adversary can exploit
the vulnerability of analytics services to launch an attack
on the advertising ecosystem, by perturbing user’s profiles
and consequently disrupting the accuracy of targeted ads
placement, with potentially serious financial consequences
for the advertising ecosystem. Specifically, we demonstrate
how user profiles can be distorted by injecting artificial app
usage data (from spoofed devices), which will influence the
ads served to the original devices. We validate the success of
our technique by capturing and analyzing the ads received
on the targeted devices.
We note that both attacks can be performed without the
need for the user to have an attacker controlled app on their
device, while exploiting the easy access to the unique identifiers of user profiles in the analytics services and lack of
authentication during the data collection and accessing process.
With this work, we aim to highlight the vulnerabilities of
current tracking mechanism by demonstrating the potential
privacy threats, rather than to design a large scale attack
on privacy of mobile users. We hope that our work will
inspire the analytics companies to implement more secure
and privacy friendly tracking mechanisms.
1
For the convenience, we use Google and Flurry to refer to
these two mobile analytics services in this paper
Mobile%ads/analy2cs%network%
[app$usage]$
Aggrega2on%%
server%
Analy2cs%
server%
[app$usage,$
$user$info]$
App%analy2c%
services%
Ad%placement%
server%
User%devices% Mobile%apps%
Adver2ser%
[app$usage,$
$user$info]$
Developer%
Figure 1: Free app eco-system and information flow
between different parties.
Figure 2: Snapshot of developer portal in Flurry
Analytics
The paper is organized as follows. Section 2 describes the
mobile applications ecosystem including the analytics services. Section 3 presents the methodology used for the first
attack, extracting user profile information, while Section 4
presents the methodology to implement the influencing of
served ads by distorting user profiles. Section 5 discusses
the potential countermeasures that may be used to mitigate
the attacks. We conclude in Section 6.
2.
BACKGROUND
In the following, we provide a brief overview of the mobile
applications ecosystem, aim to explain the relationship between users, apps, app developers, advertising network and
the corresponding analytics services.
2.1
The App Ecosystem and Mobile Tracking
In order to maximise the revenue of in-app ads, targeted
ad networks are widely adopted by app developers. According to the study in [9], more than half (52.1%) of top 100,000
apps downloaded from Android Market contain at least one
ad library (including analytics libraries). Figure 1 shows
a typical free app eco-system and the information flow between different parties. The analytic library (in some cases
same as the ad library, e.g. Flurry) which is typically embedded into the mobile app collects usage information and
possibly user related attributes, which are sent to an aggregation server. Globally unique identifiers (e.g. Android ID,
iOS UUID, IMEI, MAC, etc.), available on mobile devices,
allow analytics companies to link data collected from different apps (identified by an appID) to the same individuals.
The analytics service derive user profiles from aggregated
data and provide the information to the ad network (e.g.
Google AdMob and Flurry) for the purpose of serving targeted ads to mobile users.
2.2
Mobile Analytics and Tracking API
Another incentive for the developers to use analytics services is the comprehensive measurement tools that help
them to evaluate the performance of their apps. With the
knowledge learned from large number of other apps and
users, analytics services can also provide audience information, if available, for example gender, age, location, language, interests, etc. Figure 2 illustrates the Flurry app
Figure 3: Snapshot of an app usage message
(onStartSession message) send by Flurry API
performance dashboard which shows the user interests of
an app compared to global benchmark. Developers are also
able to view the demographic information of the app users
via the tags on the left.
In order to use these services, developers are required to
embed the tracking APIs to the apps during the development. Once a user launch the app and agreed on the permission of accessing resources on the device, the tracking APIs
send user information, device and app usage information to
the aggregation server over the Internet. Typically the message is sent using either a HTTP GET or a POST method,
e.g. Figure 3 shows the content of a onStartSession message, reporting the app (identified by APPID) usage of a
device (identified by AndoidID).
3.
USER PROFILE EXTRACTION
In this section we present the methodology to extract user
profiles from mobile analytics services, solely relying on the
device identifier of the target. In our study, we demonstrate
the methods using both Google Analytics and Flurry in the
Android environment. We refer a user profile as the set of
information collected or inferred by the analytics services.
Different types of information may be available in different
services, most of them include basic demographic information like age, gender, geography, language, etc. Analytics
services also provide audience’s interests characteristics that
are inferred from other apps usage. For instance, Flurry pro-
3.1
Extracting User Profiles from Google
Android system allows users to view and manage their inapp ads preferences3 , e.g. to opt-out or to delete interest
2
http://support.flurry.com/index.php?title=Analytics/
Overview/Lexicon/Personas
3
Access from Google Settings -> Ads -> Learn more ->
Adjust your Ads Settings
File path in Android file system
/data/data/com.android.providers
.settings/databases/settings.db
Android ID
ro.build.id
ro.build.version.release
ro.product.brand
ro.product.name
ro.product.device
ro.product.model
/system/build.prop
Table 1: The file path of identifying parameters in
Android device file system
Real%device%
[devic
e%IDa,%
a
appIDa%
ppIDa
,%…]%
[app%usage]%
a]%
Target%user%
Availability and Spoofing of Device ID
Easy access to Device ID. An adversary can capture
victims Android IDs in at least two possible ways. First,
the adversary can simply monitor the network, capture the
usage reporting message sent by the third-party tracking
APIs and extract the device ID to be utilised for further
communication with the analytics service. An example of
such a message is shown for the case of Flurry in Figure 3.
In a public hotspot scenario, it is then very easy to monitor
hundreds if not thousands of IDs. In a confined area, an
adversary (e.g. an employer or a colleague) targeting a particular individual can even associate the collected device ID
to his target (e.g. employee). It is interesting to note that
Google analytic library prevents leakage of device identity
by hashing the Android ID, however it cannot stop other ad
libraries to transmit such information in plain text (which
be easily mapped to Google’s hashed device ID).
An alternative way, although may be more challenging
in practice, is to obtain the target’s device identifier from
any application (controlled by the adversary) that logs and
exports the device’s identity information.
Spoofing of Device ID. Flurry uniquely identifies Android users by a combination of device ID (Android ID in our
case) and device information comprising the device name,
model, brand, version and system build. Google however
simply relies on the Android ID. These parameters, as described above, can be easily identified by observing the unprotected reporting messages. We note that they are also
accessible to any Android app, without the user agreement
given through the Android permissions. By modifying the
values of identifying parameters in a rooted Android device,
we are able to spoof the identity of another Android device.
The corresponding system properties files that contain the
parameters in the Android file system are shown in Table 1.
3.2
Parameter
[device%ID
vides an attribute called “persona”2 in the user profile, which
indicates the interests and behavior of the audience group.
While some of these persona tags are quite general, some
others can be consider as sensitive personal information, e.g.
“Singles”, “New Mons”, “High Net-Worth Individuals”, etc.
We notes that Flurry also allow developer to access detailed
usage sessions on different app categories of the app’s audience.
The key technique to extract user profiles from the analytics service is to first impersonate the victim’s identity
and perform actions on behalf of the victim, then (1) in the
Google case, to fetch the user profile from a spoofed device,
where the profile is simply shown by the Google service as
an ads preference setting or (2) in the Flurry case, to inject
the target’s identity into a controlled analytics app, which
triggers changes in the Flurry audience analysis report, from
which the adversary is able to extract the user profile. In
the following, we first describe how to obtain and spoof a
device’s identity. Then, we detail the user profile extraction
for both cases of Google and Flurry.
Open%AP%
appIDx%
Spoofed%%
device% Adversary%
Aggrega2on%%
server%
Analy2cs%
server%
App%analy2cs%
services%
Figure 4: Privacy leakage attack scenario
categories. The feature retrieves user profile from Google
server which is identified by the Android ID. As a consequence of the device identity spoofing described in Section
3.1, an adversary is able to access the victim’s profile on a
spoofed device.
3.3
Extracting User Information from Flurry
Extracting user profile from Flurry is more challenging,
as Flurry does not allow users to view or edit their Interests
profiles. In fact, except the initial consent on the access of
resources, many smartphone users may not be aware of the
Flurry’s tracking activity.
In Figure 4 we show the basic operations of our profile
extraction technique. An adversary spoofs the target device, identified by deviceIDa , using another Android device
or an emulator. He then uses a bespoke app with a (legitimate) appIDx , installed on the spoofed device, to trigger
a usage report message to Flurry. The analytics service is
thus manipulated into believing that deviceIDa is using a
new application tracked by the system. Consequently, all
user related information is made accessible to the adversary
through audience analysis of application appIDx .
When the audience report from Flurry targets a unique
user, an adversary can easily extract the corresponding
statistics and link them to that single user. Similarly, the adversary will be able to access all subsequent changes to this
user profile, reported at a later time. In our presented technique, since we do impersonate a particular target’s device
ID, we can easily associate the target to a “blank” Flurrymonitored application.
Alternatively, an adversary can derive an individual profile from an aggregated audience analysis report by monitoring report difference before and after a target ID has
been spoofed (and as such has been added to the audience pool). Specifically, the adversary has to take a snap-
shot of the audience analysis report Pt at time t, impersonates a target’s identity within his controlled Flurry-tracked
application, and then takes another snapshot of the audience analysis report at Pt+1 . The target’s profile is obtained by extracting the difference between Pt and Pt+1 ,
i.e. ∆(Pt , Pt+1 ). However in practice, Flurry service updates profile attributes in a weekly basis which means it will
take up to a week to extract a full profile per user.
Finally, it is important to note that Flurry provides a
feature called segment to further split the app audience by
applying filters according to e.g. gender, age group and/or
developer defined parameter values. This feature allows an
adversary to isolate and extract user profiles in a more efficient way. For instance, a possible segment filter can be
“only show users who have Android ID value of x” which
results in the audience profile containing only one user.
3.4
Validation Experiments
To demonstrate the private information leakage through
both Google and Flurry analytics services, we design and
conduct controlled experiments on a set of 44 volunteers
(Andriod users, mostly researchers and university students
from Australia, China, France and US). More specifically, we
aim to show that it is possible to not only spoof the identity
of an Android device but also to extract each user’s profile.
To aid our experiments, we have developed an Android
app, which has been installed by the volunteers. This app
collects the devices identifiers and sends them back to our
server. These identifiers are later used for device spoofing.
After spoofing volunteer’s device identities, we can successfully access their Google ad preference settings on our
experimental device. We found that the majority of users in
our volunteer sample are not (yet) profiled by Google (80%),
7 of them have at least 1 interest category and 2 have demographic information, i.e. gender and age group. Interestingly, we observed that 2 of our volunteers have opted-out
the targeted ads.
To validate the profile extraction on Flurry, each volunteer
installed an customized app with a unique appIDx (so that
profiles subsequently built by Flurry can be differentiated).
This application triggers a usage report message (by calling
the onStartSession function of the Flurry SDK, using it’s
specific appIDx ). We are thus able to extract the profile
of each volunteer from the corresponding Flurry audience
report, as described in Section 3.3.
To validate the effectiveness of the device spoofing, we set
up a second experiment using the same set of collected device IDs. For each device, we initiate a new usage report
message to Flurry, using a different appIDx . Our assumption is that Flurry will associate each device identifier to the
app corresponding to appIDx . To verify this, we extract the
two versions of profiles from Flurry for each of the collected
device IDs, i.e. the profile corresponding to the volunteers’
real devices and to our spoofed devices. We have observed
that the two sets of profiles are identical, confirming that
Flurry cannot distinguish between the spoofed and actual
report messages.
For the Flurry profiles, we found that 84.1% of our volunteers were already tracked by Flurry (before users install our
application). In addition, 56.8% of the profiles have been assigned at least one persona tags, and 11.4% of them have an
estimated gender. By following the Flurry audience report,
we are able to observe how user’s application usage evolves
Profile
Category
Blank
Books & Refs
Business
Games
Media
Productivity
Social
Code
BL
BO
BU
GA
ME
PR
SO
avg # unique
ads (Google)
42.5
106
148
148
166.5
110
176
avg # unique
ads (Flurry)
212
260.5
219.5
219.5
220.5
215
181.5
Jaccard index
vs. BL (Google)
0.645
0.32
0.275
0.1825
0.235
0.3325
0.235
Jaccard index
vs. BL (Flurry)
0.92
0.704
0.8752
0.608
0.8705
0.8435
0.793
Table 2: Measuring ads received by different profiles
over time. Over a period of 8 weeks monitoring, we found
that only 25% of the profiles are static, and that 50% of the
volunteers use at least one Flurry-tracked application once
a week, which triggers changes on their profiles.
4.
DISTORTING USER PROFILES AND
INFLUENCING SERVED ADS
In this section, we demonstrate the second type of attack
that again uses the analytics service vulnerabilities to reduce
the accuracy of analytics results and produce irrelevant targeted ads. The main idea is to first impersonate the target
device, then “pollute” the profile in mobile analytics services
by generating artificial app usage reports. This attack has a
potential to seriously disrupt the advertising ecosystem and
to reduce the profit for both advertisers and ad companies.
4.1
Methodology
The effectiveness of the attack are validated in two steps.
We first validate the premise that user’s profile is the basis
for targeting, by showing that specific profiles will consistently receive highly similar ads and conversely, that a difference in a user’s profile will result in a mobile receiving
dissimilar ads. We then perform the ad influence attack, i.e.
we perturb selected profiles and demonstrate that the modified profiles indeed receive in-app ads in accordance with
the profile modifications. Both sets of experiments comprise
the following steps.
Profile training and perturbing: We first train new
user profiles on a set of random generated Android IDs
(Google and Flurry see these users as new users with blank
profiles), and then run a set of apps from selected interest categories, so that Google and Flurry receive app usage
reports from these categories and build user profiles accordingly. To train a specific profile, we run apps from targeting
category, e.g. business, for a period of 24 hours. Google response to the profile changes in a proximate 6 hours interval
while Flurry updates audience category usage to the developer once a week. To perturb a profile, we run an app from
a different category and tailor the number of app sessions so
that the new category becomes dominant.
Ad collection: In-app ads for both Google and Flurry
are delivered via HTTP protocol. Before launching an app
that receive ads, we run tcpdump for Android on the background to monitor the ad traffic. The captured traffic is
pulled from the device and ads are extracted from the TCP
flows every 10 minutes. Unlike Google, the captured URLs
are obfuscated by the Flurry ad network. To identify ads,
we follow the redirections until reaching the landing page of
the ads. Note that we turn on the “test mode” of Flurry ads
API to avoid ad impression and click fraud.
4.2
Validation of the Targeting
Mobile targeted advertising is till in early age compared to
browser-based targeting, to validate the affect of user profile
GAaBU
L
GAd
GAc
BUb
BUa
GAbBU
GAd
GAc
GAaBU
L
(b) Flurry
M
Figure 6: Unique ads similarity before and after profile perturbation. (H - high, M - moderate and L low)
L
To evaluate the difference in served ads, we again compute
the Jaccard index between set of unique ads received as by
all the profiles (note BUa , BUb , GAa and GAb were measured in the validation experiment). The level of similarity
are shown in Figure 6. The distortion to the Games profiles
are more obvious in Flurry ads network: we observe that the
have much higher simiads received by GABU
and GABU
a
b
larity to Business profiles than to Games profiles, which
suggests that Flurry ad network has altered the targeting
to present a significantly higher number of ads appropriate
to the Business profile, to the Games user. In the Google
case, the ads received by perturbed profiles and actual Business and Games profiles exhibit moderate similarity values.
However the similarity between Business and Games ads
are rather low . This suggests that the perturbed profiles,
GABU
and GABU
a
b , receive both Business and Games related
ads. This result is consistent with the observation that both
Business and Games components are found in the Google
ads preference.
(b) Flurry
Figure 5: Unique ads similarity between profiles,
sorted by Jaccard index v.s. blank profile. (H high, M - moderate and L - low)
To measure the similarity between the sets of received ads
in various profiles, we compute the Jaccard index J(A, B) =
|A∩B|
, where A and B are two sets of unique ads. Table 2
|A∪B|
shows an overview of the collected ads across all devices. We
observe that even if less ads are received from the Google ad
network, the ads are more diverse as the Jaccard index of the
ads between different categories and blank profiles suggests
they are less similar compared to Flurry. Furthermore, we
compare the similarity of ads between categories, the results
are shown in Figure 5. We can observe strong evidence of
targeting, with 1) profiles a and b in each category having
higher (lighter colour in Figure 5) Jaccard index values and,
2) profiles from different categories having a lower (darker
colour) values.
4.3
BU
BUb
(a) Google
M
BUa
GAbBU
GAbBU
(a) Google
GAb
GAa
BOb
BOa
SOb
SOa
PRb
PRa
MEb
MEa
BUb
BUa
BLb
BLa
GAb
GAa
SOb
SOa
MEb
MEa
BUb
BUa
BOb
BOa
PRb
PRa
BLb
BLa
L
M
BUa
BUa
M
BUb
BUb
GAa
H
GAd
GAc
GAc
H
GAb
GAa
BOb
BOa
SOb
SOa
PRb
PRa
MEb
MEa
BUb
BUa
BLb
BLa
H
GAd
GAbBU
H
GAb
GAa
SOb
SOa
MEb
MEa
BUb
BUa
BOb
BOa
PRb
PRa
BLb
BLa
Games appeared in the perturbed profiles, this suggest converting a profile in Google requires a longer period.
GAaBU
to the in-app ads, we compare the similarity of ads received
using different user profiles in a control environment. We
first select six app categories, for each category we train two
identical profiles by installing and running apps from the
selected category in two devices, denote as profile a and b.
The selected categories are: Games (GA), Business (BU),
Books (BO) & References, Media (ME) & Video, Productivity (PR) and Social (SO). We then collect ads from all the
devices by running the ad collection apps for a period of 24
hours. We also collect ads from two devices with no/blank
profile.
Influencing Served Ads
To demonstrate the effectiveness of the ad influencing attack, we use an example of polluting the Game profiles
with Business applications4 . As a starting point, we use
four devices with identical profiles from the Games category: GAa , GAb , GAc and GAd . Then from a fifth device,
we pollute GAa , GAb by injecting artificial usage reports
of Business applications with the spoofed device IDs. To
make Business a dominant factor in the modified profile,
we “run” significantly more sessions and longer period with
artificially Business apps usage. We denote the perturbed
profiles as GABU
and GABU
a
b . We verify the results of perturbation by extracting the profiles from Flurry and Google
using techniques described in Section 3, finding that the perturbed profile comprise of more than 98% of Business categories usage and less than 2% of the Game category usage.
For Google, we find categories related to both Business and
4
We successfully tested with other categories. For simplicity,
we only show results based on Games and Business profiles
5.
POTENTIAL COUNTERMEASURES
There are a number of user based solutions to avoid analytics tracking, e.g. Mockdroid [4] and PDroid[1], that block
information access to mobile applications (including thirdparty libraries embedded into apps) or alter the information
reported to the third-parties. However, as observed in [4],
these solutions may prevent specific applications from properly functioning. In addition, users may actually want to
receive useful and meaningful in-app ads. The user side
protection therefore may not be the appropriate solution to
the problem we have presented here.
To address the privacy concern raised by the third-party
tracking, both Android [2] and iOS [3] deprecate permanent
unique identifiers (Android ID for Android and UUID for
iOS) in their new software release, instead, a user-specific
and unique “advertising identifier” is provided for the tracking purpose. This changes allow users to reset their advertising identifier, in similar way as browser cookies may
be deleted in desktop computers. However, to the analytics
company, every “reset” of the advertising ID creates an additional identity in their database, the user’s previous history
become useless and also results in inaccurate audience estimation to both advertiser and developers. Therefore there
is a strong incentive for the analytics companies to find replacement identifier, and in fact in the mobile environment,
there are a number of alternatives, for example MAC address, IMEI, or building fingerprinting of the device. We
note that such changes will not affect the identity spoofing
attacks if these identifiers are exposed to the attacker.
From an analytics service provider perspective, it may be
possible to prevent the individual profile extraction by ensuring the anonymity level of audience reports. For instance,
analytics should produce reports only when the audience is
composed of a minimum of k significantly different profiles
similar to what is done in the Facebook Audience estimation platform [5]. On the other hand, addressing the identity
spoofing vulnerability may be more challenging in practice.
As the first step, any message containing the device identifier should be protected. Google does take this into consideration by hashing the device IDs. However as mentioned
earlier, Google cannot successfully protect the device ID in
isolation, and all other ad and analytics libraries need to do
likewise.
protocol
HTTP
HTTPS
onStartSession
latency
bandwidth
160±1 ms
422 B
800±5 ms
3288 B
getAds
latency
bandwidth
160±1 ms
340±2 B
800±5 ms 2000±269 B
total/hour
latency
bandwidth
4,400±380 ms
9,425±731 B
8,200±950 ms 390,645±36,611 B
Table 3: HTTP v.s. HTTPS communication cost of
analytics and ads traffic (Flurry)
As the next level of protection, let us assume that all communications between the user device and analytics server
are secured. In fact, Flurry API allows the developer to
use SSL for this purpose, although this feature is turned off
by default. The API documentation5 suggests that additional loading time may be introduced if SSL authentication/encryption is used, due to extra handshake and communication costs. We take Flurry API as an example to
evaluate the communication cost, in terms of latency and
bandwidth, for both tracking and ads traffics using Flurry
HTTP and HTTPS methods. The results are shown in Table 3. We found that the response time is on average 5 times
longer using HTTPS than HTTP, and the extra handshakes
introduces on average 7 times more bandwidth. If we run the
app for 1 hour with default ad fetching interval, as defined
by Flurry API, HTTPS consumes 390.6 kB of bandwidth
while only 9.4 kB for HTTP. These results suggest that enforcing SSL usage would lead to significantly higher processing and communication costs for the user device, and that
this is certainly the case for Flurry, given that they claim
they are handling as much as 3.5 billion app sessions per
day6 . Nonetheless, even though protecting the communication through encryption would prevent the adversary from
collecting the device ID, it would still not prevent the identity spoof, if the ID was known through a different process,
e.g. from malicious apps.
A conventional security solution to mitigate identity spoof
attacks is to rely on a public key infrastructure, and use certificate and digital signatures to authenticate the messages
and devices. Regardless of the extra communication and
process cost, the users may be reluctant to authenticate to
services that they may not even be aware of, and such infrastructure requires an industry-wide effort to be implemented.
5
6
accessible through http://goo.gl/sB9qs, June 13 2013
accessible through http://goo.gl/XnTyC, Oct 17 2013
How to efficiently authenticate mobile devices while protecting user privacy during tracking process across applications
remains an open question.
6.
CONCLUSION
In this paper, we present and validate the methodology
used to demonstrate the leakage of user’s personal information through mobile analytics services. We show this leakage for two major analytics services, Google Mobile App
Analytics and Flurry and additionally demonstrate how a
malicious attacker can use this to disrupt the advertising
ecosystem. Although the recent modifications in Android
and iOS systems, to remove permanent unique identifiers,
may result in a change in the way the analytics companies
track users (necessitating use of a different permanent IDs
like e.g. device fingerprint) these changes would not affect
the described attacks if such identifiers are exposed.
7.
REFERENCES
[1] Pdroid – the better privacy protection, December
2011.
http://www.xda-developers.com/android/pdroid-thebetter-privacy-protection/.
[2] Android “kitkat” update – new privacy features,
November 2013.
http://www.futureofprivacy.org/2013/11/15/androidkitkat-update-new-privacy-features/.
[3] Using identifiers in your apps, March 2013.
https://developer.apple.com/news/?id=3212013a.
[4] A. R. Beresford, A. Rice, N. Skehin, and R. Sohan.
Mockdroid: trading privacy for application
functionality on smartphones. In HotMobile, 2011.
[5] T. Chen, A. Chaabane, P.-U. Tournoux, M. A. Kaafar,
and R. Boreli. How Much is too Much? Leveraging
Ads Audience Estimation to Evaluate Public Profile
Uniqueness. In PETS’13, 2013.
[6] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung,
P. McDaniel, and A. N. Sheth. Taintdroid: an
information-flow tracking system for realtime privacy
monitoring on smartphones. In Proc. of 9th USENIX
Symposium on OSDI, 2010.
[7] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri.
A study of Android Application Security. In
Proceedings of the 20th USENIX conference on
Security, SEC’11, 2011.
[8] A. P. Felt, H. J. Wang, A. Moshchuk, S. Hanna, and
E. Chin. Permission Re-Delegation: Attacks and
Defenses. In Proc. of 20th USENIX Security
Symposium, 2011.
[9] M. C. Grace, W. Zhou, X. Jiang, and A.-R. Sadeghi.
Unsafe Exposure Analysis of Mobile In-app
Advertisements. In WISEC, 2012.
[10] S. Han. A Study of Third-Party. Tracking by Mobile
Apps in the Wild. Technical report, University of
Washington UW-CSE-12-03-01, 2012.
[11] I. Leontiadis, C. Efstratiou, M. Picone, and
C. Mascolo. Don’t kill my ads!: balancing privacy in
an ad-supported mobile application market. In
HotMobile, 2012.