Critical Evaluation of Current Approaches to Grid Security

Transcription

MSc Secure Electronic Commerce
Royal Holloway, University of London
Critical Evaluation of Current
Approaches to Grid Security
Submitted by
Ali Nasrat Haidar
Supervised by
Dr. Kenny Paterson
2002-2003
CONTENT
ACKNOWLEDGMENT ..........................................................................5
CHAPTER 1 ..............................................................................................6
INTRODUCTION TO GRID SECURITY................................................6
1.1 Introduction..........................................................................................................6
1.2 Example of a Grid application .............................................................................6
1.3 Globus ..................................................................................................................7
1.3.1 Globus Model............................................................................................8
1.3.2 Globus and CORBA ...............................................................................10
1.4 Setting the scene ................................................................................................10
1.5 Security issues on the Grid ................................................................................13
1.6 E-commerce security Vs. Grid Security ............................................................16
1.7 Aim of this research ...........................................................................................16
1.8 Dissertation Organisation...................................................................................17
CHAPTER 2 ............................................................................................18
VIRTUAL ORGANISATION..................................................................18
2.1 Introduction........................................................................................................18
2.2 VO partner organisations trust all VO members................................................19
2.3 VO organizations trust all VO members and a Central Database....................20
2.4 Public Key Infrastructure (PKI).........................................................................22
2.4.1 Overview of Public Key Cryptography ..................................................22
2.4.2 X.509 Certificate.....................................................................................23
2.4.3 Certificate Authority (CA)......................................................................24
2.4.4 Registration Authority (RA) ...................................................................25
2.4.5 Certificate Revocation ............................................................................25
2.4.6 Problems with PKI..................................................................................27
2.5 VO organizations trust a third party...................................................................28
2
CHAPTER 3 ............................................................................................32
GRID AUTHENTICATION....................................................................32
3.1 Introduction........................................................................................................32
3.2 Design issues in Grid Authentication protocol ..................................................32
3.3 Approaches to Authentication............................................................................33
3.4 VO sites trust all VO members ..........................................................................34
3.5 VO sites trust all VO members and a Central Database ....................................35
3.6 Grid Authentication with PKI: VO sites trust a third party ...............................37
3.6.1 Advantages of using PKI on the Grid .....................................................39
3.6.2 Vulnerabilities in PKI .............................................................................39
3.7 Proxies and Delegation ......................................................................................40
3.8 Security issues in proxies...................................................................................41
3.9 Other Alternatives..............................................................................................42
3.10 Globus Toolkit (GT2): GSI approach to Authentication .................................43
3.10.1 GT2 authentication with proxies...........................................................44
3.10.2 MyProxy ...............................................................................................46
CHAPTER 4 ............................................................................................47
GRID AUTHORISATION ......................................................................47
4.1 Introduction........................................................................................................47
4.2 Fundamental model of access control................................................................47
4.3 Resource centred with Access Control List (ACL) ...........................................49
4.4 Role Based Access Control (RBAC) .................................................................50
4.5 Distributed Authorisation...................................................................................51
4.6 Globus approach to Authorisation .....................................................................53
4.7 Community Authorisation Service (CAS) .........................................................54
4.8 Access Control with PKI ...................................................................................56
4.9 Firewalls and the Grid........................................................................................57
4.9.1 Brief overview of firewalls .....................................................................57
4.9.2 Accessing resources behind the firewall.................................................58
4.9.3 Naming issue with Network Address Translation (NAT) ......................59
4.9.4 Globus and firewalls ...............................................................................60
4.10 Future network solutions..................................................................................60
3
CHAPTER 5 ............................................................................................62
CONFIDENTIALITY,
INTEGRITY,
AVAILABILITY
AND
ACCOUNTABILITY ON THE GRID ....................................................62
5.1 Introduction........................................................................................................62
5.2 Confidentiality on the Grid ................................................................................63
5.2.1 Brief overview of Encryption .................................................................63
5.2.1 Communication security .........................................................................66
5.2.2 Data resource privacy .............................................................................67
5.2.3 Remote Data privacy...............................................................................67
5.3 Integrity on the Grid...........................................................................................68
5.4 Grid Availability ................................................................................................69
5.5 Accountability....................................................................................................71
CHAPTER 6 ............................................................................................73
TOWARD A TOP DOWN VIEW OF GRID SECURITY......................73
6.1 Introduction........................................................................................................73
6.2 Risk management of Grid Assets.......................................................................74
6.2.1 Overview of Risk and Risk Analysis ......................................................74
6.2.2 Risk Analysis of Grid Assets ..................................................................75
6.2.3 Enhancing security of core Grid Assets..................................................76
6.3 Security hierarchy of Grid Resources ................................................................77
6.4 Threats................................................................................................................80
7. CONCLUSION ...................................................................................81
REFERENCES........................................................................................84
4
Acknowledgment
I am most grateful to Dr. Kenny Paterson for his supervision, guidance, illuminating
discussions and critical feedbacks on this project. In fact, I couldn't have wished for a
better supervisor! I am also very grateful to Prof. Peter Wild for his extremely
valuable comments on a draft of this report and for his supervision while Dr. Paterson
was away in August. Having said that, all inaccuracies and deficiencies in this report
are my responsibility alone. I would like to thank all the lecturers, research staff, and
MSc colleagues in the Information Security Group at RHUL for creating an exiting
lively environment for learning, intellectual discussions, and professional activities
within a most friendly and enjoyable atmosphere!
I am also indebted to several people who gave wonderful lectures at the MENA
Advanced Summer School on parallel, distributed, and Internet computing, July 7-19,
2002. Most of all, Prof. Dieter Gollmann for interesting me in security through his
brilliant lectures on this subject. Prof. Mark Baker (Portsmouth) and Prof. Salim
Hariri (Arizona) for introducing me to Grid, parallel computing and middleware
concepts. Prof. Peter Coveney (director of the Centre for Computational Science,
UCL) for his lectures on RealityGrid and for many enjoyable discussions this year
which made me appreciate the big gap between what scientists require and what
current Grid Security solutions offer. Finally, Dr. Ali Abdallah (London South Bank
University) for introducing me to formal modelling, abstraction, and precision.
5
Chapter 1
Introduction to Grid Security
1.1 Introduction
The vision of the computational Grid [1, 3] is to provide high performance computing
and data infrastructure supporting flexible, secure and coordinated resource sharing
among dynamic collections of individuals and institutions known as “virtual
organizations” (VO) [1, 3]. “Grid Computing” is rapidly emerging from the scientific
and academic area to the industrial and commercial world. It is intended to offer
seamless and uniform access to substantial resources without having to consider their
geographical locations. Resources can be high performance supercomputers, massive
storage space, sensors, satellites, software applications, and data belonging to
different institutions and connected through the Internet [1, 3]. Grids can enable
collaboration between several organisations [1, 3]. The Grid provides the
infrastructure that enables dispersed institutions (commercial companies, universities,
government institutions, and laboratories) to form virtual organisations (VOs) that
share resources and collaborate for the sake of solving common problems.
1.2 Example of a Grid application
A typical example of a Grid application is “weather prediction”. This involves
collaboration between several partners: TV stations that produce regular weather news
reports, a Satellite Company that regularly provides space images of the earth, a super
computing centre that rapidly analyses the images and a visualization centre that
produces visual interpretations of the weather analysis (Figure 1.1). The smooth
running of this project for the timely production of regular weather reports crucially
depends on appropriate schemas for securely sharing, exchanging, and coordinating
information between these partners.
6
Satellite Company
TV Station
VO
Visualization
Center
Analysis Center
Intensive
Computation
Visual. Tool
Figure 1.1 Weather prediction using Grid
The power of Grid is particularly useful in arenas involved in intensive processing
such as life science research [26], financial modelling [26], industrial design [26], and
graphics rendering [26]. Many governments have recently initiated special
programmes to support the Grid: UK e-science programme [27] is funding project
such as Reality Grid [28]. EU is funding projects such as European Data Grid [29],
EuroGrid [30], and in US, NASA is funding an Information Power Grid [31] and
Department of Energy is funding Globus project [32].
The benefits of having partnerships between institutions to achieve ambitious projects
have been well recognised and well documented in [1]. Currently these programmes
exist in concept and one of the biggest barriers to their realisation is security.
1.3 Globus
Globus [31] was the first attempt to coordinate resource sharing between many
research institutions in the US collaborating together to enable the construction of
computational Grid. It provides a software infrastructure, Globus Toolkit [31], which
enables an application to pool geographically distributed instruments, visualization
7
tools, high Performance Computing, and information resources. It also introduces
security access controls to resources, despite their geographical distribution and their
heterogeneous nature. The primary objective of Globus is to integrate these
heterogeneous resources into a single virtual machine [8].
Currently, the Globus Toolkit, GT2, [31] is considered the de-facto standard
middleware used for building Grid applications because of its wide acceptance and
deployment worldwide [3, 7]. Many Globus concepts are adopted in IBM products
[26] and current Grid projects such as NASA’s Information Power Grid and European
Data Grid. Several alternatives do exist such as Legions [39] and Unicore [34], which
is used by Reality Grid project for building Grid applications.
1.3.1 Globus Model
Globus adopts the “hourglass model”. This model provides a set of core services as
basic infrastructure. A typical Grid environment would incorporate heterogeneous
resources such as different machines, different operating systems, and possibly
different hardware architecture. Therefore, it is very difficult to design a middleware
that meets all sorts of applications requirements. Each Grid project may be specialised
in different area that requires different types of resources. This model is used to
construct high-level specialised Grid applications.
Diverse applications
Core
Services
(GSI)
Various OS
Figure 1.2 The “hourglass” model of Globus [31]
8
Instead of offering a specialised solution for Grid applications, this model will allow
developer to build dedicated applications on top of Globus, keeping the latter
participation low.
Globus consists of many components namely: Globus Toolkit Resource Allocation
Manager (GRAM) [45], Grid FTP, Monitoring and Discovery Service (MDS) [46]
and Grid Security Infrastructure (GSI) [8].
GRAM: is an HTTP-based protocol used for remote allocation of computational
resources and for monitoring and managing the status of the execution on those
resources [8]. Here is brief explanation of these components.
GridFTP: is based on File Transfer Protocol, is used to provide a high-performance,
secure and reliable data transfer and data access on the Grid.
MDS: is used to provide access to static and dynamic information of resources. This
information includes capability and availability of the resource.
GSI: is used to provide authentication and related security services discussed is details
in the next section.
GridFTP
MDS
GRAM
Grid Security Infrastructure (GSI)
Figure 1.3 Components hierarchy in GT2
Grid Security Infrastructure (GSI) component is the most important part of the Globus
Toolkit. All other components are built on top of it (figure 1.3). This will help
developers to define protocols and primitives that allow “…Secure negotiations,
initiations, monitoring, accounting, and payment of sharing operations on individual
resources” [3]. The functionalities of GSI are discussed in more details in chapter 3
and 4.
9
1.3.2 Globus and CORBA
Many large organizations such as financial institutions have built their IT system over
the years by adding new hardware, software and applications to meet new user
requirements. The upgrade sometimes requires rewriting an application in order to
work on a new platform. Therefore, sharing data between those applications requires
customised software solution [2].
CORBA [2] is a middleware designed to enable the construction of distributed
applications within an organisation. It provides a set of library primitives that allows
applications to be constructed to run on machines with different kinds of operating
systems and different classes of parallel hardware architectures. Thus, improving the
interoperability of applications. CORBA is built on client-server model rather than
coordinated use of multiple resources [3]. CORBA does not allow:
•
A user to grant access rights to an application running on a remote site.
•
Pooling resources from multiple administrative domains when required to
solve a complex a problem.
•
Sharing between different organisations. The sharing in CORBA is restricted
within one organization with one security policy rather than many
organizations with different security policies.
Globus complements CORBA rather than replacing it. There are many scenarios
described in [3] on how Globus could possibly be used with CORBA within
enterprise computing.
We examine Globus security mechanisms in several sections throughout this project.
Globus approach to authentication in chapter 3, to authorisation and firewalls in
chapter 4.
1.4 Setting the scene
To set the scene, we shall first understand what a virtual organisation (VO) is. A VO
is a community of resource providers and users from multiple administrative domains,
collaborating in order to achieve common objectives [3].
The following VO example will be used in several sections through this work. We
have a virtual organisation (VO) comprising several institutions collaborating in a
Grid project. These institutions can be Academic, Governments, Industrial and
10
Commercial institutions. We assume that these institutions are geographically
distributed. Figure 1.4 illustrates the Grid Infrastructure and the collaborating
partners’ characteristics.
Grid
Resources
Company A
University B
VO
GRID INFRASTUCTURE
Lab C
Company D
Grid
Users
Figure 1.4 A typical Virtual Organisation and Grid Infrastructure
Each institution in the project has a local security policy that governs access to its
local resources. Although these institutions are partners in the VO, not all their users
are members of the VO (denoted Grid users in Figure 1.4). In addition, not all
resources are shared with the VO (denoted Grid resources in Figure 1.4). Some
resources are restricted to the VO and only accessible by local users. Each institution
has a local intranet security solution such as Kerberos [2, 3] or Public Key
Infrastructure (PKI) [10, 12, 13, 14, 53].
11
The main roles involved in this VO are:
User: This role includes users from institution A, B, C and D. Each user has a job
position and permissions associated to him to perform job functions. Let U.A denotes
user U from institution A.
Administrator: This role includes administrators from institution A, B, C and D. The
administrator is responsible for managing project users’ accounts and granting
permissions to those users/roles. In addition, this role involves administering firewalls,
intrusion detection systems and databases. Let Admin.A denotes administrator
Admin from institution A.
Site Contact: Each VO site will have a site Contact. The main role of the site contact
is to confirm the identity of users from the site he represents. Also, the site contact
role includes communicating with resource administrators in other sites in order to
add/remove new users in the VO. Let S.A denotes Site contact in A.
Project Leader: Each Grid project has a project leader. This role involves identifying
members of the project, accepting new users, and setting permissions for the project
members according to the project security policy.
Registration Authority (RA) [12, 13, 14]: This role is only relevant when PKI solution
is adopted. The RA is responsible for authenticating project users’ identity and
submitting certificate request on their behalf to the Certificate Authority (CA) [12, 14].
Resource: The role of a resource is to provide services to project users. On the Grid, a
resource can submit jobs on a user’s behalf to other remote resources in different sites.
Therefore, resources must be authenticated. That is why they are considered as a role.
The resources that are being shared can include [1]:
•
Computer resources such as high performance supercomputers, large clusters,
massive storage devices and desktop machines.
•
Data resources such as databases, archives and sophisticated simulation
software.
•
Instruments such as telescopes, satellites, lab facilities and sensors.
These resources are dynamic and heterogeneous. For instance, new computing power
can be added or removed to the project. Data resources such as databases and archives
can be available for a limited period for the project. Let R.A denotes resource R from
site A.
12
1.5 Security issues on the Grid
Grid applications are characterised by the coordinated use of resources from different
administrative domains. Figure 1.5 indicates this situation by showing the policies and
platforms in each domain. Each site in the VO is independently administered and has
its own local security solutions such as Kerberos and PKI. These solutions are built on
top of different platforms such as UNIX [35], Windows [36] and OS2 [37].
Company A
University B
Policy B
Policy A
UNIX
Kerberos
VO
Global Policy
GRID INFRASTUCTURE
Lab C
Company D
Policy D
Policy C
OS2
PKI
Figure 1.5 Reconcile local policies with Global policy
When these institutions are brought together to collaborate on a common project in
this heterogeneous environment, many security problems arise:
Interoperability: Interoperability is a key issue on the Grid. It is impractical to change
the security mechanisms at each site in the VO due to technical, financial and political
reasons [7]. Thus, the security of the Grid project must be able to interoperate with the
local security solutions at different levels:
Policy level: Each partner in the VO has its own security policy (Figure 1.5), which is
carefully tailored to maximise the protection of its valuable resources. The main
issues to be addressed are:
13
•
How to reconcile global security policy with local security policy.
•
How conflicts between local and global policy can be solved, in other words
which policy will apply, local or global.
Authentication level: VO sites require mechanisms for identifying users from one
security domain to another. For example, the identity of a user from company A
(U.A) and his credential as expressed in Policy A are meaningless in the other VO
sites. Therefore, how does U.A authenticate (i.e. UNIX login) to site B to access
resource (R.B) (i.e. Kerberos)?
Authorisation level: Access control mechanisms used vary from one VO site to
another depending on the type and value of the resource accommodated. For example,
site A may use an Access Control List (ACL) [2] or a Role Based Access Control
(RBAC) [2] as mechanisms in order to gain access to its resources. The first problem
is how to determine whether a user, U.A, authenticated in site B, is allowed access to
resource, R.B in B. The second is who decides what the access rights of U.A are.
Scalability: The number of users and resources in the VO is dynamic. New
users/resources can be added/removed to the project as required. For example, CERN
[38], a high energy and nuclear physics project, involves 1800 physicists from 150
institutions in 32 countries. Thus, a scalable way to manage users’ authentication and
their access rights to access project resources is required.
Confidentiality and integrity issues: On the Grid, users transmit data over the Internet
and access remote data resources that may be very sensitive. Moreover, Grid users
can run programs on remote sites. Therefore, confidentiality and integrity are required
to:
•
Protect transmitted data over a public network such as the Internet
•
Ensure the privacy and accuracy of the results of programs executed on remote
sites.
•
Ensure the secrecy and correctness of the shared data resources.
Trust: Scientists and commercial companies want to know whom they are trusting
with their data and commodities. The question that arises: Who to trust
individuals/sites/third parties.
14
Usability: Grid users are from different types of organisations such as academic,
government and financial institutions. Thus, they may not be security experts.
Therefore, usability is required so that access to the VO resources is as smooth and
seamless as access to local resources.
Firewall: A frequently encountered problem on the Grid is firewalls [7, 57]. VO
members want to share resources with other partners but also, want to keep their other
resources private. Collaborating partners on the Grid have to allow requests from and
replies to jobs initiated from other sites to pass through their firewall to access their
resources. This requires opening a port in the firewall to access those resources, which
could introduce another vulnerability to the local security of the VO partner’s
organisation. For commercial companies, it is unthinkable to compromise local
security so they may end up without collaboration.
Company A
University B
LL
WA
E
R
FI
FI
RE
WA
LL
VO
GRID INFRASTUCTURE
Lab C
Company D
LL
WA
E
R
FI
FI
RE
WA
LL
Figure 1.6 VO with Firewalls
15
1.6 E-commerce security Vs. Grid Security
The major differences between e-commerce security and Grid security are:
Collaboration and sharing: In e-commerce, there is no concept of collaboration and
resource sharing. The notion of resources in e-commerce is restricted to files and
databases.
Remote program execution: Users in e-commerce cannot run programs because of the
security consequences on the e-commerce company. In Grid, users can run programs
on remote sites.
Trust: There is no trust relationship between the e-commerce Company and customers.
This allows them to install firewall and define trusted zones: Trusted in the company
and un-trusted zone that is the Internet
Authentication: Authentication in e-commerce is not a top priority as in Grid. As long
as the customer can provide a valid credit card details, he can get access to services
and resources on the company’s site.
1.7 Aim of this research
The aim of this research is to investigate, compare and contrast several approaches to
Grid Security. The Grid is based on concepts of virtual organizations (VOs) whose
definitions, administrative capabilities and functionalities have rapidly evolved over
the last decade. One reason for this evolution, among several others, is the drive to
provide more satisfactory solutions to aspects of Grid security such as: integrity,
confidentiality, availability, accountability, authorisation, and authentication. The
Grid solutions to these aspects depend strongly on the definition of a VO, the trust
relationship between VO sites and third parties, and the adoption of new advances in
cryptography and PKI technology. The main aims of this research are to:
•
Give a brief introduction to Grid computing, the reasons for using it, and to show
why in practice Grid computing is important.
•
Present several types of VOs ranging from elementary to current Grid concept by
precisely modelling the roles involved, the administrative capabilities, and the
trust relationships. Understand the concept and the mission of a Virtual
organisation.
16
•
Understand how to reconcile aspects of security policies of local sites with the
global security policy of the VO
•
Understand mechanisms for achieving certain fundamental security aspects such
as:
o Authentication mechanisms: user name and password, Single Sign-On
[2], PKI, identity delegation [3, 4, 5].
o Authorisation mechanisms: Access Control Lists [2], Role Based
Access Control [2], Identity mapping to apply local security [3, 6],
Community Authorisation Service [5] based on Trusted Third Party.
o Confidentiality, Integrity, Availability and accountability mechanisms.
•
Give an overview of the de-facto standard Globus [1] project.
Globus
infrastructure introduces security access controls over resources despite their
geographical distribution, and their heterogeneous nature.
•
Give a critical discussion of the mechanisms and solutions to aspects of Grid
security in each type of VOs and compare it with Globus (GT2, toolkit 2). Provide
a serious attempt at giving a top down view of Grid security based on combining
classical security definitions with useful concepts taken from risk analysis, threat
modelling and resource ordering (hierarchy).
1.8 Dissertation Organisation
The dissertation is organised as follows. In Chapter 2, we discuss how the concept of
VO has evolved in the last decade and attempt to clarify the mission of the VO via
several schemes. In Chapter 3 we present the authentication problem on the Grid, the
issues to be solved, possible solutions and examples from Grid Security
Infrastructures and Globus will be used throughout the chapter. In Chapter 4 we
present the authorisation problem, issues to be solved, possible solutions and Globus
approach. In Chapter 5, we explain how confidentiality, Integrity and Availability are
maintained on the Grid. We give our definition of the accountability problem on the
Grid. In Chapter 6 we present a top down view of Grid security. We discuss
relationship between resources on the Grid and security considerations and conclude
with current status of the Grid and future work.
17
Chapter 2
Virtual Organisation
2.1 Introduction
Virtual organisation (VO) [1, 3] is a fundamental concept on the Grid. A VO is a set
of resource providers and users from multiple administrative domains, possibly
geographically distributed, collaborating in order to achieve common objectives [3].
The essential issue about the Grid is enabling sharing of resources. According to [3]
“…This sharing is, necessarily, highly controlled, with resource providers and
consumers defining clearly and carefully just what is shared, who is allowed to share,
and the conditions under which sharing occurs. A set of individuals and/or institutions
defined by such sharing rules form what we call a virtual organization (VO)”. The
difference between a VO and a classical organisation is that individuals and/or
institutions that have agreed to collaborate and share resources continue to belong to
real organisations. In addition, they may be members of several VOs at the same time
[3].
The characterization of the VO given above is very broad. Understanding what does a
VO mean is essential for the design of authentication protocols and authorisation
mechanisms on the Grid. Researchers and scientists want to know who are trusting
with their data. Commercial companies want to know whom they trust with their
commodities.
The aim of this chapter is to clarify the mission of the VO, the explicit assumptions
about trust relationships among roles in the VO (administrators, sites, users, certificate
authorities, and third parties), and how security policy conflicts between VO and local
sites are resolved. We will investigate several VO schemes, and see how these
schemes have evolved over the last decade. In addition, we will analyse their merits
and weaknesses.
18
2.2 VO partner organisations trust all VO members
Typically, a VO member will have access to a wide range of resources within a Grid
project. The question to be addressed is “how does a user from an institution become
a VO member?” In the first scheme, a VO is set of ad hoc connections. Consider the
Grid project described in section 1.3.
Company A
University B
Ad hoc connections
Lab C
Company D
Figure 2.1 The first scheme: VO sites trust all VO members
The roles involved in this scheme are: site Contact and Users. Each VO site will have
a site Contact (i.e. S.A, S.B). Figure 2.1 illustrates how user U.A can be a VO
member. Each user who wishes to join the VO will need to have a separate account at
each VO site. The VO sites will treat project users as an extended part of their core
organisation. Therefore, if U.A has a valid username and password for each resource
at each site, then all VO sites will consider him trustworthy. Thus, the trust
relationship in this scheme is with “named users”. The policy of the VO will be
governed by local policies because each site will locally authenticate users and apply
its local security policy.
19
This scheme has several disadvantages:
1. The user is responsible for the details and security key of each account. Therefore,
the security of the VO resources depends on how U.A protects his account details.
2. It is unacceptable to security-sensitive companies, in particular, commercial
companies. Because users will be considered an extended part of the organisation
while they still belong to a different institution.
3. It is not scalable. Administrator from one VO site will have to manage in addition
to his local users, members from different institutions.
2.3 VO organizations trust all VO members and a
Central Database
In this scheme, the VO is a central authority responsible for maintaining user’s
account information to access VO resources. This information is stored in a database
maintained by the VO. All VO members and all VO sites trust the VO database. Thus,
the VO becomes a trusted third party that has its own policy, roles, procedures and
mechanisms. The VO is set up and funded by the partners and is managed by
employees from the participating institutions. The roles involved in this scheme
include:
Project Leader: This role involves identifying members of the project, accepting new
users, and setting permissions for the project members according to the project
security policy. An employee from institution C, emp.C, can also have a role as
project leader in the VO, Porjectleader.VO
Site contact: the site contact role includes liaising with resources’ administrators in
other sites in order to confirm the identity of a user from his institution who wishes to
join the VO.
Administrator: The administrator is responsible for managing project users’ accounts
information on the VO’s database server. For instance, an administrator from
institution A, Admin.A can also have a role as VO database administrator,
Admin.VO.
Users: This role includes users from VO partners’ organisations. Each user will have
account in the VO that allows him to gain access to VO resources.
20
To summarise, the VO is set up and funded by the collaborating partners. It is
managed by employees from the collaborating institutions (A, B, C and D). Those
employees may have a role in their organisation that is different from the role in the
VO. For instance a project manager in institution A, can be the database administrator
in the VO. A scientist from institution C can be the project leader of the VO.
For U.A to be a VO member (Figure 2.2):
1. Needs to register in the VO database.
2. The project leader in the VO contacts institution A to get assurance that U.A’s
registered information is true.
3. It is the job of Site contact, S.A, to confirm the validity of this information.
4. If the confirmation is positive, the project leader contacts the administrators of
the other VO sites to create a local account and permissions for U.A. [steps
4,5,6]
5. Each local site administrator in the VO will send the newly created username
and password to the VO. This credential will be stored in the U.A’s account on
the VO’s server. Thus, U.A does not need to store these credentials locally.
[steps 7,8,9]
VO
Company A
DB
User.A
1-Register
2-Verify information
Site contact. A
3- Information valid
Project Leader initiates steps
2,4, 5 and 6
4- Create account U.A
7- (u1,pw1,B)
Company B
8- (u2,pw2,C)
Company C
9-(u3,pw3,D)
Company D
Figure 2.2 All VO organizations trust all VO members and the VO server
21
This scheme provides significant improvement over the previous scheme. The
credentials needed to access VO resources are protected by the VO, not by the user.
This substantially reduces the risk that passwords to resources be compromised due to
user mismanagement. In addition, it gives assurance to the VO organisations about the
identity of users accessing their resources.
Also, this scheme introduces new risks:
•
The VO becomes a single point of attack. A successful denial of service attack
on the VO database server will leave users disconnected from their resources.
•
The VO central database is vulnerable to failure, which also could prevent
users from accessing project resources. However, this problem can be solved
with replication techniques.
The third VO scheme is based on Public Key Infrastructure (PKI). It requires
understanding the main components of PKI and their functionalities, which are
described next.
2.4 Public Key Infrastructure (PKI)
PKI is recognized as an essential enabling technology for security in a large-scale
network. The core concept in PKI is that of a “certificate” [12, 13, 53]. A certificate is
a data structure containing the public key and related details about the key owner and
signed by a Certification Authority (CA), thus it is tamper proof [10, 12]. The role of
the certificate is to bind the public key to a particular entity on the Grid. The private
key represents the identity of each entity on the Grid: user, resource and process.
An important advantage of PKI is Interoperability. PKI functionality is increasingly
included in standard products such as Windows [36], Web-Services [22], .NET [22]
and XML [22, 33], and Secure Shell (SSH) [18, 47], Secure Socket Layer (SSL)/
Transport Layer Security (TLS) [9, 18]. The standards related to PKIs are X.509 v3
[12], PKIX (RFC 2459) [12], and SPKI (RFC 2692, 93) [12].
2.4.1 Overview of Public Key Cryptography
With Public key cryptography [10] (asymmetric encryption), each party has a pair of
related keys: one for encryption and one for decryption. The same key cannot be used
22
for both. The main assumption in public key cryptography is that one of the keys must
remain secret (private key) and the other is made public (public key).
If a message sender uses its private key to encrypt the message then any recipient who
can obtain the public key can decrypt it. In contrast, if a message sender uses a public
key to encrypt a message then only the owner of the private can decrypt the message.
Signing is the process of encrypting with the private key. Verification is the process
of decrypting with the public key. RSA [11, 42] and EL-GAMAL [11] are two wellknown public key algorithms currently used.
The main assumption in public key cryptography is that the “Private Key must remain
secret”. Thus, it needs to be adequately protected. One way to protect the private key
file is to encrypt it with a password. So, if it is stolen from a machine or a storage
device, the user will have enough time to revoke his corresponding public key. For
effectively protecting the private key, a smart card [6, 7] with a password can be used.
The number of attempts to enter the right password can be restricted to prevent brute
force attack [2] and dictionary [2] attacks on the smart card. The only problem with
this solution is that not all users have smart card reader.
2.4.2 X.509 Certificate
X.509 version 3 is the most widely used data format for public key certificates today.
It provides a uniform way for expressing identity of entities on the Grid. It is also
standardised by the Internet Engineering Task Force (IETF) [43]. For illustration, we
give X.509 v3 certificate format in Figure 2.3. There are several types of certificates
[13]:
•
Identity certificate: contains the public key of a user and his identity together
with some other information, encrypted with the secret key of the CA. This
makes the certificate tamper resistant.
•
Attribute certificate: contains a set of attributes of user such as his occupation,
role in a company together with some other information, digitally signed
under the private key of the CA.
The extensions in the certificate are essential for the Grid as they allow standardising
policy with respect to the use of certificates. Globus uses X.509 as the main credential
for authenticating users of the Grid.
23
Version
Serial number
Signature algorithm id
Subject name
Issuer name
v3
54321
RSA with SHA1
CN=Ali Haidar, OU= MSc. Student, O =RHUL
CA= Trust Me, OU= PKI, O = Trust Me
Subject public key info 12676576436434654366543…..
Validity period
Not before 20/7/02 , Not after 27/07/04
Issuer Public key info
657854566765756732522115….
Extensions
Key Usage, Policy restriction…..
CA Signature
56456454$$$&&666894#4…
Figure 2.3 X.509 version 3 certificate format (Adapted from [12])
2.4.3 Certificate Authority (CA)
A CA is an institution trusted by others to guarantee for the authenticity of a public
key [12, 14]. The main role of the CA is to issue digital certificates that
cryptographically bind a public key to the user’s identity information [10]. This is
done by signing the information using the CA’s private key. The relying parties
require the CA’s public key so that they can verify the digital signature on the
certificates issued by the CA [10].
The CA has many other responsibilities in addition to issuing certificates. These
responsibilities include generating key pairs, revoking certificates and maintaining a
Certificate Revocation List (CRL) and other revocation forms [12, 13].
A CA can be any partner in the Grid project such as a University, Government, or a
third party operating for profit such as Verisign [40] and Entrust [41]. If the Grid
project is large enough, then it might establish its own CA. For example, in UK, the escience programme has established its own CA called UK-CA [7], and in the Globus
project, the Department of Energy in the US has established the DOE-CA [3] to
authenticate all Globus users. A CA must be trusted only to issue valid certificate [2].
24
2.4.4 Registration Authority (RA)
The RA [14] is the trusted representative of the CA. The basic functionalities of the
RA include:
•
Authenticating VO members’ claimed identity. The procedure of identifying a
user’s identity depends on the sensitivity of the Grid project, and the resources
involved. The procedure strictness may vary between face-to-face interviews, to a
letter from the human resource department of the user’s organisation confirming
his identity.
•
Providing the CA’s public key and certificate to VO members. The VO member
establishes trust with the CA by obtaining the CA’s public key and the CA’s
certificate from the RA in a secure way. For instance, personally via a floppy disk
or over the Internet via a secure connection SSL.
•
Sending the certificate creation request to the CA.
•
Obtaining the public key from the subscriber (optional)
2.4.5 Certificate Revocation
Certificate revocation [12, 14] is a core component of PKI. It is used for checking
whether a certificate is still valid or not. There are number of events that can
invalidate a certificate. For example, if the private key of the certificate is stolen or
lost, if the CA’s private key is compromised, if the date has expired or if the holder of
the certificate is no longer authorised to use the certificate [12, 46]. When any of these
events happens, the certificate must be revoked, as it does not represent the owner
anymore. The relying parties must check that a certificate is not revoked before using
the public key on the certificate. There are different forms of revocation mechanisms
such as: Certificate Revocation Lists (CRLs), Online Certificate Status Protocol
(OCSP) and Simple Certification Verification Protocol (SCVP) [12, 14]. Here is a
brief explanation of CRLs and OCSP mechanisms as they are commonly used.
2.4.5.1 Certificate Revocation List (CRL)
The CRL is the first form of revocation mechanisms. It is a highly controlled online
database that contains the serial number and the revocation time of certificates that
have become invalid [12]. Many CAs update and sign the CRL on a daily basis to
allow relying parties to verify the integrity and authenticity of the CRL [12]. This
25
allows the CRL to be transmitted over a public network without being tampered with.
The CRL is usually specific to a single CA. There are several types of CRL namely:
Full CRL, Delta CRL and partitioned CRL [14].
The full CRL contains all revocation information for all certificates of a particular CA
[14]. The main disadvantages of this type of CRLs are:
•
Freshness: The CRL is issued periodically, on a daily basis for instance. So, if
a certificate is compromised and reported for revocation, it will not be revoked
within 24 hours (revocation time should be reasonable). But in the mean time,
the certificate can still be used while it is invalid because it is not list on the
CRL yet.
•
Scalable: The size of the CRL may be large because it is always increasing.
Thus, it will take time to download it every time. This is not convenient for the
clients.
In order to reduce the size of the full CRL, a delta CRL can be used [14]. The idea is
to publish changes to the revocation information since the last full CRL was issued.
The CRL that contains the changes is called delta CRL. This mechanism requires that
the user already have a full CRL. In order to create a fresh CRL, the delta CRL can be
applied to the full CRL.
2.4.5.2 Online Certificate Status Protocol (OCSP)
OCSP [12, 14] is an online protocol for requesting status information of certificates. It
is a server-based revocation mechanism that provides real time verification of
certificate status information [12]:
•
OCSP Client requests status information for a specific certificate.
•
OCSP Server replies with a signed response status such as acceptable, revoked
or unknown.
The OCSP protocol has several disadvantages [14]
•
Response time and scaling: The response signing process will limit the server
scalability because digital signature is computationally intensive. The
signature is slow, but verification is fast.
•
Need for multiple queries to verify the entire certificate path. Thus, the time to
set the connection with the server and perform the check can be long.
•
According to [14] “The protocol provides pre-computed responses including
validity.”
26
2.4.6 Problems with PKI
PKI functionalities can only be reliable if the CA, RA and the certificate revocation
mechanisms are operating with high level of security. Here are some of the problems
faced by PKI:
•
Registration process [14]: The major problem is entity registration with the RA.
The RA needs to prove the identity of an entity, which will be issued a certificate
by the CA. The procedure of validating that the entity’s information is correct
before issuing and signing the certificate is vital. For example, in 2002, Verisign a
commercial CA issued an ex-Microsoft employee a certificate that allows him to
sign code on behalf of Microsoft as a current Microsoft employee [10]. When
registering a host name, the verification process can be done using WHOIS
lookup for host names on the Internet for instance. This will give the name of the
institution and the corresponding Domain Name Server (DNS).
•
Key quality [11]: There is another important issue with Public key encryption in
general that is the generation of the key-pair. A user who wishes to generate his
own key pair will need to use specialised software. The quality of the key pairs
depends on the ability of the Pseudo Random Number Generator [11], PRNG, in
the software to produce a non-weak key-pair and also not to keep a copy of the
private key.
•
Communicating private key to end entity [10, 13]: The user may allow the CA to
generate the key pairs. The problem that arises is how to communicate the private
key to the end entity in a secure way. Currently, a smart card is a reliable option
but it is not practical because not all users have a smart card reader.
•
Revocation [13, 46]: The revocation mechanisms have been a continuing problem
for PKI systems. There is a trade-off between how often a CRL update should be
released and the security of the relying party’s system. According to [13],
“Certificates are of most use in off-line scenarios, but revocation pushes to do
online checks for the revocation”. If the relying parties need to get a new CRL
each time they want to validate a certificate then the CA must be on-line, but the
primary advantage of PKI is that it is supposed to allow off-line verifcation.
•
Bootstrapping (trusted anchor) [13]: When a user get a certificate from a CA, he
will need to get a valid copy of the CA’s public key and certificate from a trusted
27
source. Otherwise, how can he trust that the certificate is really valid, if there is no
infrastructure to validate it?
•
Trusting other CAs [12, 46]: There are two practical problems with trusting other
CAs:
1. Interoperability between the two CAs. Each CA may be using a different
PKI product, thus, they may be using different cryptographic algorithms
different revocation mechanisms and different revocation format.
2. How the CA operates. Even if one trusts their CA to validate another CA’s
credentials (i.e. in cross-certification), should they trust that this other CA
is taking the proper precautions in the entity they certify?
In order to promote trust in PKI, issues of security, liability and obligations are
contained in a Certificate Policy (CP) [12]. A CP “is a named set of rules that
indicates the applicability of a public key certificate to a particular community and/or
class of application with common security requirements...” [12]. Furthermore, some
CAs publishes a detailed description of the practices followed in issuing and
maintaining certificates in a Certificate Practice Statement (CPS) [12]. A CPS is “a
statement of the practices which a CA employs in issuing public key certificates.”
The CPS also is the basis for compliance audits in order to ensure that PKI
components are operating in accordance to the specification contained in the CPS.
Two CAs can establish trust after reviewing each other’s CP and that they both offer
equivalent amount of trust and verification processes and liability [14].
2.5 VO organizations trust a third party
After defining the main PKI components and their functionalities, now, the third VO
scheme can be described. The trust relationship in this scheme is with a third party,
the Certification Authority (CA). All VO sites will trust the CA. The role of the CA is
to issue digital certificates for project users. Before issuing a certificate, a background
check on the identity of the certificate requestor and his role in his local institution
must be done. A Registration Authority (RA) usually does this check. In this scheme,
the VO acts as a RA for the CA. The VO becomes the trusted representative of the
CA and is responsible for authenticating VO members’ identity. The roles involved in
28
this scheme will include in addition to the roles described in the previous scheme, a
Registration Authority role.
Company A
VO
Certificate Authority
Registration Authority
VO
Site Contact
U.A
Project leader
S.A
1-Register
2-Verify Information
3-Verfied
4-Request Certificate
5- Certificate Distribution
6-Create Account U.A
University B
7-same as 6
Lab C
8-same as 6
9- Confirmation
Company D
Figure 2.4 VO is a registration Authority
Figure 2.4 illustrates how U.A can be a VO member:
1. U.A registers with the VO to get a certificate. He would be required to submit
some form of identification (letter from HR or personal interview).
2. The VO will contact the institution A in order to verify that the information
submitted by U.A is true.
3. The site contact in A confirms the U.A’s identity, and his role in the project.
4. The VO asks the CA to issue a digital certificate for U.A.
5. Once the certificate is created, the user can download it.
6. The project leader contacts all VO sites to create a local account for U.A based
on the Subject name in his Certificate. [Steps 5,6, 7]
7. The user is sent a confirmation that his account with the VO has been
established. [Step 9]
29
This scheme has several advantages:
•
If implemented “properly” it can provide the Grid infrastructure with basic
security features such as confidentiality, integrity and non-repudiation.
•
Uniform credentials. Users will have X509v3 certificate that allows expressing
their identities to different security domains in a uniform way. Furthermore, it
is platform independent, which means that it does not require major changes to
local security solutions of local sites.
•
Interoperability. PKI functionality is increasingly included in standard
products such as SSL/TLS.
•
This scheme does not require major changes in the local security policy of
local sites.
•
The digital certificate is tamper resistant because the CA signed it with its
private key. If an attacker attempts to modify a certificate, the modification
will be detected.
This scheme crucially relies on the secure operation of PKI components that are the
CA, the RA and the CRL. Here are some of the issues that might put at risk the whole
VO if it is based on PKI.
•
The CA’s private key is the crucial part in the whole public key infrastructure.
The CA signs every issued digital certificate for Grid users. Therefore, it is crucial
that the CA protects his private key according to best security practices [12].
•
The procedure of validating that the user’s information is correct before issuing
and signing the certificate is vital [12].
•
A major way to jeopardize the trust of a PKI environment on the Grid is to
compromise the integrity of the CRL process. If it is not possible to assure the
validity of the certificate in use, then the whole Grid authentication system is at
risk [12].
How to trust a CA
The VO must decide which CA to trust. Grid involves institutions from different
countries. One of these institutions may trust a local CA in the country where it
resides. When this institution joins a Grid project, other VO members may choose not
to accept their certificates, because the CA is not well known and they don’t know
how this CA operates. Moreover, the security practice of the CA may be considered
30
unreliable. Therefore, VO members need to agree on what CA to trust and the
criterion of trusting a CA.
This VO scheme is currently adopted in most Grid projects such as Globus and
Unicore. Both rely on the existence of a public key infrastructure to allow users to
access VO resources.
31
Chapter 3
Grid Authentication
3.1 Introduction
Authentication [2, 4] is paramount to Grid Security. Authentication is important to
authorization, confidentiality and auditing. Thus, if authentication fails, the whole
project security will fail as well. Authentication aims at verifying the identity of an
entity [2]. Users of a Grid project often require remote access to project valuable
resources and services over the Internet. Authentication is needed because the user
identity is a parameter in most access control (authorisation) solutions used on the
Grid [2]. Furthermore, access to confidential Grid information will only be authorised
if users are properly authenticated. Finally, authentication is crucial to accountability,
because user’s identity is part of security events logged in the audit trail [2].
In addition to authenticating users on the grid, resources and processes running on
user’s behalf may require authentication [4]. For example, users want to make sure
that they are communicating with a genuine resource before sending confidential data.
3.2 Design issues in Grid Authentication protocol
There are several issues that need to be taken into account when designing an
authentication protocol for the Grid:
•
The parties in the Grid project do not trust each other. Thus, the protocol must
provide mutual authentication.
•
Interoperability is fundamental. Institutions have different platforms and
security solutions (i.e. Kerberos and PKI). The authentication protocol must be
uniform, not platform dependent.
32
•
Usability. For users, usability is extremely important. Access to the VO has to
be seamless and transparent as accessing the local organisation’s resources.
•
Dynamic administrating. The number of users is dynamic. New users are
added and removed as required. Issuing and revoking credentials for project
users should be flexible.
•
Mobility. The user should not be tied to one machine. Scientists and
researchers move between institutions to collaborate with each other. The
authentication mechanism should allow users to be mobile.
•
Organizations have to make some changes to their security policy. The
authentication protocol should make these changes minimal.
•
Scalability. Some VO may be large, such as CERN project 1800 physicists or
small. The authentication protocol should be able to scale to the size of the
virtual organisation. Currently only PKI seems to scale.
•
Single sign-on [2, 3]. The project may involve many resources. It is
impractical for a user to authenticate to every resource during a session. The
user must be able to authenticate once per work session.
3.3 Approaches to Authentication
Consider the Grid project described in section 1.3. The first task of the VO is to
decide how to authenticate users of that particular project. So, if a user U from
institution A (U.A) is authenticated in A, would the VO admit him to access resource
R in site B (R.B)?
The answer to this question is deeply influenced by the nature of the trust
relationships between VO users, VO organisations, and third parties. Chapter 2 has
highlighted three VO schemes with different trust relationship.
1. VO organisations trust all VO users. In this scheme, trust is with “named
users”. Each VO user has a separate account with each VO organisation. VO
users are added to the core membership list of each organisation in the VO.
The details and security key of each account are stored on the user’s machine.
2. VO organisations trust all VO users as well as a Central Database. The
details and security key of each account is stored in a central database that
users and partners trust.
33
3. VO organisations trust a third party. All VO organisations and VO users will
trust a CA. The VO will act as a registration authority of that CA.
Depending on which VO scheme is adopted, an authentication solution can be applied.
This solution would not be applicable for other VO schemes.
3.4 VO sites trust all VO members
Each VO user will have an account in each VO site. Therefore, any access to resource
requires a username and a password. This means local sites will treat VO members as
an extended part of their core organisation. The details and password for each account
on the VO is stored with the user. In this case, U.A requires accounts on all VO sites
and B has to authenticate U.A locally. This type of authentication is considered “weak
authentication” [2].
1- (u1,pw1)
Company A
(u1,pw1,B)
User.A
2- (u2,pw2)
(u2,pw2,C)
University B
Lab C
3-(u3,pw3)
(u3,pw3,D)
Company D
Figure 3.1 User Authentication
This solution has many disadvantages:
1) Unacceptable to “security-sensitive” organisations such as Commercial,
Pharmaceuticals and Financial companies because adding non-staff member to
their security domain weakens their security defences. These accounts will be
behind the company’s firewall.
2) The solution does not scale because it involves lots of overhead management
and administration. The local site administrator has to manage individual
accounts. Each time a new user is added or removed to the project, the local
site administrator has to be informed in order to create/delete accounts.
34
3) Usability problems. The user has to remember numerous usernames and
passwords for each VO site. Even worse, a password may be required for each
resource on the VO.
4) High vulnerability to attacks. The security of local sites depends on the
strength of the password used by U.A. As a result, the user might take
shortcuts by using the same password or a shorter password on all accounts or
by sending the whole authorisation/authentication list by email to his/her
mailbox. Passwords are vulnerable to guessing [2], dictionary attacks [2] and
password cracking.
5) Long term credentials. Forcing users to change passwords at regular intervals
is not practical because the user has too many passwords to change. Thus, it is
very likely that the user will use the same password for the whole period of the
project.
6) It does not allow coordinated use of resources. If a user submits a job to site B
and this job determines that it needs some data from site C, then it will not
continue because only the user knows the username and password for site C.
3.5 VO sites trust all VO members and a Central
Database
This approach is a variation of the previous solution. The security details of each
account are stored in a Central database that users trust as well as VO partners. A
central authority would be responsible for supplying the right account information for
a given resource request. U.A is authenticated as a VO member first, and then the
target resource site, site B, authenticates U.A locally.
In this scenario, the VO maintains a central Database (DB). The DB contains a shared
credential between the user and the VO. For each VO user, a list of username/
password for each resource in the VO is stored in the database. Figure 3.2 shows how
a VO user can use the VO database to authenticate to a specific resource. The initial
assumption is that only the VO user, the VO DB and the resource know the password.
35
c is credential to R.B
Company A
User.A
Central DB
R, B
est
u
Server
q
e
r
c
t
.A
n
u
U
S
L
co
1L/T .B ac
SS
R
nt
s se
i
.A
2- U
SSL/TLS
(U.A, (U1,PW1), R, B)
(U.A, (U2,PW2), R, C)
(U.A, (U3,PW3), R, D)
3-U.A uses c to authenticate to R.B
Company B
Figure 3.2 User authentication using VO DB
When a user from A (U.A) wants to access resource at site B (R.B):
1. U.A sends his credentials and the target resource name to the VO over a
secure connection such as SSL/TLS [9]. [U.A (Credential, Resource-name)].
2. The VO compares U.A’s credential against the entries stored in the VO’s
database. U.A’s authentication will succeed if his credential exists in the
stored entries. The VO fetches the username and password (U1, Pw1) for the
requested resource R.B.
3. U.A can now submit a task and use the username and password supplied by
the VO to access R.B.
This approach is similar to e-commerce online banking, where the user authenticates
to the bank with a user name and password to access his account details. This solution
has solved several problems from the previous approach such as:
•
Usability problem. The usability is significantly improved. U.A is
authenticated to the VO with one username and password, instead of
maintaining a list of passwords.
•
Accountability. Each user has a “Username” and a “Password” associated with
his identity. Therefore, it is possible to determine the user responsible for
performing a job on a specific resource.
•
Reduced vulnerability. The vulnerability is reduced because the list of
username and passwords is under the VO’s protection, not the user. The user
is only responsible for protecting one password to authenticate to the VO.
36
This solution still has most of the drawbacks discussed in the previous section. These
include: scalability, long-term passwords to resources, no coordinated use of
resources and acceptability. Also, this solution introduces new security risks. The VO
becomes a:
•
Single point of Attack. Since all sites in the VO will depend on the VO’s
database server, a denial of service attack on that server will prevent users
from accessing the project resources.
•
Single point of failure. A failure in the VO’s database server will leave all
users disconnected from the VO resources.
3.6 Grid Authentication with PKI: VO sites trust a third
party
The previous two sections highlighted two authentication approaches based on what is
called weak authentication [2]. Furthermore, the solutions described above do not
satisfy many of the requirements mentioned in section 3.2. Presently, PKI is the most
adopted authentication solution in many Grid projects such as Globus [32] and
Unicore [34]. The trust relationship is with the CA. The VO acts as a registration
authority (RA) for that CA.
Each Grid user would be required to register with the VO in order to get a digital
certificate. This certificate is issued by a CA trusted by the VO. Any access to
resources on the Grid would have to be accompanied with a digital certificate. The
authentication solution based on the third VO scheme is depicted in figure 3.3.
The authentication process is achieved using SSL/TLS implementation on both sides.
The protocol should be configured to provide mutual authentication. When a user U.A
wants to access resource R. at site B (R.B):
1. U.A sends his digital certificate to R.B.
2. R.B checks whether the Certificate is issued by a trusted CA or not. Then, it
checks the integrity and validity (format, expiry date and CRL) of the
certificate. Finally, R.B verifies that the user can demonstrate proof of
possession of the certificate private key.
3. R.B sends his digital certificate to U.A
4. U.A does the same checking as in step 2.
37
If these steps are successfully done, then the mutual authentication succeeds.
However, having a valid certificate trusted by the VO does not mean that U.A can
access resources in any VO’s site. U.A still needs to be registered as a VO user at
each VO site.
Certificate Authority
Trusted CA
Trusted
Company A
Company B
U.A
SSL/TLS
1-U.A uses X.509v3 certificate to
authenticate to R.B
R.B
Figure 3.3 User authentication using trusted third party
Members of a Grid project may trust different Certificate Authorities that are
considered trustworthy by the VO. Thus, they should be able to present credentials
obtained from any source such as their local CA, a third party CA or the Project CA.
Therefore, if U.A already has a certificate, this raises the following questions. Which
CA issued the certificate? Is it trustworthy? How does the CA operate?
If U.A’s certificate is not issued by a trusted CA, then U.A must get another
certificate issued by a CA trusted by the VO. Otherwise, he cannot be part of the
project. Consequently, U.A will need to use the new certificate to authenticate to the
target site.
However, A can be a member of another VO that trusts the CA who issued the first
certificate. As a result, U.A will have many credentials, which causes a new problem:
how to decide which one to use where.
In this scheme, authentication may not necessarily involve explicitly named users. By
using attribute certificates (see section 2.4.2 for details), VO Organisations might
38
enable access to any user who can demonstrate that he is a “scientist”, “student” or a
member of a specific institution.
The protocol needs to be configured to allow mutual authentication otherwise the
authentication fails. Also, the root certificates of different CAs trusted by the VO
should be installed. Only the administrator in each VO site can add/remove a root
certificate. It is the role of the VO (partner organisations) to decide what CA to trust.
3.6.1 Advantages of using PKI on the Grid
There are many reasons for PKI to be the best candidate for authentication on the Grid.
PKI solves most of the issues mentioned in section 3.2. In addition to the advantages
mentioned in section 2.5, PKI if implemented properly [10]:
•
It provides a One-to-many authentication mechanism when used with a
certificate.
•
It provides strong authentication [2]
•
It assumes no previous trust relationship between Grid entities.
•
Usability problem is significantly improved. The user is required to remember
a pass phrase for his certificate only.
•
Provides for mobility, as the certificate can be stored on a floppy disk or smart
card device.
•
Scalability. User authentication can be done offline.
•
Uniform credential through the use of X.509v3 certificates.
3.6.2 Vulnerabilities in PKI
Section 2.4.6 has highlighted some of the issues that put PKI and Grid authentication
at risk. The security of PKI crucially relies on the PKI components operating with
high degree of security. For instance:
•
User’s private key must be kept confidential.
•
The CA’s private key must be protected with high degree of security.
•
The procedure used by the VO to authenticate candidate users must be strong
in order to avoid identity theft at the point of certificate creation. [7] describes
a real scenario for identity theft.
39
•
CA and RA computer systems and applications must be protected physically
and protected from tampering according to best security practices such as
BS7799 [52].
•
Public key of the CA must be securely communicated to VO users and VO
sites.
3.7 Proxies and Delegation
A typical Grid project consists of a dynamic number of resources. A project user can
initiate a job request that involves the coordinated use of these different resources at
different sites. However, this raises two problems:
1. Convenience: it is impractical for the user to authenticate multiple times to
access different resources [3].
2. Vulnerability: the user will have to sign a challenge for each authentication
with his private key. This provides an opportunity for an attacker to collect
challenges and their corresponding cipher texts in order to recover the private
key [5].
These problems are solved using proxy credentials [3, 5, 7]. A proxy certificate is a
special X.509 v3 certificate that is signed by the private key of a Grid entity [4, 5]. It
allows processes running on user’s behalf to authenticate the user directly to Grid
resources. In addition, this certificate can be used for delegation [2, 5] that is the
process of passing authority from one entity to another.
The proxy credentials grants the bearer all/subset of the Grid entity’s access right [5].
These credentials have a short lifetime usually in term of hours. There are two types
of proxy credentials [4, 5]:
1. “Restricted proxy” [4, 5] this type of proxies has the ability to hold policy
information restricting its use. The policy restriction field is a part of the
extensions of X.509 certificate. For instance, the policy may state that the
holder of this proxy can run SELECT query on a specific database resource. If
the user submits a request to run UPDATE query, it will fail, as it is not
consistent with the policy on the proxy. This proxy type has been enhanced
and submitted by the Globus project [3, 5] as an Internet draft to the Internet
Engineering Task Force (IETF) PKIX working group [3, 4, 5].
40
2. “Impersonation certificate” [4, 5] is an unrestricted proxy certificate that
allows an entity to delegate all its authority to another entity.
3.8 Security issues in proxies
The main advantage of proxy credentials is significant speed [4, 5, 6]. Users can run
jobs that require coordinated access to resources without the user’s intervention. This
raises new security concerns such as:
•
How the proxy’s key pairs are generated?
•
How the proxy’s private key is protected on the user’s machine?
•
When delegating proxy credentials to a remote Grid entity, who has control
over the proxy’s private key?
The first concern can be viewed from a cryptographic point of view, as the length and
quality of the public key pairs are crucial. The random number generator used to
generate primes for the public key should be reliable. Factorisation attacks have
significantly improved in the last decade [11]. When the user generates a public key
512-bit long used with RSA [25], this means it is vulnerable to factorisation [10, 11].
As a result, the attacker will be able to recover the private key from the public key
[28] and sign requests on user’s behalf. The minimum recommended key length is
760-bit [11]. Current implementation of RSA uses 1024-bit and 2048-bit [11].
Presently, there is no mechanism to detect proxy’s private key compromise [5].
Another problem with proxy credentials is clock synchronisation. A job may
determine that it requires to access resources on a different site located in a different
country. Since proxy’s lifetime is short, 10 hours in Globus [6], the remote site may
be 12 hours ahead. As a result, it will consider the proxy invalid as it has expired.
41
3.9 Other Alternatives
Currently, Kerberos and Secure Shell are commonly used authentication protocols.
This section will show why they are not suitable for Grid environment. Here is a brief
explanation of these protocols.
Kerberos: Kerberos is a TTP-aided authentication protocol based on symmetric key
cryptography [2]. The Grid user and a designated trustworthy server, Key Distributed
Centre (KDC) [10], share a long-term secret key. The user authenticates to the
resource by getting a ticket from the KDC. Kerberos splits the role of the KDC
between 2 entities, the Authentication Server (AS) [2] and the Ticket Granting Server
(TGS) [2], in order to limit bottlenecks and the exposure of the long-term keys.
Kerberos achieves inter-organisational, authentication by sharing key servers (AS or
KDC) with other organisations [6].
Kerberos meets many of the requirements mentioned in section 3.2, such as usability
and single sign-on, but when used for multi-domain authentication, several issues
arise:
•
Acceptability: according to [6] “using Kerberos for intersite authentication
also means using it for intrasite authentication”. This is not acceptable for
commercial companies, because sharing key servers to allow this type of
authentication means giving up control over local policy. (PKI doesn’t require
major changes in the local security policy of a VO site)
•
Scalability: Scalability is an issue, because it is very hard to extend the
protocol to multiple administrative domains due to equipments and staffing
cost [6]. Also, the performance bottleneck of having a TTP seems to always
limit the scalability of a system.
•
Clock synchronization [2]: all clocks across VO sites will need to be
synchronized. Thus the AS, TGS, and all clients and all servers must somehow
have nearly the same time, or allow for the difference within the key periods
and decrease security. On the Grid this is very difficult to achieve because VO
resources may exist in different countries with different time zones.
•
Key Revocation [2]: Kerberos has no mechanism for key revocation, but relies
on the timestamps to expire. Using short timestamps give more security, but
require more tickets to be issued.
42
•
Availability: Kerberos relies on the Authentication Server and TGS server to
be online. The availability becomes more important when using tickets with
short lifetimes [2]. (PKI allow offline authentication)
•
Kerberos does not address the delegation of access rights (tickets) [2]. (In PKI,
proxy certificates can be used for delegation)
Secure Shell: Secure Shell, SSH [18], is another alternative used for providing
remote login. SSH is based on public-key authentication and offers a secure channel
over assumed reliable transport, typically TCP [18]. It supports mutual authentication
and provides confidentiality of transmitted user credentials, and can be easily
deployed. SSH, however, is not suitable for Grid authentication because of:
•
Usability: users have to copy the public keys for any VO site they want to
access. This is not practical because not all users are security experts [2].
•
Limited functionalities: according to [6], SSH supports limited capabilities
such as remote shell and file transfer, but not others that require authentication,
such as collaborative environments and web browsers.
That is why PKI currently is the most used solution in Grid project such as Globus
and Unicore.
3.10
Globus
Toolkit
(GT2):
GSI
approach
to
Authentication
GSI deals with bridging between local securities solutions of different sites in the VO.
It is based on the IETF standard TLS protocol [9], public key encryption [10, 11] and
X509 version 3 certificates format. The infrastructure provides fundamental security
services [7]:
•
Single and mutual authentication: Globus uses an implementation of TLS called
Secure Socket Library (SSL). The implementation requires a PKI and configured
to provide mutual authentication using X.509 certificate.
•
Single sign-on: in Globus, certificates and proxy credentials (details in section 3.7)
are used to allow the user to authenticate once to access all Grid resources. Globus
43
team have submitted a draft to the IETF, PKIX work group, to standardise the
proxy certificate format.
•
Confidential Communication: transmitted data over the Internet is protected using
SSL communication protocol (details on confidentiality in section 5.2).
•
Authorisation: Globus supports identity mapping (details in section 4.6) and
Community authorisation Service CAS (details in section 4.7) as access control
mechanisms.
•
Delegation: Globus supports identity delegation and access rights delegation from
user to processes running on his behalf via short-lived proxy certificates
3.10.1 GT2 authentication with proxies
The primary objective of GSI is to provide authentication and message protection [4,
6]. This section will describe user proxy authentication in GSI and the delegation
process. These steps are usually implemented over a secure network using SSLv3 that
provides mutual authentication, confidentiality and integrity.
Consider the Grid project described in section 1.3. U.A wishes to authenticate to B in
order to access resource R.B. The assumptions on U.A are:
•
U.A has a public key pair (PU.A, SU.A) where PU.A is the public key of U.A and
SU.A is the corresponding private Key.
•
SU.A is known only by U.A
•
U.A has a certificate Cert-(U.A) issued by a CA trusted by the VO.
U.A creates a proxy credentials on the local machine in two steps:
1. Generates new public key pair for the proxy (PPU.A, SPU.A) where PPU.A is the
public key of U.A’s proxy and SPU.A is the corresponding private Key.
2. Creates certificate and signs proxy credentials with his private key SU.A to
produce this proxy credential: {Proxy} SU.A
The Proxy authentication process in GSI is depicted in the Figure 3.4. Here are the
steps:
1. U.A sends to R.B his certificate and the proxy certificate.
2. R.B validates U.A’s certificate by using the CA public key, CRL and expiry
date (step 2).
44
3. R.B checks the validity of the proxy by using the U.A’s public key recovered
from U.A’s certificate (step 3).
4. R.B generates and sends a challenge to U.A (step 4-5)
5. U.A signs the challenge with the proxy’s private key (step 6)
6. Verify that the response has the genuine RAND (step 7)
Once these steps are successfully done, then the authentication succeeds and R.B can
consider that the user is associated with the identity on the Certificate of U.A.
Initially U.A has
U.A
Cert-(U.A)
R.B
PCA
1- Cert-(U.A) + {proxy} SU.A
(PU.A, SU.A)
2- Validate Cert-(U.A)
PU.A
PCA
(PPU.A, SPU.A)
3- Validate {proxy} SU.A..
4-Genrate RAND
5-RAND
6- {RAND}SPU.A
7-{RAND}SU.A
PU.A
RAND1
If RAND = RAND1 Then U.A is authenticated
Figure 3.4 User Proxy authentication
Proxy credentials can be delegated to a process on remote host [5]. For instance, U.A
may need to run a program at resource R.B. The program running on R.B determines
that it need to access resources on C, R.C. U.A can delegate a proxy credential to the
program running on B to act on his behalf. Assuming that U.A and R.B have already
established a secure channel using OpenSSL version, the delegation process is
described as follows:
U.A
Cert-(U.A) , (PU.A SU.A)
R.B
R.C
PCA
1-Run program at R.B
PCA
2- Needs access to C
PCA
3- Generate proxy credential
4- proxy
(PP.B SP.B)
Sign {proxy} SU.A
5- Cert-(U.A) +{proxy} SU.A
6- Cert-(U.A) +{proxy} SU.A
7- Request data from C
8- Authentication as in
the previous diagram
Figure 3.5 Delegation of proxy credential to a process on a remote resource
45
3.10.2 MyProxy
Globus team has proposed an online credential trusted server, MyProxy, to store and
manage long-term user credentials, private keys and certificates. In addition the server
cab be used to perform delegation on user’s behalf [6]. When the user logs in, he
creates a proxy certificate and sends it to the MyProxy server along with a tag and a
pass phrase. When the user initiates a job request (i.e. program on remote site), the
process running the job connects to the MyProxy server, presents the tag and the pass
phrase, and receives a proxy for that user [6]. The advantages of introducing this
server are:
• There is no need to generate the key pairs on users’ machine.
• Protect the user’s private key
•
Provide mobility for users so they don’t have to carry their private key on a floppy
or send it to their mailbox.
The disadvantage of this approach is that it becomes a target for attacks because it
holds the tags and pass phrase for all VO site users who are using the Grid.
46
Chapter 4
Grid Authorisation
4.1 Introduction
Resources involved in a Grid a project can be extremely valuable such as
supercomputers or extremely sensitive such as classified medical records. Thus
controlling access to these resources is crucial to maintain their confidentiality [2],
integrity [2] and availability [2]. On the Grid, authorisation aims at controlling and
restricting access to resources [2]. Users of a Grid project often require access to
valuable and sensitive resources that do not belong to their institutions. Authorisation
is needed to allow legitimate Grid users to access confidential Grid information and
resources. Furthermore, authorisation is vital to resources’ integrity because only
authorised Grid users can modify data resources and equipments resources’
configurations. Finally, authorisation is essential to availability, because attackers
manage to gain unauthorised access to destroy data resources.
4.2 Fundamental model of access control
“The very nature of access control suggests that there is an active subject and a
passive object with some specific access operations and a reference monitor that
grants or denies access” [2]. On the Grid, the subjects are users, processes running on
user’s behalf and resources. The objects are resources to be shared such as
supercomputers, databases and instruments. A resource on the Grid can be a subject in
one request and an object in another.
47
The access control can be either user centred or resources centred. The first focuses
on the user capabilities and the second on what can be done with an object (first
design principle [2]).
User
Access request
Reference monitor
Resource
Figure 4.1 The fundamental model of access control [2]
Consider the Grid project describe section 1.3. It consists of multiple trust domains A,
B, C and D. Figure 4.2 below indicates this situation by showing local access policy
that applies in institutions A, B, C and D. In addition, it shows the heterogeneous
platforms that include Kerberos, OS2 and UNIX. The authorisation problem on the
Grid can be described as follows: If a U.A is successfully authenticated to the VO to
access resource R.B in domain B, how does B decide what access rights U.A has on
R.B?
Figure 4.2 Multiple trust domains
48
The main obstacle is that the identity of U.A and his credentials as expressed in policy
A are meaningless in domain B. Therefore, the first issue to be solved is how to
enable interoperability between these heterogeneous domains. The second issue is to
decide where the decision on U.A’s access rights is made (on the target site or
centralised).
Typically, each organisation in the VO wants the access control decision to remain in
the control of its local site administrator. For commercial companies, it is unthinkable
to give up control over their resources. As a result, the decision on U.A’s access rights
has to remain under full control of the local site administrator. Several possibilities
have been considered:
•
Resource centred using Access Control Lists (ACL) [2]
•
User centred with Role Based Access Control (RBAC) [2]
•
Distributed authorisation using identity mapping [4, 7]
•
Community Authorisation Service (CAS) [3, 4, 5]
4.3 Resource centred with Access Control List (ACL)
ACLs [2] specify, for each resource, a list of authorised users, with their privileges on
that resource [2]. When U.A is authenticated by the VO to access R.B, and his name
is on the list of authorised users of R.B then, he can use the resource with his
associated privileges. The access rights of U.A to R.B in the form of ACL can be
viewed as follows:
ACL (R.B) = [(U.A, [opB1, opB2…]), (U.C, [opB4, opB5…])…]
Where opB1 is an operation on resource R.B
This approach has many disadvantages:
1. Organisations in the VO must agree on access policy in advance that can be
enforced by each local site administrator. However, it would be difficult to agree
on a policy that operates at the level of individuals, particularly when they are
unknown to the organisation providing the resources.
2. It does not scale. ACLs are tied to resources. The local site administrator has to
manage individual users’ accounts on each resource. Each time a new user
joins/leaves or has new responsibility in the VO, the local site administrator in
each VO site will have to update the ACLs of each resource. For m resources and
49
n users, the administrator has to manage m*n associations. For instance, CERN
project [38] has 1800 users. Suppose that they share 10 valuable resources with
one site. This means that the administrator of this site will deal with 18,000
associations that is a huge number to manage.
3. Vulnerability to errors. Due to the dynamic aspect of the Grid, the management of
ACL on individual basis becomes cumbersome and error prone. When a user is
removed from the project, the administrator has to go through each resource’s
ACL to revoke that user’s permissions. As a result, it is very likely that he can
make a mistake by not removing someone’s access right when he is not authorised
to use the resource anymore.
4.4 Role Based Access Control (RBAC)
The main problem with the previous approach is administering individuals. To reduce
this problem RBAC [2] authorisation model could be used. RBAC focuses on users
and the jobs users perform within the organisation [2]. Thus, it is appropriate for large
and dynamic environment such as the VO. RBAC carries numerous advantages
compared to the previous approaches here are some of them:
1. Reduced administrative cost and complexity [2]. The local site administrator can
assign permissions to roles in the project instead of individuals, according to the
global and local policy.
RBAC (U.A) = [(Site1, Role1), (Site2, Role2)...]
Permission (Site1, Role1) = [op11, op12, op13….]
Operations can include execute program, read, write, and update. Every time there
are personnel changes such as new members added/removed/ have new
responsibilities, only Roles permissions are updated or added.
2. Consistency. Each role can be mapped to the user’s positions within the project
and can be granted a set of operations to perform his job. For example, two
students in the same role will have exactly the same permissions.
The current RBAC systems used are PERMIS [6] and AKENTI [6], which has been
used with Globus.
50
4.5 Distributed Authorisation
Distributed authorisation is a user centred authorisation model [3]. The main idea
behind this approach is that each site in the project has a proxy responsible for
converting global credential to a local credential. So, each Grid user would be
required to have an account in each administrative domain. User’s access rights are
decided and managed on each VO sites. Access to resources is achieved through
mapping of identities/credentials from the user’s domain to the resource domain. For
example, the decision on the access rights of U.A to access R.B is achieved through
identity mapping from domain A (i.e. UNIX) to a local account in domain B (i.e.
Windows). Therefore, U.A can access R.B as a local user in domain B. As a result, B
can apply its local security policy. This solution is currently adopted in many Grid
projects such as Globus [32] and Unicore [34].
This solution has several advantages:
1. It allows local site administrator to be in full control of local resources. It is
the administrator job to create user accounts and set permissions to resources
in his site.
2. It provides for accountability. Usernames and passwords are associated with
the subject name on the certificate. Thus, it is possible to track individual users
who performed actions on a specific resource.
3. It does not require changes in the security mechanism in the local site. It can
be implemented with ACLs and RBAC mechanisms.
Figures 4.3 and 4.4 illustrate the identity mapping process with ACL and RBAC
respectively. When U.A wants to access resource R.B in site B:
1. U.A sends his X509 certificate to the resource in site B. Upon receipt, the
resource checks the certificate validity and that the sender has the private key
that corresponds to the public key on the certificate (Done via SSL/TLS).
2. Once the authentication succeeds, the resource extracts the subject name
(SN.U.A) from the certificate.
3. The resource compares the subject name to entries stored in his mapping
database. If there is no match, then the user is not authorised to access the
resource.
51
4. Otherwise, the resource fetches the username and password corresponding to
the subject name from the database. Now, the user can access the resource as a
local user in domain B.
Company B
Mapping database
Company A
[(SN, Un, Pw)]
U.A
Username
Password
U.B
Username
Password
U.C
username
Password
4
3
2-Subject Name
False
SN.U.A =U.A
1-
Not Authorised
True
Authorised
(Username, password)
UNIX
Figure 4.3 Identity mapping process with ACL
Company B
Role
Users
Student
U.B
Mapping database
Scientist U.D, U.C
Admin
U.B
Company A
Username
Password
Username
Password
username
Password
[(SN, Un, Pw)]
4
3
2-Subject Name
SN.U.A =U.A
1-
UNIX
False
Not Authorised
True
Authorised
Figure 4.4 Identity mapping with role based
However, it still has several drawbacks such as:
1) Scalability problem [7]. The user will have local accounts at all VO sites
where he has access to resources. This implies that the administrator has to
create/delete an account and set its access rights every time a new user is
added, removed or has new responsibilities.
52
2) Distribution problem (inconsistency). Removing user’s access rights to project
resources is cumbersome and error prone. The project leader has to contact all
local system administrators to inform them in case of personnel changes or
policy changes. If one site is not informed, then there will be inconsistency in
the policy.
4.6 Globus approach to Authorisation
The Globus toolkit uses distributed authorisation approach where the local site
administrator is in full control. This section will describe identity mapping in Globus
(figure 4.5).
Consider the Grid project described in section 1.3. Assuming that R.B successfully
authenticates U.A:
•
The identity of U.A is extracted from the certificate (Subject name).
•
The Grid-mapfile [3, 6] on the resource maps the subject name to a local
identity such as UNIX account or Kerberos ticket depending on the resource
platform. If the “Subject name” is not in the mapping file, this means the user
is not authorised to access the resource.
If the resource is running on Windows operating system, a function in Globus called
SSLK5D [6], a modified Kerberos Key Distributed Centre [2, 6], takes a user
credential and returns a Kerberos ticket. (Assuming that each member of the project
already has an account on the site where he is authorised.)
Company B
GSI
Grid-mapfile
Company A
U.A
Username
Password
U.B
Username
Password
U.C
username
Password
3
Subject Name: U.A
SN (Cert U.A) =U.A
1Cert U.A
True
UNIX
Authorised
Figure 4.5 Identity mapping in Globus
53
False
Not Authorised
4
4.7 Community Authorisation Service (CAS)
The previous section has highlighted some of the main problems faced in approaches
to Grid authorisation. Scalability, heterogeneity and the distributed nature were
factors that could exclude ACL and RBAC from being the best choice for Grid
authorisation. To solve these problems, a centralised authorisation model was
proposed by Globus [3, 5] is:
•
The VO partner organisation allows the resources administrator to grant access
to block of these resources, to the project as a whole. Thus, grouping resources
together and granting access on theses blocks to the VO instead to the VO
users.
•
The project itself manages fine-grained access control mechanisms. Thus, the
access policy will be flexible enough to allow the project and the resource
administrator to specify the way resources should be allocated.
These features have been implemented in Community Authorisation Service (CAS)
framework [5]. The VO will require a CAS server that will be responsible for tracking
members and managing fine-grained access to resources [5]. The local site
administrator uses local access control mechanism to grant access to local resources
based on the subject name associated with the CAS server name, not the individual’s
name.
CAS security architecture is based on certificate-based PKI and delegation. The
delegation is done using “restricted proxies” (details in section 3.7) because this type
of proxies allows the CAS server to delegate a subset, not all, of its authority to users.
“Impersonation” proxies (details in section 3.7) are not suitable because the project
may involve different roles. Thus, it is inappropriate for the CAS server to delegates
all its authority to users. The CAS authorisation process is shown in figure 4.6.
When a user U.A wishes to access a VO resource:
1. U.A sends his X509 certificate and the request to the CAS server. The latter
verifies U.A’s certificate and fetches U.A’s rights, granted by the project, from the
policy database
54
What rights the
project grant to this user
ate
ific
ert
c
A
Company A
U.A
U.
st,
e
u
q
Re
LS y
AS
C
L/T rox
1
SS cted p
ri
est
SR
A
2-C
CAS Server
VO
Project Policy
Database
Resource R in Company B
3-Resource request, authenticated
Is this request authorised
for the project?
with CAS proxy
Local policy
information
Project subject name
Policy Restriction
Does the proxy restriction
authorise this request?
4-reply
Figure 4.6 CAS authorisation model (Adapted from [5])
2. The CAS server creates and sends a restricted proxy certificate to the user in order
to access the requested resource. This certificate allows the CAS server to
delegate a subset of its access rights to the user in the form of capabilities. These
capabilities are based on the type of the request and the role of U.A in the project.
This certificate contains the name of the CAS server in the subject name field and
the restrictions in the policy restriction field.
3. The user sends both the proxy certificate and the request to the resource. The user
authenticates to the resource with the proxy certificate as a project user (not as an
individual). The resource checks whether the request is authorised by the local
policy of the organisation to the Grid project. In addition, it checks whether the
proxy restriction authorises the request. Once these checks are successful, the
request is processed on the remote resource.
4. The resource sends the result of the request.
CAS model has several advantages:
1. Scalability: users need to be known and trusted only by the CAS server in order to
get permissions to access resources. There is no direct relationship between users
and resources. In addition, resource providers need to be known and trusted by the
CAS server, not all users. This reduces significantly the number of association to
be managed between resource providers and users. For instance, consider the
55
number of users in CERN that is 1800 users accessing 10 resources. With CAS
model, there are 1800 associations between CAS-server and VO users and 10
associations between CAS-server and Resources. This means there are 1810
associations in total. Obviously, this is much easier to manage compared with
18000 associations in identity mapping.
2. Policy support: CAS allows the local site administrator to enforce local policy, and
the VO to enforce global policy. With this approach, the conflict between local and
global policy can be easily resolved. Any request conflicting wit the resource
provider’s local policy will be rejected.
3. Support for ACL and RBAC. It allows both mechanisms to be implemented on
local sites, and support global policy of the project and the local security policy of
each site.
CAS also has its drawbacks.
•
Single point of failure: If the CAS server fails, the whole Grid project will stop
working. The users will not be able to get proxy certificates from the CAS server,
thus the project resources will not be accessible.
•
Single point of attack. A denial of service attack on the CAS server will leave the
project users disconnected from the project resources.
•
It does not provide for accountability: In CAS, individual users are not
accountable for their actions. This is because users are acting on behalf of the
project not as individual. Therefore, in case of resources misuse, the resource
administrator will know that a user from a specific project performed the misuse.
He will not be able to detect which user in that project has performed the action.
•
Giving up control: The local site is giving up some control to the VO. This
approach
is
not
acceptable
by
security-paranoid
companies
such
as
pharmaceutical and financial companies.
4.8 Access Control with PKI
It is possible to do both authentication and authorisation in one go. With X.509 v3
certificates, access rights can be linked to a public key. To get access to a resource,
the user has to prove knowledge of the private key corresponding to the public key on
the certificate [2] and that the access right on the certificate allows the holder to
perform the requested task. Simple PKI [2, 33] “specifies a standard form for X. 509
56
digital certificates whose main purpose is authorisation”. But currently, there is a
debate about the extension that holds the access rights. The IETF has not standardised
the extensions in X.509 v3 certificate [2]. This solution is not scalable and doesn’t
provide for accountability, as it is not associated with an identity.
4.9 Firewalls and the Grid
A major problem for the Grid is firewalls [7, 15]. VO partners’ organisations want to
share specific resources while keeping their other resources private. Usually, most VO
organisations protect their resources on their Local Area Network (LAN) from the
Internet with firewalls. The shared resources and the private resources, within a VO
site, are probably connected for internal use. By enabling access to resources behind
the firewall, companies will be exposing their LAN, thus their private resources, to
the outside world. This defeats the purpose of having a firewall in the first place.
4.9.1 Brief overview of firewalls
A firewall is an access control mechanism that divides a network into, usually, at least
three domains:
•
“Inner” (green, safe) trusted network that is the LAN,
•
“Outer” (red, unsafe) un-trusted network that is the Internet,
•
Demilitarised zone (DMZ) (orange), more dangerous than inner but more
sheltered than outer.
For instance, the Information Security Group (ISG) [49] in Royal Holloway [50] has
four network domains: student network (inner), staff network (inner), DMZ, and outer
network.
The firewall, possibly, checks all network packets as it arrives, logs it and allows it to
pass through if it satisfies the rules set by the firewall administrator. Many firewalls
use Network Address Translation (NAT) [15] and a router to control access to inner
network. The latter is most likely to have private IP addresses that make computers on
the LAN invisible to the Internet at large. Thus, the inner network appears to the
outside world as one single machine: the firewall host. As a result, all outgoing
packets from internal network to the outside appear from the firewall. Several issues
arise when using firewalls on the Grid:
57
•
How can a VO organisation allow VO members to access resources shared
with the VO, without compromising the security of the resources that are
restricted to the VO.
•
Because many institutions use NAT and private IP addresses, there is a
problem with the Name to use on the resource certificate. (Should it be the
firewall host name or the resource machine name?)
4.9.2 Accessing resources behind the firewall
Each VO site would be required to open a dedicated port to allow access to its
resources. The European Data Grid project lists 10 ports [7] in order to open access on
resources. Once the user is behind the firewall, he can attempt to exploit
vulnerabilities in other machines in the local network. To reduce this risk, the local
site administrator can restrict access to ports so that only request from specific IP
addresses can go through. For instance, only IP addresses of VO partners hosts.
However, this solution has drawbacks:
•
It will not work with UDP [14] network packets, because the latter have a
source address field that is unreliable and can be easily spoofed. Thus, the
rules concerned with accepting packets from a specific IP address will not
work.
•
Due to the dynamic aspect of the Grid, the firewall administrator will have to
reconfigure the firewall each time new company join or leave the project. As a
result, the administrator may do a mistake during the reconfiguration.
If the firewall rules by default allow people in then it is not doing its job properly. If it
denies them (which it should!) then they would have to be modified in the first place
to let the partners companies in (and therefore modified again to lock them out once
they no longer need the access). Also, using revision control systems (RCS) [14] for
firewall rules is highly recommended to quickly roll back to a previous configuration
if something goes wrong with the new one.
58
4.9.3 Naming issue with Network Address Translation (NAT)
On the Grid, mutual authentication between a user and a resource is required. Some
firewalls use NAT [15] so that remote users do not access directly resources behind
the firewall. Since the inner network appears as the firewall host, a major issue arises
is the selection of the name to use on the resource certificate.
Company A
F/W1
add1
University B
add2
add1: public IP address
add2: Private IP address
Figure 4.7 Firewall problem
A resource certificate usually holds a hostname. Like in e-commerce systems, in case
the resource is behind a NAT firewall, the VO user will be connecting to the public IP
address that is of the firewall on a specific port. Thus, the firewall name is expected to
be on the certificate (figure 4.7). The firewall will then tunnel the request to the target
resource. However, internal users will not be able to authenticate the resource because
the certificate does not hold the resource name [3].
Usually, when checking the resource certificate, user’s host looks up the IP address of
the resource host presenting the certificate and compares the host name to the one on
the certificate. As long as the hostname it receives matches the one on the certificate
and assuming everything else about the certificate is correct, it will accept it. Some
PKI implementations have unspecified behaviour if multiple hostnames are returned
for an IP address (as can be the case if the host has DNS aliases –standard procedure).
So, the problem can be addressed at the DNS level. The technical details of
configuring firewalls are beyond the scope of this dissertation.
A possible solution to the previous problem is to have a second firewall (Fw2) for the
internal users (figure 4.8). Internal users will access the resource via the first firewall
as the external users. Thus the name on the certificate will match the name resulting
59
from resolving the IP address. This solution is not convenient, as the company will
have to manage two firewalls.
Figure 4.8 Solution with 2 firewalls.
Currently, most large organisations partition their local area network into domains
using firewalls. For instance in Royal Holloway, ISG has its own firewall, which
separates it from other network domains: the student network, staff network and
administration network. According to [7] Grid users should be able to access only the
domain that provides the Grid services. Therefore, the entire LAN should be properly
configured such that it is very difficult for an attacker to gain access to other domains
from the Grid domain.
4.9.4 Globus and firewalls
Globus allows access to resources behind the firewall via dedicated ports. According
to [7], these ports are 2119, 2135 and 2811. Globus suffer form the problem described
above when the firewall is using NAT. Currently there is no good solution for
firewalls with NAT.
4.10 Future network solutions
IP secure Protocol, IPSEC [18, 51], is designed to provide a point-to-point secure
channel, typically offering origin authentication and confidentiality at the network
layer. Point-to-Point means: Host-to-Host, Gateway-to-Gateway or Host-to-Gateway.
IPSEC is widely used to implement Virtual Private Network (VPN) connections [18].
It operates in two modes [51]:
60
•
“Transport” mode: used between two IPSEC enabled hosts (host-to-host). In
this mode the TCP datagram is encrypted, and the destination address is left in
clear text. For instance, this mode can be used when confidentiality isn’t an
issue but we want to know whom we are communicating with and the data is
unchanged. (Useful for firewalls on the Grid)
•
“Tunnel” mode: used to establish secure connection between gateways
(firewalls). In this mode the TCP datagram is encrypted as well as the
destination address. A new IP header is used so a Virtual private Network can
be established between remote sites.
IPSEC provides origin authentication and data confidentiality by using [18]:
•
IP authentication header (AH)
•
IP Encapsulating Security Payload (ESP)
Both protocols require shared secret keys. The keys are negotiated between
communicating parties using the Internet Key Exchange (IKE) protocol.
In Grid environment, origin authentication provided by AH will:
•
Allow firewalls on VO sites to only accept network packets originated from a
trusted IP address that belongs to a partner organisation, because with IPSEC
it is difficult to spoof the IP source address.
•
Provide protection against denial of service attacks, because IP source address
is known and unforgeable.
•
Work on both TCP and UDP.
IPSEC also provides data confidentiality by using the ESP protocol. This feature can
provide limited protection against traffic flow analysis [51]. AH and ESP protocols
can be used alone or combined to provide origin authentication, confidentiality or
both. According to [18], there are engineering and political reasons for separating
them (encryption export restrictions).
61
Chapter 5
Confidentiality, Integrity, Availability
and Accountability on the Grid
5.1 Introduction
The Grid is characterised by the flow of information between VO users, resources and
VO partners’ organisations. Information is stored electronically and transmitted by
electronic means. By using the Internet or public communications, VO members may
be exposing themselves to new risks. VO Organisations want to protect their data
resources while collaborating with other VO partners.
Traditionally, security has been concerned with maintaining the confidentiality,
integrity and availability of data resources [2]. Accountability [54] has been recently
added to these attributes. On the Grid, Confidentiality and Integrity are required
because data is transmitted over a public network. Furthermore, resources involved
may be extremely sensitive and valuable. This requires that only authorised project
users can gain access to read or modify these resources. Resources availability is vital
on the Grid because without it the purpose of sharing resources will not be
accomplished. Finally, accountability is required in order to determine project users
who are responsible for performing specific jobs on VO resources and for billing
purposes in the future.
The smooth running of the Grid project crucially relies on security mechanisms that
ensure these attributes. This chapter will describe each of these attributes, why it is
needed and the mechanisms used to achieve them.
62
5.2 Confidentiality on the Grid
Confidentiality on the Grid is crucial. The existence of some institutions in the VO, in
particular commercial companies, depends on keeping their information secret.
Confidential information can include: proprietary databases, sophisticated software
and intellectual property such as software source code and drug formulas and
compounds. Confidentiality deals with preventing disclosure of information to
unauthorised entities [2]. On the Grid, Confidentiality is needed for:
•
Ensuring the network security. Users will remotely communicate with resources,
submit jobs and upload proprietary source code to remote supercomputers over
the Internet. This means traffic between users and resources is a point of attack.
•
Ensuring the privacy of data resources. The project may involve extremely
sensitive or valuable data such as patients’ medical records and drug experiments
databases. These data is shared on the basis that only authorised project users will
have access to it.
•
Maintaining the secrecy of data processed on remote resources. The Grid project
may allow its users to run programs on remote sites. The result of the program and,
possibly, the source code running on the remote machine should remain
confidential.
The usual security mechanism used to provide confidentiality is encryption [10, 11]
and is described next.
5.2.1 Brief overview of Encryption
Encryption is a mean of transforming plaintext into cipher text under the control of a
secret key. This process is called encryption and we write C=EK (M).
Where M is a plaintext, E is the cipher algorithm, K is the key and C is the cipher text.
The reverse process is called decryption or decipherment and we write M = EK(C).
The current algorithms used are either symmetric [10, 11] or asymmetric (public key
cryptography) [10, 11].
63
5.2.1.1 Secret key cryptography
With Secret key cryptography (symmetric encryption) both parties use the same key
value to encrypt clear text into cipher and to decrypt a cipher text back into clear text.
The assumption is that the key K must remain secret.
What determines the effectiveness of the cipher algorithm E is the size of the secret
key length, K. The larger the key size, the more difficult is to break the cipher text.
The most widely used symmetric encryption algorithm is Data Encryption Standard
(DES) [11]. It has 56-bit key length and 64-bit block. Due to advances in computing
power and its short key length, DES is currently considered not secure for high value
transactions [10] and for long-term data encryption. The new successor for DES is
Advanced Encryption Standard (AES) [11]. It offers 128-bit key length and 128-bit
block size. It is considered secure for the next two decades [11].
A major problem with symmetric key is how to exchange the secret key between
parties who do not trust/know each other such as on the Internet and in our case the
Grid. Also the number of keys to be managed is huge. For instance, for a group of n
users, the number of keys to be managed is n*(n-1)/2. If n = 10 users, this means
number of keys to be managed is 10*9/2 = 45 keys.
5.2.1.2 Public key cryptography (PKC)
PKC has been described in section 2.4.1. Public key has several advantages over
secret key such as:
•
Private keys need not to be distributed. Only require demonstrated integrity
and authenticity of the public keys themselves.
•
For a group of n users, the number of keys to be managed is 2n, which is
easier to manage compared to n2 in secret key. For n = 10, number of key to be
managed in this case is 20 keys (imagine it for a 1000 user on the Grid).
5.2.1.3 Combining secret key and public key
A major problem with public key cryptography is performance, because of complex
mathematical operations that are computationally intensive. Public key cryptosystems
are very slow, 1000 times more than symmetric system [11]. Therefore, a hybrid
approach is often used. Public key algorithms are used for securely exchanging secret
64
keys for symmetric algorithms. This approach is adopted in communication protocols
such as SSL/TLS and SSH.
5.2.1.4 Key management
Key management is by far the most important area of security of cryptosystems.
Kerchoff’s principle states, “Security shouldn’t rest on the security of the algorithms,
but only on the secrecy of the key”. Encryption algorithms such as RSA and AES are
assumed secure, but how to manage the keys is the issue to worry about.
The key management is really about 3 things: keeping the keys for symmetric
algorithms secret, the private keys for asymmetric systems secret, and ensuring that
the public keys in asymmetric systems are authentic (make sure it belongs to its
owner) [10].
Key management generally covers six areas according to [10]: Key Generation,
distribution, storage, usage (and preventing misuse), changing, and destruction. On
the Grid we are interested in:
Key generation: it is important for proxy servers and users generating their own key
pair. Generally, Key generation involves generating random (or at least
pseudorandom) numbers to make up the key bits. Since this is usually a single unit
within a security system, it is often the focus of sophisticated attacks, which are
difficult to detect.
•
For symmetric ciphers, the key generation needs to exclude weak or semi-weak
keys from being generated. These are keys, which have been shown to produce
certain effects in a cipher’s output. For instance, when using a weak key in DES,
the process of encryption is the same as the process of decryption [11].
•
For asymmetric ciphers, the key generator usually needs to be able to generate
strong primes in addition to the values being unpredictable. This process is usually
very slow and complicated. If a user generates his own key pair, it will cause a
non-repudiation problem because someone has a key pair from a CA and another
generates by chance the same pair. Issue of liability arises. The recommended key
length to defeat factorisation attack is 760-bit [11].
Key storage and distribution has been described in section 2.4 using smart card and
certificates. Key usage, changing and destruction are beyond the scope of this
dissertation.
65
5.2.1 Communication security
SSL/TLS are communication protocols that are commonly used to provide
confidentiality on the Internet. Typically, encryption is used to keep information
secret from a third party. But if the third party manages to masquerade as the intended
recipient of the information, the encryption will fail to achieve its objective. SSL
depends on the existence of a PKI to provide confidentiality between two entities.
Thus, it needs to be configured to provide mutual authentication between those
entities. This requires:
•
Trusted Root Certificates: the VO site administrator according to the VO
policy must install the root certificates.
•
Validation and verification: When the certificates are validated, this means
the intended parties are communicating together. Otherwise the authentication
will fail. The client must ensure that the certificate strictly identifies the party
with whom it wants to communicate. CAs usually includes the IP address
(DNS) of the resource in the certificates they issue.
•
CRL checking: Recently, Conqueror and Internet explorer in their
implementations did not check CRLs. Without checking the CRL the
authentication would not be reliable.
Export restriction: SSL/TLS can be weakened because of encryption export
restriction. VO may comprise organisations from different countries. Some of these
organisations reside in countries where there are restrictions on the use of
cryptographic algorithms. For instance, some restriction may allow only 40-bit DES,
which will give little protection to a high value data transmitted over the network
between Grid entities. This will not allow companies to benefit from the Grid.
Passive attacks: encrypted data transmitted over a public network between project
partners is vulnerable to passive attacks [11]. In such attack, the attacker collects
network packets in order to decrypt them offline and infer some information. The
protection against this type of attacks depends on the length of the encryption key.
The length of the key should be chosen such that, the time it takes to exhaustively
search the key-space is prohibitively long (years/decades). In practice 112-bit and
128-bit key length are used in Triple DES and AES respectively.
66
5.2.2 Data resource privacy
Data resources usually include databases, archives and file systems. These resources
may be extremely sensitive and valuable. For instance in the example in section 1.3,
the Laboratory may provide proprietary drug compound databases for project users.
Only authorised project users can access these databases. Therefore, a reliable access
control mechanism is required. The current security mechanisms used have already
been described in chapter 4.
Sometimes attackers manage to get an encrypted copy of databases backup files. The
backup files should remain secret even if they are stolen. This can be achieved using
strong encryption like those mentioned in the previous section.
5.2.3 Remote Data privacy
A major confidentiality issue arises when sensitive data are processed on a remote
resource. The Grid project may allow users to run programs and upload their private
source code to a remote resource. The local site administrator, as well as other
privileged users on that site, can access the data processed on that resource. Therefore,
they are capable of reading or copying the result of the program execution. A level of
trust should exist between project users and the local site administrator for not
accessing their data. Otherwise, Commercial companies such as pharmaceutical
companies that rely on keeping their experiments secret will not be able to benefit
from the Grid.
In order to reduce this trust assumption with the administrator, an asymmetric
encryption function can be integrated within the program. So, the program takes an
additional input that is a public key supplied by the user/proxy, in order to encrypt the
result of the program execution. In this case, only the user or the proxy running the
program has the private key that can decrypt the result of the execution. This measure
will make it harder to the administrator to access the result of the program, but it has
several limitations:
•
Performance will degrade because public key encryption is very slow
especially if the amount of data to be encrypted is large.
•
Each program will need to have an asymmetric encryption function for
encryption.
67
•
If another resource requires the result of the execution, it will not be able to
read it because it doesn’t have the private key to decrypt it. Thus resources
coordination will be very limited.
5.3 Integrity on the Grid
Integrity deals with the prevention of unauthorised modification of data resources and
resource configurations. Consider the Grid project described in section 1.3. Integrity
can be achieved at different levels:
Access control level: In institution A, only authorised users are allowed to change the
direction of the satellite such as Admin.A. In C, changing the attributes of a chemical
item on the database server should only be done by authorised users such as
Scientist.C. Access controls, described in chapter 3 and 4, are mechanisms used to
provide data integrity.
Administration level: Hackers manage to get unauthorised access to data by exploiting
known weaknesses in the systems that run the Grid project resources. This includes
vulnerabilities in Operating systems, Database servers, Firewalls, Intrusion Detection
Systems IDSs and Routers. Vendors respond by issuing patches and upgrades. It is
therefore vital to apply all patches recommended by the vendor.
Network and configuration level: Hash and Message Authentication Code (MAC)
algorithms [11] are also used to maintain integrity of data. A hash algorithm is oneway function that takes an input a message of arbitrary length, and returns an output
of fixed length. A MAC takes a message and key as inputs and returns a checksum of
fixed length. SHA-1 [11] and MD5 [11] are hash algorithms that produce 160-bit and
128-bit hash value respectively [11]. HMAC [11] is a MAC algorithm.
The hash function HASH has several properties [11]. Let the hash of message M: h =
HASH (M):
•
Pre-image resistant: Given h = HASH (M), it is very difficult to find M.
•
Second Pre-image resistant: Given M and h = HASH (M), it is very difficult
to find N such that h= HASH (N).
•
Collision free: It is very difficult to find M and N such that HASH (M)= HASH
(N).
68
Hash/MACing can be used to ensure the integrity of data resources by hashing backup
files, archives or file systems. Once the hash value is produced, it can be encrypted
and saved. In case there is a suspicion of unauthorised modification of any of these
files, the hash value resulting from applying the hash function again will not be the
same. Thus the modification can be detected.
Digital Signature: is a fundamental mechanism for ensuring data integrity. A digital
signature is usually created by calculating the hash value of data (document, network
packets) and encrypting the hash value with the private key of an entity (for
performance reason). For example, the digital signature on the certificate gives
assurance that the certificate has not been modified since creation by the CA.
Modifications to the certificate content can be detected because the Hash value of the
certificate will be different from the value signed by the CA. Digital signature can
also be applied to data resources, so Grid users can ensure that the data they are using
is from the intended VO site. In communication over the public network, digital
signature provides origin authentication that gives Grid users assurance of the identity
of the second party they are communicating with.
Concurrency level: Since the Grid is about coordinated resource sharing, a
concurrency problem will occur. A job running on the Grid may involve a file or a
database update. Sub-jobs cannot simultaneously read the portion of the file
concurrently being updated by another job. Therefore, an application might read
partially updated data and perhaps receiving a combination of old and new data.
Locking and synchronizing primitives are required to maintain data integrity [19].
Typically, these primitives are built into the files system or database to automatically
prevent this.
5.4 Grid Availability
Most VO organisations are totally dependent on their computerised information
systems and the data that they store, process and transmit. Consequently, system
failure or information loss can have grave consequences on the smooth running of the
Grid project. VO Organisations are also being faced with an increasingly wide and
sophisticated range of threats including viruses, hacking, and denial of service attacks
because resources on the Grid are connected to each other via the Internet.
69
Availability aims at ensuring that authorised Grid users have timely and uninterrupted
access to resources on the Grid [2].
Availability could be viewed from many angles: threats to the Grid infrastructure
layer, threats to the underlying operating system and supporting applications and
finally threat to the network infrastructure. A major threat for the Grid is “Denial-ofService” (DoS) [27]. Since the server is accessible through the Internet, it will be
vulnerable to “Denial-of-Service” (DoS) and “Distributed-Denial-of-Service” attacks
[27]. These types of attacks have been widely experienced on the Internet. For
example, a DoS attack in early 2000 seriously interrupted the services of some wellknown Internet sites such as Amazon, eBay and Yahoo [27].
The Grid infrastructure itself has some critical components such as the CA, the CAS
server and the Grid map file. If any of these components is compromised the Grid
project will fail.
CAS server: When the Grid project relies on the CAS server for granting users access
to resources, the availability of the server becomes crucial. Since the server is
accessible via the Internet, it will be vulnerable to a wide range of threats with
different impacts on availability:
•
If the CAS server is hacked, and all its content is deleted, VO users will not be
able to access any resource in the VO.
•
If the CAS server is infected by a worm such as Code red, the performance of
the server will degrade dramatically because worms consume bandwidths in
order to replicate. This will delay the response time to VO users’ requests.
•
CAS server is also vulnerable to network attacks such as SYN flooding [14]
that exploits the handshake operation when establishing a TCP connection.
(Communication with the CAS server is done via SSL thus TCP)
Grid-map file: Each VO site has a Grid map file. One way of causing interruption of
services on the Grid would be by deleting the Grid-mapfile or the database used to
map global credential to local accounts on the resource. The impact of compromising
a Grid map file will affect one VO site instead of the whole project.
In Grid projects that involve data resources, replication of this data to different
machines at different sites can increase data availability. Thus to stop the service the
70
attacker need to disable all servers that host the data. Also, the DoS would consist of
flooding the network with huge amount of fake requests. The success of the attack
depends on the type of the scheduling algorithm used to manage requests. It may vary
between jobs execution delay to complete stop.
5.5 Accountability
We have researched the Grid literature and have not found an explicit statement of the
accountability problem. We now present our definition of what the accountability
problem is.
Accountability tracks jobs execution with the intention of determining the principles
responsible for performing a job. Audits logging is a mechanism used to maintain
accountability. What makes accountability on the Grid more complex is the ability to
simultaneously access heterogeneous resources in geographically distributed locations.
Typically, an audit system will contain records about transactions such as logging on
to the system and initiating a process. According to [4], a job on the Grid “is
composed of dynamic group of processes running on different resources and sites.”
Consider the Grid project described in section 1.3. When a U.A submits a job to a
remote resource, R.B, on site B, the resource authenticates U.A first, and then maps
his global credential to a local account, LocalU.A-B, on B. The audit trails on the
local resource records that a local user, LocalU.A-B, has successfully logged in. From
this point, all U.A’s activities (job processes) are associated with his local username
on that resource, LocalU.A-B, and recorded in the audit trail of that resource. These
activities may include:
•
Initiating processes: audit trails are expected to show when the processes were
initiated, who initiated them and when the process completed (or terminated for
some reason).
•
File access: the audit trails provides information about what files (or databases)
the U.A has opened and closed. Moreover, it provides information regarding
changes in these files such as read from/delete from/ write to a file done by U.A.
Finally, it may records to which files there were invalid access attempts by U.A.
•
Resource utilisation: the audit trail will show the level of utilisation for each
resource by U.A. This allows the administrator to determine which user is abusing
71
the resource. Resources can be a massive storage device, CPU cycles, memory,
and network bandwidth. The details of use of a resource can help the administrator
to determine whether there are unauthorised processes such as worms or malicious
code abusing the resources and take appropriate actions.
The job submitted by U.A to R.B may determine that it requires access to a resource
R.C on site C. Thus, R.B initiates a proxy on U.A’s behalf to access that new remote
resource R.C. The latter will map the U.A’s identity on the certificate to another local
username, LocalU.A-C, which may be different from LocalU.A-B, and starts
recording U.A’s activity locally in a similar way.
In order to determine U.A’s activities on the Grid, each local site administrator in the
VO will have to filter the audit trail of his resources using the username associated
with U.A’s Subject name on the certificate.
Accountability on the Grid is distributed. Each site in the VO locally maintains
accountability. Currently, there is no method to track users’ activities on different VO
sites from a central audit trail for many reasons. Here are some of them:
•
Access to audit trails is usually restricted to the local administrator of the
resource. If the audit trails’ integrity is compromised, the company will lose
accountability. Therefore, it is unthinkable to allow external processes to
access the audit trails
•
Due to the heterogeneous nature of resources on the Grid, audit logs have
different format on different resources.
The accountability level in the VO schemes, described in chapter 2, depends on how
the local administrator creates and manages users’ accounts on resources. The system
administrator may give all users from a particular Grid project the same account with
even the same access rights because:
•
The number of users may be large
•
The number of accounts to be managed and created is too great
Thus, it will be difficult to track users’ activities on the Grid. The administrator will
be able to determine that a user from a particular project has performed a particular
transaction. CAS authorisation model has this weakness, as accounts on resources are
associated with the CAS server name, not with users’ identities.
72
Chapter 6
Toward a Top Down View of Grid
Security
6.1 Introduction
Grid provides control over huge resources consisting of enormous computational
power (parallel processing machines), storage (hard disks, memory), scientific devices
(telescopes, satellites) and safety critical systems such as the Ferno at QMUL used by
the RealityGrid project. Because of its valuable assets, the Grid can be a very
attractive challenge to reputable hackers to cause intentional damages: cracking codes,
obtaining confidential information, storing illegal documents, sabotaging equipments,
infecting programs, and even causing physical harm --These are intentional external
threats.
So far we have introduced several types of VOs and discussed in depth various
security mechanisms associated with each type. These mechanisms provide the
infrastructure upon which grid security solutions could be built in a bottom up
approach. They have mainly been user-centred focusing on authentications and
credentials. To form a better understanding of grid security, I attempt in this chapter
to provide a top down view which could be useful to users, site managers, and
security administrators. The approach is resource centred and is inspired by the
lectures of Prof. Dieter Gollmann on “security” and Dr Sherwood on “risk
management”. To do this it may be useful to step back a little and remind ourselves of
what “security is about”. According to Prof. Dieter Gollmann, security is about taking
appropriate measures to prevent the misuse of valuable assets (protection), identify
damages that have happened to them (detection) and recover the assets after the
damage is done (reaction).
73
In the Grid context the assets are the shared VO resources and the resources of the
individual sites. To protect these assets, it is important to have a threat model that
identifies the known threats to each asset and the vulnerabilities introduced by the
security mechanisms in use. As we have seen, when an organisation becomes a VO
site in a Grid project, some of its resources may be shared with other VO sites from
different security domains. In addition, these shared resources will be connected to
resources in other security domains through the Internet, which was not designed to
be secure. Thus, these resources (even the site resources which are not intended to be
shared by the VO!) are vulnerable to a range of new threats that might affect the
whole organisation mission. In order to develop a strategy to protect these resources,
they have to be assessed for their degree of importance. The impact resulting from the
breach of confidentiality, integrity and availability of the resource decides what level
of security is required. This could be achieved by conducting a risk analysis (RA) [20,
21] on the organisations’ resources. Risk analysis helps VO organisations to identify
and manage the risk of sharing with the aim of reducing it to an acceptable level. The
risk analysis is done from two different perspectives:
•
VO: to protect the security interests of VO users
•
Individual sites: to protect the security interests of the individual organizations
within the VO.
Any workable solution has to be a compromise acceptable to both parties.
6.2 Risk management of Grid Assets
The purpose of risk management [20] is to determine the current security status and
the level of protection required for a resource. Also, it offers a cost-effective way of
providing protection needed for the organisations’ resources. However, risk
management can only be effective if it is based on a sound risk analysis process. Here
is a brief explanation of risk analysis.
6.2.1 Overview of Risk and Risk Analysis
The key elements of a risk analysis are:
•
Asset: is a valuable resource for a VO partner organisation such as
supercomputer, databases, equipments and software.
74
•
Threat: is any potential danger to resources: equipments, information, or
systems. For example viruses, personnel, sabotage, theft and system failure.
•
Vulnerability: is a weakness in the protection that allows a threat to occur. For
example, a miss-configured firewall would allow unauthorised access to
resources.
Risk is the possibility of damage. It is dependent on the asset value, the threats, and
the vulnerabilities. Risk analysis analyses the relationship between these elements to
determine potential loss [20].
6.2.2 Risk Analysis of Grid Assets
Sherwood [21] defines RA within a business environment as, “the process of
identifying business assets, recognising the threats, assessing the level of business
impact that would be suffered if the threats were to materialize, and analysing the
vulnerabilities”. This variation of RA definition is more suitable to Grid.
Any security solution needs to be viewed from the VO perspective as well as from the
individual VO sites. The VO needs risk analysis to ensure that the VO infrastructure
has an acceptable level of security. Individual VO sites need to do risk analysis in
order to:
•
Recognise potential new threats that arise from joining the Grid.
•
Examine the effectiveness of current security controls used by the VO.
•
Understand vulnerabilities
•
Estimate the impact of loss of Confidentiality, integrity and availability on
resources.
Vulnerabilities
Trust
Threats
User
Security mechanisms
Resource
CIA - Impact
Functionalities
Figure 6.1 Risk analysis diagram depicting relationship between functionalities,
security mechanisms, and vulnerabilities.
75
Typically, each resource in the VO has:
•
A set of possible threats (threat model).
•
An impact, which results from a threat that materialises (usually loss of
confidentiality, integrity or availability).
•
Security mechanisms for enforcing an access control policy (local and global).
The VO user gains access to VO resources via various security mechanisms such as
authentication and authorisation. These security mechanisms have vulnerabilities that
may allow threats to materialise. After a risk analysis is implemented, the VO site can
decide whether or not the vulnerabilities of the security mechanisms are acceptable or
not. If the VO site finds that the level of risk is acceptable, then they will join the VO.
In some circumstances, the VO site may impose conditions on certain procedures in
the VO. For instance, a condition may state that only digital certificates issued by CA,
X, is acceptable.
6.2.3 Enhancing security of core Grid Assets
As depicted in the figure bellow [2], there is an inverse relationship between
“security” and “functionality”. It is normally the case that adding more functionality
to a service will result in less security. Therefore, to minimise security risks to core
services, non-essential functionalities should be removed [2].
Security
Functionality
Figure 6.2 More security less functionality
76
6.3 Security hierarchy of Grid Resources
To ease the process of risk assessment, it is very useful to define a partial ordering
relation on resources that captures security dependency. Typically, r1 <= r2 means
any vulnerability to resource r2 is also a vulnerability to resource r1. In other words,
the security of r1 totally depends on the security of r2. For example, we have:
r.B <= map-file.B
map-file.B <= CA
This ordering will allow the construction of a chain of hierarchy and greatly facilitate
the derivation of the risk assessment of an asset according to the cumulative impacts
on all the dependents assets. For example, compromising the CA implies
compromising all the resources on the VO! This hierarchy leads naturally to a
protection rings model [2] in which the most critical security services are placed in
ring 0 (this has highest protection: access control, firewalls, less functionalities, etc...)
and the least critical security services are placed in the outer ring.
0 123
Figure 6.3 Protection rings [2]
Grid infrastructure contains some critical resources that support the functions of the
VO. Consider the VO scheme where all VO organisations trust a third party, the CA
(described in section 2.5). The major resources of the VO can be classified in two
categories: as follows:
1. The first category comprises the Grid infrastructure resources that include:
•
Certificate Authority: The CA is the most critical resource on the Grid because,
it issues and signs certificates to project users and other resources. These
certificates are used for authentication and authorisation on the Grid. If the
CA’s private key is compromised, this means the digital certificates will not
be reliable anymore. As a result:
77
1. The authentication process will fail because the attacker will be able to
sign a false certificate, which enables him to gain access to resources
to which he is unauthorised.
2. The authorisation will fail. Authentication failure will trigger
authorisation failure (domino effect) because authorisation is based on
the subject name in the certificate.
3. The whole Grid project will fail.
•
Registration Authority: The RA is the second most critical resource. VO
partners would trust the certificates issued by a CA, only if they are satisfied
with the amount of checking done by the RA to confirm an identity, before
certifying a public key for that identity. The level of the verification procedure
depends on the intended purpose of the certificate.
•
CAS server: The CAS server provides authorisation data to Grid users. Grid
users can only access the VO resources if they have a proxy certificate signed
by the CAS server. If the latter is compromised, the attacker will be able to
access any resource on the Grid to which the VO is authorised. The impact of
this breach will depend on the type of resources shared and the authorisation
level granted to the VO. For instance, if the resources involve databases or
archives, unauthorised write access would destroy their value. While
unauthorised read access will cause disclosure of confidential information and
may result in financial losses, even worse some companies may be out of
business. Furthermore, access to resource by legitimate users will be
interrupted completely if the data on the CAS server is deleted.
2. The second category includes the shared resources from in each VO sites that
comprise:
•
Grid map files: The map files are necessary for converting VO users’ global
credentials to local accounts. Without the map file, it will not be possible to
VO users to access resources at a specific VO site. The impact of
compromising a grid map-file is less significant than the CAS server impact
because, map file operates on individual VO sites or even individual resources,
not at all VO resources. Therefore, the impact of unavailability is restricted to
one resource or at most one site, which still may cause availability problem for
78
the entire VO. Also, the attacker may manage to modify the Grid map-file so
that his subject name can be mapped to an existing account with high
privileges. By masquerading as a legitimate user, the attacker may find on the
resources confidential drug description data, or confidential future car design
information.
VO components
CA
RA
CAS-Server
Grid-mapfile
Computer resources
Instruments
Data resources
Network resources
Desktop machines
Telescopes
Databases
Large clusters
Supercomputers
Sensors
Archives
Campus network
Massive storage
Satellites
Software applications
Local resources
Figure 6.4 Resources Hierarchy
•
Data resources: Data resources vary from file systems to very sensitive
databases such as drug experiments and medical records. The value of the data
depends on the impact caused by the breach of confidentiality or integrity of
data. The attacker may manage to infer a drug formula from the data. Also, if
an attacker manages to modify or overwrite existing data resources, VO users
will be using unreliable data. Thus, they end up with unreliable jobs results.
•
Computer and equipments resources: Computer resources vary from CPU
cycles, massive storage to multi-million high performance supercomputers.
Equipments include lab facilities, telescopes and satellite that are extremely
valuable. The value of the resource decides the level of protection required.
79
6.4 Threats
Here is a sample list of possible threats to VO resources that could bring severe
damage and loss of revenue:
•
Loss or theft of the CA’s Private Key.
•
Loss or theft of a VO user’s Private Key.
•
Loss of the CAS server availability: The availability of the CAS server is
crucial. If the server is not available, due to a denial of service attack for
instance, VO users will not be able to access VO resources. Thus, the Grid
project will stop completely.
•
Unauthorised access: Unauthorised access to crucial file systems such as grid
map files or the CAS server.
•
Unauthorised disclosure: Disclosure of information that is confidential to a
VO member such as databases, software code, and results of experiments.
•
Accidental system failure: Computer, data and network resources are
vulnerable to failures due to electricity voltages for instance.
•
Deliberate sabotage: Computer, data and network resources are also
vulnerable to viruses, worms and other types of deliberate sabotage.
80
7. Conclusion
Over the last decade the Grid concept has rapidly evolved from a rigid non-scalable
set of ad-hoc point-to-point connections with poor functionalities toward a scalable
flexible and dynamic virtual organisation that provides much richer set of
functionalities. The potential benefits of the Grid (and Virtual Organizations) are
enormous but the biggest barrier for their wider adoption is security. Currently a
limited concept of VO is realized by the Grid for scientific collaborations but the
domains of potential applications is huge: VO possibilities range from e-government,
health, e-learning to military and the coordination of multinational forces! Grid
security, Globus 3, has made a strong leap forward in recent years in many aspects(for example authentication, confidentiality) but it still has many deficiencies and yet
not convincing for the commercial world. The innovative applications of
cryptography, PKI, and X509 have been the source of many of the radical advances in
the evolution of security solutions to these aspects.
In this report, we have focused on understanding the nature of a VO and the way this
concept has evolved over the last dozen years. We attempted to clarify the mission of
the virtual organization, the explicit assumptions about the roles involved in the VO
(administrators, sites, users, certificate authorities, and third parties), the trust
relationships among these roles and how security policy conflicts between VO and the
local sites are resolved. Specific solutions corresponding to various types of VOs and
from Grid Security Infrastructures have been discussed in depth throughout this report.
For example, it has become obviously clear, as depicted in Table 7.1, what kinds of
tasks are feasible to be solved by each type of VOs. For instance, running a task on a
remote named resource is feasible by all the VOs but the collective execution of a
distributed task on remote resources (in different sites) seems only feasible with a VO
trusting a CA.
81
VO
Functionality
Ad hoc
connections
Trusting Central
Database
X
X
X
X
X
X
Trusting a CA
Delegation MyProxy
U.A accesses named
resources R.B
U.A is dynamically
allocated a resource to
perform a task
Distributed execution on
named resources
Distributed execution on
dynamically allocated
resources
Table 7.1: Task functionalities for each type of VOs
We have considered each of the main security aspects: authentication, authorisation,
confidentiality, integrity, availability, and accountability and, in turn, critically
discussed the main security mechanisms for achieving it within each type of VOs. We
have also included desirable criteria such scalability, flexibility, and usability issues in
the evaluation of each security aspects in each VO type. For the authentication aspect,
the major technical problems have been solved with a CA, however, serious problems
are still to be solved with confidentiality and authorization, and major problems have
not yet been addressed with accountability!
The major contributions of my own research to this report can be highlighted as
follows:
•
Adopting a notation (from Hoare’s Communicating Processes) for clearer
description of VO entities (users, sites, resources, contact, etc...)
•
Making the initial steps towards a more accurate modelling of VOs as
mathematical structures (as a vector of relevant entities such as CA, sites,
resources, etc…). This approach to modelling helps to clearly and concisely
identify the main issue in reconciling security policies among the VO sites.
•
Classifying types of VOs according to the trust relationships between users,
sites, CAs, and third parties.
•
Giving critical discussions of the major mechanisms for achieving security
properties (authentication, authorization, confidentiality, etc...) in each type of
82
VOs. Some of the benefits and drawbacks of Globus are available in the
literatures but they have mainly been presented for developers and users.
•
Providing the initial ground work for a seemingly new top-down approach to
viewing and analysing Grid security based on a careful combination of
modelling, classical characterisation of security, and concepts of risk analysis.
Finally, I have greatly enjoyed exploring this exciting topic and hope to have the
opportunity of exploring further some the avenues and challenging problems in the
future.
83
References
[1] I. Foster, C. Kesselman (eds.). “The Grid: Blueprint for a New Computing
Infrastructure”. Morgan Kaufmann, 1999
[2] D. Gollman. “Computer Security”. John Wiley, 1999.
[3] I. Foster, C. Kesselman, S. Tuecke “The Anatomy of the Grid: Enabling Scalable
Virtual Organizations”, International Journal of Supercomputer Applications and
High Performance Computing, 2001, 200-222.
[4] I. Foster, C. Kesselman, G. Tsudik and S. Tuecke “A Security Architecture for
Computational Grids”. ACM Conference on Computers and Security, 1998, 83-91.
[5] I. Foster, L. Pearlman, V. Welch, C. Kesselman, S. Tuecke “A Community
Authorisation Service for Group Collaboration”, www.Globus.org
[6] R. Buttler, V. Welch, D. Engert, I. Foster, S. Tuecke, J. Volmer, C. Kesselman “A
National-Scale Authentication Infrastructure”, IEEE, Dec. 2000
[7] M. Surridge “A rough Guide to Grid Security”. Issue 1.1a, IT-Innovation centre,
2002.
[8] D. De Roure, M.A. Baker, N. Jennings, and N. Shadbolt. “The Evolution of the
Grid”. The International Journal computation and Currency: Practice and
Experience, Wiley and Sons Ltd., June 2002, ISSN 1040-3108
[9] S. Thomas. “SSL and TLS Essentials, Securing the Web”, John Wiley, 2000
[10] F. Piper. “Introduction to cryptography”, Lecture Notes, RHUL, 2003.
[11] Matt Robshaw, Sean Murphy. “Advanced Cryptography”, Lecture Notes, RHUL
2003.
[12] H. Mack, “Public Key Infrastructure in E-Commerce Environments”, ECommerce Infrastructure, Lecture notes, Royal Holloway, University of London,
2003.
[13] G. Price, “Public Key Infrastructure: Challenges and Challengers”, Current
development in E-commerce, Lecture Notes, RHUL, 2003
[14] M. D. Harper, Herald information Systems. “Trust, Security and Confidence
Online: The verifier’s perspective”. Current development in e-commerce, Lecture
Notes, RHUL, 2003.
[15] A. Stone. “Network Security: Firewalls”, E-commerce infrastructure, Lecture
Notes, Royal Holloway, University of London, 2003.
84
[16] V. Welch, F. Siebenlist, I. Foster, J. Bresnahan, K. Czajkowski, J. Gawor, C.
Kesselman, S. Meder, L. Pearlman, S.Tueke. “Security for Grid services”.
www.globus.org
[17] S. Zaba, “Web Security, SSL”, Network Security, Lecture notes, RHUL, 2003
http://www.isg.rhul.ac.uk/msc/teaching/sec3/sec3.shtml
[18] S. Zaba “Secure Protocols and VPNs (Part 1& 2)”, Network Security, Lecture
notes, RHUL, 2003.
[19] C. Ciechanowicz, “Database Security”, Lecture notes, Royal Holloway, 2003.
[20] I. E. Gilbert, “Guide For Selecting Automated Risk analysis Tools”, Computer
Security Division, NIST.
[21] J. Sherwood, “Security Issues in Internet E-Commerce”, Lecture notes, RHUL,
2003.
[22] B. LaMacchia, S. Lange, M. Lyons, R. Martin, K. Price. “.NET Framework
Security”. Addison-Wesley, 2002.
[23] CNN.com
cyber-attack
batter
Web
heavyweights.
Internet
http://
www.cnn.com/2000/TECH/computing/02/09/cyberattacks.01/index.html%1,
February 2000.
[24] CERT
coordination
Centre.
Denial
of
Service
Attacks.
http://www.cert.org/tech_tips/denial_of _service.html, June 2001.
[25] Global Grid Forum www.ggf.org
[26] IBM Grid solutions: http://www-1.ibm.com/grid/solutions/index.shtml
[27] UK E-Science programme: http://www.research-councils.ac.uk/escience/
[28] www.realitygrid.org
[29] www.eu-datagrid.org
[30] www.eurogrid.org
[31] www.ipg.nasa.gov
[32] www.Globus.org
[33] www.ietf.org/rfc/rfc2692.txt?number=2692;
www.ietf.org/rfc/rfc2693.txt?number=2693
[34] www.unicore.org
[35] www.unix.org
[36] www.microsoft.com/windows
[37] www.ibm.com/os2
[38] www.cern.org
85
Internet:
[39] www.legion.org
[40] www.verisign.com
[41] www.entrust.com
[42] www.rsa.com
[43] www.ietf.org
[44] www.xml.org
[45] www.globus.org/gram
[46] www.globus.org/mds
[47] www.openssl.org
[48] www.openssh.org
[49] www.isg.rhul.ac.uk
[50] www.rhul.ac.uk
[51] L. Coles-Kemp, “Virtual Private Network and IPSEC”, Current development in
E-commerce, Lecture Notes, RHUL, 2003.
[52] R. Sandhu, “Identification and Authentication”, Chapter 16, Computer Security
Hand book, Fourth Edition, Wiley, 2002.
[53] S. Chokhani, “Public Key Infrastructures and Certificate Authorities”, Chapter 23,
Computer Security Hand book, Fourth Edition, Wiley, 2002.
[54] D. Levine, “Auditing Computer Security”, Chapter 23, Computer Security Hand
book, Fourth Edition, Wiley, 2002.
[55] N. Smart, “Cryptography: An Introduction”, McGraw-Hill, 2003.
[56] B. Shneier, “Secrets and Lies: Digital Security in a Networked World”, Wiley,
2000
[57] L.D. Stein, “Web Security: A Step-by-Step reference Guide”, Addison Wesley,
1998.
[58] A. E. Abdallah, P. Ryan and S. Schneider, Formal Aspects of Security, LNCS,
2003.
86

Critical Evaluation of Current Approaches to Grid Security

Transcription

Similar documents

Heritage Category: Listing List Entry No : 1227165 Grade: II County

Certificate of Registration Glasseal Products Inc.

Certificate of Registration Glasseal Products Inc.

Opera Graphic Design Tempo

parathom - Stillwaters Law Firm

View - ENBALA Power Networks Inc.

Certificate of Registration Williamston Products, Inc.

Fast Talker 2 User Guide

Master of Optometry (M.Optom) - The Sankara Nethralaya Academy

The Treasure Map Game