Deployment Planning Guide Good Enterprise Mobility Server

Transcription

Deployment Planning Guide Good Enterprise Mobility Server
Good Enterprise
Mobility ServerTM
Deployment
Planning Guide
Product Version: 1.1
Doc Rev 2.3
Last Updated: 6-Nov-14
© 2014 Good Technology, Inc.
All Rights Reserved.
Table of Contents
Purpose and Scope
1
Prerequisites
1
Pre-Deployment ConsiderationsLegal Notice
2
Microsoft Windows Server Considerations
2
Database Server
2
Hardware
3
Good Proxy Connections
6
Scalability
7
High Availability
7
Disaster Recovery
8
Scaling Factors
8
RTO and RPO
9
Physical Deployment
9
Simplest Deployment
10
Typical Deployment
11
High Availability (HA)
12
GEMS-HA Design Principles
12
HA for Instant Messaging
13
Load Distribution
13
Referral
14
HA for Presence
Load Distribution
14
14
HA for Push Notifications
14
HA Failover Process/Behavior Summary
15
Additional HA Considerations
16
Disaster Recovery (DR)
16
ii
DR Failback Process/Behavior
18
Phased Approach Recommendation
18
Deployment with Good Dynamics
Network Separation
19
Server Instance Configuration in Good Control
19
Server-Side Services
20
Conclusion
21
Appendix A – Upgrading from Good Connect Classic
22
Upgrade Scenario 1: Parallel Server (Recommended)
22
Pertinent Considerations in this Scenario
Upgrade Scenario 2: Repurpose Existing Server
iii
19
23
24
Pertinent Considerations in this Scenario
25
Appendix B – Hardware Used for Testing GEMS
27
Purpose and Scope
Purpose and Scope
Good Enterprise Mobility Server™ (GEMS) is the designated consolidation of servers
currently supported by Good. The purpose of this document is to identify the key planning
factors that will influence the performance, reliability, and scalability of your deployed GEMS
configuration, as well as to offer guidance on high available and disaster recovery options.
The guidance presented herein is intended to help ensure the highest possible levels of
reliability, sustainability, performance, predictability, and end-user availability.
The target audience for this guide includes enterprise IT managers and administrators
charged with evaluating technology and network infrastructure, as well as those responsible
for making corresponding business decisions.
This document does not discuss general GEMS and supporting network installation and
software configuration tasks. Rather, it focuses on infrastructure configuration topics that
require careful consideration when you are first planning your GEMS deployment. For both
general and specific installation and configuration guidance and best practices, see the GEMS
Installation and Configuration Guide.
First, however, a discussion centered in the basics of physical deployment will be helpful.
Prerequisites
The planning information in this document is predicated on the following software releases:
l
Good Enterprise Mobility Server (GEMS) – v1.0
l
Good Control (GC) – v1.7.38.19
l
Good Proxy – v1.7.38.14
l
Good Connect Client – v2.3 SR7
l
Good Work Client – v1.0
General knowledge of GEMS and the Good Dynamics platform, along with Windows Server
environments employing Microsoft Lync, Exchange and Active Directory is likewise required
to effectively plan your GEMS deployment.
1
Pre-Deployment ConsiderationsLegal Notice
Pre-Deployment ConsiderationsLegal Notice
Before attempting to deploy GEMS, you may also need to plan for upgrades to the
supporting environment. Is your existing change management process sufficient and are all
the required tools handy? If not, you'll need to plan for these as well. In addition, your inhouse support team may need to have aspects of its training upgraded.
Other key factors in the deployment of GEMS include the Microsoft Windows Server version
and the machine hosting GEMS, available RAM, number of CPUs, Microsoft Lync Server
version, Microsoft Exchange version, Microsoft SQL Server edition, and the roles and
responsibilities of the IT staff supporting these servers and other vital components of your
production network.
Microsoft Windows Server Considerations
Because GEMS uses Microsoft's Unified Communications Managed API (UCMA) to integrate
Microsoft Lync with the GEMS Connect and Presence services, the OS version required to
run GEMS Connect-Presence is dependent upon on the version of Microsoft Lync deployed.
Per guidance from Microsoft, use the following guidelines to determine the version of
MS Windows Server supported by GEMS Connect-Presence:
l
l
l
For MS Lync 2010 Deployments use Windows Server in one of these 64-bit versions:
o
2008 R2
o
2008 R2 SP1
For MS Lync 2013 Deployments use Windows Server in one of these 64-bit versions:
o
2008 R2 SP1
o
2012 R2
To host the Push Notification Service (PNS) only, use Windows Server in one of these 64bit versions:
o
2012 R2
o
2008 R2 SP1
Database Server
A relational database is required for the GEMS Connect and Push Notification services, but
not the Presence service. This database can be part of your existing environment or newly
2
Pre-Deployment ConsiderationsLegal Notice
installed. GEMS supports Microsoft SQL Server in the versions and editions listed below. In
all cases, the database must be installed and prepared before starting GEMS installation.
This means the necessary SQL scripts included in the GEMS installation zip file must be
executed before beginning GEMS installation proper.
The following versions of MS SQL Server are supported:
l
SQL Server 2008 (Express/Standard/Enterprise)
l
SQL Server 2008 R2 (Express/Standard/Enterprise)
l
SQL Server 2012 (Standard)
l
SQL Server 2012 SP1 (Enterprise)
Microsoft has visual and command line tools to assist with database and schema creation;
i.e., Microsoft Management Studio or sqlcmd.
It must be noted that, although SQL Server Express is installed and set up with little effort, it
has limited resources. For most enterprises, Microsoft SQL Server Standard or Enterprise
editions are recommended.
Hardware
The recommended hardware specifications for each GEMS machine running any
combination of the services offered is captured in the following table:
Component
Specification
CPU
4 vCPU
Memory
16 GB RAM
Storage
50 GB HDD
The specifications listed above are considered sufficient to handle the majority of use cases.
Your specific enterprise environment, combined with your particular traffic and use
requirements, is the key consideration in determining the actual hardware to implement.
Hardware configurations used in testing by Good are listed in Appendix B.
Use Profile Definitions (per server instance) for Push (Mail) Notification
The Mail Push Notification service uses Exchange Web Services (EWS) to watch for messages
sent and received. A user profile is characterized by the number of messages sent and
received by a user in a typical eight hour day.
3
Pre-Deployment ConsiderationsLegal Notice
Messages sent/received
Activated Devices
per mailbox per day
supported per server
Light
50-100
40,000
Medium
100-200
20,000
Heavy
200-400
5,000
Profile
For details regarding the user profile used for scale testing, please follow the Microsoft Load
Gen Profile to determine which profile suits your needs best. The results of testing
conducted by Good1 reveal:
Metric
Medium Profile
Heavy Profile
7%
29 %
5 iops
4 iops
25%
25 %
40 iops
45 iops
GEMS CPU Utilization
GEMS IOPS
SQL CPU Utilization
SQL IOPS
Use Profile Definitions (per server instance) for Presence
Since Presence is exposed as a Good Dynamics Server-Side Service, it can be used for many
applications and the load will vary depending on the characteristics of the application
invoking the Presence service. Refer to the following table to gauge the load you can place
on a server hosting the Presence service.
Profile
Active Devices (%)
Activated Devices
subscribed per server
Medium
20%
40,000
Heavy
50%
20,000
The Good Work client also uses the GEMS Presence service. Plannning for a larger profile is
recommended when sizing for a Good Work deployment due to higher activity inherent in
an email-centric application. The Heavy profile results reported here represent each active
device subscribing to 100 contacts.
1Good lab test results are reported for the 90th percentile. The 90th percentile is a measure of statistical distributiion. Whereas the median is the statistical value
for which 50% of the actual results were higher and 50% were lower, the 90th percentile reports the value for which 90% of the data points are smaller and 10%
are greater. 90th percentile performance metrics are obtained by sorting test result values in increasing order, then taking the first 90 % of entries out of this set.
4
Pre-Deployment ConsiderationsLegal Notice
Metric
Heavy Contact Profile
GEMS Presence Service CPU Utilization
9.8 %
GEMS Memory
3.5 GB
The Presence service does not use SQL, so there is negligible disk I/O activity. Hence, only
CPU and Memory test results are reflected in the above use profile for Presence .
Use Profile Definitions (per server instance) for Connect
Here, a profile is characterized by the amount of activity generated by users against
enterprise Lync deployments.
Profile
Active Devices (%)
Activated Devices
supported per server
Light
5%
15,000
Medium
10%
10,000
Heavy
15%
5,000
The activity used for scale testing followed general guidelines published in Microsoft Lync
2010 Capacity Planning for Mobility guidance, wherein a user has 60-80 contacts and each
user initiates ≈4 IM sessions, each lasting ≈6 mins per session, with 1 message sent every 30
seconds during a session. Once again, for a more detailed explanation of user profile and
activity testing, please see Microsoft Lync 2010 Capacity Planning for Mobility.
Resource Consumption During GEMS-Connect Load Tests
4-Core, 16 GB GEMS-Connect
Profile
CPU
Memory
Disk IOPS
Light
55%
8.4 GB
0.000218 MBps/read
0.000398 MBps/write
Heavy
70%
9.2 GB
0.016115245 MBps/read
0.000379 MBps/write
Note: For 10,000 activated devices (containers) and a medium or average 10%
concurrency—the DB size will be no more than 1GB. IOPS is negligible.
5
Pre-Deployment ConsiderationsLegal Notice
General Performance (for Connect , Presence, and Push (Mail) configured on the
same machine)
Due to the modular design of GEMS, you can configure and run all or any of the GEMS
services on the same machine or on different machines. As with all distributed systems,
performance will suffer without adroit load balancing. One exception should always be
made for production environments—do not run SQL Server on the same machine with
other GEMS components.
For lighter loads, or a lesser number of users (under 10,000), Connect, Presence and Mail
Push Notifications can be configured to run on the same physical machine with a low or
medium load as defined in the profiles above. Refer to the general performance outline
below to determine the best configuration of (a) all services on the same machine or (b)
using dedicated servers for each service to optimize performance for your particular traffic
and load requirement(s). Generally, the actual use profile for most enterprises per GEMS
instance will most likely be somewhere between Light and Heavy.
Light profile testing1 conducted by Good on the recommended hardware configuration
running all three services reveal the following metrics.
Metric
GEMS CPU Utilization
GEMS IOPS
SQL CPU Utilization
SQL IOPS
Light Profile
60 %
17 iops
32 %
55 iops
Good Proxy Connections
From the perspective of the Good Proxy (GP) server, GEMS is an application server. Any
traffic relayed from GEMS to the GP server will consume a concurrent connection session on
the GP server. Consequently, it's important to understand how the individual services in the
GEMS machine interact with the GP server.
Connect – 1 active device requires 3 connections
Presence – 1 active device requires 1 connection
Push (Mail) Notification – 1 active device requires 1 connection for EWS
1Again, Good lab test results are reported for the 90th percentile. The 90th percentile is a measure of statistical distributiion. Whereas the median is the statistical
value for which 50% of the actual results were higher and 50% were lower, the 90th percentile reports the value for which 90% of the data points are smaller and
10% are greater.
6
Pre-Deployment ConsiderationsLegal Notice
Scalability
GEMS scales linearly. For this reason, and given the specifications cited, you can create
additional capacity by adding more GEMS machines. You will then need to scale-out the
database and Good Proxy resources accordingly to account for the additional capacity.
See Scaling Factors below for best practices on utilization measurement.
High Availability
Hardware failure, data corruption, and physical site destruction all pose threats to GEMS
services availability. You improve availability by identifying the points at which these services
can fail. Increasing availability means reducing the probability of failure.
At the end of the day, availability is a function of whether a particular service is functioning
properly. Think of availability as a continuum, ranging from 100 percent—a completely
fault-tolerant system/service that never goes offline—to 0 percent (never available/never
works).
Well-planned HA systems and networks typically have redundant hardware and software
that makes them available despite failures. Well-designed high availability systems avoid
single points-of-failure. Any hardware or software component that can fail has a redundant
component of the same type.
When failures occur, the failover process moves processing performed by the failed
component to the backup component. This process remasters system-wide resources,
recovers partial or failed transactions, and restores the system to normal, preferably within
a matter of microseconds. The more transparent failover is to users, the higher the
availability of the system.
At all events, you cannot manage what you cannot measure, so two planning elements are
vital before anything else. The first is determining the hardware required to manage and
deliver the IT services in question, the basis for which is outlined above. Adequately allowing
for growth, measuring as accurately as possible the number of devices, traffic and load likely
to be placed on GEMS and its services offers the best indication of the server hardware and
supporting infrastructure likely to be required.
Concentrating solely on GEMS with Connect and Presence and its supporting architecture,
the first objective in setting the goals of a high availability/disaster recovery (HA/DR)
investment strategy is to develop a cost justification model for the expense required to
protect each component. If the expense exceeds the value provided by the application and
7
Scaling Factors
data furnished to the business, plus the cost to recover it, then optimizing the protection
architecture to reduce this expense is an appropriate course of action.
See High Availability (HA) below for a general discussion of HA options and alternatives.
Disaster Recovery
Your data is your most valuable asset for ensuring ongoing operations and business
continuity. Disasters, unpredictable by nature, can strike anywhere at any time with little or
no warning. Recovering both data and applications from a disaster can be stressful,
expensive, and time consuming, particularly for those who have not taken the time to think
ahead and prepare for such possibilities. However, when disaster strikes, those who have
prepared and made recovery plans survive with comparatively minimal loss and/or
disruption of productivity. Establishing a recovery site for failover if your primary site is
struck by a disaster is crucial.
Good recommends mirroring your entire primary site configuration at the DR site, complete
with the provision for synchronous byte-level replication of your SQL databases. This is
because if the system does fail, the replicated copy is up to date. To avoid a “User Resync”
situation, the replica must also be highly protected.
See Disaster Recover (DR) below for a discussion of Good's DR recommendations for GEMS.
Scaling Factors
The scale of your GEMS deployment is largely dependent on the size of your enterprise and
its IT logistics—number of sites, distance between sites, number and distribution of mobile
users, traffic levels, latency tolerance, high availability (HA) requirements, and disaster
recovery (DR) requirements.
With respect to HA/DR, two elements must be considered—applications and data. Most
commonly, though not exclusively, HA refers to applications; i.e., GEMS Connect and
Presence.
With clustering, there is a failover server for each primary server (2xN). DR focuses on both
applications and data availability. The primary driver of your DR solution is the recovery
time objective (RTO). RTO is the maximum time and minimum service level within which a
business process must be restored after a disaster to avert an unacceptable break in
business continuity.
8
RTO and RPO
Before contemplating the optimal number of servers to be deployed, however, it’s wise to
first determine the right size of an individual server to meet your enterprise’s “normal use”
profile. There are a number of methods for projecting a traffic and use profile. Actual, realworld measurement is recommended and made easy using built-in Windows Performance
Monitoring tools. Notwithstanding the method applied, it is important to remember that
GEMS performance is governed by two principal factors: CPU utilization and available
memory, the former being somewhat more critical than the latter.
RTO and RPO
For GEMS deployment planning purposes, the first step in defining your HA/DR planning
objective is to balance the value of GEMS and the services it provides against the cost
required to protect it. This is done by setting a recovery objective. This recovery objective
includes two principal measurements:
l
Recovery Time Objective (RTO) – the duration of time and a service level within which
the business process must be restored after a disaster (or disruption) to avoid
unacceptable consequences associated with a break in business continuity. For instance,
the RTO for a payroll function may be two days, whereas the RTO for mobile
communications furnished by GEMS to close a sale could be a matter of minutes.
l
Recovery Point Objective (RPO) – the place in time (relative to the disaster) at which
you plan to recover your data. Different business functions will and should have
different recovery point objectives. RPO is expressed backward in time from the point of
failure. Once defined, it specifies the minimum frequency with which backup copies must
be made.
Obviously, if resources were fully abundant and/or free, then everything could have the best
possible protection. Plainly, this is never the case. The intent of HA/DR planning is to ensure
that available resources are allocated in an optimum fashion.
Physical Deployment
A production deployment of GEMS requires a clustered configuration, plus consideration
given to integration with the Good Dynamics server infrastructure and with your existing
enterprise systems. Here, it's important to understand the definition of a "GEMS cluster"
and an "instance" within that cluster.
9
Physical Deployment
An "instance" is any individual deployment of GEMS, with any combination of services
provided by its Java tier and its .NET tier. An instance of GEMS usually runs on one physical
machine. However this is not mandatory. The same physical machine could be used to
deploy multiple instances of GEMS with service endpoints that listen in different ports.
A GEMS cluster is just a group of instances. Within a GEMS cluster, each instance is identical
in that they all expose the same services and share a common database. Instances in a
cluster can be considered "active / active" in that there is no concept of a "passive" instance
used for failover. Even so, instances in a cluster never communicate with each other or
synchronize data.
All GEMS instances in a cluster are homogeneous in that they all expose exactly the same
service(s). This means that when an application is configured in the GC with a list of server
endpoints, any of these server endpoints can be expected to provide the same service used
by the application. This strategy also promotes ease of horizontal scale/replication, as well as
ease of hardware failure correction by swapping in pre-built spares.
Simplest Deployment
The simplest production deployment of GEMS in a corporate network ( depicted below)
comprises:
As shown, such a deployment comprises:
10
Physical Deployment
l
One Microsoft Lync Server and an Microsoft Exchange server deployed in a corporate
network and one database.
l
A single GEMS cluster made up of two physical instances (for fail over). This cluster
provides all services—Presence, Instant Messaging, Push Notifications and Exchange
Integration—for all device clients.
l
One Good Proxy server (GP) with affinity configured to both instances in the GEMS
cluster, along with only one Good Control (GC) server.
Typical Deployment
Expanding on the simplest configuration, a typical deployment, adhering to generally
accepted IT practices, offers high availability (HA) service access within data centers, rather
than geographically distributed disaster recovery (DR) sites between data centers.
11
High Availability (HA)
Here, there are two geographical regions (UK and US) to which GEMS clusters are deployed,
furnishing device clients access to the services provided by GEMS.
Two Microsoft Lync Pools are deployed—one in each geographical region. Device clients in
each region are provided access to the Presence service and Connect (IM) service by a GEMS
cluster configured to use the Microsoft Lync Pool infrastructure in that region.
There is only one GEMS cluster in the UK region (Cluster #1), and it provides the Presence
and Connect services.
Two GEMS clusters (Clusters #2 and #3) are deployed in the US Region. Cluster #2 provides
the Presence and Connect services for devices clients in the US Region, whereas Cluster #3
provides the Email (Push) Notification service for device clients in both regions. In this
example, only two physical instances are required for HA.
As seen above, there is a separate GP Cluster deployed in each region. GP servers in each
cluster are configured to have affinity to the GEMS cluster(s) used by device clients in their
region.
Only one GC cluster is necessary. It is deployed in the US Region and used by the proxy
servers in both GP clusters for both regions.
High Availability (HA)
Availability is measured in terms of outages, which are periods of time when the system is
not available to users. Your HA solution must provide as close to an immediate recovery
point as possible while ensuring that the time of recovery is faster than a non-HA solution.
Unlike with disaster recovery, where the entire system suffers an outage, your high
availability solution can be customized to individual GEMS resources and services.
HA for GEMS means that the runtime and Service APIs for Push Notifications, Presence, and
Connect are unaffected from the perspective of a device client whenever any instance of
GEMS goes down or any of its services stop working.
GEMS-HA Design Principles
Services provided by GEMS instances should not differ in their approach to:
i. Even distribution of work over instances
ii. Detection of instance failure and
12
High Availability (HA)
iii. Reallocation of work for existing users.
Hence, the following design principles are followed for all services:
l
Shared Storage – Achieves HA/DR by adopting a shared storage model and, where
possible, services provided by GEMS instances are stateless so that device clients can
select any GEMS instance regardless of where they may have been previously connected.
l
Client-Side Load Balancing – Clients know the list of server endpoints in a GEMS cluster
(with affinity to their GP cluster) and service requests are evenly distributed to those
server endpoints via client-side load balancing.
l
Heartbeat – Services on each instance are responsible for reporting their own health in
the shared database.
l
Elected Health Watcher pattern – One instance in the cluster is chosen through an
election algorithm to watch the health of all the others, and then centrally coordinate
work load distribution in response to a failed instance. All instances can be watchers and
the election algorithm provides fail over for watchers.
l
User tables in Shared Storage – To aid failover, the database can be used to determine
which instance in a GEMS cluster is currently being used to handle work for which end
users.
HA for Instant Messaging
Instant Messaging (IM) is provided by the GEMS-Connect service to the Good Connect
client.
Load Distribution
Client devices are aware of a list of endpoints (server instances in the same GEMS cluster)
which they can contact for the Connect (IM) service. Each user session is kept up to date in
the database, including which server instance is currently handling the user session. If a
server instance receives a request for a user it has not yet served, it first looks for any
sessions the user may have in the database, and may respond with a 503 referral to a
different instance that is already holding a live session for that user. The Connect client
cooperates by obeying referral responses.
13
High Availability (HA)
Referral
Server instances can be marked "offline" in the database due to a heartbeat failure or
because another instance in the GEMS cluster has determined that it is offline. If the server
of record for the user is offline, the newly contacted server can adopt the user session,
dynamically establish a session with Lync on behalf of the user, and then transition the user
to the new server. The offline status of the server has no effect on requests being routed to
it, but it prevents referral to it by other servers of incoming requests.
HA for Presence
The Presence Service provided by GEMS consists of an HTTP service called by device clients
and a "Lync Presence Provider" that integrates with a Lync Pool deployment (.NET). GEMS
clusters used for the Presence and Connect Services will be specialized for this purpose, even
though they are capable of supporting Push Notifications and Exchange integration.
Put another way, the Presence service is deployed with Connect following the Connect
deployment pattern with the Lync infrastructure.
Load Distribution
Device clients can use any instance in the GEMS cluster to establish multiple different
Presence subscriptions; for example, matching a list of Contacts, Email participants, or a GAL
Search. Moreover, multiple instances in the GEMS cluster can all reuse the same Lync
Presence subscription. Presence subscriptions are not long lived and they are not suitable
for storage in a database. Instead, they are stored in a persistent cache shared by all
instances in the GEMS cluster, where they readily expire.
The persistent cache is used to maintain a timestamp for each subscription which is used by
the Presence service to determine what new presence information to provide to the client
on request.
HA for Push Notifications
The High Availability objective for Push Notifications is that device clients should be able to
register once for Push Notifications, and not be impacted by servers that manage those
notifications going up and down. Even though device clients can be expected to eventually
resubscribe, the GEMS HA design does not depend on them doing so.
14
High Availability (HA)
Incoming push registrations are directed at random to any instance of GEMS in a cluster.
There is no affinity to server instances for device clients based on mailbox. If the push
registration already exists in the shared database, then it is assumed that one instance in the
cluster is already managing an EWS Listener subscription for that user. No new action needs
to be taken, except to reset the watermark of the push registration in the database for aging
purposes.
The EWS Listener Service on each instance of GEMS is responsible for periodically updating
its own health status in the shared database. If the EWS Listener Service for any one
instance fails to refresh its own health status within an expected time window, then it is
considered down. One instance in the cluster is elected as a "Watcher" for this condition,
whereupon it is responsible for instructing another instance to take over (and recreate) EWS
subscriptions for user mailboxes that were currently attributed to the dead instance. This is
done by updating Push Registrations in the shared database to reflect the new instance
upon which the EWS Listener Service should manage those user mailboxes. When the dead
instance comes back it is just another instance that is ready to manage new push
registrations.
HA Failover Process/Behavior Summary
GEMS can scale horizontally and offers N+1 redundancy. This offers the advantage of
failover transparency in the event of a single component failure. The level of resilience is
referred to as active / active (a k a "hot") as backup components actively participate with the
system during normal operation. Failover is generally transparent to the user as the backup
components are already active within the system.
In adequately configuring your GEMS components for this redundancy, the following
measures must be taken:
1. Configure additional GEMS machines in a cluster to use the same underlying SQL Server
database. This is done through the GEMS Dashboard.
2. Configure the additional GEMS Hosts in Good Control. This configuration can happen in
two locations within Good Control, depending on your deployment model.
Once configured, each client receives a list of supported GEMS servers during app start up of
Good Connect/Good Work. The client will then choose a server from the list at random and
continue to utilize that server for the life of the user's session.
15
Disaster Recovery (DR)
A session constitutes an active login with the system and persists until a user either
manually signs out or a 24-hour period (configurable), which ever comes first.
Should one server fail, the client will retry additional servers from the list until it can
successfully login. Any existing active user session will be seamlessly transferred to the new
server.
Detailed HA configuration steps are available in the GEMS Installation and Configuration
Guide.
Additional HA Considerations
After adding servers for HA, each client must update its policies in order for it to be aware of
the new systems. Policy updates are automatically performed each time the client is
launched or a new policy is detected. However, the update could be delayed if the Good
Control (GC) server is overburdened with update requests. As of the current release, each
GC server can process two policy updates per second. Thus, it is important to scale your GC
servers to match your policy update requirements.
If you are using server affinities, these settings will need to be adjusted to account for the
new servers.
Disaster Recovery (DR)
Disaster Recovery is different from High Availability among instances of a GEMS cluster in
that an entire cluster in one region has become unavailable and device clients need to be
Conclusion redirected to a GEMS cluster in a different region that provides the same
services.
The DR model for a GEMS cluster in a data center is to have another identically configured
GEMS cluster in different data center (failover) that shares the same storage through a
replication strategy provided by the vendor of the database and file system. This is the same
strategy prescribed by Good Dynamics for disaster recovery of a GC cluster.
Diagrammed below is a typical pattern for Disaster Recovery using a Primary and a Standby
data center. Note that although the GEMS cluster illustrated is used for Presence and
Connect, the pattern should be identical for a GEMS cluster used for Push Notifications and
Exchange integration.
16
Disaster Recovery (DR)
Note: Virtual IP is commonly employed by IT for failover, but it is not mandated by Good
Dynamics in this case. GP Clusters already have a "primary", "secondary" and "tertiary"
configuration with respect to an application managed in the GC.
A Load Balancer with Virtual IP is used to route device traffic to a GP cluster in the primary
datacenter. This GP cluster has affinity to the GEMS instances in a GEMS cluster likewise
located in the primary datacenter.
The Load Balancer is responsible for periodic heath check of the GEMS cluster in the primary
datacenter. If the health check fails, then the Load Balancer initiates fail over to the GEMS
cluster in the standby datacenter. Device clients are then routed to a GP cluster with affinity
to server instances in that GEMS cluster.
The database in the standby datacenter is replicated from the production database in the
primary datacenter. However, any state—such as Presence subscriptions and active Lync
conversations—would be lost and must be recovered as clients submit subsequent
requests.
17
Disaster Recovery (DR)
Good Control server instances in both the standby datacenter and the primary datacenter
are in the same GC cluster because they all use replicas of the same shared storage. The only
difference is that GC server instances in the standby datacenter have affinity with the GP
cluster in the same datacenter.
When the Health Check indicates that the primary datacenter is available once again, the
Load Balancer will initiate failover back to the GEMS cluster in the primary datacenter.
With respect to push notifications(, when a DR failover happens, device clients must
resubscribe using the Push Notification Service (PNS) provided by the GEMS cluster in the
standby datacenter. There is no expectation that EWS Listener subscriptions for existing
users will be automatically recreated.
DR Failback Process/Behavior
Assuming the DR site is properly configured , failover should be transparent to the end user.
As noted earlier, the client is aware of multiple GC, GP and GEMS with which it can connect.
In the event that the primary site goes offline, GEMS clients will try to connect to the
services in the secondary site.
Before failing back, you must make sure that the secondary database is synchronized with
the primary database. Update the DNS accordingly to remap infrastructure resources. From
a client perspective, the user may need to quit and relaunch the app. In most cases,
however, the process will be transparent to the end-user, and the app will reconnect to the
primary resources once it comes back online.
Phased Approach Recommendation
Clearly, the key to a successful GEMS disaster recovery event is proper planning. To this end,
the following phase approach is recommended:
Phase 1 – Ensure and verify that all services are working properly in the primary site before
introducing DR.
Phase 2 – Independent of GEMS, test and verify that the infrastructure is setup properly in
the secondary site. This includes, but not limited to, AD, SQL and Lync.
Phase 3 – Add additional GC, GP and GEMS machines in the secondary site as appropriate.
Phase 4 – Update configuration to include new GC, GP and GEMS machines.
Phase 5 – Test a Failover/Failback.
18
Deployment with Good Dynamics
Deployment with Good Dynamics
A number of factors bear consideration in appropriately deploying GEMS services with an
existing or newly established GD infrastructure.
Network Separation
Good Control instances in a GC cluster do not need to be reachable by GEMS instances in a
GEMS cluster. This may be desirable to an IT administrator since GC instances could be
installed with high privilege service accounts to perform Kerberos Constrained Delegation
(KCD) and may hold sensitive security tokens. In such cases, GC clusters and GEMS clusters
can be deployed in different network zones separated by a trust boundary.
Server Instance Configuration in Good Control
Device clients are able to access GEMS instances in a GEMS cluster because each individual
network endpoint for each instance in the cluster has been configured in a "Server List". This
is the list of endpoints provided to a device client identified by its application ID. For
example, a device client activated with a deployment of Good Control as configured below
would be presented with three network endpoints to use for access to Services in a GEMS
cluster.
Not shown here is the ability to associate user groups to each network endpoint. This
permits assignment of users to a GEMS cluster accessed via the GP cluster in their region, as
described earlier.
19
Deployment with Good Dynamics
These network endpoints configured in the GC do not reflect any physical deployment
topology for the actual server instances. IT departments rely on separate infrastructure for
routing within the enterprise and across sites. In fact, an IT department may employ VPN,
Router, Load Balancer or other infrastructure configuration behind each of these devicefacing network endpoints. Note also that network endpoints configured in this way are
implicitly whitelisted by the GC.
Server-Side Services
Service names for each service provided by GEMS are registered on the Good Dynamics
Network along with a service definition. An "application" is then created in Good Control and
has bound to it one or more Service Definitions. In the example below there is an
"application" called "com.g3.good.presence" and it has been bound to one server-side
service called, "G3 Presence Service". Note that the application concept here does not
represent an app on a device. Rather, it is a construct that can be used to entitle user and
group access to the service(s) that are bound to it.
Now, when a user who is entitled to this Application ID uses any GD application in their
device, the device client is informed of this server-side service, plus all the network
endpoints for it (via the "Application" entitlement in the GC), as illustrated above in Server
Instance Configuration in Good Control.
20
Conclusion
Conclusion
In the most optimistic scenario, practically speaking, a GEMS cluster exposing all GEMS
services and has two physical instances for failover—a simple system to manage. However,
in large enterprises, IT organizations typically choose to deploy GEMS in a manner consistent
with their existing enterprise systems, matching how Microsoft Lync and Exchange are
deployed.
The deployment architecture and HA design principles for GEMS are, in essence, identical to
those of Good Dynamics. This consistency becomes increasingly necessary as GEMS seeks
to provide the runtime environment for GD Server-Side Services, and ultimately to replace
the Application Server runtime environment for Good Control.
21
Appendix A – Upgrading from Good Connect Classic
Appendix A – Upgrading from Good Connect Classic
Good Enterprise Mobility Server (GEMS) with Connect and Presence (CP) services is built on
a different platform than the classic Good Connect server. As a result, there is no direct
upgrade path from the classic Good Connect server to GEMS with Connect and Presence.
For existing classic Good Connect server environments, please review the guidance that
follows when upgrading to GEMS with Connect and Presence.
The guidance found here covers two of the most common upgrade scenarios. It is not
intended to be a step-by-step upgrade procedure, but rather a general overview of the
process as a whole. Knowledge of the classic Good Connect server is required. Where
appropriate, cross-references to more detailed instructions are indicated.
Upgrade Scenario 1: Parallel Server (Recommended)
In this scenario a new server is provisioned for GEMS with Connect and Presence to run in
parallel with the existing classic Good Connect Server. The benefit is that no service
interruption is required on the existing Good Connect system while GEMS is deployed.
The parallel server upgrade environment can be generally depicted as follows:
22
Appendix A – Upgrading from Good Connect Classic
Pertinent Considerations in this Scenario
Good Dynamics
We recommend that you upgrade Good Control to v1.7.38.19 and Good Proxy to
v1.7.38.14 in preparation for the installation of GEMS.
Service Account
The service account used for the classic Good Connect server can also be used for GEMS.
Database
A new schema (Oracle) or database (MS SQL) will need to be created for use by the new
GEMS installation.
Microsoft Lync Configuration
Your existing classic Good Connect Lync application pool can be reused. However, the new
GEMS machine must be added as a Trusted Application computer. If you are planning to use
the Presence service as well, an additional Application ID will need to be created. Please see
the GEMS Installation and Configuration Guide for details.
GEMS Host Machine SSL/TLS Certificate
The new GEMS machine will need its own (unique) SSL/TLS certificate. Please see the GEMS
Installation and Configuration Guide for additional detail regarding setting up the SSL/TLS
certificate.
Good Control Configuration
The “Good Connect” application configuration in Good Control will need to be updated to
include the new GEMS-Connect service.
Caution: To minimize interruption to production users, Good Connect server affinities
should be set up prior to updating the Good Connect application configuration. It is
recommended that you set up two polices: one with user affinity to the classic Good
Connect server, and another with affinity to GEMS-Connect.
When you schedule your users to be switched over to the new server, make sure you ask
them sign out of their Connect client prior to the maintenance window.
23
Appendix A – Upgrading from Good Connect Classic
Verification/Testing
Verify that clients can connect to the GEMS-Connect service. This can be done by assigning a
user to a policy that contains the new GEMS-Connect service.
Moving Users
After testing is complete, all users can be moved to GEMS by updating the user’s policy set.
Specifically, update the server affinity to point to the new GEMS machine. As mentioned
above under Good Good Control Configuration, it is also recommended that when you
schedule users to be switched over to the new server, you ask them sign out of their
Connect client prior to the maintenance window.
Classic Good Connect Server
After all users have been moved to the new GEMS machine, the old classic Good Connect
server can be decommissioned or repurposed.
Upgrade Scenario 2: Repurpose Existing Server
In this scenario the existing classic Good Connect server will be repurposed for GEMS. As
pointed out previously, a direct upgrade on the same machine running classic Good Connect
is not possible. The existing classic Good Connect server software must be uninstalled
before the GEMS software is installed. The benefit of this approach is that a new server is
not needed. This mean, however, that service on your production Good Connect server will
be interrupted.
The existing server upgrade environment can be generally depicted as follows:
24
Appendix A – Upgrading from Good Connect Classic
Pertinent Considerations in this Scenario
Good Dynamics
We recommend that you upgrade Good Control to v1.7.38.19 and Good Proxy to
v1.7.38.14 in preparation for the installation of GEMS.
Service Account
The service account used for the classic Good Connect server can be used for GEMS.
Database
You will need to run the DDL/DML database scripts for Oracle or MS SQL to reset the
schema or database used by the GEMS product.
Microsoft Lync Configuration
The existing classic Good Connect Lync application pool and Trusted Application Computer
can be reused. Again, if you are planning to use the Presence service, an additional
25
Appendix A – Upgrading from Good Connect Classic
Application ID will need to be created. See the GEMS Installation and Configuration Guide for
details.
GEMS Host Machine SSL Certifcate
If the FQDN of the server did not change, the existing SSL certificate can be reused;
however, if you are planning to use the Presence service, the certificate will need to be
updated with a SAN to include the Presence service App ID. Consult the relevant section in
the GEMS Installation and Configuration Guide for additional instructions.
Good Control Configuration
If the FQDN of the server did not change, then the “Good Connect” application
configuration in Good Control can remain the same. Please ask users to sign out of Good
Connect prior to the upgrade since their temporary session information on the server will be
lost during the upgrade process.
Verification/Testing
Verify that both existing and newly provisioned clients can connect to the GEMS-Connect
service.
26
Appendix B – Hardware Used for Testing GEMS
Appendix B – Hardware Used for Testing GEMS
The following computer hardware was used for PSR validation.
Component
Processor
EWS Push (Mail)
AMD Opteron
Notification
6234 2.4 GHz –
Memory
OS
16 GB
Microsoft Windows Server 2008
R2 Enterprise 64 bit
4vCPU
Connect
AMD Opteron
16 GB
6234 2.4 GHz –
Microsoft Windows Server 2008
R2 Enterprise 64 bit
4vCPU
Presence
AMD Opteron
16 GB
6234 2.4 GHz –
Microsoft Windows Server 2008
R2 Enterprise 64 bit
4vCPU
Connect , Presence, and
AMD Opteron
EWS Push (Mail)
6378 2.39 GHz –
configured on the same
4 cores Virt
16 GB
Microsoft Windows Server 2008
R2 Enterprise 64 bit
machine
SQL Server for GEMS
AMD Opteron
8 GB
Microsoft Windows Server 2008
6234 2.4 GHz –
R2 Enterprise 64 bit / MS SQL
4vCPU
Server 2008 R2
Note: This hardware profile was used for all GEMS PSR testing. All service configurations
were tested running SQL Server on a separate machine.
27
Legal Notice
This document, as well as all accompanying documents for this product, is published by Good
Technology Corporation (“Good”). Good may have patents or pending patent applications,
trademarks, copyrights, and other intellectual property rights covering the subject matter in these
documents. The furnishing of this, or any other document, does not in any way imply any license to
these or other intellectual properties, except as expressly provided in written license agreements
with Good. This document is for the use of licensed or authorized users only. No part of this
document may be used, sold, reproduced, stored in a database or retrieval system or transmitted in
any form or by any means, electronic or physical, for any purpose, other than the purchaser’s
authorized use without the express written permission of Good. Any unauthorized copying,
distribution or disclosure of information is a violation of copyright laws.
While every effort has been made to ensure technical accuracy, information in this document is
subject to change without notice and does not represent a commitment on the part of Good. The
software described in this document is furnished under a license agreement or nondisclosure
agreement. The software may be used or copied only in accordance with the terms of those written
agreements.
The documentation provided is subject to change at Good’s sole discretion without notice. It is your
responsibility to utilize the most current documentation available. Good assumes no duty to update
you, and therefore Good recommends that you check frequently for new versions. This
documentation is provided “as is” and Good assumes no liability for the accuracy or completeness of
the content. The content of this document may contain information regarding Good’s future plans,
including roadmaps and feature sets not yet available. It is stressed that this information is nonbinding and Good creates no contractual obligation to deliver the features and functionality
described herein, and expressly disclaims all theories of contract, detrimental reliance and/or
promissory estoppel or similar theories.
Legal Information
© Copyright 2014. All rights reserved. All use is subject to license terms posted at
www.good.com/legal. GOOD, GOOD TECHNOLOGY, the GOOD logo, GOOD FOR ENTERPRISE, GOOD
FOR GOVERNMENT, GOOD FOR YOU, GOOD APPCENTRAL, GOOD DYNAMICS, SECURED BY GOOD,
GOOD MOBILE MANAGER, GOOD CONNECT, GOOD SHARE, GOOD TRUST, GOOD VAULT, and GOOD
DYNAMICS APPKINETICS are trademarks of Good Technology Corporation and its related entities. All
third-party technology products are protected by issued and pending U.S. and foreign patents.
28