Recommendations for Sizing HP Vertica Nodes and Clusters

Transcription

Recommendations for Sizing HP Vertica Nodes and Clusters
Recommendations for Sizing HP Vertica
Nodes and Clusters
HP Vertica Analytic Database
HP Big Data
Document Release Date: April, 2015
Legal Notices
Warranty
The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing
herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.
The information contained herein is subject to change without notice.
Restricted Rights Legend
Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial
Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under
vendor's standard commercial license.
Copyright Notice
© Copyright 2006 - 2015 Hewlett-Packard Development Company, L.P.
Trademark Notices
Adobe® is a trademark of Adobe Systems Incorporated.
Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.
UNIX® is a registered trademark of The Open Group.
Contents
Overview ....................................................................................................................................................... 4
Optimal Hardware Configurations for HP Vertica ........................................................................................ 4
Sizing Your Cluster ........................................................................................................................................ 4
Server Configuration .................................................................................................................................... 5
For More Information.................................................................................................................................... 5
Overview
This hardware planning guide contains recommendations for hardware for your HP Vertica nodes. It also provides information that helps you
appropriately size the hardware components to meet the needs of your environment.
Optimal Hardware Configurations for HP Vertica
The following hardware configurations provide excellent performance for your HP Vertica database.
Component
Recommendations
Processor
For optimal performance, Hewlett Packard recommends running the following processors:

Two-socket servers with 8- to 14-core CPUs, clocked at or above 2.6 GHz for clusters over 10 TB

Single-socket servers with 8 to 12 cores clocked at or above 2.6 GHz for clusters under 10 TB
Memory
HP Vertica requires a minimum of 8 GB of memory per physical CPU core in each server. However, in high-performance
applications, you should run 12–16 GB of memory per physical core. The memory should be at least DDR3-1600
(preferably DDR4-2133), and should be appropriately distributed across all memory channels in the server.
Storage
HP Vertica requires a minimum read/write speed of 40 MB/s per physical core of the CPU. However, for best performance,
you should have 60–80 MB/s per physical core. Each node should have 1–9 TB of storage post RAID. In a production setting,
Hewlett Packard recommends RAID 10. RAID 50 can be a viable alternative.
Due to the heavy compression/encoding that HP Vertica performs, you do not need to use solid-state drives (SSDs). To
satisfy HP Vertica requirements, a RAID array of more, less expensive hard disk drives (HDDs) works just as well as a RAID
array of fewer SSDs.
Note: If you intend to use RAID 50 for your data partition, keep a spare node in every rack. This allows for manual failover of
an HP Vertica node in the case of a drive failure. (Recovering an HP Vertica node is faster than rebuilding a RAID 50.To keep
node recovery times at an acceptable rate, never put more than 10 TB compressed data on any node.
Network
Hewlett Packard recommends 10G networking over 1G networking in almost every situation.
Sizing Your Cluster
Consider these factors when sizing your cluster:

Data volume (compression): First, look at the total raw data volume for the cluster and then apply a reasonable compression number. In
most cases, 2:1 compression with high availability is a good start. To calculate compression for your data, try one of these approaches:
o
Use a previously attained compression number.
o
Install HP Vertica on an existing system and run Database Designer (DBD). DBD calculates an excellent compression for all
columns.
o
Load several hundred gigabytes to 1 TB of data into the database. After you load the data, you can calculate the compression
ratio by running an audit and summing up the used bytes in projection storage.
For more information, see Calculating the Database Size in the product documentation.

Data growth: Once you have a good idea of the starting compressed data volume, consider:
o
The rate of data ingest (i.e., the amount of data you load in the database each day)
o
Your organization’s retention policy, how long you intend to save the data
You can estimate the total data volume required for your cluster using the following formula:
ingest_rate × retention_period

Workload: Develop an understanding of your total workload. The following features are part of every HP Vertica database workload:
o
Concurrency: The number of concurrently running queries. The higher the concurrency, the more core memory you need.
o
The amount of data on which the average query will operate
o
How you manage resources using runtime priority and resource pools.
Server Configuration
For high availability, Hewlett Packard recommends a minimum of three nodes. If you are getting better than 2:1 compression you may need fewer
nodes.
The following table lists the recommended server configurations. All configurations are based on raw data volumes and assume 2:1 compression. For
configurations where the raw data size exceeds 10 TB, the best practice is to engage HP presales or an HP Vertica technical representative to help
provide an exact sizing that best fits your needs.
Before purchasing hardware for a production cluster, always run your hardware Bill of Materials (BOM) past the HP Vertica technical representative for
review.
Note: In a high-concurrency or heavy-workload environment, Hewlett Packard may recommend oversizing your cluster to achieve better
performance.
Raw Data Size
Recommended Server Configuration and Comments
Up to 5 TB
3 nodes each with:

At least 8 15K RPM spinning disks for the data partition (total 1–2 TB per node or equivalent SSD)

A single 8–12 core processor with clock speed or 2.6 GHz or higher

96–128GB of RAM
5-10 TB
6 nodes each with:

At least eight 15K rpm spinning disks for the data partition (total 1–2 TB per node or equivalent SSD)

A single 8–12 core processor with clock speed or 2.6 GHz or higher

128–256GB of RAM
10-40 TB
3–4 nodes each with:

22 10K rpm drives

Dual 12-core processors at 2.6 GHz

256–512GB of RAM
40 TB – 1 PB
4–100 nodes each with:

22 10K rpm drives

Dual 12-core processors at 2.6 GHz

256–512GB of RAM
Larger than 1 PB
In most cases, the same configuration for 40 TB–1 PB will work for raw data sizes greater than 1 PB. But first, contact
an HP Vertica technical representative for recommendations on sizing clusters over 1 PB.
For More Information
In this document, Hewlett Packard engineers have provided you with sizing information for your HP Vertica cluster. The following resources provide
more detailed configurations, and they explain how to best deploy your HP Vertica database.
To…
Go to…
Configure the HP DL380 Gen9 24-SFF CTO server as an HP
Vertica node
https://community.dev.hp.com/t5/Vertica-Wiki/Configuring-the-HPDL380-Gen9-24-SFF-CTO-Server-as-an-HP-Vertica/ta-p/227450
Read the latest HP Vertica product documentation
http://www.vertica.com/hp-vertica-documentation/hp-vertica-7-1-xdocumentation/
Learn how customers can deploy HP Vertica in a Cloud
environment with the HP Vertica OnDemand
documentation
http://www.vertica.com/hp-vertica-documentation/hp-verticaondemand-documentation/
Create an HP Vertica cluster on Amazon Web Services
(AWS)
http://www.vertica.com/resources-for-technology-partner-integrations/