Recommendations for Sizing HP Vertica Nodes and Clusters
Transcription
Recommendations for Sizing HP Vertica Nodes and Clusters
Recommendations for Sizing HP Vertica Nodes and Clusters HP Vertica Analytic Database HP Big Data Document Release Date: April, 2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. The information contained herein is subject to change without notice. Restricted Rights Legend Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. Copyright Notice © Copyright 2006 - 2015 Hewlett-Packard Development Company, L.P. Trademark Notices Adobe® is a trademark of Adobe Systems Incorporated. Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation. UNIX® is a registered trademark of The Open Group. Contents Overview ....................................................................................................................................................... 4 Optimal Hardware Configurations for HP Vertica ........................................................................................ 4 Sizing Your Cluster ........................................................................................................................................ 4 Server Configuration .................................................................................................................................... 5 For More Information.................................................................................................................................... 5 Overview This hardware planning guide contains recommendations for hardware for your HP Vertica nodes. It also provides information that helps you appropriately size the hardware components to meet the needs of your environment. Optimal Hardware Configurations for HP Vertica The following hardware configurations provide excellent performance for your HP Vertica database. Component Recommendations Processor For optimal performance, Hewlett Packard recommends running the following processors: Two-socket servers with 8- to 14-core CPUs, clocked at or above 2.6 GHz for clusters over 10 TB Single-socket servers with 8 to 12 cores clocked at or above 2.6 GHz for clusters under 10 TB Memory HP Vertica requires a minimum of 8 GB of memory per physical CPU core in each server. However, in high-performance applications, you should run 12–16 GB of memory per physical core. The memory should be at least DDR3-1600 (preferably DDR4-2133), and should be appropriately distributed across all memory channels in the server. Storage HP Vertica requires a minimum read/write speed of 40 MB/s per physical core of the CPU. However, for best performance, you should have 60–80 MB/s per physical core. Each node should have 1–9 TB of storage post RAID. In a production setting, Hewlett Packard recommends RAID 10. RAID 50 can be a viable alternative. Due to the heavy compression/encoding that HP Vertica performs, you do not need to use solid-state drives (SSDs). To satisfy HP Vertica requirements, a RAID array of more, less expensive hard disk drives (HDDs) works just as well as a RAID array of fewer SSDs. Note: If you intend to use RAID 50 for your data partition, keep a spare node in every rack. This allows for manual failover of an HP Vertica node in the case of a drive failure. (Recovering an HP Vertica node is faster than rebuilding a RAID 50.To keep node recovery times at an acceptable rate, never put more than 10 TB compressed data on any node. Network Hewlett Packard recommends 10G networking over 1G networking in almost every situation. Sizing Your Cluster Consider these factors when sizing your cluster: Data volume (compression): First, look at the total raw data volume for the cluster and then apply a reasonable compression number. In most cases, 2:1 compression with high availability is a good start. To calculate compression for your data, try one of these approaches: o Use a previously attained compression number. o Install HP Vertica on an existing system and run Database Designer (DBD). DBD calculates an excellent compression for all columns. o Load several hundred gigabytes to 1 TB of data into the database. After you load the data, you can calculate the compression ratio by running an audit and summing up the used bytes in projection storage. For more information, see Calculating the Database Size in the product documentation. Data growth: Once you have a good idea of the starting compressed data volume, consider: o The rate of data ingest (i.e., the amount of data you load in the database each day) o Your organization’s retention policy, how long you intend to save the data You can estimate the total data volume required for your cluster using the following formula: ingest_rate × retention_period Workload: Develop an understanding of your total workload. The following features are part of every HP Vertica database workload: o Concurrency: The number of concurrently running queries. The higher the concurrency, the more core memory you need. o The amount of data on which the average query will operate o How you manage resources using runtime priority and resource pools. Server Configuration For high availability, Hewlett Packard recommends a minimum of three nodes. If you are getting better than 2:1 compression you may need fewer nodes. The following table lists the recommended server configurations. All configurations are based on raw data volumes and assume 2:1 compression. For configurations where the raw data size exceeds 10 TB, the best practice is to engage HP presales or an HP Vertica technical representative to help provide an exact sizing that best fits your needs. Before purchasing hardware for a production cluster, always run your hardware Bill of Materials (BOM) past the HP Vertica technical representative for review. Note: In a high-concurrency or heavy-workload environment, Hewlett Packard may recommend oversizing your cluster to achieve better performance. Raw Data Size Recommended Server Configuration and Comments Up to 5 TB 3 nodes each with: At least 8 15K RPM spinning disks for the data partition (total 1–2 TB per node or equivalent SSD) A single 8–12 core processor with clock speed or 2.6 GHz or higher 96–128GB of RAM 5-10 TB 6 nodes each with: At least eight 15K rpm spinning disks for the data partition (total 1–2 TB per node or equivalent SSD) A single 8–12 core processor with clock speed or 2.6 GHz or higher 128–256GB of RAM 10-40 TB 3–4 nodes each with: 22 10K rpm drives Dual 12-core processors at 2.6 GHz 256–512GB of RAM 40 TB – 1 PB 4–100 nodes each with: 22 10K rpm drives Dual 12-core processors at 2.6 GHz 256–512GB of RAM Larger than 1 PB In most cases, the same configuration for 40 TB–1 PB will work for raw data sizes greater than 1 PB. But first, contact an HP Vertica technical representative for recommendations on sizing clusters over 1 PB. For More Information In this document, Hewlett Packard engineers have provided you with sizing information for your HP Vertica cluster. The following resources provide more detailed configurations, and they explain how to best deploy your HP Vertica database. To… Go to… Configure the HP DL380 Gen9 24-SFF CTO server as an HP Vertica node https://community.dev.hp.com/t5/Vertica-Wiki/Configuring-the-HPDL380-Gen9-24-SFF-CTO-Server-as-an-HP-Vertica/ta-p/227450 Read the latest HP Vertica product documentation http://www.vertica.com/hp-vertica-documentation/hp-vertica-7-1-xdocumentation/ Learn how customers can deploy HP Vertica in a Cloud environment with the HP Vertica OnDemand documentation http://www.vertica.com/hp-vertica-documentation/hp-verticaondemand-documentation/ Create an HP Vertica cluster on Amazon Web Services (AWS) http://www.vertica.com/resources-for-technology-partner-integrations/