H3C CAS集群管理技术白皮书
目录
1概述 ······························································································································· 1 2 H3C CAS云计算管理平台简介 ···························································································· 2 3 H3C CAS集群管理技术:HCCS ························································································· 3
3.1技术优点 ······················································································································ 3 3.2技术实现 ······················································································································ 4
3.2.1集群的形成 ·········································································································· 6 3.2.2集群的维护 ·········································································································· 6
4 H3C实现的技术特色 ······································································································· 10 5典型组网案例 ················································································································· 12
5.1组网拓扑 ···················································································································· 12 5.2注意事项 ···················································································································· 12
5.2.1对服务器硬件的要求 ···························································································· 12 5.2.2整合比(单台服务器上虚拟机数量)的决定因素 ························································ 13
6参考文献 ······················································································································· 13
i
1 概述
随着虚拟化和云计算浪潮在全球IT行业的兴起,越来越多的企业、行业和运营商纷纷将自身的IT架构切换到虚拟化环境中。虚拟化技术对数据中心内未被充分利用的服务器进行整合,极大地降低了客户的一次性投入成本,精简了数据中心物理服务器的数量,同时,减少了供电、制冷、场地和运维人员方面的运营成本。
但是,虚拟化也为IT应用带来了单点故障问题。在未实施虚拟化技术之前,IT管理员往往遵循“根据最坏情况下的工作负载来确定所有服务器的配置”这一策略,即一台高性能物理服务器仅安装一个应用程序。在这种情况下,即使该物理服务器出现了断电或操作系统崩溃等异常状况,最多只会影响到一个应用的运行。而在虚拟化环境下,每台物理服务器往往运行多个虚拟应用,因此,虚拟化技术的实施将使IT环境面临的灾难破坏性更严重,尤其对于一些重要的业务入口或接入点(如企业的生产管理系统和金融行业的数据库系统等),即使出现秒级的业务中断,也将遭受灾难性的后果。在这种应用背景下,如何保证虚拟化环境下业务应用的高可靠性,成为急需解决的一个技术问题。
VMVMVMVMVMVMVMVMVMIP network 图1 服务器故障造成虚拟机业务全部中断
在没有实施虚拟化之前,提高物理环境中关键业务应用程序可靠性的最常用方法是部署传统的高可靠性集群解决方案,如微软的Cluster Service(MSCS)和Veritas Cluster Server(VCS)等。这些解决方案致力于在发生服务器主机故障或虚拟机故障时,在最短的应用程序停机时间内实现即时恢复,要达到这个目标,IT基础架构必须进行如下设置:
? ?
每台物理服务器和虚拟机都必须有一个镜像虚拟机(可能在其它服务器主机上)。
使用集群软件将服务器(或虚拟机及其主机)设置为互相镜像,一般情况下,由主虚拟机向镜像发送心跳信号,一旦发生故障,镜像将立即接管。
下图显示使用传统集群方法的典型的虚拟机设置:
1