High Availability
An Active-Active Cluster Configuration.
High Avaialability
High Availability

High Availability (HA) is a term used to describe the state of the TCPWave DDI appliances to ensure that they are resilient enough to provide a seamless service without any intermittent disruption or total failure. This page describes how the redundant components are designed with numerous benefits embedded in the TCPWave DDI ecosystem. An SLA (Service Level Availability) score of 100% is what TCPWave delivers with the HA design. TCPWave's HA design is also known as T-Mesh technology. The DDI management layer leverages T-Mesh technology. The DDI management devices in a T-Mesh cluster process the DDI transactions by performing database operations on a replicated database. The T-Mesh technology provides the organizations with a management layer that does not have system downtime. It is common for a data center to have ISP blackouts, power failures, network failures, etc. These unexpected events could bring down the local services to that data center. Since a typical T-Mesh cluster is spread across different data centers and regions, it is immune from getting impacted by an outage particular to a specific site. The cluster's design is to handle increased load and high traffic levels. The DDI transactions processed by the T-Mesh HA cluster the ACID (Atomicity, Consistency, Isolation, and Durability) model. This model ensures that a large volume of concurrent DDI transactions occurs with the highest reliability.

The T-Mesh Technology

The T-Mesh technology supports multiple TCPWave DDI Management appliances serving the global DDI remotes.

The underlying database uses a write-set synchronous replication. The T-Mesh consists of a single floating HA master and multiple HA members. When the floating HA master fails, the next available member automatically assumes the role of a master without any human intervention. The franchise-critical DDI transactions do not impact when a single floating HA master or a HA member goes down. When a temporary network interruption occurs, the T-Mesh technology auto-recovers and the DDI management ecosystem synchronizes with a degree of transparency that the end-users see no impact. The T-Mesh cluster maintains a cache to expedite a member's recovery that has fallen out of sync. The remote DDI appliances design is to auto-sense the failure of their preferred DDI manager. The High Availability member failure will automatically prompt an election on the connected remotes to choose their next best DDI manager. The DDI administrator also has the UI ability to swing the DDI remotes from one management node to another without causing any service disruption. The design of the T-Mesh cluster technology is such that it operates with the end-users slightest configuration changes.

The T-Mesh cluster technology self-tunes its configurations every thirty minutes to deliver maximum performance. The TCPWave monitoring engine periodically monitors the health of the T-Mesh cluster. TCPWave recommends having three nodes in a T-Mesh.

Business Advantages

While the T-Mesh technology eliminates the single points of failure in an organization, it also provides the DDI administrators with an easy-to-use user interface to maintain and monitor the cluster. Joining a member or removing a member can be done easily using the web interface.

You can perform configuration changes, update and upgrade, view remote's logs on the web interface, and restart the services on any DDI remote from the T-Mesh DDI controller. Internally, the T-Mesh cluster uses a delegate method that hands over the management activity of a given remote from one management node to another. Even though the action is initiated from one management node, the actual node that performs the appropriate action on the remote is the HA node directly connected to the remote using the T-Message Secure Tunnel. The management traffic is encrypted using SSL over a unique TCP port. The transport layer used in the T-Mesh ecosystem is encrypted using the highest degree. It is important to note that the nodes in the T-Mesh HA cluster need a proper clock discipline. The clock offset between the nodes of a T-Mesh ecosystem plays a significant factor in the overall stability.

Performance with Reliability

The T-Mesh technology comes with a built-in conflict resolution logic. When two nodes have a dispute, the third node automatically acts as an arbitrator. The dispute resolution takes place in milliseconds. TCPWave recommends that all the nodes on the cluster be on the same network speed, hardware type, patch levels, etc. Since T-Mesh uses a replicated write-set to guarantee data consistency, a given write operation must be performed on all the nodes. Every DDI operation that uses the 1490 plus REST API calls gets processed within a few milliseconds on the performance-optimized database.

Disaster Recovery

TCPWave's IPAM provides a sophisticated "Disaster Recovery" (DR) mechanism, just in case the IPAM goes down for any reason, to provide a stable service to the users. There are two IPAM appliances in the setup, and one IPAM manages all the DNS and DHCP appliances. This active IPAM appliance is known as the Master IPAM appliance. However, the other IPAM appliance is passive and is known as a slave IPAM appliance. All the Master IPAM appliance data gets replicated to the slave IPAM appliance, which forms an active-passive setup. Suppose, for any reason; the master IPAM appliance goes down. In that case, one can bring up the slave IPAM appliance, and all of the DNS and DHCP appliances automatically get connected to it for management purposes. Switching between the master and slave IPAM appliances is seamless. The IPAM uses the latest technologies like Galera Cluster Technology to replicate the data from master to slave appliances and provide organizations with a superior disaster recovery solution.