International Journal of Trend in Scientific Research and Development (IJTSRD)
Volume 5 Issue 4, May-June 2021 Available Online: www. ijtsrd.com e-ISSN: 2456 - 6470
A Study on Replication and Failover
Cluster to Maximize System Uptime
Miss Pratiksha Bhagawati}, Mrs. Priya N2
1MCA Scholar, Assistant Professor,
12School of CS & IT, Department of MCA, Jain (Deemed-to-be) University, Bangalore, Karnataka, India
Different types of clients over the globe uses Cloud services because cloud
computing involves various features and advantages such as building cost-
effectives solutions for business, scale resources up and down according to the
current demand and many more. But from the cloud-provider point of view,
there are many challenges that need to be faced in order to ensure a hassle
free service delivery to the clients. One such problem is to maintain high
availability of services. This project aims at presenting a high available (HA) Research and
solution for business continuity and disaster recovery through configuration
of various other services such as load balancing, elasticity and replication.
How to cite this paper: Miss Pratiksha
Bhagawati | Mrs. Priya N "A Study on
Replication and Failover Cluster to
Maximize System | pyaar
Uptime" Published in
of Trend in Scientific
KEYWORDS: Cloud services, cloud-provider, high availability, business continuity,
disaster recovery, load balancing, elasticity, replication
The major concerns of cloud computing involves Reliability
and High availability of resources. From the cloud provider
point of view, it has been always an essential job to provide
the customers with on-demand services ensuring they are
reliable, secured and available on time. Without these, the
customers or the clients are tend to face revenue losses in
the business end hampering their continuity of business
plans. Service downtime not only effects user experience ina
bad way but also directly translates into money loss. To
eliminate these kind of outages, cloud providers have been
focusing on finding ways to enhance their infrastructure and
management strategies to achieve high available services.
For something like this, it’s not just enough to have a failover
cluster but also we need multiple redundant energy sources
and even to have replication between multiple locations in
case of disasters. It is mainly seen that only multinational
companies could afford such a setup. But with the help of
IaaS and PaaS, however, the cost of building such a service
have decreased dramatically.
This project aims on building a High Availability (HA)
architecture to host websites in a reliable manner. The
websites should be scalable, fault tolerant, have a disaster
recovery plan and at any point of time the customer should
not be facing a problem or a connectivity issue.
Il. Literature Review
1. Inthis paper, the author uses Digital Media Distribution
platform to deliver multimedia content. The author here
presents a modern solution for server less platform of
(ijtsrd), ISSN: 2456- IJTSRD41249
6470, Volume 5 | veeeitiiiiansnananananininann
Issue-4, June 2021, pp.377-379, URL:
Copyright © 2021 by author(s) and
International Journal of Trend in Scientific
Research and Development Journal. This
is an Open Access article distributed
under the terms of
License (cc BY 4.0)
digital media for distribution of media content on
Amzon Web Services (AWS). This platform uses Amazon
AWS services for storage, content delivery optimization,
lambda execution, media transcoding, authentication
This paper states that an user experience and the costs
of providing the same video streaming service can vary
when using different cloud CDNs. There are users over
the internet who would be using video streaming and
those users are then emulated to find the best Content
Delivery Network among many such as AWS Cloud
Front, Microsft Azure Verizon CDN and Google Cloud
CDN over a platform known as PlanetLab. Quality of
Experience (QoE) is evalutated.
In this paper the author demonstrates an approach on
how to expedite the auto-scaling strategy for their use
case: public transportation web sites using the AWS
application suite. If Social network monitoring and auto
scaling frameworks are combined, this approach can be
greatly used to reduce OpEx impact of over-provisiong
and under-provisioning as well can reduce the business
In this paper, the author has introduced a measurement-
driven methodology for evaluating the impact of
replication on the QoS of relational DBaaS offerings. The
methodology builds upon an analytical model
representing the database cluster configurations
@IJTSRD | Unique Paper ID-IJTSRD41249 | Volume-5|Issue-4 | May-June 2021 Page 377
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
combined with an environment model to represent the
transient replication stages. This model thus represents
how a cloud database can achieve High Availability (HA)
when data is automatically replicated on multiples
5. Inthis paper, the author proposes a scheme called PDFE
where they use Parallel State Machine Replication which
allows execution of ordered commands in a flexible
manner. There are two kind of threads namely the work
thread and ordered threads. The binding between these
threads can be dynamic and hence any thread that faces
high work load can be executed first. Such flexibility can
help achieve load balancing.
6. Inthis paper, the author proposes a way to minimize the
power consumption while transfer of data called SWIN
(Sliding Window replica strategy) which is data aware.
It minimizes the amount of data transferred and the
storage needed also cutting cost to some extent. This can
be applied not only to data grids but also cloud
7. Over the years, distributed storage clusters are being
widely used. But replication in such cluster have been a
concern since the internal bandwidth of the cluster is
sometimes low. If any replication is misplaced then it
might effect the overall performance of the cluster. In
order to reduce the internal network traffic and to
improve the load balancing, the author has proposed a
centralized replication management scheme. It captures
replica location and network traffic. It uses 0-1
programming scheme to locate replicas.
8. Inthis paper, the author discusses the usage of artificial
intelligence for high availability of resources. After the
training of artificial neural networks , it can choose the
best node possible for resource group fall over. The
above scheme helps us to choose the best possible
failover node in the cluster through ANN.
9. The author says that traditionally used hardware
firewalls had many disadvantages due to its limitations
in physical deployment. These problems thus can be
mitigated through Network Function Virtualizations
NVF by implementing various network functions in
software. It provides various synchronization strategies
that allows sharing of connection states among the
cluster to maintain high availability and scalability.
10. Inthis paper the author proposes an architecture which
is helpful for intensive trace analysis. This architecture
contains essential techniques that amalgamate
SolrCloud, Apache Spark, and SMW. The architecture
provides a way to develop cloud monitoring applications
with advance algorithms for forecasting data and
identifying workload patterns.
Ill. Methodologies/ Algorithms
Cloud CDN: Amazon Cloud Front is a web service that gives
business and web application developers an easy and cost
effective way to distribute content with low latency and high
data transfer speed. It also helps to protect websites against
some common malicious attacks such as Distributed Denial
of Service (DDoS) attacks. Cloud Front comes with two types
namely Web and RTMP. RTMP is mainly used for streaming
media files using Adobe Flash Media whereas Web is used
for normal contents example -html, .css, .php and graphic
files which used HTTP and HTTPs for distribution. The one
that we use in this architecture is Web.
State Machine Replication: It isa method/approach used in
distributed computing for building fault tolerant systems.
State machine at any point stores a state of the system. It
receives a set of commands or inputs and it applies these
commands in a sequential order using a transition function
to generate an output. An example of State Machine
Replication is the Bitcoin ledger. In a fault tolerant state
machine replication, instead of maintaining a single server,
this system uses multiple server replicas some of which can
be faulty. The consolidation of several servers are
represented as the same interface as that ofa single server to
the client. However one main disadvantage of this algorithm
is that it doesn’t necessarily guarantee the increase of
Back Propagation Neural Network: BPNN algorithm is a
multi layer network and is one of the widely applied neural
network models. It can be used to store mapping relations of
input-output models. This algorithm works by computing
the gradient of the loss function with respect to each weight.
The central idea is to get the smallest error though adjusting
the weight of network. That is, using gradient search
technology to make the square error values minimum
between the actual output of network and expectation.
Markov Decision Process: MDP is a discrete time control
process. It provides a mathematical framework for
versioning decision making situations where some outcomes
are partly random and while other are under the control ofa
decision maker. This algorithm can be used to determine
whether to migrate a service or not in case of failover cluster
during any disaster or when needed.
Sliding Window Protocol: The Sliding Window Protocolisa
well known method which can be used for reliable and
effective transfer of data over various undependable
channels that can lose, re-assemble and duplicate messages.
There are mainly two components: the sender and the
receiver. They are mostly used in case that needs high
reliability of data transmission.
The objectives of the project is as follow:
To build a scalable environment.
To have a disaster recovery plan.
To have an environment which is highly available.
To enhance the trust and satisfaction of customers.
To ensure business continuity.
To have a backup plan available.
To configure various replications in different regions
which will ensure fault tolerance.
The scope of the project is limited to the following points:
> Incase of failure, if all the regions fails, data would be
lost. So it is always better to ensure that one of the both
(or both) sites is working properly.
> As the traffic grows, the cost of using Cloud Front can
increase very rapidly.
As far as the proceedings have been done, it is clear that a
proper planned architecture for data replication and failover
cluster is a necessary thing to do since it helps us to plan and
control the flow of data maintaining a backup system for
data safety in case of disaster recovery. This project on
@IJTSRD | Unique Paper ID-IJTSRD41249 |
May-June 2021 Page 378
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
building a High Availability (HA) architecture to host
websites in a reliable manner has thus become a perfect
solution in order to keep customer’s trust. The websites is
thus scalable, fault tolerant, have a disaster recovery plan
and at any point of time the customer shall not face a
problem or a connectivity issue.
Dinca, Alexandru, Marius., Angelescu,Nicoleta.,
Dragomir, Radu., Puchianu, Constantin, Dan., Caciula,
Ion. (2019). “Digital Media Distribution Platform
Using Amazon Web _ Services”, International
Conference - 11th Edition Electronics, Computers and
Wang, Chen., Jayaseelan, Andal., Kim, Hyong., (2018).
“Comparing Cloud Content Delivery Networks for
Adaptive Video Streaming”, 2018 IEEE 11th
International Conference on Cloud Computing.
Smith,Peter., Gonzalez—Velez, Horacio., Caton, Simon.
(2018). “Social Auto-Scaling”, 26th Euromicro
International Conference on Parallel, Distributed, and
Osman, Rasha., F. Perez, Juan., Casale, Giuliano.,
(2016). “Quantifying the Impact of Replication on the
Quality-of-Service in Cloud Databases”, 2016 IEEE
International Conference on Software Quality,
Reliability and Security.
Wu, Lihui., Wu*, Weigang., Huang, Ning., Chen,
Zhiguang. (2018). “PDFE: Flexible Parallel State
Machine Replication for Cloud Computing”, 2018 IEEE
International Conference on Cluster Computing.
V. Vrbsky, Susan., Lei, Ming., Smith, Karl., Byrd, Jeff.
(2010). “Data Replication and Power Consumption in
Data Grids”, 2010 2nd IEEE International Conference
on Cloud Computing Technology and Science.
Huang, Kangxian., Li, Dagang., Sun, Yongyue. (2014).
“CRMS: a Centralized Replication Management
Scheme for Cloud Storage System”, IEEE/CIC ICCC
2014 Symposium on Social Networks and Big Data.
R Yerravalli, Venkateswar., Tharigonda, Aditya.
(2015). “High Availability Cluster Failover Mechanism
Using Artificial Neural Networks”, 2015 IEEE
International Conference on Cloud Computing in
Gray*, Nicholas., Lorenz}, Claas., Mussig ~ +,
vAlexander., Gebert, Steffen., Zinner*, Thomas., Tran-
Gia*, Phuoc. (2017). “A Priori State Synchronization
for Fast Failover of Stateful Firewall VNFs”, 2017
International Conference on Networked Systems
Singh, Samneet., Liu, Yan. (2016). “A Cloud Service
Architecture for Analyzing Big Monitoring Data”,
Tsinghua science and technology.
Unique Paper ID - IJTSRD41249 |
May-June 2021 Page 379