Introduction to Distributed System:
Distributed systems are everywhere. A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages.
We define a distributed system as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages. It is a collection of independent computers that appears to its users as a single coherent system.
A distributed system is a system with multiple components located on different machines that communicate and coordinate actions in order to appear as a single coherent to the end user.It is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve common goal.
We can also define distributed system as “ A distributed system is a collection of separate and independent software and hardware components, called nodes that are networked and worked together coherently by coordinating and communicating through message passing or events, to fulfill one goal”.
A distributed system is a collection of independent components located on different machines that share messages with each other in order to achieve common goal. As such, the distributed system will appear as if it is one interface or computer to the end user.
Examples of Distributed Systems:
Examples of distributed systems are based on familiar and widely used computer networks: the Internet and the associated World Wide Web, Web Search, online gaming, email, social networks, e-Commerce etc.
The Internet:
The Internet is the most widely known
example of a distributed system. The Internet facilitates the connection of
many different computer systems in different geographical locations to share
information and resources.
The internet is a very large distributed
system. It enables users, wherever they are, to make use of services such as
the World Wide Web, email and file transfer. Multimedia services are available
in the internet, enabling users to access audio and video data including music,
radio and TV channels and to hold phone and video conferences.
Financial Trading:
Telecommunication Networks:
Telephone and Cellular networks are examples
of distributed networks. Telephone networks have been
around for over a century and it started as an early example of a peer to peer network.
Cellular networks are distributed networks with base stations physically
distributed in areas called cells. As telephone networks have evolved to VOIP
(voice over IP), it continues to grow in complexity as a distributed
network.
Real-time Systems:
Many industries use real-time systems distributed in various areas, locally and globally. For example, airlines use real-time flight control systems, and manufacturing plants use automation control systems. Taxi companies also employ a dispatch system, and e-commerce and logistics companies typically use real-time package tracking systems when customers order a product.
Distributed Database Systems:
A distributed database has locations across multiple servers, physical locations, or both. Users can replicate or duplicate the data across systems. Many popular applications use distributed databases and are aware of the heterogeneous or homogenous properties of the distributed database system. A heterogeneous distributed database allows for multiple data models and different database management. An end-user can use a gateway to translate the data between nodes, typically because of merging systems and applications.A homogenous distributed database is a system that has the same database management system and data model. This is easier to scale performance and manage when adding new locations and nodes.
Characteristics of Distributed System (Features of Distributed System):
Resource
Sharing: The main motivating factor for constructing and using
distributed systems is resource sharing. Resources such as printers, files, web
pages or database records are managed by servers of the appropriate type. For
example, web servers manage web pages and other web resources. Resources are
accessed by clients – for example, the clients of web servers are generally
called browsers.
Openness:
Openness is concerned with extension and
improvement of distributed system.
Concurrency: multiple
activities are executed at the same time
Scalability:
In
principle, distributed systems should also be relatively easy to expand or
scale. We scale the distributed system by adding more computers in the network.
Fault
Tolerance: Its care the reliability of system.
Transparency:
Transparency
hides the complexity of distributed system to the users and application
programs. Differences between the various computers and the ways in which they
communicate are mostly hidden from users. The same holds for the internal organization
of the distributed system.
Heterogeneity:
In
distributed systems components can have variety and differences in networks,
computer hardware, operating systems, Programming languages and implementations
by developers.
Goals of Distributed System:
A distributed system should make resources easily
accessible; it should reasonably hide the fact that resources are distributed
across a network; it should be open; and it should be scalable.
Making
resources accessible:
The main goal of a distributed system is to make it easy
for the users (and applications) to access remote resources, and to share them
in a controlled and efficient way. Resources can be just about anything, but
typical examples include things like printers, computers, storage facilities,
data, files, Web pages, and networks, to name just a few. There are many
reasons for wanting to share resources. One obvious reason is that of
economics. For example, it is cheaper to let a printer be shared by several
users in a small office than having to buy and maintain a separate printer for
each user. Likewise, it makes economic sense to share costly resources such as
supercomputers, high-performance storage systems and other expensive
peripherals.
Connecting users and resources also makes it easier to
collaborate and exchange information, as is clearly illustrated by the success
of the Internet with its simple protocols for exchanging files, mail,
documents, audio, and video.
Distribution
transparency:
An important goal of a distributed system is to hide the
fact that its processes and resources are physically distributed across
multiple computers. A distributed system that is able to present itself to
users and applications as if it were only a single computer system is said to
be transparent.
Types of Transparency:
Access transparency: Access transparency deals with hiding differences in data representation and the way that resources can be accessed by users. At a basic level, we wish to hide differences in machine architectures, but more important is that we reach agreement on how data is to be represented by different machines and operating systems.
Location
transparency: Location transparency refers to the fact
that users cannot tell where a resource is physically located in the system.
Naming plays an important role in achieving location transparency. In
particular, location transparency can be achieved by assigning only logical
names to resources, that is, names in which the location of a resource is not
secretly encoded.
Migration
transparency: Distributed systems in which resources can
be moved without affecting how those resources can be accessed are said to
provide migration transparency.
Relocation
transparency: Even stronger is the situation in which
resources can be relocated while they are being accessed without the user or
application noticing anything. In such cases, the system is said to support
relocation transparency.
Replication
transparency: replication plays a very important role in
distributed systems. For example, resources may be replicated to increase
availability or to improve performance by placing a copy close to the place
where it is accessed. Replication transparency deals with hiding the fact that
several copies of a resource exist. To hide replication from users, it is
necessary that all replicas have the same name. Consequently, a system that
supports replication transparency should generally support location
transparency as well, because it would otherwise be impossible to refer to
replicas at different locations.
Concurrency
transparency: an important goal of distributed systems is
to allow sharing of resources. In many cases, sharing resources is done in a
cooperative way, as in the case of communication. However, there are also many examples of competitive
sharing of resources. For example, two independent users may each have stored
their files on the same file server or may be accessing the same tables in a
shared database. In such cases, it is important that each user does not notice
that the other is making use of the same resource. This phenomenon is called
concurrency transparency. An important issue is that concurrent access to a
shared resource leaves that resource in a consistent state. Consistency can be
achieved through locking mechanisms, by which users are, in turn, given exclusive
access to the desired resource.
Failure
transparency: Making a distributed system failure
transparent means that a user does not notice that a resource (he has possibly
never heard of) fails to work properly, and that the system subsequently
recovers from that failure.
Openness:
Another important goal of distributed systems is openness. An open distributed
system is a system that offers services according to standard rules that
describe the syntax and semantics of those services. For example, in computer
networks, standard rules govern the format, contents, and meaning of messages
sent and received.
Another important goal for
an open distributed system is that it should be easy to configure the system
out of different components (possibly from different developers). Also, it
should be easy to add new components or replace existing ones without affecting
those components that stay in place. In other words, an open distributed system
should also be extensible. For example, in an extensible system, it should be
relatively easy to add parts that run on a different operating system. or even
to replace an entire file system. As many of us know from daily practice,
attaining such flexibility is easier said than done.
Scalability:
Scalability of a system can be measured along at least three different
dimensions (Neuman, 1994). First, a system can be scalable with respect to its
size, meaning that we can easily add more users and resources to the system.
Second, a geographically scalable system is one in which the users and
resources may lie far apart. Third, a system can be administratively scalable,
meaning that it can still be easy to manage even if it spans many independent
administrative organizations.
Challenges
of Distributed System:
The challenges arising from the construction of
distributed systems are the heterogeneity of their components, openness (which
allows components to be added or replaced), security, scalability – the ability
to work well when the load or the number of users increases – failure handling,
concurrency of components, transparency and providing quality of service.
Heterogeneity:
Distributed system must be constructed from a variety of
different networks, operating systems, computer hardware and programming
languages. The Internet communication protocols mask the difference in
networks, and middleware can deal with the other differences.
The Internet enables users to access services and run
applications over a heterogeneous collection of computers and networks.
Although the Internet consists of many different sorts of network, their
differences are masked by the fact that all of the computers attached to them
use the Internet protocols to communicate with one another. Different
programming languages use different representations for characters and data
structures such as arrays and records. These differences must be addressed if
programs written in different languages are to be able to communicate with one
another. Programs written by different developers cannot communicate with one
another unless they use common standards,
Middleware:
In order to support heterogeneous computers and networks
while offering a single-system view, distributed systems are often organized by
means of a layer of software-that is, logically placed between a higher-level
layer consisting of users and applications, and a layer underneath consisting
of operating systems and basic communication facilities, as shown in Fig.
below.
The term middleware applies to a software layer that
provides a programming abstraction as well as masking the heterogeneity of the
underlying networks, hardware, operating systems and programming languages.
The Common Object Request Broker (CORBA) is an example.
Some middleware, such as Java Remote Method Invocation (RMI) supports only a
single programming language. Most middleware is implemented over the Internet
protocols, which themselves mask the differences of the underlying networks,
but all middleware deals with the differences in operating systems and
hardware.
In addition to solving the problems of heterogeneity,
middleware provides a uniform computational model for use by the programmers of
servers and distributed applications. Possible models include remote object
invocation, remote event notification, remote SQL access and distributed
transaction processing. For example, CORBA provides remote object invocation,
which allows an object in a program running on one computer to invoke a method
of an object in a program running on another computer. Its implementation hides
the fact that messages are passed over a network in order to send the
invocation request and its reply.
Heterogeneity
and mobile code:
The term mobile
code is used to refer to program code that can be transferred from one computer
to another and run at the destination – Java applets are an example. Code
suitable for running on one computer is not necessarily suitable for running on
another because executable programs are normally specific both to the
instruction set and to the host operating system. The virtual machine approach
provides a way of making code executable on a variety of host computers: the
compiler for a particular language generates code for a virtual machine instead
of a particular hardware order code. For example, the Java compiler produces
code for a Java virtual machine, which executes it by interpretation. The Java
virtual machine needs to be implemented once for each type of computer to
enable Java programs to run. Today, the most commonly used form of mobile code
is the inclusion Javascript programs in some web pages loaded into client
browsers.
Openness:
Distributed systems should be extensible. The openness of
a computer system is the characteristic that determines whether the system can
be extended and re-implemented in various ways. The openness of distributed
systems is determined primarily by the degree to which new resource-sharing
services can be added and be made available for use by a variety of client
programs.
Security:
Many of the
information resources that are made available and maintained in distributed
systems have a high intrinsic value to their users. Their security is therefore
of considerable importance. Security for information resources has three
components: confidentiality
(protection against disclosure to unauthorized individuals), integrity (protection against
alteration or corruption), and availability
(protection against interference with the means to access the resources).
Encryption can be used to provide adequate protection of
shared resources and to keep sensitive information secret when it is
transmitted in messages over a network.
Although the Internet allows a program in one computer to
communicate with a program in another computer irrespective of its location, security
risks are associated with allowing free access to all of the resources in an
intranet. Although a firewall can be
used to form a barrier around an intranet, restricting the traffic that can
enter and leave, this does not deal with ensuring the appropriate use of
resources by users within an intranet, or with the appropriate use of resources
in the Internet, that are not protected by firewalls.
Scalability:
A distributed system is scalable if the cost of adding a
user is a constant amount in terms of the resources that must be added. Distributed
systems operate effectively and efficiently at many different scales, ranging
from a small intranet to the Internet. A system is described as scalable if it
will remain effective when there is a significant increase in the number of
resources and the number of users. The number of computers and servers in the
Internet has increased dramatically.
Failure
handling:
Computer systems sometimes fail. When faults occur in
hardware or software, programs may produce incorrect results or may stop before
they have completed the intended computation.
Failures in a distributed system are partial – that is,
some components fail while others continue to function. Therefore the handling
of failures is particularly difficult. The following techniques for dealing
with failures are discussed here:
Detecting
failures: Some failures can be detected. For example, checksums
can be used to detect corrupted data in a message or a file. It is difficult or
even impossible to detect some other failures, such as a remote crashed server
in the Internet. The challenge is to manage in the presence of failures that
cannot be detected but may be suspected.
Masking
failures: Some failures that have been detected can be hidden or
made less severe. Two examples of hiding failures:
1.
Messages can be retransmitted when they
fail to arrive.
2.
File data can be written to a pair of
disks so that if one is corrupted, the other may still be correct.
Tolerating
failures: Most of the services in the Internet do exhibit failures
– it would not be practical for them to attempt to detect and hide all of the
failures that might occur in such a large network with so many components.
Their clients can be designed to tolerate failures, which generally involves the
users tolerating them as well.
Recovery
from failures: Recovery involves the design of software so
that the state of permanent data can be recovered or ‘rolled back’ after a
server has crashed. In general, the computations performed by some programs
will be incomplete when a fault occurs, and the permanent data that they update
(files and other material stored in permanent storage) may not be in a
consistent state.
Concurrency:
The presence of multiple users in a distributed system is
a source of concurrent requests to its resources. Both services and applications
provide resources that can be shared by clients in a distributed system. There
is therefore a possibility that several clients will attempt to access a shared
resource at the same time. For example, a data structure that records bids for
an auction may be accessed very frequently when it gets close to the deadline
time. The process that manages a shared resource could take one client request
at a time. But that approach limits throughput. Therefore services and
applications generally allow multiple client requests to be processed
concurrently.
Transparency:
Transparency is defined as the concealment from the user
and the application programmer of the separation of components in a distributed
system, so that the system is perceived as a whole rather than as a collection
of independent components. The implications of transparency are a major
influence on the design of the system software.
The aim is to make certain aspects of distribution
invisible to the application programmer so that they need only be concerned
with the design of their particular application. For example, they need not be
concerned with its location or the details of how its operations are accessed
by other components, or whether it will be replicated or migrated. Even
failures of networks and processes can be presented to application programmers
in the form of exceptions – but they must be handled.
Access
transparency enables local and remote resources to be
accessed using identical operations.
Location
transparency enables resources to be accessed without
knowledge of their physical or network location (for example, which building or
IP address).
Concurrency
transparency enables several processes to operate
concurrently using shared resources without interference between them.
Replication
transparency enables multiple instances of resources to
be used to increase reliability and performance without knowledge of the
replicas by users or application programmers.
Failure
transparency enables the concealment of faults, allowing
users and application programs to complete their tasks despite the failure of
hardware or software components.
Mobility
transparency allows the movement of resources and clients
within a system without affecting the operation of users or programs. Performance transparency allows the
system to be reconfigured to improve performance as loads vary.
Scaling
transparency allows the system and applications to expand
in scale without change to the system structure or the application algorithms
Distributed
System Models:
Distributed System Models is also called Distributed
System Design. The common properties and design issues for distributed system
are explained in distributed system models.
Systems that are intended for use in real-world environments
should be designed to function correctly in the widest possible range of
circumstances and in the face of many possible difficulties and threats.
Distributed systems of different types share important underlying properties
and give rise to common design problems.
Each type of model is intended to provide an abstract,
simplified but consistent description of a relevant aspect of distributed
system design:
Distributed System Model is divided into two types:
1.
Architectural
model
2.
Fundamental
Model
Architectural Models:
Architectural Model deals with organization of components
and their interrelationships. The architecture of a system is its structure in
terms of separately specified components and their interrelationships.
An architectural model of a distributed system is
concerned with the placement of its components and the relationship between
them. The overall goal of architectural model is to ensure that the structure
will meet present and likely future demands on it.
Major concerns are to make the system reliable,
manageable, adaptable and cost-effective. The architectural design of a
building has similar aspects – it determines not only its appearance but also
its general structure and architectural style and provides a consistent frame
of reference for the design.
Architectural models describe a system in terms of the
computational and communication tasks performed by its computational elements;
the computational elements being individual computers or aggregates of them
supported by appropriate network interconnections.
The two most commonly used forms of Architectural Model for distributed systems are :
- Client-Server Model
- Peer-to-Peer Model
Client-server
Model: It is the most important and the most widely used
architectural model. In client server model, client processes interact with
individual server processes in separate host computers in order to access the
shared resources that they manage.
Servers may in turn be clients of other servers. For
example, a web server is often a client of a local file server that manages the
files in which the web pages are stored.
Web servers and most other Internet services are clients
of the DNS service, which translates Internet domain names to network
addresses. Another web-related example concerns search engines, which enable
users to look up summaries of information available on web pages at sites
throughout the Internet. These summaries are made by programs called web crawlers,
which run in the background at a search engine site using HTTP requests to
access web servers throughout the Internet. Thus a search engine is both a
server and a client: it responds to queries from browser clients and it runs
web crawlers that act as clients of other web servers.
Peer-to-Peer
Model: In this architecture all of the processes involved in a
task or activity play similar roles, interacting cooperatively as peers without
any distinction between client and server processes or the computers on which
they run. In practical terms, all participating processes run the same program
and offer the same set of interfaces to each other.
Fundamental
Models:
Fundamental models are based on some fundamental
properties such as performance, reliability and security. All of the models
share the design requirements of achieving the performance and reliability
characteristics of processes and networks and ensuring the security of the
resources in the system.
Fundamental models based on the fundamental properties
that allow us to be more specific about their characteristics and the failures
and security risks they might exhibit.
Fundamental models concerned with properties that are
common in all of the architectural models. Fundamental model addressed by three models:
Interaction
Model:
The Interaction Model is concerned with the performance of processes and communication
channels and the absence of a global clock. It identifies a synchronous system
as one in which known bounds may be placed on process execution time, message
delivery time and clock drift. It identifies an asynchronous system as one in
which no bounds may be placed on process execution time, message delivery time
and clock drift – which is a description of the behaviour of the Internet.
Failure
Model:
The Failure model
attempts to give a precise specification of the faults that can be exhibited by
processes and communication channels. It defines reliable communication and
correct processes. The failure model classifies the failures of processes and
basic communication channels in a distributed system.
Security
Model:
The Security Model
identifies the possible threats to processes and communication channels. It
introduces the concept of a secure channel, which is secure against those
threats. Some of those threats relate to integrity: malicious users may tamper
with messages or replay them. Others threaten their privacy. Another security
issue is the authentication of the principal (user or server) on whose behalf a
message was sent. Secure channels use cryptographic techniques to ensure the
integrity and privacy of messages and to authenticate pairs of communicating
principals.
Variations
of Client Server Architecture:
· Proxy Server
· Thin
Client
· Mobile
Agent
Proxy Server:
Proxy Server refers to a server that acts as an intermediary between client and server. The proxy server is a computer on the internet that accepts the incoming requests from the client and forwards those requests to the destination server. It has its own IP address. It separates the client program and web server from the global network.
The purpose of proxy servers is to increase the
availability and performance of the service by reducing the load on the wide
area network and web servers. Proxy servers can take on other roles; for
example, they may be used to access remote web servers through a firewall.
Thin Client:
A thin client is a computer that runs from resources
stored on a central server instead of a localized hard drive. It is virtual
desktop computing model that runs on the resources stored on a central server.
Thin clients are small, silent devices that communicate
with a central server giving a computing experience that is largely identical
to that of a PC.
Mobile Agent:
Mobile Agents are transportable programs that can move over the network
and act on behalf of the user. A mobile agent is a running program (including both code
and data) that travels from one computer to another in a network carrying out a
task on someone’s behalf, such as collecting information, and eventually
returning with the results. A mobile agent may make many invocations to local
resources at each site it visits – for example, accessing individual database
entries.
Mobile agents might be used to install and maintain
software on the computers within an organization or to compare the prices of
products from a number of vendors by visiting each vendor’s site and performing
a series of database operations.
Mobile agents (like mobile code) are a potential security
threat to the resources in computers that they visit. The environment receiving
a mobile agent should decide which of the local resources it should be allowed
to use, based on the identity of the user on whose behalf the agent is acting –
their identity must be included in a secure way with the code and data of the
mobile agent. In addition, mobile agents can themselves be vulnerable – they
may not be able to complete their task if they are refused access to the
information they need. The tasks performed by mobile agents can be performed by
other means. For example, web crawlers that need to access resources at web
servers throughout the Internet work quite successfully by making remote
invocations to server processes. For these reasons, the applicability of mobile
agents may be limited.
PROCESS SYNCHRONIZATION
Synchronization in distributed systems is often much more difficult compared to synchronization in uniprocessor or multiprocessor systems.
Time is an important practical issue. For example, we require computers around the world to timestamp electronic commerce transactions consistently. Time is also an important theoretical construct in understanding how distributed executions unfold. But time is problematic in distributed systems.
Each computer may have its own physical clock, but the
clocks typically deviate, and we cannot synchronize them perfectly. Time is an
important and interesting issue in distributed systems, for several reasons.
First, time is a quantity we often want to measure accurately. In order to know
at what time of day a particular event occurred at a particular computer it is
necessary to synchronize its clock with an authoritative, external source of
time. For example, an eCommerce transaction involves events at a merchant’s
computer and at a bank’s computer. It is important, for auditing purposes, that
those events are timestamped accurately.
Second, algorithms that depend upon clock synchronization
have been developed for several problems in distribution [Liskov 1993]. These
include maintaining the consistency of distributed data, checking the
authenticity of a request sent to a server, and eliminating the processing of
duplicate updates.
Clock, Event and Process States:
A distributed system consist of a collection P of N
processes pi, i= 1,2,...,N.
Each process pi in P has a state, si,
which in general, it transforms as it executes.
The process’s state includes the values of all the
variables within it. Its state may also include the values of any objects in
its local operating system environment that it affects, such as files.
As each process pi executes it takes a series
of actions, each of which is either a message send or receive operation, or an
operation that transforms pi ’s state – one that changes one or more
of the values in si . In practice, we may choose to use a high-level
description of the actions, according to the application. For example, if the
processes in „ are
engaged in an eCommerce application, then the actions may be ones such as
‘client dispatched order message’ or ‘merchant server recorded transaction to
log’.
We define an event to be the occurrence of a single
action that a process carries out as it executes – a communication action or a
state-transforming action.
The sequence of events within a single process pi
can be placed in a single, total ordering, which we denote by the relation à i
between the events. That is, eà i e'
if and only if the event e occurs before e' at pi.
Now we can define the history of process pi to
be the series of events that take place within it, ordered as we have described
by the relation à i
history(pi) = hi = <ei0, ei1,
ei2, ……… >
Synchronizing Physical Clocks:
In order to know at what time of day events occur at the
processes in our distributed system P
-for example, for accountancy purposes – it is necessary to synchronize the
processes’ clocks, Ci ,
with an authoritative, external source of time. This is external synchronization. And if the clocks Ci are synchronized with one another to a
known degree of accuracy, then we can measure the interval between two events
occurring at different computers by appealing to their local clocks, even
though they are not necessarily synchronized to an external source of time.
This is internal synchronization. We
define these two modes of synchronization more closely as follows, over an
interval of real time I:
External synchronization:
For a synchronization bound D>0 , and for
a source S of UTC time, |S(t) – Ci(t)
|<D, for i = 1, 2, …N and for all real times t in I. Another way of
saying this is that the clocks Ci
are accurate to within the bound D.
Internal synchronization:
For a synchronization bound D>0 , |Ci(t)-Cj(t)| <D, for
i, j= 1, 2, …N and for all real times t in I. Another way of saying this is
that the clocks Ci agree within the bound D.
Clocks that are internally synchronized are not
necessarily externally synchronized, since they may drift collectively from an
external source of time even though they agree with one another. However, it
follows from the definitions that if the system P is externally synchronized with a bound D, then the same system
is internally synchronized with a bound of 2D.
Algorithms
for Physical Clock Synchronization:
Clocks are one of the most important components of
computers and other devices. However, for various factors, these clocks may
drift from standard frequency or degrade and may gain or loose time with
respect to the reference clock and this time difference between the two clocks
is called clock skew.
This clock skew may gradually increase and eventually
cause desynchronisation of the computer clock from the reference clock, which
could affect their normal operation. Therefore it requires synchronization of
the computer clock with the reference clock to minimize clock skew.
Several algorithm and protocols proposed for
synchronizing physical clocks:
Cristian
Algorithm:
Cristian [1989] suggested the use of a time server,
connected to a device that receives signals from a source of UTC, to
synchronize computers externally. Upon request, the server process supplies the
time according to its clock.
Cristian observed that while there is no upper bound on
message transmission delays in an asynchronous system, the round-trip times for
messages exchanged between pairs of processes are often reasonably short – a
small fraction of a second. He describes the algorithm as probabilistic: the
method achieves synchronization only if the observed round-trip times between
client and server are sufficiently short compared with the required accuracy.
Discussion of Cristian’s Algorithm:
As
described, Cristian’s method suffers from the problem associated with all
services implemented by a single server: that the single time server might fail
and thus render synchronization temporarily impossible. Cristian suggested, for
this reason, that time should be provided by a group of synchronized time
servers, each with a receiver for UTC time signals. For example, a client could
multicast its request to all servers and use only the first reply obtained. The
problem of dealing with faulty clocks is partially addressed by the Berkeley
algorithm,
Berkley
Algorithm:
Network
Time Protocol:
Logical
Time and Logical Clocks:
So far, we have assumed that clock synchronization is
naturally related to real time. However, we have also seen that it may be
sufficient that every node agrees on a current time, without that time
necessarily being the same as the real time.
In a classic paper, Lamport (1978) showed that although
clock synchronization is possible, it need not be absolute. If two processes do
not interact, it is not necessary that their clocks be synchronized because the
lack of synchronization would not be observable and thus could not cause
problems. Furthermore, he pointed out that what usually matters is not that all
processes agree on exactly what time it is, but rather that they agree on the
order in which events occur.
Lamport’s
Logical Clock:
To synchronize logical clocks, Lamport defined a relation called happens-before. The expression a ~ b is read "a happens before b" and means that all processes agree that first event a occurs, then afterward, event b occurs. The happens-before relation can be observed directly in two situations:
1. If a and b are events in the same process, and a occurs before b, then a ~ b is true.
2. If a is the event of a message being sent by
one process, and b is the event of the message being received by another
process, then a ~ b is also true. A message cannot be received before it is
sent, or even at the same time it is sent, since it takes a finite, nonzero
amount of time to arrive.
Happens-Before is a transitive relation, so if a ~ b and b
~ c, then a ~ c. If two events, x and y, happen in different processes that do
not exchange messages (not even indirectly via third parties), then x ~ y is
not true, but neither is y ~ x. These events are said to be concurrent, which
simply means that nothing can be said (or need be said) about when the events
happened or which event happened first. What we need is a way of measuring a
notion of time such that for every event, a, we can assign it a time value C(a)
on which all processes agree. These time values must have the property that if
a ~ b, then C(a) < C(b). To rephrase the conditions we stated earlier, if a
and b are two events within the same process and a occurs before b, then C(a)
< C(b). Similarly, if a is the sending of a message by one process and b is
the reception of that message by another process, then C(a) and C(b) must be
assigned in such a way that everyone agrees on the values of C(a) and C(b)
with C(a) < C(b). In addition, the clock time, C, must always go forward
(increasing), never backward (decreasing). Corrections to time can be made by
adding a positive value, never by subtracting one.
Global
states: The global state of a distributed system is the set of
local states of each individual processes involved in the system plus the state
of the communication channel.
Global
states = Individual state of each process + individual state of
each communication channel
Cut
in a Distributed System:
DISTRIBUTED MUTUAL EXCLUSION
Fundamental to distributed systems is the concurrency and
collaboration among multiple processes. In many cases, this also means that
processes will need to simultaneously access the same resources. To prevent
that such concurrent accesses corrupt the resource, or make it inconsistent,
solutions are needed to grant mutual exclusive access by processes.
Requirement
of Mutual Exclusion:
Distributed processes often need to coordinate their
activities. If a collection of processes share a resource or collection of
resources, then often mutual exclusion is required to prevent interference and
ensure consistency when accessing the resources.
Algorithms
for Mutual Exclusion:
The Central Server Algorithm:
The simplest way to achieve mutual exclusion is to employ
a server that grants permission to enter the critical section. To enter a
critical section, a process sends a request message to the server and awaits a
reply from it. Conceptually, the reply constitutes a token signifying
permission to enter the critical section. If no other process has the token at
the time of the request, then the server replies immediately, granting the
token. If the token is currently held by another process, then the server does
not reply, but queues the request.
When a process exits the critical section, it sends a
message to the server, giving it back the token. If the queue of waiting
processes is not empty, then the server chooses the oldest entry in the queue,
removes it and replies to the corresponding process. The chosen process then
holds the token.
A Ring-Based algorithm:
One of the simplest ways to arrange mutual exclusion
between the N processes without requiring an additional process is to arrange
them in a logical ring. The idea is that exclusion is conferred by obtaining a
token in the form of a message passed from process to process in a single
direction clockwise, say – around the ring.
If a process does not require to enter the critical
section when it receives the token, then it immediately forwards the token to
its neighbour. A process that requires the token waits until it receives it,
but retains it. To exit the critical section, the process sends the token on to
its neighbour.
This algorithm continuously consumes network bandwidth
(except when a process is inside the critical section): the processes send
messages around the ring even when no process requires entry to the critical
section.
Ricart
Agarwala Algorithm:
Ricart and Agrawala [1981] developed an algorithm to
implement mutual exclusion between N peer processes that is based upon
multicast. The basic idea is that processes that require entry to a critical
section multicast a request message, and can enter it only when all the other
processes have replied to this message.
Each process records its state of being outside the
critical section (RELEASED), wanting entry (WANTED) or being in the critical
section (HELD) in a variable state. The algorithm is given in Figure below:
On initialization
state := RELEASED;
To enter the section
state := WANTED;
Multicast request to all processes;
T := request’s
timestamp;
Wait until (number of replies received = (N – 1));
state := HELD;
On receipt of a request <Ti, pi> at pj (I = j) if (state = HELD or
(state = WANTED and (T, pj) < (Ti, pi)))
then
queue request from pi
without replying;
else
reply immediately to pi;
end if
To exit the critical section
state := RELEASED;
reply to any queued requests;
If a process requests entry and the state of all other processes is RELEASED, then all processes will reply immediately to the request and the requester will obtain entry. If some process is in the state HELD, then that process will not reply to requests until it has finished with the critical section, and so the requester cannot gain entry in the meantime. If two or more processes request entry at the same time, then whichever process’s request bears the lowest timestamp will be the first to collect N – replies, 1 granting it entry next.
To Enter Critical Section:
· When a process pi wants to enter the critical section, it send a
timestamped REQUEST message to all other process.
· When a process pj receives a REQUEST message from process pi
, it sends a reply message to process pi if and only if:
o
Process pj neither requesting nor
currently executing the critical section.
o
Increase process pj requesting,
the timestamp of process pi’s request is smaller than its own
request. Otherwise the request is deferred by process pj.
To Exit Critical Section:
· Process pi enters the critical section if it
has received the REPLY message from all process.
To Release
the Critical Section:
· Upon exiting process pi sends REPLY message
to all deferred requests.
Maekow’s Voting Algorithm:
Maekawa [1985] observed that in order for a process to enter a critical section, it is not necessary for all of its peers to grant it access. Processes need only obtain permission to enter from subsets of their peers, as long as the subsets used by any two processes overlap. We can think of processes as voting for one another to enter the critical section. A ‘candidate’ process must collect sufficient votes to enter.
ELECTION
ALGORITHM
An algorithm for choosing a unique process to play a
particular role is called an election algorithm.
Requirement
of Election Algorithm:
Ring Algorithm:
The algorithm of Chang and Roberts [1979] is suitable for a collection of processes arranged in a logical ring. Each process pi has a communication channel to the next process in the ring, and all messages are sent clockwise around the ring. We assume that no failures occur, and that the system is asynchronous. The goal of this algorithm is to elect a single process called the coordinator, which is the process with the largest identifier.
Initially, every process is marked as a non-participant in an election. Any process can begin an election. It proceeds by marking itself as a participant, placing its identifier in an election message and sending it to its clockwise neighbour.
When a process receives an election message, it compares the identifier in the message with its own. If the arrived identifier is greater, then it forwards the message to its neighbour. If the arrived identifier is smaller and the receiver is not a participant, then it substitutes its own identifier in the message and forwards it; but it does not forward the message if it is already a participant. On forwarding an election message in any case, the process marks itself as a participant.
Bully’s Algorithm:
Unlike the ring-based algorithm, this algorithm assumes that the system is synchronous: it uses timeouts to detect a process failure. Another difference is that the ring-based algorithm assumed that processes have minimal a priori knowledge of one another: each knows only how to communicate with its neighbour, and none knows the identifiers of the other processes. The bully algorithm, on the other hand, assumes that each process knows which processes have higher identifiers, and that it can communicate with all such processes.
There are three types of message in this algorithm: an election message is sent to announce an election; an answer message is sent in response to an election message and a coordinator message is sent to announce the identity of the elected process – the new ‘coordinator’. A process begins an election when it notices, through timeouts, that the coordinator has failed. Several processes may discover this concurrently.
The process that knows it has the highest identifier can elect itself as the coordinator simply by sending a coordinator message to all processes with lower identifiers. On the other hand, a process with a lower identifier can begin an election by sending an election message to those processes that have a higher identifier and awaiting answer messages in response. If none arrives within time T, the process considers itself the coordinator and sends a coordinator message to all processes with lower identifiers announcing this.
TERMINATION
DETECTION
Termination Detection, i.e. determining whether a distributed computation being performed in a system has terminated, is a fundamental problem in distributed programming. A distributed computation is said to have terminated when all its live processes are in the passive state and no basic messages are in transit. The control computation performed by the processes for detecting the termination, constitutes the termination detection algorithm.
Termination occurs when all processes in the distributed system become idle (passive) and there are no computational messages in transit.
Termination detection in distributed systems has been a popular problem of study. It involves determining whether a computation running on multiple nodes has ceased (terminate) all its activities.
Huang’s Algorithm:
Huang’s algorithm is an algorithm for detecting termination in a distributed system. The algorithm was proposed by Shing-Tsaan Huang in 1989 in the Journal of Computers.
At any time, a process in a distributed system
is either in an active state or in an passive (idle) state. An active process may become idle at any time but an idle
process may only become active again upon receiving a computational message.
Basic Idea:
- When a process sends
a message, it sends a part of its weight in the message.
- When a process receives a message, it adds the weight received in the message to it’s weight.
- Thus, the sum of weights on all the processes and on all the messages in transit is always 1.
- When a process becomes passive, it sends its weight to the controlling agent in a control message, which the controlling agent adds to its weight.
- The controlling agent concludes termination if its weight becomes 1.
Assumptions of the algorithm:
- One of the co-operating processes which monitor the computation is called the controlling agent.
- The initial weight of controlling agent is 1.
- All other processes are initially idle and have weight 0.
- The computation starts when the controlling agent sends a computation message to one of the processes.
- The process becomes active on receiving a computation message.
- Computation message can be sent only by controlling agent or an active process.
- Control message is sent to controlling agent by an active process when they are becoming idle.
- The algorithm assigns a weight W (such that 0 < W < 1) to every active process and every in transit message.
Notations used in the algorithm:
- B(DW): Computation message with weight DW
- C(DW): Control message with weight DW
Algorithm:
Rule to send B(DW):
- Suppose Process P with weight W is sending B(DW) to process Q
- Split the weight of the process P into W1 and W2.
Such that
W = W1 + W2 and W1 > 0, W2 > 0
- Set weight of the process P as W1 ( i.e W = W1 )
- Send B (W2) to process Q, here DW = W2.
Note: Only the Controlling agent or any active process can send Computation message.
On receiving B(DW) by process Q:
- Add the weight DW to the weight of process Q i.e for process Q,
W = W + DW
- If process Q was idle, it will become active on receiving B(DW).
Rule to send C(DW):
- Any active process having weight W can become idle by sending C(W) to controlling agent
- Send a control message C(W) to the controlling agent. Here DW = W.
- Set weight of the process as 0 i.e W = 0. (After this process will become idle.)
On receiving C(DW) by controlling agent :
- Add the weight received through control message to the weight of controlling agent i.e W = W + DW
- After adding, if the weight of controlling agent becomes 1 then it can be conclude that the computation
Advantages of Huang’s Algorithm:
- The algorithm detects every true termination in finite time.
Limitations of Huang’s Algorithm:
- The algorithm is unable to detect termination if a message is lost in transit.
- It also does not work if a process fails while in an active state.
TRANSACTION AND CONCURRENCY CONTROL
Transactions:
A transaction defines a sequence of server operations that is guaranteed by the server to be atomic in the presence of multiple clients and server crashes.
The goal of transactions is to ensure that all of the objects managed by a server remain in a consistent state when they are accessed by multiple transactions and in the presence of server crashes.
A transaction is specified by a client as a set of operations on objects to be performed as an indivisible unit by the servers managing those objects. The servers must guarantee that either the entire transaction is carried out and the results recorded in permanent storage or, in the case that one or more of them crashes, its effects are completely erased.
A client’s transaction is also regarded as indivisible from the point of view of other clients’ transactions in the sense that the operations of one transaction cannot observe the partial effects of the operations of another.
In some situations, clients require a sequence of separate requests to a server to be atomic in the sense that:
1. They are free from interference by operations being performed on behalf of other concurrent clients.
2. Either all of the operations must be completed successfully or they must have no effect at all in the presence of server crashes.
Example:
A client’s banking transaction
Transaction T:
a.withdraw(100);
b.deposit(100);
c.withdraw(200);
b.deposit(200)
A client that performs a sequence of operations on a particular bank account on behalf of a user will first lookUp the account by name and then apply the deposit, withdraw and getBalance operations directly to the relevant account.
Above example shows a simple client transaction specifying a series of related actions involving the bank accounts A, B and C. The first two actions transfer $100 from A to B and the second two transfer $200 from C to B. A client achieves a transfer operation by doing a withdrawal followed by a deposit.
Transactions originate from database management systems. In that context, a transaction is an execution of a program that accesses a database.
Desirable Properties of Transactions
Any transaction must maintain the ACID properties, viz. Atomicity, Consistency, Isolation, and Durability.
Atomicity:
A transaction must be all or nothing. A transaction either completes successfully or has no effect at all. The atomicity property of transactions requires that the servers participating in a distributed transaction either all commit it or all abort it.
This property states that a transaction is an atomic unit of processing, that is, either it is performed in its entirety or not performed at all. No partial update should exist.
Consistency:
A transaction takes the system from one consistent state to another consistent state.
Isolation:
Each transaction must be performed without interference from other transactions; in other words, the intermediate effects of a transaction must not be visible to other transactions. There should not be any interference from the other concurrent transactions that are simultaneously running.
Durability:
After a transaction has completed successfully, all its effects are saved in permanent storage. We use the term ‘permanent storage’ to refer to files held on disk or another permanent medium. Data saved in a file will survive if the server process crashes.
Introduction
to Distributed Transactions:
Distributed transactions involve more than one server. A distributed transaction is any transaction whose activity involves several different servers. Distributed transactions may be either flat or nested.
A client transaction becomes distributed if it invokes operations in several different servers. There are two different ways that distributed transactions can be structured: as flat transactions and as nested transactions.
In a flat transaction, a client makes requests to more than one server. A flat client transaction completes each of its requests before going on to the next one. Therefore, each transaction accesses servers’ objects sequentially. When servers use locking, a transaction can only be waiting for one object at a time.
In a nested transaction, the top-level transaction can open subtransactions, and each subtransaction can open further subtransactions down to any depth of nesting.
Flat
and Nested Distributed Transactions:
Nested Transaction:
Nested transactions are structured from sets of other transactions. They are particularly useful in distributed systems because they allow additional concurrency.
Nested transactions extend the transaction model by allowing transactions to be composed of other transactions. Thus several transactions may be started from within a transaction, allowing transactions to be regarded as modules that can be composed as required.
Nested transactions are formed by structuring transactions from other subtransactions. Nesting is particularly useful in distributed systems because it allows concurrent execution of subtransactions in separate servers. Nesting also has the advantage of allowing independent recovery of parts of a transaction.
The outermost transaction in a set of nested transactions is called the top-level transaction. Transactions other than the top-level transaction are called subtransactions.
A subtransaction appears atomic to its parent with respect to transaction failures and to concurrent access. Each subtransaction can fail independently of its parent and of the other subtransactions. When a subtransaction aborts, the parent transaction can sometimes choose an alternative subtransaction to complete its task.
For example, a transaction to deliver a mail message to a list of recipients could be structured as a set of subtransactions, each of which delivers the message to one of the recipients. If one or more of the subtransactions fails, the parent transaction could record the fact and then commit, with the result that all the successful child transactions commit. It could then start another transaction to attempt to redeliver the messages that were not sent the first time.
Flat Transaction:
When we need to distinguish our original form of transaction from nested ones, we use the term flat transaction. It is flat because all of its work is done at the same level between an openTransaction and a commit or abort, and it is not possible to commit or abort parts of it.
Nested transactions have the following main advantages:
1. Subtransactions at one level (and their descendants) may run concurrently with other subtransactions at the same level in the hierarchy. This can allow additional concurrency in a transaction. When subtransactions run in different servers, they can work in parallel.
For example, consider the branchTotal operation in our banking example. It can be implemented by invoking getBalance at every account in the branch. Now each of these invocations may be performed as a subtransaction, in which case they can be performed concurrently. Since each one applies to a different account, there will be no conflicting operations among the subtransactions.
2. Subtransactions can commit or abort independently. In comparison with a single transaction, a set of nested subtransactions is potentially more robust. The above example of delivering mail shows that this is so – with a flat transaction, one transaction failure would cause the whole transaction to be restarted. In fact, a parent can decide on different actions according to whether a subtransaction has aborted or not.
Atomic
Commit Protocols:
An atomic commit protocol is a cooperative procedure used by a set of servers involved in a distributed transaction. It enables the servers to reach a joint decision as to whether a transaction can be committed or aborted.
Transaction commit protocols were devised in the early 1970s, and the two-phase commit protocol appeared in Gray [1978]. The atomicity property of transactions requires that when a distributed transaction comes to an end, either all of its operations are carried out or none of them. In the case of a distributed transaction, the client has requested operations at more than one server. A transaction comes to an end when the client requests that it be committed or aborted. A simple way to complete the transaction in an atomic manner is for the coordinator to communicate the commit or abort request to all of the participants in the transaction and to keep on repeating the request until all of them have acknowledged that they have carried it out. This is an example of a onephase atomic commit protocol.
Two-Phase Commit Protocol:
During the progress of a transaction, there is no communication between the coordinator and the participants apart from the participants informing the coordinator when they join the transaction. A client’s request to commit (or abort) a transaction is directed to the coordinator. If the client requests abortTransaction, or if the transaction is aborted by one of the participants, the coordinator informs all participants immediately. It is when the client asks the coordinator to commit the transaction that the two-phase commit protocol comes into use. In the first phase of the two-phase commit protocol the coordinator asks all the participants if they are prepared to commit; in the second, it tells them to commit (or abort) the transaction. If a participant can commit its part of a transaction, it will agree as soon as it has recorded the changes it has made (to the objects) and its status in permanent storage and is therefore prepared to commit.
The two-phase commit protocol is designed to allow any participant to abort its part of a transaction. Due to the requirement for atomicity, if one part of a transaction is aborted, then the whole transaction must be aborted. In the first phase of the protocol, each participant votes for the transaction to be committed or aborted. Once a participant has voted to commit a transaction, it is not allowed to abort it. Therefore, before a participant votes to commit a transaction, it must ensure that it will eventually be able to carry out its part of the commit protocol, even if it fails and is replaced in the interim. A participant in a transaction is said to be in a prepared state for a transaction if it will eventually be able to commit it. To make sure of this, each participant saves in permanent storage all of the objects that it has altered in the transaction, together with its status – prepared.
In the second phase of the protocol, every participant in the transaction carries out the joint decision. If any one participant votes to abort, then the decision must be to abort the transaction. If all the participants vote to commit, then the decision is to commit the transaction.
Concurrency
Control in Distributed Transactions:
Distributed Deadlocks:
Deadlocks can arise within a single server when locking is used for concurrency control. Servers must either prevent or detect and resolve deadlocks. With deadlock detection schemes, a transaction is aborted only when it is involved in a deadlock. Most deadlock detection schemes operate by finding cycles in the transaction wait-for graph.
In a distributed system involving multiple servers being accessed by multiple transactions, a global wait-for graph can in be constructed from the local ones. There can be a cycle in the global wait-for graph that is not in any single local one – that is, there can be a distributed deadlock.Detection of a distributed deadlock requires a cycle to be found in the global transaction wait-for graph that is distributed among the servers that were involved in the transactions.
The wait-for graph is a directed graph in which nodes represent transactions and objects, and edges represent either an object held by a transaction or a transaction waiting for an object. There is a deadlock if and only if there is a cycle in the wait-for graph.
Figure-1 shows the interleavings of the transactions U, V and W involving the objects A and B managed by servers X and Y and objects C and D managed by server Z.
The complete wait-for graph in Figure-2(a) shows that a deadlock cycle consists of alternate edges, which represent a transaction waiting for an object and an object held by a transaction. As any transaction can only be waiting for one object at a time, objects can be left out of wait-for graphs, as shown in Figure-2(b).
Detection of a distributed deadlock requires a cycle to be found in the global transaction wait-for graph that is distributed among the servers that were involved in the transactions
Phantom Deadlocks:
A deadlock that is ‘detected’ but is not really a
deadlock is called a phantom deadlock.
In distributed deadlock detection, information about wait-for relationships
between transactions is transmitted from one server to another. If there is a
deadlock, the necessary information will eventually be collected in one place
and a cycle will be detected. As this procedure will take some time, there is a
chance that one of the transactions that holds a lock will meanwhile have
released it, in which case the deadlock will no longer exist.
Consider the case of a global deadlock detector that
receives local wait-for graphs from servers X and Y, as shown in Figure-A. Suppose that transaction U
then releases an object at server X and requests the one held by V at server Y.
Suppose also that the global detector receives server Y’s local graph before
server X’s. In this case, it would detect a cycle TÃ UÃ V Ã T, although the edge TÃ U no longer exists. This is
an example of a phantom deadlock.
A phantom deadlock could be detected if a waiting transaction in a deadlock cycle aborts during the deadlock detection procedure. For example, if there is a cycle TÃ UÃ V Ã T and U aborts after the information concerning U has been collected, then the cycle has been broken already and there is no deadlock.
Edge Chasing Algorithm:
Edge-Chasing Algorithms have three steps- initiation, detection and resolution:
Initiation:
When a server notes that a transaction T starts waiting
for another transaction U, where U is waiting to access an object at another
server, it initiates detection by sending a probe containing the edge < TÃ U > to the server of the
object at which transaction U is blocked. If U is sharing a lock, probes are
sent to all the holders of the lock. Sometimes further transactions may start
sharing the lock later on, in which case probes can be sent to them too.
Detection:
For example, when a server of an object receives a probe
< TÃ U
> (indicating that T is waiting for a transaction U that holds a local
object), it checks to see whether U is also waiting. If it is, the transaction
it waits for (for example, V) is added to the probe (making it < TÃ UÃ V >) and if the new
transaction (V) is waiting for another object elsewhere, the probe is
forwarded. In this way, paths through the global wait-for graph are built one
edge at a time. Before forwarding a probe, the server checks to see whether the
transaction (for example, T) it has just added has caused the probe to contain
a cycle (for example, < TÃ U Ã VÃ T >). If this is the
case, it has found a cycle in the graph and a deadlock has been detected.
Resolution:
When a cycle is detected, a transaction in the cycle is aborted to break the deadlock.
FAQ:
1.
What is Distributed System?
2.
Define Distributed System.
3.
What do you mean by distributed System?
4.
Explain the concept of distributed
System.
5.
Explain Distributed System in brief.
6.
Describe Distributed System.
7.
Explain Distributed System with
examples.
8.
Explain the characteristics of
Distributed System.
9.
Write the challenges of Distributed
System and explain in brief.
10. What is Middleware?
11. What are the main concepts of a distributed system?
12. What are the applications of bully election algorithms?
0 Comments
if you have any doubts plz let me know...