Distributed System| What is a Distributed System? | Distributed Systems Notes

Introduction to Distributed System:

Distributed systems are everywhere. A distributed system is one in which components located at networked computers communicate and coordinate their actions only by passing messages.

A distributed system is a collection of autonomous computers linked by a computer network and equipped with distributed system software. Distributed system software enables computers to coordinate their activities and to share resources of system like hardware, software and data also.




We define a distributed system as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages. It is a collection of independent computers that appears to its users as a single coherent system.

A distributed system is a system with multiple components located on different machines that communicate and coordinate actions in order to appear as a single coherent to the end user.

It is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve common goal.

We can also define distributed system as “ A distributed system is a collection of separate and independent software and hardware components, called nodes that are networked and worked together coherently by coordinating and communicating through message passing or events, to fulfill one goal”.

A distributed system is a collection of independent components located on different machines that share messages with each other in order to achieve common goal. As such, the distributed system will appear as if it is one interface or computer to the end user.



Examples of Distributed Systems:

Examples of distributed systems are based on familiar and widely used computer networks: the Internet and the associated World Wide Web, Web Search, online gaming, email, social networks, e-Commerce etc.


The Internet: 

The Internet is the most widely known example of a distributed system. The Internet facilitates the connection of many different computer systems in different geographical locations to share information and resources.

The internet is a very large distributed system. It enables users, wherever they are, to make use of services such as the World Wide Web, email and file transfer. Multimedia services are available in the internet, enabling users to access audio and video data including music, radio and TV channels and to hold phone and video conferences.

 

Financial Trading:

As an example, we look at distributed systems support financial trading markets. The financial industry has long been at the cutting edge of distributed systems technology with its need, in particular, for real-time access to a wide range of information sources (for example, current share prices and trends, economic and political developments). The industry employs automated monitoring and trading applications.

 

Telecommunication Networks:

Telephone and Cellular networks are examples of distributed networks. Telephone networks have been around for over a century and it started as an early example of a peer to peer network. Cellular networks are distributed networks with base stations physically distributed in areas called cells. As telephone networks have evolved to VOIP (voice over IP), it continues to grow in complexity as a distributed network.

 

Real-time Systems:


Many industries use real-time systems distributed in various areas, locally and globally. For example, airlines use real-time flight control systems, and manufacturing plants use automation control systems. Taxi companies also employ a dispatch system, and e-commerce and logistics companies typically use real-time package tracking systems when customers order a product.

Distributed Database Systems:

A distributed database has locations across multiple servers, physical locations, or both. Users can replicate or duplicate the data across systems. Many popular applications use distributed databases and are aware of the heterogeneous or homogenous properties of the distributed database system. A heterogeneous distributed database allows for multiple data models and different database management. An end-user can use a gateway to translate the data between nodes, typically because of merging systems and applications.

A homogenous distributed database is a system that has the same database management system and data model. This is easier to scale performance and manage when adding new locations and nodes.




Characteristics of Distributed System (Features of Distributed System):

 

Resource Sharing: The main motivating factor for constructing and using distributed systems is resource sharing. Resources such as printers, files, web pages or database records are managed by servers of the appropriate type. For example, web servers manage web pages and other web resources. Resources are accessed by clients – for example, the clients of web servers are generally called browsers.

Openness: Openness is concerned with extension and improvement of distributed system.

Concurrency: multiple activities are executed at the same time

Scalability: In principle, distributed systems should also be relatively easy to expand or scale. We scale the distributed system by adding more computers in the network.

Fault Tolerance: Its care the reliability of system.

Transparency: Transparency hides the complexity of distributed system to the users and application programs. Differences between the various computers and the ways in which they communicate are mostly hidden from users. The same holds for the internal organization of the distributed system.

Heterogeneity: In distributed systems components can have variety and differences in networks, computer hardware, operating systems, Programming languages and implementations by developers.

 

 

 

Goals of Distributed System:

A distributed system should make resources easily accessible; it should reasonably hide the fact that resources are distributed across a network; it should be open; and it should be scalable.

 

Making resources accessible:

The main goal of a distributed system is to make it easy for the users (and applications) to access remote resources, and to share them in a controlled and efficient way. Resources can be just about anything, but typical examples include things like printers, computers, storage facilities, data, files, Web pages, and networks, to name just a few. There are many reasons for wanting to share resources. One obvious reason is that of economics. For example, it is cheaper to let a printer be shared by several users in a small office than having to buy and maintain a separate printer for each user. Likewise, it makes economic sense to share costly resources such as supercomputers, high-performance storage systems and other expensive peripherals.

Connecting users and resources also makes it easier to collaborate and exchange information, as is clearly illustrated by the success of the Internet with its simple protocols for exchanging files, mail, documents, audio, and video.

 

Distribution transparency:

An important goal of a distributed system is to hide the fact that its processes and resources are physically distributed across multiple computers. A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent.

 


Types of Transparency:

 Access transparency: Access transparency deals with hiding differences in data representation and the way that resources can be accessed by users. At a basic level, we wish to hide differences in machine architectures, but more important is that we reach agreement on how data is to be represented by different machines and operating systems.

Location transparency: Location transparency refers to the fact that users cannot tell where a resource is physically located in the system. Naming plays an important role in achieving location transparency. In particular, location transparency can be achieved by assigning only logical names to resources, that is, names in which the location of a resource is not secretly encoded.

Migration transparency: Distributed systems in which resources can be moved without affecting how those resources can be accessed are said to provide migration transparency.

Relocation transparency: Even stronger is the situation in which resources can be relocated while they are being accessed without the user or application noticing anything. In such cases, the system is said to support relocation transparency.

Replication transparency: replication plays a very important role in distributed systems. For example, resources may be replicated to increase availability or to improve performance by placing a copy close to the place where it is accessed. Replication transparency deals with hiding the fact that several copies of a resource exist. To hide replication from users, it is necessary that all replicas have the same name. Consequently, a system that supports replication transparency should generally support location transparency as well, because it would otherwise be impossible to refer to replicas at different locations.

Concurrency transparency: an important goal of distributed systems is to allow sharing of resources. In many cases, sharing resources is done in a cooperative way, as in the case of communication. However,  there are also many examples of competitive sharing of resources. For example, two independent users may each have stored their files on the same file server or may be accessing the same tables in a shared database. In such cases, it is important that each user does not notice that the other is making use of the same resource. This phenomenon is called concurrency transparency. An important issue is that concurrent access to a shared resource leaves that resource in a consistent state. Consistency can be achieved through locking mechanisms, by which users are, in turn, given exclusive access to the desired resource.

Failure transparency: Making a distributed system failure transparent means that a user does not notice that a resource (he has possibly never heard of) fails to work properly, and that the system subsequently recovers from that failure.

 

Openness: Another important goal of distributed systems is openness. An open distributed system is a system that offers services according to standard rules that describe the syntax and semantics of those services. For example, in computer networks, standard rules govern the format, contents, and meaning of messages sent and received.

Another important goal for an open distributed system is that it should be easy to configure the system out of different components (possibly from different developers). Also, it should be easy to add new components or replace existing ones without affecting those components that stay in place. In other words, an open distributed system should also be extensible. For example, in an extensible system, it should be relatively easy to add parts that run on a different operating system. or even to replace an entire file system. As many of us know from daily practice, attaining such flexibility is easier said than done.

Scalability: Scalability of a system can be measured along at least three different dimensions (Neuman, 1994). First, a system can be scalable with respect to its size, meaning that we can easily add more users and resources to the system. Second, a geographically scalable system is one in which the users and resources may lie far apart. Third, a system can be administratively scalable, meaning that it can still be easy to manage even if it spans many independent administrative organizations.

 

 

Challenges of Distributed System:

The challenges arising from the construction of distributed systems are the heterogeneity of their components, openness (which allows components to be added or replaced), security, scalability – the ability to work well when the load or the number of users increases – failure handling, concurrency of components, transparency and providing quality of service.

 

Heterogeneity:

Distributed system must be constructed from a variety of different networks, operating systems, computer hardware and programming languages. The Internet communication protocols mask the difference in networks, and middleware can deal with the other differences.

The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks. Although the Internet consists of many different sorts of network, their differences are masked by the fact that all of the computers attached to them use the Internet protocols to communicate with one another. Different programming languages use different representations for characters and data structures such as arrays and records. These differences must be addressed if programs written in different languages are to be able to communicate with one another. Programs written by different developers cannot communicate with one another unless they use common standards,

 

Middleware:

In order to support heterogeneous computers and networks while offering a single-system view, distributed systems are often organized by means of a layer of software-that is, logically placed between a higher-level layer consisting of users and applications, and a layer underneath consisting of operating systems and basic communication facilities, as shown in Fig. below. 

The term middleware applies to a software layer that provides a programming abstraction as well as masking the heterogeneity of the underlying networks, hardware, operating systems and programming languages.

The Common Object Request Broker (CORBA) is an example. Some middleware, such as Java Remote Method Invocation (RMI) supports only a single programming language. Most middleware is implemented over the Internet protocols, which themselves mask the differences of the underlying networks, but all middleware deals with the differences in operating systems and hardware.

In addition to solving the problems of heterogeneity, middleware provides a uniform computational model for use by the programmers of servers and distributed applications. Possible models include remote object invocation, remote event notification, remote SQL access and distributed transaction processing. For example, CORBA provides remote object invocation, which allows an object in a program running on one computer to invoke a method of an object in a program running on another computer. Its implementation hides the fact that messages are passed over a network in order to send the invocation request and its reply.

 

Heterogeneity and mobile code:

 The term mobile code is used to refer to program code that can be transferred from one computer to another and run at the destination – Java applets are an example. Code suitable for running on one computer is not necessarily suitable for running on another because executable programs are normally specific both to the instruction set and to the host operating system. The virtual machine approach provides a way of making code executable on a variety of host computers: the compiler for a particular language generates code for a virtual machine instead of a particular hardware order code. For example, the Java compiler produces code for a Java virtual machine, which executes it by interpretation. The Java virtual machine needs to be implemented once for each type of computer to enable Java programs to run. Today, the most commonly used form of mobile code is the inclusion Javascript programs in some web pages loaded into client browsers.

 

Openness:

Distributed systems should be extensible. The openness of a computer system is the characteristic that determines whether the system can be extended and re-implemented in various ways. The openness of distributed systems is determined primarily by the degree to which new resource-sharing services can be added and be made available for use by a variety of client programs.

 

Security:

 Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. Their security is therefore of considerable importance. Security for information resources has three components: confidentiality (protection against disclosure to unauthorized individuals), integrity (protection against alteration or corruption), and availability (protection against interference with the means to access the resources).

Encryption can be used to provide adequate protection of shared resources and to keep sensitive information secret when it is transmitted in messages over a network.

Although the Internet allows a program in one computer to communicate with a program in another computer irrespective of its location, security risks are associated with allowing free access to all of the resources in an intranet. Although a firewall can be used to form a barrier around an intranet, restricting the traffic that can enter and leave, this does not deal with ensuring the appropriate use of resources by users within an intranet, or with the appropriate use of resources in the Internet, that are not protected by firewalls.

 

Scalability:

A distributed system is scalable if the cost of adding a user is a constant amount in terms of the resources that must be added. Distributed systems operate effectively and efficiently at many different scales, ranging from a small intranet to the Internet. A system is described as scalable if it will remain effective when there is a significant increase in the number of resources and the number of users. The number of computers and servers in the Internet has increased dramatically.

 

Failure handling:

Computer systems sometimes fail. When faults occur in hardware or software, programs may produce incorrect results or may stop before they have completed the intended computation.

Failures in a distributed system are partial – that is, some components fail while others continue to function. Therefore the handling of failures is particularly difficult. The following techniques for dealing with failures are discussed here:

Detecting failures: Some failures can be detected. For example, checksums can be used to detect corrupted data in a message or a file. It is difficult or even impossible to detect some other failures, such as a remote crashed server in the Internet. The challenge is to manage in the presence of failures that cannot be detected but may be suspected.


Masking failures: Some failures that have been detected can be hidden or made less severe. Two examples of hiding failures:

1.   Messages can be retransmitted when they fail to arrive.

2.   File data can be written to a pair of disks so that if one is corrupted, the other may still be correct.


Tolerating failures: Most of the services in the Internet do exhibit failures – it would not be practical for them to attempt to detect and hide all of the failures that might occur in such a large network with so many components. Their clients can be designed to tolerate failures, which generally involves the users tolerating them as well.


Recovery from failures: Recovery involves the design of software so that the state of permanent data can be recovered or ‘rolled back’ after a server has crashed. In general, the computations performed by some programs will be incomplete when a fault occurs, and the permanent data that they update (files and other material stored in permanent storage) may not be in a consistent state.

 

Concurrency:

The presence of multiple users in a distributed system is a source of concurrent requests to its resources. Both services and applications provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time. For example, a data structure that records bids for an auction may be accessed very frequently when it gets close to the deadline time. The process that manages a shared resource could take one client request at a time. But that approach limits throughput. Therefore services and applications generally allow multiple client requests to be processed concurrently.

 

Transparency:

Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system, so that the system is perceived as a whole rather than as a collection of independent components. The implications of transparency are a major influence on the design of the system software.

The aim is to make certain aspects of distribution invisible to the application programmer so that they need only be concerned with the design of their particular application. For example, they need not be concerned with its location or the details of how its operations are accessed by other components, or whether it will be replicated or migrated. Even failures of networks and processes can be presented to application programmers in the form of exceptions – but they must be handled.

Access transparency enables local and remote resources to be accessed using identical operations.

Location transparency enables resources to be accessed without knowledge of their physical or network location (for example, which building or IP address).

Concurrency transparency enables several processes to operate concurrently using shared resources without interference between them.

Replication transparency enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers.

Failure transparency enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components.

Mobility transparency allows the movement of resources and clients within a system without affecting the operation of users or programs. Performance transparency allows the system to be reconfigured to improve performance as loads vary.

Scaling transparency allows the system and applications to expand in scale without change to the system structure or the application algorithms


 

Distributed System Models:

Distributed System Models is also called Distributed System Design. The common properties and design issues for distributed system are explained in distributed system models.

Systems that are intended for use in real-world environments should be designed to function correctly in the widest possible range of circumstances and in the face of many possible difficulties and threats. Distributed systems of different types share important underlying properties and give rise to common design problems.

Each type of model is intended to provide an abstract, simplified but consistent description of a relevant aspect of distributed system design:

Distributed System Model is divided into two types:

    1.   Architectural model

    2.   Fundamental Model

 







Architectural Models:

Architectural Model deals with organization of components and their interrelationships. The architecture of a system is its structure in terms of separately specified components and their interrelationships.

An architectural model of a distributed system is concerned with the placement of its components and the relationship between them. The overall goal of architectural model is to ensure that the structure will meet present and likely future demands on it.

Major concerns are to make the system reliable, manageable, adaptable and cost-effective. The architectural design of a building has similar aspects – it determines not only its appearance but also its general structure and architectural style and provides a consistent frame of reference for the design.

Architectural models describe a system in terms of the computational and communication tasks performed by its computational elements; the computational elements being individual computers or aggregates of them supported by appropriate network interconnections.

The two most commonly used forms of Architectural Model for distributed systems are :


  • Client-Server Model
  •  Peer-to-Peer Model

Client-server Model: It is the most important and the most widely used architectural model. In client server model, client processes interact with individual server processes in separate host computers in order to access the shared resources that they manage.

Servers may in turn be clients of other servers. For example, a web server is often a client of a local file server that manages the files in which the web pages are stored.

Web servers and most other Internet services are clients of the DNS service, which translates Internet domain names to network addresses. Another web-related example concerns search engines, which enable users to look up summaries of information available on web pages at sites throughout the Internet. These summaries are made by programs called web crawlers, which run in the background at a search engine site using HTTP requests to access web servers throughout the Internet. Thus a search engine is both a server and a client: it responds to queries from browser clients and it runs web crawlers that act as clients of other web servers.


Peer-to-Peer Model: In this architecture all of the processes involved in a task or activity play similar roles, interacting cooperatively as peers without any distinction between client and server processes or the computers on which they run. In practical terms, all participating processes run the same program and offer the same set of interfaces to each other.

 

 

Fundamental Models:

Fundamental models are based on some fundamental properties such as performance, reliability and security. All of the models share the design requirements of achieving the performance and reliability characteristics of processes and networks and ensuring the security of the resources in the system.

Fundamental models based on the fundamental properties that allow us to be more specific about their characteristics and the failures and security risks they might exhibit.

Fundamental models concerned with properties that are common in all of the architectural models. Fundamental model addressed by three models:

 

Interaction Model:

The Interaction Model is concerned with the performance of processes and communication channels and the absence of a global clock. It identifies a synchronous system as one in which known bounds may be placed on process execution time, message delivery time and clock drift. It identifies an asynchronous system as one in which no bounds may be placed on process execution time, message delivery time and clock drift – which is a description of the behaviour of the Internet.

 

Failure Model:

The Failure model attempts to give a precise specification of the faults that can be exhibited by processes and communication channels. It defines reliable communication and correct processes. The failure model classifies the failures of processes and basic communication channels in a distributed system.

 

Security Model:

The Security Model identifies the possible threats to processes and communication channels. It introduces the concept of a secure channel, which is secure against those threats. Some of those threats relate to integrity: malicious users may tamper with messages or replay them. Others threaten their privacy. Another security issue is the authentication of the principal (user or server) on whose behalf a message was sent. Secure channels use cryptographic techniques to ensure the integrity and privacy of messages and to authenticate pairs of communicating principals.

 

 

Variations of Client Server Architecture:

    ·  Proxy Server

    ·  Thin Client

    ·   Mobile Agent

 

Proxy Server:

Proxy Server refers to a server that acts as an intermediary between client and server. The proxy server is a computer on the internet that accepts the incoming requests from the client and forwards those requests to the destination server. It has its own IP address. It separates the client program and web server from the global network.

The purpose of proxy servers is to increase the availability and performance of the service by reducing the load on the wide area network and web servers. Proxy servers can take on other roles; for example, they may be used to access remote web servers through a firewall.

 

 

Thin Client: 

A thin client is a computer that runs from resources stored on a central server instead of a localized hard drive. It is virtual desktop computing model that runs on the resources stored on a central server.

Thin clients are small, silent devices that communicate with a central server giving a computing experience that is largely identical to that of a PC.

 

Mobile Agent:

Mobile Agents are transportable programs that can move over the network and act on behalf of the user. A mobile agent is a running program (including both code and data) that travels from one computer to another in a network carrying out a task on someone’s behalf, such as collecting information, and eventually returning with the results. A mobile agent may make many invocations to local resources at each site it visits – for example, accessing individual database entries.

Mobile agents might be used to install and maintain software on the computers within an organization or to compare the prices of products from a number of vendors by visiting each vendor’s site and performing a series of database operations.

Mobile agents (like mobile code) are a potential security threat to the resources in computers that they visit. The environment receiving a mobile agent should decide which of the local resources it should be allowed to use, based on the identity of the user on whose behalf the agent is acting – their identity must be included in a secure way with the code and data of the mobile agent. In addition, mobile agents can themselves be vulnerable – they may not be able to complete their task if they are refused access to the information they need. The tasks performed by mobile agents can be performed by other means. For example, web crawlers that need to access resources at web servers throughout the Internet work quite successfully by making remote invocations to server processes. For these reasons, the applicability of mobile agents may be limited.





PROCESS SYNCHRONIZATION

 

Synchronization in distributed systems is often much more difficult compared to synchronization in uniprocessor or multiprocessor systems.

Time is an important practical issue. For example, we require computers around the world to timestamp electronic commerce transactions consistently. Time is also an important theoretical construct in understanding how distributed executions unfold. But time is problematic in distributed systems.

Each computer may have its own physical clock, but the clocks typically deviate, and we cannot synchronize them perfectly. Time is an important and interesting issue in distributed systems, for several reasons. First, time is a quantity we often want to measure accurately. In order to know at what time of day a particular event occurred at a particular computer it is necessary to synchronize its clock with an authoritative, external source of time. For example, an eCommerce transaction involves events at a merchant’s computer and at a bank’s computer. It is important, for auditing purposes, that those events are timestamped accurately.

Second, algorithms that depend upon clock synchronization have been developed for several problems in distribution [Liskov 1993]. These include maintaining the consistency of distributed data, checking the authenticity of a request sent to a server, and eliminating the processing of duplicate updates.

 

 

Clock, Event and Process States:

A distributed system consist of a collection P of N processes pi, i= 1,2,...,N.

Each process pi in P has a state, si, which in general, it transforms as it executes.

The process’s state includes the values of all the variables within it. Its state may also include the values of any objects in its local operating system environment that it affects, such as files.

As each process pi executes it takes a series of actions, each of which is either a message send or receive operation, or an operation that transforms pi ’s state – one that changes one or more of the values in si . In practice, we may choose to use a high-level description of the actions, according to the application. For example, if the processes in are engaged in an eCommerce application, then the actions may be ones such as ‘client dispatched order message’ or ‘merchant server recorded transaction to log’.

We define an event to be the occurrence of a single action that a process carries out as it executes – a communication action or a state-transforming action.

The sequence of events within a single process pi can be placed in a single, total ordering, which we denote by the relation ài between the events. That is, eài e' if and only if the event e occurs before e' at pi.

Now we can define the history of process pi to be the series of events that take place within it, ordered as we have described by the relation ài

history(pi) = hi = <ei0, ei1, ei2, ……… >


Synchronizing Physical Clocks:

 

In order to know at what time of day events occur at the processes in our distributed system P -for example, for accountancy purposes – it is necessary to synchronize the processes’ clocks, Ci , with an authoritative, external source of time. This is external synchronization. And if the clocks Ci are synchronized with one another to a known degree of accuracy, then we can measure the interval between two events occurring at different computers by appealing to their local clocks, even though they are not necessarily synchronized to an external source of time. This is internal synchronization. We define these two modes of synchronization more closely as follows, over an interval of real time I:


External synchronization: 

For a synchronization bound D>0 , and for a source S of UTC time, |S(t) – Ci(t) |<D, for i = 1, 2, …N and for all real times t in I. Another way of saying this is that the clocks Ci are accurate to within the bound D.


Internal synchronization: 

For a synchronization bound D>0 ,  |Ci(t)-Cj(t)| <D, for i, j= 1, 2, …N and for all real times t in I. Another way of saying this is that the clocks Ci agree within the bound D.

Clocks that are internally synchronized are not necessarily externally synchronized, since they may drift collectively from an external source of time even though they agree with one another. However, it follows from the definitions that if the system P is externally synchronized with a bound D, then the same system is internally synchronized with a bound of 2D.

 


Algorithms for Physical Clock Synchronization:

Clocks are one of the most important components of computers and other devices. However, for various factors, these clocks may drift from standard frequency or degrade and may gain or loose time with respect to the reference clock and this time difference between the two clocks is called clock skew.

This clock skew may gradually increase and eventually cause desynchronisation of the computer clock from the reference clock, which could affect their normal operation. Therefore it requires synchronization of the computer clock with the reference clock to minimize clock skew.

Several algorithm and protocols proposed for synchronizing physical clocks:

 

Cristian Algorithm:

Cristian [1989] suggested the use of a time server, connected to a device that receives signals from a source of UTC, to synchronize computers externally. Upon request, the server process supplies the time according to its clock.

Cristian observed that while there is no upper bound on message transmission delays in an asynchronous system, the round-trip times for messages exchanged between pairs of processes are often reasonably short – a small fraction of a second. He describes the algorithm as probabilistic: the method achieves synchronization only if the observed round-trip times between client and server are sufficiently short compared with the required accuracy.

 

Discussion of Cristian’s Algorithm:

 As described, Cristian’s method suffers from the problem associated with all services implemented by a single server: that the single time server might fail and thus render synchronization temporarily impossible. Cristian suggested, for this reason, that time should be provided by a group of synchronized time servers, each with a receiver for UTC time signals. For example, a client could multicast its request to all servers and use only the first reply obtained. The problem of dealing with faulty clocks is partially addressed by the Berkeley algorithm,

 


Berkley Algorithm:

 


Network Time Protocol:

 


 

Logical Time and Logical Clocks:

So far, we have assumed that clock synchronization is naturally related to real time. However, we have also seen that it may be sufficient that every node agrees on a current time, without that time necessarily being the same as the real time.

In a classic paper, Lamport (1978) showed that although clock synchronization is possible, it need not be absolute. If two processes do not interact, it is not necessary that their clocks be synchronized because the lack of synchronization would not be observable and thus could not cause problems. Furthermore, he pointed out that what usually matters is not that all processes agree on exactly what time it is, but rather that they agree on the order in which events occur.

 


 

Lamport’s Logical Clock:

To synchronize logical clocks, Lamport defined a relation called happens-before. The expression a ~ b is read "a happens before b" and means that all processes agree that first event a occurs, then afterward, event b occurs. The happens-before relation can be observed directly in two situations: 

1. If a and b are events in the same process, and a occurs before b, then a ~ b is true. 

2. If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a ~ b is also true. A message cannot be received before it is sent, or even at the same time it is sent, since it takes a finite, nonzero amount of time to arrive.

Happens-Before is a transitive relation, so if a ~ b and b ~ c, then a ~ c. If two events, x and y, happen in different processes that do not exchange messages (not even indirectly via third parties), then x ~ y is not true, but neither is y ~ x. These events are said to be concurrent, which simply means that nothing can be said (or need be said) about when the events happened or which event happened first. What we need is a way of measuring a notion of time such that for every event, a, we can assign it a time value C(a) on which all processes agree. These time values must have the property that if a ~ b, then C(a) < C(b). To rephrase the conditions we stated earlier, if a and b are two events within the same process and a occurs before b, then C(a) < C(b). Similarly, if a is the sending of a message by one process and b is the reception of that message by another process, then C(a) and C(b) must be assigned in such a way that everyone agrees on the values of C(a) and C(b) with C(a) < C(b). In addition, the clock time, C, must always go forward (increasing), never backward (decreasing). Corrections to time can be made by adding a positive value, never by subtracting one.


 

Global states: The global state of a distributed system is the set of local states of each individual processes involved in the system plus the state of the communication channel.

Global states = Individual state of each process + individual state of each communication channel

 

Cut in a Distributed System: 

 


 


DISTRIBUTED MUTUAL EXCLUSION

 

Fundamental to distributed systems is the concurrency and collaboration among multiple processes. In many cases, this also means that processes will need to simultaneously access the same resources. To prevent that such concurrent accesses corrupt the resource, or make it inconsistent, solutions are needed to grant mutual exclusive access by processes.

 

Requirement of Mutual Exclusion:

Distributed processes often need to coordinate their activities. If a collection of processes share a resource or collection of resources, then often mutual exclusion is required to prevent interference and ensure consistency when accessing the resources.

 


Algorithms for Mutual Exclusion:

 

The Central Server Algorithm:

The simplest way to achieve mutual exclusion is to employ a server that grants permission to enter the critical section. To enter a critical section, a process sends a request message to the server and awaits a reply from it. Conceptually, the reply constitutes a token signifying permission to enter the critical section. If no other process has the token at the time of the request, then the server replies immediately, granting the token. If the token is currently held by another process, then the server does not reply, but queues the request.

When a process exits the critical section, it sends a message to the server, giving it back the token. If the queue of waiting processes is not empty, then the server chooses the oldest entry in the queue, removes it and replies to the corresponding process. The chosen process then holds the token.

 

A Ring-Based algorithm:

One of the simplest ways to arrange mutual exclusion between the N processes without requiring an additional process is to arrange them in a logical ring. The idea is that exclusion is conferred by obtaining a token in the form of a message passed from process to process in a single direction clockwise, say – around the ring.

If a process does not require to enter the critical section when it receives the token, then it immediately forwards the token to its neighbour. A process that requires the token waits until it receives it, but retains it. To exit the critical section, the process sends the token on to its neighbour.

This algorithm continuously consumes network bandwidth (except when a process is inside the critical section): the processes send messages around the ring even when no process requires entry to the critical section.

 

Ricart Agarwala Algorithm:

Ricart and Agrawala [1981] developed an algorithm to implement mutual exclusion between N peer processes that is based upon multicast. The basic idea is that processes that require entry to a critical section multicast a request message, and can enter it only when all the other processes have replied to this message.

Each process records its state of being outside the critical section (RELEASED), wanting entry (WANTED) or being in the critical section (HELD) in a variable state. The algorithm is given in Figure below:

 

 

On initialization

state := RELEASED;

To enter the section

state := WANTED;

Multicast request to all processes;

 T := request’s timestamp;

Wait until (number of replies received = (N – 1));

state := HELD;

On receipt of a request <Ti, pi> at pj (I = j) if (state = HELD or (state = WANTED and (T, pj) < (Ti, pi)))

then

queue request from pi without replying;

else

reply immediately to pi;

end if

To exit the critical section

state := RELEASED;

reply to any queued requests;

 

 If a process requests entry and the state of all other processes is RELEASED, then all processes will reply immediately to the request and the requester will obtain entry. If some process is in the state HELD, then that process will not reply to requests until it has finished with the critical section, and so the requester cannot gain entry in the meantime. If two or more processes request entry at the same time, then whichever process’s request bears the lowest timestamp will be the first to collect N – replies, 1 granting it entry next.

 

To Enter Critical Section:

·   When a process pi wants to enter the critical section, it send a timestamped REQUEST message to all other process.

·   When a process pj receives a REQUEST message from process pi , it sends a reply message to process pi if and only if:

o   Process pj neither requesting nor currently executing the critical section.

o   Increase process pj requesting, the timestamp of process pi’s request is smaller than its own request. Otherwise the request is deferred by process pj.

 

To Exit Critical Section:

·  Process pi enters the critical section if it has received the REPLY message from all process.

 

To Release the Critical Section:

·  Upon exiting process pi sends REPLY message to all deferred requests.

 

 

 


 

Maekow’s Voting Algorithm:


Maekawa [1985] observed that in order for a process to enter a critical section, it is not necessary for all of its peers to grant it access. Processes need only obtain permission to enter from subsets of their peers, as long as the subsets used by any two processes overlap. We can think of processes as voting for one another to enter the critical section. A ‘candidate’ process must collect sufficient votes to enter.

 



ELECTION ALGORITHM

 


An algorithm for choosing a unique process to play a particular role is called an election algorithm.

 

Requirement of Election Algorithm:

We say that a process calls the election if it takes an action that initiates a particular run of the election algorithm. An individual process does not call more than one election at a time, but in principle the N processes could call N concurrent elections. At any point in time, a process pi is either a participant – meaning that it is engaged in some run of the election algorithm – or a non-participant – meaning that it is not currently engaged in any election. An important requirement is for the choice of elected process to be unique, even if several processes call elections concurrently. For example, two processes could decide independently that a coordinator process has failed, and both call elections.


Ring Algorithm: 


The algorithm of Chang and Roberts [1979] is suitable for a collection of processes arranged in a logical ring. Each process pi has a communication channel to the next process in the ring, and all messages are sent clockwise around the ring. We assume that no failures occur, and that the system is asynchronous. The goal of this algorithm is to elect a single process called the coordinator, which is the process with the largest identifier.

Initially, every process is marked as a non-participant in an election. Any process can begin an election. It proceeds by marking itself as a participant, placing its identifier in an election message and sending it to its clockwise neighbour.

When a process receives an election message, it compares the identifier in the message with its own. If the arrived identifier is greater, then it forwards the message to its neighbour. If the arrived identifier is smaller and the receiver is not a participant, then it substitutes its own identifier in the message and forwards it; but it does not forward the message if it is already a participant. On forwarding an election message in any case, the process marks itself as a participant.

 

Bully’s Algorithm:


Unlike the ring-based algorithm, this algorithm assumes that the system is synchronous: it uses timeouts to detect a process failure. Another difference is that the ring-based algorithm assumed that processes have minimal a priori knowledge of one another: each knows only how to communicate with its neighbour, and none knows the identifiers of the other processes. The bully algorithm, on the other hand, assumes that each process knows which processes have higher identifiers, and that it can communicate with all such processes.

There are three types of message in this algorithm: an election message is sent to announce an election; an answer message is sent in response to an election message and a coordinator message is sent to announce the identity of the elected process – the new ‘coordinator’. A process begins an election when it notices, through timeouts, that the coordinator has failed. Several processes may discover this concurrently.

The process that knows it has the highest identifier can elect itself as the coordinator simply by sending a coordinator message to all processes with lower identifiers. On the other hand, a process with a lower identifier can begin an election by sending an election message to those processes that have a higher identifier and awaiting answer messages in response. If none arrives within time T, the process considers itself the coordinator and sends a coordinator message to all processes with lower identifiers announcing this.




TERMINATION DETECTION

 

Termination Detection, i.e. determining whether a distributed computation being performed in a system has terminated, is a fundamental problem in distributed programming. A distributed computation is said to have terminated when all its live processes are in the passive state and no basic messages are in transit. The control computation performed by the processes for detecting the termination, constitutes the termination detection algorithm.


Termination occurs when all processes in the distributed system become idle (passive) and there are no computational messages in transit.


Objective:

Termination detection in distributed systems has been a popular problem of study. It involves determining whether a computation running on multiple nodes has ceased (terminate) all its activities.

In the termination detection problem, a particular process (or all of the processes) must infer when the underlying computation has terminated. A termination detection algorithm is used for this purpose.


Huang’s Algorithm:

Huang’s algorithm is an algorithm for detecting termination in a distributed system. The algorithm was proposed by Shing-Tsaan Huang in 1989 in the Journal of Computers.

At any time, a process in a distributed system is either in an active state or in an passive (idle) state. An active process may become idle at any time but an idle process may only become active again upon receiving a computational message.

 

Basic Idea:

  • When a process sends a message, it sends a part of its weight in the message.
  • When a process receives a message, it adds the weight received in the message to it’s weight.
  • Thus, the sum of weights on all the processes and on all the messages in transit is always 1.
  • When a process becomes passive, it sends its weight to the controlling agent in a control message, which the controlling agent adds to its weight.
  • The controlling agent concludes termination if its weight becomes 1.

 

 

Assumptions of the algorithm:

  • One of the co-operating processes which monitor the computation is called the controlling agent.
  • The initial weight of controlling agent is 1.
  • All other processes are initially idle and have weight 0.
  • The computation starts when the controlling agent sends a computation message to one of the processes.
  • The process becomes active on receiving a computation message.
  • Computation message can be sent only by controlling agent or an active process.
  • Control message is sent to controlling agent by an active process when they are becoming idle.
  • The algorithm assigns a weight W (such that 0 < W < 1) to every active process and every in transit message.

 

Notations used in the algorithm:

  • B(DW): Computation message with weight DW
  • C(DW): Control message with weight DW

 

 

Algorithm:

Rule to send B(DW):

  • Suppose Process P with weight W is sending B(DW) to process Q
  • Split the weight of the process P into W1 and W2.

        Such that

        W = W1 + W2 and W1 > 0, W2 > 0

  • Set weight of the process P as W1 ( i.e W = W1 )
  • Send B (W2) to process Q, here DW = W2.


Note: Only the Controlling agent or any active process can send Computation message.

 

On receiving B(DW) by process Q:

  • Add the weight DW to the weight of process Q i.e for process Q,

        W = W + DW

  • If process Q was idle, it will become active on receiving B(DW).

 

Rule to send C(DW):

  • Any active process having weight W can become idle by sending C(W) to controlling agent
  • Send a control message C(W) to the controlling agent. Here DW = W.
  • Set weight of the process as 0 i.e W = 0. (After this process will become idle.)

 

On receiving C(DW) by controlling agent :

  • Add the weight received through control message to the weight of controlling agent i.e W = W + DW
  • After adding, if the weight of controlling agent becomes 1 then it can be conclude that the computation 

 


Advantages of Huang’s Algorithm:

  • The algorithm detects every true termination in finite time.

 

Limitations of Huang’s Algorithm:

  • The algorithm is unable to detect termination if a message is lost in transit.
  • It also does not work if a process fails while in an active state.









TRANSACTION AND CONCURRENCY CONTROL

 

Transactions:


A transaction defines a sequence of server operations that is guaranteed by the server to be atomic in the presence of multiple clients and server crashes.

The goal of transactions is to ensure that all of the objects managed by a server remain in a consistent state when they are accessed by multiple transactions and in the presence of server crashes.

A transaction is specified by a client as a set of operations on objects to be performed as an indivisible unit by the servers managing those objects. The servers must guarantee that either the entire transaction is carried out and the results recorded in permanent storage or, in the case that one or more of them crashes, its effects are completely erased.

A client’s transaction is also regarded as indivisible from the point of view of other clients’ transactions in the sense that the operations of one transaction cannot observe the partial effects of the operations of another.

In some situations, clients require a sequence of separate requests to a server to be atomic in the sense that:

1. They are free from interference by operations being performed on behalf of other concurrent clients.

2. Either all of the operations must be completed successfully or they must have no effect at all in the presence of server crashes.

 

 

Example:

A client’s banking transaction

 

Transaction T:

a.withdraw(100);

b.deposit(100);

c.withdraw(200);

b.deposit(200)

 

A client that performs a sequence of operations on a particular bank account on behalf of a user will first lookUp the account by name and then apply the deposit, withdraw and getBalance operations directly to the relevant account.

Above example shows a simple client transaction specifying a series of related actions involving the bank accounts A, B and C. The first two actions transfer $100 from A to B and the second two transfer $200 from C to B. A client achieves a transfer operation by doing a withdrawal followed by a deposit.


Transactions originate from database management systems. In that context, a transaction is an execution of a program that accesses a database.

 


Desirable Properties of Transactions

Any transaction must maintain the ACID properties, viz. Atomicity, Consistency, Isolation, and Durability.

Atomicity:

A transaction must be all or nothing. A transaction either completes successfully or has no effect at all. The atomicity property of transactions requires that the servers participating in a distributed transaction either all commit it or all abort it.

This property states that a transaction is an atomic unit of processing, that is, either it is performed in its entirety or not performed at all. No partial update should exist.



Consistency:

A transaction takes the system from one consistent state to another consistent state.


Isolation:

Each transaction must be performed without interference from other transactions; in other words, the intermediate effects of a transaction must not be visible to other transactions. There should not be any interference from the other concurrent transactions that are simultaneously running.



Durability:

After a transaction has completed successfully, all its effects are saved in permanent storage. We use the term ‘permanent storage’ to refer to files held on disk or another permanent medium. Data saved in a file will survive if the server process crashes.


 


Introduction to Distributed Transactions:


Distributed transactions involve more than one server. A distributed transaction is any transaction whose activity involves several different servers. Distributed transactions may be either flat or nested.

A client transaction becomes distributed if it invokes operations in several different servers. There are two different ways that distributed transactions can be structured: as flat transactions and as nested transactions.

In a flat transaction, a client makes requests to more than one server. A flat client transaction completes each of its requests before going on to the next one. Therefore, each transaction accesses servers’ objects sequentially. When servers use locking, a transaction can only be waiting for one object at a time.

In a nested transaction, the top-level transaction can open subtransactions, and each subtransaction can open further subtransactions down to any depth of nesting.

 


Flat and Nested Distributed Transactions:


Nested Transaction:


Nested transactions are structured from sets of other transactions. They are particularly useful in distributed systems because they allow additional concurrency.

Nested transactions extend the transaction model by allowing transactions to be composed of other transactions. Thus several transactions may be started from within a transaction, allowing transactions to be regarded as modules that can be composed as required.

Nested transactions are formed by structuring transactions from other subtransactions. Nesting is particularly useful in distributed systems because it allows concurrent execution of subtransactions in separate servers. Nesting also has the advantage of allowing independent recovery of parts of a transaction.

The outermost transaction in a set of nested transactions is called the top-level transaction. Transactions other than the top-level transaction are called subtransactions.

A subtransaction appears atomic to its parent with respect to transaction failures and to concurrent access. Each subtransaction can fail independently of its parent and of the other subtransactions. When a subtransaction aborts, the parent transaction can sometimes choose an alternative subtransaction to complete its task.

For example, a transaction to deliver a mail message to a list of recipients could be structured as a set of subtransactions, each of which delivers the message to one of the recipients. If one or more of the subtransactions fails, the parent transaction could record the fact and then commit, with the result that all the successful child transactions commit. It could then start another transaction to attempt to redeliver the messages that were not sent the first time.



Flat Transaction:

When we need to distinguish our original form of transaction from nested ones, we use the term flat transaction. It is flat because all of its work is done at the same level between an openTransaction and a commit or abort, and it is not possible to commit or abort parts of it.

Nested transactions have the following main advantages:

1. Subtransactions at one level (and their descendants) may run concurrently with other subtransactions at the same level in the hierarchy. This can allow additional concurrency in a transaction. When subtransactions run in different servers, they can work in parallel.


For example, consider the branchTotal operation in our banking example. It can be implemented by invoking getBalance at every account in the branch. Now each of these invocations may be performed as a subtransaction, in which case they can be performed concurrently. Since each one applies to a different account, there will be no conflicting operations among the subtransactions.



2. Subtransactions can commit or abort independently. In comparison with a single transaction, a set of nested subtransactions is potentially more robust. The above example of delivering mail shows that this is so – with a flat transaction, one transaction failure would cause the whole transaction to be restarted. In fact, a parent can decide on different actions according to whether a subtransaction has aborted or not.



Atomic Commit Protocols:


An atomic commit protocol is a cooperative procedure used by a set of servers involved in a distributed transaction. It enables the servers to reach a joint decision as to whether a transaction can be committed or aborted.

Transaction commit protocols were devised in the early 1970s, and the two-phase commit protocol appeared in Gray [1978]. The atomicity property of transactions requires that when a distributed transaction comes to an end, either all of its operations are carried out or none of them. In the case of a distributed transaction, the client has requested operations at more than one server. A transaction comes to an end when the client requests that it be committed or aborted. A simple way to complete the transaction in an atomic manner is for the coordinator to communicate the commit or abort request to all of the participants in the transaction and to keep on repeating the request until all of them have acknowledged that they have carried it out. This is an example of a onephase atomic commit protocol.



Two-Phase Commit Protocol:

During the progress of a transaction, there is no communication between the coordinator and the participants apart from the participants informing the coordinator when they join the transaction. A client’s request to commit (or abort) a transaction is directed to the coordinator. If the client requests abortTransaction, or if the transaction is aborted by one of the participants, the coordinator informs all participants immediately. It is when the client asks the coordinator to commit the transaction that the two-phase commit protocol comes into use. In the first phase of the two-phase commit protocol the coordinator asks all the participants if they are prepared to commit; in the second, it tells them to commit (or abort) the transaction. If a participant can commit its part of a transaction, it will agree as soon as it has recorded the changes it has made (to the objects) and its status in permanent storage and is therefore prepared to commit.

The two-phase commit protocol is designed to allow any participant to abort its part of a transaction. Due to the requirement for atomicity, if one part of a transaction is aborted, then the whole transaction must be aborted. In the first phase of the protocol, each participant votes for the transaction to be committed or aborted. Once a participant has voted to commit a transaction, it is not allowed to abort it. Therefore, before a participant votes to commit a transaction, it must ensure that it will eventually be able to carry out its part of the commit protocol, even if it fails and is replaced in the interim. A participant in a transaction is said to be in a prepared state for a transaction if it will eventually be able to commit it. To make sure of this, each participant saves in permanent storage all of the objects that it has altered in the transaction, together with its status – prepared.

In the second phase of the protocol, every participant in the transaction carries out the joint decision. If any one participant votes to abort, then the decision must be to abort the transaction. If all the participants vote to commit, then the decision is to commit the transaction.


Concurrency Control in Distributed Transactions:


 

Distributed Deadlocks:

Deadlocks can arise within a single server when locking is used for concurrency control. Servers must either prevent or detect and resolve deadlocks. With deadlock detection schemes, a transaction is aborted only when it is involved in a deadlock. Most deadlock detection schemes operate by finding cycles in the transaction wait-for graph.

In a distributed system involving multiple servers being accessed by multiple transactions, a global wait-for graph can in be constructed from the local ones. There can be a cycle in the global wait-for graph that is not in any single local one – that is, there can be a distributed deadlock.

Detection of a distributed deadlock requires a cycle to be found in the global transaction wait-for graph that is distributed among the servers that were involved in the transactions.

The wait-for graph is a directed graph in which nodes represent transactions and objects, and edges represent either an object held by a transaction or a transaction waiting for an object. There is a deadlock if and only if there is a cycle in the wait-for graph.

Figure-1 shows the interleavings of the transactions U, V and W involving the objects A and B managed by servers X and Y and objects C and D managed by server Z.

The complete wait-for graph in Figure-2(a) shows that a deadlock cycle consists of alternate edges, which represent a transaction waiting for an object and an object held by a transaction. As any transaction can only be waiting for one object at a time, objects can be left out of wait-for graphs, as shown in Figure-2(b).

Detection of a distributed deadlock requires a cycle to be found in the global transaction wait-for graph that is distributed among the servers that were involved in the transactions

 

Phantom Deadlocks: 

A deadlock that is ‘detected’ but is not really a deadlock is called a phantom deadlock. In distributed deadlock detection, information about wait-for relationships between transactions is transmitted from one server to another. If there is a deadlock, the necessary information will eventually be collected in one place and a cycle will be detected. As this procedure will take some time, there is a chance that one of the transactions that holds a lock will meanwhile have released it, in which case the deadlock will no longer exist.

Consider the case of a global deadlock detector that receives local wait-for graphs from servers X and Y, as shown in Figure-A. Suppose that transaction U then releases an object at server X and requests the one held by V at server Y. Suppose also that the global detector receives server Y’s local graph before server X’s. In this case, it would detect a cycle TàUà V àT, although the edge TàU no longer exists. This is an example of a phantom deadlock.

A phantom deadlock could be detected if a waiting transaction in a deadlock cycle aborts during the deadlock detection procedure. For example, if there is a cycle TàUà V à T and U aborts after the information concerning U has been collected, then the cycle has been broken already and there is no deadlock.


Edge Chasing Algorithm:

A distributed approach to deadlock detection uses a technique called edge chasing or path pushing. In this approach, the global wait-for graph is not constructed, but each of the servers involved has knowledge about some of its edges. The servers attempt to find cycles by forwarding messages called probes, which follow the edges of the graph throughout the distributed system. A probe message consists of transaction wait-for relationships representing a path in the global wait-for graph.

 

Edge-Chasing Algorithms have three steps- initiation, detection and resolution:

 

Initiation:

When a server notes that a transaction T starts waiting for another transaction U, where U is waiting to access an object at another server, it initiates detection by sending a probe containing the edge < TàU > to the server of the object at which transaction U is blocked. If U is sharing a lock, probes are sent to all the holders of the lock. Sometimes further transactions may start sharing the lock later on, in which case probes can be sent to them too.

 

Detection:

Detection consists of receiving probes and deciding whether a deadlock has occurred and whether to forward the probes.

For example, when a server of an object receives a probe < TàU > (indicating that T is waiting for a transaction U that holds a local object), it checks to see whether U is also waiting. If it is, the transaction it waits for (for example, V) is added to the probe (making it < TàUàV >) and if the new transaction (V) is waiting for another object elsewhere, the probe is forwarded. In this way, paths through the global wait-for graph are built one edge at a time. Before forwarding a probe, the server checks to see whether the transaction (for example, T) it has just added has caused the probe to contain a cycle (for example, < TàU àVàT >). If this is the case, it has found a cycle in the graph and a deadlock has been detected.

 

Resolution:

When a cycle is detected, a transaction in the cycle is aborted to break the deadlock.


 

FAQ:


1.   What is Distributed System?

2.   Define Distributed System.

3.   What do you mean by distributed System?

4.   Explain the concept of distributed System.

5.   Explain Distributed System in brief.

6.   Describe Distributed System.

7.   Explain Distributed System with examples.

8.   Explain the characteristics of Distributed System.

9.   Write the challenges of Distributed System and explain in brief.

10. What is Middleware?

11. What are the main concepts of a distributed system?

12.  What are the applications of bully election algorithms?







Post a Comment

0 Comments