Termination Detection in Distributed System | What is Termination Detection in Distributed system?

Introduction:

Termination detection in distributed systems has been a popular problem of study. It involves determining whether a computation running on multiple nodes has ceased all its activities.

Termination detection, i.e. determining whether a distributed computation being performed in a system has terminated, is a fundamental problem in distributed programming.  A distributed computation is said to have terminated when all its live processes are in the passive state and no basic messages are in transit. This is called the distributed termination condition (DTC).

Termination occurs when all processes in the distributed system become idle and there are no computational messages in transit.

In the termination detection problem, a particular process (or all of the processes) must infer when the underlying computation has terminated. A termination detection algorithm is used for this purpose.

A distributed computation is globally terminated if every process is locally terminated and there is no message in transit between any processes.

Locally terminated” state is a state in which a process has finished its computation and will not restart any action unless it receives a message. In the termination detection problem, a particular process (or all of the processes) must infer when the underlying computation has terminated.

A termination detection algorithm is used to determine if a distributed computation has terminated.

Consider a distributed algorithm executed by two processes p1 and p2, each of which may request values from the other. A process is either active or passive – a passive process is not engaged in any activity of its own but is prepared to respond with a value requested by the other. Suppose we discover that p1 is passive and that p2 is passive. To see that we may not conclude that the algorithm has terminated, consider the following scenario: when we tested p1 for passivity, a message was on its way from p2, which became passive immediately after sending it. On receipt of the message, p1 became active again – after we had found it to be passive. The algorithm had not terminated.




A termination detection (TD) algorithm must ensure the following:

 

    1.   Execution of a TD algorithm cannot indefinitely delay the underlying computation, i.e., execution of the termination detection algorithm must not freeze the underlying computation.

    2.   The termination detection algorithm must not require addition of new communication channels between processes.

    3.   At any given time, a process can be in only one of the two states: active, where it is doing local computation and idle, where the process has (temporarily) finished the execution of its local computation and will be reactivated only on the receipt of a message from another process.

    4.   An active process can become idle at any time.

    5.   An idle process can become active only on the receipt of a message from another process.

    6.   Only active processes can send messages.

    7.   A message can be received by a process when the process is in either of the two states, i.e., active or idle. On the receipt of a message, an idle process becomes active.

    8.   The sending of a message and the receipt of a message occur as atomic actions.


Huang’s Algorithm:

Huang’s algorithm is an algorithm for detecting termination in a distributed system. The algorithm was proposed by Shing-Tsaan Huang in 1989 in the Journal of Computers.

In a distributed system, a process is either in an active state or in an idle state at any given point of time. An active process may become idle at any time but an idle process may only become active again upon receiving a computation message. Termination occurs when all processes in the distributed system become idle and there are no computation messages in transit.

 

Basic Idea:

  • When a process sends a message, it sends a part of its weight in the message.
  • When a process receives a message, it adds the weight received in the message to it’s weight.
  • Thus, the sum of weights on all the processes and on all the messages in transit is always 1.
  • When a process becomes passive (idle), it sends its weight to the controlling agent in a control message, which the controlling agent adds to its weight.
  • The controlling agent concludes termination if its weight becomes 1.

 

 

Assumptions of the Algorithm:

 

·  One of the co-operating processes which monitor the computation is called the controlling agent.

·   The initial weight of controlling agent is 1.

·   All other processes are initially idle and have weight 0.

·   The computation starts when the controlling agent sends a computation message to one of the processes.

·   The process becomes active on receiving a computation message.

·  Computation message can be sent only by controlling agent or an active process.

·  Control message is sent to controlling agent by an active process when they are becoming idle.

·   The algorithm assigns a weight W (such that 0 < W < 1) to every active process and every in transit message.

 


Notations used in the algorithm:


·   B(DW): Computation message with weight DW

·   C(DW): Control message with weight DW

 

 

Algorithm:

 

Rule to send B(DW) –

 

·   Suppose Process P with weight W is sending B(DW) to process Q

·   Split the weight of the process P into W1 and W2.
Such that

W = W1 + W2 and W1 > 0, W2 > 0

·   Set weight of the process P as W1 ( i.e W = W1 )

·   Send B (W2) to process Q, here DW = W2.

 

·   Note: Only the Controlling agent or any active process can send Computation message.

 





On receiving B(DW) by process Q –


·   Add the weight DW to the weight of process Q i.e for process Q,

            W = W + DW


·   If process Q was idle, it will become active on receiving B(DW).

 


Rule to send C(DW) –


·  Any active process having weight W can become idle by sending C(W) to controlling agent

·  Send a control message C(W) to the controlling agent. Here DW = W.

·   Set weight of the process as 0 i.e W = 0. (After this process will become idle.)

 



On receiving C(DW) by controlling agent –


·   Add the weight received through control message to the weight of controlling agent i.e W = W + DW

·  After adding, if the weight of controlling agent becomes 1 then it can be conclude that the computation has terminated.

 



Advantages of Huang’s Algorithm:


·   Guaranteed termination: Termination Detection Algorithms ensure that a distributed computation terminates eventually, even in the presence of failures such as process crashes, message loss, or network partitioning. This guarantees that the distributed system will not be stuck in an infinite loop or deadlock.


·   Scalability: Termination Detection Algorithms are scalable, which means that they can handle large distributed systems with many processes.


·   Fault tolerance: Termination Detection Algorithms are fault-tolerant, which means that they can handle process crashes, message loss, and other types of failures that can occur in a distributed system.


·   Easy to implement: Termination Detection Algorithms are easy to implement and do not require any special hardware or software.

 


 Limitations of Huang’s Algorithm:


·   The algorithm is unable to detect termination if a message is lost in transit.

·   It also does not work if a process fails while in an active state.




Post a Comment

0 Comments