{THOMAS and JOHAN 1983} believed that tolerance to software faults as traditional way the state restoration is can be expensive in addition to The concurrency introduce more difficulties specially when deal with timing constraints. They suggest a straightforward pragmatic approach for real-time system. The approach takes benefits of the structure of real-time system to simplify error recovery, and a classification scheme. Respond to every type of error are determined which allow service to be maintained.
The approach presents a classification scheme for errors and techniques for the provision of software fault tolerance in real-time systems. For Error classification, errors will be classified according to a set of definitions, internal error that can be adequately handled by the process in which the error is detected. External error, that cannot be adequately handled by the process in which it is detected, but whose effects are limited to that process. And pervasive error which cannot be adequately handled by the process in which it is detected and which results in errors in other processes.
The incidence of errors classified as persistent that error is persistent if the frequency of occurrence of the associated fault exceeds some predetermined threshold. And transient incidence that error is transient if it is not persistent.
Recovery and continued service for internal errors is only aimed for those errors for which explicit provision has been made in the system design. Techniques for internal recovery by a process include ad hoc repair as a part of a local exception handler or general approach as systematic state restoration employed by recovery blocks. Continued service is provided in an arbitrary fashion following recovery...

... middle of paper ...

...chitecture’s properties gives assurance of reconfiguration for that system. For produce fault tolerance capability in this framework it have to redesign the Fault-tolerant actions that completes a correctly executed action in one computer then when experiences a failure that prevent the action from performed it restarted on another computer, completes a specified recovery protocol. Recovery protocol may complete only the original action, either by restarting the action or by some alternative means. In their framework takes a broader view of the recovery protocol, where recovery action might be the reconfiguration of the system so that the next action will complete some useful, but often different, function then requires that the system either carries out the function requested or puts itself into a state where the next action can execute some appropriate function.

