Remote Control of Space Robots Change-Adaptive in its External Environment

In the paper a method of remote bilateral control of space robots operating in a non-deterministic environment with a large delay in control signals transmission is presented. The method provides adaptation of space robot behavior to possible changes in its external environment. Compared to the known approaches, this method reduces influence of external environment variation on the control process. Keywords—Bilateral control, remote control, local sensory systems, adaptive control, stability of control process.


Introduction
One of the most promising and high-demand fields of robot's application is performance of variety of work in space. It gives a huge economic effect and saves people from being in a dangerous space environment for the required operations to be performed. That is why the problem to create space robots controlled remotely from the ground control center is extremely important.
Unfortunately, dealing with this problem is now on the stage not sufficient to create reactive space robots capable of successful performing the required actions in space, although the researches in this direction have a long history.
The main reasons for this are, firstly, the non-determinism of the external space environment in which the robot acts (opposed to, for example, the industrial environment), secondly, the nontrivially of actions that the space robot must perform, unlike industrial robots which in most cases must capture and move objects of known form from one precisely known position to another. Space robots perform significantly more complex operations in a non-deterministic external environment, so that its control system should actually recreate the functions of human central nervous system and human brain generating signals to control muscles of human body and hands performing the required action.
If such operations are necessary to be performed with the help of robots not in space, but on Earth in a non-deterministic environment, then a copying bilateral control is usually used, which allows using the human intellectual capabilities, central nervous and sensory systems for control.
It is known that in control process so-called master arm is used. Its movement is repeated by the task tool (gripper) of a controlled robot. Thus, a person controls movement of gripper with the help of visual system making gripper move in space in the required manner in accordance with the task being performed. If an object is moved by gripper with the use of mechanical constraints, the forces of interaction between the gripper with the moved object and the external environment are possible to be generated during the operation. These forces can be measured by a special forcemoment sensor usually mounted on the arm wrist. Then they are transmitted to the holder of the master arm by using a force-moment control system. These forces are perceived by the human hand moving the holder, which allows the human to instantly correct the movement of the holder and the robotic gripper, accordingly.
Only by instantaneous response of a human to interaction forces the operation is possible to be successfully completed. Any delays of reaction makes the required operation difficult to be performed, and if delay lasts more than 0.2 seconds, it makes it impossible. That is why the use of bilateral control of space robots in the form it is now is impossible and requires a radical improvement.
By now, the following approaches to the solution of remote control over space robots can be outlined.
The first one is based on the use of so-called passivity bilateral control scheme [1,2], in which the power developed by manipulator's task tool controlled bilaterally by the master arm, should not exceed the power developed by the human hand moving the master arm.
Although this imposes certain restrictions on the functionality of the bilateral control system, but in accordance with the theory of energy dissipation this satisfies the requirement stability which is one of the most important requirements of bilateral control ensuring its efficiency. On the other hand, an important requirement of transparency unfortunately is not fully satisfied. As noted above, it would be difficult for a human to perform the required operation, namely, to move objects with holonomic constraints in the absence of transparency, which is the identity of the operator's sensations remotely controlling the robot with the sensations experienced in the absence of delay.
The second approach involves the use of so-called prediction control [3,4]. It is based on the use of computer and semi-natural models of a space robot and its external environment. Taking into account various sensory information about current state of the robot and its external environment, their state is predicted with the use of these models. Based on this prediction, the corresponding control signals are generated. For these purposes, special regulators, in particular SMITH controller, are possible to be use, as it is presented in [3]. These controllers generate a predictive correction for control signals of the space robot drives. The correction is generated while the master arm is moved. This approach gives a better implementation of transparency.
The third approach is based on the use of a sliding control [5,6]. This approach is difficult to be implemented since the control equipment and the mechanical part of the robot are necessary to function in very difficult modes control sign often changing and extremal values. This leads to the large accelerations of the structural elements, and consequently, large reactive forces. There also exist different methods in use, but they are much less common.
The results of theoretical and experimental studies of these approaches show that it is possible to solve the problem of delay if it does not exceed at least 1-2 sec. In addition, the external environment in which the real robot manipulator is supposed to function must be "linear", i.e. linear approximations of "predictive" corrections should be good enough.

Feature of the Proposed Method of Remote Control
The approach proposed in the article [7][8][9][10][11][12][13][14][15] provides for the division of the control process into two stages. The first stage, performed at the ground control center, is the stage of training the robot to the required action. The second one is the stage of implementation of this action by a real space robot.
At the first stage, the control is implemented not by the robot itself, but by its very good model, perhaps a computer, but better than a half-natural or, if possible, natural one.
The model should function the environment that is a model of real external environment of the robot. In this "modeled" environment, a human must perform the required operation using a robot model. For this, in particular, it is permissible to use a bilateral model control using a master arm. The human hand moving the master arm makes the task tool of the robot model move along the trajectory of the master arm. At the same time, the human hand feels the power of interaction of the task tool of the model with models of environmental objects. Movement of this object is limited by constraints. It is permissible to use other methods to perform operations, for example, with the use of so-called master glove, which is mentioned below.
While performing the required operation, a wide range of various data necessary for use in the process of remote control of a space robot is formed with the use of appropriate sensors.
These include a trajectory of variation in space and time of the position vector of the robot model's task tool in its body coordinates, a time variation vector of force of interaction between the task tool of the robot model and modeled environmental objects, as well as data that carry information about the position of models of environmental objects, which the task tool of the robot is supposed to interact with.
It is important to note that the time variation law for the vector of force of interaction between task tool of the robot and objects of external environment, as well as the law of position vector variation of this object in the coordinate system of the task tool are the necessary invariant, which is a passport of the required operation containing all the necessary data for its execution.
At the second stage, the real space robot should be controlled. Its local control system of should developed the program trajectory formed at the first stage and transmitted through the communication channel to the local robot control system.
Thus, the described method of organizing remote control of space robots with large delay in the transmission of control signals belongs to a class of methods allowing to perform the off-line control mode, which involve forming a plan and then its implementation.
The degree of their execution success is determined by quality of the external environment models and the robot itself used in the training process. If the program trajectories are obtained during the training process with the use of some inaccurate model, then these program trajectories would give erroneous behavior of the robot during operating in a real external environment.
The proposed approach provides forming a correction signal of the program trajectory of the space robot's task tool, which increases the probability of successful execution of the required operation. This makes it stand out from the class of traditional off-line remote control approaches.
The possibility of program trajectory correction is based on the statement that there is a passport for any operation of interacting with objects of the external environment executed by the task tool. It is an invariant of the operation containing all the necessary data for its execution.
Thus, in order for an operation of interaction between task tool and environmental objects to be successfully executed, the mutual position of the task tool and object and the forces of their interaction during the process of performing the operation, are necessary to be identical to the forces and position in the training process. The correction signal is generated as a result of processing additional information.
This additional information gives data that can be used to determine the mutual position between the task tool of space robot model and the models of environmental objects, as well as interaction forces between them. For more information, it is necessary to use a variety of sensors, which should be equipped with a model of space manipulator. They can be location, force-moment, tactile sensors, as well as TV-cameras, necessary for the implementation of vision system.
The formation of corrective signals also requires using analogical current additional information obtained in the process of executing the required operation by space robot with the use of sensors that are identical to the robot model's sensors installed in the same way as on the model.
Since this additional information is the result of functioning of the robot's sensory system, let it be named "sensory image". The correction signal is a function of mismatch value between the "modeled" and the real sensory images. It equals to a "zero" in case of zero mismatch between them. A good example of a sensory images are images of a set of characteristic points belonging to the model of the external environment of a robot.
These can be images of so-called characteristic points, for example, vertices of polyhedra. A special "recognition program" allows distinguish them from the image of the external environment, obtained with the help of TV-cameras located on the model of the task tool, they are distinguished by. The images of analogical points of the real external environment are generated at the stage of the program trajectory execution by the robot with the use of TV cameras located on a real task tool likewise on its model. Therefore, in case of ideal formation of the program trajectory and its perfect development, the images of these points must coincide with the images of the "modeled" points. However, in reality, the possible inaccuracy of the modeled external environment causes no coincidence. The no coincidence of the positions of the characteristic modeled points images and the corresponding real points are used to form the correction value for the position of the space manipulator's task tool when the program is performed by its control system. Sensory images can also be "power" images obtained with the help of wrist force-moment sensors of the robot and its model. The processing of these signals results in the force vectors of interaction between the model of the task tool and the models of the moving bodies from external environment, as well as the force vectors of interaction between the real task tool and real bodies.
As mentioned above, the correction signal is the result of the process of the sensory image regulation "by deviation" from its desired value. To improve the dynamics of this process, it is possible to use a more sophisticated method of regulation, for example, a combined one, instead of regulating "by deviation".
It is important to note that the modified off-line remote control method retains all the advantages of the unmodified method, i.e. mostly removes the time lags and its variations, and at the same time has less dependence on the quality of the environmental model than the traditional method.
The modified off-line remote control method is more efficiently used for implementing remote control in stationary or quasi-stationary environments when objects of the external environment do not move too fast. However, it remains functional, as in the case of free-moving objects of the environment, as in the case when the movements of objects are limited by constraints. For example, such an object could be a cabinet with sockets into which the boards moved along the directions should be inserted. A possible medium may be a surface of an arbitrary profile polished with a special tool, exerting pressure on the surface with the required force. A possible external environment may be engagement of two parts: one of them has a hole and the other is a pin inserted into this hole.
The operations described above and others like them can be used to create an interpreter for an expandable problem-oriented language to implement the supervisory control of a space robot.

The Main Types of Information Generated During Training
The process of training a remotely controlled robot to the desired action executed by a human operator using a robot model in a modeled environment implies the following data to be generated: • The law of the time variation for the vector of robot's ( ) generalized coordinates, formed by using sensors measuring the generalized coordinates, • The law of the time variation for the vector of force of interaction between the task tool of the robot and the object of the external environment ( ). The listed data is used in the laws of robot control.
Firstly, these laws support the motion of the robot's task tool in free space along a trajectory close to the trajectory of modeled task tool moved by an operator during training.
Secondly, this law provides a repetition of the force of interaction between the task tool and the object for the "constrained" motion when the mechanical constraints are imposed on the moving tool. This allows to successfully perform an operation requiring the interaction between the task tool and the object being moved.
These control laws implement the control method "by rejection" and therefore the discrepancy function is an essential element of these laws. In the considered case it is the discrepancy between the vector & ( )of the desired and ( ) of the current time variation of the generalized coordinates vectors, as well as the desired & ( ) and current ( ) vectors of interaction between the task tool and a moved subject of external environment.
However, this data is not enough to form the control laws that would allow maintaining the position of the task tool relative to objects of the external environment in the same manner as during training a robot.
All the considerations above require the addition of data obtained as a result of training the robot with a new data type. These are vectors of time variation of positions which are the so-called characteristic points of the second type on the surface of the modeled external environment of the robot ( ( ), where = 1, 2, . . . , . In contrast to the characteristic points of the first type formed with the use of machine vision system, the positions of the characteristic points of the second type are measured by using a radar scanning laser or radio wave device rigidly constrained with the task tool of the mechanical "arm" of the robot.
Each position vector of a characteristic point can be represented in the coordinate system of the device, for example, in a spherical coordinate system as ( ( , 2 ( , 2 ( ). When the required operation is performed by a real robot in a real external environment, characteristic points are also formed with the help of a scanning device similar to a "modeled" one.
The obtained position vectors of characteristic points ( ( ( , 6 ( , 6 ( ) of a real external environment differ from the corresponding vectors of "positions of modeled characteristic points". Modeled and real characteristic points that have two of the three components of the position vectors of these points being equal to each other are considered to be corresponding to each other. For example, if the position vectors are represented in spherical coordinates, these components may be the angular coordinates: 2 ( = 6 ( è 2 ( = 6 ( Assume that the typical difference between a real environment and its model consists only in the relative displacement and rotation of these surfaces relative to each other. Therefore, to achieve the position of the task tool (gripper) relative to the external environment surface, which is identical to their relative "modeled" position, it is sufficient to additionally rotate and move the gripper (Fig. 1 represents the considerations above). It is easy to prove that if for 3 points in three-dimensional environment position vectors coincide, then the equalities given above are valid for a larger number of corresponding characteristic points.
Taking the above into account, in order to maintain the position of task tool relative to the surface of the real external environment close to the "modeled" one during controlling a real robot, it is reasonable to use in the control law a function of

Dynamic Analysis of the Control Process with Adaptation of the Robot to the External Environment
The papers [7][8][9][10][11][12][13] present a detailed dynamic analysis of the robot control process with the use of in the control law the terms depending on the described above two types of discrepancy functions for the desired and current vectors of the generalized coordinates and the forces of interaction between the task tool and objects of the external environment.
This papers also present determined requirements for the structure and parameters of the control law, as well as the requirements for the construction parameters of the robot to maintain control efficiency, its stability while tracking the required movement trajectories of the task tool and while tracking the force of its interaction with environmental objects [16][17][18][19][20][21][22][23][24][25].
The stability of the control process must maintain with the complication of the control law by a new member depending on the vector of discrepancy between the desired and current positions of the task tool relative to the objects of its external environment.
To find the representation of this additional term maintaining stability of the control process, consider a functional that is square of module of the discrepancy function = | & − | ; .
Let us show that the control process is asymptotically stable if in the control law that additional term is a vector proportional to the anti-gradient of the functional [26][27].
The vector & ( ) contained in is only time-varying function and is independent of . Therefore, an additional member of the control law proportional to the antigradient can be represented as:  KN is a symmetric and positively definite matrix (due to its structure and scalarity of matrices g and ).
Thus, in the resulting dynamics equation (4) which is a linear approximation of the initial nonlinear dynamic description of the behavior of the remotely controlled robot (3). All coefficients for variables Δ, Δ̇, Δ̈ are positively defined symmetric matrices. Consequently, this equation describes an asymptotically stable process, [7] which is easily proved by using the Lyapunov theorem.
Note the following useful feature of the proposed adaptive method of remote control: the fact that the characteristic points of the robot's external environment are simple to be proposed, and their number is not regulated and can be changed in the process of control.
In comparison with other methods [4], this fact makes it possible to increase the reliability and quality of the control process, to smooth over and avoid possible control signal steps caused by "possible non-smoothness" of external surface.
The described process of implementing adaptive control is a process continuous in time. Indeed, after transferring the data generated during the training process to the local control system, it is simultaneously "processed" by the robot's local control system until the end of the required operation. The generated data is: The control process is stopped only in case of emergency. Then, a corresponding message is sent to the central control center.
So-called discrete approach to implement the process of adaptive control is also possible. It differs adaptation process of the robot's task tool to the possible inaccuracy of the modeled external environment which the robot is trained with.
The adaptation process is actually the process of implementing the algorithm for minimizing the discrepancy functional. Such algorithm can be an algorithm of mathematical programming, for example, the gradient one. The algorithm includes the following steps: As follows from the description of the adaptation process, its discreteness develops in the continuous movement of the robot at each step of the algorithm to calculate the value of the next movement Δ .
Note that using the gradient minimization algorithm implies that the magnitude of the gradient decreases as it approaches the zero minimum, i.e. when ( approaches to & ( as shown in (2) and, therefore, the step size decreases slowing down the control process.
Therefore, to speed up the process, another method adapted to the peculiarities of the problem being solved is proposed to be used. This reduces finding an argument of the functional corresponding to its zero minimum, to an iterative process of solving algebraic equations by the Newton method.
In this case, these equations are formed by equating the discrepancy function to zero. Taking into account the identity of the angular coordinates of the position vectors of the corresponding characteristic points, this gives an equation of the form: To find the desired vector from (5) by using the Newton method, it is necessary to represent the differentiable vector-function ( ) as a series of the vector Δ terms in a certain neighborhood of the current value = o , and only linear terms remain in the expansion: If the functional matrix KS KN presented by expression (6) is non-singular, then from (6) it follows that: The value v = o + Δ is the first approximate value of the desired argument . The second approximation is found from (6) by replacing the value o by obtained value v . The process continues until the discrepancy function reaches a predetermined small value. At each step, the robot is moved by the obtained value Δ until the descrepancy function reaches a specified small value.

Conclusion
In the article, a method for implementing remote bilateral control of space robots is proposed. The approaches given in [7][8][9][10][11][12][13] are developed, which increases possibility of successful implementation of sustainable remote control by a manipulation robot operating in the environments with varying topography.
The control process consists of the following steps: • With the help of a video camera and a scanning three-dimensional locator, the "topography" of the external environment in which the remote-controlled robot should function is formed. In other words, the coordinates of points on the surface of external environment in the coordinate system of the scanning device, for example, spherical, are determined with a certain scanning step. • A three-dimensional model of the external environment is created in the control center of the robot. It can be full-scale or combined, consisting of full-scale and virtual elements "joint" with each other using augmented reality technology. • To perform the required operation the robot is trained by a human operator in this modeled environment by using a robot model preferably a physical (full-scale) one. For this purpose, the human operator performs the required operation with the use of the mode of bilateral control of the robot model. • As a result, the process generates data used as programmed by the local control system of a real robot and "worked out" by it to ensure that the real robot performs the required operation. These data include the laws of time variation of the generalized coordinates vector g(t), the vectors of force interaction between the task tool of the robot (gripper) and objects of the external environment. It's freedom of movement can be limited to constraints (for example, when performing assembly operations). • In addition, the number of generated data includes the laws of time variation for the vector of characteristic points on the surface of external environment to correct the position of the robot's task tool in relation to objects of the external environment. • Perform the desired operation by the robot.