Tip:
Highlight text to annotate it
X
Today's lecture is on the topic adaptive neural control for affine systems - multi-input multi-output.
Last class, we talked about single-input single-output system. We will see how the techniques we
learnt in the last class can be extended to multi-input multi-output system. This is the
lecture on the module 3 which is neural control and in this course on intelligent control.
Topics that will be covered today are a revisit of adaptive neural control for single-input
single-output system. I just want to refresh your memory with what we discussed in the
last class, general form of multi-input-multi-output systems. Then, direct adaptive control of
MIMO system of the form. So, we are only considering a specific form of the multi-input multi-output
system, where this can be written in the form z1 dot is z2, z1 is z vector and z2 dot is
equal to f (z) plus g (z) u.
Where z1 and z2 are vector and of course z is z1 transpose, z2 transpose. So, if z1 is
n1 dimensional vector and z2 is n2 dimensional vector, then z is n1 plus n2 dimensional vector.
We will talk about when f (z) is unknown g (z) is known and when g (z) is unknown, we
do not know the solution.
Simulation results for two-linked manipulator system: We will present whatever solution
we will get for this kind of non-linear system, where we assume object to be unknown and we
will solve the control problem. Finally, the summary, single-input single-output affine
systems; we will be revisiting that subject again today.
A large class of single-input single-output non-linear system can be represented by the
following affine systems: an affine system can be written as x1 dot is x2 x2 dot is x3
and so on until x n dot is f x, where x is the complete vector consisting of the elements
x1 x2 x3 until xn.x is x1 x2 xn and plus g xu u is a singular input u and y are single-input
system so belongs to real line. The control problem is find u so that x t follows a desired
trajectory x desired.
Feedback linearization technique we discussed last class that, in general if I have this
particular non-linearity here which we call affine. Then if we select this control law
u to be of the form 1 upon g x minus f (x) kv r lambda 1 e n minus 1 and so on until
this is n minus 1th derivative of the error. This is the first derivative of error plus
x n desired, desired xn dot, where e is y d minus 1 is the output tracking error, r
is we said filtered tracking error which is n minus 1th derivative of error plus n minus
2, 2th derivative of error and so forth. Power d denotes respective derivatives; putting
this expression of u in the system dynamics. System dynamic is simply your x n dot is f
(x) g (x) u. If I replace this u here so I get xn dot this side and this side is in u
you have this xn d dot. When you multiply this u with g (x) you get xn d dot so bring
to this side then by definition this term is minus if you look at e is this one then
this particular term by definition becomes minus nth derivative of the error and equal
to this side if you look at this minus f (x) will cancel with this f (x).
Hence this is kv r plus lambda 1 e n minus 1 until lambda n minus 1 e first derivative.
The closed loop error dynamics becomes finally if I look at here, if I take this 1 to this
side, then this term will become r by definition here this is my r; so this will become if
I take to this side this will become r dot. r dot becomes minus kv r which is linear as
well as stable given kv and lambdas are positive parameters. These parameters are all positive.
This is speed back linearization. What you see is that, if my system is distract by this
form, xn dot is f (x) plus g (x) u and if I select control input is like this and I
am able to show that such a controller will stabilize the system. Also tracking will be
achieved and the closed error of feedback error dynamics becomes r dot equal to minus
kv r.
We extended this feedback linearization principle, for adaptive control technique. Why we are
doing adaptive control? Because, we assume that in the dynamic which is given xn dot
is f (x) plus g (x) u if this f (x) and g (x) are known, then there is no need for adaptive
control.
But most of the situations or in many situations, we do not know what is f (x) and g (x). How
do we solve this problem using the same principle, the feedback linearization concept? What we
are trying to do here is that, last class we discussed two cases, where 1 is f x is
unknown and g x is known and the other is that both are unknown. We are considering
the first case now: f x is unknown g x is known. Then if we select the control law which
is 1 upon g x, you can see that this form is exactly as that of a feedback linearization
control law, but with the exception that instead of f x which is known we have written f hat
x which is an estimate of f x. We are saying direct adaptive control because we will not
be estimating f (x) using the system identification principle, but we will be identifying f x
hat using the principle of tracking error convergence directly. Hence this control technique
is known as direct adaptive control. What I am saying here is that, where the non-linear
function f (x) is approximated as f hat x using a radial basis function network, f hat
x is W transpose phi x; W hat is the weight vector of the network. The update law for
W hat is W hat dot is minus phi F phi into r. F is a positive definite matrix, phi is
the radial basis functions easing which the unknown function F hat x is is estimated and
r is the filtered tracking error which you have already defined; r is e to the power
which is there here - e to the power n minus 1 plus lambda 1. This is our filtered tracking
error and using this filtered tracking error we have weight update rule for estimating
F hat x.
In this method the idea is not to exactly identify what is F hat x but to estimate F
hat x in such a way the tracking error is converged. We proved this theorem we are not
going to prove this theorem in this class because we have already done it; I am just
refreshing your memory.
Similarly, we also proposed a control law for when this affine system both f (x) and
g (x) are unknown. Again for your memory here I rewrite it as f (x) plus g (x) u so in this
the control law is given by u1 plus u2; where u1 is similar structure except that here we
have instead of g (x) we have g hat x and here instead of f (x) we have f hat x the
estimate of f x because we do not know f (x) and g (x) and u2 is a sliding mode term which
is the absolute value of g hat upon gl absolute value of u1 sign um sine r, sine of the filter
tracking error. gl lower bound of g (x).
What we are also assuming here g (x) is either positive or negative. g x is approximated
as g hat x using radial basis function network; g hat x is P transpose psi x. P hat is the
weight vector of the network the update law P hat dot is minus G psi u r and G is a positive
definite matrix. Similarly, the weight update law for f (x) which is f (x) hat is W transpose
phi x so this is f (x) and the W hat dot is also the same what we derived F phi r. We
have two weight update laws; one update law for the weights of the neural network that
approximates f (x) and there is another update law here, which is the weight update law for
the neural network that estimates g (x). If these are the two update laws, then the control
law given by u equal to u1 plus u2 will stabilize this non-linear system. Now adaptive control
of MIMO system using these two theorems, we have already proved in the last class today
we will extend these concepts to MIMO systems.
Let us see what a MIMO system is. We will now extend the application of proposed direct
adaptive neural control to multi-input multi-output system. A general multi-input multi-output
system can be written as x dot is f (x) plus g (x) u and y is Cx. Unlike the earlier case,
what we are writing here is, that here x dot is a vector a whole relation f (x) plus g
(x) u, where x belongs to the n dimensional vector, f (x) also is an n dimensional vector,
u is m dimensional vector, g is m into n dimensional vector, y is p dimensional vector and C is
p into m dimensional matrix. In fact, this is a matrix g (x) is a matrix and this is
also a matrix. We are talking about a multi-input multi-output system.
Now, this is a very general form and out of this form we will consider a very specific
form which is this one. This structure where this MIMO system we can write any of the system
dynamics particularly multi-link robot manipulators. All categories of multiple robot manipulators
are two-link or more than two-link robot manipulator. The dynamics can be written as x1 dot is x2
dot is f1 x plus g11 u1 until g1n um x 3 dot is x 4 x 4 dot is f 2 x plus g 2 1 x until
g2m (x) um and finally, x2n minus one dot is x 2n and x 2n dot is f (x) fn (x) plus
g n 1 x u 1 until gnm (x) um. So, what you are seeing is that here that we have written
is a generic form of a specific multi input multi output system where outputs are x1 x3
the odd number state vectors x1 x3 until x2 n minus 1. For such cases the system equation
can be rewritten as z1 dot is z2 and z2 dot is f z plus g z u.
We are clubbing these equations and representing by z1 dot is z2 and then obviously we are saying z1
is x1 odd number things x2n minus 1 transpose and z2 is x2 x 4r x2n. You see that a given
2 n state vectors can be represented as z1 dot is z2 and the next one z2 dot can be written
as f (z) plus g (z) u which you can see here, this one this one and this one. So this representation
is z2 dot is f (x) plus g x u. So this is again the n dimensional vector z2 and fx(x)
is 2 n dimensional and g x also is n into m dimensional, because u is m dimensional.
Here, this is our class of multi-input system; multi-input multi-output system that we will
be discussing today. Where z1 is x1 x3 the odd number of state vectors and z2 is even
number state elements. f (z) is f1 until fn transpose and g (z) you can easily see g (z)
is g11 so this is g11 until g1 m g2 1 until g2m gn 1 until gn m. This is the elements
gather that forms g (z) and then your final state vector z is actually 2n dimensional
where each one here is n dimensional.
Direct adaptive neural control, the output error can be defined as e is y d minus y in
the as you know that y is again n dimensional vector because we are assuming that this is z1 d
desired minus z1 since z1 is the n dimensional vector so this is valid. y d is z1 d is the
desired output vector. Let us define a variable r which is again in the form of a filtered
tracking error. r is e dot plus lambda e where lambda is a diagonal matrix with positive
diagonal element. Theorem one - suppose that the non-linear function f (z) is unknown while
the function g (z) is known, suppose also that f (z) can be approximated as f hat z
upon equal to W hat transpose phi z using a radial basis function network then the control
law given by this expression u is g transpose into g g transpose inverse whole multiplication
minus f hat z plus kvr plus this term into e dot plus z2 d dot so z2 dot. The desired
one will stabilize the system in sense of Lyapunov provided W hat is updated using the
update law minus F phi r transpose. You see that in here in-single-input single-output
case r was a scalar and in this case this is a vector n dimensional vector.
You see that when we did with using single-input single-output system we had a radial basis
function network that was estimating what is f (x). If you see the weight vector here
which is W this is a vector and how many dimensions? Dimension is l into 1 dimensional vector W.
In case of single-input single-output system the function f (x) is a scalar function the
R B F network as a single-output as shown in the figure. The network weight constitutes
a vector W hat in this case while when we do it in multi-input multi-output system.
We have n outputs here. In this case this is no more a vector W hat is a matrix. What
is the meaning of this W hat here? In case of MIMO system the function f (x) is a vector
valued function. The R B F network has multiple outputs as shown in the figure. The network
weights constitute a matrix W hat in this case. Now, we will go to the proof with this
basic idea we have already actually described.
We described that this control law, with the weight update algorithm for F hat z which
is W hat dot is minus F phi r transpose. If I have this weight update law, for estimating
F hat z then, this controller will give me the results that the actual output will follow
the desired output.
In this control law k v is a positive definite diagonal matrix W hat is the weight matrix
of appropriate dimension. Let us assume that there exists an ideal weight matrix W such
that the original vector f (z) can be represented as W transpose phi z. We can say here f (z)
is W hat transpose phi z for this W hat as to be updated. We have already given this
rule of W hat dot is minus F phi r transpose so can we say that given this as the weight
update law these controller will stabilize the given affine system. Now what I do is
that my dynamic is because f (z) is this one so z 2 z is f (z). This f z is now W transpose
phi z plus g (z) u in this u is replaced by this equation. The closed loop error dynamics
is obtained by putting u in this equation.
This is putting the control law u in the system equation and after simplification we get z2
dot is W transpose phi minus W hat transpose phi plus k v r plus lambda 1 e dot plus z2
dot. If f x would have been known, then these two terms would have cancelled. Then we would
have remained with only this term which is a stable closed loop error dynamics. Since
these two are not exact, they are different. How do we stabilize it?
We have proposed a control law for weight update. Defining the error in weight vector
is… earlier I told that in this case W is a matrix so this is very important and here
W is actually matrix; defining the weight matrix W tilde is according to this. We can
write z2 dot this particular term is W tilde transpose. This is written as W tilde transpose
phi plus this 3 terms kv r plus this term into e dot plus z dot 2 desired. We have already
defined r dot. r dot we know, we have defined r to be e dot plus so if I re-compute r dot
is e double dot plus 1 into e dot. The symbol this is simply a constant into 1 suffix 1
e dot which is z2 dot minus z2 dot plus 1e dot. This particular expression we derive
from r dot simply differentiating then replacing, in this case what is e double dot. So e double
dot is simply z2 d dot minus z2 dot. Combining this expression we get the close error dynamics
is r dot is minus kvr minus W tilde transpose phi which is the final closed loop error dynamics.
We now analyze the stability of this using Lyapunov function so what we get is that this
is our closed loop error dynamics. By replacing u in our system dynamic which is f (z2) plus
g (z) u is z2 dot. This is our dynamics we refer in this dynamics here z2 dot here, then
I get this expression.
This is our closed loop dynamics r dot is minus kv r W tilde transpose phi. Now consider
a Lyapunov function candidate V is half r transpose r plus trace half W tilde transpose
F inverse W tilde F is the positive definite matrix. Since the Lyapunov function should
be a scalar function but W hat is a matrix. In this case we have taken trace because earlier
we simply wrote this is W tilde transpose F inverse W tilde. But now we have taken trace
because of the matrix. Because this Lyapunov function has to be scalar.
Trace of a matrix those of you do not know, if I take a 3 dimensional matrix the diagonal
elements is given by a 1 1, a 2 2, a 3 3. Trace of a matrix is a 1 1, plus a 2 2 ,plus
a 3 3. Differentiating V, V dot which is r transpose r dot plus trace of W tilde transpose
F inverse W tilde dot. Substituting r dot into above equation, so this r dot which is
the closed loop dynamics this is my r dot. If I give this equation or if I replace this
equation in V dot which is r transpose r dot, we may get a nice solution for this which
we have derived now. You can see that r transpose r dot plus trace
of this thing with W tilde dot. Substituting r dot into the above equation in this equation
you get the rate of Lyapunov function to be r minus kv r minus W tilde transpose phi plus
the trace of this particular term.
Since W is a constant matrix we can always write W tilde dot to be minus W hat dot. V
dot, which we have already computed to be minus r transpose kv r minus r transpose W
tilde transpose phi plus trace minus W tilde transpose F inverse W hat dot.
By taking this example, using the properties of trace we will utilize this theorem, which
says, r transpose W tilde transpose phi is trace of W tilde transpose phi r transpose.
We can further simplify this V hat V rate derivative of the Lyapunov function to minus
r transpose kv r trace. This is our trace; this particular thing has been replaced by
this here trace W tilde transpose phi r transpose. This is trace and this is also another trace
so you can write trace is minus W transpose phi r transpose minus W tilde transpose F
inverse W tilde W hat dot. Equating the second term of the above equation to 0, this one
we want to eliminate, I can take out W tilde transpose to the left side as a common, then
I get simply phi r transpose; you see that phi r transpose plus F inverse W hat dot is
0 or this is my weight update law W ha dot is minus F phi r transpose. If this is my
update law then direct adaptive control for non-linear systems is solved.
Further the proof of the theorem again using the update law W hat dot which we just derived
minus F phi r transpose. Now let us see whether the algorithm is convergent rate derivative
Lyapunov function becomes according to our definition… this V dot becomes V dot is
minus r transpose kv r. Since V is greater than 0 and V dot is less than equal to 0 this
shows stability in the sense of Lyapunov so that r and W tilde are bounded hence the proof.
Furthermore, you can easily check this one. Again V double dot if I have found V dot then
V double dot is minus 2 r transpose kvr dot is 2 r kv square because r dot is minus kv
r. This was negative this becomes positive plus 2 r transpose k v W tilde transpose phi;
because this r dot is replaced by…
Since r and W tilde are bounded as V double dot double differential of the Lyapunov function.
Therefore, V dot is uniformly continuous. Thus according to Barbalat’s Lemma, V dot
tends to 0 and t tends to infinity; hence r vanishes with time. We showed here by saying
that this is my update law, I found out this is my rate derivative of Lyapunov function
which is negative definite which is always negative for the values r not equal to 0 when
r is equal to 0; this is 0. Then further we are showing that V dot is uniformly continuous
considering this term. Thus according to Barbalat’s Lemma if this is continuous, then this term
will go to 0 as t tends to 0. So r would vanish with time and what is r? If you look at here
r we have defined r is this particular term which is the filtered tracking error. So finally
error will converge to 0.
Now we will apply this particular application to this adaptive control theory to an actual
system in simulation and we take two-link manipulator. The dynamics of a two-link manipulator
has been taken as an example of multi-input multi-output systems. For a two-link robotic
manipulator the second and third link of a PUMA 560 robot that we have taken. The dynamical
equation which relate the joint torques tow 1 and tow 2 to the joint angles theta 1 theta
2 of the links are given as where tow 1 is a 1 plus a 2 cos theta 2 theta 1 double dot
plus a3 plus a2 by 2 cos theta 2 so this is the coefficient of theta 2 double dot plus
a 4 cos theta 1 minus a 2 sine theta 2; this whole term multiplication with this term.
The joint torque 1 is related with the joint velocities acceleration velocity which is
the co-relate force theta 1 dot theta 2 dot plus theta 2 dot whole square by 2 and the
gravity term F i cos theta 1 plus theta 2. Similarly, the joint torque in second joint
which is a 3; this is the acceleration term this is co-relation term and this is your
gravity term.
Where a1 is 3.82, a2 is 2.12, a3 is 0.71, a4 is 81.82, a5 is 24.06. The two-link manipulator
system can be re-written as theta 1 double dot is something and theta 2 double dot is
something. But you see that each equation is written in terms of theta 1 double dot
and theta 2 double dot. So I have to eliminate and then I have to find out the expression
for theta 1 double dot. Similarly, I have to find the expression of theta 2 double dot
using other terms. Doing that what you are getting theta 1 double dot theta 2 double
dot equal to 1 upon D m22 minus m12 minus m21 m11 tow 1 minus v1 tow 2 minus v2 where,
m11 is given by this expression, m12 is given by this expression, m21 is m12 and m 22 is
a3.
Where v1 is given by this expression this big expression which is a4 cos theta 1 minus
a2 sine theta 2 theta 1 dot theta 2 dot plus theta 2 dot whole square by 2 plus a5 cos
theta 1 plus theta 2. Similarly, you can see that d2 here another expression a2 sine theta
2 theta 1 dot whole square upon 2 a phi cos theta 1 plus theta 2 and D is m11 m22 minus
m12 m21. This is m11 m22 minus m12 m21 so m11 m22, diagonal element multiplication minus
the other diagonal element. Considering the state variable as x1 is theta 1 x2 is theta
1 dot and x2 is theta 2 x 4 is theta 2 dot. One can write x1 dot is x2 and x2 dot is 1
upon D and then this quantity m22 t 1 tow 1 minus v1 minus m12 tow 2 minus v2 and similarly
x3 dot is x4. Similarly x dot 4 is in this particular format 1 upon D minus m21 tow 1
minus v1 plus m11 tow 2 minus v2. You see that we said in the beginning of the class
that some of the system can be represented in this particular form. You see that this
v1 is function of the state vector x and similarly v2 also is a function of state vector x this
is also function of state vector. You see that v2 is function of theta 1; theta 2 theta
1 dot, v1 is theta 1 theta 1 dot theta 2 dot and theta 2. All the states are there here
one state is missing. But anyway in general they can be a function of this entire state
vector.
This can be represented if we define z1 to be x1 x3 and z2 is x2 and x4 then the system
can be written in our general notation that we talked which is z1 dot is z2 and z2 dot
is f (z) plus g (z) tow and going back and reformulating the problem we get f (z) is
f1 f2 which is 1 upon D and this is my first term minus m22 v1 plus m12 v2 and the second
term is m21 v1 minus m11 v2 and g z is a 2 by 2 matrix g11, g12, g21and g22 and this
is 1 upon D m22 minus, m12 minus m21 and m11.
Output y is z1 and z1 is our first link position and second link position. The reference output
trajectories are taken as which is a z1 d is x1 d and x3 d is pi, so this is a angular
position of the joint 1 and angular position of the joint 2. pi by 6 sine 2 t pi by 6 cos
2 t the control input which is ug transpose z g g transpose inverse minus W hat transpose
phi z plus k v. This term and this the desired trajectory z2 d dot so where r is our tracking
error e double dot plus lambda 1 e dot and e dot is z1 e dot minus z1 dot. This is our
control law we have already said this control law with the weight update law for W hat will
stabilize. The weight update law is we have already seen F phi r transpose. This weight
update law would surely stabilize the system.
We have selected kv is 30, 00, 30 which is a 2 by 2 matrix and lambda is 20, 20, 00.
W hat is updated using the following update law, minus f phi r transpose, where f is taken
as 20, 00, 20, the number of neurons for the radial basis network function is taken as
30, the centers of radial basis function networks are chosen randomly between 0 and 1.
Weights are initialized to very small values. If we do that the simulation results would
show the trajectory tracking force theta 1 is desired and absolutely no difference. Tracking
is so perfect so we see that we have achieved this tracking and the RMS tracking error is
found to be 0 point 0004 assuming f (x) to be unknown. Of course, in the initial period
we are not showing instead when the trajectory has settled, once the transients are gone
in their steady state that tracking is perfect because of very small value 0 point 0004.
This is a desired perfect tracking.
Similarly, here this is a trajectory tracking for theta 2. Again this is link angle theta
2 and according to time and steady state from time 3 to 8, if we compute the RMS tracking
error this is 0 point 0006 and very efficient tracking. Correspondingly, the controlled
torque tow 1 and tow 2 that were found out to be like this that you can see again in
steady state the controlled torque is very smooth without any kind of fluctuations. This
implies that the proposal algorithm is very efficient. If we go back to the control law,
we have this g, g transpose mac inverse; you know that this is a matrix. In this case the
g is two-dimensional matrix, so gg transpose is again two-dimensional matrix. But in general,
if I have n link matrix; if it is two-link then it is two dimensional. If I have a six
link matrix it is a 6 by 6 inverse matrix. Inverse means computation is more; this point
gg transpose can be computed using a recursive relation. One can work out on this that instead
of doing this inversion we can find a recursive solution for it. That this easily be computed.
Second thing is when we consider f (x) is unknown and g (x) is known, if g (x) is also
unknown or both are unknown this problem is still not solved in the control literature.
Non-linear function g is unknown the adaptive control problem becomes difficult to solve.
This is an open recursive problem.
In the summary, the following topics have been covered: adaptive control system for
single-input and single-output systems is revisited, mathematical model is provided
for general classes of multi-input multi-output system, where we could represent even practical
systems like multi-link robotic manipulators. They can also be represented in this particular
format: z1 dot z2 z2 dot zu plus g (z) u, and f (z) is unknown then this solution to
this control problem is already provided but we found that in this control law includes
an inversion which must be replaced by a recursive solution. Simulation results are provided
for a two-link manipulator system where we saw the tracking order is in the range of
10 to the power minus 4, implying tracking is very perfect. Direct adaptive control of
multi-input multi-output system, when both f (z) and g (z) are unknown is kept for the
future work. This problem is still not solved.
Those who are further interested to work on this problem, I would like that you can follow
these references in the Lewis and Jaganathan and Woodwreck, neural network control of robot
manipulators and non-linear systems. A book published by Taylor and Francis in 1999. Spoonor
and Passino they have a paper on Stable Adaptive control and Fuzzy systems and Neural Networks,
S He Konnald Reif and Rolf have published A neural approach for the control of Non-linear
systems with Feedback linearization, Choy and Farnell have published Non-linear adaptive
control using networks of linear approximates volume 11. This is our paper; Indrani Kar
and I have published Neural Network Direct adaptive control for all non-linear systems.
Thank you very much.