Современные информационные системы / 2. Вычислительная техника и
программирование
Kryuchin O.V., Rakitin A.A.
Tambov State University named after
G.R. Derzhavin, Russia
The
information processes for the function coefficient minimization efficiency
increasing using quasi-Newton methods
In optimization theory quasi-Newton methods are algorithms for finding a
local maximum and minimum of function
|
(1) |
where is the minimized vector.
Quasi-Newton methods are based on the Newton method to find the stationary
point of a function (1), where the gradient is equal to 0. Newton method
assumes that function (1) can be locally approximated as a quadratic in the
region around the optimum and uses the first
|
(2) |
and the second
derivatives
|
(3) |
to find the
stationary point. In formulas (2)-(3) is the vector size. In quasi-Newton methods
Hessian matrix (3) to be minimized does not need to be computed. The Hessian is
updated by analyzing successive gradient vectors instead. Quasi-Newton methods
are a generalization of the secant method to find the first derivative root for
multidimensional problems. In multi-dimensions the secant equation is under-determined,
and quasi-Newton methods differ in how they constrain the solution, typically
by adding a simple low-rank update to the current estimate of the Hessian [1].
This paper aim is to develop quasi-Newton method which can be used with
the large information resource elements (IR-elements) number and to do
experiments for checking the developed algorithms efficiency.
As in Newton method
|
(4) |
thus we can
calculate new vector values by formula
|
(5) |
where is the approximate Hessian
inverse [2].
If we have n IR-elements (for example a computer cluster or a
network computing) so as it done by the classic parallel gradient algorithms
one IR-element (which will calling “lead”) calculates elements of a gradient and a
gradient changing and other IR-elements calculate elements [3]. We can divide
matrix similarly and so we can develop
few information process steps:
1.
calculating parts of gradient , gradient changing and matrix by the different IR-elements and
passing the calculated gradient and the gradient changing by non-lead
IR-elements to the lead;
2.
passing gradient changing by the lead IR-element to other;
3.
each non-lead IR-element calculates matrix
|
(6) |
rows and vector
|
(7) |
elements; the lead IR-element calculates scalar
|
(8) |
4.
passing calculated matrix and vector elements by non-lead IR-element
to the lead;
5.
passing formed matrix and vector elements by the lead IR-element
to non-lead;
6.
the lead IR-element calculates
, |
(9) |
|
(10) |
and
; |
(11) |
at that time other IR-elements calculate
, |
(12) |
|
(13) |
and
; |
(14) |
7.
non-lead IR-elements sends to the lead matrix ;
8.
the lead IR-element calculates matrix and new weights and sends it to
other IR-elements.
For checking the developed algorithms efficient we have done an experiment.
The efficiency coefficient which was calculated by formula
|
(15) |
(where t
is serial algorithm time expenses and is parallel algorithm time expenses
with n IR-elements usage) is about 80% So we can see that parallel
quasi-Newton algorithms are very efficient and can be used for the function
minimization.
Reference:
1.
William C. Davidon, Variable Metric Method for Minimization // SIOPT,
1991 – Vol. 1, Issue 1 – P. 1-17.
2.
Fletcher R. A New Approach to Variable Metric Algorithms // Computer Journal,
1970 – Vol. 13, Issue 3 – P. 317-322.
3.
Kryuchin O.V., Arzamastsev A.A., Troitzsch K.G. A universal simulator
based on artificial neural networks for computer clusters // Arbeitsberichte
aus dem Fachbereich Informatik , № 2/2011,
http://www.uni-koblenz.de/~fb4reports/2011/2011_02_Arbeitsberichte.pdf