Ñîâðåìåííûå
èíôîðìàöèîííûå òåõíîëîãèè/Âû÷èñëèòåëüíàÿ òåõíèêà è ïðîãðàììèðîâàíèå
Êðàâ÷óê Â.Ï.,Ìÿñ³ùåâ Î.À.
Õìåëüíèöüêèé íàö³îíàëüíèé
óí³âåðñèòåò
Äîñë³äæåííÿ åôåêòèâíîñò³
êîìï³ëÿòîð³â òà á³áë³îòåê Intel ó ïàðàëåëüíèõ îá÷èñëåííÿõ â ãàëóç³ ë³í³éíî¿
àëãåáðè.
Ç
ðîçâèòêîì íàóêè ç'ÿâëÿºòüñÿ âñå á³ëüøå çàäà÷, ùî ïîòðåáóþòü ³íòåíñèâíèõ
îá÷èñëåíü äëÿ ¿õ âèð³øåííÿ. Îäí³ºþ ç íàéá³ëüø øèðîêî çóñòð³÷àþ÷èõñÿ çàäà÷
ë³í³éíî¿ àëãåáðè º çàäà÷à âèð³øåííÿ ñèñòåìè ë³í³éíèõ ð³âíÿíü, ÿêèìè ìîæíà
îïèñàòè äåÿê³ ïðîöåñè. Öÿ çàäà÷à ìຠê³ëüêà ï³äõîä³â äî âèð³øåííÿ, àëå ò³ ç
ï³äõîä³â, ÿê³ ìîæóòü áóòè àâòîìàòèçîâàí³, º äîñèòü ñõîæèìè, à àëãîðèòìè ¿õ
â³äîì³. Çâè÷àéíî, îäíèì ç îñíîâíèõ ïèòàíü, ùî ïîñòຠïðè äîñë³äæåíí³ ð³çíèõ
øëÿõ³â âèð³øåííÿ çàäà÷³ º ïèòàííÿ åôåêòèâíîñò³, òîáòî øâèäêîñò³ îá÷èñëåíü.
Øâèäê³ñòü
âèð³øåííÿ çàäà÷ çàëåæèòü â³ä íàáîðó ôàêòîð³â, ÿê³ ìîæíà ðîçä³ëèòè íà äâ³ ãðóïè:
åôåêòèâí³ñòü òåõí³÷íèõ çàñîá³â òà åôåêòèâí³ñòü ïðîãðàìíîãî çàáåçïå÷åííÿ. Ö³ äâà
ôàêòîðè ò³ñíî ïîâ'ÿçàí³ îäíå ç îäíèì, àäæå íàëåæíå âèêîðèñòàííÿ òåõí³÷íèõ
çàñîá³â çàáåçïå÷óºòüñÿ ñàìå ïðîãðàìíîþ ÷àñòèíîþ.  òîé æå ÷àñ, â êîìïåòåíö³¿
ðîçðîáíèêà º ëèøå ïðîãðàìí³ ³íñòðóìåíòè.
Îñíîâí³
àñïåêòè ïðîãðàìíîãî çàáåçïå÷åííÿ, ùî âïëèâàþòü íà øâèäê³ñòü îá÷èñëåíü:
1.Êîìï³ëÿòîð, ùî
âèêîðèñòîâóºòüñÿ.
2.Âèêîðèñòàííÿ ñïåö³àëüíèõ
á³áë³îòåê, ÿê³ñòü âèõ³äíîãî êîäó ïðîãðàìè, âèêîðèñòàííÿ ñïåö³àëüíèõ á³áë³îòåê.
3.Çàñòîñóâàííÿ ïàðàëåë³çìó.
Ñàìå
äîñë³äæåííþ ìîæëèâîñò³ ïðèøâèäøåííÿ îá÷èñëåíü ñèñòåì ë³í³éíèõ ð³âíÿíü çà
äîïîìîãîþ íàéîïòèìàëüí³øîãî âèêîðèñòàííÿ ïðîãðàìíèõ çàñîá³â ³ ïðèñâÿ÷åíà äàíà
ðîáîòà.
Âèá³ð êîìï³ëÿòîðà.
ßê
â³äîìî, êîìï³ëÿòîð ïðèçíà÷åíèé, ãðóáî êàæó÷è, äëÿ ïåðåòâîðåííÿ ïðîãðàìè,
íàïèñàíî¿ íà äåÿê³é ìîâ³ ïðîãðàìóâàííÿ âèñîêîãî ð³âíÿ ó ¿¿ åêâ³âàëåíò â
ìàøèííîìó êîä³, ÿêèé ìîæå áóòè âèêîíàíèé áåçïîñåðåäíüî öåíòðàëüíèì ïðîöåñîðîì.
³äïîâ³äíî,
ïåðåòâîðèòè âèõ³äíèé êîä ó âèêîíóâàíèé ìîæíà ð³çíèìè øëÿõàìè ³ ðåçóëüòàò ç òî÷êè çîðó øâèäêî䳿 òàêîæ
áóäå ð³çíèì. Âèðîáíèêè êîìï³ëÿòîð³â íàìàãàþòüñÿ âò³ëþâàòè íîâ³ ðîçðîáêè, ùî
äîçâîëÿþòü îïòèì³çóâàòè âèêîíàííÿ, çàñòîñîâóþòü òåõíîëî㳿, ùî çäàòí³
ïðèøâèäøèòè âèêîíàííÿ ïðîãðàìè çà ðàõóíîê âèêîðèñòàííÿ ñïåö³àëüíèõ ³íñòðóêö³¿
ïðîöåñîð³â, áàãàòîïîòî÷íîñò³, òîùî.
Îïåðàö³éí³
ñèñòåìè ñ³ìåéñòâà Linux äîñèòü øèðîêî âèêîðèñòîâóþòüñÿ äëÿ âèêîíàííÿ îá÷èñëåíü
ð³çíèìè îðãàí³çàö³ÿìè, çàâäÿêè â³äêðèòîñò³ êîäó, áåçêîøòîâíîñò³ òà íàÿâíèõ
ìîæëèâîñòåé äëÿ ñòâîðåííÿ âèñîêîïðîäóêòèâíèõ îá÷èñëþâàëüíèõ êëàñòåð³â.
Â
ÿêîñò³ ìîâè ïðîãðàìóâàííÿ, ùî âèêîðèñòîâóºòüñÿ äëÿ íàïèñàííÿ ñïåöèô³÷íèõ
ïðîãðàì, ùî âèêîíóþòü îá÷èñëåííÿ â ãàëóç³ ë³í³éíî¿ àëãåáðè, íàáóëè øèðîêîãî
âæèòêó Fortran òà C++. Íå äèâëÿ÷èñü íà òå, ùî ïåðøèé êîìï³ëÿòîð Fortran áóâ
ðîçðîáëåíèé ó 1957 ðîö³, öÿ ìîâà ïðîäîâæóº ñëóæèòè ³íæåíåðíèì òà ìàòåìàòè÷íèì
çàäà÷àì á³ëüø í³æ ï’ÿòäåñÿò ðîê³â ïîòîìó. C++ º íàáàãàòî íîâ³øîþ òà á³ëüø
øèðîêî âèêîðèñòîâóâàíîþ ìîâîþ, ùî ïðîäîâæóº àêòèâíî ðîçâèâàòèñü.
Ñåðåä
êîìï³ëÿòîð³â Fortran äëÿ Linux º äâà îñíîâíèõ – GNU Fortran (gfortran) òà Intel
Fortran. Îñíîâíèìè êîìï³ëÿòîðàìè C++ òàêîæ º êîìï³ëÿòîð GNU ùî âõîäèòü äî
íàáîðó GCC òà Intel C++ Compiler.
Êîìï³ëÿòîðè
GNU º áåçêîøòîâíèìè òà ðîçïîâñþäæóºòüñÿ ï³ä ë³öåí糺þ GNU General Public
License.
Êîìï³ëÿòîðè
Intel º êîìåðö³éíèìè ïðîäóêòàìè, ïðîòå º áåçêîøòîâíà âåðñ³ÿ êîìï³ëÿòîð³â ï³ä
Linux äëÿ íåêîìåðö³éíîãî âèêîðèñòàííÿ. Êîìï³ëÿòîðè Intel º ö³êàâèìè ç ò³º¿
òî÷êè çîðó, ùî Intel îäíî÷àñíî º âèðîáíèêîì ïðîöåñîð³â òà íàìàãàºòüñÿ ï³äâèùèòè
øâèäêîä³þ ñêîìï³ëüîâàíèõ ïðîãðàì íà ïðîöåñîðàõ ñâîãî âèðîáíèöòâà çà ðàõóíîê
âèêîðèñòàííÿ ñïåö³àëüíèõ ³íñòðóêö³é.
Âèêîðèñòàííÿ ñïåö³àëüíèõ
á³áë³îòåê.
Íà
äàíèé ìîìåíò â ïðîãðàìóâàíí³ â ö³ëîìó ïðîäîâæóº ðîçâèâàòèñü òåíäåíö³ÿ äî
âèêîðèñòàííÿ á³áë³îòåê òà ôðåéìâîðê³â, ÿê³ ì³ñòÿòü íàáîðè êîäó äëÿ âèð³øåííÿ
íàéá³ëüø ðîçïîâñþäæåíèõ çàäà÷. Öåé ï³äõ³ä â ö³ëîìó º âèïðàâäàíèì, òàê ÿê
äîçâîëÿº çàîùàäèòè ÷àñ, âèòðà÷åíèé íà ðîçðîáêó ïðîãðàìè ³ ïðè öüîìó ÷àñòî äàº
ïðèð³ñò ó øâèäêî䳿 çà ðàõóíîê òîãî, ùî ôóíêö³¿ â á³áë³îòåêàõ ïîñò³éíî
âäîñêîíàëþþòüñÿ, à íàä íèìè ïðàöþþòü êîìàíäè äîñâ³ä÷åíèõ ðîçðîáíèê³â. Òîìó,
êîëè çàäà÷ó ìîæíà âèð³øèòè çà äîïîìîãîþ ôóíêö³é ç ãîòîâèõ á³áë³îòåê, òî öå
÷àñòî º á³ëüø äîö³ëüíèì àí³æ âëàñíîðó÷íå ñòâîðåííÿ ïðîãðàìè «ç íóëÿ».
Äëÿ
êîìï³ëÿòîð³â GNU, îñíîâíîþ á³áë³îòåêîþ, ùî ðåàë³çóº îïåðàö³¿ ë³í³éíî¿ àëãåáðè º
á³áë³îòåêà LAPACK ³ ¿¿ âåðñ³ÿ äëÿ ñèñòåì ç ðîçïîä³ëåíîþ ïàì'ÿòòþ ScaLAPACK.
Òàêîæ ³ñíóº á³áë³îòåêà PARDISO, ùî ïðèçíà÷åíà êîíêðåòíî äëÿ âèð³øåííÿ ñèñòåì
ë³í³éíèõ ð³âíÿíü ³ äຠíàéêðàù³ ðåçóëüòàòè ïðè âèð³øåíí³ ñèñòåì, â ÿêèõ ìàòðèöÿ
êîåô³ö³ºíò³â ì³ñòèòü äîñòàòíüî íóë³â. Êîä á³áë³îòåêè PARDISO º çàêðèòèì, ïðîòå
º ìîæëèâ³ñòü áåçêîøòîâíî îòðèìàòè ë³öåíç³þ äëÿ àêàäåì³÷íîãî êîðèñòóâàííÿ.
Äëÿ
êîìï³ëÿòîð³â Intel ³ñíóº âëàñíà ðîçðîáêà — ³íòåãðîâàíà á³áë³îòåêà MKL. Ïðè öüîìó, á³áë³îòåêà äîñòóïíà äëÿ
íåêîìåðö³éíîãî âèêîðèñòàííÿ íà òàêèõ æå óìîâàõ, ùî ³ êîìï³ëÿòîð. Äî ñêëàäó MKL
âõîäÿòü BLAS, LAPACK, ScaLAPACK, PARDISO òà äîäàòêîâ³ çàñîáè äëÿ âèêîíàííÿ
ìàòåìàòè÷íèõ îïåðàö³é, ïðè ÷îìó âèùåçãàäàí³ á³áë³îòåêè âæå ìîäèô³êîâàí³ äëÿ
íàäàííÿ îïòèìàëüíî¿ øâèäêî䳿, áàãàòî ç ôóíêö³é ó íèõ âæå âèêîðèñòîâóþòü
ïàðàëåë³çì òà îïòèì³çîâàí³ äëÿ âèêîðèñòàííÿ ç ïðîöåñîðàìè Intel. MKL ìàº
âáóäîâàíó ï³äòðèìêó OpenMP ³ âäàëî ³íòåãðóºòüñÿ ç òàêèìè IDE ÿê Eclipse,
Microsoft Visual Studio, XCode.
Çàñòîñóâàííÿ ïàðàëåë³çìó.
Áåçïåðå÷íî,
íàéêðàù³ ïîêàçíèêè øâèäêî䳿 ìîæóòü áóòè äîñÿãíóò³ ëèøå ïðè ïîâíîìó
âèêîðèñòàíí³ íàÿâíèõ â ñèñòåì³ ðåñóðñ³â, à ñàìå ÿäåð ïðîöåñîðà êîëè éäå ìîâà
ïðî îäíó îá÷èñëþâàëüíó ìàøèíó ç áàãàòîÿäåðíèì ïðîöåñîðîì.
Á³áë³îòåêà
MKL àâòîìàòè÷íî íàìàãàºòüñÿ çàñòîñîâóâàòè ïàðàëåë³çì ³ ñàìîñò³éíî çàâàíòàæóâàòè
íàÿâí³ â ñèñòåì³ ïðîöåñîðè òà ¿õ ÿäðà. Ó ðàç³ æ â³äìîâè â³ä âèêîðèñòàííÿ
ãîòîâèõ á³áë³îòåê ïàðàëåë³çì ìîæíà äîñòèãíóòè çà äîïîìîãîþ MPI òà OpenMP.
Ðîçâ'ÿçóâàííÿ ñèñòåìè
ë³í³éíèõ ð³âíÿíü ç âèêîðèñòàííÿì êîìï³ëÿòîðà Intel C++ òà á³áë³îòåêè PARDISO.
Ðîçãëÿíåìî
âàð³àíò ðîçâ'ÿçàííÿ ñèñòåìè ë³í³éíèõ àëãåáðà¿÷íèõ ð³âíÿíü çà äîïîìîãîþ
á³áë³îòåêè PARDISO òà êîìï³ëÿòîðà Intel C++.
Ó
á³áë³îòåö³ PARDISO ³ñíóº ôóíêö³ÿ pardiso ùî ³ ïðîâîäèòü âèð³øåííÿ ñèñòåìè
ð³âíÿíü, à òàêîæ êðîêè, ùî äîïîìàãàþòü îïòèì³çóâàòè ïðîöåñ âèð³øåííÿ øëÿõîì
ïîïåðåäíüîãî âïîðÿäêóâàííÿ ìàòðèöü.
Çàãàëüíèé
ñèíòàêñèñ ôóíêö³¿ ìຠíàñòóïíå ïðåäñòàâëåííÿ:
pardiso
(pt, maxfct, mnum, mtype, phase, n, a, ia, ja, perm, nrhs, iparm, msglvl, b, x,
error).
Âàæëèâîþ
äåòàëëþ º òå, ùî çíà÷åííÿ iparm[12] çà çàìîâ÷óâàííÿì ð³âíå 1 äëÿ íåñèìåòðè÷íèõ
ìàòðèöü, àëå ÿê ñòàëî â³äîìî â ðåçóëüòàò³ äîñë³äæåíü, öåé ïàðàìåòð íàäçâè÷àéíî
íåãàòèâíî âïëèâàâ íà øâèäê³ñòü îá÷èñëåíü çà ðàõóíîê ³íòåíñèâíîãî âèêîðèñòàííÿ
ïàì'ÿò³. Çì³íà öüîãî ïàðàìåòðó íà 0 äîïîìãëà çá³ëüøèòè øâèäê³ñòü îá÷èñëåíü
ïðèáëèçíî â 10 ðàç.
Ðîçãëÿíåìî
ïðèêëàä ðîçâ'ÿçàííÿ ñèñòåìè ë³í³éíèõ ð³âíÿíü:
x1
– x2 – 3x4 = 1
-2x1
+ 5x2 = 1
4x3
+ 6x4 + 4x5 = 1
-4x1
+ 2x3 + 7x4 = 1
8x2
– 5x5 = 1
Ïðè
òàêèõ óìîâàõ íåîáõ³äíî çàäàòè íàñòóïí³ âõ³äí³ çì³íí³:
int
n = 5;
int
ia[ 6] = { 1, 4, 6, 9, 12, 14 };
int
ja[13] = { 1, 2, 4, 1, 2, 3, 4, 5, 1, 3, 4, 2, 5 };
double
a[18] = { 1.0, -1.0, -3.0, -2.0,
5.0, 4.0, 6.0, 4.0, -4.0, 2.0, 7.0,8.0, -5.0};
int
mtype = 11;
nrhs
= 1;
b[5]
= {1, 1, 1, 1, 1};
iparm[0]
= 1; /* No solver default */
iparm[1]
= 2; /* Fill-in reordering from METIS */
/*
Numbers of processors, value of OMP_NUM_THREADS */
iparm[2]
= 2;
iparm[3]
= 0; /* No iterative-direct algorithm */
iparm[4]
= 0; /* No user fill-in reducing permutation */
iparm[5]
= 0; /* Write solution into x */
iparm[6]
= 0; /* Not in use */
iparm[7]
= 2; /* Max numbers of iterative refinement steps */
iparm[8]
= 0; /* Not in use */
iparm[9]
= 13; /* Perturb the pivot elements with 1E-13 */
iparm[10]
= 1; /* Use nonsymmetric permutation and scaling MPS */
iparm[11]
= 0; /* Not in use */
iparm[12]
= 0; /* Maximum weighted matching algorithm is switched-on (default for
non-symmetric) */
iparm[13]
= 0; /* Output: Number of perturbed pivots */
iparm[14]
= 0; /* Not in use */
iparm[15]
= 0; /* Not in use */
iparm[16]
= 0; /* Not in use */
iparm[17]
= -1; /* Output: Number of nonzeros in the factor LU */
iparm[18]
= -1; /* Output: Mflops for LU factorization */
iparm[19]
= 0; /* Output: Numbers of CG Iterations */
maxfct
= 1; /* Maximum number of numerical factorizations. */
mnum
= 1; /* Which factorization to use. */
msglvl
= 1; /* Print statistical information in file */
error
= 0; /* Initialize error flag */
Âèêîíóºìî
ïåðåñòàíîâêó, ôàêòîðèçàö³þ, òà îá÷èñëåííÿ:
phase
= 13;
PARDISO
(pt, &maxfct, &mnum, &mtype, &phase,
&n,
a, ia, ja, &idum, &nrhs,
iparm,
&msglvl, b, x, &error);
Â
ê³íö³ î÷èùóºìî ïàì'ÿòü:
phase
= -1; /* Release internal memory. */
PARDISO
(pt, &maxfct, &mnum, &mtype, &phase,
&n,
&ddum, ia, ja, &idum, &nrhs,
iparm,
&msglvl, &ddum, &ddum, &error);
Â
ðåçóëüòàò³ âèêîíàííÿ îòðèìàíî:
x [0] = -0.522321
x [1] = -0.008929
x [2] = 1.220982
x [3] = -0.504464
x [4] = -0.214286
Ðîçâ'ÿçóâàííÿ ñèñòåìè
ë³í³éíèõ ð³âíÿíü ç âèêîðèñòàííÿì êîìï³ëÿòîðà Intel C++ òà á³áë³îòåêè LAPACK.
Äî ñêëàäó á³áë³îòåêè LAPACK âõîäèòü íàá³ð ôóíêö³é
?gesv(
int* n, int* nrhs, double* a, int* lda, int* ipiv, double* b, int* ldb, int*
info );
Â
çàëåæíîñò³ â³ä òèïó äàíèõ, íàä ÿêèìè ïðîâîäÿòüñÿ îá÷èñëåííÿ, ï³äñòàâëÿºòüñÿ
ïåðøà ë³òåðà â íàçâ³ ôóíêö³¿. Ó íàøîìó âèïàäêó äëÿ ÷èñåë ïîäâ³éíî¿ òî÷íîñò³
ôóíêö³ÿ íàçèâàºòüñÿ dgesv.
Âàðòî
çàóâàæèòè, ùî áåç ñïåö³àë³çîâàíèõ êëþ÷³â êîìï³ëÿòîðà, ïðîãðàìà íå ìîãëà
ïðîâîäèòè îá÷èñëåííÿ ñèñòåì ð³âíÿíü ç á³ëüøå í³æ 1000 íåâ³äîìèìè òà ïîâåðòàëà
ïîìèëêó segmentation fault. Äëÿ òîãî, ùîá çàïîá³ãòè ö³é ïîìèëö³ íåîáõ³äíî
âèêîíàòè êîìàíäó ulimit -s unlimited ïåðåä êîìï³ëÿö³ºþ òà êîìï³ëþâàòè ç êëþ÷åì
-std=c99.
Ïðîãðàìà,
ùî âèêîíóº îá÷èñëåííÿ çà äîïîìîãîþ á³áë³îòåêè LAPACK íà áàç³ çãåíåðîâàíî¿ ñèñòåìè â 1000 ð³âíÿíü ïðèâåäåíà íèæ÷å:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
/* DGESV prototype */
extern void dgesv( int* n, int* nrhs, double* a,
int* lda, int* ipiv,
double* b, int* ldb, int* info );
/* Parameters */
#define N 1000
#define NRHS 1
#define LDA N
#define LDB N
/* Main program */
int main() {
/*
Locals */
int
n = N, nrhs = NRHS, lda = LDA, ldb = LDB, info;
/*
Local arrays */
int
i,j,z;
int
*ipiv[N];
double a[N][N], b[LDB*NRHS], o;
z=0;
for(i=1; i<(n+1); i++) {
for(j=1; j<(n+1); j++) {
o = (i+j)/0.89;
if(i==j-1) o = 10;
if(i==j+1) o = 2;
a[i-1][j-1]=o;
z++;
}
}
for(i=1; i<(n+1); i++) b[i-1] = (i*21 + 101) / 1.3;
dgesv( &n, &nrhs, a, &lda, ipiv, b, &ldb, &info
);
/*
Check for the exact singularity */
if(
info > 0 ) {
printf( "The diagonal element of the triangular factor of
A,\n" );
printf( "U(%i,%i) is zero, so that A is singular;\n", info,
info );
printf( "the solution could not be computed.\n" );
exit( 1 );
}
exit( 0 );
}
Ðîçâ'ÿçóâàííÿ ñèñòåìè
ë³í³éíèõ ð³âíÿíü ç âèêîðèñòàííÿì êîìï³ëÿòîðà Intel C++ òà ïðîãðàìè, íàïèñàíî¿ ç
âèêîðèñòàííÿì OpenMP.
Êîëè
íåìຠìîæëèâîñò³ âèêîðèñòîâóâàòè á³áë³îòåêè ñòîðîíí³õ ðîçðîáíèê³â, ìîæíà
âëàñíîðó÷ ñòâîðèòè ïðîãðàìó äëÿ âèð³øåííÿ çàäà÷³. Ðîçïàðàëåëþâàííÿ áóëî
âèêîíàíå çà äîïîìîãîþ çàñîá³â ðîçïàðàëåëþâàííÿ öèêëó for â OpenMP. Òåêñò
ïðîãðàìè ùî çíàõîäèòü êîðåí³ ñèñòåìè 1000 ð³âíÿíü íàâåäåíî íèæ÷å:
#include <stdlib.h>
#include <stdio.h>
#include <omp.h>
#define ABS(x) (x < 0 ? -(x) : (x))
#define EPS 0.00001
#define TRUE
1
#define FALSE 0
#define N 1000
int GSolve(double **a,int n,double *x)
{
int
i,j,k,maxrow;
double
tmp;
for
(i=0;i<n;i++) {
/*
Find the row with the largest first value */
maxrow = i;
for
(j=i+1;j<n;j++) {
if
(ABS(a[i][j]) > ABS(a[i][maxrow]))
maxrow = j;
}
/*
Swap the maxrow and ith row */
for
(k=i;k<n+1;k++) {
tmp = a[k][i];
a[k][i] = a[k][maxrow];
a[k][maxrow] = tmp;
}
if
(ABS(a[i][i]) < EPS)
return(FALSE);
/*
Eliminate the ith element of the jth row */
#pragma omp parallel for
for
(j=i+1;j<n;j++) {
for (k=n;k>=i;k--) {
a[k][j] -= a[k][i] * a[i][j] / a[i][i];
}
}
}
/* Do
the back substitution */
for
(j=n-1;j>=0;j--) {
tmp = 0;
for
(k=j+1;k<n;k++)
tmp += a[k][j] * x[k];
x[j]
= (a[n][j] - tmp) / a[j][j];
}
return(TRUE);
}
int main(int argc,char **argv)
{
int
i,j,z,n = N;
double
x[N], o;
double
**a;
a =
(double **)malloc((n+1)*sizeof(double *));
for
(i=0;i<n+1;i++)
a[i]
= (double *)malloc(n*sizeof(double));
z =
0;
for(i=1; i<(n+1); i++) {
for(j=1; j<(n+1); j++) {
o = (i+j)/0.89;
if(i==j-1) o = 10;
if(i==j+1) o = 2;
a[i-1][j-1]=o;
z++;
}
}
for(i=0; i<n; i++) {x[i] = 0; a[n][i] = ((i+1)*21 + 101) / 1.3;}
GSolve(a,n,x);
}
Ïîð³âíÿííÿ øâèäêî䳿.
Äëÿ
îö³íêè øâèäêî䳿 îá÷èñëåíü ïðîãðàìè, çãàäàí³ âèùå áóëè çàïóùåí³ äëÿ ðîçâ'ÿçàííÿ
ñèñòåì ð³âíÿíü ç ê³ëüê³ñòþ íåâ³äîìèõ â³ä 1000 äî 7000.  êîæíîìó âèïàäêó
ð³âíÿííÿ áóëè îäíàêîâ³, òàê ÿê ãåíåðóâàëèñü çà îäíèì ³ òèì æå àëãîðèòìîì.
×àñ
âèêîíàííÿ çàì³ðÿâñÿ çà äîïîìîãîþ âáóäîâàíèõ ôóíêö³é á³áë³îòåêè PARDISO òà
ïðîãðàìè time äëÿ Ubuntu 9.10.
Äëÿ
îá÷èñëåíü âèêîðèñòîâóâàëàñü îá÷èñëþâàëüíà ìàøèíà íà áàç³ Intel(R) Core(TM)2 Duo
CPU T5550 @ 1.83GHz, 3 GB RAM, Linux
Ubuntu 9.10. Ðåçóëüòàòè íàâåäåí³ ó òàáëèö³.
PARDISO:
ʳëüê³ñòü ð³âíÿíü |
×àñ âèêîíàííÿ íà îäíîìó ÿäð³,
ñåê. |
×àñ âèêîíàííÿ íà äâîõ ÿäðàõ,
ñåê. |
Ïðèñêîðåííÿ |
1000 |
0.45 |
0.37 |
1.22 |
2000 |
2.25 |
1.7 |
1.32 |
3000 |
6.13 |
4.42 |
1.39 |
4000 |
12.84 |
8.83 |
1.45 |
5000 |
23.06 |
15.3 |
1.51 |
6000 |
37.6 |
24.89 |
1.51 |
7000 |
56.75 |
36.7 |
1.55 |
LAPACK:
ʳëüê³ñòü ð³âíÿíü |
×àñ âèêîíàííÿ íà îäíîìó ÿäð³,
ñåê. |
×àñ âèêîíàííÿ íà äâîõ ÿäðàõ,
ñåê. |
Ïðèñêîðåííÿ |
1000 |
0.24 |
0.45 |
0.53 |
2000 |
1.34 |
0.95 |
1.41 |
3000 |
3.54 |
2.57 |
1.38 |
4000 |
8.04 |
5.86 |
1.37 |
5000 |
14.99 |
10.63 |
1.41 |
6000 |
25.29 |
17.27 |
1.46 |
7000 |
44.89 |
27.79 |
1.62 |
Âëàñíà ïðîãðàìà ç OpenMP:
ʳëüê³ñòü ð³âíÿíü |
×àñ âèêîíàííÿ íà îäíîìó ÿäð³,
ñåê. |
×àñ âèêîíàííÿ íà äâîõ ÿäðàõ,
ñåê. |
Ïðèñêîðåííÿ |
1000 |
7.66 |
4.95 |
1.55 |
2000 |
61.71 |
35.12 |
1.76 |
3000 |
209.55 |
114.19 |
1.84 |
4000 |
509.1 |
279.93 |
1.82 |
5000 |
1021.32 |
608.93 |
1.68 |
Âèñíîâêè.
1.Âèêîðèñòàííÿ
ñïåö³àë³çîâàíèõ á³áë³îòåê º äîö³ëüíèì òà äຠïðèð³ñò øâèäêî䳿 äî 60 ðàç â
ïîð³âíÿíí³ ç ïðîñòîþ ïðîãðàìîþ, ùî âèêîðèñòîâóº OpenMP.
2.Á³áë³îòåêà PARDISO äåùî
ïîñòóïàºòüñÿ á³áë³îòåö³ LAPACK ó øâèäêî䳿, àëå öå ìîæíà ïîÿñíèòè îïòèì³çàö³ºþ
ö³º¿ á³áë³îòåêè ï³ä ñèñòåìè ð³âíÿíü, äå º çíà÷íà ê³ëüê³ñòü íóë³â ó ìàòðèö³
êîåô³ö³ºíò³â.
3.Ñïåö³àë³çîâàí³ á³áë³îòåêè
âñå ùå äàþòü ïðèñêîðåííÿ ïðè âèêîðèñòàíí³ ê³ëüêîõ ÿäåð. Öå ïðèñêîðåííÿ º ìåíøèì
í³æ ïðèñêîðåííÿ â ïðîñò³é ïðîãðàì³ ç âèêîðèñòàííÿì OpenMP, ùî ïîÿñíþºòüñÿ
çíà÷íîþ îïòèì³çàö³ºþ àëãîðèòì³â.
4.Ïðèñêîðåííÿ ó ïðîñò³é
ïðîãðàì³ ç âèêîðèñòàííÿì OpenMP º çíà÷íèì, íàéêðàùèé ïîêàçíèê äîñÿãàºòüñÿ ïðè
îá÷èñëåíí³ ñèñòåì ç 3000-4000 ð³âíÿíü.
˳òåðàòóðà.
1.Intel® Math Kernel
Library 10.2 Reference Manual. http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/win/mkl/refman/index.htm — 2010
2.Intel® Math Kernel
Library 10.2 User's Guide http://software.intel.com/sites/products/documentation/hpc/mkl/lin/index.htm — 2010
3.Intel® Math Kernel
Library 9.0 for Linux Technical User Notes http://www.intel.com/software/products/mkl/docs/mklusel.htm — 2006
4.Ôîðóìè Intel MKL http://software.intel.com/en-us/forums/intel-math-kernel-library/
5.PARDISO Solver Project http://www.pardiso-project.org/index.php?p=manual
6.LAPACK — Linear Algebra
Package http://www.netlib.org/lapack/