Osokina S.A.

Altai State University

Will Medvedev and Obama Find a Common Language:

Creating the Linguistic Theory of Thesaurus


Today language thesaurus studies form one of the most topical linguistic problems. New thesaurus dictionaries are being published almost every day. The number of information retrieval software thesauruses is also increasing. At the same time linguistic research into the thesaurus is becoming more and more intensive. Scientists agree that we can claim the existence of the new branch of humanity studies – thesaurology (the term was created by Val. A. and Vl. A. Lukovs, the professors of Moscow Humanitarian University) which is a branch of culturology but without doubt is connected with liberal arts in general.

The great number of works on the thesaurus and the seeming conceptual discord and lack of cooperation in these works is a result of the post-modernism methodology and its main product.

Actually all conceptions of what the thesaurus is can be unified to the main idea of a special device to systematize and search for information. The fact is all devices, both software and printed dictionaries, have different aims and structures so that it is really hard to distinguish the main quality of the thesaurus. The dictionaries reveal synonymous, subject, contextual and other types of relations between the words and word collocations, therefore it is unclear which type of semantic relations is the leading one in the structure of the thesaurus.

Results of some studies modeling thesaurus structures made it possible to find parallels between the arrangement of the thesaurus system and the arrangement of the artificial intelligence, on the one hand, and between the arrangement of the thesaurus system and the arrangement of the human thinking, on the other hand. The similarity with the artificial intelligence is the strict vocabulary rubrication and hierarchical organization of information. The similarity with the work of the human thinking is the hyper textual way of searching for information – it reminds the associative “jumps” of the human thinking.

Thus, in modern liberal studies there is such a situation when it is necessary to systematize all the knowledge about the thesaurus and create a theory of the thesaurus. Since the thesaurus deals with semantic relations of words, we suggest to do it in the sphere of linguistics.

         We believe that the thesaurus is something more then merely a dictionary. In fact, all thesaurus dictionaries and information systems are artificially created by the man models of the objective natural entity. This entity is a kind of verbal mass pressing on the organs of perception (mainly eyesight and ear), penetrating into the human consciousness and completely filling it. It is something that makes the man to create understandable verbal products – texts – in a certain language.

         To wide extend, the thesaurus is the informational system of the culture, the mean of semantic organization of the world. At the same time, as the object of a scientific research this entity is given only in its individual manifestations, as the thesaurus of W. Shakespeare, the thesaurus of A. Pushkin. Still, we believe that the individual thesaurus is formed rather by the language than by the will of an individual. These views were revealed in the study published in 2007 [1].

         Similar results were achieved by Val. A. and Vl. A Lukovs who studied the thesaurus from culturological positions. They think that individual thesauruses are structured by so to say nodal centers which the scientists call “thesaurus structures”. The nodal centers can be compared with roots of words added by different affixes in the process of derivation, or set idiomatic expressions [2].

         We suggest that the key to creating the linguistic theory of the thesaurus must be the conception of the set collocations of the language. Russian linguists have achieved a lot in studying the set collocations. The most famous works on this subject belong to V. Vinogradov. But the works that were taken as the basis for our study of the thesaurus are the works by I.E. Anichkov [3]. He proves that all verbal expressions that we pronounce every day are made of set collocations. While speaking, we can not combine words as we want. Every word needs before and after it a very special word, suitable for the context and the semantic structure of the given word. We do not produce word combinations, we use the set collocations provided by the language. And if we venture to use something new – a word combination, unheard before – we are at risk to be misunderstood.

         The idea that the set collocation of words is the main structure of the thesaurus and its nodal center was discussed in our mentioned above work [2]. The work proves that the system of the thesaurus consists of set collocations provided by the language and thus available to choose by users of the language. We have analyzed the thesaurus system of Russian language and the process of its development through the 19th  and 20th centuries. The studying language material was “The Daemons” by Dostoyevsky, a masterpiece of Russian literature, some pieces of modern Russian literature, and political texts.

         In this work we suggest the results of the study conducted on the material of two languages, Russian and English. We believe that the only way for a language theory to gain credibility is to make it explain facts of different languages.

The scientific position whish we are going to prove concludes that two individuals are able to understand each other only if they have a common language, that is their individual thesaurus is either absolutely, or at least partly similar. In other words, two individuals will be able to understand each other and to come to an agreement only if they use in their speech similar set collocations or collocations with the same nodal word.

The nodal word is the lexeme, forming the most frequently used by an individual set collocations. For example, one of the most important nodal word in F. Dostoyevsky’s thesaurus is the word человек (man) because  the set collocations with this word are extremely often used in his novels: деловой человек, молодой человек, благородный человек, опасный человек etc.

The technique of picking set collocations out of the text – the so called epistemological method of texts analysis – is described in our work published in 2006 [4]. It was used to analyze different types of texts – fiction, Mass Media publications, politicians’ speeches.

         Using this technique we analyzed texts in Russian and English languages. The material of the study was the texts pronounced by Dmitry Medvedev and Barack Obama during their meeting in Moscow in July 2009. The texts are taken from the official sites of the President of Russia (www.kremlin.ru) and the President of the USA (www.whitehouse.gov). The aim of the study is to find out whether the Presidents’ thesauruses agree, that is whether there are similar set collocations. If there are, then we can assume the Presidents’ capacity to understand each other and the possibility of achieving positive communication results.

         As the result of our work we can provide the following facts.

         The most frequently used set collocations in the actual thesaurus of Mr. Medvedev during the meeting in Moscow are the collocations with the nodal words отношения (relations), проблема (problem), позиция (position), вопрос (question), сотрудничество (cooperation). Examples: российско-американские отношения, история российско-американских отношений, наши отношения, развитие отношений, укрепление отношений, персональные отношения, личные отношения, строить отношения, межгосударственные отношения, отношения между странами; сталкиваться с проблемами, решать проблемы, экономические проблемы, проблемы межгосударственной безопасности, накопившиеся проблемы, груз проблем, колоссальная проблема; принципиальная позиция, стоять на позиции, сближение позиций, излагать позицию; остановиться на вопросе, обсуждать вопросы, решать сложные вопросы; продолжать сотрудничество, направление сотрудничества, российско-американское сотрудничество. All these collocations are mentioned in Mr. Medvedev’s speech several times.

         The most frequently used set collocations in the actual thesaurus of Mr. Obama are the collocations with the nodal words cooperation, security, step, relation/relationship. Examples: nuclear security cooperation, great cooperation, improved cooperation, to broaden cooperation, a foundation for cooperation, the pursuit of cooperation; to strengthen our security, global security, nuclear security, security of the country; to take steps, steps forward, concrete steps, important steps, call for strong steps; the relationship between Russia and the US, to set/reset relations, bilateral relationship. 

         In both Presidents’ speeches we can find collocations with the word ядерный/nuclear (ядерное оружие/nuclear weapons, ядерный арсенал/nuclear arsenal, ядерные боеголовки/nuclear warheads, ядерные державы/nuclear powers etc.). These collocations were the key ones at Moscow meeting in 2009.

         As we see, Russian and American Presidents’ actual thesauruses, used during the meeting in Moscow, are partly similar. This fact improves the possibility to achieve agreement between the Presidents.

         But the analysis also provides the facts that show some dissimilarities as well. For instance, the most commonly used collocations in President Madvedev’s speech are the collocations with the word отношения (relation), especially in the beginning of the meeting; at the same time the most commonly used collocations in President Obama’s speech are with the word cooperation. This fact shows the difference in preliminary communicative aims of the Presidents.

         Another evidence of the dissimilarity is that there are collocations used merely by President Medvedev or merely by President Obama. For instance, only Mr. Medvedev uses collocations with the word ответственность (responsibility): нести ответственность, осознавать ответственность, перекладывать ответственность; and only Mr. Obama uses collocations with the word commitment  (to keep commitment, to abandon commitment, international commitment). Though the words ответственность (responsibility) and commitment are close synonyms, we can conclude that the Presidents estimate the discussed situation and see their position in this situation in different ways.

         The main result of the whole thesaurus analysis is the fact of the thesaurus shift which happened to the end of the meeting in Moscow. We compared the opening statements of the Presidents pronounced on the 6th of July and their closing speeches at the end of the meeting in Moscow and we found out that there is an obvious shift in the frequency of usage of the set collocations. This shift could have happened only as the result of the Presidents’ interaction and their two-way influence on each other.

Thus, in the opening statements of Mr. Medvedev we can see mostly collocations with the word отношения (relations) especially when he speaks about Russia-America contacts. In the opening word of Barack Obama when he mentions Russia-America contacts there are mostly collocations with the word cooperation. In their closing speeches both Presidents tend to use the collocations with the word cooperation. Moreover, at the end of the meeting they both frequently use the collocations with the lexemes прогресс/ progress, усилия/efforts, ситуация/situation.

So, we can conclude that to the end of the meeting Russian and American Presidents literally found a common language.

Creating the linguistic theory of thesaurus is the matter of the future.

A theory must have its metalanguage which defines its main notions and concepts.

A theory must be based on a proper philosophical methodology and have its own methods of research within this or that science.

A theory must be able to explain phenomena and processes of the reality and predict the possibility of their happening in the future and the ways of their development.

         In this article we have just marked several key ideas of the linguistic thesaurus theory.

The main idea is that the thesaurus entity is the linear explication of semantic relations between words – by analogy of the explication of speech. Language thesaurus is based rather on syntagmatic than on paradigmatic word relations and assumes successive explication, not hyperspace jumps.

The second idea is that the thesaurus unit is the set collocation of words. In the thesaurus system set collocations with an identical word form the nodal centers of the system; and the identical word itself can be called the nodal word.

The method of the thesaurus analysis assumes picking out of the text frequently used set collocations, not just frequently used words. This idea stresses that the thesaurus is not the same thing as the vocabulary (the lexicon) – i.e. not a set of words, but a set of reproducible semantic relations between words; and a set collocation is an explication of these relations. At the same time the role of the nodal words themselves in an individual thesaurus is very important; and the thesaurus theory must explain it.

The word thesaurus is an essentially open for development system; and the theory of the thesaurus must explain the principles of its development.

At each moment of time there exists only the so called actual thesaurus, i.e. the system of set collocations relevant in a particular situation.

Though some questions remain under discussion, we can conclude that creating the theory of the thesaurus is a necessity of today’s science. We assume that the linguistic theory of the thesaurus will be a branch of a more general theory but it is undoubtful that it will give new explanation to some facts of the language and predict their development in future.



1. Осокина С.А. Языковые механизмы воздействия на человека // С.А. Осокина. – Барнаул : Изд-во «Графикс», 2007. – 224 с.

2. Луков Вал. А., Луков Вл. А. Тезаурусный подход: исходные положения // Электронный журнал "Знание. Понимание. Умение" / 2008 / №9, 2008 - Комплексные исследования: тезаурусный анализ мировой культуры - http://www.zpu-journal.ru/e-zpu/2008/9/Lukovs_Thesaurus_Approach.

3. Аничков, И.Е. Труды по языкознанию / И.Е. Аничков. - СПб. : Наука, 1997. – 512 с.

4. Осокина, С.А. Об эпистемологической методике анализа художественного текста // Художественный текст: варианты интерпретации : Труды XI Всероссийской научно-практической конференции (Бийск, 12-13 мая 2006): В 2-х ч.   Ч. 2.  Бийск : Изд-во БПГУ им. В. М. Шукшина, 2006.  С. 57–61.