Ôèëîñîôèÿ/4. Ôèëîñîôèÿ êóëüòóðû

Boyarshinova E.B.

Moscow State University, Department of philosophy, Russia

Movie audience's aesthetic preferences digital investigation

One of the issues arising in the contemporary cinema aesthetic perception is a statistical comparison of current Russian moviegoer tastes (both gross audience and cinephiles) with the tastes of their western counterparts. The development of communication technologies as well as the Internet gave rise to cinephile oriented websites containing vast amounts of organized information. Box office receipts indicate the gross audience's regard for the movie. The number of webpage visitors who evaluated the movie measures the cinephiles' (advanced audience) attention to it. The range of evaluations indicates how the movie was perceived from the artistic point of view. The development of a correlation method in order to match the data concerning our country movie theaters audience and foreign audience is crucial to studying differences in movie reception. In fact all current indicators require a correlation method to be applied due to differences between foreign and domestic movie market's volume, number of cinephiles and their integration in the web community as well as specific approaches to movie assessment.

A relatively obvious method was used herein. It's best explained by example. Let's assume that there are some measurements in inches and centimeters. For instance heights of a number of people. By comparing the data a conversion coefficient between inches and centimeters can be established. Still for some subjects the height converted into centimeters through coefficient and the height measured by a ruler will not coincide. That indicates that something affected the measurement result in different ways when the height was measured by an inch ruler and a centimeter ruler respectively.

The same is true for opinion comparison between different audiences. It is possible to calculate a median conversion coefficient yet the result for certain movies will not follow simple correspondence rules. Hence some aspect of the movie was perceived differently by domestic and foreign audiences. Moreover there is even no correlation, i.e. no connection, between the attention given to the movie by domestic cinephiles and domestic gross audience.

Information sources. Several movies are released in Russia each week. For instance according to a popular cinephile page [1] 339 movies were released in 2008. All kinds of information concerning those movies could be obtained via the Internet. One can watch a trailer or a film clip, view a gallery of movie shots, read a professional or an amateur review as well as comments by those who have seen the movie. If that were the extent of information on movie aesthetical properties, the analysis framework would be limited to the audience opinions, i.e. professional and amateur comments. Yet a large scale web community exists today comprised of a vast number of cinephiles that are taking part in polls on relevant web resources on a regular basis. Then the statistical data gathered on expressed opinions is automatically consolidated. The commercial success or failure also indirectly indicates its assessment by the viewers.

Below do the information sources comprise the foundation of our research.

A number on the ten-point scale in Top250 and IMDB represents a straightforward expression of cinephiles' movie assessment. Top250 is employed by the Russian speakers while IMDB index is international. IMDB page contains dozens more reviews of the movies released worldwide that Top250 page. Average assessment released by the two sites could be used to compare the tastes of active moviegoers who could be called cinephiles.

Aside from a straightforward assessment expressed in points, there is an indicator reflecting active moviegoers' interest in a certain film, namely a number of people who gave their movie evaluation on Top250 and IMDB pages.

Statistic overview of professional English-speaking film critics is also available at www.kinopoisk.ru. That includes the number of reviews, the share of positive reviews among them and an average evaluation of the movie on a ten-point scale.

Box office receipts are used to compare audience's interest to various films. Russian and USA box offices are available on this site.

The data for indicated criteria and evaluations is available for 142 of 339 movies released in Russia in 2008.

Research method and results. The main question is "How to make use of such an abundant statistics?" Should we restrict ourselves and create another top ten movies list, such as can be found in any popular magazine? By the way that is the main concept behind Top250. A continuously updated list of 250 best movies is the core of that web site. The tool we designed is slightly different. It allows, as will be demonstrated, highlighting differences in movie evaluations despite its position in a rating list.

Now to describe the issue using the mathematical statistics language. Two quantitative characteristics are known about a certain number of objects (in this case movies). Here those characteristics could be selected from a broad range:

- cinephiles' evaluation Top250;

- cinephiles' evaluation IMDB;

- number of comments in Top250 ranking;

- number of comments in IMDB ranking;

- box office receipts in Russia (a measure of the Russian audience's interest in the movie);

- box office receipts in the USA (a measure of the American audience's interest in the movie);

- English-speaking critics' evaluation;

- percentage of positive English reviews;

- total number of English reviews;

- any other quantitative characteristic of viewers' attention and reception that can be found in the Internet.

Thus there are two quantitative characteristics selected from a broad range for a certain number of objects (movies). Let us denote them {xi;yi}. The subscript i is  a computation value that goes from 1 to n.

There is a mathematical criteria of correlation coefficient that allows to assess how accurately does a linear dependence describe the connection between values y and x. Such criteria is called the coefficient of the linear correlation [2]. It would be proper to calculate it using the following formula:

Here the horizontal lines above expressions represent averaging. Specifically:

It should be noted that . In terms of statistics the values  and  are called covariance and dispersion respectively.

The linear correlation coefficient ranges from -1 to 1. If the coefficient exceeds zero the positive correlation occurs. In such cases larger values of one variable correspond to large values of another. The negative correlation is represented by the exact opposite case when smaller values of one variable correspond to large values of another. The correlation coefficient absolute value is considered to be the ratio of closeness for two variables. The closer it is to 1 the closer are the variables. The correlation coefficient that differs from zero not more than by 0,2 … 0,3 indicates the absence of connection between the values or an extremely low degree of connection.

Much can be deduced from the correlation coefficient value itself. For instance there is no correlation between cinephiles' evaluations trough Top250 and box office receipts for 2008; the correlation coefficient equals 0,047. A very significant result demonstrating that the receipts are in no way connected to artistic or aesthetic values of the movie. At the same time the number of people who evaluated or reviewed the movie greatly depends on the box office receipts. The correlation coefficient for Top250 evaluations and Russian box office reaches 0,635. That being a relatively expected result since the number of cinephiles who have watched the movie is determined by the scale of motion picture distribution.

The correlation between Top250 and IMDB evaluations is even tighter with correlation coefficient amounting to 0,794; while the reviews' numbers correlate with 0,743 coefficient. Meaning in general the movie evaluation and its appeal for domestic and foreign audience is determined by aesthetic criteria.

However we are led to the most curious part which are the distinctions.

It should be noted once again that we are dealing with two quantitative characteristics for a certain number of motion pictures. The characteristics were denoted {xi;yi} where a subscript i is a  computation value that goes from 1 to n. When displayed on a XOY coordinate plane  the points make up a more or less elongated cloud. It seems proper to display the relationship between y and x as a line, i.e. a linear function:

y~i=kxi+b.

The larger the linear correlation coefficient the more justified is such an approximation. The method for displaying such dependencies is known since the Renaissance. It is called the least-squares method. In essence this method suggests selecting values of dependence parameters (in this case k and b) in such a way that minimizes the sum of the squares of the errors (deviations between theoretical and real values). The deviation of a real value (yi) from theoretical (y~i) is:

εi=yi- y~i=yi-kxi-b.

The problem was solved a long time ago in the following manner:

The deviation of real values from theoretical ones (εi) determines the correspondence between theoretical dependency and the real one in each separate case. Mean square deviation of real values from theoretical ones (residual variance) determines the precision of the dependency correlation to the actual situation in general. The derivative of that value is the residual mean square deviation commonly denoted by the Greek letter σ. The values are easily calculated using the following formulae:

The significance of these values couldn't be overestimated. Recall the so-called three sigma rule: "very rarely does a random value deviate from its mean value by more than three mean square deviations (three sigma)". If such deviation does occur it indicates that it could be assumed the object is "special" and there is an underlying reason for that. Now to illustrate the point:

If to find the degree of dependence we use as values (y and x) the number of moviegoers (cinephiles) who provided evaluation on the Top250 website and the box office receipts in Russia (correlation coefficient between the values is 0,635; the data is available for 317 motion pictures), then a peculiar fact can be found. The expected appeal of the following movies to cinephiles exceeds three sigma (or less): «Twilight» (8,5σ), «The Dark Knight» (7,0σ), «WALL-E» (5,8σ), «I am Legend» (3,7σ), «Taken» (3,3σ), «Iron Man» (2,6σ), «Sweeney Todd: The Demon Barber of Fleet Street» (2,6σ).

The burst of attention to the «Twilight» is explained by the viewers, young girls for the most part, who fancied the protagonist - a handsome vampire. The appeal of «The Dark Knight» is also understandable. A tragically deceased actor Heath Ledger played his last role in this movie, his death preceding the premier attracted increased attention from cinephiles. The remaining three movies enjoyed increased attention due to artistic values and lower gross audience interest in them.

Cinephiles paid far less attention to the following movies: «Madagascar 2» (–3,4σ), «Admiral» (–2,6σ), «The Mummy: Tomb of the Dragon Emperor» (–2,6σ). Such falling behind is easily explained. The movies in question were attracted viewers outside of the Internet community. «Madagascar 2» and «The Mummy: Tomb of the Dragon Emperor»» were successful in the box office yet aimed at children. The movie «Admiral» was apparently unpopular for aesthetic reasons.

Another peculiarity is the movie «Awake» (2008). The Russian viewers appeared more interested in the story than western cinephiles and American viewers. This fact may signify that issues described in the motion picture are more relevant to our viewers. According to the story the general anesthesia can sometimes fail on a patient going through the surgery.

To demonstrate the relationship between the number of  moviegoers (cinephiles) who evaluated a motion picture on Top250 website and the box office receipts in Russia 317 movies were studied. The employed method allowed to distinguish 10 movies or close to 3%. The information was processed for only one year (2008). The amount of information available for study and research greatly exceeds the amount that was processed and we expect to continue this work.

 

Bibliography:

1. www.kinopoisk.ru

2. Áðîíøòåéí È.Í., Ñåìåíäÿåâ Ê.À. Ñïðàâî÷íèê ïî ìàòåìàòèêå äëÿ èíæåíåðîâ è ó÷àùèõñÿ ÂÒÓÇîâ. Ì. – 1980. (Bronstein I.N., Semendyaev K.A. Mathematics. A reference book for engineers and technical colleges students. Ì. – 1980.)