Data Warehouse and OLAP solutions
within the decision support systems
Professor Adrian COJOCARIU, PhD
Assistant Cristina Ofelia STANCIU, PhD Candidate
“Tibiscus”
Faculty of Economic Science
1/A Daliei
Street, 300558, Timisoara, Romania
Phone:
+40-256-202931, Fax: +40-256-202930
E-mail: a_cojocariu@yahoo.com,
ofelia_stanciu@yahoo.com
ABSTRACT
A Data
Warehouse is a complex system which contains operational and historical data concerning
an organization, data provided by internal and external sources of the
organization. The Data Warehouse overtakes data from the operational data base,
the data being processed and analyzed in order to support the decision system. The
main means to benefit by the data from the Data Warehouse are the on-line analytical
processing (OLAP) solutions and Data Mining Techniques.
KEY WORDS: Data Warehouse, Data Mining, OLAP,
decisions
INTRODUCTION
Management activity within organizations has suffered significant changes
once with the development of the informational society, and the new
informational technologies have positively influenced the most important field
of interest this activity offers, decision making. Decisional tasks start to
get harder to fulfill if the human decider is not assisted by computer-based
tools, also known as decision support systems (DSS).
Nowadays, the decision support system components are similar to the ones
Sprague identified in 1982: the user interface, knowledge-based subsystems, the
data management module and the model management manner.
Figure no. 1. Data management module
(From Turban, 2001)
The data management method is a subsystem of the computer-based decision
support system, and has a number of subcomponents of its own (Figure no. 1.):
§
the integrated decision support system database, which
includes data extracted from internal and external sources, data which can be
maintained in the database or can be accessed only when is useful;
§
the database management system; the database can be relational
or multidimensional;
§
a data dictionary, implying a catalog containing all the
definitions of database data; it is used in the decisional process
identification and definition phase;
§
query tools, assuming the existence of
languages for querying databases.
PROBLEM DEFINITION
The Data
Warehouse is a complex system that holds operational and historical data for an
organization, representing a separate entity from the other operational
databases. The huge quantity of data maintained in a data warehouse is
collected from internal sources as well as from external sources of the
organization. The data warehouse fetches data from operational databases; the
data can be then analyzed in various ways, for the purpose of helping the
decider in the decisional process.
W. H. Inmon, the most remarkable author in the area of building
data warehouses, describes these as “a subject-oriented, integrated, historical
and non-volatile data collection designed to assist in the decisional process”,
thus the primary properties of data warehouses: subject-oriented, integrated,
with an historical character and persistent data.
The process of
building and using data warehouses is known as data warehousing; this process implies integration, filtering and
consolidating the data.
The objectives
of a data warehouse can be identified within the following:
Ø
providing the user with persistent data access – the computer
is the tool permitting easier access to the data warehouse;
Ø
providing an unique version of the data – ambiguous data will
not be supplied to the user, as so there will be no debates regarding the
truthfulness of used data;
Ø
recording and accurate playback of past events – historical
data can be extremely important to the user because often the present data is
meaningless unless compared to past data;
Ø
allowing a high level as well as a detailed level access to
data – information can be collected and formatted easier using the data within
data warehouses;
Ø
splitting operational level and
analytic level processing – maintaining an informational system in which the
decisional and operational information must be gathered together raises many
issues.
Data
warehouses can hold different types of data: detailed data, aggregated data,
metadata; the latter also allow specifying the data structure, source,
transformation rules, being used when loading data and thus playing an
important role in populating the data warehouse. The data warehouse
architecture is presented in Figure no. 2.
Using
data warehouses presents a series of advantages that can be identified among
the following:
§
the deciders can easily be provided with a series of reports
assisting in the decisional process;
§
the increase of data consistency, data “productivity” and the
decrease of computational costs;
§
the users are provided with access to a large variety of data
§
the structure of the data warehouse makes is easily adaptable
to data changes and capable of transmitting these changed data to the
operational system;
Figure no. 2. General architecture of the
data warehouse
A similar
entity to the data warehouse is the Data
Mart, often encountered in specialized literature, which has generated long
debates on the fact whether it is or it isn’t equivalent to a data warehouse.
Data Mart is not equivalent to a data warehouse, it represents a data
collection specialized on areas of interest, depending on the needs of a
particular department in the organization. There is a financial Data Mart, a
marketing one, and so on, these being almost totally independent from one
another. Each department is considered to be the owner of hardware and software
components constituting the Data Mart.
Data Mart comes
in two forms: dependent and independent. A dependent Data Mart is a data
warehouse sourced one, while the independent one is sourced by its own
applications.
A dependent
Data Mart is formed from loading data from operational systems into the
organization’s data warehouse which will be divided in smaller units named Data
Mart, and their dependency is actually this derivation from data warehouses.
An independent
Data Mart is more unstable than a dependent one, and its deficiency determines it
to delay the moment it begins to manifest until there are more independent Data
Marts within the organization. Due to the fact that organizations develop in
time, there are situations when many Data Marts are encountered, Data Marts
that have grown large and each of them needs to collect data from operational
databases; this fact can be relatively costly but also inefficient for those
operational databases, as the working time is reduced in favor of supplying
data to the Data Mart.
Data
warehouses can be very useful to various categories of deciders, and the most
important ways to benefit from the data within the warehouses are online
analytical processing (OLAP) and Data Mining techniques. The OLAP technology
refers to the possibility of aggregation of data in a warehouse, being able to filter
the large amount of data to obtain useful information for the decisional
process within an organization. According to specialists, an alternative term
for describing the OLAP concept would be FASMI (Fast Analysis of Shared
Multidimensional Information). The essence of each OLAP is the OLAP cube, also
known as the multidimensional cube composed from numeric facts called
measurements, categorized by dimensions [8]. These measurements are obtained
from records in the relational databases tables. The outcomes of user
requirements care be achieved by dynamically
traversing the dimensions of the data cube, on a high or detailed level.
OLAP systems
have the following properties [7]:
Ø
multidimensional data view;
Ø
intensive evaluation capabilities;
Ø
timeline orientation (time
intelligence).
Figure no. 3. The components of a decision
support system in an organization using information technologies
The available
technologies for managing data and information must lead to better
understanding of past events and to predict the future through an increase
efficiency brought to the decisions made, also involving Data Mining here. Data
Mining techniques integrated with decision support systems determine the
existence of a decision support tool that is still based on the man-machine
interaction (man-computer system), and these two entities taken together
represent a specter of computer-based analytic technologies developing a
platform for an optimum combination for an data-driven analysis, but controlled
by man [4].
A decision
support system in an organization using information technologies has the
components shown in Figure no. 3. Still, depending on the system, on its
complexity and functionality, the mentioned elements may or may not appear.
RESULTS
Nowadays decision
makers within an organization could hardly fulfil all their tasks without
decision support systems. The decision support systems are a great help for the
human decision maker, by analyzing and processing data, resulting information,
from which derives knowledge. Operational and historical data of an
organization are usually contained by a data warehouse. The main techniques to
benefit of the data within data warehouses are the OLAP solutions and the Data
Mining technique. Data Mining have greatly evolved,
being capable to “learn” from previous behaviour of the elements, and according
to the acquired knowledge it will set hypothesis which will be tested.
SOURCES
1.
Ackoff, R. L., From Data to Wisdom, Journal of Applied Systems Analysis, Volume
16, 1989, pages 3-9
2.
Filip F.Gh., Sisteme suport pentru decizii, Editura Tehnică, Bucureşti,
2004
3.
Filip F.Gh., Decizie asistată de calculator: decizii, decidenţi – metode
de bază şi instrumente informatice asociate, Editura
Tehnică, Bucureşti, 2005
4.
Ganguly, A. R., Gupta A., Data Mining Technologies and decision
Support Systems for Business and Scientific Applications, Encyclopedia of
Data Warehousing and Mining, Blackwell Publishing, 2005
5.
Graz, P., Watson, H., Decision Support in the Data Warehouse,
Prentice Hall, Upper Saddle River Publishing, 1998
6. Turban E., Aronson J., Decision Support Systems and Intelligent
Systems, Prentice Hall, SUA, 2001
7.
Zaharie D., Albulescu F., Bojan I.,
Ivacenco V., Vasilescu C., Sisteme
informatice pentru asistarea deciziei, Editura Dual Tech, 2001
8.
http://en.wikipedia.org