Essay PreviewMore ↓
Data Warehouses (DW) integrate data from multiple heterogeneous information sources and
transform them into a multidimensional representation for decision support applications. Apart from a
complex architecture, involving data sources, the data staging area, operational data stores, the global
data warehouse, the client data marts, etc., a data warehouse is also characterized by a complex
lifecycle. In a permanent design phase, the designer has to produce and maintain a conceptual model
and a usually voluminous logical schema, accompanied by a detailed physical design for efficiency
reasons. The designer must also deal with data warehouse administrative processes, which are complex
in structure, large in number and hard to code; deadlines must be met for the population of the data
warehouse and contingency actions taken in the case of errors. Finally, the evolution phase involves a
combination of design and administration tasks: as time passes, the business rules of an organization
change, new data are requested by the end users, new sources of information become available, and the
data warehouse architecture must evolve to efficiently support the decision-making process within the
organization that owns the data warehouse.
All the data warehouse components, processes and data should be tracked and administered via a
metadata repository. In , we presented a metadata modeling approach which enables the capturing
of the static parts of the architecture of a data warehouse. The linkage of the architecture model to
quality parameters (in the form of a quality model) and its implementation in the metadata repository
ConceptBase have been formally described in .  presents a methodology for the exploitation of
the information found in the metadata repository and the quality-oriented evolution of a data warehouse
based on the architecture and quality model. In this paper, we complement these results with
How to Cite this Page
"Data Warehousing." 123HelpMe.com. 05 Apr 2020
Need Writing Help?
Get feedback on grammar, clarity, concision and logic instantly.Check your paper »
- Data mining and Data warehousing are used daily in a wide variety of contexts. In today’s corporate world, decisions must be made rapidly and with the maximum amount of knowledge. Data warehousing is the process in which data from multiple sources is combined and stored in one common database (Gutierrez). The fundamental concept of data warehousing is the distinction between data and information. Data is observable and recordable facts but only comes to have value when it is presented as information.... [tags: distinction between data and information]
886 words (2.5 pages)
- The Return on Investment of Data Warehousing This paper will present the return on investment (ROI) of data warehousing (DW). The history of data warehousing is based on the definition and timeline. Then, detailed information about return on investment will be discussed. Following, will be information about data warehousing new technology of hardware and software. Data Warehousing is a new term in my department where we use the Network Appliance (NetApps) Netfiler storage devices/units. The information read was very informative and helpful in my understanding data warehousing better.... [tags: Data Warehouse ]
1905 words (5.4 pages)
- Businesses around the world are faced with many challenges when attempting to compete and gain a competitive advantage over their competition. The members of the management team within these businesses need to make quick and accurate decisions which will often impact the company and all of its employees, possibly for years to come. One of the worst situations for one of these decision makers to find themselves in is when they know that all of the information they need is available; but they are unable to retrieve that information in an accurate and useful way.... [tags: Information Technology ]
910 words (2.6 pages)
- Abstract: This paper covers trends in the data mining and data warehousing industry. It covers applications and new possibilities in the field along with risks involved, limitations, and possible questions surrounding ethical usage of information. As computing power has increased over the past few decades, the industry has found many innovative solutions to previously impossible problems. The raw increase in computing power, and the ability to push numbers and move large amounts of data in reasonable amounts of time have enhanced the abilities and sizes of databases.... [tags: Technology Essays]
2462 words (7 pages)
- Data Warehouses In the past decade, we have witnessed a computer revolution that was unimaginable. Ten to fifteen years ago, this world never would have imagined what computers would have done for business. Furthermore, the Internet and the ability to conduct electronic commerce have changed the way we are as consumers. One of the upcoming concepts of the computer revolution in the past ten years has been that of Data Warehousing. In the following pages, we will examine this concept in the broadest sense first looking at a brief history of how databases and data warehouses have unrolled.... [tags: essays research papers fc]
2853 words (8.2 pages)
- Introduction Data Warehouses (DW) integrate data from multiple heterogeneous information sources and transform them into a multidimensional representation for decision support applications. Apart from a complex architecture, involving data sources, the data staging area, operational data stores, the global data warehouse, the client data marts, etc., a data warehouse is also characterized by a complex lifecycle. In a permanent design phase, the designer has to produce and maintain a conceptual model and a usually voluminous logical schema, accompanied by a detailed physical design for efficiency reasons.... [tags: Technology Data]
1282 words (3.7 pages)
- Summary: A data warehouse is an information conveyance framework for business insight. It is not about innovation, but rather about taking care of clients ' issues and giving key information to the client. In the period of characterizing prerequisites, you have to focus on what information the client’s need, less on how you are going to give the obliged information (1). The genuine techniques for giving information will come later, not while you are gathering necessities.... [tags: Data mining, Data warehouse, Data warehousing]
1201 words (3.4 pages)
- The company implementing the Data warehouse is Sears Roebuck & Co. Sears is in the retailing business selling ever thing from under garments to washing machines. The main reason why the company is implementing the DW is because it wants to maximize the use of its information in its credit data warehouse. The prior application processed steps sequentially one processor at a time thus not using the multiprocessor capabilities of its machine to process the huge quantity of data. Mining of data was too slow and could only be performed once a year, which produced inaccurate data for the Marketing department.... [tags: essays research papers]
384 words (1.1 pages)
- In this report, the report will introduce based on MapReduce parallel data processing system classes ( Hadoop and Hive ), to optimize connection optimize storage layer and the query optimizer aspects OLAP two working efficiency of the implementation of tasks and explore big data system Environmental analysis of complex technical challenges and common optimization strategies to optimize face. Consider the main reasons that force companies to implement data warehousing. The help in making decisions based on facts rather than intuition," "provide an opportunity to get to know the customer" and, of course, everywhere inserted - is only the first step in the implementation of these ambitious goal... [tags: Data management, Data warehouse]
994 words (2.8 pages)
- The ability to harness the ever increasing amounts of business-related data will enable us to understand what is happening in the world. In this context, ‘Big Data’ is one of the biggest buzzwords these days  and it is going to impact on the Business Intelligence domain. In particular, generating huge metadata (e.g. trust, security, and privacy) for imbuing the business data with additional semantics, the adoption of social media, the digitalization of business artifacts (e.g. files, documents, reports, and receipts), and using sensors (e.g.... [tags: Business intelligence, Data warehouse]
707 words (2 pages)
data warehouse processes.
The different viewpoints for the metadata repository of a data warehouse
In the three phases of the data warehouse lifecycle, the interested stakeholders need information on
various aspects of the examined processes: what are they supposed to do, how are they implemented,
why are they necessary and how they affect other processes in the data warehouse [68, 29]. Like the
data warehouse architecture and quality metamodels, the process metamodel assumes the clustering of
their entities in logical, physical and conceptual perspectives, each assigned with the task of answering
one of the aforementioned stakeholder questions. In the rest of this section we briefly present the
requirements faced in each phase, our solutions and their expected benefits.
The design and implementation of operational data warehouse process is a labor-intensive and
lengthy procedure, covering thirty to eighty percent of effort and expenses of the overall data warehouse
construction [55, 15]. For a metamodel to be able to efficiently support the design and implementation
tasks, it is imperative to capture at least two essential aspects of data warehouse processes, complexity
of structure and relationship with the involved data. In our proposal, the logical perspective is capable
of modeling the structure of complex activities and capture all the entities of the widely accepted
Workflow Management Coalition Standard . The relationship of data warehouse activities with
their underlying data stores is taken care of in terms of SQL definitions.
This simple idea reverts the classical belief that data warehouses are simply collections of
materialized views. In previous data warehouse research, directly assigning a naïve view definition to a
data warehouse table has been the most common practice. Although this abstraction is elegant and
sufficient for the purpose of examining alternative strategies for view maintenance, it is incapable of
capturing real world processes within a data warehouse environment. In our approach, we can deduce
the definition of a table in the data warehouse table as the outcome of the combination of the processes
that populate it. This new kind of definition complements existing approaches, since our approach
provides the operational semantics for the content of a data warehouse table, whereas the existing ones
give an abstraction of its intentional semantics.
The conceptual process perspective traces the reasons behind the structure of the data warehouse.
We extend the demand-oriented concept of dependencies as in the Actor-Dependency model , with
the supply-oriented notion of suitability that fits well with the redundancy found often in data
warehouses. As an another extension to the Actor-Dependency model, we have generalized the notion
of role in order to uniformly trace any person, program or data store participating in the system.
By implementing the metamodel in an object logic, we can exploit the query facilities of the
repository to provide the support for consistency checking of the design. The deductive capabilities of
ConceptBase  provide the facilities to avoid assigning manually all the interdependencies of activity
roles in the conceptual perspective. It is sufficient to impose rules to deduce these interdependencies
from the structure of data stores and activities.
While the design and implementation of the warehouse are performed in a rather controlled
environment, the administration of the warehouse has to deal with problems that evolve in an ad-hoc
fashion. For example, during the loading of the warehouse contingency treatment is necessary for the
efficient administration of failures. In such events, not only the knowledge of the structure of a process
is important; the specific traces of executed processes are also required to be tracked down in an
erroneous situation, not only the causes of the failure, but also the progress of the loading process by the
time of the failure must be detected, in order to efficiently resume its operation. Still, failures during the
warehouse loading are only the tip of the iceberg as far as problems in a data warehouse environment
are concerned. This brings up the discussion on data warehouse quality and the ability of a metadata
repository to trace it in an expressive and usable fashion. To face this problem, the proposed process
metamodel is explicitly linked to our earlier quality metamodel . We complement this linkage by
mentioning specific quality factors for the quality dimensions of the ISO 9126 standard for software
implementation and evaluation.
Identifying erroneous situations or unsatisfactory quality in the data warehouse environment is not
sufficient. The data warehouse stakeholders should be supported in their efforts to react against these
phenomena. The above-mentioned suitability notion in the conceptual perspective of the process
metamodel allows the definition of recovery actions to potential errors or problems (e.g., alternative
paths for the population of the data warehouse) in a straightforward way, during runtime.
Data warehouse evolution is unavoidable as new sources and clients are integrated, business rules
change and user requests multiply. The effect of evolving the structure of the warehouse can be
predicted by tracing the various interdependencies among the components of the warehouse. We have
already mentioned how the conceptual perspective of the metamodel traces interdependencies between
all the participants in a data warehouse environment, whether persons, programs or data stores. The
prediction of potential impacts (whether of political, structural, or operational nature) is supported by
this feature in several ways. To mention the simplest, the sheer existence of dependency links forecasts
a potential impact in the architecture of the warehouse in the presence of any changes. More elaborate
techniques will also be provided in this paper, by taking into account the particular attributes that
participate in these interdependencies and the SQL definitions of the involved processes and data stores.
Naturally, the existence of suitability links suggests alternatives for the new structure of the warehouse.
We do not claim that our approach is suitable for any kind of process, but focus our attention to the
internals of data warehouse systems.