Data Warehouse Architecture
Definition von Data Warehouse
A data warehouse is a central repository that contains a copy of the data stored in multiple corporate sources. This data is later used for analyzing business data and supporting decision-making processes. In many companies a data warehouse serves as the heart of their individual Performance Management and Business Intelligence strategies.
Good design dístinguishes an efficient and useful data warehouse is based on three guiding principles:
- Integrating data from disparate, heterogeneous databases to create a single version of the truth and enable comprehensive analysis
- Separating operational business data from management data for reporting, decision support and analysis
- Sound data modelling and mapping
Why do I need Data Warehouse?
Excellent question. There is ample literature on this very topic – not to mention many different views on the “one true” methodology and implementation. Our experience shows that the pragmatic approach of building smaller data marts for one single purpose (e.g. enterprise planning) can be beneficial.
Sooner or later, however, most companies will expand their activities into the performance management spectrum at which point a data warehouse approach will be necessary. For this reason, we create well-structured architectures from business as well as technical points of view for even the smallest of data marts to ensure future system scalability.
In support of this important specification process between business and IT the pmOne offers Solutions tools for Modelling to be effectiv.
Data Lake as a supplement to the data warehouse
In the context of big data, businesses are increasingly gaining access to new information such as unstructured documents, blogs and images from the Web or semi-structured XML, HTML and sensor data. In contrast to this, the Data Warehouse, which is usually located in the finance or controlling area, is primarily designed to process structured data that originates from the business systems such as ERP.
Companies that manage to combine these different types of data for analysis purposes increase their analytical quotient, gain new insights for corporate management, and thus gain a competitive advantage. In this respect, there is no question that information that is available beyond the operational management systems must also be prepared.
Analytics Platform as an extension
This is where Data Lake comes into play as a complementary concept that can turn the data warehouse into a analytics platform . The Data Lake offers the possibility of parallel processing in combination with very high storage capacities. Here, you can store large amounts of data first of all, in order to be able to set up prediction models and the like, if required. consulted. In contrast to the Data Warehouse, the Data Lake is therefore large on storage Datasets designed and very flexible in terms of processing different formats. While the data in the data warehouse is generally available in such a way that the business users from the finance and controlling departments can evaluate them independently with the appropriate tools, the purposeful use of the heterogeneous data lake information, which often benefits sales and marketing, First of all, the special skills of data scientists asked.
Data warehouse summarized
A data warehouse is a topic-driven, historical and autonomous database of a company that integrates and manages data from multiple independent source systems. The data is provided and loaded from the data sources, where it is stored for long-term data analysis, business decision support and data mining.