What are you looking for ?
Home » » Introduction (1)

Introduction (1)

{[['']]}

Data Warehouse

Business Intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access and analysis of information to improve and optimize decisions and performance.

As the definition states, the main purpose of a BI system is to help decision makers to make proper decisions based on the results of data analysis provided by the BI system.

A BI system usually uses a data warehouse as a core tool. The data warehouse is an integrated dimensional data structure. Data from a variety of sources will be fed into the data warehouse and some data quality and governance would be applied on the data. The dimensional model of data warehousing is optimized for reporting and analysis, so data visualization tools can directly query against the data warehouse. Another layer of modeling might be added to the BI architecture, OnLine Analytical Processing (OLAP), or the tabular model. These models will improve data access in terms of speed and performance of queries. BI systems have one or more data visualization frontends that will be the GUI for the end user.

Integration of systems & BI solutions

Businesses use multiple operational systems to simplify, standardize, and automate their everyday jobs and requirements. Each of these systems may have their own database, some of which may work with SQL Server, some with Oracle. Some of the legacy systems may work with legacy databases or even file operations. There are also systems that work through the Web via web services and XML.

The rising number of operational systems also adds another requirement, which is the integration of systems together. Business owners and decision makers not only need integrated data but also require an analysis of the integrated data.

But some organizations and businesses use ERP systems that are integrated, so a question may appear in your mind that there won't be a requirement for integrating data.

So does that mean that these systems still require a BI solution? The answer in most cases is yes. The companies or businesses might not require a separate BI system for internal and parts of the operations that implemented it through ERP. However, they might require getting some data from outside, for example, getting some data from another vendor's web service or many other protocols and channels to send and receive information. This indicates that there would be a requirement for consolidated analysis for such information, which brings the BI requirement back to the table.

The architecture and components of a BI system

The BI architecture and components differ based on the tools, environment, and so on. The architecture shown in the preceding diagram contains components that are common in most of the BI systems.


Data Warehouse Design

The data warehouse is the core of the BI system. A data warehouse is a database built for the purpose of data analysis and reporting. This purpose changes the design of this database as well. As you know, operational databases are built on normalization standards, which are efficient for transactional systems, for example, to reduce redundancy.

A report on sales information may consume more than 10 joined conditions, which slows down the response time of the query and report. A data warehouse comes with a new design that reduces the response time and increases the performance of queries for reports and analytics.

Extract Transform Load

It is very likely that more than one system acts as the source of data required for the BI system. So there is a requirement for data consolidation that extracts data from different sources and transforms it into the shape that fits into the data warehouse, and finally, loads it into the data warehouse; this process is called Extract Transform Load (ETL).

There are some ETL tools in the market that perform the extract, transform, and load operations. The Microsoft solution for ETL is SQL Server Integration Service (SSIS), which is one of the best ETL tools in the market. SSIS can connect to multiple data sources such as Oracle, DB2, Text Files, XML, Web services, SQL Server, and so on.

Data model – BISM

A data warehouse is designed to be the source of analysis and reports, so it works much faster than operational systems for producing reports. However it is not that fast to cover all requirements because it is still a relational database, and databases have many constraints that reduce the response time of a query. The requirement for faster processing and a lower response time on one hand, and aggregated information on another hand causes the creation of another layer in BI systems. This layer is called a data model.

Microsoft's solution for the data model is split into two technologies: the OLAP cube and the In-memory tabular model. The OLAP cube is a file-based data storage that loads data from a data warehouse into a cube model.

The cube contains descriptive information as dimensions (for example, customer and product) and cells (for example, facts and measures, such as sales and discount).

Microsoft SQL Server Analysis Services 2012(SSAS) comes with two different types of modeling: multidimensional and tabular. Multidimensional modeling is based on the OLAP cube and is fitted with measures and dimensions. The tabular model is based on a new In-memory engine for tables. The In-memory engine loads all data rows from tables into the memory and responds to queries directly from the memory. This is very fast in terms of the response time.

Data visualization

The frontend of a BI system is data visualization. In other words, data visualization is a part of the BI system that users can see. There are different methods for visualizing information, such as strategic and tactical dashboards, Key Performance Indicators (KPIs), and detailed or consolidated reports. As you probably know, there are many reporting and visualizing tools on the market.

Microsoft's SQL Server Reporting Services (SSRS) is a great reporting tool for creating detailed and consolidated reports. There are also components in Excel such as Power View, which are designed to build performance dashboards.

Master Data Management (MDM)

Every organization has a part of its business that is common between different systems. That part of the data in the business can be managed and maintained as master data. For example, an organization may receive customer information from an online web application form or from a retail store's spreadsheets, or based on a web service provided by other vendors. Microsoft's solution for MDM is Master Data Services (MDS). Master data can be stored in the MDS entities and it can be maintained and changed through the MDS Web UI or Excel UI. Other systems such as CRM, AX, and even DW can be subscribers of the master data entities. Even if one or more systems are able to change the master data, they can write back their changes into MDS through the staging architecture.

Data Quality Services

The quality of data is different in each operational system, especially when we deal with legacy systems or systems that have a high dependence on user inputs. As the BI system is based on data, the better the quality of data, the better and the output of the BI solution.

As an example, Auckland might be written as "Auck land" in some Excel files or be typed as "Aukland" by the user in the input form.

As a solution to improve the quality of data, Microsoft provided users with DQS. DQS works based on Knowledge Base domains, which means a Knowledge Base can be created for different domains, and the Knowledge Base will be maintained and improved by a data steward as time passes. There are also matching policies that can be used to apply standardization on the data.

Dimensional modeling

To gain an understanding of data warehouse design and dimensional modeling, it's better to learn about the components and terminologies of a DW. A DW consists of Fact tables and dimensions. The relationship between a Fact table and dimensions are based on the foreign key and primary key.

Fact or measure

Facts are numeric and additive values in the business process. For example, in the sales business, a fact can be a sales amount, discount amount, or quantity of items sold.

Dimension

Dimension tables are tables that contain descriptive information. Descriptive information, for example, can be a customer's name, job title, company, and even geographical information of where the customer lives. Each dimension table contains a list of columns, and the columns of the dimension table are called attributes.

Each dimension has a primary key, which is called the surrogate key. The surrogate key is usually an auto increment integer value. The primary key of the source system will be stored in the dimension table as the business key.

The Fact table

The Fact table is a table that contains a list of related facts and measures with foreign keys pointing to surrogate keys of the dimension tables.

Grain

Grain is one of the most important terminologies used to design a data warehouse. Grain defines a level of detail that stores the Fact table. For example, you could build a data warehouse for sales in which Grain is the most detailed level of transactions in the retail shop, that is, one record per each transaction in the specific date and time for the customer and sales person.

The star schema

There are two different schemas for creating a relationship between fact and dimensions: the snow flake and star schema. In the start schema, a Fact table will be at the center as a hub, and dimensions will be connected to the fact through a single-level relationship. 
The snow flake schema, as you can see in the preceding diagram, contains relationships of some dimensions through intermediate dimensions to the Fact table. If you look more carefully at the snow flake schema, you may find it more similar to the normalized form, and the truth is that a fully snow flaked design of the fact and dimensions will be in the 3NF.




Share this article :

Post a Comment

Flag Counter

Social Profile

 
Copyright x 2011. By Wael Medhat - All Rights Reserved