As the definition states, the main purpose of a BI system is to help decision makers to make proper decisions based on the results of data analysis provided by the BI system.
A BI system usually uses a data warehouse as a core tool. The data warehouse is an integrated dimensional data structure. Data from a variety of sources will be fed into the data warehouse and some data quality and governance would be applied on the data. The dimensional model of data warehousing is optimized for reporting and analysis, so data visualization tools can directly query against the data warehouse. Another layer of modeling might be added to the BI architecture, OnLine Analytical Processing (OLAP), or the tabular model. These models will improve data access in terms of speed and performance of queries. BI systems have one or more data visualization frontends that will be the GUI for the end user.
Integration of systems & BI solutionsBusinesses use multiple operational systems to simplify, standardize, and automate their everyday jobs and requirements. Each of these systems may have their own database, some of which may work with SQL Server, some with Oracle. Some of the legacy systems may work with legacy databases or even file operations. There are also systems that work through the Web via web services and XML.
The rising number of operational systems also adds another requirement, which is the integration of systems together. Business owners and decision makers not only need integrated data but also require an analysis of the integrated data.
But some organizations and businesses use ERP systems that are integrated, so a question may appear in your mind that there won't be a requirement for integrating data.
So does that mean that these systems still require a BI solution? The answer in most cases is yes. The companies or businesses might not require a separate BI system for internal and parts of the operations that implemented it through ERP. However, they might require getting some data from outside, for example, getting some data from another vendor's web service or many other protocols and channels to send and receive information. This indicates that there would be a requirement for consolidated analysis for such information, which brings the BI requirement back to the table.
Data Warehouse DesignThe data warehouse is the core of the BI system. A data warehouse is a database built for the purpose of data analysis and reporting. This purpose changes the design of this database as well. As you know, operational databases are built on normalization standards, which are efficient for transactional systems, for example, to reduce redundancy.
A report on sales information may consume more than 10 joined conditions, which slows down the response time of the query and report. A data warehouse comes with a new design that reduces the response time and increases the performance of queries for reports and analytics.
Extract Transform Load
There are some ETL tools in the market that perform the extract, transform, and load operations. The Microsoft solution for ETL is SQL Server Integration Service (SSIS), which is one of the best ETL tools in the market. SSIS can connect to multiple data sources such as Oracle, DB2, Text Files, XML, Web services, SQL Server, and so on.
Data model – BISMA data warehouse is designed to be the source of analysis and reports, so it works much faster than operational systems for producing reports. However it is not that fast to cover all requirements because it is still a relational database, and databases have many constraints that reduce the response time of a query. The requirement for faster processing and a lower response time on one hand, and aggregated information on another hand causes the creation of another layer in BI systems. This layer is called a data model.
Microsoft's solution for the data model is split into two technologies: the OLAP cube and the In-memory tabular model. The OLAP cube is a file-based data storage that loads data from a data warehouse into a cube model.
The cube contains descriptive information as dimensions (for example, customer and product) and cells (for example, facts and measures, such as sales and discount).
Microsoft SQL Server Analysis Services 2012(SSAS) comes with two different types of modeling: multidimensional and tabular. Multidimensional modeling is based on the OLAP cube and is fitted with measures and dimensions. The tabular model is based on a new In-memory engine for tables. The In-memory engine loads all data rows from tables into the memory and responds to queries directly from the memory. This is very fast in terms of the response time.
Data visualizationThe frontend of a BI system is data visualization. In other words, data visualization is a part of the BI system that users can see. There are different methods for visualizing information, such as strategic and tactical dashboards, Key Performance Indicators (KPIs), and detailed or consolidated reports. As you probably know, there are many reporting and visualizing tools on the market.
Microsoft's SQL Server Reporting Services (SSRS) is a great reporting tool for creating detailed and consolidated reports. There are also components in Excel such as Power View, which are designed to build performance dashboards.
Master Data Management (MDM)Every organization has a part of its business that is common between different systems. That part of the data in the business can be managed and maintained as master data. For example, an organization may receive customer information from an online web application form or from a retail store's spreadsheets, or based on a web service provided by other vendors. Microsoft's solution for MDM is Master Data Services (MDS). Master data can be stored in the MDS entities and it can be maintained and changed through the MDS Web UI or Excel UI. Other systems such as CRM, AX, and even DW can be subscribers of the master data entities. Even if one or more systems are able to change the master data, they can write back their changes into MDS through the staging architecture.
Data Quality ServicesThe quality of data is different in each operational system, especially when we deal with legacy systems or systems that have a high dependence on user inputs. As the BI system is based on data, the better the quality of data, the better and the output of the BI solution.
As an example, Auckland might be written as "Auck land" in some Excel files or be typed as "Aukland" by the user in the input form.
As a solution to improve the quality of data, Microsoft provided users with DQS. DQS works based on Knowledge Base domains, which means a Knowledge Base can be created for different domains, and the Knowledge Base will be maintained and improved by a data steward as time passes. There are also matching policies that can be used to apply standardization on the data.
To gain an understanding of data warehouse design and dimensional modeling, it's better to learn about the components and terminologies of a DW. A DW consists of Fact tables and dimensions. The relationship between a Fact table and dimensions are based on the foreign key and primary key.
Fact or measure
Facts are numeric and additive values in the business process. For example, in the sales business, a fact can be a sales amount, discount amount, or quantity of items sold.
DimensionDimension tables are tables that contain descriptive information. Descriptive information, for example, can be a customer's name, job title, company, and even geographical information of where the customer lives. Each dimension table contains a list of columns, and the columns of the dimension table are called attributes.
Each dimension has a primary key, which is called the surrogate key. The surrogate key is usually an auto increment integer value. The primary key of the source system will be stored in the dimension table as the business key.