Monday 4 May 2015

MSBI -I


MSBI - Introduction 





Normalization: It is the process of storing the same data into multiple tables (eliminating redundant data) and each table contains only related data(ensuring dependencies make sense).
Benefits:
Eliminate data redundancy,
Improve performance,
Query optimization,
Faster update due to less number of columns in one table,
Index improvement.

OLTP Systems (Online Transaction Processing Systems): In DBA’s view these systems are directly connected to application systems and the data is inserting into these systems, modifying the data and extracting the data when it is needed.  Means it maintains daily transactional summarized data (Dynamic Data). Normalized data is available.

ODS / DW (Operational Data Store / Data Warehouse): It integrates the data from multiple sources and stores in a database. And it maintains the detailed data / historical data (Static Data). Denormalized data is available.


Generally to stores the data into a Data Warehouse/ODS we are using ETL tools (Extraction, Transformation and loading). These are of 2 types…. GUI based (Informatica, Data Stage, Oracle Warehouse Builder, SSIS etc) and CUI based tools (SAS ETL, TeraData Utility, BCP etc) available.


Reporting tools are used to fetch the reports from DW, as time based (Daily, weekly, monthly), business factors based (E.g. Total sales on particular product, Maximum reselling products etc). So that we can analyse how much progress is being made in the business for certain period of time.




MSBI (MicroSoft Business intelligence)

MSBI tools are the set of technologies (SSIS, SSRS, SSAS..), used to collect the raw data from multiple data sources, Transform it into meaningful information and finally show data to users with a presentation.


BIDS (Business Intelligence Development Studio)/SQL Server Data tools:  It’s an integrated development environment (IDE), by using this we can develop the projects for “SQL server Integration Services”, “SQL server Analysis services” and “SQL server reporting services”.


SSIS: Is an ETL tool of Microsoft, introduced in SQL Server 2005(Before this DTS – Data Transformation Services are available). By SSIS we can Extract the data from different sources like Text files, Excel files, RDBMS Tables…etc and do some Transformations on that data like aggregations, Joins, Conditions … etc and load the data to Destination (DW).

All the SSIS files are organized into Packages, Projects and Solutions. 
  




Package is basic element in SSIS design flow. In general package means collection of items, in the same way SSIS package is a collection of Connections, Control Flow items, Data Flow items, variables, Parameters and Configurations. Each package is saved with a extension “.dtsx” in a project. You can create one or more packages in a Project. And a Project is a part of Solution, you can develop one or more Projects within a Solution.

(**Collection of packages stored in a project, Collection of Projects stored in Solution)


Connection Managers : This section will stores the meta data of sources and destinition connection. For example Server name, database name, user name and password etc.

Control flow Elements : It is a collection of Tasks and Containers(in another way Collection of executables). Optionally collection of Data Flow tasks. A Container represents the structure and Task represent the functionality.  
Example: 
(i)A Data Flow Task can represents the flow of the data from source to destination.  
(ii)A Execute SQL Task Represents to execute user defined queries,
(iii)A  Send Mail Task represent to send the mails to different recipients,
(iv)A File System Task represents copy or move the files/folders to different destinations.... etc. 

Data Flow Task: Already we know that a Data Flow Task can represents the flow of the data from source to destination with required changes in the data using by different types of Transformations like, SORT,MERGE,DERIVED COLUMN,CONDITIONAL SPLIT, UNION ALL.... etc. 

Event Handlers : At the time of executing the Packages or Tasks or Containers it will raise an event like OnError,OnWarning, OnPreExecute, OnPostExecute, OnProgress … etc. 

Variables : A Variable is a named object/temporary place, that store one or more values which will be used/referenced by various components through out the package. You can assign a value to the variable when it creates, can updates values at run time. Two types of variables in SSIS ,  System and User Defined. 

Configurations : Which allows to configure properties of a package externally and can run in different environments. For example you have created a package and migrated to another server. That you need to set the source and destination information externally by config files. Which can available in XML Config file, SQL Server, Environment variable ... etc.


1 comment: