Saturday, June 11, 2011

Basics of Data Model

Relational Vs Dimensional Model :



1) Dimensional Modeling approach provides a way to improve query performance for summary reports thereby enabling faster business decision making process.
2) Dimensional Modeling is easier to understand and navigate as it represents business subject areas in an intuitive style.
3) Dimensional Data Modeling is used for storing and reporting summarized data. For example, sales data could be collected on a daily basis and then be aggregated to the week level, the week data could be aggregated to the month level, and so on.



Star Schema:

1) Star Schema is a relational database schema for representing multidimensional data.
2) It is the simplest form of data warehouse schema that contains one or more dimensions and fact tables.
3) Entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multiple dimensions.
4) The center of the star schema consists of a large fact table and it points towards the dimension tables. 
5) The advantage of star schema are slicing down, performance increase and easy understanding of data.




Snowflake Schema:

1) A snowflake schema is a term that describes a star schema structure normalized through the use of outrigger tables. i.e. dimension table hierarchies are broken into simpler tables.
2) Hierarchies (category, branch, state, and month) are being broken out of the dimension tables (PRODUCT, ORGANIZATION, LOCATION, and TIME) respectively and shown separately. 
3) Snowflake schema approach increases the number of joins and poor performance in retrieval of data.
4) Normalization of dimension makes it difficult for user to understand
5) Dimension tables are normally smaller than fact tables - space may not be a major issue to warrant snowflaking.



 Important aspects of Star and Snowflake schema:

1) In a star schema every dimension will have a primary key.
2) In a star schema, a dimension table will not have any parent table.
3) In a snow flake schema, a dimension table will have one or more parent tables.
4) Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
5) Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies helps to drill down the data from topmost hierarchies to the lowermost hierarchies.



Model Implementation:

1) In the Data Modeling Life cycle the last phase is the implementation of the Physical model. During the physical design process, you convert the data gathered during the logical design phase into a description of the physical database structure. Physical design decisions are mainly driven by query performance and database maintenance aspects.
2) During the physical design process, you translate the expected schemas into actual database structures. At this time, you have to map:
a) Entities to tables
b) Relationships to foreign key constraints
c) Attributes to columns
d) Primary unique identifiers to primary key constraints
e) Unique identifiers to unique key constraints





3) Once you have converted your logical design to a physical design you will need to create or use some or all of the following features while implementing you Data warehouse solution.
Partitions, Bitmap Index, Materialized Views, Analytic SQL function, Parallel Execution, etc

Regards,

Deva

No comments:

Post a Comment