Resources » Articles/Knowledge Sharing » Computer & Technology

Data Warehouse Architecture


Last Updated:   Category: Computer & Technology    
Author: Member Level: Gold    Points: 10


To know about data warehouse architecture and there three tier,OLAP services and implementation.



Data Warehouse Architecture:


Benefits:


Competitive advantage
Enhance business productivity
Customer relationship management
Cost reduction

To design a data warehouse, understand and analyze business needs and construct a business analysis framework.
The different views regarding the design are
Top-down view-selection of relevant information
Data source view-the information being captured, stored and managed by operational systems.
Data warehouse view-includes fact,dimension tables
Business query view-end user view

For building and using data warehouse needs certain skills
Business skill-how much system store, manage, hoe to build warehouse refresh software.
Technology skill-how to make assessments from quantitative information and to derive facts.
Program management skill-to interface with technologies.

A data warehouse can be built using a top-down approach, bottom-up approach and combined approach.The design and construction may consists of the following steps:
Planning
Requirements study
Problem analysis
Warehouse design
Data integration and testing
Deployment of data warehouse

Large software systems can be developed using
Waterfall method
Spiral method

The design process has the following steps
Choose a business process to model eg.,orders,sales.
Choose the grain of the business process.
Choose the dimension.
Choose the measure.

A Three-Tier Data Warehouse Architecture:


Bottom Tier-relational database is formed from operational database, external resources.
Middle Tier-using relational OLAP or multidimensional OLAP.
Top Tier-querying and reporting tools,analysis,data mining tools.


There are three data warehouse models,

Enterprise warehouse-It collects all of the information about subjects spanning the entire organisation.

Data Mart-It contains a subset of corporate wide data that is of value to a specific group of users.There are two types
Independent Data Mart-data captured from operational systems/external sources/from a particular area or department.

Dependent departments-directly from enterprise data warehouse
Virtual warehouse-It is a set views over operational databases.

Types of OLAP Services:



Relational OLAP servers(ROLAP)-intermediate server that is in between a relational back end server and client front end tools.

Multidimensional OLAP Servers(MOLAP)-It support multi-dimensional views of data array-based multidimensional storage engines.The vies are mapped to data cube array structures.

Hybrid OLAP servers-combining both i.e., the grater scalability of ROLAP and faster computation of MOLAP.

Data Warehouse Implementation:


Efficient Computation of data cubes:


The compute cube operator is used for cube computation i.e., it computes aggregates over all subsets of the dimension specified in the operation.
Eg.:Compute the sum of sales grouping by item and city.

Total number of cuboid is 2n i.e., 2^2=4.If no group-by is used then it is zero dimensional operation or O-D cuboid or apex cuboid's.If there are n group be then it is n-dimensional operation.

The cube operator is the n-dimensional generalization of the group by operator.Compute cube sales. Generates all cuboid's.
If the dimension has hierarchy, then the total number of cuboid.If there are more cuboid's which may lead to storage problem, then partial materialization is the good option.

The partial materialization should consider three factors:
Identify materialization subsets of cuboid's to materialistic.
Exploit the materialization cuboid's during the query processing.
Efficiently update the materialized cuboid's during load and refresh.

In OLAP, multi way array aggregation is used for computation of data cubes,
ROLAP uses value value based addressing using tuples and relational tables.MOLAP uses direct addressing using array index.

So, differently approach is used for array-based cube construction of MOLAP.
1.Partition the array into chunks.
2.Computer aggregate by visiting cube cells.
MOLAP is significantly faster than ROLAP>

Indexing OLAP data:


For low-cardinality domains, bitmap indexing is used.A bit vector BV, exist for value V in the domain of the attribute.
For high-cardinality domains, join indexing is used. It registers join able rows of two relations from a relational database.

OLAP Query Processing:


It proceeds as following:
Determine which operation should be performed on the available cuboid's.
Determine to which materialized cuboid's the relevant operations to be applied.

Metadata Re-positing:


Metadata are data about data i.e., it defines warehouse objects.It contains the following:
Structure description of the data warehouse.
Operational metadata-data image,currency of data,monitoring information.
Algorithms used for summarization.
Mapping from operational environment to data warehouse.
Data related to system performance.
Business metadata.

some of the data warehouse functions are
Data extraction
Data cleaning
Data transformation
Load
Refresh


Did you like this resource? Share it with your friends and show your love!




Responses to "Data Warehouse Architecture"

No responses found. Be the first to respond...

Feedbacks      

Post Comment:




  • Do not include your name, "with regards" etc in the comment. Write detailed comment, relevant to the topic.
  • No HTML formatting and links to other web sites are allowed.
  • This is a strictly moderated site. Absolutely no spam allowed.
  • Name:   Sign In to fill automatically.
    Email: (Will not be published, but required to validate comment)



    Type the numbers and letters shown on the left.


    Submit Article     Return to Article Index

    Awards & Gifts
    Active Members
    TodayLast 7 Daysmore...

    ISC Technologies, Kochi - India. Copyright © All Rights Reserved.