Introduction to data warehouse

We have many a times heard about Data Warehouse but actually what is Data Warehouse. First of all let me give you an example so that the concept of data warehouse is clear. You must have heard about big bazar, dmart etc. what are they? They are shopping mart having lots of branches in different cities.

Let’s assume they have 10 branches and the owner wanted to know how much business has been generated that means how much sales has been done.

So the manager went to all 10 branches individually and spent at least 10- 15 minutes to collect full data and handover the information about the sales to the owner.

Again the owner wanted to know about the product. Now the manger got irritated as he has to go again and again in all branches to collect required data and hence decided to quit the job.

That means going to different places and retrieving data seems very time consuming irritating and tiring hence here comes in picture the word Data Warehouse.

Every month from all places the data is collected and stored in one big place known as Data Warehouse.

Now if the owner wants the information about the sales he can retrieve it from Data Warehouse without contacting the manager or third party. Likewise he can get data related to product too.

So the first thing to know is -What is Data Warehouse?

It is the collection of big- big data from small database just like godown.

Secondly what comes in my mind is that what kind of data is stored in Data Warehouse.

There are two types of data-

  1. Operational data
  2. Strategic information

Both are very important depending upon their use.

1.Operational data – It is built for one application or subject area at a time and is used in day to day work. For example if some personal order is given than this order should be delivered at some particular place in particular time or if any employee is coming late then some part of his salary should be deducted. Hence these are called operational data.

2.Strategic information – Strategic information is built for multiple subject area at a time and contains data like how much sales has been done or which product has been sold or where no product has been sold. It also tells in which season maximum or minimum selling has been done. Therefore that information which helps in decision making to know why your business is not going smoothly or having profit is known as strategic information.

Hence in Data Warehouse all strategic information are stored and operational data remains in their own branches.

Need of Data Warehouse

As time changes business started growing and all businessmen wanted that their business should grow better than other but with operational data they cannot compete each other so then came the need of Data Warehouse where all strategic information like how much sales has been done or which product has been sold or not at all sold. Which product need more marketing or even don’t need at all are recorded.

In the word of William H. (Bill) Inmon, the father of Data Warehouse we can define Data Warehouse as-

“A Data Warehouse is – a subject oriented, non-volatile, and integrated, time variant collection of data in support of management’s decisions.”

Let’s discuss each term used in the definition in more easy way.

1.Subject oriented– Data Warehouse is a collection of big data and in that there are small rooms known as Data Mart which contain information about sales, customer, product, and employee. It is lower-cost, scaled down version of the DW .So subject oriented means all details about one kind of subject stored at one particular place and another kind of subject stored at other place in an organized way so that information can be available instantly as and when needed.

2.Integrated– India is a country of people belonging to different culture and language but still living together under one nation .In the same way we have so many branches having different database like MYSQL, ORACLE etc. coming together and collecting in the Data Warehouse having standard format. This is what we call it as integration. Different kind of data collected together in particular place in a standard format.

3.Non- Volatile– What we do in day to day operational data , we give order, make changes, delete or update data but in Data Warehouse once the data is stored you cannot edit or update what you can do is only add or delete it. That why it is called non-volatile.

4.Time variant– It means data warehouse can have data of the past years. For example if I had done lot of sales five years back and continuously I did my business for 3 years and suffer loss next two years then I will see which pattern or process I had followed 3 years back and try to collect this information which I will get from Data Warehouse.

So all these information and data is given to the employer to decide how to grow his business and earn profit.
Hence Data Warehouse is used by the management in decision making so as to grow his business an devaluate future strategy.

Valuable information is added to the DW, allowing the management to learn from the success and failure of the past and to have the ability to analyze plan and react to the changing business conditions.

Hope the concept of Data Warehouse is much clear in simple way.



  • January 24, 2018 at 10:07 am

    I remember vague something like star schema, demoralised design and terms like ETL. Nice overview about data warehouse.


Leave a Reply

Your email address will not be published. Required fields are marked *