Grids in Data Warehousing
Summary: IT organizations face a lot of problems when data volume explodes, but there is no need to worry, as gird technology is available for use. A Grid is very useful to handle data volume and make the data processing faster. IT firms that have implemented the grid technology in their database and ETL tools are reaping the benefits of fast processing.
What is grid technology? A grid basically refers to a collection of low-cost servers connected over a high-speed network in which IT resources such as computer power, storage and network capacity are pooled into a single set of shared services that can be distributed on demand. A Grid results in effective utilization of already available resources.
Why do we use gird technology? Data warehousing is basically about loading of the several forms of data such as raw data, summery data and metadata from heterogeneous sources like operational systems, mainframes and files etc. This is one of the most useful technologies to extract, transform and load data into the data warehouse. This process of data loading is called as ETL. The general problem in this process is that loading becomes inefficient when the data volume increases. Hence, the ETL process should be designed in such a way that data loading should be completed within the given load window.
To handle all the data explosion and the other related problems, grid computing is one of the innovative solutions that provides the following benefits:
1- Scalability: By distributing the task over a shared pool of resources, the scalability and performance can be improved.
2- Reliability- In a grid, if one server fails, then another server may be used for further processing tasks. Thus, grid computing is a highly reliable way of processing.
3- Cost saving- It is an interesting fact that companies can enhance their return on investment and lower the cost of ownership by utilizing the computing power of unused resources.
4- Effective utilization of resources- A number of users can access the shared pool of resources in order to obtain the best possible response time. Maximizing the utilization of all the resources, which are available in the pool, can do this.
Some examples of the implementation process of a grid:
1- Informatica Corporation
Informatica PowerCenter 8 is the latest release of Informatica that harnesses the power of grid computing. It delivers load balancing, dynamic partitioning and parallel processing to ensure optimal scalability, performance and reliability.
2- Oracle Corporation
Oracle has implemented grid computing in their Oracle 10g version. In Oracle 10g, the database can balance the workload with new processing capacity. On the other hand, the Oracle Database 11g delivers the benefits of grid computing with more self-management and automation in order to store more data and run queries faster and protect and audit the data.
A Grid in data warehousing is useful to several people like IT managers and directors, data warehouse architects and specialists and most importantly business analysts and decision makers.
An expert data warehousing consultant has written this article.
Tags: Data Warehouse Consulting, Data warehousing consultants
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Leave a Reply