I am working on this data warehouse project from past 4.5 yr. starting from initial hardware invoice process to aix OS installation and mount point creation, DB2 installation, datastage installation, datastage configuration for Oracle, SQL and DB2 database and finally crystal report generation.
Let me explain how the DWH(Datawarehosue) process works.
General work flow.
S1(oracle)---------------------ETL(Datastage)---DWH(DB2 database)--Crystal report server--->DWH reportes
S2(DB2)-----------------------ETL(Datastage)---DWH(DB2 database)--Crystal report server--->DWH reportes
S3(SQL)------------------------ETL(Datastage)---DWH(DB2 database)--Crystal report server--->DWH reportes
S4(Access)------------------ETL(Datastage)---DWH(DB2 database)--Crystal report server--->DWH reportes
S5(Text file)------------------ETL(Datastage)---DWH(DB2 database)--Crystal report server--->DWH reportes
In the above process i have taken different kinds of source data (oracle,db2,sql,txt files) and through ETL software data stage .the data from all the source will be pulled and converted into one DB2 database formatted and put that into DWH (DB2 server).Then from DWH DB2 server the data is accessed by crystal report server and generated different kinds of reports for the finance and other department.
Hardware and software configuration
1.Hardware from IBM Pseris (570 and now Power servers)are very good server and give effective performance
a. Just install all the aix OS given in the CD and also the Linux box CD for the aix OS patches
b. make sure you create mount points for all and have good space. Always install db2 in /db2 and datastage in /dsdata only not in / or /home. so that you can avoid space problem later.
ex: for big project processing more than 20 to 40 GB data every month or even day
/db2(200 to 500GB for datafiles)
/dsdata(50GB to 100GB for projects)
/db2backup(50GB to 100GB for DB backup)
/dsdata(10 to 50GB for project backup)
/datasource(500GB to 1000GB for source file )
b. you can directly install db2 software no need any OS fix patches for this.(make sure you download directly the db2 software from IBM site and unzip it only in aix server not one local windows server and then ftp it to aix server) or can directly install it from CD.
Also configure DAS user and install tools database. So that you can install db2 client in windows and connect to UNIX DB2 server.
3.datastage software need C++ compiler installed before installing datastage.you can download it from IBm site or can install it from CD.after this you can install datastage software (7.5).No need any patches from aix OS.
If you want to connect to any oracle,db2 or SQL server. Just take the below parameter
Then go to odbc.ini file in datastage engine( ‘cat /. dshome’ ) then also edit the odbc.config file and go the specific projects and edit odbc.config file. Then the connection is done and DSN is created.
Data warehouse team
1. Datawarehouse consultant
2. Unix administrator
3. DB2 DBA
4. Crystal report developer.
5. Datastage application for ETL in DWH
1. Datastage administrator
2. Datastage developer
1. Datawarehouse consultant works as team manager and handle the team. he is responsible for
a. Monitor the data and activities inside the data warehouse
b.Administer the data warehousing environment, including security, data growth, performance, platform upgrades, support agreements, and disaster recovery
c.Administer the data warehousing services, including query services, and production services
d.Support the data warehousing user as End clients
e.Support the user in AIX IBM Pseries server,DB2 DBA,ETL tool Datastage developers and datastage admin, Crystal report developers, Backup admin team and Unix admin team
f.Provide technical support for the DWH Team for any problem
g.Understand the exact requirements from the clients and responsible to bring the new projects to the DWH team and deliver it to the clients in time
2.Unix administrator works as a team members of DWH and do all unix admin task.
3.DB2 DBA works as a team member of DWH and do all DB2 DBA task
4.Crystal report developer works as team members and do all report generation task
5.Datastage administrator works as team or team of members and do all admin task for datastage
5.Datastage developer work as a team or team of members and do all the datastage development task.
General work flow for any reports
1.Client requirement gather from DWH consultant and explaning it to team members to generate the report.
client-->DWH consultant-->unix admin-->Datastage admin-->datastage developer-->DB2 DBA-->crystal report developer
source(client)-->work structure(DWH consultant)->files loading to DWH(unix admin)-->datastage project creation(Datastage admin)->datastage jobs creation and running(datastage developer)->DB2 tables creation(DB2 DBA)->DWH report generation(crystal report developer)
What is the difference between Database and DWH
DWH is also a database. Collation of data from different database to one database is called DWH.
Why we need DWH?
Take an example of employee data
1.If employee requirement data is in SQL database(That’s is EMP NO,EMP Name,EMP salary)
2.If employee project location details in oracle(EMP Loc,Project name,DOJ)
3.If company details are in DB2 database(company name,loc,dept ,verticals)
4.if company profit details is in unix server and can get only TEXT file info
So if the company manager /finance manager/revenue manager/HR/anyone. want the report for "what is the report for employees working in India (Bangalore) who are joined in 2010 and have more than 50 K salary in department finance working in vertical health care with company profit.
Then it is very very difficult to gather information from all to one report.coz SQL can be create only EMP no, EMP name or salary report and oracle can create only EMP Loc,project,DOJ..etc etc..
So solution for this problem is DWH.DWH not only give this info for the present .it can store the data for more than 10 years and can give consolidated reports like which year how much profit from which department and which vertical to the company.
So all are moving towards the DWH solution. I am inviting you all to join me
Different kind of sources and DWH software’s as below
ETL popular tools
2. Business object
1. Business object/crystal report
The DWH server usual work flow to get more effective work and process is as below for the hardware and software as per my knowledge.But all can be mixed for any DWH implemantation.it depands on money for the project.
1.IBM Pserise(AIX) serverDatastageDB2crystal report or HyperionIBM DS4300 storage
2.Sun Solaris serverinformaticaoracle crystal report or HyperionSUN storage
Microsoft SQL server server with full pakage for windows server.