There are many ETL tools which will support for the data warehouse .The most efficient Tools which I am using from past many years is Datastage. This was Assential company product later it was occupied by IBM.As time pass the new requirements and the problem faced in the tool has to be modified to sustain in the market .so even datastage has under gone many changes and come up with new versions to handle the data effectively.
I am starting from datastage 7.5 version to the latest datastage 8.5 version in this session. I have started my work on datastage 7.1 ,7.5 and 8.0,8.1,8.2 and now datastage 8.5.
The major change from datastage 7.5 is to datastage 8.0 is as fallows
The datastage 7.5 version
The installation of the datastage was done in one server.where the datastage service,Engine(server or parallel) and metadata was all installed in one server only. The client was installed in any windows server/pc and will be allowed to work with server for designing the jobs and administrating the projects. The software was installed either by root/dsadm and are made a part of dstage group for this software. if new users want to have access to datastage client means .just need to create one Unix user in the server and add that user to dstage group/or create one more group and give necessary privileges/restricted privileges based on projects.
so other components are majorly used as fallows
1.Datastage administrator-used for datastage project administration
2.Datastage Designer-used for datastage job designing
3.Datastage director-used for datastage job schedule/lock-unlock the jobs
4.Datastage manager-used for import/export of the project jobs…etc
The datastage new version 8.1 to 8.5
The datastage installation has been changed in 8.0/8.1 version. The new component called information server was added and is used for user management in the web console.
The web console Mapping.
Once the datastage software is installed in the server. we will get 3 types of user
1.dsadm-datastage administrator user
2.wasadmin-websphere administrator user
3.isadmin-information server administrator user
So using isadmin .please login to web console and to the mapping for Unix datastage admin user by keying the password in user credentials.
Once the dsadm mapping is done. We can create n-number of user to provide the datastage client login access.
Note: if the user is not present in web console. He will not be able to connect to datastage server using datastage client.
After software installation and user dsadm mapping is completed. Please create the new project. here it is always suggested to create the project in different mount point since /opt/IBM path space will be very less.please create any floder in new mount point EX:/dsadm/projects and give 775 or 750 and make the project folder owner as (dsadm:dstage) and then create new projects in this path.
After the project and jobs creation. most of the time we will come across with new kinds of bugs/problems in 8.1 version for which IBM has released fix packs.
The below are some of the fix packs we used in our environment.
Command to see the version
# grep -i Success /opt/IBM/InformationServer/Version.xml
The Application team and APPC /DBA team work flow for this patch installation
1.App-team send mail to bring down the datastage server
2.SP-will bring down the datastage server
3.DBA will take the backup of database(xmeta,iauser or full DB backup)
4. SP-will take the backup of /opt/IBM file system
6.SP-will bring up the datastage server
7.SP- will install the patch XML +jr37776 /any patch on the servers
8.App-team-will test the job and give sign-off
Data stage WebSphere upgraded from 188.8.131.52 to 184.108.40.206
Most of the big environment will have datastage in the below formate. where datastage application,service,Engine and metadata will be stored in different different server or may be in one server . In our enviornment we have datastage service, engine in one server and metadata (application database) in another server.
Application Database-the Xmeta,iauser schema will be created in this DB and this Application Database holds the datastage binary and configuration information of the datastage software(Note: if Application database is crashed means datastage is also crashed. so always take the backup of Application Db also while take /opt/IBM datastage backup for any activity)
Country Database-It contains the actual data used for datawarehouse.
Datastage 8.5 start/stop scripts
/opt/IBM/InformationServer/Server/DSEngine/bin/uv -admin -stop
/opt/IBM/InformationServer/Server/DSEngine/bin/uv -admin –start