Monday, 24 December 2012

Creating metadata for Database Connection using Talend Open Studio

In the previous posts we have created metadata for Delimited, Excel and XML files. You can visit these posts by using the following links:

Creating Metadata for Delimited files.
Creating Metadata for XML files.
Creating Metadata for Excel files.

Today, we are going to create a metadata for Database connection. Databases are the heart of most of the modern applications and business systems. Every developer in some way must have interacted with databases. There are lot of database types available in market like Oracle, Teradata, DB2 and MySQL. In my blog post, I am going to use MYSQL database. MySQL is world most used open source relational database management systems that runs as a server providing multi-user access to a number of databases.

For more information on MySQL please visit http://www.mysql.com.        

Now, lets get straight to creating metadata for Database connection.

1. In the Repository window, expand the Metadata section and right-click on Db Connections. Click on Create Connection. The Database Connection wizard will appear as shown in the following screenshot.

Enter a name for the connection. You can also enter additional information in the Purpose and Description boxes. There are other configuration options for Version, Status,and Path, but these are not mandatory, so we'll leave them for now.

Click Next.


2. In Step 2, Enter following database details. You will find these details from you DBA.

I.  DB Type (Type of Database. e.g. oracle, MySQL, DB2 etc.
II. DB Version (Version of your database)
III. Login Username and Password.
IV. Server IP address where your database is hosted and its Port number
V.  Name of the Database to which you want to connect.


3. Once you have entered all the database details. Click on Check to test the database connection. If the details you have entered is correct then pop up will appear with Connection successful will appear.


Click on Finish. to save the connection.

4. You will find this Database connection metadata in the Repository pane, Under the Metadata Section.



We have set up the database connection metadata, Now we need to add to this by retrieving the schema of the database. This will allow us to easily query tables from the database in our jobs without having to explicitly define them each time. Perform the following steps:



1. In the Repository window, expand the Metadata section, right-click on the MySQLDB connection, and select Retrieve schema.

2. By default, the schema wizard will retrieve all objects types, tables, views, and synonyms, and all objects by name (the name filter is set to %,which is the SQL wildcard notation). We could modify these settings so that, only tables with a specific name or name pattern are returned, but let's accept the default settings for now. Click on Next.



3.  The next screenshot shows the objects returned by the previous name pattern query. We can now explicitly set which objects we want to set in our schema. In our case, we want all of the database tables to be in our schema, so expand the classicmodels line to reveal the tables and check the box next to classicmodels to select all tables

4. Click on Next. Talend will work out the schema for each table based on the table definition within the database, but on the next screenshot, we can modify the schema as necessary. Again, we'll go with the defaults for now, so select Finish to complete the schema setup.

6.  Expand the MySQLDB_CON metadata connection to show the items within the Table schema folder.

 

Now we have set up the metadata for database connection. In the coming posts we will use this metadata to extract, Join and Load data in the database tables.