Tuesday, 23 April 2013

tMap vs. tJoin - Talend Open Studio.

tJoin and tMap are quite different components, though tMap can be used to perform the same functionality as tJoin. tJoin has "Join" function to help you ensure the data quality of any source data against a reference data source.

However, tMap is more powerful in terms of functionality. I am going to list down differences between tMap and tJoin components of Talend Open Studio.

1. tMap can have many outputs links as compared to tJoin which can have a main and reject links.
2. With tMap we can use expression on the columns while providing the joining condition. I think it is not possible for tJoin, Only exact match between the keys is possible.
3. In tMap we have option to store the intermediate data in the disc.
4. In tMap, we can enable the option to reload the look-up for every record.

5. tMap supports more types of join model, includes unique join, first join and all join, however, tJoin only support unique join.
6. tMap allows you to link multiple look-up flows to it, and supports to load multiple look-up flows parallel, however, tJoin only accept one look-up flow.
7. tMap supports 'die on error' option.

Hence tMap is quite powerful component as compared to tJoin which is basic. Being a powerful component, it generates more code and it may take more space and time to load in the memory while compiling than tJoin. Hence, We should use tJoin if it suffice our requirements otherwise tMap.

Listed below are some examples of using tJoin and tMap components:

1. Performing Look up operations
2. Using tJoin component to Output Main and Reject data
3. Joining Excel worksheets having different schema.

Request you guys to add more points to the above list. let me know, if you have any.

You may also like to read..

No comments:

Post a Comment