Why a Talend Framework?

I’ve worked on a number of data migration and data integration projects, using a number of data integration tools, including Talend. A common theme of much of this work is that the pressure of delivering user functionality has often been to the determiment of standards and frameworking. By this, I mean that we promise that we will include tasks such as building development standards, writing reusable code, consistent error handling, predictable and reliable deployment, performance monitoring and reconciliation; however, these are often forgotten or shoe-horned in at the last minute in an inelegant and inconsistent manner.

When you see your first demonstration of Talend, and this is true for any Data Integration tool, you will quickly see how easy it is to drop components from a palette, connect them together, and see some useful results. This is, however, far from the reality of building something that you will be able to maintain and deploy to a production environment in a reliable and consistent manner.

I have built the Talend Framework, with frameworking in mind, rather than its functionality being dictated by any specific functional requirement. This means that you can use the components of the framework in all of your day-to-day work, allowing you to concentrate on your user requirements, and know that you have consistency across all of your development.

The Talend Framework is all original code, and is licensed under Apache License Version 2, January 2004. Please use this Framework under the terms of this license.

Do I need a Framework?

Here are just some of the reasons why you might need a framework.

  • Do you work in an environment where you have tens or hundreds of Jobs that have been built by a number of developers with different levels of experience, some of these developers have now left the organisation, and no two Jobs work or fail in a consistent manner?
  • Do some of your Jobs send an Email if they fail and the remainder just write an un-handled Exception to the Console Log? Do they all do it in a different manner? Are Email credentials stored in different places?
  • Has your file system run out of space because you have not been archiving, and then deleting aged files?
  • Has one of your Jobs failed to run for a few days and you had no idea?
  • Has your Cloud Software vendor changed the way you connect to their service and you need to modify a number of Jobs because the connection information had not been externalised?
  • Are your users telling you that some of their is data missing; but you are unable to reconcile the results from previous runs of your Job?
  • Has one of your Jobs inadvertently run against your Development database rather than Production because some credentials had been hard-coded?
  • Do you need a framework?