Custom Lineage in Informatica Cloud Data Governance & Catalog

Agenda:

  • What is Custom Lineage?
  • What all are the types of Association?
  • What is Custom Links file?
  • How to get the reference ID of the Asset?
  • How to create Custom Links file?
  • How to create Custom Catalog source in Metadata Command Center?

Custom Lineage:

Introduction:

  • Custom Lineage is used to view or populate data flow from one object to another within the source or across the source system.
  • Custom Lineage accept input file as .zip file, which has the csv file inside with all the object information along with the association types.
  • We can use the Custom Lineage where there is no ETL tool supported for the source system or to populate our own custom data flow across the asset.

So basically, the need of Custom Lineage is to populate our own custom flow diagram or data flow visualization across the asset. We are defining the level of assignment or linkage through the association types, so these association types tell you what type of lineage we are going to populate among the source and target.

What all are the types of Association?

What is Custom Links file?

  • Links file is a csv file, it should have the below headers Source, Target & Association
  • Source refers to the origin where the data flow is coming from.
  • Target refers to the destination where the data flow is going.
  • Association field refers to the type of link between source and target that is column level or table level associations.
  • In source and target field we should enter the reference Id of the asset as an input value.

How to get the reference ID of the Table and Column?

  • Reference ID is the unique Id which can be fetched from Data governance and Data Catalog UI.
  • To get the reference ID of the particular asset, navigate to CDGC -> Search the catalog name
  • Open theSource Catalog -> Open the table ->Go to Attributes -> Copy the Reference ID
  • Open theSource Catalog -> Open the column ->Go to Attributes -> Copy the Reference ID
  • Open theTarget Catalog -> Open the table ->Go to Attributes -> Copy the Reference ID
  • Open theTarget Catalog -> Open the column ->Go to Attributes -> Copy the Reference ID

How to create Custom Links file?

  • Once we have the reference ID of the asset along with the association types, then create csv file.
  • In the csv file, enter the relevant asset information along with the association types, we should make sure that in the source and target field reference ID of the asset should be entered as an input value.
  • Once the csv file is ready with all the asset related information’s then we should zip it as Links.zip, we recommend you use the same name for which been using for csv file and zip file.

How to create Custom Catalog source in Metadata Command Center?

  • Since we don’t have a native or predefined scanner for custom lineage, so we must create the custom catalog source to run the custom lineage job.
  • Open the Metadata Command Center-> Click New
  • Click Customization-> Custom Catalog Source Type– Click Create

 

  • Type the Custom Catalog Type Name and Click Save

 

  • Go to Catalog Source and search the Custom Catalog Type Name which you created and click Create
  • Once the custom catalog source is created, by using that we can upload the zip file to populate the custom lineage in CDGC.

  • Now, you can Lineage catalog name and Upload the Links.zip Lineage file from local using Browse option

 

  • Configuration and Association fields are not mandatory, hence click Next -> Save -> Run
  • Once the Job run successfully, you will see the below information for the catalog in monitor.

  • Now the source and target lineage connection will be reflected in CDGC.

 

Final View of Custom Lineage:

To view the lineage, Open the Catalog->Open the Table->Go to Lineage

To view Data Set Level:

To view Data Element Model:

Advantages and Disadvantages

Advantages:

  • Flexibility: Allows for detailed and tailored lineage tracking.
  • Enhanced Visibility: Provides a comprehensive view of data movement and transformation.
  • Compliance: Helps meet specific regulatory and governance requirements.
  • Integration: Bridges gaps between diverse systems and processes.

Disadvantages:

  • Development Effort: Requires time and resources to develop and maintain.
  • Complexity: Adds complexity to the data integration environment.
  • Ongoing Maintenance: Needs regular updates and validation.
  • Cost: Can be costly in terms of both time and resources.

Conclusion:

Custom lineage is a powerful tool for organizations that need more precise and tailored data tracking than what is offered by default lineage solutions. By defining custom rules and configurations, organizations can gain deeper insights into their data processes, enhance compliance, and better integrate complex systems. While there are challenges in terms of development and maintenance, the benefits of customized data visibility and governance often outweigh the drawbacks, making custom lineage a valuable addition to any comprehensive data management strategy.

Please reach out to us for your Informatica solution needs. We are an Informatica Platinum Partner with extensive experience with Informatica implementations and data integration.



Leave a Reply