Monday, February 25, 2019

Several mid-process databases for the steps in an ETL process?

I really don't know much about the challenges ETL developers face, but I am getting some exposure. I had always envisioned stuff getting moved from a point of origin to a final destination with no other stops along the way, but it turns out that it can get more complicated than that. In an undertaking we are working on we have a concept of a contract and it is represented many different ways in many different systems. There are contracts in both SAP and JD Edwards, for example, and in JD Edwards contracts are spread across eight different tables with unintuitive names. The process we are laying out has records getting pulled from SAP into one database where there will be a Contracts_SAP table and records from JD Edwards getting consolidated from the eight JD Edwards tables into a Contracts_JDE table in the same database. Farther downstream in the process there is a second, different database that will just have a Contracts table and the very different Contracts_SAP and Contracts_JDE tables will have their records massaged into a common format to get records into that table to make things level set.

No comments:

Post a Comment