Blog > Opinion

4 steps in the evolution of integration architectures

Posted by Chris Chandler | Apr 8, 2021

Middleware-centric

  • Disparate systems and no centralised data warehouse
  • Reliable reporting but analytics is ad-hoc
  • ETL the primary method of modelling data

Before the cloud, this architecture evolved as the best way to make disparate systems talk to one another. In this context, integration passes certain critical data points from system to system. For example, a customer name change in the CRM is reflected in related systems. It remains necessary for companies with legacy software and proprietary systems.
At this stage of maturity, the data warehouse is a side-show for reporting and ETL is used to transform some data. For data to move between operational systems, enterprise middleware becomes a natural central point, acting like a multilane highway intersection. Normally, this technology is owned by IT, has broad use across the enterprise, and requires specialist skills to run. This approach struggles to scale as the web of dependencies grows and each integration has a single use. Each linked system, as well as the middleware infrastructure itself, must have high availability and scalability to prevent loss of business downtime.

Cloud data warehouses began changing the natural central point of data as both sources and destinations moved to the cloud.

Two-stage data transformation

  • Data warehouse for some enterprise data
  • Source of truth for analytics and reporting
  • ETL the primary method for modelling data

As data warehouses become more capable, data teams start centralising enterprise data for analytics and reporting. Naturally, this modelled data is useful for other applications.
However, the ETL data modelling process is not agile enough to service all internal customers. Data is available, but the responsibility shifts to the destination team to figure out how to reshape it for their application. Integration happens twice and most often transformations along with it. Data transformation happens when entering the warehouse, and again, leaving. 

Given the skills required, the application team tend to consult IT who own integration platforms. This time, instead of the integration being the centre of all data, it is a dependency that grows in complexity to become a bottleneck.

Modern warehouses change the paradigm meaning data and IT teams can stop double handling data.

Data-centric replication

  • Centralised data warehouse ingests all data
  • Source of truth for analytics and reporting
  • Modern ELT processes
  • Data pipelines are trusted for operational updates

In this scenario, data teams have adopted a modern cloud platform like Snowflake or BigQuery and are landing all data from across the enterprise. The warehouse becomes a highly reliable, curated and governed source of truth for all data. Using modern processes and tools, the data team can model data quickly for internal customers. The application team no longer has to reshape data as it’s efficient and repeatable for the data team to run data outputs as a service.

From here, there are two practical approaches, a replication tool or Omnata’s unique Push.

Replication tools sit in the middle and talk to both the source and destination. Since data transformations happen in the warehouse, straight replication is far simpler than before. However, one team still needs to map the relationships and determine how to maintain an accurate replication of the data. Some tools will allow a degree of transformation in this layer, however, this risks undoing progress away from ETL.

On the other hand, Omnata further simplifies this process with Push. Instead of a middle layer that extracts and loads into the destination, Push uses Snowflake’s external functions to move data directly to the application.

The main drawback of any replication method is that it will eventually reach the limits of the destination system, whether it’s in ingest or storage, and it varies for each one.

Data-centric live-query

  • Centralised data warehouse ingests all data
  • Source of truth for analytics and reporting
  • Modern ELT processes
  • Data pipelines are trusted for real-time operational use

To escape the limits of replication, future architectures no longer move data from system to system but query it in real-time. As a result, it’s always up to date, accurate and lightweight. For this to happen, data pipelines need to be robust and low-latency and the warehouse needs to be performant. Omnata uniquely offers this capability for Snowflake and BigQuery, which are capable of these workloads.

On the other side of the fence, the application also needs to have the ability to surface live queries. Salesforce has this capability in external objects which behave as if they are on-platform data. The integrator no longer needs to operate within the confines of API loads or worry about maintaining a silo within an application. True customer 360 can be deployed where all relevant data is delivered natively to workflows.

Conclusion

As enterprises move increasingly into the cloud and towards centralising data, they can deliver savings and agility by moving away from architectures built around middleware. Omnata offers two simple integration approaches that reduce complexity and leverage the strengths of modern data warehouses.

Omnata blog

Expert tips for Salesforce and Snowflake, plus, open-source and community contributions. Read insights about analytics, machine learning, enterprise architecture and data-engineering.

Thanks for signing up!

Error sending please try again

Popular Posts