Thursday, October 11, 2007

Data Replication in SOA: The Price of Loose Coupling

When designing a corporate SOA architecture you are often faced with a tough choice: do you rely on a common database (centralized) or do you implement replication instead?

Let me explain what I mean. The idea in SOA is that you define more or less independent services that correspond (hopefully) to clearly defined and business-related activities. For instance, you could have a customer management service and a payment/invoicing service. The customer management service belongs to CRM, the invoicing to the billing department. However, both of these services might need the same customer data. Now what do you do? Basically, you have the following options:

  1. Use the same centralized customer database. This gives you the benefit of easy maintenance because there is only one copy. However, this also means that you are coupling your services into the same database schema, and updates to the schema are likely to affect more than one service.

  2. Replicate the customer database, by identifying one master (the CRM?) that regularly pushes or publishes updates (in an XML feed, for instance). While you lose the benefit of easy maintenance, this does give you loose coupling: as long as the XML format is the same, you can change DBMS schemas as much as you like - without affecting other services.

  3. Merge the customer and invoicing services into one. However, this may not always be possible or desirable, and may even defeat the purpose of service-oritentation altogether.

  4. Have the invoicing query the customer service for each payment. Thi seems to incur a lot of dependencies and network traffic.

So what do you do? My preference tends to go to the second option. However, it means that realistic SOA architectures are likely to have an event-driven nature.