Integration is a complicated business. Tying together the IT landscape means lots of connections, interface points, APIs, protocols, messaging types, data formats, etc… There are plenty of patterns and area’s of best practice that can be applied to simplify your approach to integration and reduce the complexity, however, one inevitability of being in the middle of all these connections is that when there is an issue with what has (or has not) arrived from system A to system B, the integration platform is usually the first port of call.
Recognising this natural instinct from an anxious PM or Incident manager needs to be factored into your planning assumptions when building an integration function. It’s important to ensure the integration team is able to accommodate the additional level of triage and support in an operations or in a system integration test phase. If this isn’t feasible then the upskilling of test, or operations teams to identify whether a defect is integration or not is another approach that can help to minimise the levels of incorrect assignment of defects or incidents. In our experience, the integration testing rarely talks place on a programme with time to spare, and in a production incident identifying the root cause quickly is critical.
Making sure your integration team is prepared for this scenario will help make sure issues are dealt with quickly, saving cost and reducing incident outages. With the correct monitoring this can also be a source of important metrics for the CIO highlighting areas of instability or risk across the broader IT platform. After all, if you’re finding that 87% of the issues being called out to your integration team are not integration issues, then it’s important to make sure they are being fixed in the right place.