Communication Monitoring

The A&D Solution in IFS Cloud is distributed. It requires the coordination of multiple services via a series of complex message interactions across various frameworks and infrastructure, each with their own failure points and error surfacing mechanisms. When errors occur in the communication request or event, the errors are surfaced or logged in various ways that make it difficult to cohesively handle errors across the solution.

The monitoring service which is centered around the Communication Log page (ADMON_COMMUNICATION_LOG) is used to monitor communication between IFS Cloud and A&D Services and surface errors via a uniform interface.

Communication Pattern – Request

Using the communication between Mobile Maintenance (MM) for Aviation and Maintenix services as an example, the below diagram highlights the key communication stages in a request pattern. Each stage will appear as a log record in the Communication Log page.

The following table describes the communication log type codes used to identify each stage in the Request communication.

NumberCommunication TypeNotes
1Request InitiatedA request processing flow has been started in a service.
2Request SentThe request has been sent by the client service.
3Request ReceivedThe request has been received by the target service.
4Response SentThe target service has sent a response.
5Response ReceivedThe client service has received the response.
6Request CompletedThe client has completed the processing of the response.
7ErrorThis is a catch-all state that would result if any unexpected (i.e., non-business) error is encountered that prevents the message from successfully completing. An IFS Application Event will be emitted to alert administrators of these types of failures.
NAExceptionAn exception refers to a specific event that disrupts the normal flow of communication. In the event of an exception, six retry attempts are made to process the message.
NAAd hocAn ad hoc entry indicates either a duplicate message that has been skipped to prevent reprocessing, or an entry added to support troubleshooting or development.

Communication Pattern – Event

Using the communication between Mobile Maintenance (MM) for Aviation and Maintenix services as an example, the following diagram highlights the key communication stages in an Event pattern. Each stage will appear as a log record in the Communication Log.

The following table describes the communication log type codes used to identify each step in the Event communication.

NumberCommunication TypeNotes
1Event SentThe event has been produced by a source service.
2Event ReceivedThe event has been received by a consumer service.
3Event CompletedThe consumer service has completed processing of the event.
4ErrorThis would be a catch all state that would result if any unexpected that would prevent the data synchronization communication from completing. An IFS Application Event will be emitted to alert administrators of these types of failures.
NAExceptionAn exception refers to a specific event that disrupts the normal flow of communication. In the event of an exception, six retry attempts are made to process the message.
NAAd hocAn ad hoc entry indicates either a duplicate message that has been skipped to prevent reprocessing, or an entry added to support troubleshooting or development.

Communication Errors

The ADMON_ERROR_ALERT event is triggered when an error occurs in the communication process (i.e., when a log record is recorded with communication type Error). The event contains information about the error. An event action can be created to respond to this event, for example, to send an email to administrators.