Process incoming HTTP requests from external ERP services for order updates

Edit on GitHub

Spryker applications can receive information about orders from external systems in various ways. The nature of such requests can vary widely: it can be a request from a user’s browser, a push notification from a delivery company, or a batch update request from an ERP system. This document suggests possible solutions for processing incoming HTTP requests, describes their advantages and disadvantages, and highlights pitfalls.

Suggested solutions

You can process incoming requests synchronously and asynchronously.

Synchronous handling of incoming requests

The most popular and straightforward way of processing incoming requests is synchronous. This solution involves keeping an active HTTP connection until the application processes the request and returns a response.

Pros

Easy to implement, understand, and maintain.

Cons

  • Long-running requests can fail because of an HTTP connection timeout.
  • Heavy operations require scaling the hardware for the application, which can lead to extra costs.
  • Retry mechanism should be implemented on the caller’s side.

Asynchronous handling of incoming requests

Upon receiving an incoming request, the application stores the context and responds that the request has succeeded. In this case, the application takes full responsibility for handling this request. Various types of workers and storage engines, and their combinations, can affect quality attributes in different ways.

Processing with the oms:check-condition worker

By default, the Spryker application has the oms:check-condition worker that can be used to process requests and run application logic. The worker relies on a state machine graph. The oms:check-condition job moves order items through the OMS graph and runs OMS plugins such as Commands, Conditions, and Timeouts. Therefore, the processing logic must be represented in OMS. An additional restriction is that the OMS plugins cannot trigger OMS events for the same order, as they cannot call the OmsFacade::triggerEvent methods.

Info

An incoming request handler must not only store the event context but also trigger an OMS event. Therefore, it is necessary to pass control to the oms:check-condition worker. If the event must affect several orders, then the handler must trigger an OMS event for each order.

Pros

  • The worker is available by default.
  • More transparency with the logic that is represented in OMS.
  • Easy to understand, maintain, and support.
  • Default fault tolerance and retry logic support.

Cons

  • Can run logic in OMS plugins only.
  • Cannot trigger OMS events for the same order.
  • Extra OMS elements can make the graph harder to understand and maintain.
Incoming request handling and passing control to the worker

The incoming request handler should validate the request structure, save it to some storage, and then trigger an OMS event, for example, start processing. OMS will stop order items in the next state. On the next run, oms:check-condition worker will move them forward and run subsequent commands and logic.

Potential pitfall

The worker can process the requests only when the order is in a specific OMS state. If OMS cannot apply the start processing event to any item, it returns an error, which means that the event cannot be processed right now. The request handler should decide what to do: store the request for further processing, ignore it, or return an error response to the caller.

Triggering an OMS event and checking if it affected some items is one possible way to determine whether the order was in a proper state. Another way to check the order items’ state is to check the current state of all items in the database, in the spy_sales_order_item_state table. This way can be a bit faster but prone to the issue of a concurrent request and is therefore not recommended.

Processing with a dedicated Jenkins worker

The Jenkins worker listens to the storage and begins processing when an event appears. It is a high-level logic. Depending on requirements, the worker logic can differ significantly and cover quality attributes differently.

Quality attribute Suggestions
Availability / Degradability Jenkins can configure how often and when the job should run.
Performance The batch size of events per run can be configurable.
Testability / Maintainability / Monitoring / Operability Background jobs should write logs about not only errors but also about normal work progress. Good logs help to create efficient monitoring and alerts. This approach can save lots of time spent investigating problems.
Fault-tolerance / Correctness / Recoverability The worker should be designed to be able to re-run and retry some requests or continue the process on restart. It should not loose data in exceptional cases. If possible, the logic should be in DB transaction if possible.
Scalability The worker should support running in multiple instances.

Pros

  • Can handle any kind of logic, such as triggering an OMS event, working with multiple orders, and processing non-OMS-related logic.

Cons

  • Requires custom implementation on the project level.
  • Requires custom monitoring, operating tools, and procedures.

If the worker works with OMS, the workflow is as follows:

Solutions quality attributes comparison

The following comparison table illustrates the various quality attributes of the synchronous request handling, and asynchronous request handling with the OMS check-condition worker and the Jenkins job.

Quality attribute Synchronous handling OMS check-condition worker Jenkins job
Availability / Degradability ❓ Depends on load balancers and availability of all sub-components and applications ❓ Depends on OMS worker load
Modifiability / Flexibility ❌ Order must be in a specific OMS state
❌ OMS worker cannot trigger OMS events
❌ Cannot run logic that is not represented in OMS
Performance ❓ Depends on the application performance and hardware
Testability / Maintainability / Monitoring / Operability ❗ Limited amount of tools are available OOTB ❗ No OOTB tools
Usability / Understandability / Simplicity ❗ Experience with OMS engine is required ❗ Experience with Jenkins and CLI is required
Fault-tolerance / Correctness / Recoverability ❌ HTTP connection usually has timeout.
❌ Extra efforts to make an application recoverable are required (may be impossible).
❌ Extra efforts to implement retry mechanism are required (may be impossible).
❗ Lock mechanism is required
✅ Put transition command into DB transaction to be able to retry (if possible)
✅ Retry transitions can trigger retry logic manually or by timeout
✅ Retry transitions can trigger retry logic manually or by timeout
✅ Event handler logic should be inside of DB transaction to be able to retry (if possible)
❗ Lock mechanism is required
Scaleability ✅Static scaling of OMS check-condition job ✅ Static scaling of Jenkins job
Upgradability ✅ No upgradability issues because of no OOTB functionality usage

✅ good or bonus

❓unknown or depends on a project

❗ moderate, requires attention, but can be handled

❌ absent or impossible

Storage engines quality attributes comparison

Quality attribute DB Storage RabbitMQ Storage Redis
Modifiability / Flexibility - ✅ Supports TTL if needed ✅ Supports TTL if needed
Performance ❗ Can lead to performance issues with huge amount of data or huge message size
Usability / Understandability / Simplicity ❗ Needs experience ❗ Needs experience
Fault-tolerance / Correctness / Recoverability ❗ DLQ can be used to implement retry logic, but no OOTB solution ❌ Cannot store messages long period of time

Conclusion

If the processing logic is simple and fast, or the caller can handle errors and supports the retry logic, then the synchronous processing solution is more suitable.

If the process could take a long time or is sensitive to errors—for example, there is no retry logic support, there is a potential loss of data, or unacceptable UX—then we recommend representing the logic in OMS and using the oms:check-condition worker if possible.

In other cases, we recommend implementing a dedicated worker to process the requests asynchronously.