Recommendations around async history

Hello Everyone,

Our team recently started exploring the async history feature to push data via SQS into MongoDB. Our initial impression was that we can put all of the events into a single collection and run our subsequent pipelines on it. But looking at the structure of the JSON objects, it is apparent that they should not be placed together.

I have searched on the web, as well as on the forums, and I have not seen specific pattern that is established for storing the async data in the NoSQL stores. Are there any considerations that we should take into account. Should the NoSQL collections follow a similar set of collections, like

  • ACT_HI_PROCINST
  • ACT_HI_TASKINST
  • ACT_HI_VARINST, etc

Thanks in advance!

It depends on your use case what you’d want to with your data. If your goal is to do queries similar to the historical queries from the HistoryService in Flowable, it makes sense to group them as you said: process instances / task instances / activities / variables / etc.

However, it’s also possible to combine or enrich the data with a pipeline and cater for different use cases (and keep the regular history relational), for example to prepare the data to be readily available for some dedicated UI screens you’ve built.