Flowable deployed as Spring boot service in Docker Container - Performance issues


#1

We are having some performance problems when it relates to querying for tasks and including process variables. Right now its about 40 seconds to pull 1K/2K worth of tasks. I can post more info about those timings. If we don’t include the process variables it is milliseconds. I am trying to find a way to solve this issue.

Based on digging into the performance issue, is Flowable fit to be deployed in a docker container? Are there any suggestions on what we can do to alleviate this issue. We have a few use cases that are getting hit by perfomance which make the product almost unusable for us:

  1. Getting all tasks a user is eligible to claim (pulling process variables in this query)
  2. Doing searches based on a process variable for the appropriate tasks

#2

Hey @akeiker,

Is the performance problem not an issue when not using Docker? From what you explained I don’t see how docker can be the cause of this.

In general when including process variables into queries it can take some time.

Can you please tell us which database you are using?

Cheers,
Filip


#3

I didn’t think Docker would be the issue either. I just wanted to confirm that using docker was an ok thing to do with flowable and we weren’t doing a pattern that it is not supposed to be used for.

Also this is what we are using for a database: Mysql version: 5.7.21 t2.medium aurora database

I am also going to try doing some Native queries to see if this helps. I also notice we are trying to get process variables along with task variables and there might be some potential to split this up?

TaskQuery query = taskService.createTaskQuery().includeProcessVariables()
.includeTaskLocalVariables().orderByTaskCreateTime().asc();

I just thought I would check with the community to see if there are solutions that have worked for others, etc. Or things we should possibly try? I have a couple I am going to try: Get timings, compare this with Native Queries, try possibly splitting up the query above. These are the first ones I have thought of so far.


#4

After digging into this more, I logged this issue: https://github.com/flowable/flowable-engine/issues/1656

Passing in offset = 0. maxResults = 5,000

taskService.createTaskQuery().includeProcessVariables()
.includeTaskVariables().orderByTaskCreateTime().asc().taskCandidateGroupIn(profiles).listPage(offset, count)

The API call above creates this query which is vary inefficient. Why not do a union if Process Variables and Task Variables are both included? Also why does it limit to 20K rows? This seems like a bug. This will never render 5,000 tasks at this point if you have say 20 variables for each task. This would likely only return 1,000 tasks in this scenario.

Right now I got around the performance issue by querying process variables and task variables separately. I still see the 20K limit a huge bug though.

select RES.*, VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.ID_ = VAR.TASK_ID_ or RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) ) order by RES.CREATE_TIME_ asc LIMIT 20000 OFFSET 0;

Why not a union for seperating out Process vs Task Variables???

select RES. , VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) )
union
select RES.
, VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.ID_ = VAR.TASK_ID_ WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) ) Order by CREATE_TIME_ asc;