Flowable deployed as Spring boot service in Docker Container - Performance issues

akeiker · February 21, 2019, 4:44pm

We are having some performance problems when it relates to querying for tasks and including process variables. Right now its about 40 seconds to pull 1K/2K worth of tasks. I can post more info about those timings. If we don’t include the process variables it is milliseconds. I am trying to find a way to solve this issue.

Based on digging into the performance issue, is Flowable fit to be deployed in a docker container? Are there any suggestions on what we can do to alleviate this issue. We have a few use cases that are getting hit by perfomance which make the product almost unusable for us:

Getting all tasks a user is eligible to claim (pulling process variables in this query)
Doing searches based on a process variable for the appropriate tasks

filiphr · February 22, 2019, 8:01am

Hey @akeiker,

Is the performance problem not an issue when not using Docker? From what you explained I don’t see how docker can be the cause of this.

In general when including process variables into queries it can take some time.

Can you please tell us which database you are using?

Cheers,
Filip

akeiker · February 22, 2019, 6:53pm

I didn’t think Docker would be the issue either. I just wanted to confirm that using docker was an ok thing to do with flowable and we weren’t doing a pattern that it is not supposed to be used for.

Also this is what we are using for a database: Mysql version: 5.7.21 t2.medium aurora database

I am also going to try doing some Native queries to see if this helps. I also notice we are trying to get process variables along with task variables and there might be some potential to split this up?

TaskQuery query = taskService.createTaskQuery().includeProcessVariables()
.includeTaskLocalVariables().orderByTaskCreateTime().asc();

I just thought I would check with the community to see if there are solutions that have worked for others, etc. Or things we should possibly try? I have a couple I am going to try: Get timings, compare this with Native Queries, try possibly splitting up the query above. These are the first ones I have thought of so far.

akeiker · March 11, 2019, 10:15pm

After digging into this more, I logged this issue: https://github.com/flowable/flowable-engine/issues/1656

Passing in offset = 0. maxResults = 5,000

taskService.createTaskQuery().includeProcessVariables()
.includeTaskVariables().orderByTaskCreateTime().asc().taskCandidateGroupIn(profiles).listPage(offset, count)

The API call above creates this query which is vary inefficient. Why not do a union if Process Variables and Task Variables are both included? Also why does it limit to 20K rows? This seems like a bug. This will never render 5,000 tasks at this point if you have say 20 variables for each task. This would likely only return 1,000 tasks in this scenario.

Right now I got around the performance issue by querying process variables and task variables separately. I still see the 20K limit a huge bug though.

select RES.*, VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.ID_ = VAR.TASK_ID_ or RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) ) order by RES.CREATE_TIME_ asc LIMIT 20000 OFFSET 0;

Why not a union for seperating out Process vs Task Variables???

select RES. , VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.PROC_INST_ID_ = VAR.EXECUTION_ID_ and VAR.TASK_ID_ is null WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) )
union
select RES. , VAR.ID_ as VAR_ID_, VAR.NAME_ as VAR_NAME_, VAR.TYPE_ as VAR_TYPE_, VAR.REV_ as VAR_REV_, VAR.PROC_INST_ID_ as VAR_PROC_INST_ID_, VAR.EXECUTION_ID_ as VAR_EXECUTION_ID_, VAR.TASK_ID_ as VAR_TASK_ID_, VAR.BYTEARRAY_ID_ as VAR_BYTEARRAY_ID_, VAR.DOUBLE_ as VAR_DOUBLE_, VAR.TEXT_ as VAR_TEXT_, VAR.TEXT2_ as VAR_TEXT2_, VAR.LONG_ as VAR_LONG_ from ACT_RU_TASK RES left outer join ACT_RU_VARIABLE VAR ON RES.ID_ = VAR.TASK_ID_ WHERE RES.ASSIGNEE_ is null and exists(select LINK.ID_ from ACT_RU_IDENTITYLINK LINK where LINK.TYPE_ = ‘candidate’ and LINK.TASK_ID_ = RES.ID_ and ( LINK.GROUP_ID_ IN ( “processor” ) ) ) Order by CREATE_TIME_ asc;

Topic		Replies	Views
Performance issue in querying active tasks Flowable Engine	5	894	May 26, 2021
Custom task query to remote server Flowable Engine	1	413	May 16, 2022
All-in-one-container plus custom spring boot application Flowable Engine	4	1457	July 16, 2019
Very slow api call with includeProcessVariables Flowable Engine	7	2219	January 28, 2019
Flowable Task Debug mode does not show Variables until the Flow is Completed Flowable Engine	9	916	February 20, 2020

Flowable deployed as Spring boot service in Docker Container - Performance issues

Related topics