Advice on custom HistoryCleanup logic

Hi there,

In my project, we want distinct retention period for the history cleanup of various case or process models, and we always want to cascade down the history cleanup from the root case or process only (e.g. no deletion of completed process belonging to a case which is still open).

As a result, we came up with the following implementation (on version 6.5.1 of the engines):

application.properties

com.example.history-cleaning.retention-periods-days.cmmn.ABC_CAS_001=3650
com.example.history-cleaning.retention-periods-days.cmmn.ABC_CAS_002=730
com.example.history-cleaning.retention-periods-days.bpmn.ABC_PRC_001=30

CustomHistoryCleanupProperties

@ConfigurationProperties(prefix = CustomHistoryCleanupProperties.CUSTOM_HISTORY_CLEANING_PROPERTIES)
public class CustomHistoryCleanupProperties {

    public static final String CUSTOM_HISTORY_CLEANING_PROPERTIES = "com.example.history-cleaning";

    //number of days per definition key, for bpmn and cmmn models
    private RetentionPeriodsDays retentionPeriodsDays = new RetentionPeriodsDays();

    public RetentionPeriodsDays getRetentionPeriodsDays() {
        return retentionPeriodsDays;
    }

    public void setRetentionPeriodsDays(RetentionPeriodsDays retentionPeriodsDays) {
        this.retentionPeriodsDays = retentionPeriodsDays;
    }

    /**
     * number of days per definition key, for bpmn and cmmn models
     */
    public class RetentionPeriodsDays {
        private Map<String, Integer> cmmn = new HashMap<>();
        private Map<String, Integer> bpmn = new HashMap<>();

        public Map<String, Integer> getCmmn() {
            return cmmn;
        }

        public void setCmmn(Map<String, Integer> cmmn) {
            this.cmmn = cmmn;
        }

        public Map<String, Integer> getBpmn() {
            return bpmn;
        }

        public void setBpmn(Map<String, Integer> bpmn) {
            this.bpmn = bpmn;
        }
    }
}

FlowableEnginePostConfiguration

@EnableConfigurationProperties(CustomHistoryCleanupProperties.class)
@Configuration
public class FlowableEnginePostConfiguration {

    @Bean
    public EngineConfigurationConfigurer<SpringProcessEngineConfiguration> processHousekeepingConfigurer(CustomHistoryCleanupProperties properties) {
        return engineConfiguration -> {
            engineConfiguration.setHistoryCleaningManager(new CustomHistoryCleaningManager(engineConfiguration,
                    properties.getRetentionPeriodsDays().getBpmn()));
            engineConfiguration.setEnableHistoryCleaning(true);
            engineConfiguration.setCleanInstancesEndedAfterNumberOfDays(365);
            engineConfiguration.setHistoryCleaningTimeCycleConfig("0 0 1 * * ?");
        };
    }

    @Bean
    public EngineConfigurationConfigurer<SpringCmmnEngineConfiguration> cmmnHousekeepingConfigurer(CustomHistoryCleanupProperties properties) {
        return engineConfiguration -> {
            engineConfiguration.setCmmnHistoryCleaningManager(new CustomCmmnHistoryCleaningManager(engineConfiguration,
                    properties.getRetentionPeriodsDays().getCmmn()));
            engineConfiguration.setEnableHistoryCleaning(true);
            engineConfiguration.setCleanInstancesEndedAfterNumberOfDays(365);
            engineConfiguration.setHistoryCleaningTimeCycleConfig("0 0 1 * * ?");
        };
    }
}

CustomHistoryCleaningManager

/**
 * inspired by DefaultHistoryCleaningManager
 * triggered by BpmnHistoryCleanupJobHandler
 */
public class CustomHistoryCleaningManager implements HistoryCleaningManager {
    private static final Logger LOGGER = LoggerFactory.getLogger(CustomHistoryCleaningManager.class);

    protected ProcessEngineConfigurationImpl processEngineConfiguration;
    protected Map<String, Integer> retentionPeriodsPerProcessDefinitionKey = new HashMap<>();

    public CustomHistoryCleaningManager(ProcessEngineConfigurationImpl processEngineConfiguration, Map<String, Integer> bpmn) {
        this.processEngineConfiguration = processEngineConfiguration;
        this.retentionPeriodsPerProcessDefinitionKey = bpmn;
    }

    @Override
    public HistoricProcessInstanceQuery createHistoricProcessInstanceCleaningQuery() {
        Set<String> processesToClean = new HashSet<>();
        processesToClean.addAll(getSpecialProcessesForHistoryCleaning());
        processesToClean.addAll(getOtherProcessIdsForHistoryCleaning());

        LOGGER.info("About to prune '{}' root processes and their descendants: '{}'", processesToClean.size(), processesToClean);
        return processEngineConfiguration.getHistoryService().createHistoricProcessInstanceQuery().processInstanceIds(processesToClean);
    }

    private Collection<String> getOtherProcessIdsForHistoryCleaning() {
        int retentionPeriodInDays = processEngineConfiguration.getCleanInstancesEndedAfterNumberOfDays();
        Calendar cal = new GregorianCalendar();
        cal.add(Calendar.DAY_OF_YEAR, -retentionPeriodInDays);


        HistoryService historyService = processEngineConfiguration.getHistoryService();

        HistoricProcessInstanceQuery historicProcessInstanceQuery = historyService.createHistoricProcessInstanceQuery();
        historicProcessInstanceQuery.finishedBefore(cal.getTime());
        List<HistoricProcessInstance> matchingProcesses = historicProcessInstanceQuery.list();

        return getIdsOfRootProcessesOnly(matchingProcesses);
    }

    private Collection<String> getSpecialProcessesForHistoryCleaning() {
        List<HistoricProcessInstance> matchingProcesses = new ArrayList<>();

        for (Map.Entry<String, Integer> retentionEntry : retentionPeriodsPerProcessDefinitionKey.entrySet()) {
            Calendar cal = new GregorianCalendar();
            cal.add(Calendar.DAY_OF_YEAR, -retentionEntry.getValue());

            HistoryService historyService = processEngineConfiguration.getHistoryService();
            matchingProcesses.addAll(historyService.createHistoricProcessInstanceQuery()
                    .finishedBefore(cal.getTime())
                    .processDefinitionKey(retentionEntry.getKey())
                    .list());
        }
        return getIdsOfRootProcessesOnly(matchingProcesses);
    }



    private Set<String> getIdsOfRootProcessesOnly(List<HistoricProcessInstance> processes) {
        // of the returned processes, we only want to cascade delete the ones which do not have a parent (parent case or parent process)
        // because those ones need to be pruned only when their parent is cleaned)
        Set<String> processesToClean = new HashSet<>();
        HistoricEntityLinkService historicEntityLinkService = processEngineConfiguration.getEntityLinkServiceConfiguration().getHistoricEntityLinkService();

        for (HistoricProcessInstance processInstance : processes) {
            List<HistoricEntityLink> entityLinksToParentProcesses = historicEntityLinkService.findHistoricEntityLinksByReferenceScopeIdAndType(processInstance.getId(), ScopeTypes.BPMN, EntityLinkType.CHILD);
            List<HistoricEntityLink> entityLinksToParentCases = historicEntityLinkService.findHistoricEntityLinksByReferenceScopeIdAndType(processInstance.getId(), ScopeTypes.CMMN, EntityLinkType.CHILD);
            if (entityLinksToParentProcesses.isEmpty() && entityLinksToParentCases.isEmpty()) {
                processesToClean.add(processInstance.getId());
            }
        }
        return processesToClean;
    }
}

CustomCmmnHistoryCleaningManager is similar to CustomHistoryCleaningManager.

Does the implementation above seem correct? Is there a simpler way to achieve the same goal? If you have a similar use case in your project, how did you solve it?

I am currently stuck at finding a way to test it, and will report back if my tests require to amend the proposed implementation above, so that we can start a broader discussion about similar history cleanup patterns.

Best regards,
Tiffany

Hi Tiffany,

From my point of view there is a high probability to fail on the amount of process/case instances during the first run. That’s why I would recommend to split the clean up into batches.
I fact I did not dig deeper into the implementation.
Why don’t you use your custom clock implementation to simulate “old” processes?

Regards
Martin

Hi Martin,
Thank you for your reply. This is indeed a valid point for already running applications, which I will keep in mind.
Regarding my proposed implementation above, I discovered that unfortunately the query returned by createHistoricProcessInstanceCleaningQuery only supports some of the standard historic process instance query restrictions, and processInstanceIds is not supported in this particular situation (by the deleteWithRelatedData method of HistoricProcessInstanceQuery).
In the end, I will take inspiration from the history cleanup default implementation but build my custom mechanism.
Kind regards,
Tiffany