Flowable scheduler no longer scheduling jobs

Hi,
in our production system we managed to get into a situation where the flowable scheduler is no longer scheduling any scheduled processes. To be more concrete, the scheduler is still querying the ACT_RU_TIMER_JOB table, but no timer jobs are moved anymore to the job table and as such nothing that was scheduled is executed. Because nothing is moved, there is also no progress and we get in an endless loop of querying for timers but never doing anything with them.
This situation occurs after the system has been running for some time (typically a number of days). Once the system is in that state, it cannot get out. Restarting the service fixes the problem (so scheduled jobs are executed again), but only for a limited amount of time as after some days we end up again in the situation with nothing being scheduled. Typically after a restart, you’ll see it executing all the timer jobs from the time when it got stuck until now.
While in this situation, all the rest of flowable is still working. So bpmn processes are executed, new ones can be created etc. Only the timers that are beyond their deadline are not executed.
One symptom that we observe when the system is in such a situation is that the thread pool that is executing the tasks (the one that names it’s thread like task-1, task-2, etc) seems to be at its max capacity, yet all these threads hardly use any cpu at all.
My guess is that this is related to some combination of process instances ending while having timers and throwing exceptions that make them ending up in dead letter.

Anybody any suggestion on what we can do to get out of this?

thanks,

Bert