I have a situation in this moment, with the users who are drawing/implementing the BPMN processes: They can create tasks that can have a people’s data (name, VAT number, address,…) in the task name. But that increases when the focus are the task/process variables. They can have forms where all data of a person can be collect and stored in the flowable DB.
Well according to the legislation, sometimes we have to erase the information of someone or we have to store the data that can identify someone in a different DB (I’m not a legal guy, and there are lots of rules relating this GDPR issue).
I was thinking in having some metadata associated with the fields that can store these information (variables, task/process names) where we can mark if that information is/can be GDPR sensible/candidate. The next step is harder, because knowing that information, or I have to create extensions tables to store the information (with an ID in the original) and customizing the CRUDs to take that in count… Bit this is only a draft of an idea…
I searched here and didn’t saw nothing about his issue… Anyone stuck with this problem?
Are the processes of such detail they contain the actual name/vat number/ etc in the task name? I’d assume the model is quite generic (like ‘handle customer’), but the actual data is in the runtime variables?
I’m more worried with the variables:
We created a service that generate and deploys custom BPMN that are passed to our service from the UI. Since it is a generic model, neither the name or variables are known from the process.
So, an user, can model a process in the UI, call the service and there is available a new workflow with the tasks he draw with respective variables that he defined.
Over these variables, we don’t have control over them.
Unless we pass the responsibility to the user who is modelling the process for, somehow, indicate that those variables can contain personal sensible data.
With that indication, we can do something in the future with that information: encrypt, delete, move to another database the marked variables. But I am not seeing any native in flowable to this use case.
Doesn’t an identity provider (IDP) solve your issue?
If your flowable uses it as an IDM source then your user’s details lives in your idp and then in your task/form you would store the user id and can then control what is rendered in your ui.
We have an IDM in our gateway to our services who call the flowable services. The access is secure.
We are putting an information (task detail/form) aggregated to the process definition, where in can categorize the information stored in the variable. With that we can know, in the frontend, which variable can have “sensible” data.
But the other (main) issues are not resolved:
I’m depending of the caller to the services to show or not that information: if I want to provide the services decoupled from the UI, I can’t because in that case we will depend from the “good will” of external entities;
If the person with data, wan’t to be “forgotten”, I have to have a service that will read all processes definitions where that user can be, find variables with potential “sensible” data, and delete/ofuscate them (I don’t want to delete the process/tasks);
Is there some function, that ofuscate the “sensible” data?
Can I have tables where the “sensible” information is, decoupled? eg: a ACT_RU_VARIABLE in another DB only with that data? Or other strategy where I can have that information (physically) separated from the normal one?
This sounds like you need to work with policies with the external consumer If they require data that crosses the GDPR line and they’re are not compliant, then it’s a simple matter of denying them access to the data in general. implementing this is solvable by extra additional checks as to whether or not the user in question has granted permission against what policies your external consumer has agreed to.
Again if you only store user ids in your process variables, the right to be forgotten is as simple as returning some indicative data for that user id from your IDM/IDP source (e.g. return bogus user details like “John Doe” for that particular id), which would be easier than having to pore through your in-flight and process history to obsfucate data or delete them. (How does this even conflict with your government’s mandated data retention policies?)
I would recommend not modifying or adding to Flowable’s database tables.
In general I believe your issues can be resolve by design (software architectural) and practiced policies.
I hope this helps clarify. Or perhaps I’ve misunderstood your issue.
One is SaaS, where we provide a way where our client can classified that information and we have to garante that the data that he classifies like that is stored under the GDPR regulation. And if possible we would like to store the data marked as sensible in a detached location.
Please note, that it is not the one who accesses the data, but the data itself: our clients will create BPMN models where some tasks fields will be the name, addresses and other data of their own clients. They also have to grant to others that the software that they are using is GDPR compliant.
The other model we don’t store the software and databases, but must garante that is GDPR compliant and gave the client the configuration he has to make.
I only store userIDs of who is making the changes. But as I say before, our clients can create Models where their tasks have forms where, for example, the name and address of their clients is stored.
Thanks for your inputs. They are clarifying my goals.
The data will be marked in the model.
But is there any way of excluding/putting a mask in those data, when accessing the engines services?
Or to split databases? Or when querying the data, say that the data with data special mark should no be received without being masked?
No, the engine hasn’t got such things built in. I mean, there’s pluggability in many places, but this feels like a really lowlevel thing that needs to be added in many places. In theory you could try to solve this on the lowest level (persistency) and avoid exposing anything further if it doesn’t match some criteria.