How to follow the regs without sacrificing performance

Event sourcing and the GDPR

Event sourcing and the GDPR

How to follow the regs without sacrificing performance

Event sourcing and the GDPR


Thanks to Article 17 of the GDPR, organizations using event sourcing need to double check and make sure they are following the rules. JAX London speaker Michiel Rook explains a few simple fixes to make sure performance isn’t impacted by compliance.

Recently, the EU General Data Protection Regulation (GDPR) came into effect. You’ve probably heard all about it or at least seen the absurd amount of ‘update privacy policy’ emails in your inbox. In any case, the GDPR attempts to regulate data protection for EU citizens. It is applicable to any organization that deals with EU citizens.

The GDPR has many implications for any software or organization that processes data. However, if you are considering implementing event sourcing in your application or have already done so, there are a few provisions in the regulation that have specific implications for event sourced applications.

Consent

One of the requirements of the GDPR is that an organization should be able to prove it has consent to process someone’s personal data. The consent must be very specific and it should be able to withdraw at any time.

For example, if you use the same personal data to send a newsletter, perform data analysis, and do re-targeting; you must have consent for those actions individually and support individual withdrawals of consent.

Demonstrating consent was given is easy when that consent was recorded as an event.

SEE ALSO: “The GDPR fever opened a greatly beneficial discussion on privacy protection – before that, nobody cared!”

Right to erasure

Without a doubt, the most interesting article in the regulation concerning event sourced applications is Article 17, the “Right to Erasure”.

“… the data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay and the controller shall have the obligation to erase personal data without undue delay…”

Whenever an Article 17 request is received, we first have to identify all the events that contain personally identifiable information of/for the requestor. Those events must either be sufficiently anonymized or removed altogether.

Challenge: immutability

Individual events are generally considered immutable. After all, events are a reflection of history, records of something that happened. Multiple events form event streams, persisted in event stores that are append-only. In fact, some implementations are even backed by immutable storage such as Kafka or a WORM drive.

Append-only event stores with immutable events have their own special advantages. Events can be cached ad infinitum and form the basis of a stable audit log. Any mistakes, errors or missing information in previously persisted events are typically dealt with by applying corrective events similar to an accountant’s ledger or using upcasters.

However, when you need to erase or anonymize personal information, those strategies are no longer an option, as they’ll both leave the original data intact and you non-compliant!

One option is to create a copy of the original event stream, filtering out the affected events, or including anonymized versions of those events. When that process is completed, the original stream should of course be discarded.

SEE ALSO: How will GDPR complicate data collection?

Don’t store personal information in events

Another way of dealing with this is using a mix of event sourcing and regular database tables. The idea is that personal information is longer stored inside events themselves, but in another database or storage solution.

Whenever an event is read by the system, the associated personal information is then retrieved from the secondary database and merged with the event.

Dealing with an Article 17 request is then reduced to finding the right entry in the secondary database and removing it. Any subsequent reads of the event will leave that event essentially anonymized.

Crypto-trashing/shedding

The last technique I want to discuss keeps the personal information inside events, but encrypts that data using a unique key that is either associated with the event or an aggregate. The encryption key is stored in and retrieved from a centralized key management system. Events are decrypted automatically before they are handled by domain code.

Whenever an Article 17 request is received, the appropriate key is looked up and promptly forgotten, i.e. removed. This renders the personal information unreadable and effectively removed.

SEE ALSO: PumpkinDB: Scary good for an event sourcing database engine

Conclusion

This was just a quick overview of some of the ways organizations can continue to use event sourcing while remaining compliant to the new GDPR regulations. If you’re interested in learning more about this topic, come join me for my talk “Forget me, please? Event sourcing and the GDPR” at JAX London this fall!

 

Michiel Rook will be delivering a talk at JAX London 2018 on Wednesday, October 10 as part of the Software Architecture & Design track. His talk goes more into detail about the effects of the GDPR and how it interacts with enterprise development.

love me like you do

Michiel Rook

Michiel Rook is a very experienced, passionate and pragmatic freelance IT consultant from the Netherlands. Working as a coach, software developer & architect, and a strong leader, he considers it his mission to help companies significantly improve their software quality and delivery process. Currently, he focuses on adopting Continuous Delivery & DevOps principles, culture and tooling, legacy software transformations, and cloud migrations. Michiel is a regular speaker at (international) conferences and events. In his free time he enjoys racing bicycles.


Weitere Artikel zu diesem Thema