How the GDPR will affect the future of machine learning companies

April 2016 was a major turning point for machine learning companies. This was the date when the European Parliament Council (EPC) made the General Data Protection Regulation (GDPR) legal. The ruling sent the online world into a frenzy with everyone wondering what needed to be done to achieve GDPR compliance.  Any company - regardless of where it’s located - doing business with a European Union (EU) customer must comply with the collection, storage and processing of personal data regulations outlined by the GDPR by May 25, 2018.

Understanding how the GDPR affects machine learning companies requires appreciating why the EPC deemed it necessary to create this regulation. One word sums this up perfectly: trust.


Customers must trust that their rights are protected when they submit data to any company that operates digitally.

However, this hasn’t been the case in some instances since data appropriation - the theft of data for financial gain -  is rampant. This theft isn’t limited to hackers who steal people’s identities. It goes far beyond that, extending to machine learning companies who should be using people’s data responsibly.

Data is a valuable commodity to many machine learning companies such as: Google, Uber and Amazon. They use the data they collect to support advertising initiatives, ping the location of users and, a host of other things that inherently infringe upon a user’s privacy.

For instance, Facebook faced its biggest scandal to date with the Cambridge Analytica files. Facebook was fined £500,000 for lack of transparency and failing to protect users’ information.   It was a clear breach of the stipulations of the old Data Protection Act which the GDRP has replaced.

As Kamarinou, D., Millard, C., and Singh, J. highlight in their research paper, cloud computing and artificial intelligence are the main branches of machine learning with the most common applications being “speech recognition, natural language processing (NLP) and deep learning.” These applications impede heavily on an individual’s privacy and data protection; it’s like machines are given free rein to use people’s data at will.

The question that the GDPR forces machine learning companies to answer is, “How can we guarantee that we’ll uphold customers’ trust when all the data we collect is essential to creating a functional product?”

Answering this question raises 4 primary concerns for machine learning companies:

  1. Algorithm accountability

  2. Managing the right to explanation

  3. Improving cyber security

  4. Right to erasure

1. Algorithm accountability

Goodman, B. (2016) states that the need for algorithm accountability arises as a result of algorithm discrimination where “an individual or group receives unfair (discriminatory) treatment as a result of algorithmic decision making.”  

For instance, an algorithm may decide to deny access to someone who is African American solely based on race. He also argues that the GDPR’s data sanitization and algorithm transparency requirements may be inadequate to address algorithm accountability. This necessitates the need for algorithm audits by third parties.

The MIT Technology Review outlines solutions, including algorithm audits that machine learning companies can use to address the issue of algorithm accountability.

They suggest:

  1. Designating an upper-level management employee to quickly deal with the individual and societal effects. The number of people who have access to the data should also be limited to about one or two persons. Allowing too many people to access the data compromises its integrity and increases the chances of misuse.

  2. Providing a clear explanation to those impacted by the algorithm’s decisions

  3. Logging and benchmarking the sources of error and uncertainty that caused problems with the algorithm

  4. Consistently evaluate the algorithm to identify discriminatory elements and prevent them from recurring

2. Managing the right to explanation

Customers have the right to understand why algorithms have discriminated against them. They also have a right to understand how their data s used. To be more specific, Recital 71 of the GDPR states that “data controllers must notify customers about how their data will be used including ‘the existence of automated decision-making… meaningful information about the logic involved, [and] the significance and the envisaged consequences of such processing for the data subject.”

The simplest solution for this stipulation is for machine learning companies to improve their customer service. There must be someone designated to respond to customer’s inquiries about challenges they’re having with an algorithm. For instance, a bank may use an algorithm to determine someone’s credit score. A user may use the app and receive a questionable result. There must be an employee who can answer the user’s questions and provide a viable solution.

3.Improving Cyber Security

Zerlang, J. (2017) argues that most companies will strive to meet the minimum GDPR cyber security compliance requirements. However, cyber security issues are constantly evolving and companies who’re serious about data protection must keep abreast with the constant changes. Cyber security must become a strategic priority.

Two major issues arise when considering cyber security: consent and a user’s anonymity. All users should be given the opportunity to consent to the collection and use of their data.

Consent, however, needs to be more than clicking “I Agree” at the end of a long Terms of Service agreement. Few, if any users, understand these terms and take the time to read them. This puts the user at a severe disadvantage.

Additionally, a user should be allowed to remain anonymous. Anonymous data masks the identity of a user. Some companies use pseudonymization to tackle this challenge. Pseudonymization involves using pseudonyms instead of personal data to identify a user. It is a good way to add another level of protection to a user’s identity.

Some ways that machine learning companies can deal with the constantly changing face of cyber security are:

  1. Hiring a cyber security expert who constantly researches how attackers enter systems and ensures that the company’s systems are protected against such attacks.

  2. Once threats are detected, response must be immediate to remove the attackers.

  3. Investing in regular training opportunities for all staff to learn how to deal with and prevent cyber security threats.

4.Right to Erasure

One of the biggest trust issues customers have with machine learning companies is the ability to exercise their right to erase their data. The GDPR dictates that customers should be allowed to request for erasure either verbally or in writing. Companies should then respond within a month.

Therefore, it’s important for machine learning companies to have a strong internal structure to deal with these requests.

This means having:

  • The right processes in place

  • The right tools to effectively erase data

  • A policy for recording requests

  • A clear understanding of the instances where requests can be rejected and the time frame to respond to a request can be extended

GDPR compliance puts pressure on machine learning companies

Meeting the GDPR demands outlined above means radical changes for machine learning companies. There is less room for these companies to use customer data at will. It forces data theft to be reduced and puts the legal framework in place for appropriate punishment to be meted out.  

Consider how Google collects customer data to create email addresses but then essentially sells that data to digital advertising companies for their ad campaigns. GDPR implementation means that this business model must change.

Some argue that the GDPR stifles innovation and adds layer of bureaucracy within organizations. Nevertheless, it’s here to stay and one way machine learning companies can deal with it is to incorporate data trust technology and hire a Data Protection Officer (DPO).

Data trust technology essentially holds data so that decisions can be made about its use. NS Tech reports that “it is a legal structure that provides stewardship of some data for the benefit of a group or organizations of people.” The individuals or organizations (the trustees) that own the grant some of the rights that they have to control the data to a group of trustees who are legally bound to make decisions that protect the trustees. Transparency is key to the success of this data protection tool.

A DPO is responsible for data protection beyond what is stipulated by the GDPR. This role is all encompassing and ensures that both the organization and the customer is protected. The DPO must ensure that the company has everything in place to address the aforementioned concerns.

Using the GDPR to create a more responsible digital world

Trust is pivotal for positive business-consumer relationships. The GDPR was implemented to offer better protection for customer’s data and, ultimately, improve business-customer trust. Data theft must stop!

Machine learning companies will lose some autonomy in how they can retrieve and use customer’s data. However, proactively addressing this lost autonomy, using data trust technologies and hiring a DPO, will help these companies continue to thrive in the foreseeable future.

About SID


Our latest product is a smart-contract platform to secure and monetize your data and the latest addition to our family of AI products.