Data protection and AI systems 5

Different types of personal data may be needed for different purposes at the different stages of AI system development and use. In some cases, the same personal data is used for the same purposes both in the development and use of the system.

Personal data may not be used later for purposes that are not in line with the original purpose. Processing the data for other purposes can be allowed if the new purpose is compatible with the original purpose.

If an organisation plans to use personal data that it has collected for other purposes in the development or use of an AI system, it must assess the new purpose based on the following criteria:

What is the connection between the original purpose and the new purpose?
In what context was the personal data collected?
What is the data subjects’ relationship to the party that is responsible for processing their personal data?
What kind of processing can data subjects reasonably expect?
What types of personal data will be processed?
Will sensitive personal data be processed?
What consequences could the personal data processing have for the data subjects?
What safeguards does the organisation have in place?

If the new purpose is in line with the original purpose, the personal data can be processed based on the original processing basis. If the new purpose is not in line with the original purpose, a new processing basis must be determined.

Generally, a new purpose is not compatible if it is materially different, if the new personal data processing would be unexpected for the data subjects, or if the processing would result in unjust consequences for the data subjects. Archiving for public interest, scientific or historical research, and the creation of statistics are usually compatible with original purposes, as long as sufficient safeguards are in place.

Minimise the data processed and ensure its accuracy

According to the GDPR, the personal data processed must be adequate, relevant and limited to what is necessary in relation to the purposes for which the data is processed. This means that personal data may only be collected and processed to the extent that is necessary.

In the development and use of AI systems too, only the personal data that is required for the pre-determined purposes may be processed. An organisation developing or using an AI system must always carefully assess what personal data is required.

The development of an AI system often requires extensive high-quality datasets to ensure the model works correctly statistically and does not discriminate against some groups of people. It may therefore be necessary to process personal data in order to avoid any bias and errors. This need must also be identified when the purpose of the data is determined.

If the volume of data required at each stage is difficult to assess, start with limited datasets and gradually increase the volume according to justified needs. The relevancy of the data must also be monitored and re-assessed at all stages. In the assessment, it must be considered whether the same objective could be achieved with synthetic or anonymous data, for example.

The personal data must be accurate and updated when necessary. It is especially vital to ensure accuracy if the AI system uses personal data to make decisions or conclusions related to the data subjects.

Storage limitation and determining the storage period

According to the GDPR, the storage period of personal data must be limited. Personal data may be stored in a format that allows identifying the data subjects only as long as is necessary for the purpose of the personal data.

A time limit must be determined for the data after which the data is erased, archived or anonymised. When the personal data is no longer needed for the development or use of an AI system, the data must be anonymised or erased.

Personal data may be stored for longer than originally planned only if the safeguards required in the GDPR are appropriately in place and the data is processed for one of the following purposes:

Archiving in the public interest
Scientific or historical research
Statistical purposes

What must be taken into account in automated decision making?

AI systems can be used for automated decision making. Decision making is automated if it is based solely on automated personal data processing and when the decisions have legal or similarly significant effects on people.

People have the right to not be subjected to such decision making. However, there are exceptions to this rule. Automated decision making is allowed if appropriate safeguards are in place and if the decision is

necessary for the conclusion or performance of a contract between the data subject and the controller;
approved in legislation to which the controller is subject;
based on explicit consent given by the data subject.

Sensitive data belonging in special groups of personal data, such as health data, can be processed in connection with automated decision making only if the data subject consented to the processing or if it is necessary based on a statutory public interest.

The result of profiling can be data that belongs to special groups of personal data even if the original data alone was not sensitive. A processing basis must always exist for the processing of such data as well.

In addition, the data subjects must be informed of the logic of the processing and the consequences the processing has for them. Data subjects must be clearly, transparently and plainly informed what practices and principles are used in the processing of their personal data. The organisation must also be able to demonstrate to what extent the AI system made the decision relating to them.

Process personal data securely

Both the GDPR and the AI Act highlight the importance of protecting personal data throughout the data’s lifecycle against processing violating personal data legislation. Organisations must implement the appropriate technical and organisational measures to ensure and demonstrate that the personal data processing is in line with personal data legislation.

Organisations must have the ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services, and the ability to restore availability and access in the event these are prevented. In addition, organisations must have a process for regularly testing, assessing and evaluating the effectiveness of the measures in order to ensure the security of the processing.

Such measures may include the following:

Encryption: Encrypting the personal data during both storage and transfers ensures confidentiality even in the event of a data breach.
Access right management: Restricted access rights limit who can use and edit personal data.
Vulnerability testing: Regular vulnerability testing helps identify and remedy any vulnerabilities in the system’s security.
Log data and auditing: Maintaining detailed log data on the system’s operation enables detecting and investigating any suspicious activity.

AI systems are associated with special risks that require safeguards that supplement the traditional data protection practices.

Such supplementary safeguards include:

Input management: Text, video, sound or images can be used as input for an AI system. Before entering input data for processing by the AI system, it must be examined whether the input data includes any deviating or harmful content and, if necessary, these must be erased to ensure that they do not conflict with the purpose of the AI system. Other safeguards include controlling and limiting the amount of input data.
Decision control: The accuracy, justness and traceability of the outputs of the AI system must be ensured and any biases must be identified.

Respect peoples’ data protection rights

Data subjects’ data protection rights are laid down in the GDPR. A data subject is a natural person whose personal data is processed in connection with the development or use of an AI system. Data subjects’ rights are the right to access the data collected on them, the right to have the data rectified and erased, the right to restrict the processing and object to it, and the right to have the data transmitted from one system to another.

Organisations developing or using AI system must ensure that the system and its mechanisms are such that all of the rights related to the personal data processing in the system can be efficiently implemented.

Demonstrate compliance with data protection legislation

Organisations developing or using an AI system are responsible for complying with the requirements of data protection legislation and for being able to demonstrate their compliance (‘accountability’). Accountability requires implementing certain measures and documenting them.

Read more:

Artificial Intelligence Act (EUR-Lex)

General Data Protection Regulation (EUR-Lex)

Information on the AI Act on the Commissions website