LEGAL, REGULATORY & COMPLIANCE CONSULTANTS

Handley Gill Limited

Our expert consultants at Handley Gill share their knowledge and advice on emerging data protection, privacy, content regulation, reputation management, cyber security, and information access issues in our blog.

Right my wrongs

When generative AI model developers fail to implement the principles of privacy by design and default and data minimisation when carrying out processing for the purposes of training AI models, the implications flow down the chain, posing challenges for data subjects securing their rights. Ensuring that data subjects can exercise their rights at an early stage requires generative AI developers to do more to highlight their collection of personal data, particularly when this is gathered through web scraping.
— Handley Gill Limited

In its fourth and most recent consultation on generative AI and data protection, ‘Engineering individual rights into generative AI models’, the Information Commissioner’s Office is seeking views on the enforcement of data subject rights in the context of generative AI models.

The draft guidance emphasises that data subject rights apply to all phases of the generative AI model from training to fine tuning, to inputs and outputs but, despite its title, the draft guidance gives no guidance on how data protection by design and default can be engineered into generative AI models, whether in relation to data subject rights or more widely. 

In connection with the right to be informed, in accordance with Article 14 UK GDPR / GDPR where personal data are collected from third party sources - as is likely to be the case in connection with personal data web scraped for AI training purposes - the guidance proceeds on the basis that it is impossible or would require disproportionate effort to provide privacy information to each individual whose data has been collected. Where Article 14(5)(b) UK GDPR applies, the data controller is still obliged to “take appropriate measures to protect the data subject's rights and freedoms and legitimate interests, including making the information publicly available”. The ICO suggests that this includes publishing: specific, accessible information on the sources, types and categories of personal data used to develop the model; specific, accessible explanations of the purposes for which personal data is being processed and the lawful basis for the processing; and, prominent, accessible mechanisms for individuals to exercise their rights. As we highlighted in our previous post ‘Time to regenerate’, the ICO indicates that its preliminary view is that the inclusion in privacy policies of “Vague statements around data sources (eg just ‘publicly accessible information’) are unlikely to help individuals understand whether their personal data may be part of the training dataset or who the initial controller is likely to be”.

In circumstances where, as the ICO concedes, it is possible for generative AI developers, to web scrape datasets pertaining to millions or billions of individuals, we suggest that it is insufficient to merely publish a privacy notice on a developers website, and that the ICO ought to consider whether developers could feasibly contact the data controllers/web hosts of the sites from which data including personal data is scraped thus enabling them to make their users aware and/or to take measures to protect their content. In this regard we are also mindful of the recognition by the European Data Protection Board (EDPB) in its ‘Report of the work undertaken by the ChatGPT Taskforce’ that the obligation under Article 12(2) GDPR to facilitate data subject rights includes an obligation to “continue improving the modalities” for so doing. We therefore oppose guidance being issued which effectively sets standards at a level below that which would be accepted in other contexts and when thresholds will be difficult to raise, if technological solutions could support compliance either now or in the future.  

The ICO also recognises that the web scraping and further processing of personal data is unlikely to have fallen within individuals’ reasonable expectations.

The ICO queries whether imposing filters on generative AI model inputs and outputs is sufficient to comply with data subject rights to restrict, rectify and/or erase personal data. Output filters effectively suppress the inclusion of personal data, similar to the approach taken by search engines following the judgment in C-131/12 Google Spain SL and Google Inc. v Agencia Española de Protección de Datos (AEPD) and Mario Costeja González. In our response we highlight the limitations of even these measures, which don’t guarantee that personal data won’t be included in outputs, and as well as potential alternatives for effective compliance with data subject rights.

We consider that many of the issues arising from the difficulties in complying with data subject rights dictate improvements in securing privacy by design and data minimisation in conducting web scraping. We addressed the lawful basis for the processing of personal data through web scraping for the purposes of training generative AI models in our previous post ‘Scraping together a lawful basis’.

There are areas the draft guidance currently doesn’t address that we believe should be incorporated: the guidance doesn’t address the role of generative AI developers in notifying deployers of generative AI models about data subject rights requests in accordance with Article 19 UK GDPR; nor does the guidance address how the exception to the exercise of data subject rights where requests are manifestly unfounded or excessive under Article 12(5) UK GDPR should apply in the context of generative AI models where the personal data processed may constantly evolve.

We submitted a response to the Information Commissioner’s call for evidence on ‘Engineering individual rights into generative AI models’, which can be accessed here:

The Information Commissioner indicates that the next phase of its generative AI and data protection consultation series will focus on the concepts of controllership in the context of generative AI, addressing the role of AI developers and AI deployers.

Both generative AI developers and deployers need to establish processes for handling data subject rights requests and may require support in responding to specific requests. The ability to comply with such requests should also form part of your Data Protection Impact Assessment. If Handley Gill can support you, please contact us.

Download our Helping Hand Data Subject Access Request Compliance checklist.

Access Handley Gill Limited’s proprietary AI CAN (Artificial Intelligence Capability & Needs) Tool, to understand and monitor your organisation’s level of maturity on its AI journey.

Download our Helping Hand checklist on using AI responsibly, safely and ethically.

Check out our dedicated AI Resources page.

Follow our dedicated AI Regulation Twitter / X account.

Follow Handley Gill Limited on Twitter / X.