LEGAL REGULATION OF THE FORMATION AND UTILIZATION OF REGIONAL DATA SETS IN THE PROCESS OF ARTIFICIAL INTELLIGENCE MODEL TRAINING

Year: 
2026

Article:

Issue: 
1

UDC: 
340
DOI: 
10.34076/22196838_2026_1_87
Author(s): 

Olifirenko Artem

Master’s student, Saratov State Law Academy (Saratov), Master’s student, Yuri Gagarin State Technical University of Saratov (Saratov), Data Protection Specialist and AI Systems Security Officer, Ecosystem of Real Estate «Square Meter» LLC (Moscow), ORCID: 0000-0002-2186-281X, e-mail: panolifer@lamirtech.su.

Author(s): 
Olifirenko Artem
Abstract: 

The article presents a comprehensive legal and scholarly interpretation of the institution of regional data sets, formalized within the framework of the experimental legal regime established by Federal Laws No. 123-FZ and No. 233-FZ, which regulate the use  of anonymized information arrays for the training of artificial intelligence models. The substantive legal mechanism governing the formation, processing and subsequent use of regional data sets within the state information infrastructure-operating under the model of technological sovereignty is examined in detail. The study analyzes the differentiated requirements imposed on data access subjects, including criteria of institutional affiliation, national jurisdiction and compliance with imperatives of information security and regulatory oversight. The transformation of the legal regime governing AI-model training into the domain of public law regulation is substantiated, with emphasis on ensuring the lawfulness and transparency of algorithmic processes in strategically significant sectors. The necessity of establishing a legal presumption of operator liability for the implementation and exploitation of regional data sets is argued, including the introduction of monitoring and audit mechanisms over the functioning of the trained models. The article proposes conceptual directions for harmonizing existing legislation concerning the categorization of regional data sets used in AI-model training, aimed at reinforcing the digital sovereignty of the Russian Federation.

Key words: 

regional data sets, artificial intelligence systems, legal regime, personal data processing, compliance control, technological sovereignty, information security

For citation: 

Olifirenko A. (2026) Legal regulation of the formation and utilization of regional data sets in the process of artificial intelligence model training. In Elektronnoe prilozhenie k «Rossiiskomu yuridicheskomu zhurnalu», no. 1, pp. 87–97, DOI: http://doi.org/10.34076/22196838_2026_1_87.

Text of the article: 

Publication date: 
Friday, 06.03.2026

English