Security Aspects of Industrial Data Management
In the new digital world where tremendous amount of information is generated from the increasing number of data sources, the emerging need for security and privacy techniques, methods and solutions has been raised. This information should be effectively handled, stored and processed in a secure, trustful and privacy preserving manner.
To this end, privacy and security have been very important topics and the extended research on this topic resulted in a constantly evolving large list of state-of-the-art security and privacy technologies whose combination is attempting to provide solutions for a secure and privacy-preserving environment. In this era, the increase of trust of the users in the technology platforms and services is also considered among the most important topics.
Within the context of XMANAI, a holistic approach effectively covering the needs for security and privacy in the area of industrial data management has been employed. Towards this end, the designed holistic approach addresses different aspects of the required security and privacy in the following five (5) core axes: a) the security of the data in motion, b) the data access control, c) the data anonymization, d) the identity management and finally e) the accountability management.
The nature of novel platforms and applications nowadays requires the transferring of large volumes of data between resources that are usually residing across different network locations. Hence, the need for high-performance data transfer rises in conjunction with the hard requirement for secure data transfer and data sharing as fundamental aspects of the designed data-intensive applications. For this reason, large resource efforts have been performed towards the design and implementation of protocols that can effectively and efficiently support these operations with the main characteristics of high throughput, fairness, and stability in various layers of the TCP/IP model. Within the context of XMANAI, the state-of-the-art protocols are leveraged depending on the nature of the data in motion, as well as the requirement of the XMANAI platform’s components, including the IPSec, TLS, GridFTP, GridCopy, and UDT protocols.
Data Access Control is considered the cornerstone of any designed software solution nowadays. It refers to the methodologies applied that guarantee the selective restriction of access to any critical or valuable resources towards the assurance of data confidentiality and data integrity. Towards the elimination of unauthorized data access or data misuse, data access control covers all the authentication, authorization, access regulation, and access auditing aspects. Hence, data access control should regulate the access and legitimate actions (discovering, reading, creating, editing, deleting, reserving and executing) that the subjects can perform over the objects, which refers to the system or the resources (data, services, network, etc..) of the system. To achieve this, a variety of Access Control Mechanisms (ACM) are available with the most dominant ones being the Access Control List (ACL), Discretionary Access Control (DAC), Mandatory Access Control (MAC), Identity Based Access Control (IBAC), Role-Based Access Control (RBAC), and Attributed Based Access Control (ABAC). Within the context of XMANAI, a hybrid approach is followed by designing an access control model that takes the best of the most dominant models (ABAC, RBAC, and ACL). The rationale behind this approach is based on the nature of the different assets in XMANAI as any valuable resource incorporated into the platform such as projects, datasets, trained models, and analytics results are considered assets.
A significant amount of the collected and processed data is also incorporating personal and sensitive data in various forms, such as personal information, financial information, and personal activity information. However, this personal and sensitive information is now strictly regulated by various national and international data protection laws, i.e., the European General Data Protection Regulation (GDPR), US Health Insurance Portability and Accountability Act (HIPAA) and the California Consumer Privacy Act (CCPA). Hence, the need for data anonymization arises to ensure both compliances with the underlying regulations, as well as to ensure privacy preservation. Data anonymization is a challenging process that provides the means to preserve the privacy by applying several data anonymization techniques that will effectively generalize, conceal or mask the personal information. The list of applied techniques spans from generalization, suppression and bucketization techniques to slicing and randomization techniques. Within the context of XMANAI, a comprehensive toolset offering a range of anonymization methods that go beyond pseudo-anonymization, including k-anonymity and km-anonymity, different automation options and a graphical interface, where users guide the algorithm and decide trade-offs with simple visual choices, is employed.
Identity management incorporates all the processes and policies that are related to the management of digital identities. It covers the holistic user account management lifecycle that spans from the creation and maintenance to the de-provision of user accounts. While the identity management research area has evolved significantly over the past years, the most dominant identity management models are the Isolated (or Silo) Model, the Central Model, the User-Centric Model and the Federated Model, while there are also several technologies which are leveraged such as OAuth 2.0, OpenID Connect and SAML 2.0. Within the context of XMANAI, the approach of a single core identity provider of the platform that undertakes the operations related to the registration, verification and authentication of all the users of the platform has been adopted. This single core identity provider based on the Isolated Model classifies the users of the platform under the concept of the organizations where each user belongs to a single and unique organization following a robust organization registration process as well as a user invitation process.
Accountability management refers to the accountability of the data provenance and data management lifecycle as several stakeholders, such as the service providers, the data consumers, and data providers, are involved. It covers how information is managed, how any action is verified, and how any discrepancies between the occurred actions and the expected actions are remedied with proper explanation and verification. In this sense, accountability management is considered an important prerequisite for the increase of trust of any ICT platform. The appropriate accountability lifecycle includes the policy planning, sense and trace, logging and safe-keeping of logs, the reporting and replaying, the auditing and finally the optimization and rectification. Within the context of XMANAI, a mechanism that enables the determination of the complete lineage back to the creation of the data has been adopted. It includes also the tracking of the transformation or integration of original data into processed data or new datasets to create transparency with respect to data usage, data manipulation, underlying manipulation methods and access privilege towards the comprehension of the provenance of given data.
The designed XMANAI holistic approach addresses the identified requirements of the industrial data management operations by covering the described different aspects of security and privacy. Nevertheless, as the project evolves several optimizations and enhancements are foreseen based on the feedback that will be collected by the XMANAI stakeholders in the course of the project’s completion.