How to Manage AI Big-Data Risks
Establishing a taxonomy for AI risks would enable researchers, policymakers, and industries to communicate effectively and coordinate their efforts.
Artificial intelligence (AI) is no longer a science fiction fantasy—but AI systems are only as good as the code, training data, and algorithms used to create them. As AI continues transforming industries, understanding and addressing its inherent risks is paramount. What’s needed now is a robust framework to manage AI vulnerabilities. Cybersecurity is light years ahead of AI. By applying lessons learned from cybersecurity, effective strategies can be developed to ensure the responsible and trustworthy advancement of AI technologies.
That AI systems are only as good as their inputs—and, consequently, can do great damage—is indisputable. Consider, for instance, a scenario where an AI system designed to monitor water quality inadvertently underreports contaminants due to flawed training data. This could lead to entire communities consuming unsafe water, resulting in public health crises and loss of trust in technology and government.
The recent introduction of MIT’s AI risk repository offers a promising tool for categorizing and analyzing these threats. The repository serves as an invaluable resource. Aggregating hundreds of AI-associated threats across various environments and categorizing these risks based on their causes and natures—whether related to privacy, security, disinformation, or other concerns—enhances the probability of risk mitigation.
This comprehensive perspective is crucial for organizations aiming to understand and moderate the complex challenges posed by AI. The implications of AI failures extend beyond water quality monitoring. In healthcare, faulty algorithms could lead to misdiagnoses. In the finance sector, erroneous models might enable fraud and significant financial loss. Given AI's immense potential, ensuring that these systems are trustworthy, accountable, and transparent is essential.
Cybersecurity—which draws from decades of experience in developing effective frameworks and tools to tackle complex threats—offers valuable insights for managing AI risks. A key tool in the cybersecurity domain is the National Vulnerability Database (NVD), which maintains a comprehensive catalog of known software vulnerabilities. Each vulnerability is assigned a Common Vulnerabilities and Exposures (CVE) ID, a unique identifier that standardizes the way vulnerabilities are tracked and reported. The CVE system is significant because it provides a common language for cybersecurity professionals, researchers, and vendors to discuss and address security flaws consistently and efficiently.
This system helps prioritize risk management efforts, allowing developers and security teams to focus on the most critical vulnerabilities. The CVE system also fosters transparency and accountability in software development, ensuring that known flaws are not only tracked but also mitigated as part of an ongoing effort to secure the ever-evolving software landscape.
This structured approach in cybersecurity, exemplified by the NVD and CVE systems, can be a model for managing AI risks. As AI systems become more widespread and complex, creating a similar framework to catalog and address AI-specific vulnerabilities—whether in the code, data, or algorithms—would help ensure AI technologies remain secure, transparent, and accountable.
Currently, AI lacks a comparable framework. AI-specific risks remain largely untracked and unstructured. Even if an AI system’s’ code is flawless, the real risks often lie within the data. For example, an AI system trained exclusively on data from North America might incorrectly assume that July is the hottest month worldwide because it was never trained to know that it was winter in the Southern Hemisphere during that period.
Such biases can have significant consequences, whether unintentional or the result of deliberate adversarial interference. Additionally, the non-deterministic nature of AI—where outputs depend on probabilities, training data, and real-world conditions—adds further unpredictability. Unlike traditional software, which produces consistent outputs given the same inputs, AI systems can exhibit varying behavior based on their data and probabilistic models. This unpredictability increases the risk of unintended consequences if the training data is flawed or biased.
To address these challenges, an AI framework analogous to the NVD is necessary. This would involve creating a structured system to systematically identify and address vulnerabilities arising from both the code and the data used by AI systems. Such a framework would help avoid the black-box problem, where AI decisions are opaque and difficult to understand or correct.
The MIT risk repository represents a significant advance in addressing AI risks. However, standardization in how AI risks are categorized and responded to is required to manage these risks fully. Establishing a taxonomy for AI risks, similar to the CVE system in cybersecurity, would enable researchers, policymakers, and industries to communicate effectively and coordinate their efforts. This taxonomy would cover intentional and unintentional risks, providing a framework for understanding how various factors undermine AI trustworthiness.
By leveraging the MIT risk repository, we can develop a knowledge base similar to the MITRE ATT&CK framework specifically for AI trustworthiness. The MITRE ATT&CK framework is a comprehensive and widely recognized knowledge base used in cybersecurity to catalog and describe various tactics, techniques, and procedures employed by threat actors. It provides a structured way to understand and analyze how cyber adversaries operate, allowing security professionals to detect, respond to, and mitigate attacks more effectively.
Similarly, a structured framework for AI trustworthiness would analyze intentional and unintentional risks associated with AI systems. This would involve identifying tactics and techniques that impact AI reliability, such as deliberate attacks like data poisoning and unintentional issues like data drift and model bias. The framework would enhance understanding of how various factors compromise AI systems by categorizing these risks and providing detailed descriptions, mitigation strategies, and detection methods.
This approach would empower researchers, developers, and policymakers to systematically address vulnerabilities in AI systems, ensuring their development is more reliable, transparent, and accountable.
Another relevant cybersecurity practice is the Software Bill of Materials (SBOM). An SBOM lists all the components of a software product, improving transparency and allowing users to identify vulnerabilities in specific parts of the code. For AI, an analogous AI Bill of Materials (AI BOM) could be developed to detail how an AI system was created, including data sources, algorithms, and testing processes. An AI BOM would enhance transparency, facilitating the tracing of errors or biases and holding developers accountable for their systems.
The principles of supply chain security in cybersecurity also offer valuable insights for AI. By incorporating rigorous testing and transparency practices, AI technologies can be developed and tested with the same scrutiny applied to cybersecurity. This would reduce the risk of vulnerabilities and improve overall trustworthiness.
Addressing its risks is crucial as AI continues to evolve and become more embedded in daily life. The lessons learned from cybersecurity provide a clear roadmap for managing these risks. We can build more reliable and transparent AI systems by adopting frameworks like the NVD and SBOM, as well as developing a structured taxonomy for AI-specific vulnerabilities. Collaboration across sectors is essential, as AI risks span various industries, and a piecemeal approach will not suffice. Cooperation between governments, academia, and the private sector must be fostered to create a unified approach to AI safety. Investing in ongoing research into AI ethics and risk management will further ensure that the benefits of AI are realized while minimizing its potential for harm.
Ultimately, the future of AI depends on the ability to manage its risks effectively. By drawing on successful cybersecurity strategies, we can develop a resilient framework that promotes the responsible advancement of AI technologies, ensuring they serve society positively while safeguarding against their inherent dangers.
Dr. Georgianna Shea is the chief technologist of the Center on Cyber and Technology Innovation (CCTI) at the Foundation for Defense of Democracies (FDD).
Zachary Daher served as a summer 2024 intern at FDD. Mr. Daher is a cyber science student at the United States Military Academy at West Point.
Image: Anggalih Prasetya / Shutterstock.com.