A Framework for AI: Evaluating AI Risks and Benefits

Christoph Mugge-Durum

10 hours ago

If 2025 was the year that many organizations bet on AI, in 2026, the focus will shift to the implementation and evaluation of AI systems. As organizations transition from testing AI to deploying it at scale, the most important question is no longer whether AI poses risks, but whether its benefits meaningfully justify those risks. At the same time, this shift is happening against a rapidly evolving legal and compliance landscape, where AI impact assessments are increasingly required under state-level laws [1] and international frameworks such as the EU AI Act [2].

As the science of AI measurement continues to evolve and federal guidance in this domain remains nascent [3], a significant question arises: how should organizations evaluate the positive and negative impacts of an AI system? To address this, the Responsible AI Lab published the first edition of its AI risk/benefit framework. This framework looks beyond traditional technical and economic considerations and integrates a range of downstream effects of AI on users and adopters of the technology as well as the community where the infrastructure powering it is situated.

What Are AI Impact Assessments?

A traditional AI Impact Assessment (AIIA) is a structured process to identify and manage risks resulting from AI systems, with most emphasizing safety, privacy, and accountability at the design or early development stages. However, the Lab takes a comprehensive approach by expanding the scope of an impact assessment to include identification and management of benefits to users and society.

As AI impact is increasingly understood as an outcome of sociotechnical systems rather than isolated models, [4] many frameworks combine quantitative performance metrics with qualitative process checks to estimate and mitigate the potential risk profile of an AI system once it is deployed.

We reviewed a number of impact assessments including, including ISO 42001, [5] Ontario’s Human Rights AI Impact Assessment, [6] and Canada’s Algorithmic Impact Assessment tool, [7] coming to the conclusion that while these frameworks reflect important progress, there are recurring limitations when applied to housing, lending, and other regulated sectors.

Despite their growing adoption, current AI impact assessments struggle to consistently measure outcomes and exhibit several persistent gaps that limit their effectiveness. These shortcomings include:

Many frameworks lack quantitative fairness testing and do not clearly operationalize disparate impact analysis, relying instead on high-level qualitative prompts or self-reported compliance claims.
Existing approaches frequently exclude meaningful input from affected stakeholders, limiting their ability to reflect lived experiences and community-level consequences.
AIIAs are often biased toward pre-deployment evaluation, with limited emphasis on post-deployment monitoring or tracking real-world outcomes over time. As a result, assessments tend to focus on hypothetical or design-stage risks without establishing a clear link to tangible impacts such as environmental and community harms.
Most frameworks prioritize the identification of risks while giving insufficient attention to measuring benefits, making it difficult to assess whether an AI system meaningfully improves outcomes relative to the status quo.

What is NFHA’s Risk/Benefit Assessment?

The Risk/Benefit Assessment is a structured framework consisting of 60 scoreable questions, organized into 8 steps: AI Model Assessment, Stakeholder Assessment, Impact at Data Level, Impact at Algorithmic Level, Impact at Output Level, Fairness Metrics and Mitigation, Advancing Fairness and Consumer Opportunity, and Legal & Compliance. The framework is designed for organizations deploying or procuring AI in high-stakes domains—particularly housing, lending, insurance, and other areas governed by civil rights and consumer protection laws.

The framework offers the following differentiating factors:

A scoring approach that produces a risk/benefit profile, allowing organizations to assess the proportionality between an AI system’s anticipated benefits and its residual risks;
Explicit integration of antidiscrimination tenet considerations, translating US legal and policy concepts into practical assessment criteria; and
An overall design that supports real-world outcomes, transparency, comparability, and future adaptation for different use cases or regulatory contexts.

Conclusion:

As organizations move from experimenting with AI to relying on it the need for benchmarking and rigorous evaluation will grow. RAIL’s Risk/Benefit Framework is designed as a practical first step to help organizations evaluate their AI systems’ risks relative to its benefits to make more evidence-based AI governance decisions.

[1] https://leg.colorado.gov/bills/sb24-205

[2] https://artificialintelligenceact.eu/

[3] https://www.nist.gov/news-events/news/2026/01/towards-best-practices-automated-benchmark-evaluations

[4] https://link.springer.com/article/10.1007/s10462-023-10420-8

[5] https://www.iso.org/standard/42001

[6] https://www3.ohrc.on.ca/en/human-rights-ai-impact-assessment

[7] https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html