The Official Logo for USD(R&E)

Developmental Test, Evaluation and Assessment

About the Office of Developmental Test, Evaluation and Assessment

Developmental Test & Evaluation of Artificial Intelligence Enabled Systems

Sufficiency Assessments

The Developmental Testing and Evaluation (T&E) of Artificial Intelligence enabled systems (AIES) faces significant challenges for the Department of Defense. This technology requires assessment methodologies and tools that are still emerging, while AI models and applications continue to evolve in more expansive ways for the Solider.  

The key feature in the testing of Artificial Intelligence (AI) is that comprehensive testing has become challenging for many AI components or AI-enabled systems. ML systems often exhibit chaotic changes in response to small input variations, making comprehensive testing very challenging. Comprehensive testing for systems with a large state space is feasible only when the performance envelope can be described, allowing for interpolation between test points and extrapolation outside the tested region. Unfortunately, the absence of an underlying theory for ML systems makes this difficult. These systems are designed to be continuously evolving, making fixed configurations for testing a thing of the past. Under these conditions, traditional test design and strategy may be insufficient. This situation calls for an iterative approach to testing and assessment that continues even after fielding. Additionally, ML models are usually based on large, complex data sets, which themselves require careful evaluation. 

This shift in testing dynamics impacts the relationship between 'test' (measurements) and 'evaluation' (assessments). While comprehensive testing and fixed configurations are unrealistic, the need for evaluation remains crucial. This requires a broader approach to ensure independent T&E evidence is gathered at the most effective points in the development processes to provide the greatest insights. These approaches will require the T&E professional to interpret these results applicability beyond specific test conditions. 

In order to address these challenges, DTE&A is exploring and developing new methods in collaboration with the DOD, academic and industrial base to advance T&E of AIES practices. These include increased reliance on Modeling and Simulation (M&S), collaborating with contractors to streamline testing processes while protecting intellectual property, engaging with requirements communities to ensure feasibility, directly involving users in design considerations, and working closely with system designers to balance testing requirements with cost and schedule considerations. 

These AIES advancements must be pursued balancing risk and fiscal responsibility. Expanding testing scope may lead to higher costs and longer schedules, so it is critical to research and mature DT&E practices in collaboration with DOD stakeholders to ensure the timely and effective T&E approaches will be applied. 

Policy, Guidance and Emerging Guidance/Best Practices

DTE&A is collaborating with stakeholders, within and outside the DoD, to develop policy, guidance and to support the T&E of AIES. 

DTE&A has begun to address the challenge of formalizing DT&E of AI Policy. This structure is necessary to provide the service level T&E organizations with clear and attributable expectations, ensuring the most critical processes and resources can be put in place to enable effective T&E of AIES. Formally approved DTE&A policy is not yet available, in the meantime please see the guidebook and emerging guidance.

Office of the Under Secretary of Defense,
Research and Engineering (OUSD(R&E))
3030 Defense Pentagon, Washington, DC 20301-3030