Detection of Unsanctioned Information
Access Control
NIST Control Text
When transferring information between different security domains, examine the information for the presence of [Assignment: organization-defined unsanctioned information] and prohibit the transfer of such information in accordance with the [Assignment: organization-defined security or privacy policy].
NIST Discussion
Unsanctioned information includes malicious code, information that is inappropriate for release from the source network, or executable code that could disrupt or harm the services or systems on the destination network.
Parameter Values
Assignment (unsanctioned information): Adversarial content (poisoning attempts, jailbreak attempts, adversarial examples)
SL5 Supplemental Guidance
Adversarial content designed to compromise AI models or data center operations constitutes unsanctioned information. Detection applies to all data that could be consumed as input by AI models: training datasets, evaluation data, prompts, code, configuration files, images, audio, video, and structured data. All supported modalities (text, images, video, audio, code, structured data) must be screened regardless of intended use.
Organizations determine detection thresholds through risk assessment (per RA-3), balancing false positives against false negatives. Risk-based tiering applies stricter thresholds to higher-risk data (e.g., training data, infrastructure code).
Detection systems must resist adversarial bypass attacks. Robust adversarial content detection remains an open research problem; best-available defensive measures are insufficient against sophisticated adversaries. Organizations invest in research and development to advance detection capabilities, understanding that breakthroughs in adversarial robustness are necessary to fully address this threat. Section 1.3 highlights this as a key open question.