SOURCES SOUGHT
R -- DEVELOP EVALUATIONS AND BENCHMARKS FOR ASSESSING CYBER CAPABILITIES OF AI MODELS
- Notice Date
- 1/10/2025 12:20:05 PM
- Notice Type
- Sources Sought
- NAICS
- 541690
— Other Scientific and Technical Consulting Services
- Contracting Office
- DEPT OF COMMERCE NIST GAITHERSBURG MD 20899 USA
- ZIP Code
- 20899
- Solicitation Number
- CAW-AISI-0002
- Response Due
- 1/31/2025 12:30:00 PM
- Archive Date
- 02/15/2025
- Point of Contact
- Carol A. Wood, Phone: 3019758172, Fax: 3019756273
- E-Mail Address
-
carol.wood@nist.gov
(carol.wood@nist.gov)
- Description
- This is a SOURCES SOUGHT NOTICE for market research purposes. THIS IS NOT A REQUEST FOR PROPOSALS OR A REQUEST FOR QUOTATIONS. Artificial Intelligence (Al) is one of the defining technologies of our era. Its emergence, together with its multiplying contexts of use and increasing capabilities, presents enormous opportunities as well as significant present and future harms. To help encourage innovation while protecting against risks from Al, President Biden signed Executive Order 14110 on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence. The Executive Order tasked the National Institute of Standards and Technology (NIST) with establishing guidelines and best practices for developing and deploying safe, secure, and trustworthy Al systems, including ""launching an initiative to create guidance and benchmarks for evaluating and auditing Al capabilities, with a focus on capabilities through which Al could cause harm.""1 In particular, the Executive Order focuses on models that might pose a risk to ""security, national economic security, national public health or safety,"" among other things, by ""enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks.""2 The U.S. AI Safety Institute (AISI), housed within the National Institute of Standards and Technology, was created to respond to the priorities assigned to NIST under the Executive Order. AISI is advancing the science, practice, and adoption of AI safety across the spectrum of risks. As part of this mandate, AISI is conducting pre-deployment testing, evaluation, validation, and verification (TEVV) on frontier models of cyber capabilities and risks.3 In doing so, AISI seeks to ensure that its projects, evaluations, and tools reflect the best available science, and to coordinate closely with a diverse set of Al stakeholders who are developing and conducting evaluations to assess cyber capabilities and risks. NIST is performing market research to identify potential sources for an anticipated contract to assist in developing evaluations and benchmarks of AI models' relevant cyber capabilities and risks. POTENTIAL CONTRACT REQUIREMENTS The following requirements are provided to describe the Government's minimum needs. As the acquisition planning process proceeds, requirements may be modified. The Contractor must provide or develop resources for various aspects of assessing frontier Al model cyber capabilities and risks. The Contractor would be responsible to conduct one or more of the tasks in list A in order to assess one or more of the capabilities in list B. Contractor Tasks: LIST A - Contractors must provide or develop resources for one or more of the following. Developing benchmarks and scoring mechanisms for automated evaluation of Al models' relevant cyber capabilities based on real or realistic offensive cyber tasks or workflows; Developing tasks for automated evaluation of AI models� relevant cyber capabilities with accompanying data on human baseline performance (e.g., how long the tasks take human experts to complete); Creation of synthetic environments such as cyber ranges that emulate features of realistic networks and support automated grading or scoring for evaluation of AI models� relevant cyber capabilities; Design and implementation of protocols or methods for evaluating AI models� relevant cyber capabilities; Development of technical resources or infrastructure for generating tasks and environments and associated scoring mechanisms for evaluating AI models� relevant cyber capabilities; Development of technical resources or infrastructure for converting existing assets (e.g. publicly available codebases) into evaluations and scoring mechanisms for AI models� relevant cyber capabilities; Design and implementation of research projects to identify or measure the use of AI models� relevant cyber capabilities in real-world cyber attacks and offensive cyber operations. LIST B - Relevant frontier model capabilities to elicit, evaluate, and benchmark include: Capabilities that enable a model to discover vulnerabilities in real or realistic code bases, web resources, or networks; Capabilities that enable a model to develop working exploits for discovered or known vulnerabilities in real or realistic code bases, web resources, or networks; Capabilities that enable a model to automate social engineering workflows, such as the generation of targeted phishing content; Capabilities that enable a model to automate open-source research and/or pre- or post-compromise information gathering for planning and executing cyber attacks; Capabilities that could significantly uplift or assist threat actors with the development or customization of malware or other offensive cyber tools; Capabilities that could significantly uplift or assist threat actors models in obtaining or operating infrastructure for offensive cyber operations; Capabilities that could significantly uplift or assist cyber threat actors in evading detection by defensive systems; Capabilities that could enable models to autonomously perform cyber operations, such as the ability to autonomously complete multiple steps of a cyber attack from gaining initial access to establishing persistence, moving laterally, and evading detection. INFORMATION FOR SUBMISSION NIST is seeking responses from responsible sources. Responses are being sought from all business sizes. After the results of this market research are obtained and analyzed, NIST may conduct a competitive procurement and subsequently award a contract. The NAICS code for this effort is 541690, Scientific & Technical Consulting Services. The small business size standard is $19.0M. If a small business set-aside results, a Limitation on Subcontracting clause will be applicable to the anticipated procurement action. Businesses meeting the classification of the set-aside cannot pay more than 50% of the amount paid to it, by the Government, to firms that are not similarly situated. Responses to this sources sought shall not exceed 15 pages in length and must demonstrate the respondent's capabilities related to the potential government requirements; responses must provide detail beyond a standard capability statement. The following information must be provided in response to this sources sought notice: Name of company(ies), their addresses, Unique Entity Identifier (UEI) for the company's active System for Award Management (SAM.gov) website registration, and a point of contact for the company (name, phone number, fax number and email address) that provide the services for which specifications are provided. Interested parties that do not have an active SAM.gov registration are strongly encouraged to immediately begin the registration process. This process can take several weeks to complete. Parties responding to Government solicitations must have an active registration at SAM.Gov in order for the proposal to be considered for award. 1. Respondents must possess knowledge and capacity in some or all of the listed tasks A(i)-(vi) included in this sources sought, specifically as they pertain to assessing the relevant Al model capabilities listed in (B) above that may pose a risk of enabling the automation of components of offensive cyber attacks. For each applicable task area A(i)-(vi), Respondents must provide discussion of their knowledge and capacity. Knowledge and capacity in assessing AI-enabled cybersecurity systems for procurement (e.g. detection and monitoring tools that use AI), or in assessing the security of AI models, is not considered applicable. 2. In addition to having knowledge and capacity in some or all of the areas above, Respondents must have: -Previous experience executing on one or more of the actions and activities described above, specifically as they pertain to assessing the cyber capabilities and/or risks of frontier Al models; -Technical staff with 3 � 5 years of experience working in cybersecurity. 3. Potential labor categories, that may be used to accomplish the work. If the Contractor has performed similar work, please provide price information for the work performed. All price information will be utilized for budgeting purposes All price information will be held by NIST as confidential. 4. Any other relevant information which the Government should consider in developing its minimum specifications and finalizing its market research. The above information and any other information considered pertinent to this notification must be submitted to Carol Wood, Contracting Officer, not later than, January 31, 2025, at 3:30 PM Eastern Time. Email submission is required. 1 https://www.federalregister.gov/doc uments/2023/11 /01/2023-24283/safe-sec ure-and-tru stworthy-deve lopment-an d-u se of-artificial-intellige nee 2 Id. 3 https://www.nist.gov/system/files/docu ments/2024/05/21/AISl-vi sio n-21 May2024.pdf 4https://www.federalregister.gov/doc uments/2023/11 /01/2023-24283/safe-sec ure-and-tru stworthy-deve lopment-an d-use of-artificial-intelligenee.
- Web Link
-
SAM.gov Permalink
(https://sam.gov/opp/b94348fbd44b4e02b293feb244b10ee6/view)
- Record
- SN07310096-F 20250112/250110230108 (samdaily.us)
- Source
-
SAM.gov Link to This Notice
(may not be valid after Archive Date)
| FSG Index | This Issue's Index | Today's SAM Daily Index Page |