SOURCES SOUGHT
D -- BTRIS Text De-Indentification Software
- Notice Date
- 2/15/2018
- Notice Type
- Sources Sought
- NAICS
- 511210
— Software Publishers
- Contracting Office
- Department of Health and Human Services, National Institutes of Health, Clinical Center/Office of Purchasing & Contracts, 6707 Democracy Blvd, Suite 106, MSC 5480, Bethesda, Maryland, 20892-5480
- ZIP Code
- 20892-5480
- Solicitation Number
- NIHCCOPC-18004399
- Archive Date
- 3/17/2018
- Point of Contact
- Kathy D Waugh,
- E-Mail Address
-
waughk@cc.nih.gov
(waughk@cc.nih.gov)
- Small Business Set-Aside
- N/A
- Description
- Request for Information Text De-Identification Software National Institutes of Health NIH Clinical Center Biomedical Translational Research Information Systems (BTRIS) Solicitation Number: NIHCCOPC-18004399 Notice Type: Sources Sought Synopsis: THIS IS A SOURCES SOUGHT NOTICE to determine the availability and capability of software vendors to provide currently functioning software to meet programmatic needs. This notice is for planning purposes only, and does not constitute an Invitation for Bids, a Request for Proposals, Solicitation, Request for Quotes, or an indication the Government will contract for the items contained herein. This notice is not to be construed as a commitment on the part of the Government to award a contract, nor does the Government intend to pay for any information submitted as a result of this notice. The Government does not reimburse respondents for any cost associated with submission of the information being requested or reimburse expenses incurred to interested parties for responses to this sources sought. Any responses received will not be used as a proposal. The National Institutes of Health (NIH), Clinical Center (CC), Office of Purchasing and Contracts (OPC) is seeking to identify any sources with capabilities or prior experience that can to provide the NIH Clinical Center's Biomedical Translational Research Information System (BTRIS) with the following DRAFT Scope of Work. The government anticipates the award of a Single software licensing contract with software support for one 12-month base period and four 12-month option periods to extend the effective Period of Performance. The award shall not exceed 60 months if all options are exercised. Background The mission of the National Institutes of Health (NIH) is to uncover new knowledge that will lead to better health for everyone. The NIH accomplishes that mission by conducting research in its own laboratories; supporting the research of non-Federal scientists in universities, medical schools, hospitals, and research institutions throughout the country and abroad; helping in the training of research investigators; and fostering communication of biomedical information. As an organization, the NIH is a unique medical research facility. Management is intentionally decentralized into 24 Institutes and Centers, yet the individual organizations cooperate and collaborate in a useful and necessary partnership. This controlled decentralization is expected to continue as most feel it provides a fertile environment for individual research and supports the incredible scientific diversity represented at NIH. The NIH Clinical Center (CC) is a 234-bed federally funded, biomedical research hospital located on the NIH campus in Bethesda, Maryland. The Clinical Center is the delivery setting for NIH intramural clinical research protocols. The Clinical Center accounts for about half of all NIH-funded clinical research beds in the United States and accommodates about 7,000 inpatient and 70,000 outpatient visits a year. Patients are admitted to the NIH Clinical Center from all over the world for the sole purpose of participating in a clinical research protocol. See http://www.nih.gov for additional information on NIH and the various disease-specific institutes that comprise the NIH. The Biomedical Translational Research Information System (BTRIS) is the National Institutes of Health (NIH) intramural clinical research data repository. BTRIS allows researchers access to data, text and images for over 1800 active clinical research protocols and over 7000 terminated protocols from 1976 to the present. The repository contains over 50 data sources including data from the NIH Clinical Center hospital information systems as well as from institute repositories and clinical trial management systems. Data from contributing systems are received as discrete data values, as well as text and various image document types. Researchers at the NIH use these data, text and images to manage data from active clinical trials, as well as use data in a de-identified format for hypothesis generation and cohort development through the BTRIS user interface. Objective The BTRIS Program is seeking software solutions to enable the automated removal of personally identifiable information (PII) from electronic health care data in unstructured text format. The program seeks currently functional software applications that can redact all or a specified subset of personal identifiers. Scope The NIH BTRIS project seeks text de-identification software for the purpose of removing personally identifiable data from clinical research unstructured text data. Text data are formatted in a variety of formats including RTF, XML, plain text, HTML and PDF, and stored in either databases or as files on BTRIS file server. These files are from the NIH Clinical Center electronic health record (EHR) or from NIH institute research systems. Examples of text files are clinical documentation summaries of care, discharge summaries, consultation reports, adverse event reports, radiology reports, pathology reports, progress notes, surgical summaries and electronic research case report forms. These documents often have different types of PII data embedded in them such as medical record numbers or patient contact information. Ad hoc access to the data is a part of the BTRIS feature set provided to authorized individual users as well as to NIH institutes and centers. Internally developed applications can also be given access to extracted data from BTRIS for use by other repositories. The BTRIS team envisions using an appropriate extraction methodology for taking original text-based data from the respective repository, transferring it to an application server running the de-identification software, and then loading the redacted records back into a different repository containing only de-identified data. It will be accessible to users wishing to run de-identified queries decreasing the risk of PII exposure compared to on-the-fly redaction routines. There will be a series of batch transfers and de-identification of data to populate the new data repository, with anticipated ongoing work to de-identify new inbound text documents on a daily basis as well as to add new sources of unstructured or semi-structured information. The de-identified (redacted) data repository will reside on the same servers as the production storage of identified data. Requirements Functional Requirements: 1. The system shall have the ability to remove the following identifying information from text: 1. Names (First, Middle, Last including hyphenated and multi-part last names). Must be able to demonstrate ability to redact names from all ethnicities. 2. All geographical subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code, if according to the current publicly available data from the Bureau of the Census: (1) The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and (2) The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000. 3. All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; 4. Phone numbers; 5. Fax numbers; 6. Electronic mail addresses; 7. Social Security numbers; 8. Medical record numbers; 9. Health plan beneficiary numbers; 10. Account numbers; 11. Certificate/license numbers; 12. Vehicle identifiers and serial numbers, including license plate numbers; 13. Device identifiers and serial numbers; 14. Web Universal Resource Locators (URLs); 15. Internet Protocol (IP) address numbers; 16. Biometric identifiers, including finger and voice prints; 17. Full face photographic images and any comparable images; and 18. Any other unique identifying number, characteristic, or code (note this does not mean the unique code assigned by the investigator to code the data) 19. Specific identifiers to the NIH environment such as provider names, nursing units, or protocol names that may be deemed to identify a subject (e.g. exemption or compassionate use protocols). 2. Data shall be coded such that the de-identified data can be linked back to the original identified data set - therefore the data in the BTRIS de-identified database shall be considered "coded". The code will not be exposed to the user but shall be used to ensure the correct association and terms of use associated with the original data source. Any code used to replace the identifiers in datasets cannot be derived from any information related to the individual and the master codes. The respondent shall detail how the application would provide a coding system for data elements. 3. The system shall have the ability to de-identify data based on RDBMS lists of PII such as patient names, and medical record numbers currently stored in the BTRIS database. 4. The system shall preserve medical terms (eponymy) based on UMLS metathesaurus terms and additional medical dictionaries if needed. 5. The system shall replace identifiers such that they can be tracked and are consistent within runs. For example, SP20-4567D could be changed to [ID 1] as long as ID 1 always refers to the same identifier within a single run. 6. The system shall maintain a history of applied replacements for testing and quality assurance purposes. 7. The system shall maintain consistency of subject replacement across queries run in the same batch and/or job. For example, a document batch may consist of 10 various text files. Subject "Robert Smith" shall be assigned a redaction of [SUBJECT 1] and [SUBJECT 1] shall be the same subject across all 10 documents. 8. The system shall be able to utilize two different redaction modalities: a. Redaction replacing a subject name such as "Robert Smith" with [SUBJECT 1], an address of "21 Spruce Street" with [STREET ADDRESS] and so on. b. OR -- Redaction replacing a subject name such as "Robert Smith" with [Anthony Jones], an address of "21 Spruce Street" with [10 Cedar Street] and so on. 9. The system shall use advanced heuristics, rule sets and lexical and syntactic context to de-identify data. 10. Dates shall have the ability to be replaced based on context. a. Some dates will require offsets to maintain narrative integrity. This should be based on a parameter set by BTRIS such as a patient's ID. b. Other dates (such as exam dates) shall have the capability to not be offset to ensure the ability to analyze the temporal nature of events within the scope of limited data sets. c. The ability to control dates as actual or consistently offset shall be determined by the use case. 11. The efficacy of system performance shall demonstrate at least 90% recall and 90% precision statistics for de-identification of unstructured text. 12. The software shall have the ability to run daily new documents on the file server based on filters defining new data sets. Technical Requirements: 1. The software shall have the demonstrated ability to work in the Windows or Linux environments. 2. The software should be able to function on physical servers or in the cloud using VM's. 3. The system shall have auditing/logging capabilities via a user interface 4. BTRIS stores a number of tables of subject names, addresses, medical record numbers and other forms of PII. The software proposed shall be able to access and utilize these lists in an automated fashion. These lists may be stored in a relational database (e.g. RDBMS tables) or in a non-relational database such as MongoDB. 5. The software should have the demonstrated capability of using API calls to or from other application platforms such as Elasticsearch, Apelon DTS terminology system or a FHIR terminology server. 6. Software bug fixes shall be made available on a routine basis. 7. Software enhancements shall be provided as a part of the maintenance agreement. 8. System documentation shall be made available in electronic format. 9. The software shall have the ability to process data provided through an encrypted pipeline. 10. If data are stored on the application server, the software application shall be able to process data that are encrypted at rest. User Support Requirements: 1. The user support help desk shall be available during normal business hours (8-5 ET) Monday through Friday via toll-free number or e-mail. 2. A tiered-response system shall be used to address issues. When the software is not functional a response and plan for remediation shall be available within four hours. 3. User support shall commence upon delivery of the software product. Software Use: The software may be used by any subdivision of the licensing agency (service, bureau, division, command, etc.) that has access to the site the software is placed at, even if the subdivision did not participate in the acquisition of the software, so long as the number of licensed users are appropriately accounted for in the license (as applicable). The Government shall have the right to use the computer software and documentation with the computer for which it is acquired at any facility to which that computer may be transferred. The Government shall also have the right to use the computer software and documentation with a back-up computer when the primary computer is inoperative, and to copy computer programs for safekeeping (archives) or back-up purposes. 508 Requirements Section 508 applies to this requirement. All Electronic and Information Technology (EIT) procured through this procurement must meet the applicable accessibility standards at 36 CFR 1194, unless an agency exception to this requirement exists. 36 CFR 1194 implements Section 508 of the Rehabilitation Act of 1973, as amended, and is viewable at http://accessboard.gov/sec508/508standards.htm Part 1194. Contractors are now responsible for indicating on each line item in the procurement whether products or services are compliant or noncompliant with the accessibility standards at 36 CFR 1194. Section 508 Program Need - Requirements for accessibility based on Section 508 of the Rehabilitation Act of 1973 (29 U.S.C. 794d) are determined to be relevant for the following program need: "Commercial off the Shelf Software (COTS)." Section 508 Deliverable Requirements - Technical standards from 36 CFR part 1194 Subpart B have been determined to apply to this acquisition. Solicitation respondents must describe how their proposed Electronic and Information Technology (EIT) deliverables meet at least those technical provisions identified as applicable in the attached Government Product/Service Accessibility Template (GPAT). Functional performance criteria from 36 CFR part 1194 Subpart C have been determined to apply to this acquisition. Solicitation respondents must describe how their proposed Electronic and Information Technology (EIT) deliverables meet at least those functional performance criteria identified as applicable in the attached Government Product/Service Accessibility Template (GPAT). Information, documentation, and support requirements from 36 CFR part 1194 Subpart D have been determined to apply to this acquisition. Solicitation respondents must describe how the information, documentation, and support proposed for Computer Server deliverables meet at least those information, documentation, and support requirements identified as applicable in the attached Government Product/Service Accessibility Template (GPAT). Research on De-identification methodologies For the avoidance of doubt, the BTRIS project will be entitled to utilize the software application for among other things the conduct of comparative research on various de-identification methodologies. RFI Responses: PLACE OF PERFORMANCE: NIH Clinical Center 10 Center Drive Bethesda, MD 20892 Companies are encouraged to respond if they have the capability and capacity to provide the identified products and services with little or no disruption of services to the current users of the BTRIS system. Interested small business potential Offerors are encouraged to respond to this notice. However, be advised that generic capability statements are not sufficient for effective evaluation of respondents' capacity and capability to perform the specific work as required. Pricing information is encouraged, but not required. Responses must directly demonstrate the company's capability, experience, and/or ability to marshal resources to effectively and efficiently perform each of the tasks described above at a sufficient level of detail to allow definitive numerical evaluation; and evidence that the contractor can satisfy the minimum requirements listed above while in compliance with FAR 52.219-14 ("Limitations on Subcontracting"). Failure to definitively address each of these factors will result in a finding that respondent lacks capability to perform the work. Responses to this notice shall be limited to 20 pages, and must include: 1. Company name, mailing address, e-mail address, telephone and fax numbers, website address (if available), and the name, telephone number, and e-mail address of a point of contact having the authority and knowledge to clarify responses with Government representatives. 2. Name, title, telephone number, and e-mail addresses of individuals who can verify the demonstrated capabilities identified in the responses. 3. Business size for a NAICS: 541512 with size limitation standards of the employees and status, if qualified as an 8(a) firm (must be certified by the Small Business Administration (SBA), Small Disadvantaged Business (must be certified by SBA), Woman-Owned Small Business, HUBZone firm (must be certified by SBA), and/or Service-Disabled Veteran Owned Small Business (must be listed in the VetBiz Vendor Information Pages). 4. DUNS number, CAGE Code, Tax Identification Number (TIN), and company structure (Corporation, LLC, partnership, joint venture, etc). Companies also must be registered in the System for Award Management (SAM) at www.sam.gov to be considered as potential sources. 5. Identification of the firm's government-wide acquisition contract (GWAC) by Schedule number and contract number and/or SINs that are applicable to this potential requirement are also requested (e.g. GSA, NITAAC, NASA SEWP). Please submit copies of any documentation, such as letters or certificates to indicate the firm's status (see item #3 above). Teaming arrangements are acceptable, and the information required above on the company responding to this announcement shall also be provided for each entity expected to be teammates of the respondent for performance of this work. To the maximum extent possible, please submit non-proprietary information. Any proprietary information submitted should be identified as such and will be properly protected from disclosure. Interested Offerors should submit their capability statement not exceeding twenty (20) pages in length, excluding standard brochures. SUBMISSIONS ARE DUE no later than 9:00 a.m., Eastern Time, March 2, 2018. The capabilities response shall be e-mailed to: Contract Specialist, Kathy Waugh (waughk@cc.nih.gov). All information received in response to this notice that is marked Proprietary will be handled accordingly. Information provided in response to this notice will be used to assess alternatives available for determining how to proceed in the acquisition process. This notice is part of Government Market Research, a continuous process for obtaining the latest information on the commercial status of the industry with respect to their current and near-term abilities. The information provided herein is subject to change and in no way binds the Government to solicit for or award a competitive contract. The NH/CC will use the information submitted in response to this notice at its discretion and will not provide comments to any submission; however, the content of any responses to this notice may be reflected in subsequent solicitation. NIH/CC/OPC reserves the right to contact any respondent to this notice for the sole purpose of enhancing NIH/CC/OPC 's understanding of the notice submission. This announcement is Government market research, and may result in revisions in both its requirements and its acquisition strategy based on industry responses. It is emphasized that this is a notice for planning and information purposes only and is not be construed as a commitment by the government to enter into a contractual agreement, nor will the government pay for information solicited. Responses to this Sources Sought Notice shall be submitted before the closing date and time to the attention of the Contract Specialist listed below. Contracting Office Address: 6707 Democracy Blvd, Suite 106, MSC 5480 Bethesda, Maryland 20892-5480 Place of Performance: NIH Clinical Center 10 Center Drive Bethesda, Maryland 20892 United States Primary Point of Contact: Name: Kathy Waugh Title: Contract Specialist Email: waughk@cc.nih.gov
- Web Link
-
FBO.gov Permalink
(https://www.fbo.gov/spg/HHS/NIH/CCOPC/NIHCCOPC-18004399/listing.html)
- Place of Performance
- Address: NIH Clinical Center, 10 Center Drive, Bethesda, Maryland, 20892, United States
- Zip Code: 20892
- Zip Code: 20892
- Record
- SN04824413-W 20180217/180215231457-958ce5d546349b74fb3daaf2017d7368 (fbodaily.com)
- Source
-
FedBizOpps Link to This Notice
(may not be valid after Archive Date)
| FSG Index | This Issue's Index | Today's FBO Daily Index Page |