AWARD
A -- Low Resource Languages for Emergent Incidents (LORELEI)
- Notice Date
- 8/26/2015
- Notice Type
- Award Notice
- NAICS
- 541712
— Research and Development in the Physical, Engineering, and Life Sciences (except Biotechnology)
- Contracting Office
- Other Defense Agencies, Defense Advanced Research Projects Agency, Contracts Management Office, 675 North Randolph Street, Arlington, Virginia, 22203-2114, United States
- ZIP Code
- 22203-2114
- Solicitation Number
- DARPA-BAA-15-04
- Archive Date
- 9/9/2015
- Point of Contact
- Kelly Widmaier,
- E-Mail Address
-
kellywid@cs.cmu.edu
(kellywid@cs.cmu.edu)
- Small Business Set-Aside
- N/A
- Award Number
- HR0011-15-C-0114
- Award Date
- 8/25/2015
- Awardee
- Carnegie Mellon University
- Award Amount
- $4,874,583
- Description
- The United States Government operates globally and frequently encounters low-resource languages for which no automated human language technology capability exists. The goal of the Low Resource Languages for Emergent Incidents (LORELEI) Program is to dramatically advance the state of computational linguistics and human language technology to enable rapid, low-cost development of capabilities for low-resource languages. These capabilities will be exercised to provide situational awareness based on information from any language, in support of emergent missions such as humanitarian assistance/disaster relief, peacekeeping, or infectious disease response. Historically, development of technology for automated exploitation of foreign language materials has required protracted effort and a large data investment. Current methods can require multiple years and tens of millions of dollars per language (mostly to construct translated or transcribed corpora). As a result, human language technology systems exist primarily for languages in widespread use or in high demand. With more than 7,000 languages in the world and the difficulty of predicting the next language for which technology will be needed, universal human language technology coverage by existing means is an unattainable goal. The LORELEI Program aims to change this state of affairs by targeting research and development of human language technology that eliminates the current reliance on huge, manually-translated, manually transcribed, r manually-annotated corpora and turns instead to leveraging language-universal resources, projecting from related-language resources, and fully exploiting a broad range of language specific resources. The technologies resulting from LORELEI research will be capable of supporting situational awareness based on low-resource foreign language sources within an extremely short time frame - starting as soon as 24 hours after a new language requirement emerges. With the understanding that even with perfect translation, there would still be too much material for analysts to use effectively, LORELEI research will not be focused solely on Machine Translation. While LORELEI technologies may include partial or full Automated Speech Recognition and/or Machine Translation, the overall goal will not be translating foreign language material into English, but providing situational awareness by identifying elements of information in foreign language and English sources, such as topics, names, events, sentiment, and relationships. Program Structure It is anticipated that the LORELEI Program will consist of three phases. Phase 1 will be 24 months long, while Phases 2 and 3 will be 12 months long each. The structure of the LORELEI Program will include the following Technical Areas (TAs) of interest: TA1 - Algorithm Research and Development Environment TA2 - Run-time Framework Development TA3 - Linguistic Resource Creation Carnegie Mellon University, teaming with the University of Washington, Hong Kong University, Leidos and the University of Melbourne, will develop the "Analysis of Rare Incident-Even Languages (ARIEL)" for TA1.4 of the DARPA LORELEI program. The ARIEL framework will provide an effective suite of algorithms and methods to cope with any low-resource incident language, starting with rudimentary but useful functionality in 24 hours, and increasingly useful and sophisticated capabilities in a week, a month and beyond. Key capabilities to be developed include: topic identification in written and spoken language, identification of key entities, machine translation starting from rough gisting in 1 to 7 days, with increasing capability over time, event detection, and when possible entity and event coreference. These capabilities will be delivered through a flexible omnivorous architecture and interfaced to the analyst in collaboration with the TA2 performer, and able to incorporate external TA1.1, TA1.2 and TA1.3 modules from other performers.
- Web Link
-
FBO.gov Permalink
(https://www.fbo.gov/spg/ODA/DARPA/CMO/Awards/HR0011-15-C-0114.html)
- Record
- SN03858269-W 20150828/150827001055-341db69554f6d4cc3533aec38634eacb (fbodaily.com)
- Source
-
FedBizOpps Link to This Notice
(may not be valid after Archive Date)
| FSG Index | This Issue's Index | Today's FBO Daily Index Page |