Introducing NLP Architect by Intel AI Lab

Many advances in Natural Language Processing (NLP) and Natural Language Understanding (NLU) in recent years have been driven by advancements in the field of deep learning with more powerful compute resources, greater access to useful data sets, and advances in neural network topologies and training paradigms. At Intel AI Lab, our team of NLP researchers and developers have been exploring the state-of-the-art deep learning topologies and techniques for NLP and NLU. And today, we would like to introduce NLP Architect, as an open source library to share with the community and to create a platform for future research and collaborations.

NLP Architect Overview

In the current version of NLP Architect, we’ve collected these features that we found interesting from both research perspectives and practical applications, including:

  • NLP core models that allow robust extraction of linguistic features for NLP workflow: for example, dependency parser (BIST) and NP chunker
  • NLU modules that provide best in class performance: for example, intent extraction (IE), name entity recognition (NER)
  • Modules that address semantic understanding: for example, colocations, most common word sense, NP embedding representation (e.g. NP2V)
  • Components instrumental for conversational AI: for example, ChatBot applications, including dialog system, sequence chunking, and IE
  • End-to-end DL applications using new topologies: for example, Q&A, machine reading comprehension

All the above models are provided with end-to-end examples of training and inference processes. In addition, we’ve included some of the functionalities often used when deploying these models, such as data pipelines, common functional calls, and utilities related to NLP. The library is modularized for easy integration. We look at these as a set of building blocks that were needed for implementing NLP use cases based on our pragmatic research experience.

This open and flexible library of NLP components provides the foundations for us to enable NLP solutions with our partners and customers. We are still actively incorporating new results from our research and data science into this stack to allow everyone to re-use what we’ve built and optimized. The library also provides us the platform for analysis and optimizations of Intel software and hardware on NLP workloads.

Some of the components, with provided pre-trained models, are exposed as REST service APIs through NLP Architect server. NLP Architect server is designed to provide predictions across different models in NLP Architect. It also includes a web front-end exposing the model annotations for visualizations. Currently, we provide 2 services, BIST dependency parsing and NER annotations. We also provide a template for developers to add a new service.

Getting Started

Developers can start by downloading the code from our GitHub repository and following the instructions to install NLP Architect. A comprehensive documentation for all the core modules and end-to-end examples can be found here. We look forward to receiving feedback, feature requests or pull request contributions from all users.

Next steps

In our previous blog, we discussed that by building a stack of NLP components based on latest DL technologies, it allows us to build foundations to tackle many applications for our partners and customers. It also enables us to continuously incorporate new results from our research and data science into the stack. In future releases, we are planning to demonstrate these advantages with solutions including sentiment extraction, topic and trend analysis, term set expansion and relation extraction. We are also researching unsupervised and semi-supervised methods that will be introduced into interpretable NLU models and domain-adaptive NLP solutions.

Acknowledgments

Credits go to our team of NLP researchers and developers at Intel AI Lab, Peter Izsak, Anna Bethke, Daniel Korat, Amit Yaccobi, Andy Keller, Jonathan Mamou, Shira Guskin, Sharath Nittur Sridhar, Oren Pereg, Alon Eirew, Sapir Tsabari, Yael Green, Chinnikrishna Kothapalli.

Notices and Disclaimers

Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

© Intel Corporation