NLP Flow Processors

NLP Flow contains several custom processors for creating NLP pipelines. The processors are described below.

NLP Building Blocks Processors

ProcessorDescriptionAdded in Version
RenkuLanguageDetectionEngineProvides language detection capabilities via Renku Language Detection Engine.1.0.0
ProseSentenceExtractionEngineProvides sentence extraction capabilities via Prose Sentence Extraction Engine.1.0.0
SonnetTokenizationEngineProvides string tokenization capabilities via Sonnet Tokenization Engine.1.0.0
IdylE3EntityExtractionEngineProvides named-entity extraction capabilities via Idyl E3 Entity Extraction Engine.1.0.0

Entity Store and Persistence Processors

ProcessorDescriptionAdded in Version
RyaIngestProvides entity storage into Apache Rya.1.2.0
AllegroGraphIngestProvides entity storage into AllegroGraph.1.3.0
AmazonNeptuneIngestProvides entity storage into Amazon Neptune.1.3.0

Entity Filtering and Querying Processors

ProcessorDescriptionAdded in Version
EntityQueryLanguageProvides entity filtering via EQL (Entity Query Language).1.0.0

Entity Transformation and Utility Processors

ProcessorDescriptionAdded in Version
ConvertEntityResponseToNQuadsConverts an Idyl E3 entity extraction response into N-quads for ingestion into a triple store.1.3.0

Processor Documentation

NLP Building Blocks Processors

RenkuLanguageDetectionEngine Processor

The RenkuLanguageDetectionEngine processor provides language detection capabilities via Renku Language Detection Engine. The processor accepts as input natural language text, sends the text to an instance of Renku Language Detection Engine as defined in the processor’s settings, and outputs a JSON array of potentially identified languages along with each the probability of each language.

This processor is often followed by a RouteOnAttribute processor to direct the flow based on the input text’s language.

ProseSentenceExtractionEngine Processor

The ProseSentenceExtractionEngine processor provides sentence extraction capabilities via Prose Sentence Extraction Engine. The processor accepts as input natural language text, sends the text to an instance of Prose Sentence Extraction Engine as defined in the processor’s settings, and outputs the individual sentences as a JSON array.

SonnetTokenizationEngine Processor

The SonnetTokenizationEngine processor provides string tokenization capabilities via Sonnet Tokenization Engine. The processor accepts as input natural language text (ideally a single sentence), sends the text to an instance of Sonnet Tokenization Engine as defined in the processor’s settings, and outputs the individual tokens as a JSON array.

IdylE3EntityExtractionEngine Processor

The IdylE3EntityExtractionEngine processor provides named-entity extraction capabilities via Idyl E3 Entity Extraction Engine. The processor accepts as input tokenized text (ideally tokens for a single sentence), sends the text to an instance of Idyl E3 Entity Extraction Engine as defined in the processor’s settings, and outputs a JSON entity extraction response that contains the extracted entities.

RyaIngest Processor

The RyaIngest processor facilitates the ingest of entities into an Apache Rya database. Required configuration settings include the Apache Rya API endpoint. The ConvertEntityResponseToNQuads processor often precedes this processor in an NLP pipeline.

AllegroGraphIngest Processor

The AllegroGraphIngest processor facilitates the ingest of entities in an AllegroGraph database. The ConvertEntityResponseToNQuads processor often precedes this processorin an NLP pipeline.

AmazonNeptuneIngest Processor

The AmazonNeptuneIngest processor facilitates the ingest of entities into an Amazon Neptune database. The ConvertEntityResponseToNQuads processor often precedes this processorin an NLP pipeline.

EntityQueryLanguage Processor

The EntityQueryLanguage processor provides entity filtering capabilities through the use of the Entity Query Language (EQL).

ConvertEntityResponseToNQuads Processor

The ConvertEntityResponseToNQuads processor provides the ability to convert an entity extraction response to N-quads suitable for ingest into a triple store.