Idyl E3 Entity Extraction Engine is an all-in-one solution for performing entity extraction from natural language text. It takes in unmodified natural language text and through a pipeline, it identifies the language of the text, the sentences in the text, tokenizes those sentences, and extracts entities from those tokens. It’s not exactly what you would call a microservice. The archives for version 2.6.0 are nearly 1 GB in size.
With the introduction of the NLP Building Blocks earlier this year, we began breaking up Idyl E3 into a set of smaller services to perform its individual functions. Renku identifies languages, Prose extracts sentences, and Sonnet performs tokenization. Joining the mix soon with its first release will be Lacuna that classifies documents. Lacuna can be used to route documents through your NLP pipelines based on their content. Each of these applications are small (less than 30 MB), stateless, and horizontally scalable. Using these building blocks for an NLP pipeline instead of the all-in-one Idyl E3 provides much improved flexibility in your NLP pipelines. You can now create loosely connected microservices in your custom NLP pipeline.
With that said, Idyl E3 3.0 will become a microservice whose only function is to perform entity extraction. This will dramatically cut Idyl E3’s deployment size making it easier to deploy and manage. Like the other building blocks, Idyl E3 3.0 will be available as a Docker container. Because Idyl E3’s functionality will be trimmed down its pricing will also be reduced. Stay tuned for the updated pricing.
To help bring the NLP building blocks together in a pipeline we have made the nlp-building-blocks-java-sdk available on GitHub. It includes clients for each product’s API. The Apache2 license product also includes the ability to tie each client together in a pipeline. This is a Java project but we hope to eventually have similar projects available for other languages.
We are very excited to take this path of making NLP building block microservices. We believe it provides awesome flexibility and control over your NLP pipelines.
Also published on Medium.