We have open-sourced our NLP library and its associated projects on GitHub. The library, Idyl NLP, is a Java natural language processing library. It is licensed under the Apache License, version 2.0.
Idyl NLP stands on the shoulders of giants to provide a capable and flexible NLP library. Utilizing components such as OpenNLP and DeepLearning4j under the hood, Idyl NLP offers various implementations for NLP tasks such as language detection, sentence extraction, tokenization, named-entity extraction, and document classification.
Idyl NLP has its own webpage at http://idylnlp.ai and is available in Maven Central under the group ai.idylnlp.
Here are the GitHub project links:
Idyl NLP powers our NLP building block microservices and they are also open source on GitHub:
NLP Models and Model Zoo
Idyl NLP has the ability to automatically download NLP models when needed. The Idyl NLP Models repository contains model manifests for various NLP models. Through the manifest files, Idyl NLP can automatically download the model file referenced by the manifest and use it. The service powering the service is the Idyl NLP Model Zoo that will soon be hosted at zoo.idylnlp.ai. It is a Spring boot application that provides a REST interface for querying and downloading models so you can run your own model zoo for internal usage. See these two repositories on GitHub for more information about the available models and the model zoo. Models will become available through the repository in the coming days.
There are some sample projects available for Idyl NLP. The samples illustrate how to use some of Idyl NLP’s core capabilities and hopefully provide starting points for using Idyl NLP in your projects.
We are committed to further developing Idyl NLP and its ecosystem. We welcome the community’s contributions to help it foster and grow. We hope that the business friendly Apache license helps Idyl NLP’s adoption. Like most software engineers we are a bit behind on documentation. In the near term we will be focusing on the wiki, javadocs, and the sample projects. Our NLP Building Blocks will continue to be powered by Idyl NLP.
For questions or more information please contact firstname.lastname@example.org.
The Idyl E3 Java and .NET Client SDKs have been updated for Idyl E3 2.5.0. Check them out on GitHub:
They both include support for Idyl E3’s new text TCP streaming endpoint.
We are happy to let you know how Idyl E3 Entity Extraction Engine can be used with Apache NiFi. First, what is Apache NiFi? From the NiFi homepage: “Apache NiFi supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic.” Idyl E3 extracts entities (persons, places, things) from natural language text.
That’s a very short description of NiFi but it is very accurate. Apache NiFi allows you to configure simple or complex processes of data processing. For example, you can configure a pipeline to consume files from a file system and upload them to S3. (See Example Dataflow Templates.) There are many operations you can do and they are performed by components called Processors. There are many excellent guides available about NiFi, such as:
There are many processors available for NiFi out of the box. One in particular is the InvokeHttp processor that lets your pipeline send an HTTP request. You can use this processor to send text to Idyl E3 for entity extraction from within your pipeline. However, to make things a bit simpler and more flexible we have created a custom NiFi processor just for Idyl E3. This processor is available on GitHub and its binaries will be included with all editions of Idyl E3 starting with version 2.3.0.
Instructions for how to use the Idyl E3 processor will be added to the Idyl E3 documentation bit they are simple. Here’s a rundown. Copy the idyl-e3-nifi-processor.jar from Idyl E3’s home directory to NiFi’s lib directory. Restart NiFi. Once NiFi is available you will see the Idyl E3 in the list of processors when adding a processor:
There are a few properties you can set but the only required property is the Idyl E3 endpoint. By default, the processor extracts entities from the input text but this can be changed using the action property. The available actions are:
- extract (the default) to get a JSON response containing the entities.
- annotate to return the input text with the entities annotated.
- sanitize to return the input text with the entities removed.
- ingest to extract entities from the input text but provide no response. (This is useful if you are letting Idyl E3 plugins handle the publishing of entities to a database or other service outside of the NiFi data flow.)
The available properties are shown in the screen capture below:
And that is it. The processor will send text to Idyl E3 for entity extraction via Idyl E3’s /api/v2/extract endpoint. The response from Idyl E3 containing the entities will be placed in a new idyl-e3-response attribute.
The Idyl E3 NiFi processor is licensed under the Apache Software License, version 2.0. Under the hood, the processor uses the Idyl E3 client SDK for Java which is also licensed under the Apache license.
The Idyl E3 SDK for Go is now available on GitHub. This SDK allows you to integrate Idyl E3’s entity extraction capabilities into your Go projects.
Like the other Idyl E3 SDKs, the project is licensed under the Apache Software License, version 2.0.
It’s easy to use:
endpoint := "http://localhost:9000"
s := "George Washington was president."
confidence := 0
context =: "context"
documentID := "documentID"
language := "en"
key := "your-api-key"
response := Extract(endpoint, s, confidence, context, documentID, language, key)
There is a new project on our GitHub that is an EC2 metadata simulator. The project allows for testing applications that depend on EC2 instance metadata in non-AWS environments. It doesn’t (yet) provide complete simulation of all EC2 metadata endpoints but in time it will and in the mean time it should be simple enough to modify to fit your needs.
One important note, the EC2 instance metadata listens on 169.254.169.254 port 80. The simulator will run on your localhost at port 8080 by default. The result of this command will not persist across system restarts, but you can redirect traffic from 169.254.169.254:80 to localhost:8080 using iptables:
iptables -t nat -A OUTPUT -p tcp -d 169.254.169.254 --dport 80 -j DNAT --to-destination 127.0.0.1:8080
Now, all requests to http://169.254.169.254:80 will get redirected to the simulator running at http://localhost:8080.
EC2 instance metadata provides useful information to instances running in EC2. Using a simple curl command like the one below to find the instance’s ID you can retrieve information about the running instance.
Idyl E3 2.2.0 added support for publishing metrics to a Graphite server. To help make it easier to deploy a Graphite server we have added a new project on our GitHub that contains a Packer script for creating a Graphite AMI. Usage instructions are available in the project’s readme file.
Today we are announcing the release of Idyl E3 2.2.0. (See the full Release Notes.) This version brings some new exciting features such as heuristic confidence filtering, support for all UTF-8 languages, and statistics reporting.
Idyl E3 2.2.0 can be downloaded from our website today. Look for it to be available on the AWS Marketplace in the upcoming week.
In related news:
We have pushed a new open source project to our GitHub called Idyl Talk. The goal of Idyl Talk is to replace traditional interface-defined software communication with natural language text.
When software communicates with other software, either internally or with external software, the communication is defined by interfaces. These interfaces tell each side how to communicate. Interfaces are an essential piece of good design. But what happens when two components have to communicate, and for whatever reasons, it is difficult (or impossible) to define the interface? Idyl Talk addresses this problem by letting software components communicate using natural language English text.
Imagine your refrigerator talking to your smartphone app to update your shopping list. The communication might look a bit like this:
Your smartphone receives the message and an app notifies you that you need milk. For this to be possible the developers of the refrigerator and the smartphone app have to agree on some interface that dictates the communication between the devices. This requires collaboration, and of course, time and money.
Now, imagine that when you are running low on milk your refrigerator sends the following message to your smartphone app:
You are low on milk.
The agreed-to interface here is the English language. With Idyl Talk can now create devices that are enabled to communicate even if they do not exist yet! The app processes the received message and alerts you that you are low on milk.
Sound interesting? We think so! We welcome your contributions to the project as it matures and grows. Check out Idyl Talk on GitHub.
See a listing of all our open source projects.