Verso Text Preprocessing Engine FAQ

What is Verso Text Preprocessing Engine?

Verso Text Preprocessing Engine is an application that is able to perform many common text preprocessing functions common in natural language processing. When creating an NLP pipeline, often one of the first steps is preprocessing the input text to get it in the appropriate format. This can require operations such as lowercasing text, removing punctuation, and removing stop words. Verso Text Preprocessing Engine provides this functionality through a microservice implementation with a REST API. Text submitted to Verso Text Preprocessing Engine via the API is processed based on the given parameters.

Why would I use Verso Text Preprocessing Engine?

The functionality provided by Verso Text Preprocessing Engine can be implemented in most programming languages in just a few lines and in some use-cases this is an adequate solution. However, implementing text preprocessing in this manner leads to a few potential shortcomings.

First, you may find yourself implementing the same code to preprocess text in multiple places or even by multiple developers. Second, these implementations are confined by the resources of the machine where the code was implemented. In contrast, Verso Text Preprocessing Engine is implemented as a stateless microservice that can scale horizontally without limit providing increased throughput. Lastly, many NLP pipelines are implemented in various programming languages. Getting code in Python to work with a Java application may be difficult. By using Verso Text Preprocessing Engine and its REST API, all applications can interact with Verso Text Preprocessing Engine regardless of the programming languages being used.

What types of preprocessing operations are available in Verso Text Preprocessing Engine?

Verso Text Preprocessing Engine supports the following operations. In case this list becomes out of date, refer to the documentation.

  • Uppercasing of text.
  • Lowercasing of text.
  • Removing words of a minimum length.
  • Removing words of a maximum length.
  • Removing words containing digits.
  • Removing punctuation.
  • Removing common stop words.
  • Removing custom stop words.

The operations performed by Verso Text Preprocessing Engine are controlled by the parameters provided to its API.

Is Verso Text Preprocessing Engine free?

Yes. Verso Text Preprocessing Engine is licensed under the Apache License, version 2.0. The application’s source code is available.

What do I do if I have a problem with Verso Text Preprocessing Engine or have a question?

Please let us know! We will be happy to discuss the problem you are experiencing. Click here to submit a support request.