Chunker (English)
Description
Segments given tokens into chunks (e.g. noun groups, verb groups, ...) and appends the found chunks to the stream.
Required input
Needs a stream with two string list properties:
- A list of tokens
- A list of part-of-speech tags (the Part-of-Speech processing element can be used for that)
Configuration
Assign the tokens and the part of speech tags to the corresponding stream property. To use this component you have to download or train an openNLP model: https://opennlp.apache.org/models.html
Output
Example:
Input:
tokens: ["John", "is", "a", "Person"]
tags: ["NNP", "VBZ", "DT", "NN"]
Output:
tokens: ["John", "is", "a", "Person"]
tags: ["NNP", "VBZ", "DT", "NN"]
chunks: ["John", "is", "a Person"]
chunkType: ["NP", "VP", "NP"])