MLSS Chunker

ID:

21.11115/0000-000B-D345-9

The MLSS (Maltese Language Software Services) Chunker is an online tool which looks for sequences of part-of-speech tags in a tagged text, using one or more patterns that capture the typical members of, say, a noun phrase, and returning a list of chunks from the text that match the searched patterns.

The download for this resource only contains the narrative description in a Word file. The tool itself is delivered as a GUI, which features:
- a panel of buttons representing the POS tags, showing the corresponding description on mouse over.
- a text box which allows the user to enter the pattern of the chunks, and
- a text area where the user inputs the text

The user types or pastes text into the large text box on the bottom of the page and then enters a search pattern into the smaller text box above:

The user can specify patterns of:
- POS; these being preceded by '_'. Example, _DDC _NN
- words; string containing no '_'. Example, 'il- ?' (where the '?' represents anything which follows that word)
- tagged words; being the most specific. Example, 'il-_DDC ?'

Instead of typing in the search patterns, the user can also click on the buttons in the panel on the left-hand side of the page.

Afterwards, the user clicks on “Process” to start the chunking process. The user is then directed to a page “Chunker Result”, which shows the tagged and extracted chunks in a text box.

The input data format is text string typed or copied into an input text box, and search patterns typed or copied into a search box.
The output data format consists of text sub-strings (i.e. tagged text chunks) of the input string, with one chunk per line.

This is the first version of the chunker. Therefore, there may be bugs and room for improvement.

You don’t have the permission to edit this resource.