Download PDFOpen PDF in browser

A System for Constituent and Dependency Tree Linearization

5 pagesPublished: February 16, 2023

Abstract

In this work, we introduce a framework that unifies existing implementations for the tasks of constituent and dependency parsing as sequence labeling problems. The system provides a way to encode both formalisms as sequences of one label per word, so they can be used with any existing general-purpose sequence labeling architecture. More particu- larly, we implement three linearizations to encode constituent trees and four linearizations for dependency trees. All encoding functions ensure completeness and injectivity. We will also train a sequence labeling neural system to learn such encodings, and compare their ef- fectiveness on standard constituent (PTB and SPMRL treebanks) and dependency parsing (a subset of treebanks from the UD collection) evaluation frameworks.

Keyphrases: constituent parsing, dependency parsing, Natural Language Processing, NLP, sequence labeling, tree linearization

In: Alvaro Leitao and Lucía Ramos (editors). Proceedings of V XoveTIC Conference. XoveTIC 2022, vol 14, pages 83--87

Links:
BibTeX entry
@inproceedings{XoveTIC2022:System_for_Constituent_and,
  author    = {Diego Roca and David Vilares and Carlos G\textbackslash{}'omez-Rodr\textbackslash{}'iguez},
  title     = {A System for Constituent and Dependency Tree Linearization},
  booktitle = {Proceedings of V XoveTIC Conference. XoveTIC 2022},
  editor    = {Alvaro Leitao and Luc\textbackslash{}'ia Ramos},
  series    = {Kalpa Publications in Computing},
  volume    = {14},
  pages     = {83--87},
  year      = {2023},
  publisher = {EasyChair},
  bibsource = {EasyChair, https://easychair.org},
  issn      = {2515-1762},
  url       = {https://easychair.org/publications/paper/kBBd},
  doi       = {10.29007/9m3p}}
Download PDFOpen PDF in browser