Corpora of Romance languages

  • Manuel BARBERA (Turin, Italie)
    Complex lexical units and their morphosyntactic treatment in the Corpus Taurinense
    2000, Vol. V-2, pp. 57-70

    Corpus Taurinense (CT) is the POS tagged version of ItalAnt Corpus, an electronic corpus of Old Italian texts (between 1251 and 1300). In this article we aim to describe the approach followed in CT for the annotation of multiword units (MWU). MWU in our work is a set of two or more graphic words which receive (also) an overall POS tagging because this set of words is in paradigmatic relation with one word lexical unit with the same POS.Our POS tagging confirms that most of the Modern Italian compound conjunctions at that time were not lexicalised. The order of the components is already the Modern Italian order but they can still be interrupted by occasional elements.


  • Mireille BILGER (Perpignan)
    Corpus de portugais et d'espagnol
    1996, Vol. I-2, pp. 124-130
  • Maria de Lourdes CRISPIM (Lisbonne, Portugal)
    Building and using a corpus of medieval Portuguese
    1999, Vol. IV-1, pp. 41-45

    In this article, the authors describe first how the Corpus of Medieval Portuguese has been constitued, in particular how it has been coded ; secondly, an attempt will be made at demonstrating how it can be used for the construction of a dictionary of medieval Portuguese, more specifically of its verbs, proper and common nouns.


  • Claus D. PUSCH (Fribourg, Allemagne)
    Les corpus de linguistique romane en pays germanophones. Bilan et perspectives
    2007, Vol. XII-1, pp. 111-124
  • Miriam VOGHERA (Naples, Italie)
    Corpora of Italian
    1996, Vol. I-2, pp. 131-134