FGV Digital Repository
    • português (Brasil)
    • English
    • español
      Visit:
    • FGV Digital Library
    • FGV Scientific Journals
  • English 
    • português (Brasil)
    • English
    • español
  • Login
View Item 
  •   DSpace Home
  • Rede de Pesquisa e Conhecimento Aplicado
  • Projetos de Pesquisa Aplicada
  • Tecnologia aplicada à pesquisa com fontes primárias / RP
  • View Item
  •   DSpace Home
  • Rede de Pesquisa e Conhecimento Aplicado
  • Projetos de Pesquisa Aplicada
  • Tecnologia aplicada à pesquisa com fontes primárias / RP
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Browse

All of DSpaceFGV Communities & CollectionsAuthorsAdvisorSubjectTitlesBy Issue DateKeywordsThis CollectionAuthorsAdvisorSubjectTitlesBy Issue DateKeywords

My Account

LoginRegister

Statistics

View Usage Statistics

Text mining for history: first steps on building a large dataset

Thumbnail
View/Open
Artigo_lrec_2018_dhbb.pdf (106.7Kb)
Date
2018
Author
Higuchi, Suemi
Freitas, Cláudia
Claro, Bruno Cuconato
Alexandre, Rademaker
Metadata
Show full item record
Abstract
This paper presents the initial efforts towards the creation of a new corpus on the history domain. Motivated by the historians’ need to interrogate a vast material - almost 9 million words - in a non-linear way, our approach privileges deep linguistic analysis on an encyclopaedic style data. In this context, the work presented here focuses on the preparation of the corpus, which is prior to the mining activity: the morphosyntactic annotation, the definition of semantic types for named entity (NE) and named entities relations relevant to the History domain. Taking advantage of the semantic nature of appositive structures, we manually analysed a sample of 1,049 sentences in order to verify its potential as additional semantic clues to be considered. The results show that we are on the right track.
URI
https://hdl.handle.net/10438/29143
Collections
  • Rede de Pesquisa - Preprints [10]
  • Tecnologia aplicada à pesquisa com fontes primárias / RP [5]
Knowledge Areas
Ciências sociais
Subject
Mineração de dados (Computação)
História
Keyword
Digital humanities
Text mining
Corpus annotation
Appositives

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
@mire NV
 

 


DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
@mire NV
 

 

Import Metadata