Jordi Atserias Batalla
  • Posts
  • Demos
    • Euro Love Map
    • Yahoo! Correlator
    • Yahoo! Quest
    • VERTa
  • NLP resources
    • Catalan and Spanish Wordnets
    • Semantically Annotated English Wikipedia
Hero Image
Yahoo! Semantically Annotated Snapshot of the English Wikipedia

Yahoo! Semantically Annotated Snapshot of the English Wikipedia, version 1.0 This SW1 dataset contains a snapshot of the English Wikipedia dated from 2006-11-04 processed with a number of publicly-available NLP tools. In order to build SW1, we started from the XML-ized Wikipedia dump distributed by the University of Amsterdam. This snapshot of the English Wikipedia contains 1,490,688 entries (excluding redirects). First, the text is extracted from the XML entry and split into sentences using simple heuristics.

June 8, 2020 Read
Hero Image
Catalan & Spanish Wordnets

I was part of the team developing the Spanish Wordnet inside the EuroWordNet European funded project during 1996-1999.The original EuroWordNet project dealt with Dutch, Italian, Spanish, German, French, Czech, and Estonian. EuroWordNet is a system of semantic networks for European languages, based on WordNet. Each language develops its own wordnet but they are interconnected with interlingual links stored in the Interlingual Index (ILI). Spanish and Catalan Wordnets follow the EuroWordNet framework and are structured in the same way as the American wordnet for English (Princeton WordNet) trhough synsets (sets of synonymous words) with basic semantic relations between them.

June 8, 1999 Read
Navigation
  • About
  • Experiences
  • Education
  • Projects
  • Publications
  • Talks & Courses
Contact me:
  • jatserias
  • Jordi Atserias Batalla

Stay up to date with email notification

By entering your email address, you agree to receive the newsletter of this website.


Toha Theme Logo Toha
© Copyright 2022 Jordi Atserias Batalla. All Rights Reserved.
Powered by Hugo Logo