Primož Jakopin

From Wikipedia, the free encyclopedia

Primož Jakopin, Slovenian computer scientist and linguist, * 30 June 1949, Ljubljana.

In 1972, he completed a degree in technical mathematics at the University of Ljubljana and got his Ph.D. in 1999 with the thesis Upper Bound of Entropy in Slovenian Literary Texts (Entropija v slovenskih leposlovnih besedilih).

He is a senior lecturer at the Department of Comparative and General Linguistics, Faculty of Arts, University of Ljubljana. His subject of instruction are language technologies. As of 2001, he is the Head of the Corpus Laboratory at the Fran Ramovš Institute of Slovenian Language (within the Scientific Research Centre of the Slovenian Academy of Sciences and Arts). He participates in a number of European projects on language resources.

His major pieces of software were: IBIS (Digital DEC 10, 1981), INES (Sinclair ZX Spectrum, 1985), STEVE (ATARI ST, 1987-1992), EVA for DOS, 1992- and Windows 9X/2000/XP, 1996-2005), NEVA - Windows server search engine, 1999-2005. From 1992 to 1994 he supervised the transfer of the Dictionary of the Slovenian Literary Language from printed to electronic version (EVA OCR, DOS version). In 1997 he wrote the first part-of-speech tagger for Slovenian texts. In 1999 he started an Internet text corpus, with a concordance service and linked wordform and reversed wordform frequency dictionaries. It is now available as Nova beseda (New word).

His father was the Slovenian linguist Franc Jakopin, while his mother was the poet and translator Gitica Jakopin.

[edit] Publications

  • CORTES - a text corpus of Slovenian. In publication: Digital resources for the humanities: Conference abstracts (University of Sheffield, 10-13 september 2000). - Sheffield: University of Sheffield, 2000. - p. 70-72. (COBISS)
  • EVA - an internet tool fr[!] textual and lexical resources. In publication: Linguistics and language studies / 32nd Annual Meeting, Ljubljana, 8-11 July 1999. - Ljubljana : University, Faculty of Arts: Societas Linguistica Europaea, 1999. - p. 98. (COBISS)
  • The feasibility of a complete text corpus. LREC 2002: proceedings. (COBISS)
  • On text corpora, word lengths, and word frequencies in Slovenian. In publication: Contributions to the science of text and language / edited by Peter Grzybek. - Dordrecht: Springer, 2006. (Text, speech and language technology ; vol. 31). - ISBN 1402040679. - p. 171-185. (COBISS)
  • Query-driven dictionary enhancement. Co-author: Birte Lönneker. In publication: Proceedings of the Eleventh EURALEX International Congress, EURALEX 2004, Lorient, France, July 6-10, 2004 / Geoffrey Williams and Sandra Vessier (eds.). - Lorient : Université de Bretagne-Sud, cop. 2004-. - p. 273-284. (COBISS)
  • Slovenian texts on the internet. In publication: Zapiski: Chronicle of the American Slovene Congress. Issue 7 (May 2000), p. 4-7. (COBISS)
  • Words and nonwords as basic units of a newspaper text corpus. In publication: COMPLEX 2001 / 6th Conference on Computational Lexicography and Corpus Research "Computational Lexicography and New EU Languages", Mason Hall, Birmingham, 28 June-1 July, 2001. - Birmingham: Centre for Corpus Linguistics, Department of English, University of Birmingham, 2001. - p. 49-65. (COBISS)
  • Entropija v slovenskih leposlovnih besedilih (Upper Bound of Entropy in Slovenian Literary Texts), Založba ZRC, Ljubljana 2002. (COBISS)
  • O oblikoslovnem označevanju slovenskega besedila (Morphological tagging of Slovenian texts) (co-author A. Bizjak), Slavistična revija 1997.
  • Odzadnji slovar slovenskega jezika (Inverse Dictionary of Slovenian language) (co-author M. Hajnšek-Holz), Ljubljana 1996. (COBISS)

[edit] External links


In other languages