Maciej Janicki's website

Science

So far, I have been working in the field of Computational Linguistics / Natural Language Processing, which combines my primary field of expertise (computer science) with one of my main hobbyist interests (linguistics and languages). The topics I am especially interested in include:

  • probabilistic models of grammatical structures and their application in unsupervised learning and modeling human language acquisition,
  • grammar theories treating words as basic units of language (Whole Word Morphology, dependency grammars),
  • string similarity, string transformations and finite-state technology,
  • applications of language technology in humanities and social sciences.

Given good research opportunities, I am open to other topics as well.

PhD thesis

  • Maciej Janicki:
    Statistical and Computational Models for Whole Word Morphology.
    PhD thesis, University of Leipzig, 2019. pdf

Conference Papers

  • Maciej Janicki:
    Finite State Transducer Calculus for Whole Word Morphology.
    Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing (FSMNLP), Dresden, Germany, September 2019. pdf
  • Maciej Janicki:
    Semi-Supervised Induction of POS-Tag Lexicons with Tree Models.
    Proceedings of RANLP 2019, Varna, Bulgaria, September 2019. pdf
  • Lydia Müller, Uwe Quasthoff and Maciej Sumalvico:
    Corpora of Typical Sentences.
    In: LREC 2018, Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan, 2018.
  • Maciej Sumalvico:
    Unsupervised Learning of Morphology with Graph Sampling.
    In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2017), Varna, Bulgaria, 2017. pdf
  • Dirk Goldhahn, Maciej Sumalvico and Uwe Quasthoff:
    Corpus Collection for under-resourced languages with more than one million speakers.
    In: Workshop on Collaboration and Computing for Under-Resourced Languages (CCURL), LREC, Portorož, Slovenia, 2016.
  • Maciej Janicki:
    A Multi-purpose Bayesian Model for Word-Based Morphology.
    In: Systems and Frameworks for Computational Morphology – Fourth International Workshop, SFCM 2015, Springer, 2015.
  • Maciej Janicki:
    Unsupervised Learning of A-Morphous Inflection with Graph Clustering.
    In: Proceedings of the RANLP 2013 Student Workshop, 2013.
  • Michał Marcińczuk, Jan Kocoń and Maciej Janicki:
    Liner2–a customizable framework for proper names recognition for Polish.
    In: Intelligent Tools for Building a Scientific Information Platform, Springer Berlin Heidelberg, 2013.
  • Maciej Janicki:
    A Lexeme-Clustering Algorithm for Unsupervised Learning of Morphology.
    In: Dritte Studentenkonferenz Informatik Leipzig – SKIL 2012, 2012. Best paper award.
  • Michał Marcińczuk and Maciej Janicki:
    Optimizing CRF-based model for proper name recognition in Polish texts.
    In: Computational Linguistics and Intelligent Text Processing, Springer Berlin Heidelberg, 2012

(between Jan 2016 and May 2019 under the name Maciej Sumalvico)

Unpublished papers

  • Maciej Janicki:
    Unsupervised Vocabulary Expansion with Whole Word Morphology.
    Rejected from both ACL and EMNLP, 2019. pdf