A preliminary list:
- AlterPhono is in idea on quantitative methods for historical linguistics which I started developing when applying to the QMSS16 at the Max Planck Institute for the Science of Human History. I am currently developing a model for generating a distance matrix between phonemes based in a set of binary, distinctive features. My short presentation on the topic is avaliable here and the current model (a work in progress) is avaliable here.
- Tzara Engine is a toy language for automatic text generation from a user-defined grammar, with both Python and JavaScript implementation. It is inspired in famous Dada Engine for the Post-Modernist Generator.
- Beanish; I somewhat become recognized as an expert in Beanish, the fictional language used in XKCD Time. Most of the work is documented, in a chaotic manner, in a the blog Deciphering Beanish ~ ᖉ, ᖆᐣᖚᔭ,ᐦ – which is the reason for my only mention (so far) on BoingBoing! Certainly the geekiest thing I’ve done so far in my life.
- While I am not as active as before, I am still part of the group that develops Acopost (source here, pure ANSI C), a collection of part-of-speech taggers, originally written by Ingo Schröder, which implements and extends well-known machine learning techniques and provides a uniform environment for testing new tagging strategies.
Some old contributions of mine can occasionaly be found in other open source projects, such as NLTK, Moses and the “Hermes Computational Linguistics Project” at the Universidade Federal do Rio Grande (FURG, Brazil).