The seventh international conference on Language Resources and Evaluation (LREC) was organized in Malta last week. Tommi Vatanen presented a paper written by himself, Jaakko J. Väyrynen and Sami Virpioja on Language Identification of Short Text Segments with N-gram Models. They have studied language identification task, in which the test samples have only 5-21 characters.
Before traveling to work at CERN for the summer 2010, Tommi Vatanen shared with us some experiences from the LREC conference. Issues included Jaime Carbonell's key note presentation on, for instance, CBMT, Context-Based Machine Translation method, Ralf Steinberger's keynote on NewsExplorer as well as the use of Wikipedia and active learning in which a supervised learning algorithm can actively query the teacher for labels.