Friday, June 01, 2012

Many faces of linguistics and language technology

A seminar entitled "Many Faces of Linguistics and Language Technology" was organized on Friday, 1st of June, 2012, at University of Helsinki. The seminar was dedicated to professors Fred Karlsson and Kimmo Koskenniemi who are retiring after their very successful academic careers.

The seminar consisted of invited talks and opening words by the organizers Ekaterina Gruzdeva, Seppo Kittilä and Krister Lindén. Invited talks were given by Urho Määttä, Lauri Carlson, Arvi Hurskainen, Andrew Chesterman, Arto Mustajoki, Kaius Sinnemäki, Maria Vilkuna, Trond Trosterud, Kalervo Järvelin, and Kaisa Häkkinen.

In his presentation, Kimmo Koskenniemi first described the early steps of his career. In 1978, he met Martin Kay in a conference, leading into a one-day discussion on shared interests. Based on this experience, Koskenniemi encouraged in his talk young researchers to meet established scholars to see if there are shared interests and to learn from them. Koskenniemi joined in 1981 a project lead by Fred Karlsson at University of Helsinki. In 1983, he got an invitation by Lauri Karttunen to visit Texas where he met Martin Kay and Ron Kaplan to discuss finite-state transducers. In February 1982, Koskenniemi developed the first version of his to-become-famous theory and implementation of two-level model of morphology and in 1983, he defended his PhD thesis on the TWOL model. His model has become acknowledged as a basic approach for morphological modeling and there are TWOL models for a large number of languages. Demonstrations on, for instance, Finnish, English and German are available. English is morphologically so simple language that it is almost non-interesting, but non-Finnish speakers may try to analyze the Finnish word forms "alusta" (a homograph with five different lexical and seven morphological interpretations) or "peikonjalkasateenvarjoteline" (a compound word from a Finnish translation of a Harry Potter book with five components; 3-component compounds are frequent in Finnish and even 5-component compounds are not very rare).

In his presentation, Koskenniemi presented also a view on the historical development of astrology into the science of astronomy and discussed parallels with research on language.

Fred Karlsson started his career at University of Helsinki in 1972, lecturing on transformational syntax and generative semantics. On Sunday, 20th of March 1977, University of Helsinki opened a position on general linguistics for which there were 9 applicants. Fred Karlsson was nominated in March, 1980. The nomination letter was signed by the President of Finland, Urho Kekkonen. In 1984, a center on computational linguistics was formed by Fred Karlsson at University of Helsinki, funded by Academy of Finland. The work approved to be also practically useful. In 1982-85, the department made collaboration contracts with many companies including Imatran Voima, IBM, Nokia, Papyrus, Posti, Xerox, Sanoma, Tietojyvä, VTKK. Karlsson told that thanks to the economic opportunities they were able to hire dozens of researchers which would not have otherwise been possible.

Karlsson mentioned that he has made two main inventions during his scientific career, constraint grammar and recognizing the absence of multiple central embeddings, especially in speech. In 1989, 6th of March, Fred Karlsson was flying from Amsterdam to Dublin. At about 9.30am, he realized that one has to create a "negative grammar", later coined as constraint grammar. Karlsson talked about the gate keeper problem which means that new inventions do not necessarily get to the "scientific market". Karlsson has been pleased to note that in Hurford's "The Origins of Grammar" his work related to central embeddings has been taken into serious consideration.

Karlsson also talked about how scientific research and science-based business should be combined. As a background, he mentioned about an EU project in which the department was involved with Nokia as another Finnish participant. The tools developed at the department could be considered to conduct the best analysis for English. Concrete evidence for this was gained when Harper-Collins Publishers announced a competition in which English corpus of 200 million words needed to be parsed. The department won the contract and a team with Atro Voutilainen, Timo Järvinen and Arto Anttila was working in the project. Every month they had to deliver 10 million words for 20 months. In the end, this corpus of 200 million words was re-analyzed. This formed the basis for the famous COBUILD dictionary which is acknowledged for its high quality. After a number of such successful projects, the administrators at the University of Helsinki became frustrated with the contracts that needed to be formulated and signed. This motivated formation of Lingsoft, Inc, which became an internationally successful language technology company. At best, Lingsoft has employed 55 people, speaking 13 different languages.

Timo Honkela, Mikko Kurimo, Krista Lagus and the younger language researchers at the department of information and computer science at Aalto University School of Science warmly thank professors Koskenniemi and Karlsson for all the collaboration over the years and wish happy and productive years also after the retirement.

No comments: