(More software is under development, these applications will be published here soon)


GenI is a surface realiser which generates from a flat semantic representation, the sentence(s) verbalising this semantics. GenI uses Feature-Based Lexicalised Tree Adjoining Grammar equipped with a unification based compositional semantics to compute the sentences verbalising the semantic input.

GenI was developed by Carlos Areces and Eric Kow under the supervision of Claire Gardent. It is now maintained by Eric Kow.

To download and install GenI, please visit this site.

Contact: Claire Gardent


Jeni is an ongoing port of GenI in Java (current repository). Contact: Alexandre Denis


JTrans is a software that is primarily designed to bring speech alignment (and by extension speech recognition) technology right to the user, and ready to use, in a GUI-rich 100% Java software.

More details here: jtrans page

How to cite:

  author =   {Cerisara, C. and Mella, O. and Fohr, D.},
  title =    {JTrans, an open-source software for semi-automatic text-to-speech alignment},
  booktitle =    {Proc. of INTERSPEECH},
  year =     {2009},
  address =      {Brighton, UK},
  month = sep,


SATI API (Sentiment Analysis from Textual Information) is a multilingual Web API to analyze emotions and sentiments conveyed by text. Given a text, it returns the sentiment (positive, negative, neutral) or the emotion formatted in EmotionML. See the specification and the Web API access at this url.

Contact: Alexandre Denis, Samuel Cruz-Lara.


J-Safran (Java Syntaxico-semantic French Analyser) is a 100%-Java free open-source software for manual, semi-automatic and automatic syntactic dependency parsing in French. It supports other languages as well, but does only include French models.

Contact: Christophe Cerisara

Human-machine dialogues

Dialogue systems are complex distributed and asynchronous architectures that gather specialized components. Broadly, these components solve the tasks of modal-based recognition and synthesis, understanding, dialogue management, generation, fission and fusion; and they can be either symbolic or stochastic oriented. The lack of domain-specific and linguistic resources is the major difficulty when incorporating dialogue in different domains and languages.

Within the Emospeech project, we developed the Emospeech Dialogue Toolkit, for supporting human-machine dialogues and data collection. For supporting data collection we allow a human, the Wizard of Oz to plug-in/out into the dialogue architecture. The Emospeech Dialogue Toolkit is a multi agent architecture for developing man/machine dialog systems in the context of a video game. It includes the following agents;

  • MIDIKI Dialogue Manager: We extended and improved the open source MIDIKI (MITRE Dialogue Toolkit) software to support the multi-agent architecture and the configuration from a relational database.
  • Wizard of Oz: two Wizard of OZ interfaces were built which allow a human to interact with other agents in the dialogue architecture. The free-wizard acts as a dialogue manager and permits a chat between two humans the player and the Wizard while simultaneously storing all interactions in a database. In contrast, The semi-automatic wizard, connects the Wizard with Midiki, whereby the Wizard interprets and adjusts Midiki generation.
  • Interpretation: We trained a SVM and Logistic Regression Classifiers that assigns a user move to a player sentence.
  • Question Answer: We trained a classifier with Conditional Random Fields and a Logistic Regression classifier that chooses the most plausible response to a player sentence.
  • Generation: We implemented a generation-by-selection strategy. Given the dialog move output by the dialog manager, the generator selects any utterance in this corpus that is labeled with this dialog move for the current subdialog.

Additional Tools and Linguistic Resources:

  • Dialogue Configuration: A web tool for configuring different dialogs in a game, by configuring: the speakers(players and not player characters), the game goals and the dialogs: speakers and context goals in a dialog.
  • Annotation Tools: A web tool for annotating both player utterances with dialogue moves and system propositional questions with the related context goals (i.e.the goals to be discusse in the sub-dialog).
  • The Emospeech Corpus: A case study for the Serious Game Mission Plastechnology. Emospeech Corpus comprises 1249 dialogs, 10454 utterances and 168509 words. It contains 3609 player utterances consisting of 31613 word tokens and 2969 word types, with approximately 100 conversations for each sub-dialog in the game. Dialog length varies from 78 to 142 with an average length of 106 utterances per dialog.

Contact: Claire Gardent, Lina Rojas

Contents © 2017 Christophe Cerisara - Powered by Nikola