/* * Copyright 1999-2002 Carnegie Mellon University. * Portions Copyright 2002 Sun Microsystems, Inc. * Portions Copyright 2002 Mitsubishi Electric Research Laboratories. * All Rights Reserved. Use is subject to license terms. * * See the file "license.terms" for information on usage and * redistribution of this file, and for a DISCLAIMER OF ALL * WARRANTIES. * */ package edu.cmu.sphinx.linguist; import edu.cmu.sphinx.util.props.Configurable; import edu.cmu.sphinx.util.props.S4Double; import java.io.IOException; /** * The linguist is responsible for representing and managing the search space for the decoder. The role of the linguist * is to provide, upon request, the search graph that is to be used by the decoder. The linguist is a generic interface * that provides language model services. * <p> * The main role of any linguist is to represent the search space for the decoder. The search space can be retrieved by * a SearchManager via a call to <code> getSearchGraph</code>. This method returns a SearchGraph. The initial state in * the search graph can be retrieved via a call to <code>getInitialState</code> Successor states can be retrieved via * calls to <code>SearchState.getSuccessors().</code>. There are a number of search state subinterfaces that are used to * indicate different types of states in the search space: * <ul> <li><b>WordSearchState </b>- represents a word in the search space. <li><b>UnitSearchState </b>- represents a * unit in the search space <li><b>HMMSearchState </b> represents an HMM state in the search space * </ul> * A linguist has a great deal of latitude about the order in which it returns states. For instance a 'flat' linguist * may return a WordState at the beginning of a word, while a 'tree' linguist may return WordStates at the ending of a * word. Likewise, a linguist may omit certain state types completely (such as a unit state). Some Search Managers may * want to know a priori the order in which different state types will be generated by the linguist. The method * <code>SearchGraph.getNumStateOrder()</code> can be used to retrieve the number of state types that will be returned * by the linguist. The method <code>SearchState.getOrder()</code> returns the ranking for a particular state. * <p> * Depending on the vocabulary size and topology, the search space represented by the linguist may include a very large * number of states. Some linguists will generate the search states dynamically, that is, the object representing a * particular state in the search space is not created until it is needed by the SearchManager. SearchManagers often * need to be able to determine if a particular state has been entered before by comparing states. Because SearchStates * may be generated dynamically, the <code>SearchState.equals()</code> call (as opposed to the reference equals '==' * method) should be used to determine if states are equal. The states returned by the linguist will generally provide * very efficient implementations of <code>equals</code> and <code>hashCode</code>. This will allow a SearchManager to * maintain collections of states in HashMaps efficiently. * <p> * The lifecycle of a linguist is as follows: * <ul> * <li> The linguist is created by the configuration manager * <li> The linguist is given an opportunity to register its properties via a call to its <code>register</code> method. * <li> The linguist is given a new set of properties via the <code>newProperties</code> call. A well written linguist * should be prepared to respond to <code>newProperties</code> call at any time. * <li> The <code>allocate</code> method is called. During this call the linguist generally allocates resources such as * acoustic and language models. This can often take a significant amount of time. A well-written linguist will be able * to deal with multiple calls to <code>allocate</code>. This can happen if a linguist is shared by multiple search * managers. * <li> The <code>getSearchGraph</code> method is called by the search to retrieve the search graph that is used to * guide the decoding/search. This method is typically called at the beginning of each recognition. The linguist should * endeavor to return the search graph as quickly as possible to reduce any recognition latency. Some linguists will * pre-generate the search graph in the <code>allocate</code> method, and only need to return a reference to the search * graph, while other linguists may dynamically generate the search graph on each call. Also note that some linguists * may change the search graph between calls so a search manager should always get a new search graph before the start * of each recognition. * <li> The <code>startRecognition</code> method is called just before recognition starts. This gives the linguist the * opportunity to prepare for the recognition task. Some linguists may keep caches of search states that need to be * primed or flushed. Note however that if a linguist depends on <code>startRecognition</code> or * <code>stopRecognition</code> it is likely to not be a reentrant linguist which could limit its usefulness in some * multi-threaded environments. * <li> The <code>stopRecognition</code> method is called just after recognition completes. This gives the linguist the * opportunity to cleanup after the recognition task. Some linguists may keep caches of search states that need to be * primed or flushed. Note however that if a linguist depends on <code>startRecognition</code> or * <code>stopRecognition</code> it is likely to not be a reentrant linguist which could limit its usefulness in some * multi-threaded environments. * </ul> */ public interface Linguist extends Configurable { /** Word insertion probability property */ @S4Double(defaultValue = 1.0) public final static String PROP_WORD_INSERTION_PROBABILITY = "wordInsertionProbability"; /** Unit insertion probability property */ @S4Double(defaultValue = 1.0) public final static String PROP_UNIT_INSERTION_PROBABILITY = "unitInsertionProbability"; /** Silence insertion probability property */ @S4Double(defaultValue = 1.0) public final static String PROP_SILENCE_INSERTION_PROBABILITY = "silenceInsertionProbability"; /** Filler insertion probability property */ @S4Double(defaultValue = 1.0) public final static String PROP_FILLER_INSERTION_PROBABILITY = "fillerInsertionProbability"; /** The property that defines the language weight for the search */ @S4Double(defaultValue = 1.0) public final static String PROP_LANGUAGE_WEIGHT = "languageWeight"; /** * Retrieves search graph. The search graph represents the search space to be used to guide the search. * <p> * Implementor's note: This method is typically called at the beginning of each recognition and therefore should be * * @return the search graph */ public SearchGraph getSearchGraph(); /** * Called before a recognition. This method gives a linguist the opportunity to prepare itself before a recognition * begins. * <p> * Implementor's Note - Some linguists (or underlying lanaguge or acoustic models) may keep caches or pools that * need to be initialzed before a recognition. A linguist may implement this method to perform such initialization. * Note however, that an ideal linguist will, once allocated, be state-less. This will allow the linguist to be * shared by multiple simulataneous searches. Reliance on a 'startRecognition' may prevent a linguist from being * used in a multi-threaded search. */ public void startRecognition(); /** * Called after a recognition. This method gives a linguist the opportunity to clean up after a recognition has been * completed. * <p> * Implementor's Note - Some linguists (or underlying lanaguge or acoustic models) may keep caches or pools that * need to be flushed after a recognition. A linguist may implement this method to perform such flushing. Note * however, that an ideal linguist will once allocated, be state-less. This will allow the linguist to be shared by * multiple simulataneous searches. Reliance on a 'stopRecognition' may prevent a linguist from being used in a * multi-threaded search. */ public void stopRecognition(); /** * Allocates the linguist. Resources allocated by the linguist are allocated here. This method may take many seconds * to complete depending upon the linguist. * <p> * Implementor's Note - A well written linguist will allow allocate to be called multiple times without harm. This * will allow a linguist to be shared by multiple search managers. * * @throws IOException if an IO error occurs */ public void allocate() throws IOException; /** * Deallocates the linguist. Any resources allocated by this linguist are released. * <p> * Implementor's Note - if the linguist is being shared by multiple searches, the deallocate should only actually * deallocate things when the last call to deallocate is made. Two approaches for dealing with this: * <p> * (1) Keep an allocation counter that is incremented during allocate and decremented during deallocate. Only when * the counter reaches zero should the actually deallocation be performed. * <p> * (2) Do nothing in dellocate - just the the GC take care of things * * @throws IOException if an IO error occurs */ public void deallocate() throws IOException; }