public class Detector
extends java.lang.Object
Detector class is to detect language from specified text.
Its instance is able to be constructed via the factory class DetectorFactory.
After appending a target text to the Detector instance with append(Reader) or append(String),
the detector provides the language detection results for target text via detect() or getProbabilities().
detect() method returns a single language name which has the highest probability.
getProbabilities() methods returns a list of multiple languages and their probabilities.
The detector has some parameters for language detection.
See setAlpha(double), setMaxTextLength(int) and setPriorMap(HashMap).
import java.util.ArrayList;
import com.cybozu.labs.langdetect.Detector;
import com.cybozu.labs.langdetect.DetectorFactory;
import com.cybozu.labs.langdetect.Language;
class LangDetectSample {
public void init(String profileDirectory) throws LangDetectException {
DetectorFactory.loadProfile(profileDirectory);
}
public String detect(String text) throws LangDetectException {
Detector detector = DetectorFactory.create();
detector.append(text);
return detector.detect();
}
public ArrayList detectLangs(String text) throws LangDetectException {
Detector detector = DetectorFactory.create();
detector.append(text);
return detector.getProbabilities();
}
}
DetectorFactory| Constructor and Description |
|---|
Detector(DetectorFactory factory)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
void |
append(java.io.Reader reader)
Append the target text for language detection.
|
void |
append(java.lang.String text)
Append the target text for language detection.
|
java.lang.String |
detect()
Detect language of the target text and return the language name which has the highest probability.
|
java.util.ArrayList<Language> |
getProbabilities()
Get language candidates which have high probabilities
|
void |
setAlpha(double alpha)
Set smoothing parameter.
|
void |
setMaxTextLength(int max_text_length)
Specify max size of target text to use for language detection.
|
void |
setPriorMap(java.util.HashMap<java.lang.String,java.lang.Double> priorMap)
Set prior information about language probabilities.
|
void |
setVerbose()
Set Verbose Mode(use for debug).
|
public Detector(DetectorFactory factory)
DetectorFactory.create().factory - DetectorFactory instance (only DetectorFactory inside)public void setVerbose()
public void setAlpha(double alpha)
alpha - the smoothing parameterpublic void setPriorMap(java.util.HashMap<java.lang.String,java.lang.Double> priorMap)
throws LangDetectException
priorMap - the priorMap to setLangDetectExceptionpublic void setMaxTextLength(int max_text_length)
max_text_length - the max_text_length to setpublic void append(java.io.Reader reader)
throws java.io.IOException
setMaxTextLength(int),
the rest is cut down.reader - the input reader (BufferedReader as usual)java.io.IOException - Can't read the reader.public void append(java.lang.String text)
setMaxTextLength(int),
the rest is cut down.text - the target text to appendpublic java.lang.String detect()
throws LangDetectException
LangDetectException - code = ErrorCode.CantDetectError : Can't detect because of no valid features in textpublic java.util.ArrayList<Language> getProbabilities() throws LangDetectException
LangDetectException - code = ErrorCode.CantDetectError : Can't detect because of no valid features in text