diff options
Diffstat (limited to 'tesseract/doc/dawg2wordlist.1.asc')
-rw-r--r-- | tesseract/doc/dawg2wordlist.1.asc | 45 |
1 files changed, 45 insertions, 0 deletions
diff --git a/tesseract/doc/dawg2wordlist.1.asc b/tesseract/doc/dawg2wordlist.1.asc new file mode 100644 index 00000000..cbe18d89 --- /dev/null +++ b/tesseract/doc/dawg2wordlist.1.asc @@ -0,0 +1,45 @@ +DAWG2WORDLIST(1) +================ +:doctype: manpage + +NAME +---- +dawg2wordlist - convert a Tesseract DAWG to a wordlist + +SYNOPSIS +-------- +*dawg2wordlist* 'UNICHARSET' 'DAWG' 'WORDLIST' + +DESCRIPTION +----------- +dawg2wordlist(1) converts a Tesseract Directed Acyclic Word +Graph (DAWG) to a list of words using a unicharset as key. + +OPTIONS +------- +'UNICHARSET' + The unicharset of the language. This is the unicharset + generated by mftraining(1). + +'DAWG' + The input DAWG, created by wordlist2dawg(1) + +'WORDLIST' + Plain text (output) file in UTF-8, one word per line + +SEE ALSO +-------- +tesseract(1), mftraining(1), wordlist2dawg(1), unicharset(5), +combine_tessdata(1) + +<https://tesseract-ocr.github.io/tessdoc/Training-Tesseract.html> + +COPYING +------- +Copyright \(C) 2012 Google, Inc. +Licensed under the Apache License, Version 2.0 + +AUTHOR +------ +The Tesseract OCR engine was written by Ray Smith and his research groups +at Hewlett Packard (1985-1995) and Google (2006-present). |