intl/hyphenation/src/README

Thu, 22 Jan 2015 13:21:57 +0100

author
Michael Schloh von Bennewitz <michael@schloh.com>
date
Thu, 22 Jan 2015 13:21:57 +0100
branch
TOR_BUG_9701
changeset 15
b8a032363ba2
permissions
-rw-r--r--

Incorporate requested changes from Mozilla in review:
https://bugzilla.mozilla.org/show_bug.cgi?id=1123480#c6

     1 Hyphen - hyphenation library to use converted TeX hyphenation patterns
     3 (C) 1998 Raph Levien
     4 (C) 2001 ALTLinux, Moscow
     5 (C) 2006, 2007, 2008, 2010, 2011 László Németh
     7 This was part of libHnj library by Raph Levien.
     9 Peter Novodvorsky from ALTLinux cut hyphenation part from libHnj
    10 to use it in OpenOffice.org.
    12 Compound word and non-standard hyphenation support by László Németh.
    14 License is the original LibHnj license:
    15 LibHnj is dual licensed under LGPL and MPL (see also README.libhnj).
    17 Because LGPL allows GPL relicensing, COPYING contains now 
    18 LGPL/GPL/MPL tri-license for explicit Mozilla source compatibility.
    20 Original Libhnj source with OOo's patches are managed by Rene Engelhard
    21 and Chris Halls at Debian:
    23 http://packages.debian.org/stable/libdevel/libhnj-dev
    24 and http://packages.debian.org/unstable/source/libhnj
    27 OTHER FILES
    29 This distribution is the source of the en_US hyphenation patterns
    30 "hyph_en_US.dic", too. See README_hyph_en_US.txt.
    32 Source files of hyph_en_US.dic in the distribution:
    34 hyphen.tex (en_US hyphenation patterns from plain TeX)
    36   Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex
    38 tbhyphext.tex: hyphenation exception log from TugBoat archive
    40   Source of the hyphenation exception list: 
    41   http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex
    43   Generated with the hyphenex script
    44   (http://www.ctan.org/tex-archive/info/digests/tugboat/hyphenex.sh)
    46   sh hyphenex.sh <tb0hyf.tex >tbhyphext.tex
    49 INSTALLATION
    51 ./configure
    52 make
    53 make install
    55 UNIT TESTS (WITH VALGRIND DEBUGGER)
    57 make check
    58 VALGRIND=memcheck make check
    60 USAGE
    62 ./example hyph_en_US.dic mywords.txt
    64 or (under Linux)
    66 echo example | ./example hyph_en_US.dic /dev/stdin
    68 NOTE: In the case of Unicode encoded input, convert your words
    69 to lowercase before hyphenation (under UTF-8 console environment):
    71 cat mywords.txt | awk '{print tolower($0)}' >mywordslow.txt
    73 DEVELOPMENT
    75 See README.hyphen for hyphenation algorithm, README.nonstandard
    76 and doc/tb87nemeth.pdf for non-standard hyphenation,
    77 README.compound for compound word hyphenation, and tests/*.
    79 Description of the dictionary format:
    81 First line contains the character encoding (ISO8859-x, UTF-8).
    83 Possible options in the following lines:
    85 LEFTHYPHENMIN num          minimal hyphenation distance from the left word end
    86 RIGHTHYPHENMIN num         minimal hyphation distance from the right word end
    87 COMPOUNDLEFTHYPHENMIN num  min. hyph. dist. from the left compound word boundary
    88 COMPOUNDRIGHTHYPHENMIN num min. hyph. dist. from the right comp. word boundary
    90 hyphenation patterns       see README.* files
    92 NEXTWORD                   separate the two compound sets (see README.compound)
    94 Default values:
    95 Without explicite declarations, hyphenmin fields of dict struct
    96 are zeroes, but in this case the lefthyphenmin and righthyphenmin
    97 will be the default 2 under the hyphenation (for backward compatibility).
    99 Comments
   101 Use percent sign at the beginning of the lines to add comments to your
   102 hpyhenation patterns (after the character encoding in the first line):
   104 % comment
   106 *****************************************************************************
   107 * Warning! Correct working of Libhnj *needs* prepared hyphenation patterns. *
   109 For example, generating hyph_en_US.dic from "hyphen.us" TeX patterns:
   111 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1
   113 or with default LEFTHYPHENMIN and RIGHTHYPHENMIN values:
   115 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 2 3
   116 perl substrings.pl hyphen.gb hyph_en_GB.dic ISO8859-1 3 3
   117 ****************************************************************************
   119 OTHERS
   121 Java hyphenation: Peter B. West (Folio project) implements a hyphenator with
   122 non standard hyphenation facilities based on extended Libhnj. The HyFo module
   123 is released in binary form as jar files and in source form as zip files.
   124 See http://sourceforge.net/project/showfiles.php?group_id=119136
   126 László Németh
   127 <nemeth (at) numbertext (dot) org>

mercurial