|
1 Hyphen - hyphenation library to use converted TeX hyphenation patterns |
|
2 |
|
3 (C) 1998 Raph Levien |
|
4 (C) 2001 ALTLinux, Moscow |
|
5 (C) 2006, 2007, 2008, 2010, 2011 László Németh |
|
6 |
|
7 This was part of libHnj library by Raph Levien. |
|
8 |
|
9 Peter Novodvorsky from ALTLinux cut hyphenation part from libHnj |
|
10 to use it in OpenOffice.org. |
|
11 |
|
12 Compound word and non-standard hyphenation support by László Németh. |
|
13 |
|
14 License is the original LibHnj license: |
|
15 LibHnj is dual licensed under LGPL and MPL (see also README.libhnj). |
|
16 |
|
17 Because LGPL allows GPL relicensing, COPYING contains now |
|
18 LGPL/GPL/MPL tri-license for explicit Mozilla source compatibility. |
|
19 |
|
20 Original Libhnj source with OOo's patches are managed by Rene Engelhard |
|
21 and Chris Halls at Debian: |
|
22 |
|
23 http://packages.debian.org/stable/libdevel/libhnj-dev |
|
24 and http://packages.debian.org/unstable/source/libhnj |
|
25 |
|
26 |
|
27 OTHER FILES |
|
28 |
|
29 This distribution is the source of the en_US hyphenation patterns |
|
30 "hyph_en_US.dic", too. See README_hyph_en_US.txt. |
|
31 |
|
32 Source files of hyph_en_US.dic in the distribution: |
|
33 |
|
34 hyphen.tex (en_US hyphenation patterns from plain TeX) |
|
35 |
|
36 Source: http://tug.ctan.org/text-archive/macros/plain/base/hyphen.tex |
|
37 |
|
38 tbhyphext.tex: hyphenation exception log from TugBoat archive |
|
39 |
|
40 Source of the hyphenation exception list: |
|
41 http://www.ctan.org/tex-archive/info/digests/tugboat/tb0hyf.tex |
|
42 |
|
43 Generated with the hyphenex script |
|
44 (http://www.ctan.org/tex-archive/info/digests/tugboat/hyphenex.sh) |
|
45 |
|
46 sh hyphenex.sh <tb0hyf.tex >tbhyphext.tex |
|
47 |
|
48 |
|
49 INSTALLATION |
|
50 |
|
51 ./configure |
|
52 make |
|
53 make install |
|
54 |
|
55 UNIT TESTS (WITH VALGRIND DEBUGGER) |
|
56 |
|
57 make check |
|
58 VALGRIND=memcheck make check |
|
59 |
|
60 USAGE |
|
61 |
|
62 ./example hyph_en_US.dic mywords.txt |
|
63 |
|
64 or (under Linux) |
|
65 |
|
66 echo example | ./example hyph_en_US.dic /dev/stdin |
|
67 |
|
68 NOTE: In the case of Unicode encoded input, convert your words |
|
69 to lowercase before hyphenation (under UTF-8 console environment): |
|
70 |
|
71 cat mywords.txt | awk '{print tolower($0)}' >mywordslow.txt |
|
72 |
|
73 DEVELOPMENT |
|
74 |
|
75 See README.hyphen for hyphenation algorithm, README.nonstandard |
|
76 and doc/tb87nemeth.pdf for non-standard hyphenation, |
|
77 README.compound for compound word hyphenation, and tests/*. |
|
78 |
|
79 Description of the dictionary format: |
|
80 |
|
81 First line contains the character encoding (ISO8859-x, UTF-8). |
|
82 |
|
83 Possible options in the following lines: |
|
84 |
|
85 LEFTHYPHENMIN num minimal hyphenation distance from the left word end |
|
86 RIGHTHYPHENMIN num minimal hyphation distance from the right word end |
|
87 COMPOUNDLEFTHYPHENMIN num min. hyph. dist. from the left compound word boundary |
|
88 COMPOUNDRIGHTHYPHENMIN num min. hyph. dist. from the right comp. word boundary |
|
89 |
|
90 hyphenation patterns see README.* files |
|
91 |
|
92 NEXTWORD separate the two compound sets (see README.compound) |
|
93 |
|
94 Default values: |
|
95 Without explicite declarations, hyphenmin fields of dict struct |
|
96 are zeroes, but in this case the lefthyphenmin and righthyphenmin |
|
97 will be the default 2 under the hyphenation (for backward compatibility). |
|
98 |
|
99 Comments |
|
100 |
|
101 Use percent sign at the beginning of the lines to add comments to your |
|
102 hpyhenation patterns (after the character encoding in the first line): |
|
103 |
|
104 % comment |
|
105 |
|
106 ***************************************************************************** |
|
107 * Warning! Correct working of Libhnj *needs* prepared hyphenation patterns. * |
|
108 |
|
109 For example, generating hyph_en_US.dic from "hyphen.us" TeX patterns: |
|
110 |
|
111 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 |
|
112 |
|
113 or with default LEFTHYPHENMIN and RIGHTHYPHENMIN values: |
|
114 |
|
115 perl substrings.pl hyphen.us hyph_en_US.dic ISO8859-1 2 3 |
|
116 perl substrings.pl hyphen.gb hyph_en_GB.dic ISO8859-1 3 3 |
|
117 **************************************************************************** |
|
118 |
|
119 OTHERS |
|
120 |
|
121 Java hyphenation: Peter B. West (Folio project) implements a hyphenator with |
|
122 non standard hyphenation facilities based on extended Libhnj. The HyFo module |
|
123 is released in binary form as jar files and in source form as zip files. |
|
124 See http://sourceforge.net/project/showfiles.php?group_id=119136 |
|
125 |
|
126 László Németh |
|
127 <nemeth (at) numbertext (dot) org> |