[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debian-devel:12518] Re: multibyte character support on groff (Re: [fpolacco@debian.org: Installed groff 1.16-0 (source i386)])



At Thu, 22 Jun 2000 17:42:08 +0900,
Taketoshi Sano <kgh12351@xxxxxxxxxxx> wrote:

> I think I can check his code to see if it can show Japanese characters
> correctly. Some of our members also can do it, I hope.

I looked at utf-8 support in groff 1.16, but it doesn't seem to work.
For example,

 zcat < /usr/share/man/ja/man1/groff.1.gz| iconv -fEUC-JP -tUTF-8 | groff -Tutf8 -man | lv -Iu8

then, we got many warnings
<standard input>:18: warning: can't find special character `wchare5e7'

and broken output.  I think we need more work to support utf-8.

I just read groff-1.16 source code, it has no code to handle utf-8 input.
In src/roff/troff/input.cc, void token::next() handles single byte character
(and 2 byte character by #ifdef NIPPON), but this code can't handle utf-8.
I think it's not so difficult to add utf-8 input support, but I wonder
how to determine which encoding is used for input file.  It seems device,
selected by -T option, is used for output, so we need another option to
specify input encoding, don't we?  Can we assume same encoding is used 
for input file and output?  In this case, what encoding for non tty devices?
such as ps, dvi, X75, X100 ... ?

In addition, we need more font definition for multibyte characters
in font/devutf8.

Regards,
Fumitoshi UKAI