[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debian-devel:10635] Re: man-db locale support



Hi, Fabrizio! Thank you for your work. Debian's standard man-db 
now works completely on my system. We does not need man-db-ja from
JP Packages for coming potato, and later.

# man-db-ja has been removed from unstable-jp tree on ftp server of Debian JP.

 At Fri, 15 Oct 1999 00:52:18 +0900,
   Fumitoshi UKAI <ukai@debian.or.jp> writes:

> At Thu, 14 Oct 1999 18:31:33 +0300,
> Fabrizio Polacco <fab@xxxxxxxx> wrote:
> 
> > > > # LANG=ja_JP.ujis is needed for jless, and LANG=ja does not work 
> > > > # for us currently,,,

Sorry, this is my misunderstanding. jless can display Japanese characters
well on kterm, krxvt (in rxvt-ml), and kon (in kon2) with LANG=ja now.

> > > This can be fixed by jless, see charset.c.
> > > Or use JLESSCHARSET instead of LESSCHARSET that is overrided by man-db.

I don't like the idea of having JLESSCHARSET. It is required since the old
man-db has set LESSCHARSET=latin1, but current man-db has changed to match
our requirements, so we should not need this extra setting.

But, jless does well in the case of LANG=ja as well as LANG=ja_JP.ujis,
so this is not a problem, in fact.

> > ah, I always miss a part of the cake :-)
> > What is needed by jless?
> > Do you need that LESSCHARSET is set to something different when LANG is
> > set to ja_JP.ujis or ja or what?

Excuse me, Fabrizio. I did misunderstand. LESSCHARSET=ja works well, when
LANG is set to ja as well as ja_JP.ujis. Moreover, evenif LANG is set to
each of ja_JP, ja_JP.eucJP, ja_JP.jis, and ja_JP.sjis, jless can show 
the Japanese text correctly on terminals which have capability of display
Japanese characters in eucJP or JIS encoding (char-set).

# btw, ukai, do you know if krxvt can display SJIS text ? I check it
# using nkf -s (somem japanese text), and it seems that krxvt can't do.

> I think LESSCHARSET=ja for LANG=ja* works no problem.
> check struct charset charsets[] and search_charset() in jless-332iso242.
> strncmp(name="ja", p->name="japanese", namelen=strlen("ja")) == 0, so
> "ja" will match one of japanese* in charsets[].

Yes. Thank you ukai for clear things up.
 
> To summarize in order to Japanese manual pages,
>  * install recent version of man-db, jgroff, jless, manpages-ja
> 
>  * set LANG=ja, because manpages-ja install manpages to 
>    /usr/share/man/ja/man[1-8]
>    If LANG=ja_JP.ujis, then man-db failed to find Japanese manpages in
>    /usr/share/man/ja/man[1-8].
> 
>    We'll happy if man-db find /usr/share/man/ja/man[1-8] as well as
>    /usr/share/man/ja_JP.ujis/man[1-8] when LANG=ja_JP.ujis.

I don't know if this is the proper solution. or I wonder if we should 
file the report to BTS for related packages in order to moving the eucJP
encoded manpages to ja_JP.ujis/ tree from ja/ tree.

This ja_JP.ujis directory name is descripted as an example for manpages
directory in FHS 2.0, so this seems the standard compatible name.

There are discussions in Debian JP for the preferable place of manpages 
written in Japanese. And there, the moving from ja/ to ja_JP.ujis/ tree 
for manpages written in eucJP, and making the link by

   dh_link usr/share/man/ja_JP.ujis/man1/foo.1.gz \
             usr/share/man/ja/man1/foo.1.gz

is proposed. I think this is reasonable solution for potato. Isn't it ?

> > Please can you make a complete table of the needs of the various
> > programs involved, and not simply tell me the minimum changes that does
> > the stuff work on your machine?

Yes, I will try to do. But the encoding (char set) problem is difficult 
even for me, a native Japanese speaker, and I may have mistakes sometimes. 
sorry for that.

> Anyway, I'm happy with lv instead jless for PAGER.
> (However, lv can't recognize prompt string for now.  
>  I requested to support it to lv maintainer.)

Hmm, lv is not a Debian package yet, or have you uploaded it already ?

P.S.

Though FHS 2.0 shows ja_JP.ujis for name of locale for Japanese EUC-JP 
char-set, in X11 (/usr/X11R6/lib/X11/locale/locale.alias) japanese is 
an alias for ja_JP.eucJP, while in glibc2 (/usr/share/locale/locale.alias)
has the following:

  japanese        ja_JP.SJIS
  japanese.euc    ja_JP.eucJP

I know neither of the reason why ja_JP.ujis is used in FHS 2.0, nor 
the reason why japanese is ja_JP.SJIS in glibc2.

But I wonder that we should switch from ja_JP.ujis to ja_JP.eucJP 
in order to conform the glibc2 and X11 definition. This requires
the change in the examples of FHS 2.0.

FHS 2.0 say: 

   The <character-set> field should represent the standard describing the
   character set.  If the <character-set> field is just a numeric
   specification, the number represents the number of the international
   standard describing the character set.  It is recommended that this be a
   numeric representation if possible (ISO standards, especially), not
   include additional punctuation symbols, and that any letters be in
   lowercase.

So ujis is all lowercase letters, and eucJP is not. 
But I can't understand why at all this lowercase preference is important.

Of course, this is big change, and potato is not the release for this change,
as this change must be time consuming work. I write this for long time 
consideration.

Thanks.

-- 
  Taketoshi Sano: <sano@debian.org>,<sano@debian.or.jp>,<kgh12351@xxxxxxxxxxx>