[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[debian-devel:15129] 英訳の校正をお願いできませんか
- From: Tatsuki Sugiura <sugi@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
- Subject: [debian-devel:15129] 英訳の校正をお願いできませんか
- Date: Thu, 23 May 2002 08:47:50 +0900
- X-face: %#SbsX5Ajq`)JKPGXyq8Cn6"<`&M~n\?,}`vYU7[}gM!q_K=\v6[}y8&R:Dy3O0Ymmw$@>T $Ys$^Tw8ghB'uxX)I(n_x\5RQ|s'D0m$,"I^\S
- X-gpg-fingerprint: C4BC EDCC 50B2 2D7B 4A85 4A13 6CAD 85CE 4502 FDC2
- X-gpg-keyid: 4502FDC2
- X-ml-info: If you have a question, send e-mail with the body "help" (without quotes) to the address debian-devel-ctl@debian.or.jp; help=<mailto:debian-devel-ctl@debian.or.jp?body=help>
- X-ml-name: debian-devel
- X-mlserver: fml [fml 3.0pl#17]; post only (only members can post)
- X-moe: Vampire/lilith
- X-public-key: http://pgp.nic.ad.jp:11371/pks/lookup?op=get&search=0x4502FDC2
- Message-id: <87wutvg2xn.wl@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
- X-mail-count: 15129
- User-agent: Wanderlust/2.9.11 (Unchained Melody) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (畝傍御陵前) APEL/10.3 Emacs/21.2 (i386-debian-linux-gnu) MULE/5.0 (賢木)
こんにちは。杉浦です。
org に持って行くためも兼ねて、 Unicode::Japanese の pod の英訳の修正を
引き受けて何とかやってみたのですが。どうにも私では巧く訳せない部分が多い&
自分の英語能力のが足らなくて、なかなかまともな訳に出来ずにいます。
どなたか、下の文章の校正をお願いできないでしょうか。
意味がずれているとか、そもそも英語として変だとか、何でも良いので指摘して
いただけると、とてもありがたいです。
古い日本語の原文は(多少内容変わっていますが)これです。
http://sugi.nemui.org/tmp/Japanese.pm.old.html
# pod2text だと文字化けするので HTML に。
# 一応、変更/追加前の英訳は
# http://sugi.nemui.org/tmp/Japanese.pm.orig.txt
# にあります
どうかよろしくお願いします。
--
Tatsuki Sugiura mailto:sugi@xxxxxxxxxxxxxxxxxxxxxxxxxxx
-----------------------------------------------------------------
NAME
Unicode::Japanese - Japanese Charset Converter
SYNOPSIS
use Unicode::Japanese;
# convert utf8 -> sjis
print Unicode::Japanese->new($str)->sjis;
# convert sijs -> utf8
print Unicode::Japanese->new($str,'sjis')->get;
# convert sjis (imode_EMOJI) -> utf8
print Unicode::Japanese->new($str,'sjis-imode')->get;
# convert ZENKAKU (utf8) -> HANKAKU (utf8)
print Unicode::Japanese->new($str)->z2h->get;
DESCRIPTION
Module for convert among each charsets in Japanese encodings.
FEATURES
* The instance maintains internal strings with UTF-8.
* Support both XS and Non-XS mode. XS for high performance, and No-XS
for easy to use (only with copy Japanese.pm).
* Support converting between ZENKAKU and HANKAKU.
* Handle safely "EMOJI" of mobile phones (DoCoMo i-mode, ASTEL dot-i,
and J-PHONE J-Sky) by mapping them on Unicode Private Use Area.
* Support converting same image of EMOJI between diffrent mobile phone's
standerd mutually.
* Consider SJIS as MS-CP932. (Shift_JIS on MS-Windows (MS-SJIS/MS-CP932)
differ from generic Shift_JIS charset.)
* When convert Unicode to SJIS (EUC/JIS), escape to "&#xxxx;" format, if
the character cannot be converted to SJIS. (except "EMOJI")
METHODS
$s = Unicode::Japanese->new($str [, $icode [, $encode]])
Create a new instance of Unicode::Japanese.
If arguments was specified, pass through to set method.
$s->set($str [, $icode [, $encode]])
$str: string
$icode: charset, can be omitted (default = 'utf8')
$encode: encoding, can be omitted.
Set a string to the instance. If omit '$icode', string is consider
as UTF-8.
If you specify a charset, choose and specify from the following;
'jis', 'sjis', 'euc', 'utf8', 'ucs2', 'ucs4', 'utf16', 'utf16-ge',
'utf16-le', 'utf32', 'utf32-ge', 'utf32-le', 'ascii','binary',
'sjis-imode', 'sjis-doti', 'sjis-jsky'.
'&#xxxx' will be converted to "EMOJI", when specify 'sjis-imode' or
'sjis-doti'.
For auto detect charset, MUST specify 'auto'. (then, call getcode
method automatically)
For encoding, can only be specified by 'base64'. If it specified,
string will be decode before storing.
When you decode binary, specify 'binary' as charset.
$str = $s->get
$str: a string(UTF-8)
get string with UTF-8.
$code = $s->getcode($str)
$str: string
$code: character set name
Detect charset of a *$str*.
Notice: This is not for string codes which is maintained instance!
Charsets are distinguished by the following algorism;
1 If BOM of UTF-32 was found, the charset is utf32.
2 If BOM of UTF-16 was found, the charset is utf16.
3 If it is proper as UTF-32BE, the charset is utf32-be.
4 If it is proper as UTF-32LE, the charset is utf32-le.
5 Without NON-ASCII characters, the charset is ascii. (control
codes except escape sequences has been included in ASCII)
6 If it include JIS escape sequences, the charset is jis.
7 If it include "J-PHONE EMOJI", the charset is sjis-sky.
8 If it is proper as EUC, the charset is euc.
9 If it is proper as SJIS, the charset is sjis.
10 If it is proper as SJIS and "EMOJI" of i-mode, the charset is
sjis-imode.
11 If it is proper as SJIS and "EMOJI" of dot-i,the charset is
sjis-doti.
12 If it is proper as UTF-8, the charset is utf8.
13 If it is not true of them, the charset is unknown.
Caused by the algorism, please pay attention to the following;
* Possible, take UTF-8 for SJIS.
* Can NOT detect UCS2 automatically.
* Can detect UTF-16, only including BOM.
* Can detect "EMOJI", when it is stored by binary, not by "&#xxxx;"
format. (If it is only stored in &#xxxxx; format, getcode() will
return incorrect result. Then, "EMOJI" will be crashed when you
convert it.)
$str = $s->conv($ocode, $encode)
$ocode: output charset (Choose from 'jis', 'sjis', 'euc', 'utf8',
'ucs2', 'ucs4', 'utf16', 'binary')
$encode: encoding, can be omitted.
$str: string
Get strings converted to *$ocode*.
For encoding, can only be specified by 'base64'. Then, string
encoeded base64 will be returned.
$s->tag2bin
Replace to "&#xxxxx;" inclued in strings binary entity.
(de-refecencing &#xxxxx;)
$s->z2h
Convert ZENKAKU to HANKAKU.
$s->h2z
Convert HANKAKU to ZENKAKU.
$s->hira2kata
Convert HIRAGANA to KATAKANA.
$s->kata2hira
Convert KATAKANA to HIRAGANA.
$str = $s->jis
$str: string (JIS)
Get string converted JIS(ISO-2022-JP).
$str = $s->euc
$str: string (EUC)
Get string converted EUC.
$str = $s->utf8
$str: string (UTF-8)
Get string converted UTF-8.
$str = $s->ucs2
$str: string (UCS2)
Get string converted UCS2.
$str = $s->ucs4
$str: string (UCS4)
Get string converted UCS4.
$str = $s->utf16
$str: string (UTF-16)
Get string converted UTF-16(big-endian). Not accompanied with BOM.
$str = $s->sjis
$str: string (SJIS)
Get string converted SJIS(MS-CP932).
$str = $s->sjis_imode
$str: string (SJIS/imode_EMOJI)
Get string converted SJIS for i-mode.
$str = $s->sjis_doti
$str: string (SJIS/dot-i_EMOJI)
Get string converted SJIS for dot-i.
$str = $s->sjis_sky
$str: string (SJIS/J-SKY_EMOJI)
Get string converted SJIS for j-sky.
@str = $s->strcut($len)
$len: number of charcter
@STR: a string
Split string by length(*$len*).
$len = $s->strlen
$len: width of a string
Get length of strings stored in $s. This has been offerd to
substitute for perl build-in length(). This method will count
ZENKAKU as 2.
$s->join_csv(@values);
@values: data array
Convert array to string of CSV format, and store into instance. At
this time, Supplement newline("\n") in the end of string.
@values = $s->split_csv;
@values: data array
Split string stored in $s as CSV format. Each newline("\n") will be
remove before split.
DESCRIPTION OF UNICODE MAPPING
SJIS
Mapped as MS-CP932. Mapping table of following URL was used.
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT
If character cannot be mapped to SJIS from Unicode, it will be
converted to &#xxxxx; format.
Also, every unmapped characters will be convert "?" when converting to
SJIS for mobile phon.
EUC/JIS
At first, converts to SJIS, and maps to Unicode. If string include
charcter was out of SJIS, it cannot be mapped correctly.
DoCoMo i-mode
Portion of involving "EMOJI" in F8OO - F9FF will be maapped to
U+0FF800 - U+0FF9FF.
ASTEL dot-i
Portion of involving "EMOJI" in F~OO - F4FF will be maapped to
U+0FF000 - U+0FF4FF.
J-PHONE J-SKY
"J-SKY EMOJI" are mapped down as follows. "\e\$"(\x1b\x24) escape
sequences, the first byte, second byte, and "\xOf". Compressed by
drawing second byte's "EMOJI" twice, if the "EMOJI" are same between
first and second.
Map 45OO - 47FF into U+OFFBOO - U+OFFDFF, as accounting the first byte
and the second one is one character of a pair.
Unicode::Japanese compresses "J-SKY_EMOJI" automatically, when "EMOJI"
of consentive bytes are same.
BUGS
* EUC, JIS cannot be converted to appropriately, when they include in
untransformed characters to SJIS. Because they are converted after
gotten SJIS to UTF-8.
* If sent Japanese.pm via ASCII mode of FTP, file will be broken.
Because it has a binary data.
AUTHOR INFORMATION
Copyright 2001, SANO Taku (SAWATARI Mikage) All right resreved.
This library is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.
Bug reports and comments to: mikage@xxxxxxxxx Thank you.
CREDITS
Thanks very much to:
Nao NAKAYAMA