如何在Java字符串中检测日文文本?
发布时间:2020-12-15 04:51:16 所属栏目:Java 来源:网络整理
导读:我需要能够在 Java字符串中检测日语字符. 目前我正在获取UnicodeBlock并检查它是否等于Character.UnicodeBlock.KATAKANA或Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS,但我不是100%将覆盖所有内容. 有什么建议? 解决方法 我使用以下java方法.可
我需要能够在
Java字符串中检测日语字符.
目前我正在获取UnicodeBlock并检查它是否等于Character.UnicodeBlock.KATAKANA或Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS,但我不是100%将覆盖所有内容. 有什么建议? 解决方法
我使用以下java方法.可能不会完全满足您的要求.
<!-- language: lang-java --> /** * Returns if a character is one of Chinese-Japanese-Korean characters. * * @param c * the character to be tested * @return true if CJK,false otherwise */ private boolean isCharCJK(final char c) { if ((Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_COMPATIBILITY_FORMS) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_RADICALS_SUPPLEMENT) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION) || (Character.UnicodeBlock.of(c) == Character.UnicodeBlock.ENCLOSED_CJK_LETTERS_AND_MONTHS)) { return true; } return false; } 此外,这些似乎应该适用于平假名和片假名字符: private boolean isHiragana(final char c) { return (Character.UnicodeBlock.of(c)==Character.UnicodeBlock.HIRAGANA); } private boolean isKatakana(final char c) { return (Character.UnicodeBlock.of(c)==Character.UnicodeBlock.KATAKANA); } (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |