Unable to decode сyrillic text with Java

问题: I have the following text: РђРЅРЅР° Меркулова With help of the following online decoder https://2cyr.com/decode/?lang=en I was able to decode mentioned strin...

问题:

I have the following text:

Анна Меркулова

With help of the following online decoder https://2cyr.com/decode/?lang=en I was able to decode mentioned string to the correct one:

Анна Меркулова

enter image description here

Source encoding is UTF-8 and the target is WINDOWS-1251

but I still unable to do it programmatically in Java:

String utf8String = new String("Анна Меркулова".getBytes(), "UTF-8");
String ansiString = new String(utf8String.getBytes("UTF-8"), "windows-1251");
System.out.println(ansiString);

returns

Анна Меркулова

What am I doing wrong and how to properly convert the string?


回答1:

You're trying to assign the String(s) a Charset, but what you really need to do is extract the bytes with a specific Charset

final byte[] bytes = "Анна Меркулова".getBytes("UTF-8");
final String utf8String = new String(bytes);
final byte[] bytes1 = utf8String.getBytes("windows-1251");
final String ansiString = new String(bytes1);

And by the way, you don't need all of that

final byte[] bytes = "Анна Меркулова".getBytes("windows-1251");
final String result = new String(bytes);
  • 发表于 2019-03-05 01:05
  • 阅读 ( 191 )
  • 分类:sof

条评论

请先 登录 后评论
不写代码的码农
小编

篇文章

作家榜 »

  1. 小编 文章
返回顶部
部分文章转自于网络,若有侵权请联系我们删除