Home > DeveloperSection > Forums > How to wncod cp-1252 as utf-8?
jayprakash sharma
jayprakash sharma

Total Post:117

Points:821
Posted on    October-16-2014 12:51 AM

 Java Java  OOPS 
Ratings:


 1 Reply(s)
 871  View(s)
Rate this:
I am trying to write a Java app that will run on a linux server but that will process files generated on legacy Windows machines using cp-1252 as the character set. Is there anyway to encode these files as utf-8 instead of the cp-1252 it is generated as?



Tanuj Kumar

Total Post:134

Points:940
Posted on    October-16-2014 12:53 AM

You can read and write text data in any encoding that you wish. Here's a quick code example:

public static void main(String[] args) throws Exception

  {

    // List all supported encodings

    for (String cs : Charset.availableCharsets().keySet())

      System.out.println(cs);

 

    File file = new File("SomeWindowsFile.txt");

    StringBuilder builder = new StringBuilder();

 

    // Construct a reader for a specific encoding

    Reader reader = new InputStreamReader(new FileInputStream(file), "windows-1252");

    while (reader.ready())

    {

      builder.append(reader.read());

    }

    reader.close();

 

    String string = builder.toString();

 

    // Construct a writer for a specific encoding

    Writer writer = new OutputStreamWriter(new FileOutputStream(file), "UTF8");

    writer.write(string);

    writer.flush();

    writer.close();

  }

If this still 'chokes' on read, see if you can verify that the the original encoding is what you think it is. In this case I've specified windows-1252, which is the java string for cp-1252.


Don't want to miss updates? Please click the below button!

Follow MindStick