Showing posts with label open csv special characters. Show all posts
Showing posts with label open csv special characters. Show all posts

Monday, June 24, 2013

opencsv and Japanese character


The problem I faced:

opencsv was corrupting my characters in Japanese language.

Myths
- CSV cannot hold all types of unicode characters. (it can. even a notepad can.)
- FileWriter is not good for handling all types of unicode characters.

What was failing?

- The  ResultSetHelperService class of opencsv where there is rs.getString() was corrupting the data.

How?

I need the figure this out :( But ofcourse it must be not encoding it to the correct character set)

What was the solution?

I derived a child class of ResultSetHelperService and overloaded getColumnValues. I copied everything and did a small change.

instead of

value =  rs.getString(colIndex)

I replaced it with

value =  new String(rs.getBytes(colIndex), "UTF-8")

and it worked !!!

I also read with newer version of Java and Oracle it just works. But for mySql 3.0 and JDBC 4 it did't work.

References:

- The classes java.io.InputStreamReader, java.io.OutputStreamWriter, java.lang.String, and classes in the java.nio.charset package can convert between Unicode and a number of other character encodings.(http://docs.oracle.com/javase/6/docs/technotes/guides/intl/encoding.doc.html)

- http://stackoverflow.com/questions/5892163/should-i-be-using-jdbc-getnstring-instead-of-getstring

- http://www.joelonsoftware.com/printerFriendly/articles/Unicode.html

- http://stackoverflow.com/questions/496321/utf8-utf16-and-utf32