Friday, June 6, 2014

OpenAM Policy Agent HTTP HEADER UTF-8 issue

May passed by quickly… Managed to run a night marathon on 31st May 2014. Yes, Sundown Marathon as it only starts at 11:30 pm in the night. 


It's a nightmare for me. The weather is super humid in Singapore recently and there were a lot of participants. I did not have a good run after all. Will try again next year.


Meanwhile, back at work, I was helping a customer with an internationalization issue with their OpenAM Policy Agent.




In the JSP page that displays the Chinese characters from CHINESENAME HTTP Header, the output was always in garbage format. e.g. CHINESENAME: 中国|领英.


There are a lot of articles in the Internet that talks about setting encoding for application containers. We were only interested for JBoss. One example is UTF-8 Encoding for JBoss 5.1 AS.




As one of my colleagues pointed out, those settings are for decoding content and I/O streams like uploaded files within JSP pages. These have no effect on HTTP Header.

This thread from Stackflow (Do HTTP request headers have to be UTF-8 encoded?) has the best explanation.



There is a section in HTTP RFC on Message Headers. Hmm… as usual, RFC is not meant for layman reading. I have a friend who loves reading RFC. Ha!



I then found an article on RFC 5987. This is the best answer I get.

By default, message header field parameters in Hypertext Transfer Protocol (HTTP) messages cannot carry characters outside the ISO-8859-1 character set.

So the proper way to decode HTTP Headers that are passed from OpenAM Policy Agent is as follows:

// get bytes in iso-8859-1. 
byte[] data = request.getHeader("CHINESENAME").getBytes("iso-8859-1"); 
// now decode in UTF-8 to get the Chinese characters 
String cnName = new String(data, "UTF-8"); 


Any one has better solution? I would love to know and learn!


.


No comments:

Post a Comment