Categories
English World Wide Web

Thijs’ guide to UTF8

The Thijs way!
This might not be the best way, but it does work like a charm!

First of all: make sure your document is written in UTF8. This will make sure that all forms you submit (yes, even in xmlHttpRequest calls) will be encoded in UTF8.

Second: make sure that your database connection is using UTF8 aswell! Using UTF8 as charset in your fields and tables is one thing, you have to make sure that your connection is using UTF8 aswell! You can change the default charset in your mysql configuration files, but the easiest way to make sure you are using UTF8 is actually telling your connection to use UTF8.
mysql_connect (DB_SERVER, DB_USER, DB_PASS);
mysql_select_db (DB_DATABASE);
mysql_query ("SET NAMES utf8");

This way you will lose those ugly non-utf8 characters in your database.

Now let’s take a look at the xmlHttpRequest. This is fairly simple: what do you want to do? Submit a form ofcourse. What do we have to take in mind? & and = are special characters, so we can’t use those in the form. A simple escape in javascript should take care of that.

The html:
<textarea id="myfield">Some data</textarea>
<button onclick="submitForm();">Submit</button>

The javascript (function submitForm()):
// Only escape the form field
var myfield = encodeURIComponent(document.getElementById('myfield').value);
var req= getXmlHttpRequest (); // Use your own function here..
req.open ('post', 'index.php', true);
req.setRequestHeader("Method", "POST index.php HTTP/1.1");
req.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
req.send ('myfield='+myfield);

How to receive it in PHP:
// Just remove the special url chars
$myfield = rawurldecode ($_POST['myfield']);

And that’s it really. Since your document is in UTF8 and your form fields are encoded in UTF8, the only thing you have to keep in mind is the “split symbols”. No need to encode or decode anything.