My name in the modern world
My last name is a technical hurdle. At least, for many people. In this high-tech, modern, globalized society, one would expect that a simple accented character shouldn’t be an issue. After all, even a small child can write Stüvel correctly with a stick in the mud. So why is it such an issue in the world of computers?
To demonstrate the issue, here is a list of ways that I have seen my name appear:
- Stüvel
- St??vel
- St�vel
- St++vel
- Stüvel
- St\xc3\xbcvel
- St vel
- St☐☐vel
- St³vel
- StĂźvel
- StÃ╝vel
- St? <Input Name=
- St~vel
- StÃ☐üvel
- StÃOvel
- St?üvel
- St��vel
- St?vel
- “your last name contains illegal characters”
- “your last name should contain at least one alphabet”, “only enter alphabetical characters”, etc.
- STÃœVE
- Stãœvel
- “you did not fill in a last name” and variants thereof
- St
- “your last name should contain only letters”
- St_f_»vel
- Stüvel
The root cause? Often software is only tested with what the developers & testers can easily type on their keyboard. As a result, they use the characters from the ASCII set, which happens to be the common dinominator between many character encodings. If you want to ensure your system correctly handles character data, test with non-ASCII characters. Go to Wikipedia, pick a random article in a language you don’t know, and just copy-paste some text into your application. Your system should be set up to handle this. Preferably by using UTF-8, but as a worst-case it should at least let the user know that non-ASCII characters aren’t supported.
And please, don’t call those “illegal” or “special” characters. It’s not a property of the characters, but a limitation of your system. Those limitations are usually simple to lift, as long as testing with non-ASCII characters is part of the development process.
Let’s move into the modern world, and accept the view that there are more names than English ones. The best thing to read after this, one of my favourite pages on the web, Falsehoods Programmers Believe About Names.