Wt::WLineEdit converting £ to Â£

Hi,

I have a Wt::WLineEdit and I get the content using this:

Wt::WLineEdit m_newUserName;
...
Wt::WString str( m_newUserName.text() );

Always if I enter £ I get it converted to Â£.

Any suggestions as to where I'm going wrong?

Replies (3)

RE: Wt::WLineEdit converting £ to Â£ - Added by Wim Dumon over 11 years ago

Hello Russ,

Can you verify that you have this problem on the hello world example? If not, how do you need to you modify hello.C so that it demonstrates your problem?

BR,

Wim.

RE: Wt::WLineEdit converting £ to Â£ - Added by Russ Freeman over 11 years ago

Here's the results:

    Wt::WString str( nameEdit_->text() );       //  utf8=Â£
    std::string str1( str.toUTF8() );           //  =Â£
    std::string str2( str.narrow() );           //  =£
    greeting_->setText("Hello there, " + str);  //  =£

Which is and isn't a problem. If I convert everything using .narrow() it's fine. I was using .toUTF8().

Should I just use .narrow() ?

RE: Wt::WLineEdit converting £ to Â£ - Added by Wim Dumon over 11 years ago

It depends on an important decision you have to make: what character encoding do you want to use in your application?

As long as you're working with WString, Wt will properly handle unicode characters (internally, WString is UTF-8, but that's an implementation detail). As soon as you convert WString to a regular C string, or a regular C string (or char *) to a WString, you have to worry about what encoding that C string is in.

Option: UTF-8 everywhere:

WString str(nameEdit_->text());
std::string str1(str.toUTF8());
greeting_->setText(WString::fromUTF8("Hello there, " + str);

This converts a WString to a UTF-8 character sequence, and then a UTF-8 character sequence into a WString. No loss of characters can occur, but str1.length() will not necessarily return the number of characters in the string (characters may be encoded in more than one byte).

Option: I want to use the global locale (by default the "C" locale):

WString str(nameEdit_->text());
std::string str1(str.narrow());
greeting_->setText("Hello there, " + str);

Note that while narrowing a unicode string, not all Unicode characters can be represented in the target character set; characters that cannot be represented will be replaced by '?'. This is a sure way to loose character information (greek symbols, russian symbols, chinese, and much more)

Option: I want to use a specific locale

std::locale mylocale = ...;
WString str(nameEdit_->text());
std::string str1(str.narrow(mylocale));
greeting_->setText("Hello there, " + str, mylocale);

This is a sure way to loose characters, but you can choose which ones you want to support.

Option: I want to use wstring in my application

WString str(nameEdit_->text());
std::wstring str1(str.widen());
greeting_->setText(L"Hello there, " + str);

Wt assumes that std::wstring can represent all unicode characters (which is not guaranteed by the standard, and is also not the case on Windows).

What you did is create a UTF-8 string, and then byte-interpret that string as a C-locale string, which will interpret the same bytes as being two separate characters instead of one unicode character.

Bottom line: whenever you convert from C string types (std::string, char *) to/from WString, think about what character set is used in the C string and use the appropriate WString method to convert. My personal favorite is to use UTF8 for std::string, but you don't always have the choice (e.g. when reading files, communicating with a database, ...). See also the WString documentation for all constructors and conversion possibilities.

BR,

Wim.

(1-3/3)

Project

General

Profile

Wt