I’ve been struggling with a pretty nasty problem in Visual FoxPro dealing with a Web Application that must support multiple double byte/high Unicode languages from a single application. Visual FoxPro of course is not Unicode aware and must deal with all strings as ANSI. Worse VFP really has no conversion format to hold Unicode strings without potentially loosing the encoding information.
I can make my app work with one Language at a time, but not mixing languages (Chinese, Korean, Russian and a few others).
Here’s the scenario:
Data is stored in SQL Server and if I run some simple tests in ASP. Net to enter and store Unicode text in all of the languages I can store the data easily into SQL Server into an NTEXT field. To make this easy I created 4 records with a few languages. With ASP.NET all of this is completely automatic. You simply do a Request.Form, read the data and Insert it into Sql Server using standard ADO.NET SQL syntax or DataAdapter.Update commands.
Now in VFP I pull this data back out using SQLExec. If I run VFP in its default English mode doing:
loSQL = CREATEOBJECT("wwSQL")
loSQL.Connect("driver={sql server};database=westwindadmin;server=(local)")
loSQL.Execute([select nShapeValueId,sLongDescription from tbShapeValues],"TShapes")
I get the Unicode data back merely as a bunch of ‘???? ???’ strings.
Now I can mess around with Window’s language settings to get me back at least one of these. Basically you can set the Regional Setting to have exactly one Unicode to Double-Byte Ansi translation. This can be done in Regional Settings:
When I do this I can now get Korean data to be translated properly into VFP. It also causes some of the Chinese, the Russian text to be translated, but all but the Korean text have problems.
The last two entries are actual content captured from VFP and then stored into the database.
So, the short point here is that I can easily capture a single language, but how do I capture multiple languages in this scenario? I can’t figure out how to do this…
The issue (I think) is that VFP cannot properly translate strings if the above regional setting is not in place. This causes problems all the way through this application. Not only can you not display the data, but the data permanently looses its formatting. Once the data comes down to VFP into a string that has ‘?’ inside of the string to replace a character the character is lost for good.
While you can use STRCONV() to handle conversions, STRCONV() relies on the fact that the string returned is valid in the first place. So you can’t pick up one of the ???? field values and run it through STRCONV() to get a valid value. It would have been useful if you could get a byte representation of these characters, but even using ASC() to return the values only returns 63 (‘?’) for the ‘missing’ characters.
It gets worse when you’re starting out with strings to start with. The above data is to be read and written out to a Web application. In order to make this work the output needs to be encoded in UTF-8 or other encoding that a Web browser can understand.
So with the result above I can write out the data in West Wind Web Connection with something like this:
loSQL = CREATEOBJECT("wwSQL")
loSQL.Connect("driver={sql server};database=westwindadmin;server=(local)")
loSQL.Execute([select nShapeValueId,CAST( sLongDescription as nVarChar(174) ) as sLongDescription from tbShapeValues],"TShapes")
LOCAL loSC as wwShowCursor
loSC = CREATEOBJECT("wwShowCursor")
loSC.ShowCursor()
pcCursorText = loSC.GetOutput()
*** Run ExpandTemplate to a String so we can encode it
lcResult = Response.ExpandTemplate(Request.GetPhysicalPath(),"NONE",,.t.)
*** UTF-8 Encode
lcEncodedResult = STRCONV(lcResult,9)
*** Create a custom header
LOCAL loHeader as wwHttpHeader
loHeader = CREATEOBJECT("wwHTTPHeader")
loHeader.SetProtocol()
loHeader.Setcontenttype("text/html; charset=utf-8")
loHeader.AddHeader("Content-Length",TRANSFORM(LEN(lcEncodedResult)))
Response.Write( loHeader.Getoutput() )
*** Write it out
Response.Write(lcEncodedResult)
This works to generate:
The UTF-8 encoding of the text makes it possible for the foreign text to display correctly and the browser to automatically show in generic Unicode (UTF-8) display mode.
But obviously there’s still the problem of the other languages not displaying of course.
The other end of this is the data entry. Given that the page was generated with UTF-8 encoding the results of a POST operation (the textbox and Save button clicked) is also UTF-8 and URLEncoded. To get the data out you can use:
IF Request.IsPostBack()
*** Read the raw input data
pcSavedDescription = Request.Form("txtDescription")
*** Convert the data from UTF-8 into ANSI string
pcSavedDescription = STRCONV(pcSavedDescription,11)
*** Insert Captured value back to SQL Server
loSql.Execute(;
[insert into tbShapeValues (sLongDescription,nLanguageId ,;
nCorpId,nShapeId) values ] +;
[(cast( N'] + STRTRAN(pcSavedDescription,"'","''") + [' as nText)] )
…
This too works for Korean - only with the installed Unicode->Ansi Locale (Korea). For the other locales some things work others don’t. For some reason the Russian actually works, while the Chinese has a number of characters that work, but not others (there's apparently some overlap). Polish and Spanish miss a few accented characters. For Polish and other Euro charsets overriding the Content Type for the page with a specific character set (Windows 1252 forexample) will make it work, but I suspect this can cause problems with others that don't use the same encoding.
What sucks about this is that STRCONV() supports LOCALEID or CODEPAGE parameters, but they have no real effect of the data coming back when converting from UTF-8. The reason for this I think is the fact that once you do a conversion from UTF-8 into ANSI or Unicode VFP stores to string and immediately looses the locale specific Unicode encoding other than the one that is configured. At that point your text is lost and you get the ???? data that is being written in the browser and browse window screen shots above.
I’ve been messing with this stuff for a couple of days now and I cannot find a way to do this, so if anybody has any ideas on how to deal with multiple Ansi/Unicode Locales in a single application, heck even a single machine I would love some feedback.
Other Posts you might also like