CryptoSys PKI Toolkit Manual

UTF-8 and Latin-1

Deprecated UTF-8 functions

As of v3.6 these UTF-8-related functions and methods are deprecated and are replaced by new ones.

DeprecatedReplaced by
CNV_UTF8FromLatin1CNV_UTF8BytesFromLatin1
CNV_Latin1FromUTF8CNV_Latin1FromUTF8Bytes
CNV_CheckUTF8CNV_CheckUTF8Bytes
Cnv.CheckUTF8(String) methodCnv.CheckUTF8(Byte[]) method

The change is subtle. Strictly speaking, the concept of "converting" a string of characters from one character encoding scheme to another is meaningless. What we really mean is that we want to change the byte array that represents the string in Latin-1 encoding to a new byte array that represents the same string using UTF-8 encoding.

The deprecated string-based functions just "forced" the UTF-8-encoded bytes back into a string type. This forcing trick will work in VB6/C but does not in .NET (well, you can but it's a lot of effort, and pointless). The resulting "UTF-8 strings" will print "funny" but you could pass them to HASH_HexFromString and obtain the correct hash digest.

If you need to "convert" a string to UTF-8 while using this cryptography toolkit, you probably intend to pass the result to a message digest hash or signature function. The underlying hash functions work with byte arrays anyway and so you should really just go directly from the Latin-1 string to a byte array containing bytes of the correct UTF-8 encoding. Then you can check you have the correct bytes and can pass this array directly to the "Bytes" version of the hash function.

We have also added the new function CNV_ByteEncoding and equivalent method Cnv.ByteEncoding to convert encoding in a byte array between UTF-8 and Latin-1. Again, by working with byte arrays, we are doing it the right way.

Both VB6 and .NET store strings internally in "Unicode" encoding (UTF-16 in .NET; possibly UCS-2 in VB6) but when passed to this Toolkit they are automatically converted to strings of "ANSI" characters. For more information, see Converting strings to bytes and vice versa

Sample code

Here's how to change the code in VB6 to obtain the MD5 digest of a UTF-8-encoded string using the new functions.

Dim strData As String
Dim strDataUTF8 As String
Dim abDataUTF8() As Byte
Dim strDigest As String
Dim nRet As Long
Dim nLen As Long

' Our original string data
strData = "||069204|2004-09-27T11:29:56|11111|4 MENSUALIDADES|AME9806027F9|Asociación Mexicana de Estándares para el Comercio Electrónico A.C.|Blvd Manuel Avila Camacho|138|Lomas de Chapultepec|Ciudad de México|Periferico|Miguel Hidalgo|Distrito Federal|México|11000|PAT980610G6|Empresa Prueba SA de CV|Homero|18|Nueva Anzures|Ciudad de México|Mariano Escobedo|Miguel Hidalgo|Distrito Federal|México|11570|10|Articulo Prueba 1|20|200|15|Articulo Prueba 2|10|150|30|Articulo Prueba 3|10|300|IEPS|64||"
' "Convert" to UTF-8
nLen = CNV_UTF8FromLatin1("", 0, strData)
nLen = CNV_UTF8BytesFromLatin1(vbNull, 0, strData)
If nLen <= 0 Then
    Debug.Print "Failed to convert to UTF-8: " & nLen
    Exit Function
End If
strDataUTF8 = String(nLen, " ")
ReDim abDataUTF8(nLen - 1)
nLen = CNV_UTF8FromLatin1(strDataUTF8, nLen, strData)
nLen = CNV_UTF8BytesFromLatin1(abDataUTF8(0), nLen, strData)

' Create a hash but first dimension the string to receive it
strDigest = String(PKI_MD5_CHARS, " ")
nRet = HASH_HexFromString(strDigest, Len(strDigest), strDataUTF8, Len(strDataUTF8), PKI_HASH_MD5)
nRet = HASH_HexFromBytes(strDigest, Len(strDigest), abDataUTF8(0), nLen, PKI_HASH_MD5)
Debug.Print "Digest=" & strDigest
This should result in the output:
Digest=6e93bd823e92da03d75c2f1a3966c8d0

Here is code to do the same thing in VB.NET.

Dim strData As String
Dim abDataUTF8() As Byte
Dim strDigest As String

' Our original string data
strData = "||069204|2004-09-27T11:29:56|11111|4 MENSUALIDADES|AME9806027F9|Asociación Mexicana de Estándares para el Comercio Electrónico A.C.|Blvd Manuel Avila Camacho|138|Lomas de Chapultepec|Ciudad de México|Periferico|Miguel Hidalgo|Distrito Federal|México|11000|PAT980610G6|Empresa Prueba SA de CV|Homero|18|Nueva Anzures|Ciudad de México|Mariano Escobedo|Miguel Hidalgo|Distrito Federal|México|11570|10|Articulo Prueba 1|20|200|15|Articulo Prueba 2|10|150|30|Articulo Prueba 3|10|300|IEPS|64||"
' "Convert" to UTF-8
abDataUTF8 = System.Text.Encoding.UTF8.GetBytes(strData)
' Create a hash 
strDigest = Hash.HexFromBytes(abDataUTF8, HashAlgorithm.Md5)
Console.WriteLine("Digest=" & strDigest)

[Contents] [Index]

[HOME]   [NEXT: Security Issues...]

Copyright © 2004-12 D.I. Management Services Pty Ltd. All rights reserved.