Checks if a byte array contains valid UTF-8 characters.
Public Declare Function CNV_CheckUTF8Bytes Lib "diCrPKI.dll" (ByRef lpInput As Byte, ByVal nBytes As Long) As Long
nLen = CNV_CheckUTF8Bytes(lpInput(0), nBytes)
long __stdcall CNV_CheckUTF8Bytes(const unsigned char *lpInput, long nBytes);
Returns zero if the byte array contains invalid UTF-8, or a positive number if the byte array contains valid UTF-8, where the value of the number indicates the nature of the encoded characters (see Remarks below):
Public Function cnvCheckUTF8Bytes
(lpInput() As Byte) As Long
Cnv.CheckUTF8 Method (Byte[])
static Cnv.utf8_check(data)
Return values:
Returns | Value | Result |
---|---|---|
PKI_CHRS_NOT_UTF8 | 0 | Not valid UTF-8 |
PKI_CHRS_ALL_ASCII | 1 | Valid UTF-8, all chars are 7-bit ASCII |
PKI_CHRS_ANSI8 | 2 | Valid UTF-8, contains at least one multi-byte character equivalent to 8-bit ANSI |
PKI_CHRS_MULTIBYTE | 3 | Valid UTF-8, contains at least one multi-byte character that cannot be represented in a single-byte character set. |
Overlong UTF-8 sequences and illegal surrogates are rejected as invalid. Strings that return
PKI_CHRS_ANSI8 (2) can be converted to Latin-1 format using the
CNV_Latin1FromUTF8Bytes()
function. Strings that return
PKI_CHRS_MULTIBYTE (3) cannot be converted to Latin-1, and strings that return
PKI_CHRS_ALL_ASCII (1) are already OK because they only consist of 7-bit ASCII characters.
See the example in CNV_UTF8BytesFromLatin1
.
Dim strData As String Dim lpDataUTF8() As Byte strData = "abcóéÍáñ" Debug.Print "Latin-1 string='" & strData & "'" Debug.Print " (" & Len(strData) & " characters)" lpDataUTF8 = cnvUTF8BytesFromLatin1(strData) Debug.Print "UTF-8=(0x)" & cnvHexStrFromBytes(lpDataUTF8) Debug.Print " (" & cnvBytesLen(lpDataUTF8) & " bytes)" Debug.Print "cnvCheckUTF8Bytes returns " & cnvCheckUTF8Bytes(lpDataUTF8) & " (expected 2)" ' And back to a string Dim strLatin1 As String strLatin1 = cnvLatin1FromUTF8Bytes(lpDataUTF8) Debug.Print "Back to string='" & strLatin1 & "'"
CNV_CheckUTF8File CNV_Latin1FromUTF8Bytes CNV_UTF8BytesFromLatin1 CNV_ByteEncoding