Accented characters and UTF-8 in XML-DSIG signatures

In this page we look at a simple example to create an XML-DSIG signature of an XML document containing accented characters like áéíóúñ.

The problem | The answer | The Digest Value | The Signature Value | The final signed XML document | The Code | See Also | Contact us

The problem

Question: I am trying to create the signature for the <Book> element in this XML document, but I cannot compute the correct SHA-1 digest value.

<?xml version="1.0" encoding="ISO-8859-1"?>
<References>
<Book xml:id="F01">
<FirstName>Bruceñ</FirstName>
</Book>
</References>

I can do it for the first name "Bruce" but it fails when I add the ñ character.

The answer

Answer: The problem is usually because the canonicalized (c14n'd) form requires the ñ character to be encoded in UTF-8.

Here is some code in VB6 that takes the input from reading the original file in ISO-8859-1 (Latin-1) form and computes the SHA-1 digest value of the canonicalized form.

Dim strData As String
Dim abData() As Byte
Dim strDigest As String
Dim nRet As Long
Dim nDataLen As Long
Dim strSig64 As String

' Input as (would be) read from original file (with LATIN SMALL LETTER N WITH TILDE)
strData = "<Book xml:id=""F01"">" & vbCrLf & _
    "<FirstName>Bruceņ</FirstName>" & vbCrLf & _
    "</Book>"
Debug.Print strData
Debug.Print "ORIG= " & cnvHexStrFromString(strData)
' Line breaks are normalized to "#xA": i.e. convert CR-LF pairs to LF
strData = Replace(strData, vbCrLf, vbLf)
' Convert to UTF-8
nDataLen = CNV_UTF8BytesFromLatin1(vbNull, 0, strData)
ReDim abData(nDataLen - 1)
nDataLen = CNV_UTF8BytesFromLatin1(abData(0), nDataLen, strData)
' Display input as sequence of bytes in hex form
Debug.Print "INPUT=" & cnvHexStrFromBytes(abData)
' Form SHA-1 digest of input
strDigest = String(PKI_SHA1_CHARS, " ")
nRet = HASH_HexFromBytes(strDigest, Len(strDigest), abData(0), nDataLen, PKI_HASH_SHA1)
Debug.Print "DIGEST(hex)=" & strDigest
' Encode in base64
strSig64 = cnvB64StrFromHexStr(strDigest)
' Return base64-encoded digest...
Debug.Print "DIGEST(base64)=" & strSig64

And this results in

<Book xml:id="F01">
<FirstName>Bruceņ</FirstName>
</Book>
ORIG= 3C426F6F6B20786D6C3A69643D22463031223E0D0A3C46697273744E616D653E4272756365F13C2F46697273744E616D653E0D0A3C2F426F6F6B3E
INPUT=3C426F6F6B20786D6C3A69643D22463031223E0A3C46697273744E616D653E4272756365C3B13C2F46697273744E616D653E0A3C2F426F6F6B3E
DIGEST(hex)=ff1ae390056f0aea04dc8e6db9d19d2325391a1d
DIGEST(base64)=/xrjkAVvCuoE3I5tudGdIyU5Gh0=

The Digest Value

The short answer is that the "DigestValue" we require for the XML signature is /xrjkAVvCuoE3I5tudGdIyU5Gh0=.

The longer answer for those of you who want to debug your own programs is that we convert the sequence of bytes in the original XML fragment <Book>...</Book>

3C426F6F6B20786D6C3A69643D22463031223E0D0A3C46697273744E616D653E4272756365F13C2F46697273744E616D653E0D0A3C2F426F6F6B3E

to another sequence of bytes, the canonicalized form of exactly 58 bytes:

3C426F6F6B20786D6C3A69643D22463031223E0A3C46697273744E616D653E4272756365C3B13C2F46697273744E616D653E0A3C2F426F6F6B3E

where all CR-LF pairs (0x)0D 0A are converted to a single 0A, and (in this case) the only non-ASCII character ñ (F1) is represented by two bytes (0x)C3 B1 in UTF-8 encoding.

Note that this sequence should always begin with a "<" character (3C) and always end with a ">" (3E). Be aware, too, that there are lots of other changes you might need to do for more general c14n, but this is all you need to do for our simple example.

This sequence of 58 bytes is input to the SHA-1 digest algorithm to yield the result (0x)FF1AE390056F0AEA04DC8E6DB9D19D2325391A1D or /xrjkAVvCuoE3I5tudGdIyU5Gh0= in base64. Only this exact sequence as input will give the correct digest value.

The Signature Value

Having got this digest value, we need to compute the signature value. The input to this process is the canonicalized form of the "SignedInfo" element. Our exact input in this case is

<SignedInfo xmlns="http://www.w3.org/2000/09/xmldsig#">
<CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"></CanonicalizationMethod>
<SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"></SignatureMethod>
<Reference URI="#F01">
<Transforms>
<Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"></Transform>
</Transforms>
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"></DigestMethod>
<DigestValue>/xrjkAVvCuoE3I5tudGdIyU5Gh0=</DigestValue>
</Reference>
</SignedInfo>

That is, the 554 bytes shown in this hexdump. Note again that the first and last characters in this sequence are "<" and ">", respectively, and that the line endings are single LF characters 0x0A.

This input has SHA-1 digest value (0x)3910b5e6a669140c374978e55fe08d87ddd81e74, which is incorporated in the actual signature value. We use Alice's 1024-bit encrypted private key (password="password") to create the signature. See the sample code below.

You can see the XML form of Alice's public key in the <RSAKeyValue> in the final XML document below. There are example procedures in the sample code showing how to form this using both Alice's certificate and private key file.

The final signed XML document

<?xml version="1.0" encoding="ISO-8859-1"?>
<References>
<Book xml:id="F01">
<FirstName>Bruceņ</FirstName>
</Book>
<Signature xmlns="http://www.w3.org/2000/09/xmldsig#">
<SignedInfo>
<CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
<SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#rsa-sha1"/>
<Reference URI="#F01">
<Transforms>
<Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
</Transforms>
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<DigestValue>/xrjkAVvCuoE3I5tudGdIyU5Gh0=</DigestValue>
</Reference>
</SignedInfo>
<SignatureValue>
PZab37+BAm8XXXLOL4CxF0M0Ep9okIl2IDfvOEaejCv68lrRv0zCF3zXOkl6x09e
prbrWK1adS3bNzqK0KxDiUBuOAKgX0wF1MrTPCJmlwU+HSsVmbFj49jlx+9q5YGi
oEf3bdOFO3Mj9+1snhwEAIPVVe8n+jGvbTD/d4CyPIk=
</SignatureValue>
<KeyInfo>
<KeyValue>
<RSAKeyValue>
<Modulus>
4IlzOY3Y9fXoh3Y5f06wBbtTg94Pt6vcfcd1KQ0FLm0S36aGJtTSb6pYKfyX7PqC
UQ8wgL6xUJ5GRPEsu9gyz8ZobwfZsGCsvu40CWoT9fcFBZPfXro1Vtlh/xl/yYHm
+Gzqh0Bw76xtLHSfLfpVOrmZdwKmSFKMTvNXOFd0V18=
</Modulus>
<Exponent>AQAB</Exponent>
</RSAKeyValue>
</KeyValue>
</KeyInfo>
</Signature>
</References>

You can check this at the XML Security Library Online XML Digital Signature Verifer.

The Code

Here is our sample code in VB6/VBA and in VB.NET/VB200x. This zipped file (7.2 kB) includes the source code as well as Alice's private key file (password="password"), X.509 certificate and the final XML document.

See also

  1. Signing an XML document using XMLDSIG: a simple example signing a straightforward text string and storing the result in an XML document.
  2. XML-Dsig and the Chile SII: creating digital signatures in XML documents (XML-Dsig) using the standards for electronic invoices set by the Servicio de Impuestos Internos (SII) of Chile.

Contact us

To comment on this page or ask a question, please send us a message.

This page first published by DI Management 4 December 2010. Last updated 27 October 2011.