Datamatrix Code

“How is it possible that mathematics, a product of human thought independent of experience, adapts so admirably to the objects of reality.”

Albert Einstein

You can download the next files that are complementary for the explanation:

The common solution to generate a datamatrix code in a machining center is to buy a macro (SW has a very good but only work in sinumerik) or do the engraving after the machining process. But sometimes could be better to have a CNC program to engrave a code on the piece. A datamatrix code encodes a string of characters in a symbol. The symbol consists of a set of modules, each module represents a bit, a "full" module represents a 1, while an empty one is 0.

The symbol is limited by two lines that form an "L" that serve to align the symbol. On the opposite side it consists of two lines of full and empty modules placed alternately, this side determines the size of the module.

Símbolo Datamatrix mostrando únicamente sus bordes

Fi. 1 Datamatrix Code The code modules represent ones and zeros.

The string of characters to be encoded become "codewords," each codeword has an 8-bit length. The advantage of the datamatrix code is its redundancy or parity, as codewords are added that allow the character string to be recovered even though the symbol is incomplete or not properly scanned. This error correction process is achieved through the “Reed Solomon algorithm”

Reed Solomon's algorithm is based on “finite field algebra”. Understanding the algebra of fields is the most difficult part for a manufacturing engineer, but its implementation does not require such deep knowledge (although it would be desirable). The finite field algebra is not equal to the common algebra, which operates in an infinite field of numbers. The finite field algebra is based on a field with a limited number of elements. For example if we have a "field" of only two elements, 0 and 1, the sum of 0 and 1 is equal to 1, as in the algebra we all know. In the "normal" algebra the sum of 1+1=2, but it will be equal to 1 in the algebra of finite fields (for a two-element field). For example, suppose a field of 12 elements(1,2,3,4,5,6,7,8,9,10,11), as on a clock. If we add 12 to 12, the result would be 24 for the "normal" algebra, but it will be zero in the finite field. The explanation of the clock tells us that the result would be the residual of the operation, this can be expressed as a "modulo" operation (which can be performed in ms excel). The clock analogy can serve when we want to express a number on a finite element field. For example, the number 4 in a three-element field (0,1,2) would be equal to 1, which can also be expressed as 4 mod 1. The figure shows how a three-element clock (0,1,2) would express the number 4. The first three elements would complete a lap of three elements and only one element will be the remainder.

Modulo Operation

Fig. 2 Modulo Operation The results of performing a modulo operation using different values are shown. On the right, the analogy of a clock to show the result of 4 in modulo 3.

In general terms the idea behind Reed Solomon's algorithm is that according to Galois field theory, if we have a field of n elements we can reconstruct the entire field if we know the value of a lower-grade polynomial. The explanation is deeper but for our purposes it is enough. Thus, we can encode the character string as a polynomial of degree N which in turn is divided by a known polynomial. With the division remainder the error correction values can be constructed. All these operations will take place within a field of 256-element (0,1,...255).

The number of data codewords that can be stored in a symbol is determined by the number of rows and columns. It is important to clarify that the number of "codewords" is not necessarily equal to the number of characters we want to encode, since there are ways in which more than one character can be grouped in a codeword

Symbol size

Fig. 3 Number of data in a symbol. The number of rows and columns of the symbol determines the number of data codewords that can be encoded in a symbol.

There are various encodation schemes, the ASCII scheme is used by default. Others have certain advantages such as being able to encode more characters into a single codeword, although they are more limited in the characters they can encode. The attached table shows how the ASCII encodation may require up to 16 bits to encode a character, but C40 mode only requires 5 in some cases, with the disadvantage that C40 mode is limited to fewer characters. Each mode has different rules for encoding characters.

Encodation schemes

Fig. 4 Encodation scheme. The Encodation schemes have different capablilities to store characters in a codeword, so the expense in bit per character can be lower

The steps to encode a string of characters would the following:

Encode data according to the selected scheme
Divide the obtained polynomial using the polynomial divisor according to the size of the symbol.
Get the error correction codewords from division remainder
Convert the codewords to binary mode (8 bits).
Transfer the binary values to the symbol modules.

Example

The phrase "ANITA LAVA LA TINA" will be encoded in ASCII scheme. The number of characters is 18, so we will choose an 18-row symbol, which can store up to 22 data codewords and 18 error correction codewords. Figure 5 shows the location of each bit of the codewords.

symbol size 18

Fig. 5 Symbol size 18. The 18-row symbol can store up to 22 data codewords and 18 error correction codewords.

The encoding is always based on the decimal value of the character according to ASCII table (Figure 6 and 7). and follows the next rules.

For ASCII characters with decimal values from 0 to 127 CODEWORD="ASCII VALUE" + 1
For ASCII characters with decimal values from 128 to 255 (ASCII extended) CODEWORD="ASCII VALUE" - 127
For pairs of digits ("00," "10," "05," etc) CODEWORD="NUMERICAL VALUE OF PAR OF DIGITS" + 130.

ASCII chart

Fig. 6 ASCII Chart. The encoding of a character is based on its decimal value according to ASCII table. (Source:wikimedia).

ASCII codewords

Fig. 7 Creating codewords for characters. In ASCII scheme for each character a codeword is generated, which is equal to its decimal value plus one.

A 18-row symbol can store up to 22 data codewords, but the character string has only 18. When there are extra codewords, they will have to be filled using the following rules:

The first extra codeword will have the value 129
The following will use algorithm 253 where a pseudo-random number R is required:

R = ((149 * P) MOD 253) 1 where P is the position number.
CODEWORD = (129 + R) MOD 254

Figure 8 shows the complete codewords for the data, due the the string is only text (no numbers) each character corresponds to a codeword, in addition 4 codewords for empty spaces are added.

ASCII codewords with emtpy spaces

Fig. 8 Creating codewords for empty characters and spaces. he first value always has the value of 129, the subsequent ones make use of the formulas shown before.

The codewords found will be used as the data polynomial coefficients in Reed Solomon's algorithm. In a very general way the objective is to build a final polynomial c(x). According to the formula of Figure 9.

Reed Solomon formula

Fig. 9 Coding Formula Reed Solomon. eed Solomon's encoding is the sum of a data polynomial and a control polynomial (obtained from a modulo operation).

The degree of the data polynomial is established according to the number of data codewords k, in this case the number of codewords is k=22, the degree is k-1=21 since the value of the powers begins at zero (0,1,...21). This polynomial multiplies by x IS raised to n-k. The purpose of this operation is to leave room for control polynomials.

¨Data Polynomial

Fig. 10 Data polynomials. Data polynomials, the degree is determined by the number of data codewords k. The polynomial i(x) is multiplied by x-(n-k) to expand it and add the elements of the control polynomial.

Figure 10 shows the first part that composes the final polynomial c(x) The control polynomial is a modulo operation, that is, the remainder of the division of the polynomial i(x)x^(n-k) between the polynomial generator g(x). The polynomial g(x) can be calculated using algorithm provided in the standard IEC 16022, however the standard also provides them completely developed according to the number of data correction codewords. For our example with 18 error correction codewords, the polynomial is shown in Figure 11. Figure 12 shows a visualization of the module operation, which shows that the result would be the remainder of the division of the polynomial data between the polynomial generator.

¨Polinomios generador

Fig. 11 Generator polynomials for 18 error correction codewords..

¨Modulo Operation

Fig. 12 Modulo Operation. The g(x) modulo i(x)x^(n-k) is the remainder value of its division.

The polynomial division must be carried out in the finite field of 256. To make it easier, special logarith tables are used for that field.

The process consists of obtain the anti-logarithms (in the finite field 256) of the dividend and divisor coefficients. The process is repetitive for each coefficient of the dividend. At each stage the antilogarithm of the dividend coefficient is added to the antilogarithm of each divisor coefficient, that sum is modulo 255, and for that result the logarithm is obtained. When the sequence is completed for each divisor coefficient, an XOR operation is performed to obtain the result and the process is repeated.

¨Polynomial division operation

Fig. 13 Polynomial division operation. t the beginning the antilogarithm of the first dividend coefficient should be added with the antilogarithm of the divisor coefficient, this sum is modulo 255, and the result logarithm is obtained. Subsequently, an XOR operation is carried out to obtain the new coefficients of the polynomial.

¨Operation polynomial division

Fig. 14 Operation polynomial division. The process of obtaining new coefficients is repeated at the following levels..

When completing operations for all dividend coefficients, a remainder will be obtained, which is the modulo. the remainder values are the coefficients of the polynomial control (Figure 9). The coefficients of the polynomial data are shown in yellow in Figure 16, and in blue the coefficients of the polynomial control.

¨Final polynomial division result

Fig. 15 Final polynomial division result. The final remainder coefficients are the codewords to correcting errors

Each codeword obtained from the previous process should be expressed in the form of 8 bits. Each of the individual bit values are accommodated according to the symbol mapping of Figure 5. At the end the values of 1 are expressed as a full module and zeros as an empty module, the process is now complete (Figure 17)

Coefficient bitwise values

Fig. 16 The coefficient values should be expressed in bits and accommodating those values in their corresponding places in the symbol

Full symbol

Fig. 17 Full symbol. Each full module is a 1, an empty one a 0.

Creation of a CNC macro

We will encode a serial number consisting of 13 alphanumeric characters ("02AD23MH06001"), formed only by capitals and numbers. Due to it is a machining operation we are interested in using the fewest characters, so instead of ASCII scheme we will use C40 mode. This scheme allows to group 3 characters into two codewords, allowing to save space and time.

Regarding the symbol, the most suitable for our purposes is one of 16 X 16, which is capable of storing 12 data codewords and 12 error correction.

In C40 scheme each character has a decimal value, each 3 consecutive characters (C1, C2, C3) are encoded with the formula (1600*C1)+(40*C2)+C3+1.

This formula generates a value from 1 to 64000

Two codewords are obtained

The first codeword is the integer part of division of the value generated by 256
The second codeword is the remainder of the same division.

C40 values chart

Fig. 18 C40 values chart CWhen using the C40 scheme each character has the value corresponding to this chart.

As the default schemes ASCII, we should indicate the change to C40 scheme using the codeword 230 (latch code) before the rest of data codewords. If go back to ASCII scheme is needed, codeword 254 is added and the coding in ASCII scheme is used (equal to the decimal value plus one).

Figure 18 shows the calculation of the 16-bit value for the first three characters. Each character has a value according to the value of the table for the code C40.

C40 16 bits calculation

Fig. 19 Calculation of 16 bit values and their two codewords for the first three characters. C40 mode allows to group three data (characters) into two codewords. These codewords are generated from a value of 16 bits. The first codeword is the integer value of dividing the 16-bit value by 256. The second codeword is operation modulo 256.

When calculating the 16-bit value we see that there are only enough characters to complete 4 16-bit values. From each of these values two codewords are obtained, that is, we have 8 codewords. The first codeword to the data should be the 230 to indicate that we are changing to C40 mode, then put the data codewords, which are 8 in this case, so we have 9 codewords of 12 available. We see that we have to encode a one character, which for simplicity is encoded in ASCII, so the tenth codeword will be the value 254 to return to that scheme. The eleventh codeword would be the value of the character in ASCII scheme (decimal value plus 1), and the twelfth codeword would be a padding value (129).

Calculation of 165-bit values and their two codewords for all characters.

Fig. 20 Calculation of 165-bit values and their two codewords for all characters. The first codeword must be the 230 value to indicate that C40 scheme will be used. Since not all values can be encoded in C40 mode, is necessary to return to ASCII scheme using the 254 value. The remaining character is then encoded using ASCII scheme, the remaining space is filled with a padding value (129).

The error correction codewords would also be 12 and are obtained as shown before: from the expanded polynomial division using as dividend coefficients the data codewords previously obtained. The values of the divisor polynomial g(x) are obtained from the IEC standard.

The division process must be carried out in a recursive way until the remainder is obtained. Each stage has to use the antilogarithms (base 256) performing an XOR operation.

Obtaining error correction values.

Fig. 21 LObtaining the error correction codewords is the result of dividing the expanded polynomial by the g(x) polynomial corresponding to 12 error codewords. The codeword values will be the coefficient values of the remainder polynomial.

To create a program in CNC it is only necessary to convert each module of the symbol into a coordinate and to program a marking if the module is 1 and omit it if it is 0..

The edges of the symbol have values already defined, so their programming is simple. By dividing the entire symbol into columns we only require to determine the individual value of each point. We will identify the position of the lower left corner by the coordinates (LSX, LSY), the value of the DSP variable will be the space between points.

Symbol coordinates

Fig. 22 Variables used in Macro The lower left corner of the symbol has coordinates (LSX, LSY), the distance between modules will be DSP.

Each codeword consists of 8 bits and each bit has a predefined location in symbol. Figure 21 shows how the value of codeword 2 has a value of 25 that is 00011001.

Location of codeword values in the symbol

Fig. 23 Location of codeword values in the symbol. Each bit of the codewords has a predefined location on the symbol.

When the marking is programmed, the value of each bit should be isolated according to its position in the symbol. The way to achieve this is through a "mask" with an AND operation with a binary value that only has the value "1" in the same position of the bit we want to isolate.

Bit mask

Fig. 24 Bit mask. To isolate a specific bit, an AND operation is performed with a binary value that has only that bit of interes equal to 1.

Figure 25 shows how AND operations would be performed for each bit of codeword 2 whose value is 25 (00011001).

Applied bit mask

Fig. 26 Applied bit mask. Application of bit mask to codeword 2 values, each bit is isolated with an AND operation with the right mask.

The macro structure is as follows:

Get the values of each character according to Chart C40.
Calculate the 16-bit value.
Get the values of the codewords.
Get the polynomial coefficients for 12 error correction codewords (the operation requires a table of logarithms and anti-logarithms).
Perform the division (iterate until obtaining the remainder values).
Update the error correction codeword values.
Mark the values of each column of the symbol (using AND operation).

Figure 25 shows how the values are declared in the macro for the characters of our example according to Table C40, in this case they are being declared directly.

Declare variables in C40 mode

Fig. 26 Declare variables in C40 mode. In this example the character values are declared in C40 scheme directly.

The 16-bit values and the codewords are calculated right away. The values of the dividend coefficients are stored in "temporary" variables CI. In this case the symbol can accumulate 12 data codewords and 12 error correction codewords. The data codewords (12) are stored in the variables CW23 to CW12 (the number corresponds to the power of the variable). The variables CW0 to CW11 (the error correction codewords) will be obtained from the polynomial division.

CData codewords values

Fig. 27 Data codewords values. he data codewords are calculated according to the formulas for C40 mode, notice that the 230 value is at the start to use C40, the 254 value is to return to ASCII scheme

The value of the polynomial for generating the error codes g(x) has to be declared directly (Figure 28)

Declaration of values of polynomial g(x)

Fig. 28 Declaration of values of polynomial g(x). The values of the polynomial coefficients for the generation of error codes are declared directly.

The the division operation is done iteratively using the logarithms and antilogarithms tables. The operation requires XOR operation.

realización de la división de polinomios.

Fig. 29 The the division operation is done iteratively using the logarithms and antilogarithms tables. The operation requires XOR operation.

The values resulting from the division remainder update the error correction variables (Figure 29)

Update of error correction codewords

Fig. 30 Updating of error correction codewords according to the remainder values of polynomial division

Finally, the marking is made. The code is arranged in columns so that an AND operation determines whether the corresponding bit is 1 or 0..

marking operation

Fig. 31 The marking is done with a G01 code and an AND operation to determine whether the module value is 1 or 0.

You can make the macro more sophisticated in a way that automatically calculates the size of the symbol and the operations for the marking, but the purpose was to keep it as simple as possible.