How to EncodE and dEcodE InformatIon IT Zone
Transcription
How to EncodE and dEcodE InformatIon IT Zone
IT Zone How to Encode and Decode Information Learn to convert between various codes for digital data transmission use. There are different kinds of binary codes, like weighted and non-weighted codes, reflected codes, sequential codes, alphanumeric codes, and error detection and correction codes. Weighted code C.T. Bhunia D igital data is represented, stored and exchanged in the form of codes written in strings of 0’s and 1’s known as bits. And encoding and decoding is an art of transforming one representation into another and vice versa. In accounts section of offices, employee codes are used for salary disbursement. The code is a unique number given to the employee name. Decoding is getting the employee details from the code. In computer engineering, encoding and decoding are used for transformation of numerical data from one system to another system for specific Converting decimal data into binary and vice versa is essential for computing as the computer realises only binary data from ‘on’ and ‘off’ states of the transistor built into the system. Binary codes are the codes that are represented in binary system with modification from the original ones. There are two such codes: weighted binary codes and non-weighted codes. In addition to conversion of the number system, it is often desirable to retain decimal character even when it is encoded in binary digits, or bits (1 or 0). For example, consider decimal number 56, where we represent individual decimal digits, namely, 5 and 6, as 0101 for 5 and 0110 as 6: 56=01010110 Such a representation is known as binary-coded decimal (BCD). It is a code known as weighted code (see Table I). From Table I, you can find the code of 783 as 0111 1000 0011. In fact, each digit of the decimal number is coded as 4-tuple binary data, say, a3a2a1a0 , as follows: First decimal Second decimal Last decimal digitdigit digit a3a2a1a0 a3a2a1a0a3a2a1a0 BCD is also known as 8421 code, where 8, 4, 2 and 1, respectively, rep- 7 0 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u Table I Weighted BCD Codes for Decimal Digits Decimal digit BCD code weight 8 4 21 0 0 0 00 1 0 0 01 2 0 0 10 3 0 0 11 4 0 1 00 5 0 1 01 6 0 1 10 7 0 1 11 8 1 0 00 9 0 0 01 resent weights of fourth, third, second and first positions of the tuple. As examples of other weighted codes, Table II shows following codes: 2421, 74-2-1 and 4221. In fact, weighted binary codes are those which obey the positional weighting principle (each position of the number represents a specific weight). Some weighted codes are known as reflective codes. A code is said to be reflective when the code for 9 is the complement for the code for 0, code for 8 is the complement for the code for 1, and so on. Codes 2421, 5211 and excess-3 are reflective, whereas the 8421 code is not reflective. Similarly, some weighted codes are sequential. A code is said to be sequential when two subsequent codes, seen as numbers in binary representation, differ by one. This greatly aids in mathematical manipulation of data. The 8421 and Excess-3 codes are w w w. e f y m ag . co m IT Zone sequential, whereas the 2421 and 5211 codes are not sequential. Some features of weighted codes Except 8421 and 74-2-1 codes, for all other weighted codes some of the decimal number may be coded in more than one forms. For example, decimal 4 in 2421 code may be represented as 1010 or 0100. That’s why 2421 and 4221 are known as self-complementing codes. A property known as self-complementing is applied to select a code out of different options. The property is stated as follows: If ‘D’ is the given decimal number, you may code it in any option, but then decimal number 9-D must be coded taking the option such that the code is bit-wise complement of the code of D. For example, if decimal 4 in 2421 is coded as 0100, then 9-4=5 must be coded as 1011. See that 1011 is bit-wise complement of 0100. We have shown the code in Table II applying the self-complementing property. A necessary condition of self-complementing property is that the sum of weights in the code is 9. The codes 8421 and 74-2-1 are not selfcomplementing. Non-weighted code In many cases, there may be some requirement for non-weighted codes (see Table III), particularly from design point of view. Non-weighted codes are codes that are not positionally weighted. That is, each position within the binary number is not assigned a fixed value. The examples of a few nonweighted codes are excess-3 code, 1-to2 code and BCDP (BCD with parity). Excess-3 is a non-weighted code used to express decimal numbers. The code derives its name from the fact that each binary code is the corresponding 8421 code plus 0011(3). Some features of nonweighted code Excess-3 code is just BCD with plus 3 weight. If ‘d’ is the decimal number, d+3 is coded in 8421 to get the excess-3 0, 1 and three-bit gray code 000, 001, 011, 010, Weighted Codes 110, 111, 101, 100 (dif2421 code 74-2-1 code 4221 code fers from 000 by 1 bit). Decimal digit Weight 2 4 2 1 Weight 7 4 -2 -1 Weight 4 2 2 1 The sequence 00, 11, 01, 0 0000 0000 0 0 0 0 10 is not a two-bit gray 1 0001 0111 0 0 0 1 code as the first and 2 0010 0110 0 0 1 0 second elements differ by two bits. 3 0011 0101 0011 Table IV shows a 4 0100 0100 0110 4-bit code in compari5 1011 1010 1 0 0 1 son with a binary num6 1100 1001 1 1 0 0 ber. In the column of 7 1101 1000 1 1 0 1 decimal number, it is 8 1110 1111 1 1 1 0 seen that from 00 to 09 if you move from num9 1111 1110 1111 ber N to number N+1, there is a change in Table III only one digit. But when you Non-weighted Codes move from 09 to 10, there are BCDP (with changes in two digit positions. Excess-3 code 1-to-2 code Decimal digit odd parity) (XS-3 code) Look at the column of reflected 0 0011 0001 00001 decimal number: as you move 1 0100 0010 00010 from N to N+1, always there is change in one digit position. 2 0101 0011 00100 In binary representation as 3 0110 0100 00111 we move from N to N+1, there 4 0111 0101 01000 may be many changes in bits. 5 1000 0110 01011 For example, as you go from 3 6 1001 1000 01101 to 4, there are as many as three 7 1010 1001 01110 changes in bit position. But, look at the reflected binary or 8 1011 1010 10000 gray code: as you move from 9 1101 1100 10011 any N to N+1, always there is only one change in bit posicode. This code is the most important tion. This is the property of gray code non-weighted code in digital logic deor reflected binary code. Gray code sign. 1-to-2 code has a unique feature. is also known as a variable weighted No code has less than one and greater code and is cyclic. than two 1’s. BCDP (with odd parity) is The gray code is called ‘reflected designed so as to make any code havbinary,’ because the first eight values ing odd numbers of 1’s. This is done compare with those of the last eight with selection of parity bit, which is the values, but in reverse order. The gray right-most bit of the code. code belongs to a class of codes called Gray code/binary ‘minimum change codes,’ in which reflected code only one bit in the code changes when moving from one code to the next. The Gray code is a very important nongray code is a reflective digital code weighted code. An N-bit gray code is which has a special property that any a sequence of all the N-bit binary numtwo subsequent numbers’ codes differ bers, ordered in such a way that each by only one bit. It is also called ‘unitbinary number differs from its prededistance code.’ cessor and successor by exactly 1 bit Gray code has an important ap(and the first and last differ by one bit plication in digital control systems. Let also). For example, one-bit gray code 7 2 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u Table II w w w. e f y m ag . co m IT Zone Table IV Gray/Reflected Code Decimal Reflected decimal Binary b3b2b1b0 Reflected binary/gray code g3g2g1g0 00 00 0000 0000 01 01 0001 0001 02 02 0010 0011 03 03 0011 0010 04 04 0100 0110 05 05 0101 0111 06 06 0110 0101 07 07 0111 0100 08 08 1000 1100 09 09 1001 1101 10 19 1010 1111 11 18 1011 1110 12 17 1100 1010 13 16 1101 1011 14 15 1110 1001 15 14 1111 1000 16 13 17 12 18 11 19 10 to the first half and 1’s to the second half: 00, 01, 11, 10. Continuing, write 00, 01, 11, 10, 10, 11, 01, 00 to obtain 000, 001, 011, 010, 110, 111, 101, 100. Each iteration therefore doubles the number of codes. Algorithm for binary-to-gray conversion This is very simple: 1. The most significant bit of the binary number is the most significant bit of the gray code 2. Add (using modulo 2, i.e., ignoring carry) the next significant bit of the binary number to the next significant bit of the binary number to obtain the next gray code bit. 3. Repeat step 2 until all bits of the binary bits have been added modulo 2. The resultant number is Table V Illustration of Property of Gray Code Present number Action Next number 0011 (=3) Starting position Change in the first position from right 0010 (=2) 0010(=2) Change in the second position from right 0000(=0) 0000(=0) Change in the third position from right 0100(=4) Final position Comment: So for a change from regulating position 3 to 4, the fan will go a round of 3 to 2 to 0 to 4. This will cause huge oscillation in the circuit, which is not a good design. us take the example of a fan regulator which is digitally controlled. You want to switch from position 3 to 4. If you follow the normal binary system, you need to change three bit positions to switch from 3 to 4. But how? Change three positions one after another as shown in Table V. In gray code, there is change in only one position, so there will be no oscillation as such. The code is called reflected because it can be generated in the following manner: Take the gray code 0, 1. Write it forwards, then backwards: 0, 1, 1, 0. Then prepend 0’s w w w. e f y m ag . co m the gray code equivalent of the binary number. Assume binary number b3 b2 b1 b0 = 0011. Conversion into gray is done as follows: g3 = b3 =0; g2= b3+b2=0+0=0; g1=b2+b1=1+0=1; and g0= b1+b0=1+1=0 (ignore ‘carry’) Gray-to-binary conversion Assume gray code g3 g2 g1 g0=0010. It can be converted into binary as follows: b 3=g 3/2=remainder 0; b 2 = [g 3+g 2]/2=remainder 0; b 1= [ g 3+ g 2+ g 1] / 2 = r e m a i n d e r 1 ; b 0= [g3+g2+g1+g0]/2=remainder 1. Thus equivalent binary is 0011. Excess-3 gray code In many applications, it is desirable to use a BCD as well as unit distance. Excess-3 gray code is such a code. The values for 0 and 9 differ in only one bit, and so do all values for successive numbers. Outputs from linear devices or angular encoders may be coded in excess-3 gray code to obtain multi-digit BCD numbers. The code obtained from excess-3 code by applying conversion rule is shown below: Decimal Excess-3 gray code 0 0010 1 0110 2 0111 3 0101 4 0100 5 1100 6 1101 71111 8 1110 9 1010 Error correction and detection codes When a binary message made of strings of 0’s and 1’s is transmitted from the source to the destination, the message is corrupted by the noise during transmission. The corrupted message becomes erroneous by conversion from transmitted 0 to received 1 or from transmitted 1 to received 0. The error-correction code (ECC) and the error-detection code (EDC) are used to rectify the errors. In EDCs and ECCs, redundant check bits are pended with original message bits to design the codes. EDCs detect presence of errors. ECCs detect and correct the errors. Typically, transmission errors are of two types: random and burst. An error is called random if bits in the error are randomly distributed over the code. Burst error occurs when bits in the error are clustered together over the code. For transmitted byte 01010101, the examples of random error and burst error may be as below e l e c t ro n i c s f o r yo u • s e p t e m b e r 2 0 1 0 • 7 3 IT Zone (underlined bits are in error): 01110111 ---- Random error (errors are distributed and in second and sixth bit locations), and 01101101 ---- Burst error (errors are clustered on fourth, fifth and sixth bit locations). Several EDCs and ECCs are used to address both random and burst errors. R.W. Hamming introduced one-bit error-correcting codes in 1950. (7,4) and (13,8) are the examples of one-bit ECCs. P. Elias developed convolution codes in 1955. In 1959, R.C. Bose and D.K. Chaudhuri proposed multiple error-correcting codes. These are very powerful codes and known as generalised Hamming codes. A. Hocquenghem independently designed the codes proposed by Bose and Chaudhuri. That is why these codes are known as BCH codes. In 1960, I.S. Reed and G. Solomon designed powerful block codes particularly for burst errors, known as Reed Solomon codes. In 1960, G.D. Fornery introduced the concept of concatenated codes. In 1967, A.J. Viterbi introduced an important convolution code known as Viterbi code. Turbo code, low-density parity code, combined turbo code, punctured turbo code, cyclic redundancy code (CRC) and Golay code are the other important codes. Error detection and correction codes begin with parity codes. Parity codes are EDCs. Parity bit is used for error detection. In parity codes, a parity bit (either odd or even) is appended to the original message bits; parity bit is the redundant check bit. Even parity bit ensures even number of 1’s in the code (message plus parity bit). Odd parity ensures odd number of 1’s in the code. If an original message of seven bits is 1110001, its codes with parity bit are: even parity [11100010] and odd parity [11100011], where bold bits are parity bits. Hamming code is an ECC. Richard Hamming, a theorist with Bell Telephone Laboratories in the 1940s, developed the Hamming code method of error correction in 1949. The key to the Hamming code is the use of extra parity bits to allow identification of the errors. Hamming code (7,4) can detect and correct one-bit error, whereas (13,8) code can detect up to two simultaneous bit errors, and correct single-bit errors. A code with this ability to reconstruct the original message in the presence of errors is known as the error-correcting code. By contrast, the simple parity discussed above cannot correct errors, and can detect only an odd number of errors. An ECC always has more check bits than EDC and hence requires more bandwidth. In simple parity check bit is just one, whereas in (7,4) and (13,8) codes check bits are 3 and 5, respectively. Code capability and complexity in system design are the other parameters for selection of a code for particular applications. Encoding of (7,4) Hamming code (7, 4) encodes four bits of data into seven bit blocks called ‘code word.’ The extra three bits are parity bits. Each of the three parity bits maintains even parity for three of the four data bits, and no two parity bits are for the same three data bits. If the four data bits in (7, 4) are d1, d2, d3 and d4, Hamming code parity bits p1, p2 and p3 are calculated as: p1 = d2 + d3 + d4 p2 = d1 + d3 + d4 p3 = d1 + d2 + d4 where ‘+’ means bit-wise exclusive OR operation, i.e., sum ignoring carry. For example, you can encode data 1010 using the Hamming code as 1011010. Decoding of (7,4) In a world without errors, decoding a Hamming code word would be very easy. Just leave out the parity bits. In the example of code word, the parity bits are 101 and when you leave these out, you will receive data bits as 1010. But what if you receive a code word with an error and one or more of the parity bits are wrong? Suppose the received code word is 1011011. The first step is to check the parity bits to determine whether there is an error. 7 4 • s e p t e m b e r 2 0 1 0 • e l e c t ro n i c s f o r yo u Calculate parity bits with received bits as: p1 = d2 + d3 + d4 = 0 + 1 + 1 = 0 p2 = d1 + d3 + d4 = 1 + 1 + 1 = 1 p3 = d1 + d2 + d4 = 1 + 0 + 1 = 0 In this case, every parity bit is wrong. p1, p2 and p3 should have been 010, but you received 101. Compare received parity bits with these calculated parity bits to get bit pattern 111. This bit pattern has decimal value of 7. Now reverse the bit at seventh position of the received code to get 1011010, and then leave out parity bits 101 to receive the corrected data as 1010. Generalised Hamming code for single bit-error correction Illustration of (7,4) Hamming code paves the way for explaining generalised Hamming code. In generalised code, coding is done as below: 1. All bit positions that are powers of 2 will be locations for parity bits in code words. Thus in code words, locations of parity bits are 1, 2, 4, 8, 16, 32, 64, etc. 2. All other bit positions are for the given original data to be encoded. That means data locations are 3, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 17, etc. 3. Each parity bit calculates the parity for some of the bits in the code word. The position of the parity bit determines the sequence of bits that it alternately checks and skips as follows: (i) Location 1: check 1 bit, skip 1 bit, check 1 bit, skip 1 bit, etc (1, 3, 5, 7, 9, 11, 13, 15,...), (ii) Location 2: check 2 bits, skip 2 bits, check 2 bits, skip 2 bits, etc (2, 3, 6, 7, 10, 11, 14, 15,...), (iii) Location 4: check 4 bits, skip 4 bits, check 4 bits, skip 4 bits, etc (4, 5, 6, 7, 12, 13, 14, 15, 20, 21, 22, 23,...), (iv) Location 8: check 8 bits, skip 8 bits, check 8 bits, skip 8 bits, etc (8-15, 24-31, 40-47,...), (v) Location 16: check 16 bits, skip 16 bits, check 16 bits, skip 16 bits, etc (16-31, 48-63, 80-95,...), (vi) Location 32: check 32 bits, skip 32 bits, check 32 bits, skip 32 bits, etc (32-63, 96-127, 160-191,...). 4. Set a parity bit to 1 if the total number of 1’s in the positions that it w w w. e f y m ag . co m IT Zone checks is odd. Set a parity bit to 0 if the total number of 1’s in the positions that it checks is even. This means even parity is ensured in the code word. We illustrate with an example. Say, the original given byte of data is 10011010. As per Hamming code, the code word will be: _ _ 1 _ 0 0 1 _ 1 0 1 0, where ‘_’ locations are for the parity for each parity bit. • Location-1 check bit is 0, as created by even parity rule of data at locations 1, 3, 5, 7, 9, 11. • Location-2 check bit is obtained by even parity rule of data bit locations 2, 3, 6, 7, 10, 11 as 1. • Location-4 check bit is obtained from data locations of 4, 5, 6, 7, 12 as 1. • Location-8 check bit is obtained from data locations of 8, 9, 10, 11, 12 as 0. • Thus the code word becomes: 011100101010. Suppose the received code is w w w. e f y m ag . co m 011100101110. Here the error location is highlighted in bold. The receiver calculates parity bits from the received data, and compares calculated parity bits with received parity bits to find out which bit is in order to correct it. The method is to verify all the incorrect parity bits by the comparison stated. In the example, parity bits 2 and 8 are incorrect. It is now 2 + 8 = 10, and that bit position 10 is the location of the incorrect bit. We complement the tenth bit to get back the correct code word. Repetition code Coding Theory is the study of how to add redundancy or additional bits to the original given data so as to use them to detect and correct errors induced by the communication channel. Here, the transmitter sends the data bit several times, an odd number of times in fact. In general, if each bit is repeated 2K + 1 (K is a positive integer) times, the code can tolerate up to K errors. In repetition code, bit to be transmitted is transmitted more than once. In triple repetition code, ‘0’ and ‘1’ are coded, respectively, as ‘000’ and ‘111’. At the receiver, majority rule is applied to decide about the bit. If the three bits received were not identical, an error occurred. If the channel is nearly clean, most likely only one bit will change in each triple. Therefore 001, 010 and 100 each correspond to a 0 bit, while 110, 101 and 011 correspond to a 1 bit. Such codes cannot correct all errors. For example, if the channel introduces two bits error and the receiver gets ‘001,’ the system detects the error, but concludes that the original bit was 0, which is incorrect. If we increase the number of times we duplicate each bit to four, we can detect all two-bit errors but can’t correct; at five, we can correct all two-bit errors, but not detect all three-bit errors. The author is a regular contributor to EFY e l e c t ro n i c s f o r yo u • s e p t e m b e r 2 0 1 0 • 7 5