Sat. Jan 21st, 2023

Error-detection methods in data transmission

Parity checking

A parity checking system works on the basis of there being either an even or an odd number of 1s in a message (remember all data is sent as binary).

The following example uses even parity:

  1. The sender prepares the message to be sent: the number of 1s in the message is counted so that a check-bit can be added to the start of the message. If there is already an even number of 1s in the message, the check-bit is 0. If there is currently an odd number of 1s, the check-bit is 1, so that there are now an even number of 1s.
  2. The message is sent.
  3. The receiver (knowing that the message is expected in even parity), counts up how many 1s there are in the entire message. If it is even, then hopefully the message is correct. *
    If there are an odd number of 1s, then an error must have occurred at some point, and the data can’t be accepted.

* The fact that the message is in even parity can’t guarantee that no error has occurred; all it guarantees is that either no errors occurred, or an even number of errors occurred. However, assuming a normally-reliable connection with a very low likelihood of error, it may be a good enough assurance.

To use odd-parity, the exact same steps occur, except that the sender must ensure there are an odd number of 1s in the message, and the receiver should assume that unless there are an odd number of 1s, an error has definitely occurred.

Checksum

A checksum is a mathematically calculated value added to a message, that is only likely to be correct if the preceding data is received in its original form.

Take a bank card for example: these use an algorithm called Luhn’s algorithm. Rather than copy it, read about it here.

This is a very important algorithm: not only do people key in their card details on a regular basis on websites, but magnetic card readers are also prone to interference.

If, after reading the data and the checksum being re-calculated, it does not match the original checksum, then an error must have occurred, and so the data must be requested again.

Repetition

A repetition system works on the basis of a ‘best of three’ scenario – a majority vote. If a piece of data is sent multiple times, in an ideal situation, identical data is received each time. However, if the same data is not received each time, a majority vote is used to work out what should have been received. (Bold values are where errors have occurred)

Bit 0Bit 1Bit 2Bit 3Bit 4Bit 5Bit 6Bit 7
Original message00110011
Received (1)01110011
Received (2)00010010
Received (3)10111011
Reconstructed:00110011

As you can see, despite a few errors, it is easy to reconstruct the original data by picking the most common value at each position.

Drawbacks of this technique are the obvious increase in the amount of data that must be processed and transferred, as everything is sent multiple times.

Cyclic Redundancy Check (CRC)

CRC is an error checking method that appends a checksum to the original data, which is calculated through a hashing algorithm. A hashing algorithm is designed to mathematically process large amounts of data and return a single (usually large) value that results. These algorithms are designed so that it is extremely unlikely that two different sets of data could produce the same value, and more importantly, so that it is extremely difficult, if not impossible, to predict what changes can be made to force the hash value to a certain value. This is vital for security as it prevents malicious alterations being made whilst not affecting the hash value.

The CRC algorithm is often implemented in hardware or network equipment.

Compared to the Luhn algorithm, where it is easy to create new values that are accepted (just adding 8 to the last four digits of a VISA number will result in a valid yet potentially non-existent card number), the correct checksum being received alongside a file using CRC guarantees the data is intact.