Encoding
Now that we know what a byte is and what it looks like, let us see how it is interpreted, mainly in strings. Character Encodings are a way to assign values to bytes or sets of bytes that represent a certain character in that scheme. Some encodings are ASCII(probably the oldest), Latin, and UTF-8(most widely used as of today. In a sense encodings are a way for computers to represent, send and interpret human readable characters. This means that a sentence in one encoding might become completely incomprehensible in another encoding.
Working with Binary Data in Python
Alright, lets get this out of the way! The basics are pretty standard:
- There are 8 bits in a byte
- Bits either consist of a 0 or a 1
- A byte can be interpreted in different ways, like binary octal or hexadecimal
Note: These are not character encodings, those come later. This is just a way to look at a set of 1’s and 0’s and see it in three different ways(or number systems).
Examples:
Input : 10011011 Output : 1001 1011 ---- 9B (in hex) 1001 1011 ---- 155 (in decimal) 1001 1011 ---- 233 (in octal)
This clearly shows a string of bits can be interpreted differently in different ways. We often use the hex representation of a byte instead of the binary one because it is shorter to write, this is just a representation and not an interpretation.
Contact Us