The Representation of Data

The Representation of Data
An Overview

Computer Science Department

bernstdh@jmu.edu

A Quiz

What Is This?

The Point of the Quiz

It could be many things
- Bits (i.e., binary values) can be and are used to represent a wide variety of things
We need contextual information
- To interpret a bunch of bits we need to know the representation scheme being used

Why 0/1?

Electronic/Magnetic Systems:
- Positive/Negative
- On/Off
- Clockwise/Counterclockwise
Mechanical Systems:
- Up/Down
- Pits/Lands
- Hole/Solid
- Bump/Flat

An Easy First Example - The Counting Numbers

An Interesting Question and Answer

The Question:
- How many bits do we need to represent all of the counting numbers less than $N$ ?
Getting to the Answer:
- With $B$ bits, we can represent all of the counting numbers less than $2^B$

What About Negative Integers?

An Obvious Place to Start:
- Since there are two signs, use one bit (e.g., the left-most) to represent the sign
A Shotcoming of this Approach:
- It results in both a +0 and a -0

Going Further

: Negative Numbers (cont.)

Going Further

: Negative Numbers (cont.)

(Courtesy of xkcd)

What About Real Numbers?

Think About Base 10:
- The positions to the left of the decimal point are powers of 10 and the positions to the right of the decimal place are powers of 1/10
An Obvious Place to Start in Binary:
- The positions to the left of the decimal point are powers of 2 and the positions to the right of the decimal place are powers of 1/2

Going Further

: What About Real Numbers? (cont.)

Going Further

: What About Real Numbers? (cont.)

IEEE Short Real (Single Precision):
- 1 bit for the sign
- 8 bits for the exponent
- 23 bits for the mantissa
IEEE Long Real (Double Precision):
- 1 bit for the sign
- 11 bits for the exponent
- 52 bits for the mantissa

What About Characters?

An Obvious Place to Start:
- Count the number of characters
- Determine the number of bits needed
- Assign a binary number to each character
An Example:
- There are 26 letters in the alphabet
- 5 bits can represent $2^5$ (i.e., 32) different things
- Assign 00001 to A, 00010 to B, 00011 to C, ...., 11010 to Z

What About Characters? (cont.)

The ASCII (American Standard Code for Information Interchange) Encoding:
Unicode:
- A mapping for every character in every language (including many dead languages)

What About Other Things?

Discrete Sets:
- We can use the same approach as for characters
Continuous Sets:
- We either have to sample the set (to create a discrete approximation) or describe the elements

What About Colors?

A Sampling Scheme:
- Use the fact that we have red, green and blue cones to think of colors as having a red, green and blue component
- Think of each color as having a discrete number of levels (e.g., $2^8 = 256$ )
- A palette of $2^{24} = 16,777,216$ colors
A Description Scheme:
- Use the wavelength

What About Pictures?

A Sampling (Raster) Scheme:
- Create a finite grid with equal sized cells (called picture elements or pixels)
A Description (Vector) Scheme:
- Use geometric shapes (e.g., points, lines, curves, rectangles, polygons, ellipses)

What About Audio?

A Sampling Scheme:
- Need to use both temporal sampling and amplitude sampling (called quantization)
A Description Scheme:
- Use something like standard musical notation

What About Programs?

Getting Started:
- Each processor is capable of executing a discrete set of operations
- Each operation is given a code
The Next Step:
- Each operation has a discrete number of operands, each of which is represented in binary
The Final Step:
- A program is just a sequence operation codes and operand values

The Quiz Revisited

There's Always More to Learn