Computer organization and architecture (OBE assignment): Chapeter 2 Floating point

Floating point

1. Representing real numbers in a way that can support a wide range of values.
2. Number in scientific notation (normalized&not normalized)
3. In binary,decimal
4. In general, numbers are represented approximately to a fixed number of significant digits and scaled using an exponent.

Floating point standard

1.Defined by IEEE std 754-1985
-IEEE 754-1985 was technical standard for floating-point computation.

2.Develop in response to divergence of representation.

3.Now almost universally adopted

4. The standard provides definitions for single precision and double precision representations.

IEEE Floating-point format

1bit	8bits	23bits
Sign	Exponent	Mantissa/Significand

Single Precision Range:

• 32-bit: 1-bit sign + 8-bit exponent + 23-bit significand

• Range: 2.0 * 10^-38< N < 2.0 * 10^38

• Precision: ~7 significant (decimal) digits

• Used when exact precision is less important

1bit	11bits	52bits
Sign	Exponent	Mantissa/Significand

Double Precision Range:

• 64-bit: 1-bit sign + 11-bit exponent + 52-bit significand
• Range: 2.0 * 10^–308 < N < 2.0 * 10^308
• Precision: ~15 significant (decimal) digits
• Used for scientific computations

The sign(s) of binary floating-point number is represented by a single bit. A 1 bit indicates a negative number, and a 0 bit indicates a positive number.
M is the mantissa (000...000 to 111...111) and E is the biased exponent.
Single precision, bias = 127.
Double precision, bias = 1203.

Value = (-1)^s x 1.M x 2^(E-bias)

For Example:
Represent -0.4375
-0.4375=(-1)^1 * 1.11 * 2^(-2)
S=1
Fraction=11000....002
Exponent= -1+bias
Single = -1+127
=126
= 011111102
double= -1+1023
=1022
=011111111102

Single precision:

01111110

11000000..00

Double precision:

01111111110

11000000000..00

Computer organization and architecture (OBE assignment)

Pages

Wednesday, 17 October 2012

Chapeter 2 Floating point

No comments:

Post a Comment