Lecture Note 4
Amdahl's Law
Fallacy: low power at idle, high power at maximum consumption
- 100% load : 100% power consumption
- 50% load : 68% power consumption
- 10% load : 47% power consumption
Most data centers operates at 10% ~ 50% load
- consider designing processors to make power proportional to load
mips (Millions of instructions per second)
$$ \text{mips} = \frac{\text{Instruction cnt}}{\text{Execution time} \cdot 10^6} $$ -> not good!
- differences in ISAs
- differences in complexity between instruction
- CPI varies between programs given CPU
Summary
- Cost per performance is decreasing (Moores' law, but discontinued)
- Hierarchical layers (abstraction): hiding hardward details
- ISA
- Execution time to measure performance (better than mips metric)
- Power as limiting factor: using parallelism to solve this
Instructions: language of the Computer
Instruction set
- collection of instructions used by ISA
- different computers have different instruction sets
MIPS Instruction Set
- Arithmetic operations
add a, b, c // a <= b + c
- e.g.
f = (g+h) - (i+j)
//pseudo-code like MIPS
add t0, g, h // t0 <- g + h
add t1, i, j // t1 <- i + j
sub f, t0, t1 // f <- t0 - t1
Register Operands
Register file: 32 * 32 bit wide register file (single-word wide)
-
frequently accessed aata (#0 ~ #31)
-
$t0 ~ $t9 for temp values
-
$s0 - $s7 for saved variables
smaller is faster! : use registers (instead of slow main memory)
- e.g.
f = (g+h) - (i+j)
// f(s0), g(s0), h(s2), i(s3), j(s4)
add $t0, $s1, $s2 // t0 <- g + h
add $t1, $s3, $s4 // t1 <- i + j
sub $s0, $t0, $t1 // f <- t0 - t1
Memory operands (MIPS32)
- memory used for compositie data: array, data structures
- LOAD : memory -> register
- STORE : register -> memory
- memory address (1 byte interval)
- word (4 byte) aligned data
- Big endian architecture
memory operands e.g.
g(s1) = h(s2) + A[8](s3);
lw $0, 32($s3) // load word: 8 integers = 32 bytes
add $s1, $2, $t0
Registers vs memory
- Registers: fast / Memory : slow
- operating on memory requires load and stores (with register)
- compilers : need to use registers for faster program
Immediate Operands
addi $s3, $s3, 4 // s3 = s3 + 4
//no subi: use addi with negative
Constant zero register
$0 register always hold the constant 0:
add $t2, $s1, $0 // t2 = s1 + 0
Binary integers
-
Unsigned $$ x = \sum_{i}^{n-1}{x_i \cdot 2 ^{i}}$$
-
Signed $$ x = -x_{n-1} \cdot 2^{n-1} + \sum_{i}^{n-2}{x_i \cdot 2 ^{i}}$$
Sign extension / negation in MIPS
addi //extend immediate value
lb, lh //extend loaded byte/halfword
beq, bne //extend the displacement
MIPS R format instruction
32 = op(6) + rs(5) = rt(5) + rd(5) + shamt(5) + funct(6)
- op (opcode)
- rs (first src register)
- rt (second src register)
- rd (dst register)
- shamt (shift amount)
- funct (function code)