GAP8 (Modified)
The original GAP8 design offers limited memory capacity, making it challenging to efficiently tile memory blocks for transformer architectures—particularly when implementing KV caches.
To address this, we adjusted several memory configuration parameters in the simulated environment prior to composing the real RTL design for testing purposes.
Noticeable changes
- Increased Cluster L1 size: 64 KB -> 128 KB
- Address Space 0x10000000 - 0x1001FFFF
- Increased FC L1 size: 16 KB -> 64 KB
- Address Space 0x1B000000 - 0x1B00FFFF
- Modified L2
- L2 intereleaver features 4 -> 5 128 KB blocks working independently depending on its power mode.
Memory Map
Aliased Memory Area
| Area |
Address space |
Alternative |
Size |
| Aliased memory map area |
0x00000000 - 0x003FFFFF |
|
4 MiB |
Aliased Memory Area (FC)
| Area |
Address space |
Alternative |
Size |
| ✅ FC L1 RAM (64KiB) |
0x00000000 - 0x0000FFFF |
0x1B000000 - 0x1B00FFFF |
64 KiB |
| FC control unit registers |
0x00200000 - 0x002003FF |
0x1B200000 - 0x1B2003FF |
1 KiB |
| FC timer registers |
0x00200400 - 0x002007FF |
0x1B200400 - 0x1B2007FF |
1 KiB |
| FC event unit registers (FC private) |
0x00204000 - 0x002043FF |
|
1 KiB |
| FC event unit registers (Global) |
|
0x1B200800 - 0x1B200FFF |
2 KiB |
| ⚠️ FC Memory protection unit registers |
0x00204400 - 0x002047FF |
|
1 KiB |
| FC instruction cache control unit registers |
0x00201400 - 0x002017FF |
0x1B201400 - 0x1B2017FF |
1 KiB |
| FC Core 0 (Debug Unit) |
0x00300000 - 0x00307FFF |
0x1B300000 - 0x1B307FFF |
32 KiB |
Aliased Memory Area (Cluster)
| Area |
Address space |
Alternative |
Size |
| ✅ Cluster L1 RAM (128KiB) |
0x00000000 - 0x0001FFFF |
0x10000000 - 0x1001FFFF |
128 KiB |
| Cluster L1 memory test and set unit |
0x00100000 - 0x0010FFFF |
0x10100000 - 0x1010FFFF |
64 KiB |
| Cluster control unit registers |
0x00200000 - 0x002003FF |
0x10200000 - 0x102003FF |
1 KiB |
| Cluster timer registers |
0x00200400 - 0x002007FF |
0x10200400 - 0x102007FF |
1 KiB |
| Cluster event unit registers (Global) |
|
0x10200800 - 0x10200FFF |
2 KiB |
| ⚠️ Cluster event unit registers (Cluster private) |
0x00204000 - 0x002047FF |
|
2 KiB |
| Cluster instruction cache control unit |
0x00201400 - 0x002017FF |
0x10201400 - 0x102017FF |
1 KiB |
| ⚠️ Hardware convolution engine |
0x00201800 - 0x00201BFF |
0x10201800 - 0x10201BFF |
1 KiB |
| ⚠️ DMA CMD |
0x00204400 |
|
- |
| ⚠️ DMA Status |
0x00204404 |
|
- |
| Cluster Core 0 (Debug Unit) |
0x00300000 - 0x00307FFF |
0x10300000 - 0x10307FFF |
32 KiB |
| Cluster Core 1 (Debug Unit) |
0x00308000 - 0x0030FFFF |
0x10308000 - 0x1030FFFF |
32 KiB |
| Cluster Core 2 (Debug Unit) |
0x00310000 - 0x00317FFF |
0x10310000 - 0x10317FFF |
32 KiB |
| Cluster Core 3 (Debug Unit) |
0x00318000 - 0x0031FFFF |
0x10318000 - 0x1031FFFF |
32 KiB |
| Cluster Core 4 (Debug Unit) |
0x00320000 - 0x00327FFF |
0x10320000 - 0x10327FFF |
32 KiB |
| Cluster Core 5 (Debug Unit) |
0x00328000 - 0x0032FFFF |
0x10328000 - 0x1032FFFF |
32 KiB |
| Cluster Core 6 (Debug Unit) |
0x00330000 - 0x00337FFF |
0x10330000 - 0x10337FFF |
32 KiB |
| Cluster Core 7 (Debug Unit) |
0x00338000 - 0x0033FFFF |
0x10338000 - 0x1033FFFF |
32 KiB |
Cluster Subsystem
| Area |
Address space |
Alternative |
Size |
| ✅ Cluster L1 RAM (128KiB) |
0x10000000 - 0x1001FFFF |
0x00000000 - 0x0001FFFF |
128 KiB |
| ✅ Cluster L1 memory test and set unit |
0x10100000 - 0x1010FFFF |
0x00100000 - 0x0010FFFF |
64 KiB |
| Cluster control unit |
0x10200000 - 0x102003FF |
0x00200000 - 0x002003FF |
1 KiB |
| Cluster timer registers |
0x10200400 - 0x102007FF |
0x00200400 - 0x002007FF |
1 KiB |
| Cluster event unit registers (Global) |
0x10200800 - 0x10200FFF |
|
2 KiB |
| Cluster event unit registers (Cluster private) |
|
0x00204000 - 0x002047FF |
2 KiB |
| Cluster instruction cache control unit |
0x10201400 - 0x102017FF |
0x00201400 - 0x002017FF |
1 KiB |
| Hardware convolution engine |
0x10201800 - 0x10201BFF |
0x00201800 - 0x00201BFF |
1 KiB |
| Cluster Core 0 (Debug Unit) |
0x10300000 - 0x10307FFF |
0x00300000 - 0x00307FFF |
32 KiB |
| Cluster Core 1 (Debug Unit) |
0x10308000 - 0x1030FFFF |
0x00308000 - 0x0030FFFF |
32 KiB |
| Cluster Core 2 (Debug Unit) |
0x10310000 - 0x10317FFF |
0x00310000 - 0x00317FFF |
32 KiB |
| Cluster Core 3 (Debug Unit) |
0x10318000 - 0x1031FFFF |
0x00318000 - 0x0031FFFF |
32 KiB |
| Cluster Core 4 (Debug Unit) |
0x10320000 - 0x10327FFF |
0x00320000 - 0x00327FFF |
32 KiB |
| Cluster Core 5 (Debug Unit) |
0x10328000 - 0x1032FFFF |
0x00328000 - 0x0032FFFF |
32 KiB |
| Cluster Core 6 (Debug Unit) |
0x10330000 - 0x10337FFF |
0x00330000 - 0x00337FFF |
32 KiB |
| Cluster Core 7 (Debug Unit) |
0x10338000 - 0x1033FFFF |
0x00338000 - 0x0033FFFF |
32 KiB |
ROM Memory
| Area |
Address space |
Alternative |
Size |
| ✅ ROM (8kB) |
0x1A000000 - 0x1A001FFF |
|
8 KiB |
SoC Peripherals Subsystem
| Area |
Address space |
Alternative |
Size |
| SoC FLL |
0x1A100000 - 0x1A1007FF |
|
2 KiB |
| Cluster FLL |
0x1A100800 - 0x1A100FFF |
|
2 KiB |
| GPIO |
0x1A101000 - 0x1A101FFF |
|
4 KiB |
| SoC control unit |
0x1A104000 - 0x1A104FFF |
|
4 KiB |
| Advanced timer |
0x1A105000 - 0x1A105FFF |
|
4 KiB |
| SoC event generator |
0x1A106000 - 0x1A106FFF |
|
4 KiB |
| PMU DLC bridge |
0x1A107000 - 0x1A107FFF |
|
4 KiB |
| RealTime Counter |
0x1A108000 - 0x1A108FFF |
|
4 KiB |
| Efuse |
0x1A109000 - 0x1A109FFF |
|
4 KiB |
uDMA Subsystem
| Area |
Address space |
Alternative |
Size |
| uDMA LVDS interface |
0x1A102000 - 0x1A10207F |
|
128 B |
| SPI Master Channel 0 |
0x1A102080 - 0x1A1020FF |
|
128 B |
| SPI Master Channel 1 |
0x1A102100 - 0x1A10217F |
|
128 B |
| uDMA Hyperbus interface |
0x1A102180 - 0x1A1021FF |
|
128 B |
| uDMA UART interface |
0x1A102200 - 0x1A10227F |
|
128 B |
| I2C Channel 0 |
0x1A102280 - 0x1A1022FF |
|
128 B |
| I2C Channel 1 |
0x1A102300 - 0x1A10237F |
|
128 B |
| uDMA MEMCPY interface |
0x1A102380 - 0x1A1023FF |
|
128 B |
| uDMA I2S interface |
0x1A102400 - 0x1A10247F |
|
128 B |
| uDMA CPI interface |
0x1A102480 - 0x1A1024FF |
|
128 B |
| uDMA control unit |
0x1A102780 - 0x1A1027FF |
|
128 B |
Fabric Controller Subsystem
| Area |
Address space |
Alternative |
Size |
| ✅ FC L1 RAM (64KiB) |
0x1B000000 - 0x1B00FFFF |
0x00000000 - 0x0000FFFF |
64 KiB |
| FC control unit registers |
0x1B200000 - 0x1B2003FF |
0x00200000 - 0x002003FF |
1 KiB |
| FC timer |
0x1B200400 - 0x1B2007FF |
0x00200400 - 0x002007FF |
1 KiB |
| FC event unit (Global) |
0x1B200800 - 0x1B200FFF |
|
2 KiB |
| FC event unit (FC private) |
|
0x00204000 - 0x002043FF |
1 KiB |
| FC instruction cache control unit |
0x1B201400 - 0x1B2017FF |
0x00201400 - 0x002017FF |
1 KiB |
| FC Core 0 (Debug Unit) |
0x1B300000 - 0x1B307FFF |
0x00300000 - 0x00307FFF |
32 KiB |
L2 Memory
| Area |
Address space |
Alternative |
Size |
| ✅ L2 RAM (512kB) |
0x1C000000 - 0x1C080000 |
|
512 KiB |