When two PCIe devices are connected (e.g., GPU ↔ motherboard slot), they don’t immediately start blasting data.
Instead, they go through an automatic negotiation and calibration process to:
- Agree on how many lanes to use (x1, x4, x8, x16, up to x32).
- Agree on speed (Gen1 2.5 GT/s, Gen2 5 GT/s, Gen3 8 GT/s, etc.).
- Ensure signal integrity (bits aren’t flipped/misaligned).
- Handle real-world imperfections like reversed or inverted connections.
This is done via training sequences (TS1, TS2 ordered sets) exchanged between devices.
Key Steps in Training & Initialization
-
Detect link partner
- Physical presence is detected.
- Electrical idle exit confirmed.
-
Establish Link Width
- How many lanes are active (x1, x4, x8, …).
- If some lanes fail, width may be reduced (e.g., device supports x16, but only x8 trains successfully).
-
Negotiate Data Rate
- Start at the lowest common rate (Gen1 = 2.5 GT/s).
- Attempt to move up (Gen2, Gen3, …) if both sides support it.
- Retrain downward if errors are excessive.
-
Lane Reversal
- If lanes are connected in reverse order (e.g., PCB layout swapped lane 0 with lane 7), the receiver can logically remap them.
- This avoids the need for perfect lane routing in hardware.
-
Polarity Inversion
- PCIe uses differential pairs (positive + negative signals).
- If the pair is accidentally flipped (P ↔ N), the PHY logic detects and corrects it automatically.
-
Bit Lock (per lane)
- Receiver extracts a clean clock signal from the incoming bit stream.
- Ensures bits are sampled at the right time.
-
Symbol Lock (per lane)
- Finds alignment within the serial stream so symbols (8b/10b or 128b/130b) are properly grouped.
- Example: “Where does the 10-bit symbol boundary start?”
-
Lane-to-Lane Deskew
- In multi-lane links (x4, x8, x16…), signals don’t all arrive at the same time due to PCB trace length differences.
- Deskew aligns them so bytes across all lanes reassemble correctly.
Result
When training completes successfully:
- The Link is established at the highest common width + data rate that both sides can handle.
- Data packets (TLPs/DLLPs) can now flow reliably.
- If conditions change (errors, power management, resets), retraining can occur.