PCI Express protocol can be used as data interface to flash memory devices, such as memory cards and solid-state drives SSDs. Certain data-center applications such as large computer clusters require the use of fiber-optic interconnects due to the distance limitations inherent in copper cabling.

Typically, a network-oriented standard such as Ethernet or Fibre Channel suffices for these applications, but in some cases the overhead introduced by routable protocols is undesirable and a lower-level interconnect, such as InfiniBand , RapidIO , or NUMAlink is needed.

Local-bus standards such as PCIe and HyperTransport can in principle be used for this purpose, [92] but as of [update] solutions are only available from niche vendors such as Dolphin ICS.

The differences are based on the trade-offs between flexibility and extensibility vs latency and overhead.

The additional overhead reduces the effective bandwidth of the interface and complicates bus discovery and initialization software. Also making the system hot-pluggable requires that software track network topology changes.

InfiniBand is such a technology. Another example is making the packets shorter to decrease latency as is required if a bus must operate as a memory interface.

Smaller packets mean packet headers consume a higher percentage of the packet, thus decreasing the effective bandwidth.

PCI Express falls somewhere in the middle, targeted by design as a system interconnect local bus rather than a device interconnect or routed network protocol.

Additionally, its design goal of software transparency constrains the protocol and raises its latency somewhat. Delays in PCIe 4. From Wikipedia, the free encyclopedia.

Views Read Edit View history. In other projects Wikimedia Commons. This page was last edited on 10 November , at PCI devices therefore are generally designed to avoid using the all-ones value in important status registers, so that such an error can be easily detected by software.

Targets latch the address and begin decoding it. Subtractive decode devices, seeing no other response by clock 4, may respond on clock 5.

If the master does not see a response by clock 5, it will terminate the transaction and remove FRAME on clock 6.

The initiator may assert IRDY as soon as it is ready to transfer data, which could theoretically be as soon as clock 2.

To allow bit addressing, a master will present the address over two consecutive cycles. On the following cycle, it sends the high-order address bits and the actual command.

Dual-address cycles are forbidden if the high-order address bits are zero, so devices which do not support bit addressing can simply not respond to dual cycle commands.

Addresses for PCI configuration space access are decoded specially. For these, the low-order address lines specify the offset of the desired PCI configuration register, and the high-order address lines are ignored.

Each slot connects a different high-order address line to the IDSEL pin, and is selected using one-hot encoding on the upper address lines.

After the address phase specifically, beginning with the cycle that DEVSEL goes low comes a burst of one or more data phases.

In case of a write, the asserted signals indicate which of the four bytes on the AD bus are to be written to the addressed location. In the case of a read, they indicate which bytes the initiator is interested in.

For reads, it is always legal to ignore the byte enable signals and simply return all 32 bits; cacheable memory resources are required to always return 32 valid bits.

The data phase continues until both parties are ready to complete the transfer and continue to the next data phase. Whichever side is providing the data must drive it on the AD bus before asserting its ready signal.

Once one of the participants asserts its ready signal, it may not become un-ready or otherwise alter its control signals until the end of the data phase.

The data recipient must latch the AD bus each cycle until it sees both IRDY and TRDY asserted, which marks the end of the current data phase and indicates that the just-latched data is the word to be transferred.

This continues the address cycle illustrated above, assuming a single address cycle with medium DEVSEL, so the target responds in time for clock 3.

However, at that time, neither side is ready to transfer data. For clock 4, the initiator is ready, but the target is not. On clock 5, both are ready, and a data transfer takes place as indicated by the vertical lines.

For clock 6, the target is ready to transfer, but the initiator is not. On clock 7, the initiator becomes ready, and data is transferred.

For clocks 8 and 9, both sides remain ready to transfer data, and data is transferred at the maximum possible rate 32 bits per clock cycle.

In case of a read, clock 2 is reserved for turning around the AD bus, so the target is not permitted to drive data on the bus even if it is capable of fast DEVSEL.

A target that supports fast DEVSEL could in theory begin responding to a read the cycle after the address is presented.

This cycle is, however, reserved for AD bus turnaround. Note that most targets will not be this fast and will not need any special logic to enforce this condition.

Either side may request that a burst end after the current data phase. Simple PCI devices that do not support multi-word bursts will always request this immediately.

Even devices that do support bursts will have some limit on the maximum length they can support, such as the end of their addressable memory.

The cycle after the target asserts TRDY , the final data transfer is complete, both sides deassert their respective RDY signals, and the bus is idle again.

Obviously, it is pointless to wait for TRDY in such a case. The target requests the initiator end a burst by asserting STOP.

The initiator will then end the transaction by deasserting FRAME at the next legal opportunity; if it wishes to transfer more data, it will continue in a separate transaction.

There are several ways for the target to do this:. There will always be at least one more cycle after a target-initiated disconnection, to allow the master to deassert FRAME.

There are two sub-cases, which take the same amount of time, but one requires an additional data phase:. If the initiator ends the burst at the same time as the target requests disconnection, there is no additional bus cycle.

For memory space accesses, the words in a burst may be accessed in several orders. The unnecessary low-order address bits AD[1: A target which does not support a particular order must terminate the burst after the first word.

Some of these orders depend on the cache line size, which is configurable on all PCI devices. If the starting offset within the cache line is zero, all of these modes reduce to the same order.

Cache line toggle and cache line wrap modes are two forms of critical-word-first cache line fetching. Toggle mode XORs the supplied address with an incrementing counter.

This is the native order for Intel and Pentium processors. It has the advantage that it is not necessary to know the cache line size to implement it.

When one cache line is completely fetched, fetching jumps to the starting offset in the next cache line. Note that most PCI devices only support a limited range of typical cache line sizes; if the cache line size is programmed to an unexpected value, they force single-word access.

This is rarely used, and may be buggy in some devices; they may not support it, but not properly force single-word access either.

That might be their turnaround cycle. As the initiator is also ready, a data transfer occurs. This repeats for three more cycles, but before the last one clock edge 5 , the master deasserts FRAME , indicating that this is the end.

On clock edge 7, another initiator can start a different transaction. This is also the turnaround cycle for the other control lines.

The equivalent read burst takes one more cycle, because the target must wait 1 cycle for the AD bus to turn around before it may assert TRDY:.

On clock edge 6, the target indicates that it wants to stop with data , but the initiator is already holding IRDY low, so there is a fifth data phase clock edge 7 , during which no data is transferred.

The PCI bus detects parity errors, but does not attempt to correct them by retrying operations; it is purely a failure indication.

Due to this, there is no need to detect the parity error before it has happened, and the PCI bus actually detects it a few cycles later.

During a data phase, whichever device is driving the AD[ The device listening on the AD bus checks the received parity and asserts the PERR parity error line one cycle after that.

This generally generates a processor interrupt, and the processor can search the PCI bus for the device which detected the error.

The PERR line is only used during data phases, once a target has been selected. If a parity error is detected during an address phase or the data phase of a Special Cycle , the devices which observe it assert the SERR System error line.

Due to the need for a turnaround cycle between different devices driving PCI bus signals, in general it is necessary to have an idle cycle between PCI bus transactions.

Additional timing constraints may come from the need to turn around are the target control lines, particularly DEVSEL. The target deasserts DEVSEL , driving it high, in the cycle following the final data phase, which in the case of back-to-back transactions is the first cycle of the address phase.

One case where this problem cannot arise is if the initiator knows somehow presumably because the addresses share sufficient high-order bits that the second transfer is addressed to the same target as the previous one.

In that case, it may perform back-to-back transactions. All PCI targets must support this. It is also possible for the target keeps track of the requirements.

Targets which have this capability indicate it by a special bit in a PCI configuration register, and if all targets on a bus have it, all initiators may use back-to-back transfers freely.

A subtractive decoding bus bridge must know to expect this extra delay in the event of back-to-back cycles in order to advertise back-to-back support.

Starting from revision 2. This is provided via an extended connector which provides the bit bus extensions AD[ The bit PCI connector can be distinguished from a bit connector by the additional bit segment.

During a bit burst, burst addressing works just as in a bit transfer, but the address is incremented twice per data phase. The starting address must be bit aligned; i.

AD2 must be 0. Note that a target may decide on a per-transaction basis whether to allow a bit transfer. If REQ64 is asserted during the address phase, the initiator also drives the high 32 bits of the address and a copy of the bus command on the high half of the bus.

If the address requires 64 bits, a dual address cycle is still required, but the high half of the bus carries the upper half of the address and the final command code during both address phase cycles; this allows a bit target to see the entire address and begin responding earlier.

The data which would have been transferred on the upper half of the bus during the first data phase is instead transferred during the second data phase.

If ACK64 is missing, it may cease driving the upper half of the data bus. It is only valid for address phases if REQ64 is asserted. PCI originally included optional support for write-back cache coherence.

Because this was rarely implemented in practice, it was deleted from revision 2. For example, many motherboards have x16 slots that are connected to x8, x4, or even x1 lanes.

With bigger slots it is important to know if their physical sizes really correspond to their speeds. Moreover, some slots may downgrade their speeds when their lanes are shared.

The most common scenario is on motherboards with two or more x16 slots. With several motherboards, there are only 16 lanes connecting the first two x16 slots to the PCI Express controller.

This means that when you install a single video card, it will have the x16 bandwidth available, but when two video cards are installed, each video card will have x8 bandwidth each.

But a practical tip is to look inside the slot to see how many contacts it has. If you see that the contacts on a PCI Express x16 slot are reduced to half of what they should be, this means that even though this slot is physically an x16 slot, it actually has eight lanes x8.

