PCI interface in a computer: types and purpose. Photo

If you ask which interface should be used for an NVMe-enabled SSD, then anyone (who knows what NVMe is) will answer: of course PCIe 3.0 x4! True, with the justification, he is likely to have difficulties. At best, we will get the answer that such drives support PCIe 3.0 x4, and throughput interface matters. It has something, but all the talk about it began only when it became crowded for some drives in some operations within the framework of the "regular" SATA. But between its 600 MB/s and the (equally theoretical) 4 GB/s of the PCIe 3.0 x4 interface is just an abyss filled with a lot of options! What if one PCIe 3.0 line is enough, since it is already one and a half times more than SATA600? Fuels are added to the fire by controller manufacturers who threaten to switch to PCIe 3.0 x2 in budget products, as well as the fact that many users do not have such and such. More precisely, theoretically there are, but you can release them only by reconfiguring the system or even changing something in it, which you don’t want to do. But buy a top solid state drive- I want to, but there are fears that there will be absolutely no benefit from this (even moral satisfaction from the results of test utilities).

But is it so or not? In other words, is it really necessary to focus exclusively on the supported mode of operation - or is it still possible in practice give up principles? That is what we decided to check today. Let the check be quick and not claiming to be exhaustive, but the information received should be enough (as it seems to us) at least to think ... In the meantime, let's briefly get acquainted with the theory.

PCI Express: existing standards and their bandwidth

Let's start with what PCIe is and how fast this interface works. Often it is called a "bus", which is somewhat ideologically incorrect: as such, there is no bus to which all devices are connected. In fact, there is a set of point-to-point connections (similar to many other serial interfaces) with a controller in the middle and devices attached to it (each of which can itself be a next level hub).

The first version of PCI Express appeared almost 15 years ago. Orientation to use inside the computer (often within the same board) made it possible to make the standard high-speed: 2.5 gigatransactions per second. Since the interface is serial and full duplex, a single PCIe lane (x1; actually an atomic unit) provides data transfer at speeds up to 5 Gbps. However, in each direction - only half of this, i.e. 2.5 Gb / s, and this is the full speed of the interface, and not "useful": to improve reliability, each byte is encoded with 10 bits, so the theoretical bandwidth of one PCIe line 1.x is approximately 250 MB/s each way. In practice, it is still necessary to transfer service information, and as a result, it is more correct to talk about ≈200 MB / s of user data transfer. Which, however, at that time not only covered the needs of most devices, but also provided a solid supply: suffice it to recall that the predecessor of PCIe in the segment of mass system interfaces, namely the PCI bus, provided a throughput of 133 MB / s. And even if we consider not only mass implementation, but also all PCI options, then the maximum was 533 MB / s, and for the entire bus, i.e. such a PS was divided into all devices connected to it. Here, 250 MB / s (since PCI usually gives full, not useful bandwidth) per line - in exclusive use. And for devices that need more, the possibility of aggregating several lines into a single interface was initially provided, by powers of two - from 2 to 32, i.e. the x32 version provided by the standard in each direction could already transmit up to 8 GB / s. In personal computers, x32 was not used due to the complexity of creating and breeding the corresponding controllers and devices, so the variant with 16 lines became the maximum. It was used (and is still used) mainly by video cards, since most devices do not need so much. In general, a considerable number of them and one line is enough, but some successfully use both x4 and x8: just on the storage topic - RAID controllers or SSDs.

Time did not stand still, and about 10 years ago, the second version of PCIe appeared. The improvements were not only about speeds, but a step forward was also taken in this regard - the interface began to provide 5 gigatransactions per second while maintaining the same coding scheme, i.e., the throughput doubled. And it doubled again in 2010: PCIe 3.0 provides 8 (instead of 10) gigatransactions per second, but the redundancy has decreased - now 130 is used to encode 128 bits, and not 160, as before. In principle, the PCIe 4.0 version with the next doubling of speeds is already ready to appear on paper, but in the near future we are unlikely to see it massively in hardware. In fact, PCIe 3.0 is still used in many platforms in conjunction with PCIe 2.0, because the performance of the latter is simply ... not needed for many applications. And where it is needed, the good old method of line aggregation works. Only each of them has become four times faster over the past years, i.e. PCIe 3.0 x4 is PCIe 1.0 x16, the fastest slot in mid-zero computers. This option is supported by top SSD controllers, and it is recommended to use it. It is clear that if such an opportunity exists - a lot is not enough. And if she is not? Will there be any problems, and if so, which ones? This is the question we have to deal with.

Test Methodology

It is easy to test with different versions of the PCIe standard: almost all controllers allow you to use not only the one they support, but also all earlier ones. It's more difficult with the number of lanes: we wanted to directly test variants with one or two PCIe lanes. The Asus H97-Pro Gamer board we usually use on the Intel H97 chipset does not support the full set, but in addition to the x16 “processor” slot (which is usually used), it has another one that works in PCIe 2.0 x2 or x4 modes. We took advantage of this trio, adding to it the PCIe 2.0 mode of the “processor” slot in order to assess whether there is a difference. Still, in this case, there are no extraneous “intermediaries” between the processor and the SSD, but when working with the “chipset” slot, there is: the chipset itself, which is actually connected to the processor by the same PCIe 2.0 x4. We could add a few more modes of operation, but we were still going to conduct the main part of the study on a different system.

The fact is that we decided to take this opportunity and at the same time check one "urban legend", namely, the belief about the usefulness of using top-end processors for testing drives. So we took the eight-core Core i7-5960X - a relative of the Core i3-4170 usually used in tests (these are Haswell and Haswell-E), but which has four times as many cores. In addition, the Asus Sabertooth X99 board found in the bins is useful to us today by the presence of a PCIe x4 slot, which in fact can work as x1 or x2. In this system, we tested three x4 variants (PCIe 1.0/2.0/3.0) from the processor and chipset PCIe 1.0 x1, PCIe 1.0 x2, PCIe 2.0 x1 and PCIe 2.0 x2 (in all cases, chipset configurations are marked on the diagrams with the icon (c)). Does it make sense now to turn to the first version of PCIe, given the fact that there is hardly a single board that supports only this version of the standard and can boot from an NVMe device? From a practical point of view, no, but to check a priori the expected ratio of PCIe 1.1 x4 = PCIe 2.0 x2 and the like, it will come in handy for us. If the test shows that the bus scalability corresponds to the theory, then it does not matter that we have not yet been able to get practically meaningful ways PCIe 3.0 x1 / x2 connections: the first will be identical to just PCIe 1.1 x4 or PCIe 2.0 x2, and the second - PCIe 2.0 x4. And we have them.

In terms of software, we limited ourselves only to Anvil’s Storage Utilities 1.1.0: it measures various low-level characteristics of drives quite well, but we don’t need anything else. On the contrary: any influence of other components of the system is extremely undesirable, so low-level synthetics have no alternative for our purposes.

As a "working body" we used a 240 GB Patriot Hellfire. As it was found during testing, this is not a performance record holder, but its speed characteristics are quite consistent with the results. best SSD same class and same capacity. Yes, and there are already slower devices on the market, and there will be more of them. In principle, it will be possible to repeat the tests with something faster, however, as it seems to us, there is no need for this - the results are predictable. But let's not get ahead of ourselves, but let's see what we got.

Test results

When testing Hellfire, we noticed that the maximum speed on sequential operations can only be “squeezed” out of it by a multi-threaded load, so this should also be taken into account for the future: the theoretical throughput is theoretical, because “real” data, received in different programs according to different scenarios, they will depend more not on it, but on these same programs and scenarios - in the case, of course, when force majeure circumstances do not interfere :) Just such circumstances we are now observing: it was already said above that PCIe 1 .x x1 is ≈200 MB/s, and that's exactly what we're seeing. Two PCIe 1.x lanes or one PCIe 2.0 lane is twice as fast, and that's exactly what we're seeing. Four PCIe 1.x lanes, two PCIe 2.0 lanes, or one PCIe 3.0 lane is twice as fast, which was confirmed for the first two options, so the third is unlikely to be different. That is, in principle, scalability, as expected, is ideal: the operations are linear, Flash copes with them well, so the interface matters. Flash stops do well to PCIe 2.0 x4 for writing (so PCIe 3.0 x2 will do). Reading "may" more, but the last step already gives one and a half, and not two (as it potentially should be) increase. We also note that there is no noticeable difference between the chipset and processor controllers, and also between the platforms. However, LGA2011-3 is a little ahead, but only a little.

Everything is smooth and beautiful. But templates do not tear: the maximum in these tests is only a little more than 500 MB / s, and even SATA600 or (in the appendix to today's testing) PCIe 1.0 x4 / PCIe 2.0 x2 / PCIe 3.0 x1. That's right: do not be afraid of the release of budget controllers for PCIe x2 or the presence of only so many lines (and the version of the 2.0 standard) in the M.2 slots on some boards, when more is not needed. Sometimes so much is not needed: the maximum results are achieved with a queue of 16 commands, which is not typical for mass software. More often there is a queue with 1-4 commands, and for this you can get by with one line of the very first PCIe and even the very first SATA. There are overheads and such, though, so a quick interface is useful. However, too fast - perhaps not harmful.

And in this test, the platforms behave differently, and with a single command queue, they behave fundamentally differently. The "trouble" is not at all that many cores are bad. They are still not used here, except perhaps one, and not so much that the boost mode unfolds with might and main. So we have a difference of about 20% in the frequency of the cores and one and a half times in the cache memory - in Haswell-E it operates at a lower frequency, and not synchronously with the cores. In general, the top platform can only be useful for kicking out the maximum "yops" through the most multi-threaded mode with a large command queue depth. The only pity is that from the point of view practical work this is a very spherical synthetic in a vacuum :)

On the record, the state of affairs has not fundamentally changed - in every sense. But, funny, on both systems, the PCIe 2.0 x4 mode in the “processor” slot turned out to be the fastest. On both! And with multiple checks/rechecks. At this point, you might wonder if you need these are your new standards Or is it better not to rush anywhere at all ...

When working with blocks of different sizes, the theoretical idyll breaks down that increasing the speed of the interface still makes sense. The resulting numbers are such that a couple of PCIe 2.0 lanes would be enough, but in reality, in this case, the performance is lower than that of PCIe 3.0 x4, albeit not at times. And in general here budget platform the top one "scores" to a much greater extent. But just such operations are mainly found in application software, i.e. this diagram is the closest to reality. As a result, there is nothing surprising that thick interfaces and trendy protocols do not give any “wow effect”. More precisely, those who are passing from mechanics will be given, but exactly the same as any solid-state drive with any interface will provide it.

Total

To make it easier to perceive the picture of the hospital as a whole, we used the score given by the program (total - for reading and writing), normalizing it according to the PCIe 2.0 x4 "chipset" mode: this moment it is he who is the most widely available, since it is found even on LGA1155 or AMD platforms without the need to "offend" the video card. In addition, it is equivalent to PCIe 3.0 x2, which budget controllers are preparing to master. And on the new AMD AM4 platform, again, this particular mode can be obtained without affecting the discrete video card.

So what do we see? The use of PCIe 3.0 x4, if possible, is certainly preferable, but not necessary: it brings literally 10% additional performance to middle-class NVMe drives (in its initially top segment). And even then - due to operations, in general, not so often encountered in practice. Why is this option implemented in this case? Firstly, there was such an opportunity, but the pocket does not pull the stock. Secondly, there are drives and faster than our test Patriot Hellfire. Thirdly, there are such areas of activity where loads that are “atypical” for a desktop system are just quite typical. And it is there that the performance of the storage system is most critical, or at least the ability to make part of it very fast. But to ordinary personal computers this is all irrelevant.

In them, as we can see, the use of PCIe 2.0 x2 (or, accordingly, PCIe 3.0 x1) does not lead to a dramatic decrease in performance - only by 15-20%. And this is despite the fact that in this case we limited the potential capabilities of the controller by four times! For many operations, this throughput is enough. Here, one PCIe 2.0 lane is no longer enough, so it makes sense for controllers to support exactly PCIe 3.0 - and in conditions of a severe shortage of lanes in modern system this will work well. In addition, x4 width is useful - even if there is no support for modern PCIe versions in the system, it will still allow you to work at normal speed (albeit slower than it could potentially), if there is a more or less wide slot.

In principle, a large number of scenarios in which the flash memory itself turns out to be the bottleneck (yes, this is possible and inherent not only in mechanics) leads to the fact that the four lanes of the third PCIe version on this drive overtake the first one by about 3.5 times - the theoretical throughput of these two cases differs by 16 times. From which, of course, it does not follow that you need to rush to master very slow interfaces - their time has gone forever. It's just that many of the features of fast interfaces can only be implemented in the future. Or under the conditions regular user an ordinary computer will never directly collide in its life (with the exception of those who like to measure themselves with what they know). Actually, that's all.

Introduction

Moore's law states that the number of transistors on a silicon chip that is profitable to produce doubles every couple of years. But don't think that processor speed is also doubling every couple of years. Many people have this misconception, and users often expect exponential scaling of PC performance.

However, as you probably noticed, the top processors on the market have been stuck between 3 and 4 GHz for six years now. And the computer industry had to look for new ways to increase computing performance. The most important of these methods is to maintain a balance between platform components that use the PCI Express bus - an open standard that allows high-speed video cards, expansion cards and other components to exchange information. And the PCI Express interface is just as important for performance scaling as multi-core processors. While dual-core, quad-core, and six-core processors can only be loaded with thread-optimized applications, any program installed on your computer interacts in one way or another with components connected via PCI Express.

Many journalists and experts expected next-generation PCI Express 3.0 motherboards and chipsets to appear in the first quarter of 2010. Unfortunately, backward compatibility issues delayed the release of PCI Express 3.0, and today it's been half a year, but we're still waiting official information about the publication of a new standard.

However, we spoke with the PCI-SIG (Special Interest Group, which is responsible for the PCI and PCI Express standards), which allowed us to get some answers.

PCI Express 3.0: plans

Al Yanes, President and Chairman of PCI-SIG, and Ramin Neshati, Chairman of PCI-SIG Serial Communications Workgroup, shared their current plans for the implementation of PCI Express 3.0.

Click on the picture to enlarge.

On June 23, 2010, version 0.71 of the PCI Express 3.0 specification was released. Jans argued that version 0.71 should fix any backwards compatibility issues that caused the initial delay. Neshati noted that the main compatibility issue was the "DC wandering" feature, which he explained in such a way that PCI Express 2.0 and earlier devices "did not provide the necessary zeros and ones" to comply with the PCI Express 3.0 interface.

Today, with the backward compatibility issues resolved, PCI-SIG is ready to release baseline 0.9 "later this summer". And behind this basic version, version 1.0 is expected in the fourth quarter of this year.

Of course, the most intriguing question is when PCI Express 3.0 motherboards will hit store shelves. Neshati noted that he expects the first products to appear in the first quarter of 2011 (triangle "FYI" in the plan picture).

Neshati added that between versions 0.9 and 1.0 there should be no changes at the level of the silicon crystal (that is, all changes will only affect software and firmware), so some products should be on the market before the final 1.0 specification. And products can already be certified for the PCI-SIG "Integrator's List" ("IL" triangle), which is a variant of the PCI-SIG compliance logo.

Neshati jokingly named the third quarter of 2011 as the date for "Fry's and Buy" (probably referring to Frys.com, Buy.com, or Best Buy). That is, during this period, we should expect the appearance of a large number of products with PCI Express 3.0 support in retail stores and online stores.

PCI Express 3.0: Designed for Speed

For end users, the main difference between PCI Express 2.0 and PCI Express 3.0 will be a significant increase in maximum bandwidth. PCI Express 2.0 has a signal transfer rate of 5 GT/s, that is, a throughput of 500 MB/s for each lane. Thus, the main PCI Express 2.0 graphics slot, which typically uses 16 lanes, provides up to 8 GB/s of bidirectional bandwidth.

With PCI Express 3.0, we will get a doubling of these figures. PCI Express 3.0 uses a signaling rate of 8 GT/s, which gives a throughput of 1 GB/s per lane. Thus, the main slot for the video card will receive a bandwidth of up to 16 GB / s.

At first glance, increasing the signal rate from 5 GT/s to 8 GT/s does not seem like a doubling. However, the PCI Express 2.0 standard uses an 8b/10b encoding scheme, where 8 bits of data are transmitted as 10-bit characters for an error recovery algorithm. As a result, we get 20% redundancy, that is, a decrease in useful throughput.

PCI Express 3.0 moves to a much more efficient 128b/130b encoding scheme, eliminating 20% redundancy. So 8 GT/s is no longer a "theoretical" speed; this is the actual rate comparable in performance to a signal rate of 10 GT/s if the 8b/10b coding principle were used.

Click on the picture to enlarge.

We asked Jans about devices that would require a boost in speed. He replied that they would include "PLX switches, Ethernet controllers 40Gb/s, InfiniBand, solid state devices that are getting more and more popular, and of course graphics cards." He added "We haven't run out of innovation, it's not static, it's a continuous stream", they pave the way for further improvements in future versions of the interface PCI Express.

Analysis: where will we use PCI Express 3.0?

Drives

AMD has already integrated SATA 6Gb/s support into its 8th chipset line, and motherboard manufacturers are adding USB 3.0 controllers. Intel is a bit behind in this area as it doesn't support USB 3.0 or SATA 6 Gb/s in chipsets (we have pre-production P67 motherboards in our lab and they have SATA 6 Gb/s support, but USB 3.0 is in this generation we will not receive). However, as we've seen so many times in the AMD vs. Intel showdown, AMD innovation often inspires Intel. Given the speed of next-generation storage interfaces and peripherals, there is no need to migrate any of the technologies to PCI Express 3.0 yet. For both USB 3.0 (5 Gb / s) and SATA 6 Gb / s (no drives have yet appeared that would fit the limits of this interface), one PCI Express second generation line will suffice.

Of course, when it comes to drives, the interaction between drives and controllers is only part of the story. Imagine an array of multiple SSDs with SATA interface 6 Gb / s on the chipset, when a RAID 0 array can potentially load one lane of the second generation PCI Express, which most motherboard manufacturers use to connect the controller. So you can decide whether USB 3.0 and SATA 6 Gb / s interfaces can really require PCI Express 3.0 support after simple calculations.

Click on the picture to enlarge.

As we already mentioned, the USB 3.0 interface gives a maximum speed of 5 Gb / s. But like the PCI Express 2.1 standard, USB 3.0 uses 8b/10b encoding, meaning the actual peak speed is 4Gbps. Divide the bits by eight to convert to bytes and you get a peak throughput of 500 MB/s - just the same as a single lane of the current PCI Express 2.1 standard. SATA 6Gb/s runs at 6Gb/s, but it also uses an 8b/10b encoding scheme that turns the theoretical 6Gb/s into an actual 4.8Gb/s. Again, convert this to bytes and you get 600 MB/s, or 20% more than a PCI Express 2.0 lane can handle.

However, the problem lies in the fact that even the fastest SSDs today cannot fully load a SATA 3 Gb / s connection. Peripherals don't even come close to the USB 3.0 interface load, the same can be said about the latest generation of SATA 6 Gb / s. At least, today the PCI Express 3.0 interface is not necessary for its active promotion on the platform market. But let's hope that as Intel transitions to 3G NAND flash, clock speeds will increase and we'll get devices that can go over 3Gb/s with 2G SATA ports.

Video cards

We conducted our own research on the impact of PCI Express bandwidth on graphics card performance - after entering the PCI Express 2.0 market , at the beginning of 2010, as well as recently. As we have found, it is very difficult to load the x16 bandwidth that is currently available on PCI Express 2.1 motherboards. You'll need a multi-GPU configuration or an extreme high-end single-GPU graphics card to be able to tell the difference between x8 and x16 connections.

We asked AMD and Nvidia to comment on the need for PCI Express 3.0 - will this fast bus be needed to unlock the full performance potential of next generation graphics cards? An AMD representative told us that he can't comment yet.

Click on the picture to enlarge.

An Nvidia spokesman was more accommodating: "Nvidia played a key role in the industry in developing PCI Express 3.0, which should double the throughput work current generation standard (2.0). When these significant increases in bandwidth occur, there are applications that can take advantage of them. Consumers and professionals alike will benefit from the new standard with increased graphics and computing performance in laptops, desktops, workstations and GPU-enabled servers."

Perhaps the key phrase can be called "there will be applications that can use them." Nothing seems to be shrinking in the world of graphics. Displays are getting bigger, high resolution is replacing standard definition, textures in games are getting more detailed and intriguing. Today, we do not believe that even the latest high-end video cards have a need to use the PCI Express 3.0 interface with 16 lanes. But enthusiasts have seen history repeating itself year after year: advances in technology are paving the way for new ways to use "thicker pipes". Perhaps we will see an explosion of applications that will make GPU computing more mainstream. Or perhaps the performance hit that occurs when the video card memory is out of bounds when paging from system memory, will no longer be so noticeable in mass and low-end products. In any case, we'll have to see the innovations that PCI Express 3.0 will allow AMD and Nvidia to implement.

Motherboard Component Connections

AMD and Intel are always very reluctant to share information about the interfaces they use to connect chipset components or logical "bricks" in the north/south bridges. We know the speed at which these interfaces work, and also that they are designed to be as non-bottlenecking as possible. Sometimes we know who produced a certain part of the system logic, for example, AMD used a SATA controller in the SB600 based on the development of Silicon Logic. But the technology used to build bridges between components is often a blind spot. PCI Express 3.0 certainly seems like a very attractive solution, like A-Link interface, which is used by AMD.

The recent introduction of USB 3.0 and SATA 6Gb/s controllers on a large number of motherboards also gives us an idea. Because the Intel X58 chipset doesn't natively support either of the two technologies, companies like Gigabyte have to integrate controllers onto motherboards using available lanes to connect them.

At the maternal Gigabyte boards EX58-UD5 does not support either USB 3.0 or SATA 6 Gb/s. However, it does have a x4 PCI Express slot.

Click on the picture to enlarge.

Gigabyte has replaced the EX58-UD5 motherboard with the new X58A-UD5, which supports two USB 3.0 ports and two SATA 6Gb/s ports. Where did Gigabyte find the bandwidth to support these two technologies? The company took a single PCI Express 2.0 line for each controller, cutting down on the ability to install expansion cards, but at the same time enriching the functionality of the motherboard.

Aside from the addition of USB 3.0 and SATA 6 Gb/s, the only notable difference between the two motherboards concerns the removal of the x4 slot.

Click on the picture to enlarge.

Will the PCI Express 3.0 interface, like the standards before it, allow future technologies and controllers to be added to motherboards that will not be present in the current generation of chipsets in an integrated form? It seems to us that it will.

CUDA and Parallel Computing

We are entering the era of desktop supercomputing. Our systems work GPUs with intensive parallel data processing, as well as power supplies and motherboards that can support up to four video cards simultaneously. Nvidia technology CUDA allows you to transform a video card into a tool for programmers to calculate not only in games, but also in scientific fields and in engineering applications. The programming interface has already proven itself development of various solutions for the corporate sector, including medical imaging, mathematics, oil and gas exploration work.

Click on the picture to enlarge.

We asked for the opinion of OpenGL programmer Terry Welsh from the company Really Slick Screensavers about PCI Express 3.0 and GPU computing. Terry told us that "PCI Express has taken a good leap and I love that developers are doubling bandwidth whenever they want - like with version 3.0. However, in the projects I have to work on, I don't expect to see any difference. Most my work is related to flight simulators, but they tend to be limited by memory and I / O performance hard drive; the graphics bus is not a "bottleneck" at all. But I can easily foresee that the PCI Express 3.0 bus will lead to significant advances in GPU computing; for people who do scientific work with large amounts of data."

Click on the picture to enlarge.

The ability to double data transfer rates for math-intensive workloads certainly motivates the development of CUDA and Fusion. And therein lies one of the most promising areas for the upcoming PCI Express 3.0 interface.

Any gamer with Intel chipset P55 can talk about the advantages and disadvantages of the Intel P55 compared to the Intel X58 chipset. Advantage: Most motherboards based on the P55 chipset cost more reasonably than models based on the Intel X58 (in general, of course). Disadvantage: P55 has minimal PCI Express connectivity, the main task is assigned to Intel Clarkdale and Lynnfield processors, which have 16 second generation PCIe lanes in the CPU itself. Meanwhile, the X58 boasts 36 PCI Express 2.0 lanes.

For P55 buyers who wish to use two graphics cards, they will need to be connected via x8 lanes each. If you would like to add to Intel platform P55 third video card, you will have to use the chipset lines - but they are, unfortunately, limited by the speed of the first generation, and the chipset can allocate a maximum of four lines for the expansion slot.

When we asked PCI-SIG's Al Yance how many lanes can be expected in PCI Express 3.0-enabled chipsets from AMD and Intel, he replied that it was "private information" that he "couldn't disclose". Of course, we did not expect to receive an answer, but the question was worth asking anyway. However, it is unlikely that AMD and Intel, which are part of the PCI-SIG Board of Directors, would invest time and money in PCI Express 3.0 if they planned to use the new PCI Express standard simply as a means of reducing the number of lanes. It seems to us that in the future, AMD and Intel chipsets will continue to be segmented as we see today, high-end platforms will have enough opportunities to connect a couple of video cards with a full x16 interface, and the number of lines will be cut in mainstream chipsets.

Imagine a chipset like the Intel P55 but with 16 PCI Express 3.0 lanes available. Since these 16 lanes are twice as fast as PCI Express 2.0, we get the equivalent of 32 lanes of the old standard. In such a situation, it will be up to Intel if it wants to make the chipset compatible with 3-way and 4-way GPU configurations. Unfortunately, as we already know, next generation Intel P67 and X68 chipsets will be limited to PCIe 2.0 support (and Sandy Bridge will be similarly limited to supporting 16 lanes per chip).

Apart from parallel computing CUDA/Fusion, we're also seeing a rise in mainstream market capabilities thanks to the improved communication speeds of PCI Express 3.0 components - there's a lot of potential here too, we think. Without a doubt, PCI Express 3.0 will improve the capabilities of low-cost motherboards, which in the previous generation were only available on high-end platforms. And high-end platforms with PCI Express 3.0 at their disposal will allow us to set new performance records with innovations in graphics, storage and networking technologies that can use the available bus bandwidth.

In this article, we will explain the reasons for the success of the PCI bus and describe the high-performance technology that is coming to replace it - the PCI Express bus. We will also look at the history of development, the hardware and software levels of the PCI Express bus, the features of its implementation and list its advantages.

When in the early 1990s she appeared, then on her own technical specifications significantly outperformed all buses that existed up to that point, such as ISA, EISA, MCA and VL-bus. At that time, the PCI bus (Peripheral Component Interconnect - interaction of peripheral components), operating at a frequency of 33 MHz, was well suited for most peripherals. But today the situation has changed in many ways. First of all, the clock speeds of the processor and memory have increased significantly. For example, the clock frequency of processors has increased from 33 MHz to several GHz, while the operating frequency of PCI has increased to only 66 MHz. The emergence of technologies such as Gigabit Ethernet and IEEE 1394B threatened that the entire bandwidth of the PCI bus could go to serve a single device based on these technologies.

At the same time, the PCI architecture has a number of advantages over its predecessors, so it was not rational to completely revise it. First of all, it does not depend on the type of processor, it supports buffer isolation, bus mastering technology (bus capture) and PnP technology in full. Buffer isolation means that the PCI bus operates independently of the internal processor bus, which allows the processor bus to function independently of the speed and load of the system bus. Thanks to the bus capture technology, peripheral devices have the ability to directly control the process of data transfer on the bus, instead of waiting for help from CPU which would affect system performance. Finally, Plug and Play support allows automatic tuning and configuring the devices that use it and avoid fussing with jumpers and switches, which pretty much ruined the life of the owners of ISA devices.

Despite the undoubted success of PCI, at the present time it faces serious problems. Among them are limited bandwidth, lack of real-time data transmission functions and lack of support for next-generation network technologies.

Comparative characteristics of various PCI standards

It should be noted that the actual throughput may be less than the theoretical one due to the principle of the protocol and the features of the bus topology. In addition, the total bandwidth is distributed among all devices connected to it, therefore, than more devices sits on the bus, the less bandwidth goes to each of them.

Such standard improvements as PCI-X and AGP were designed to eliminate its main drawback - low clock speed. However, the increase clock frequency in these implementations has resulted in a reduction in the effective length of the bus and the number of connectors.

The new generation of the bus, PCI Express (or PCI-E for short), was first introduced in 2004 and was designed to solve all the problems that its predecessor faced. Today, most new computers are equipped with a PCI Express bus. Although they also have standard PCI slots, the time is not far off when the bus will become history.

PCI Express Architecture

The bus architecture has a layered structure as shown in the figure.

The bus supports the PCI addressing model, which allows all currently existing drivers and applications to work with it. In addition, the PCI Express bus uses the standard PnP mechanism provided by the previous standard.

Consider the purpose of the various levels of organization PCI-E. At the software level of the bus, read / write requests are generated, which are transmitted at the transport level using a special packet protocol. The data layer is responsible for error-correcting coding and ensures data integrity. The basic hardware layer consists of a double simplex channel consisting of a transmit and receive pair, collectively referred to as a link. The total bus speed of 2.5 Gb/s means that the throughput for each PCI Express lane is 250 Mb/s each way. If we take into account the overhead costs of the protocol, then about 200 Mb / s is available for each device. This bandwidth is 2-4 times higher than what was available for PCI devices. And, unlike PCI, if the bandwidth is distributed among all devices, then it goes to each device in full.

To date, there are several versions of the PCI Express standard, which differ in their bandwidth.

PCI Express x16 bus bandwidth for different versions PCI-E, Gb/s:

32/64
64/128
128/256

PCI-E bus formats

Currently available various options PCI Express formats, depending on the purpose of the platform - a desktop computer, laptop or server. Servers that require more bandwidth have more PCI-E slots, and these slots have more connecting lines. In contrast, laptops may only have one line for medium-speed devices.

Video card with PCI Express x16 interface.

PCI Express expansion cards are very similar to PCI cards, but the PCI-E connectors are more grippy to ensure the card won't slip out of the slot due to vibration or during shipping. There are several form factors of PCI Express slots, the size of which depends on the number of lanes used. For example, a bus with 16 lanes is referred to as PCI Express x16. Although the total number of lanes can be as high as 32, in practice, most motherboards nowadays are equipped with a PCI Express x16 bus.

Smaller form factor cards can be plugged into larger form factor slots without compromising performance. For example, a PCI Express x1 card can be plugged into a PCI Express x16 slot. As in the case of the PCI bus, you can use a PCI Express extender to connect devices if necessary.

The appearance of the connectors various types on the motherboard. From top to bottom: PCI-X slot, PCI Express x8 slot, PCI slot, PCI Express x16 slot.

Express Card

The Express Card standard offers a very simple way to add hardware to a system. The target market for Express Card modules are laptops and small PCs. Unlike traditional expansion boards desktop computers, the Express card can connect to the system at any time while the computer is running.

One of the popular varieties of Express Card is the PCI Express Mini Card, designed as a replacement for Mini PCI form factor cards. A card created in this format supports both PCI Express and USB 2.0. PCI Express Mini Card dimensions are 30×56 mm. PCI Express Mini Card can connect to PCI Express x1.

Benefits of PCI-E

PCI Express technology has gained advantages over PCI in the following five areas:

Better performance. With just one lane, the throughput of PCI Express is twice that of PCI. In this case, the throughput increases in proportion to the number of lines in the bus, maximum amount which can reach 32. An added advantage is that information on the bus can be transmitted simultaneously in both directions.
Simplification of input-output. PCI Express takes advantage of buses such as AGP and PCI-X while offering a less complex architecture and relatively simple implementation.
Layered architecture. PCI Express offers an architecture that can adapt to new technologies without the need for significant software upgrades.
New generation I/O technologies. PCI Express gives you new opportunities to receive data with the help of simultaneous data transfer technology, which ensures that information is received in a timely manner.
Ease of use. PCI-E greatly simplifies system upgrades and expansions by the user. Additional formats Express cards such as the ExpressCard greatly increase the ability to add high-speed peripherals to servers and laptops.

Conclusion

PCI Express is a bus technology for connecting peripherals, replacing technologies such as ISA, AGP, and PCI. Its use significantly increases the performance of the computer, as well as the user's ability to expand and update the system.

Brief history...

For the first time, a separate interface designed to bereplacement for the PCI bus for video cards, was introduced in 1997. AGP (from the English Accelerated Graphics Port, accelerated graphics port) - this is how Intel presented its new development simultaneously with the official announcement of the chipset for Intel processors Pentium II.

Claimed BenefitsAGP before its predecessorPCIwere significant:

more high frequency work (66 MHz);
increased bandwidth between the video card and the system bus;
direct transfer of information between the video card and RAM, bypassing the processor;
improved power system;
high speed access to shared memory.

Due development standardAGP 1x (AGP 1.0 specification) was not received due to the low speed of working with memory and was almost immediately improved, and its speed was doubled - this is how the AGP 2x interface appeared. Transmitting 32 bits (4 bytes) per cycle, the AGP 2x port could deliver peak performance of 66.6x4x2 = 533 M, unprecedented at that timeB/ s.

In 1998, the AGP 4x standard (AGP 2.0 specification) was released, providing the transfer of up to 4 blocks of information per clock cycle. At the same time, the signal voltage of the port was reduced from 3.3 to 1.5 V. The maximum throughput of AGP 4x became about 1GB/ s. In the future, the development of specifications was protracted - the reason for this was the very low speed of the fleet of video accelerators that existed at that time, as well as the low speed of exchange with RAM.

As soon as technical progress "rested" on the bus, which turned out to be too small for transferring huge information flows by modern video cards, a new standard was approved - AGP 8x (AGP 3.0 specification). As you may have guessed, it can transfer up to 8 blocks of information in one cycle and has a peak bandwidth of 2GB/ s. The AGP 8x bus is backward compatible with AGP 4x.

Industry high technology always going uphill. The volumes of transmitted and transmitted data are increasing, textures and their quality are growing, all this certainly makes each of the manufacturers give themselves a shake-up and give out something new and high-tech (standard, specifications, protocol, interface) that will connect with itself a new round in the fieldhi- tech.

Officially, the first basic PCI Express specification appeared in July 2002, thus marking the day of the gradual "departure" of AGP 8x ...

Introduction

At the moment, the modern Intel P45 / X48 chipset has official support for PCI Express 2.0 specifications, which the very common Intel P35 could not boast of. For those who are just about to buy modern fee on the Intel platform, the choice remains quite obvious - the P45/X48 chipset, and you won't have the dilemma "enough or not enough" of PCI Express 1.1 for the current hi-end or middle-end video card. But what about the owners of the P35s? Should I run back to the store?

In our today's material, we will try to dot the "I" regarding the advantages of PCI-E 2.0 over PCI-E 1.1 for modern accelerators. We will also experimentally analyze the performance of video cards when working with various interfaces, on the basis of which a conclusion will be made about the practical value of PCI-E 2.0.

And before proceeding to any objective tests, let's delve a little into the theory, namely, let's figure out how it all works in general.

PCI- Express- briefly about the main

As mentioned above, the base PCI Express specification appeared in July 2002. Due to its high speed and peak performance, the PCI Express bus does not leave its predecessor AGP a chance. According to its programming model new interface PCI-E is in many ways similar to PCI, which makes it easy to adapt the current fleet of various devices to the new interface without significant software "adjustments".

The principle of operation of PCI Express is based on serial data transfer. The tire is packet network with star topology. When interacting with PCI-E devices, a bidirectional point-to-point connection is used, called "Line" (line). Each PCI Express connection can consist of one (1x) or multiple lanes (4x, 16x, etc.).

For a basic PCI-Express 1x configuration, the theoretical throughput is 250 MB/s in each direction (transmit/receive). Accordingly, for PCI-E x16, this value is 250 MB/s x 16 = 4 GB/s.

It is noteworthy that from the physical side, the interface allows, for example, any board with a PCI-E 1x interface to work confidently not only in a regular one, but also in any other PCI Express slot with a higher bandwidth (4x, 16x, etc.). In this case, the maximum number of involved lines depends only on the properties of the device.

In all high-speed protocols, the issue of noise immunity is always acute. On this account, PCI Express uses the well-known 8/10 scheme or excess traffic (8 bits of data transmitted over the channel are replaced by 10 bits, thus generating additional information, about 20% of the total "flow").

PCIExpress 2.0

The standard was officially approved on January 15, 2007. The second revision of PCI Express significantly increased the throughput of one channel - up to 5 Gb/s (PCI Express 1.x - 2.5 Gb/s). This means that now for the x16 line maximum speed data transfer can reach 8 GB / s in both directions versus 4 GB / s for the old PCI Express 1.x.

A notable fact is that PCI Express 2.0 is fully compatible with PCI Express 1.1. In practice, this means that old video cards will work smoothly in motherboards with new connectors, and new video adapters will work without problems in old PCI Express 1.x standard connectors.

Perhaps, on this, let's wrap up with the theory and main features of PCI Express, it's time to start the corresponding tests, which, in fact, we will do, however, a little lower, but for now let's get acquainted with the test participants in detail.

About test participants

Unfortunately, it was not possible to cover a larger set of graphics accelerators at the time of testing, which we will definitely fix in the future. Video cards of the Low-End class were deliberately excluded from the tests, since they are of little use for high-resolution modes (above 1280x1024) with maximum image detail, where the advantages of PCI-E 2.0 over the younger PCI-E 1.1 can be revealed.

video card	Point Of View GeForce GTX 280	POV GeForce 9600 GT 512 MB Extreme Overclock	Palit HD 4850 Sonic
Chip code name
Process technology

I have been asked this question more than once, so now I will try to answer it as clearly and briefly as possible, for this I will give pictures of the PCI Express and PCI expansion slots on the motherboard for a better understanding and, of course, I will indicate the main differences in the characteristics, t .e. very soon, you will find out what these interfaces are and how they look.

So, to begin with, let's briefly answer this question, what is PCI Express and PCI in general.

What is PCI Express and PCI?

PCI is a computer parallel I/O bus for connecting peripherals to a computer motherboard. PCI is used to connect: video cards, sound cards, network cards, TV tuners and other devices. The PCI interface is outdated, so you probably won't be able to find, for example, a modern video card that connects via PCI.

PCI Express(PCIe or PCI-E) is a computer serial bus I / O for connecting peripherals to the computer motherboard. Those. this already uses a bidirectional serial connection, which can have several lines (x1, x2, x4, x8, x12, x16 and x32) the more such lines, the higher the throughput of the PCI-E bus. The PCI Express interface is used to connect devices such as video cards, sound cards, network cards, SSD drives and others.

There are several versions of the PCI-E interface: 1.0, 2.0 and 3.0 (version 4.0 will be released soon). This interface is usually designated, for example, like this PCI-E 3.0 x16, which stands for PCI Express 3.0 version with 16 lanes.

If we talk about whether, for example, a video card that has a PCI-E 3.0 interface on a motherboard that only supports PCI-E 2.0 or 1.0 will work, so the developers say that everything will work, but of course keep in mind that the throughput will be limited by the capabilities of the motherboard. Therefore, in this case, overpay for a video card with more new version PCI Express I think is not worth it ( if only for the future, i.e. You are planning to purchase a new motherboard with PCI-E 3.0). Also, vice versa, suppose you have motherboard supports PCI Express 3.0 version, and the video card version is, say, 1.0, then this configuration should also work, but only with PCI-E 1.0 capabilities, i.e. there is no restriction here, since the video card in this case will work at the limit of its capabilities.

Differences between PCI Express and PCI

The main difference in characteristics is, of course, the bandwidth, for PCI Express it is much higher, for example, for PCI at 66 MHz, the bandwidth is 266 Mb / s, and for PCI-E 3.0 (x16) 32 Gb/s.

Externally, the interfaces are also different, so connect, for example, PCI graphics card Express into the PCI expansion slot will not work. PCI Express interfaces with a different number of lanes also differ, I will now show all this in the pictures.

PCI Express and PCI expansion slots on motherboards

PCI and AGP slots

PCI-E x1, PCI-E x16 and PCI slots

PCI Express interfaces on video cards

That's all I have for now!