Intel is unveiling details about the new graphics chip it plans to compete in the battle for the graphics card market with NVIDIA and ATI. The details revealed describe the Larrabee architecture that will be available as a product at the end of 2009 or at the beginning of 2010.
It has long been known that Intel is planning its entry into the graphics card market, which means it will break the hegemony of NVIDIA and ATI. The plan was preceded by a development and acquisition of technologies such as the acquisition of the companies Havok and-Neoptica During the year of 2007, as well as recruitment of professionals specializing in the field. At present, Intel holds the largest market share in the graphics chip market thanks to its chipsets that contain a built-in video card. Intel's built-in cards, however, are weak and very limited compared to video cards and relatively To the built-in ATI chips. Intel does not intend to make do with that, and they intend to make their mark on the big players' field as well.
This week Intel officially presented the conference SIGRAPH (R.T. for Special Interest Group on GRAPHics and Interactive Techniques) - The main event of the year in everything related to the field of graphics held this year in Los Angeles, document Describing the features and capabilities of a new multi-core architecture called Larrabee. Why architecture and not GPU? Because it cannot be said that Larrabee is a GPU in the conventional sense, it is in fact a multi-core processor (CPU) that has been extensively optimized for parallel information processing. And if you're not convinced yet, we'll note that Larrabee has very little hardware designed for a particular purpose, and a lot of hardware designed to run general code - X86 but what.
Intel intends to introduce a video card based on the new architecture towards the end of 2009 or at the latest 2010. The card will have to compete with the NVIDIA GeForce and Radeon models available then, that is, two generations from now. In addition, Intel is targeting Larrabee to compete in the HPC market to accelerate complex calculations at workstations, where it will compete with NVIDIA's Tesla and FireStream cards.
The competitors were quick to respond to the new threat, and NVIDIA made sure to emphasize that in fact it already supports running code on its graphics cards starting with the G8X series, CUDA, Which is a C compiler and development tools that allow coding algorithms to execute on the GPU. AMD's solution was also determined in the past with the acquisition of ATI in the form ofFusion, And now AMD has announced that a chip called Shrike (designed for laptops) will also implement the Fusion vision at the beginning of 2010 with the advent of Larrabee, and is manufactured using 32 nanometer technology.
Some basic facts - Larrabee as a GPU
Intel first noted that the first product based on the Larrabee architecture will be a card designed to accelerate 3D graphics games - it is a confirmation that this is a standard video card that competes directly with ATI and NVIDIA products. Larrabee chip will contain many cores, unlike the company's processors now containing 4 cores, or 8 cores in the upcoming management processors, the Larrabee will already contain dozens of cores and probably hundreds later on.
Solution based software
The base processing unit (one core) is based on a derivative of the original Pentium processor that has 64 bit processing capabilities and a second cache of 256 kilobytes. The core was also equipped with an ultra-wide vector processing unit capable of handling parallel 16 vectors. Intel made an accounting How can the given core space be used most efficiently to obtain maximum vector calculations at a given time and conclude that using many small and simple processors (Pentium remember?) Will yield more 20 output than modern and complex processors with SSE and out of order execution.
Intel has decided to take a different route than ATI and NVIDIA, which is based on hundreds of simple processing units capable of performing floating-point calculations, including dedicated and purpose-oriented hardware. Similar to the 3DLabs P10 card, which years ago offered a similar solution, Intel chose a software-based solution in which all the computational functions are performed by processors running a standard code, thereby achieving maximum flexibility. The card can implement existing DX10 or OpenGL interfaces in the software, or alternatively run a completely different program written specifically for it.
New architecture - old
The modern processors are very different from the first processors that operated the personal computers like the 8088 that ran the original IBM XT at 4.47MHz speed. The new processors contain many complex functions such as large cache, complex execution mechanisms for out-of-order pipelines, complex SIMD / MIMD commands and, of course, 3GHz clock speeds and more. Intel's decision not to develop entirely new technology for the Larrabee was surprising, but more surprising is the fact that the company decided to look for the basis for archiving technology and to pull out the original Pentium, launched in 1993 and initially operating at 60MHz. It is a processor that can execute two commands at the same time (superscalar), contains a reasonable amount of cache of only one level (L1 cache) and includes 3.1 only one million transistors.
The old-fashioned processor core was the basis for the extensive changes made to adapt it to the intensive performance of floating-point calculations necessary to create the 3D graphics produced by the card. Intel has added a vector processing unit capable of performing 16 parallel floating point operations when the width of each word is 32 bit. The unit can handle 32 bit commands for floating point and integers or 64 bit for floating point. This achieves significant processing power for each core and flexibility in the unit allocation that does not exist in ATI and NVIDIA dedicated flow processors.
The annular channel
The Texture Logic unit is a function that requires a lot of processing resources, which is why Intel chose to implement it in hardware and thus streamline the execution of the task. It is actually one of the few units in architecture that play a permanent and non-programmable role. Since these are very large amounts of information that need to pass between the various units of the processor, a special bass was built that connects all the units and allows for fast communication and relatively low latency. This is a 512-bit two-way ring bass - that is, a total width of 1024 bits. With the help of this bass, the consistency of the cache is maintained, thus ensuring that the processing operation remains as efficient as possible and the cores do not remain idle due to a lack of data in their cache.
One of the most interesting features of the new architecture is scalability and increased performance with the addition of processing units. Intel claims to increase almost linear relative to the number of cores so doubling the number of cores in the chip will actually double its performance. This will, of course, be possible with the development of manufacturing technologies, so that with the transition to production at 32 nanometers, Intel will be able to double the number of processing units per chip. We see that Larrabee has a clear advantage over the existing technologies of ATI and NVIDIA, which are unable to approach the linear increase in performance relative to the number of processing units.
The key to high performance in parallel rendering algorithms is the task assignment to give as many tasks as possible that can be divided between different processors in a way that loads them all equally with minimizing synchronization points between tasks. The Larrabee allows more flexibility than any graphics processor in task allocation and execution due to the flexible memory model and the ability to control the software on how and how to execute it.
A concept in the graphic field that often arises with the mention of Larrabee is Ray Tracing, which should not replace the traditional image-creation methods of figuring and drawing polygons that create the image as a complement to add a realistic touch to the image. Without entering the complex details of the method it is said that the basis of the method is to monitor the ray of light coming out of the source of light in the scene and calculating the path it passes and its repetitions, thus enabling accurate drawing of shadows and reflections that give the picture a realistic look. According to Intel, the Larrabee will enable graphics using Ray Tracing technology by making its architecture more efficient. 4.7 processes Core 2 Duo technology for each clock cycle in the type of calculations required for Ray Tracing.
Software developers have two options to harness the power of the Larrabee. The first and traditional, code writing that will run the DirectX or OpenGL interfaces that are supported for compatibility with existing applications. Intel must develop software that will implement these interfaces in order for the card to be ready to run existing applications. The second option is to write an application that will run directly on the Larrabee hardware in C / C ++. The difference here from the NVIDIA CUDA is that you can write any code in any compiler to X86 and do not have to rely on dedicated development tools and features developed by NVIDIA. The code will simply run on a standard X86 machine that any programmer used to.
Because the DirectX / OpenGL interfaces are implemented in software that also needs to translate the execution instructions into the different processing units that simulate logical levels in graphics cards, there is a certain loss of performance in running such code. The most effective way to utilize Larrabee resources will of course be writing software that will run directly on it.
The Larrabee hardware has a software that is actually the software that comes with the card and is responsible for optimizing the workload between the cores and deciding which mechanism will be applied to each core. This software is loaded into the memory card from the disk when it starts and can be updated to add features and improve performance just like a graphics card driver. In addition, it will be possible to provide microOS that are specific to a certain application or game and load it accordingly, thus enabling optimal performance for the same software, a kind of application-specific work profile.
What to expect?
The current trend is that generic processors are becoming more and more powerful, and on the other hand, graphics processors are getting more sophisticated and have considerable computing capabilities. In the long term, a combination of trends is expected to be integrated into a strong integrated product. Larrabee's software model separates it clearly from competitors and other graphic processors are slowly gaining general processing capabilities. Larrabee is already a generic processor for everything running the X86 command set. Larrabee drivers are supposed to hide this fact from the operating system, but there is no reason why in the future with minor changes, such a card can participate in all tasks performed by the computer.
The future is already here
Although Intel introduced an architecture only and not a specific chip, it is expected that the initial Larrabee that will appear on 2009 will contain 32 cores and will be created with the available 45 nanometer technology. During 2010 with the transfer of production to 32 nm, the number of cores will increase to 48. Intel did not disclose specific details about the products expected to appear and it is not known how many cores exactly will appear the different versions of the cards, how much memory they will contain, how much power they will consume, and so on. Of course, it is not known whether the Larrabee, at least in its first version, will be strong enough to deal with the ATI and NVIDIA graphics cards that will exist when it is released. But this is undoubtedly a refreshing change in direction that the entire market will only benefit from.