What is deep learning?
Deep learning, AI (Artificial Intelligence) and ML (Machine Learning), allows data scientists and hobbyists alike to analyse and recognise patterns using a computer, and make predictions or adjustments without the need to create extra code or programming.
Deep learning is growing in popularity, and is most commonly found in academic, fin-tech and industrial sectors, with major players like Google, Microsoft and similar size corporations taking an even greater interest. These industries have been creating innovative uses for deep learning, ML and AI, with the aid of custom designed PCs or cloud computing. Though used mostly in enterprise settings, deep learning is now very popular with PC enthusiasts, due to the increased accessibility and low barrier to entry. With a mid-range to high-end PC and a powerful graphics card (or two), deep learning is now available to anyone who wants to explore this fascinating world of algorithms and data science.
In this guide
Understanding the build
If you are building a PC for deep learning (AI or ML), then you need to understand how the workload is handled by the PC and its components. You may have realised by now that GPUs are capable of bearing incredible compute loads, and can calculate at much higher speeds than a CPU. For deep learning builds, artificial intelligence and machine learning projects, however, the CPU does play a significant part in the workflow, which we will also discuss below.
Cloud or desktop workstation?
There are many situations where using cloud compute performs better at training AI, but if you are intending on saving money in the long run, or prefer local compute for its data security and privacy - a deep learning PC might be the best solution for you.
If your AI project is involved with the government, civil/military defence, law enforcement or medical sectors, then a local compute provides a level of data security that you determine. Cloud computing is often around 2x - 4x the cost of local compute on a dedicated system, when compared with Amazon Web Services P2 and P3 instances for example.
The general consensus is if you are using cloud compute for less than a year, it is cost effective. Anything longer than a year, and a desktop workstation becomes cost effective.
There are reasons you would opt for cloud computing without giving any thought to a deep learning PC, however. If you cannot accurately estimate the size of the compute power, and you believe it may exceed the capabilities of a local computer, then you could comfortably opt for cloud compute to err on the side of caution. If you are able to quantify the size and scope of the project, then a purpose built deep learning PC can be based on this criteria.
A great source of information on calculating whether a cloud or desktop is required is Tim Dettmers, a widely regarded expert on Deep Learning, Natural Language Processing.
Read Tim's explanation on DL GPU choice here: Tim Dettmers blog
AI training and inference - CPU or GPU?
Where the rubber meets the road for AI is the difference between training and inference. These are key points in the AI workflow and are the main considerations when building a workstation and/or estimating cloud compute needs.
Training in local machines is often handled by GPUs due to the sheer compute power they offer, and the speed in which NVIDIA CUDA cores and Tensor Cores can provide millions of results using the ubiquitous TensorFlow. Whilst there are capabilities in the AMD camp for GPU compute, NVIDIA is still unmatched in the AI and neural network department, and provides the best compatibility for learning libraries.
Inference is the use of these results in practice as an algorithm, and presenting this trained algorithm with data to provide accurate predictions. For inference, a CPU can often handle the workload, and with Alder Lake's Deep Learning Boost, Intel have proven that predominantly consumer chips can in fact offer quantifiable performance in data science.
Inference, takes place at device level, being less compute-heavy, and further optimisation at this stage can reduce the compute levels, too. Pragmatically speaking, for a student or hobbyist data scientist with a modest budget, this is good news. Any modern CPU should be able to handle this type of computational requirement, and this is usually because of the intended application. For example, the AI training project you undertake on a massive scale using GPU or cloud computing may have an application simply telling the difference between images of car manufacturers, cat breeds or tree species. These applications are often intended to run on smartphone cameras or tablets - which means the inference stage should be capable of running on an appropriate system.
How you get from AI training to inference is key, then. Whilst your deep learning project will need extremely powerful compute (GPU or cloud) at the neural network modelling and training stage, you will need less compute in the inference stage (CPU).
Hardware requirements for deep learning
We are going to make some assumptions for the rest of this article, in that you have settled on a desktop workstation for your deep learning research/project. To ensure flexibility and elasticity of your project and to allow for any further expansion requirements, just as you would expect in cloud situations, we have included a motherboard that is somewhat "overkill" for a single GPU. This is to ensure you can upgrade to a dual GPU setup if required, but also to ensure adequate cooling is accounted for. As your computational requirements increase, so too can your hardware.
CPU
AMD Ryzen 9 5950X 3.4 GHz 16-Core Processor
This really came down to two processors in terms of value and performance. The Intel Core i9-12900KS and the Ryzen 9 5950X. In deep learning circles, AMD has long been the champion of the student and professor. Combining affordability and incredible performance for productivity, the Ryzen 9 is virtually impossible to beat. In AMD's pursuit of becoming an overall leader in gaming & productivity, they have created the world's most renowned workstation processor.
TechPowerUp Ryzen 9 5950X Deep Learning Comparisons
The Core i9-12900KS, on the other hand, should be an ideal candidate for deep learning, in that it houses Intel's Deep Learning Boost technology. TechPowerUp's tests show an interesting result, in that the CPU is actually a high performer in training as well as inference. However, even against this new technology, we see the Ryzen 9 5950X is comparatively better in 2/3 benchmarks:
TechPowerUp Core i9-12900KS vs Ryzen 9 5950X Deep Learning Comparisons
GPU
Palit GeForce RTX 3090 24 GB GameRock Video Card
Ultimately, the graphics card you choose for a deep learning PC should have as high a CUDA core count, and as much memory as possible to handle the computations you will be imposing on it. Besides the RTX 2080, the RTX 3090 is the most recommended graphics card for deep learning, and it is clear to see why.
The RTX 2080, released in 2018, represented good value for those who do not use it predominantly for gaming. Creators and data scientists were heavy users of this GPU, using dual or triple RTX 2080s in deep learning systems. Roll forward to 2021, and the RTX 3090 has become a poster child for the deep learning student and hobbyist, with a massive 10,496 CUDA cores, versus the RTX 2080's 2,944.
RTX 3090 for Deep learning applications
The RTX 3090 is also capable of scaling. By adding a second GPU with an NVLink bridge. When used as a double act with an NVLink bridge, your workstation effectively has 48 GB of memory, doubling resources in training large NN models.
A comprehensive compute review of the RTX 3090 by ServeTheHome.com said "This is the type of GPU launch that can yield transformational speedups in workflows.", with incredible performances versus all other GPUs with recognised deep learning training capabilities.
"The new GeForce RTX 3090 setup smashes our single GPU results and approaches NVLink configurations"
- ServeTheHome.com
To further highlight the relative value of the RTX 3090 in deep learning situations, Lambda Labs recently tested a number of GPUs with the following test criteria: -
- Computer Vision: ResNet50, SSD, Mask-RCNN
- Language: GNMT, TransformXL-base, TransformXL-large, BERT-base, BERT-large
- Speech Synthesis: Tacotron2, Waveglow
- Recommendation System: NCF
The only GPUs that performed better than the RTX 3090 were: -
- A100 80GB SXM4 - £10,000
- A100 80GB PCIe - £10,000
- A100 40GB SXM4 - £9,000
- RTX A6000 - £4,750
- RTX A40 - £4,750
Of course, these represent the higher echelons of GPU, and they are predominantly visual computing, workstation and data science oriented. The RTX 3090, when compared with the above, is absolutely closer to consumer pricing (and availability).
Motherboard
Asus ROG Crosshair VIII Dark Hero ATX AM4
The choice of motherboard came down to the memory compatibility, multi-GPU support (Supports NVIDIA 2-Way SLI® Technology) and expansion ports - 2 x PCIe 4.0 x16 (x16 or dual x8). The board will also allow excellent cooling capabilities, should you ever expand on the single RTX 3090, as well as a VRM heatsink, M.2 heatsink and passive chipset heatsink.
Memory
G.Skill Trident Z RGB 32 GB (2 x 16 GB) DDR4-3600 CL17 Memory
Designed for gaming reliability, G.Skill memory provides performance at scale, though 32GB is more than enough for most data modelling situations. That said, with a motherboard that is capable of up to 2128GB of DDR4 RAM, there's plenty of scope to increase.
Storage
- Samsung 970 Evo Plus 1 TB M.2-2280 NVME Solid State Drive
- Seagate Barracuda Compute 8 TB 3.5" 5400RPM Internal Hard Drive
With fast solid state storage for most-used applications and a large capacity 8TB drive for storing datasets, this is an ideal scenario, but can be swapped out for a lesser arrangement if desired.
Cooling
Corsair iCUE H115i ELITE CAPELLIX 97 CFM Liquid CPU Cooler
Corsair's range of cooling systems set a new standard with the Capellix liquid cooling, with industry magazines and reviewers applauding the cooling capabilities and noise management. Besides a more expensive H170i Elite 420mm arrangement, this cooling system is perfect for extended CPU and GPU usage.
Case
Corsair 4000D Airflow ATX Mid Tower Case
With a virtually unequalled airflow performance, Corsair's aptly named Airflow cases offer complimentary cooling throughout the chassis for the H115i Elite, and a huge amount of space for components.
PSU (Power Supply)
be quiet! Straight Power 11 1200 W 80+ Platinum Certified Fully Modular ATX Power Supply
With the sheer volume of power used, which tops out at over 600w+a 1200W power supply unit may seem like overkill, but the be quiet! Straight Power 11 is made for performance and efficiency. We chose this PSU for absolute reliability and the knowledge that even when overworked and under pressure, your workstation will perform at peak.
Deep Learning PC Cost: ~£3200
Full specification:
The spec comes in at under £3500 - nicely under the £4000 budget. A deep learning PC with these core specs and the outstanding power and cooling efficiencies is able to compete with cloud compute costs, and that is - of course - the overall objective.