If you could buy a very small supercomputer that uses AI, how much would you be willing to spend to have one of those sitting right on your desk?
If your answer is something close to $6,999, then you're in luck! We're going to take a look at the NVIDIA DGX-SPARK, one of the most efficient supercomputers ever designed, all in a very small form factor. When I first saw this little supercomputer and looked closely at the price tag, I thought to myself, "Wow that seems expensive, but for $6,999 this must do some amazing things!" So let's just say my expectations were extremely high, while at the same time very, very low.
And yes, it does have the ability to perform many, many things depending on the software that you load onto it.
Recently, I was able to get an early look at the NVIDIA DGX-SPARK with the NVIDIA team during a Unpacking Event. There are multiple applications of this small but powerful supercomputer; it can be used for everything from developing enterprise business applications to machine learning and data mining to running your business for you, building your own virtual assistants and much more. If you're tired of engaging in the monthly fee cycle associated with Cloud AI or need to keep your data confidential in an enterprise setting or both, then you will want to read this article carefully to find out why you should consider purchasing the NVIDIA DGX-SPARK Supercomputer for your day-to-day use.
An Excessive Amount of Power Delivered to Your Work Bench
We will first present the enormous, unrefined figures, as they are truly monstrous. The NVIDIA DGX Spark is capable of delivering approximately 1,000 trillion AI operations each second. Yes, that is the correct way to write it. We are speaking about a total of one petaflop of AI computing power that has been fitted into a form that can sit on your desk.
However, the true magic takes place within the memory structure of the machine. The DGX Spark contains 128 GB of unified memory which is shared between the CPU and GPU thereby providing a method to transfer information quickly and efficiently between the CPU and GPU. If you have ever attempted to run large generative AI models on a standard workstation it is very likely you will have noticed that memory bottlenecks are your number one enemy. This system solves this problem completely by employing a unified memory system.
So what does this mean for your daily work/life? It means you can execute Massive Large Language Models (LLMs) (up to Two hundred (200) billion parameters) 100% Locally (without requiring an internet connection or a ping from an outside server) and without worry about having your proprietary information escaping into the internet.
Having Zero Friction to Run a Massive AI Model
My first question during unboxing was about the practical applications for businesses. Why would companies want something like this? Where could we use it?
To start with, we have automation and productivity. By running a local LLM, you can securely automate a huge amount of your email, generate your daily documents, and speed up your coding work extremely quickly. When we asked the NVIDIA team what the best LLM's would be in the locational environment, they gave us very reassuring answers.
Although a 30B (30 billion parameter) AI Model is super easy to run, they confirmed running a 70B (70 billion parameter) AI Model is "absolutely seamless". Just think about the reasoning capabilities of some of the best cloud-based AIs, but they are in your home or office. You get instantaneous answers, total control of your own data, and unlimited possibilities for creativity.
The Stacking Superpower
Things become crazy here. Let’s say for your particular, many business Enterprise-level workflows you need more than 128GB of unified memory. Perhaps you’re doing complicated agentic AI computations, possible large-scale collaborative video generation, etc.
That’s where the NVIDIA DGX Spark has a special trick. It’s stackable.
When you connect two of these units together, the amount of unified memory in the pool gets doubled to 256GB. If you need more than that, you can continue to stack DGX Sparks and connect up to four together. Meaning you can have an insane 512GB of unified memory, using multiple DGX Sparks together.
Right now the availability of high-level Enterprise-level hardware such as NVIDIA H100 and A100 GPCs is practically impossible to find. They just aren’t easily purchased for most developers. The DGX Spark is the solution for developers to have an AI computer on their desk and for business enterprise customers to have to access serious computing capability immediately without having wait in line to purchase server-grade data center chips.
Personalized API Server for Your Office
The DGX Spark is capable of much more than simply being used by one individual. One of the best ways to extract value from this $4699 device is by utilizing it as a central server for your whole business and allowing all your employees to connect and interact with it.
You can put the DGX Spark anywhere in your office, and all of your staff members will have the ability to connect to it at the same time. By using SSH access and APIs, multiple members of your staff will be able to utilize the processing power of the DGX Spark. In lieu of having to pay for API token calls whenever your staff uses an offsite AI company’s services; your employees can use the DGX Spark for all their local AI processing needs. Over the course of a year, the DGX Spark could save a busy development department a significant amount of money by providing local AI processing.
Regarding Audio, Video and Gaming
Due to its resemblance to high-end desktop computers, many people have wondered if this unit could also take voice commands and play games.
For audio, it does not have built-in speakers or microphones; however, the rear of the unit has a number of fast USB Type-C ports available to connect your favourite studio-quality microphones and great-sounding speakers so that you can run locally or use voice interactive AI.
As for gaming, the situation is even more complicated because inside the machine's architecture is heavily geared towards AI workloads and does not contain 'traditional' gaming GPU cores (e.g., real-time ray tracing); however, all of the same functions that were traditionally performed on a GPU can still be performed here and it is still an extremely powerful machine to act as a fantastic, universal PC for many different types of applications.
You’re one step closer to preparing for future upgrades. If you think you’ll need new replacement parts every couple of years, don’t be concerned. This hardware’s many terabytes of unified RAM and extensive AI compute capacity ensure that it will continue to function well into the foreseeable future; possibly for as long as a decade without requiring any replacements.