Traditional SANs stack up arrays, controllers, and fabric gear that inflate budgets and slow projects. A Virtual SAN (VSAN) aggregates local NVMe, SSD, and HDD across hosts and exposes a shared datastore to the hypervisor. You keep standard x86 servers, add high-speed NICs, and run clustered storage without a separate array or Fibre Channel fabric.
The savings show up fast. No external array means fewer support contracts, less rack space, and lower power draw. Commodity servers and off-the-shelf drives cut acquisition costs. Scaling is granular: add a node when you need capacity or IOPS instead of planning a risky forklift refresh.
Operations get simpler as well
Compute and storage live in one cluster, which shrinks the surface area you patch, monitor, and troubleshoot. Policies sit at the VM level, replica count, striping, QoS and the platform handles placement and self-healing. When a disk or host fails, rebuilds fan out in parallel from multiple peers, shortening the recovery window.
Here’s what this looks like in the field. A retail chain runs two compact nodes per store with a lightweight witness back at HQ. Domain services, POS, and video stay local; if one node drops mid-day, checkout continues and the store barely notices.
An SMB replaces an aging array with a three-node pod that starts around 30-40 TB usable; growth to five nodes is a routine rack-and-stack when new databases arrive. For VDI or GPU inference, NVMe cache on each host keeps hot blocks close to CPU or GPU, so adding a node boosts both storage throughput and session capacity in one move.
Getting the design right matters.
Start with the network. For mixed workloads, 25 GbE per host is a solid baseline, with dedicated ports for storage traffic; thin pipes turn into noisy neighbors under load. On the write path, prefer NVMe write logging with end-to-end checksums; pair small, durable cache devices with larger QLC capacity tiers to avoid write amplification pain.
Model fault domains across racks and power feeds, and use a witness for two-node sites to avoid split-brain. Size for 12–18 months of growth, keep a few empty bays per chassis, and plan to add nodes before the cluster sits at 75–80% utilization.
A few trends are shaping modern builds.
NVMe/TCP is gaining ground as a lower-latency alternative to iSCSI with simpler day-2 ops. 25/100 GbE—and RDMA where it fits—tighten east–west latency inside the cluster. QLC capacity tiers paired with small NVMe caches are pushing $/TB down for general virtualization.
Many teams run Kubernetes on top of HCI for stateful services at the edge, managing VMs and containers under the same roof. Early DPU/SmartNIC pilots are offloading encryption, compression, and storage datapaths to free CPU cycles for workloads.
There are still cases where a VSAN isn’t the right tool. Ultra-latency-sensitive databases that live on sub-millisecond tail latency often demand isolated media and bespoke tuning. Massive sequential pipelines, for example, large media transcoding farms, can favor scale-up arrays with deep write caches. Sites that can’t support stable 10/25 GbE or can’t host at least two nodes won’t get the availability profile a cluster expects.
A practical path is to run a small proof: two or three nodes, clear SLOs, and scripted failover tests. Measure latency under failure, examine rebuild behavior, and validate noisy-neighbor isolation. Once the model fits, scaling out is predictable. If you’re evaluating architectures, look for VSAN hyper converged designs that match your network and growth plan instead of forcing new hardware patterns on your team.