Software company cuts storage array costs with Nutanix hyperconvergence
Banking software vendor Vialink needed a high-performance storage array, but instead deployed Nutanix hyperconverged infrastructure (HCI). He didn’t start looking for the server functionality that came with HCI, but it was cheaper than alternatives that only offered storage. And, Vialink realized that choosing hyper-converged would also radically simplify the work of its technical teams.
“Before making this choice, there were only three of us managing the infrastructure. Our days were quite stressful, faced with a storage array that could no longer support our activity peaks after barely two years,” explains the systems and networks manager, Emmanuel Helfenstein. “When we face this type of situation, we can say that it becomes possible to radically change the infrastructure.”
Vialink software as a service digitizes regulatory processes, mainly those of banking customers (BPCE is a notable client), but also in real estate where he provides an electronic signature solution to Villea group.
The company’s flagship product is KYC, which handles document scans for new banking customers via ROCK and connections to third-party services to verify and assign a trust rating.
Due to the regulated nature of its work, Vialink does not use the cloud except to train data modules with Google Cloud Platform virtual GPUs. Everything else is managed in its own data centers.
“Under normal conditions, we don’t have a huge need for bandwidth,” Helfenstein said. “So in 2016, when we virtualized all of our servers, we chose a storage infrastructure that was suitable for that. It was a NetApp array with 48 SAS readers on each of our two sites.
“At the time, it worked well, but since we started handling more demanding operations, like batch executions on databases or receiving large requests from clients, performance has plummeted.”
Helfenstein said 7,000 IOPS was the limit beyond which NetApp arrays became unresponsive. “I don’t think the drives were the problem rather than the processor in the array, which was not up to par.”
Initially, the IT team tried to shut down services that required a lot of processing, Helfenstein said. They blocked data deduplication and compression, and recovered some IOPS. “After a short time, we realized that this material would not produce a miracle,” he said. “It would have been pointless to add more record shelves.”
Helfenstein and his colleagues were “suffering,” he said. They contacted their suppliers to look for an alternative: NetApp, Pure Storage and Dell EMC, which had already offered to switch to its hyperconverged VxRailthen Nutanix.
Nutanix: For the price, and its global console
“The big thing that set us apart from Nutanix was that it offered the same compute functionality as Dell EMC’s VxRail,” Helfenstein said. “But at the same price as the solutions from NetApp and Pure Storage, which lacked the server part.”
All of these products contain processors. For NetApp and Pure, this only provides storage functionality. With VxRail, you can also use it to run virtual machines, but with an additional ESXi license. Meanwhile, Nutanix’s AHV hypervisor is free.
“The VMware license cost would be $50,000 in the first year, to manage 16 cores and deploy two vCenter consoles,” Helefenstein said. “Plus €20,000 per year for maintenance. It is this economy that we made by choosing Nutanix.
The idea of putting virtual machines and storage in a single box goes beyond the simple savings that Vialink could realize by not having to buy servers to go with its disk arrays. “When you’re a small IT team, you don’t want to manage 50 consoles — Nutanix puts everything on one screen,” Helfenstein said.
“Nutanix’s management software handles, for example, firmware updates on the motherboard, controller cards, and SSDs,” he said. “And it does it transparently, without any human intervention. Previously, this required complex additional operations on the Dell servers that accessed our NetApp arrays.
A seamless migration
In 2019, Vialink acquired two Nutanix clusters. Each includes four SuperMicro nodes with two sockets, 512 GB of RAM and 38 TB of storage on 12 x 3.84 TB SSDs. The company opted for two options: share files via SMB with other servers for Mandatory encryption for banking industry providers.
Each cluster is located in one of Vialink’s data centers. “Each piece of data is the disaster recovery mirror of the other,” Helfenstein said. “In other words, they’re running different apps but all their data is in sync. In this way, we share the daily load between our two sites, but if one fails, the other can handle 100% of the production. Synchronization of stored data is provided by Nutanix via 1Gbps dark fiber.
To migrate data and virtual machines from the existing system to the new one, Vialink used a tool called Move, also provided by Nutanix, which converted VMware virtual machines from ESXi to AHV format on the fly.
“The migration was seamless, with applications continuing to work during the copy,” Helfenstein said. “But at some point you have to restart them on the new clusters, so we did all that over the weekend.
“We migrated 300 VMs like that in three months, 10 different VMs at a time, so as not to block everything in the event of a problem.”
It turns out that there was a problem with four virtual machines. “Converting our applications from VMware format to AHV format was not a problem for us because all of their virtual machines run on a Debian Linux system which includes all the necessary drivers for either cluster,” said Helfenstein. “On the other hand, we had virtual network appliances that we purchased pre-configured for older Dell server clusters that we had to adapt by hand for the new Nutanix cluster.”
Benefits: Nodes that handle the workload, and more
The Nutanix clusters didn’t last long when Helfenstein and his team noticed that the number of IOPS “hit to 10,000, 20,000…then 30,000 IOPS. It held without fail and we had proven that it could withstand a million IOPS. We have achieved a kind of serenity.
Support was another area of satisfaction. “Nutanix had encouraged contact with their helpdesk if there were any concerns,” Helfenstein said. “We took them at their word. We opened a ticket when we needed to make an update or change settings. They were very responsive and always responded helpfully.
In 2020, an update went wrong, resulting in one of the nodes disappearing. “We opened a priority ticket,” Helefenstein said. “Someone from Nutanix contacted us quickly via Zoom. She pressed three commands and the system was restarted within an hour, with no effect felt by users.”
That said, the Prism console is easy enough to use for the Vialink team to handle most incidents. On one occasion, the 1Gbps link between data centers was saturated, cutting off normal communications between applications. A simple intervention via Prism to configure the synchronization bandwidth between the two clusters was enough to solve the problem in a few seconds.
After a year without further incident, Vialink decided to migrate its Kubernetes containers to Nutanix.
“Nutanix suggested its Kubernetes orchestrator, Karbon,” Helfenstein said. “It comes free with the product anyway. And not only that, we also had the benefit of being able to manage our containers from the same Prism console we use to administer everything else. Previously, we used a dedicated Kubernetes console. »
The additional workload carried by these containers, especially Java applications, meant adding memory for each node to bring it up to 768 GB. Helfenstein said he bought the memory cards themselves and installed them, which didn’t Wasn’t a problem for Nutanix. Now the clusters run around 600 virtual instances.
Administer databases without a DBA
In terms of simplifying administration, the IT team was about to have another nice surprise.
“In 2021, our developers asked us to support Mongo DB and Postgres databases on Nutanix clusters,” Helfenstein said. “The problem was that we didn’t have a DBA on the team. So, Nutanix offered us to deploy ERA, which is a database management automation tool that manages availability and allows the deployment of test and working copies in one click.
Vialink has also invested in Hycu backup software, which specializes in protecting Nutanix clusters. This, however, is not built into Prism. Access is through its own console with backups stored on 300TB of Caringo object storage.
“Nutanix also offers object storage, but we didn’t go there because we’re dealing with data protection and didn’t want to put all our eggs in one basket,” he said.
Helfenstein also plans to invest in the optional Nutanix module that will enable real-time synchronization between clusters, which only happens at intervals with the base system.