Veeam File Share backups and knowledge worker data

Introduction

Today I focus on Veeam File Share backups and knowledge worker data testing. In Veeam NAS and File Share Backups did my 1st testing with the RTM bits of Veeam Backup & Replication V10 File Share backup options. Those tests were focused on a pain point I encounter often in environments with lots of large files: being able to back them up at all! Some examples are medical imaging, insurance, and GIS, remote imaging (satellite images, Aerial photography, LIDAR, Mobile mapping, …).

The amount of data created has skyrocketed driven by not only need but the advances in technology. These technologies deliver ever-better quality images, are more and more affordable, and are ever more applicable in an expanding variety of business cases. This means such data is an important use case. Anyway for those use cases, things are looking good.

But what about Veeam File Share backups and knowledge worker data? Those millions of files in hundreds of thousands of folders. Well, in this blog post I share some results of testing with that data.

Veeam File Share backups and knowledge worker data

For this test we use a 2 TB volume with 1.87TB of knowledge worker data. This resides on a 2 TB LUN, formatted with NTFS and a unit allocation size of 4K.

1.87 TB of knowlegde worker dataon NTFS

The data consists of 2,637,652 files in almost 196,420 folders. The content is real-life accumulated data over many years. It contains a wide variety of file types such as office, text, zip, image, .pdf, .dbf, movie, etc. files of various sizes. This data was not generated artificially. All servers are Windows Server 2019. The backup repository was formatted with ReFS (64K allocation unit size).

Backup test

We back it up with the file server object from an all-flash source to an all-flash target. There is a dedicated 10Gbps backup network in the lab. As we did not have a separate spare lab node we configured the cache on local SSD on the repository. I set the backup I/O control for faster backup. We wanted to see what we could get out of this setup.

Below are the results.

Veeam File Share backups and knowledge worker data
45:28 minutes to backup 1.87 TB of knowledge worker data. I like it.

If you look at the back-up image above you see that the source was the bottleneck. As we are going for maximum speed we are hammering the CPU cores quite a bit. The screenshot below makes this crystal clear.

We have plenty of CPU cores in the lab on our backup source and we put them to work.
The CPU core load on the backup target is far less.

This begs the question if using the file share option would not be a better choice. We can then leverage SMB Direct. This could help save CPU cycles. With SMB Multichannel we can leverage two 10Gbps NICs. So We will repeat this test with the same data. Once with a file share on a stand-alone file server and once with a high available general-purpose file share with continuous availability turned on. This will allow us to compare the File Server versus File Share approach. Continuous availability has an impact on performance and I would also like to see how profound that is with backups. But all that will be for a future blog post.

Restore test

The ability to restore data fast is paramount. It is even mission-critical in certain scenarios. Medical images needed for consultations and (surgical) procedures for example.

So we also put this to the test. Note that we chose to restore all data to a new LUN. This is to mimic the catastrophical loss of the orginal LUN and a recovery to a new one.

The restore takes longer than the backup for the same amount of data, the restore speed is typically slower for large amounts of smaller files.

Below you will find a screenshot from the task manager on both the repository as well as the file server during the restore.

The repository server from where we are restoring the data
The file server where the backup is being restored completely on a new LUN. Note the peak throughput of 6.4 Gbps.

Mind you, this varies a lot and when it hits small files the throughput slows down while the cores load rises

The file server during the restore is having to work the hardest when it has to deal with the least efficient files.

Conclusion

For now, with variable data and lots of small files, it looks that restores take 2.5 to 3 times as long as backups with office worker data. We’ll do more testing with different data. With large image files, the difference is a lot less from our early testing. For now, this gives you a first look at our results with Veeam File Share backups and knowledge worker data As always, test for your self and test multiple scenarios. Your mileage will vary and you have to do your own due diligence. These lab tests are the beginning of mine, just to get a feel for what I can achieve. If you want to learn more about Veeam Backup & Replication go here.Thank you for reading.

Hyper-V Amigos Showcast Episode 20 and 21

Introduction

This is just a quick blog post to let you know the Hyper-V Amigos have released 2 webcasts recently. These are Hyper-V Amigos Showcast Episode 20 and 21. You will find a link to the videos and a description of the content below.

Hyper-V Amigos Showcast – Episode 20

In episode 20 of the Hyper-V Amigo ShowCast, we continue our journey in the different ways in which we can use storage spaces in backup targets. In our previous “Hyper-V Amigos ShowCast (Episode 19)– Windows Server 2019 as Veeam Backup Target Part I” we looked at stand-alone or member servers with Storage Spaces. With both direct-attached storage and SMB files shares as backup targets. We also played with Multi Resilient Volumes.

For this webcast, we have one 2 node S2D cluster set up for the Hyper-V workload (Azure Stack HCI). On a second 2 node S2D cluster, we host 2 SOFS file shares. Each on their own CSV LUN. SOFS on S2D is supported for backups and archival workloads. And as it is SMB3 and we have RDMA capable NICs we can leverage RDMA (RoCE, Mellanox ConnectX-5) to benefit from CPU offloading and superb throughput at ultra-low latency.

Hyper-V Amigos Show Cast Episode 20

Some extra information

The General Purpose File Server (GPFS role) is not supported on S2D for now. You can use GPFS with shared storage and in combination with continuous availability. This performs well as a high available backup target as well. The benefit here is that this is cost-effective (Windows Server Standard licenses will do) and you get to use the shared storage of your choice. But in this show cast, we focus on the S2D scenario and we didn’t build a non-supported scenario.

You would normally expect to notice the performance impact of continuous availability when you compare the speeds with the previous episode where we used a non-high available file share (no continuous availability possible). But we have better storage in the lab for this test, the source system is usually the bottleneck and as such our results were pretty awesome.

The lab has 4 Tarox server nodes with a mix of Intel Optane DC Memory (Persistent Memory or Storage Class Memory), Intel NVMe and Intel SSD disks. For the networking, we leverage Mellanox ConnectX-5 100Gbps NICs and SN2100 100Gbps switches. Hence we both had a grin on our face just prepping this lab.

As a side note, the performance impact of continuous availability and write-through is expected. I have written about it before here. The reason why you might contemplate to use it. Next to a requirement for high availability, is due to the small but realistic data corruption risk you have with not continuously available SMB shares. The reason is that they do not provide write-through for guaranteed data persistence.

We also demonstrate the “Instant Recovery” capability of Veeam to make workloads available fast and point out the benefits.

Hyper-V Amigos Showcast – Episode 21

In episode 21 we are diving into leveraging the Veeam Agent for Windows integrated with Veeam Backup & Replication (v10 RC1)  to protect our physical S2D nodes. For shops that don’t have an automated cluster node build processes set up or rely on external help to come in and do it this can be a huge time saver.

We walk through the entire process and end up doing a bare metal recovery of one of the S2D nodes. The steps include:

  • Setting up an Active Directory protection group for our S2D cluster.
  • Creating a backup job for a Windows Server, where we select failover cluster as type (Which has only the “Managed by Backup Server”  as the mode).
  • We run a backup
  • After that, we create the Veeam Agent Recovery Media (the most finicky part)
  • Finally, we restore one of the S2D hosts completely using the bare metal recovery option

Some more information

Now we had some issues in the lab one of them suffering to a BSOD on the laptop used to make the recording and being a bit too impatient when booting from the ISO over a BMC virtual CD/DVD. Hence we had to glue some parts together and fast forward through the boring bits. We do appreciate that watching a system bot for 10 minutes doesn’t make for good infotainment. Other than that, it went fine and we were able to demonstrate the process from the beginning to the end.

As is the case with any process you should test and experiment to make sure you are familiar with the process. That makes it all a little easier and hurt a little less when the day comes you have to do it for real.

We hope the show cast helps you look into some of the capabilities and options you have with Veeam in regards to protecting any workloads. Long gone are the days that Veeam was only about protecting virtual Machines. Veeam is about protecting data where ever it lives. In VMs, physical servers, workstations, PCs, laptop, on-prem, in the cloud and Office 365. On top of that, you can restore it where ever you want to avoid lock-in and costly migration projects and tools. Check it out.

Conclusion

We will be doing more web casts on Veeam Backup & Replication v10 in 2020 as it will be generally available in Q1 as far I can guess.

Hyper-V Amigos Showcast Episode 20 and 21

But with Hyper-V Amigos Showcast Episode 20 and 21, that’s it for 2019. Enjoy the holidays during this festive season. The Hyper-V Amigos wish you a Merry X-Mas and a very happy New Year in 2020!

Veeam Vanguard Renewals and Nominations 2020

Introduction

Are you are working with Veeam software solutions? Are you passionate about sharing your experiences, knowledge, and insights? If so, you might want to consider a nomination for the Veeam Vanguard program. If you are already a Veeam Vanguard I’m pretty sure you already know submissions for Veeam Vanguard Renewals and Nominations 2020 are open.

Veeam Vanguard Renewals and Nominations

As we are nearing the end of 2019 Veeeam has opened the Veeam Vanguard Renewals and Nominations for 2020.

Describing the Veeam Vanguard program is not easily done. But Nikola Pejková has done a great job to do exactly that in Join the Veeam Vanguard 2020 class! She also explains how to nominate someone or yourself. Read the blog post and find out if this is something for you. I enjoy being a part of it because I get to learn with and from some of the best minds in the industry. This allows me to help others better while also keeping up with the changing IT landscape whilst helping others.

Veeam Vanguard Renewals and Nominations 2020
My fellow Veeam Vanguard and me in a Q&A session with the Veeam R&D and PM teams at the Veeam Vanguard Summit.

I would like to emphasize that the diversity of the Veeam Vanguard is paramount to me. It works because we have people in there form around the globe, from all kinds of backgrounds and job roles. This helps open up discussions with different points of view and experiences. Customers, consultants, and partners look at needs and solutions from their perspectives. Having us together in the Vanguard benefits us all and prevents tunnel vision.

Nominate someone, yourself or be nominated

Nikola explains how to do this in her blog so read Join the Veeam Vanguard 2020 class! and apply to become Vanguard! It is quite an experience. Quality people who are active in the commumnity and help by sharing their knowledge are welcomed and appreciated. Maybe you’ll find yourself to be a Veeam Vanguard in 2020!

Optimize the Veeam preferred networks backup initialization speed

When Veeam preferred networks cause slow backup initialization speeds

When using preferred networks in Veeam you choose to use another than the default host network for backups and restores. In this post, we’ll discuss how to optimize the Veeam preferred networks backup initialization speed because we aim for optimal performance. TL-DR: You need to provide connectivity to the preferred networks for the Veeam Backup & Replication server. It seems a common mistake I run into every now and then. Ultimately it makes people think Veeam is slow. No, it is just a configuration mistake.

Why use a preferred network?

Backups can fill up a 1Gbps pipe very fast. Many people still use 1Gbps networking as default connectivity to the hosts. Even when they leverage 10Gbps or better it is often in a converged network setup. This means that only part of the bandwidth goes to host connectivity. Few have 10Gbps for “just” host connectivity. This means it makes sense to select a different higher bandwidth network for backup and restore traffic.

Hence for high volume, high-performance backup and restores it is smart to look for a bigger pipe to leverage. Some environments have dedicated backup networks at 10Gbps or better. But we find way more high bandwidth networks for other purposes. In Hyper-V environments, you’ll have those for SMB networking like CSV, Live Migration variants and storage replication. Hyper-Converged Infrastructure deployments use these networks for storage as well. With S2D you’ll find more and more 25/50/100Gbps. All these can be leveraged as a preferred backup network in Veeam

Setting up a preferred network

Setting up a preferred network is easy. First of all, you figure out which network to use. You then add those to the preferred networks as follows:

In file menu select “Network Traffic Rules”

Optimize the Veeam preferred network backup initialization speed

Click “Add” and specify the source IP as well as the target IP range. You can op to encrypt the traffic and /or set a bandwidth limit.

Optimize the Veeam preferred network backup initialization speed

There is no need to have the preferred network registered in DNS. It will work fine without.

I hope it is clear that the source (Hyper-V Hosts), the target (backup repository or the extends in a Scale-Out Backup Repository) and any Off Host Proxies need connectivity to the preferred network(s). If you leverage WAN accelerators, Gateways Servers, log shipping servers than these also need access. Last but not least you should also make sure that the Veeam Backup Server (VBR) has access to the preferred networks. This is one that a lot of people seem to forget. May because it is most often a VM if it is not a shared role on the repository server or such and things do work without it.

When the VBR server has no access to the preferred networks things still work but initialization of the backup and restore jobs is a lot slower. Let’s test this.

Slow Initialization of backup and restore jobs

As a result of using preferred networks you might probably notice the following:

  • First of all, we notice a slow down in the overall initialization of the backup and restore job.
  • This manifests itself in a slow start of the actual VM backup/restore and reducing the number of simultaneous backups/restores of VMs within a job.

Without the VBR server having connectivity to the preferred networks

23:54 to complete the backup job (no connectivity to the preferred network)

Optimize the Veeam preferred networks backup initialization speed

With the VBR server having connectivity to the preferred networks. Notice how smooth and continuous the throughput is.

07:55 to complete the backup job (with connectivity to the preferred network) => 3 times as fast.

When you look into the Veeam backup logs for this job you will find at various stages attempts by the VBR server to connect to the preferred networks. If it can’t it has to wait until it times out. You see entries like:

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.10.110.2:2509 (System.Net.Sockets.SocketException)

Optimize the Veeam preferred network backup initialization speed
Just a small part of all the NetSocket time out you will find for every single VM in the job. Here VBR is trying to connect to one of the extends in the SOBR.

This happens for every file in the backups (config files and disks) for every extend in the Scale-Out Backup Repository (per VM backup chain). This slows down the entire backup job tremendously.

Conclusion

I always make sure that the VBR servers in my environments have preferred network connectivity. Consequently, initialization is faster for both backups and restores. Test it out for yourself! It is the first thing I check when people complain of really slow backup. Do they have preferred networks set up? Check if the VBR server has connectivity to them!