ReFS Supported Deployment Scenarios Updated

Introduction

Some support statements for ReFS have been updated recently. These reflect well over a year of me, fellow MVPs and others testing and providing feedback to Microsoft. For all practical purposes I’m talking about ReFSv3, which was introduced with Windows Server 2016. Read up on this because that’s what I’m discussing here: Resilient File System (ReFS) overview

As many you know the ReFS supported storage deployment option has “fluctuated a bit. It was t limited ReFS to Storage Spaces and standalone disks only. That meant no RAID controllers, no FC or iSCSI LUNs via a SAN whether that was a high end one or and entry level one that you normally only use for backup purposes.

I was never really satisfied with the reasons why and I kept being a passionate advocate for a decent explanation as tying a files system with the capabilities and potential of ReFS to almost a single storage solution (S2D, and yes that’s a very good HCI offering) isn’t going to help proliferate the goodness of ReFS around the globe.

I was not alone and many others, amongst them fellow MVPs Anton Gostev (Senior Vice President, Product Management at Veaam and an industry heavy weight when it comes to credibility and technical skill), Cars ten Rachfahl and Jan Kappen (both at Rachfahl IT-Solutions) were arguing he case for broader ReFS support. Last week we go the news that the ReFS deployment documentation had been revised. Guest what? Progress! A big thank you to Andrew Hansen for taking the time to hear us plead or case, listen to our testing results and passionate feedback. He picked up the ball, ran with it and delivered! Let’s take a look.

ReFS Storage Deployment Options

Storage Spaces Direct

Deploying ReFS on Storage Spaces Direct is recommended for virtualized workloads or network-attached storage. This is well known and is used for a Hyper Converged Infrastructure and Converged (SOFS) solution (Hyper-V, IIS, SQL, User Profile Disks and even archival or backup targets). You can deploy it with simple, mirrored (2-way or 3-way), parity or Mirror accelerated parity volumes.

Storage Spaces

Storage Spaces supports local non-removable direct-attached via BusTypes SATA, SAS, NVME, or attached via HBA (aka RAID controller in pass-through mode). You can deploy it with simple, mirrored (2-way or 3-way) or parity volumes. Do note that this can be both non-shared as shared storage spaces (Shared SAS enclosures). This is the high available solution with storage spaces we have before Windows Server 2016 added S2D.

Basic disks

Deploying ReFS on basic disks is best suited for applications that implement their own software resiliency and availability solutions. Applications that introduce their own resiliency and availability software solutions can leverage integrity-streams, block-cloning, and the ability to scale and support large data sets. A poster child for this use case is and Exchange DAG.

Now it is important to note that basic disks with ReFS are supported with local non-removable direct-attached disks via BusTypes SATA, SAS, NVME, or RAID. So yes, you can have RAID 1, 5,6,10 and make the storage redundant. Now, be smart, ReFS is great but it is not magic. If your workload requires redundancy and high availability you should provide it. This is not different when you use NTFS. When you have shared PCI RAID controllers (which can be redundant like in a DELL VRTX) this can be uses as well to create high availability deployments with shared storage.

SAN Storage

You can also use ReFS with a SAN over FC or iSCSI, normally those are always configured with some form of storage redundancy. You can consume the ReFS SAN storage on stand alone, member or clustered serves for high availability. As long as you use that storage for supported use cases. For example, it is and remains not support to put knowledge worker data on SOFS shares, not matter what the underlying storage for ReFS or NTFS volumes is. For backups this can leveraged to build some very capable solutions.

What were the concerns that made ReFS Support so limited at a given point in time?

Well one of them was confusion and concerns around how data gets flushed and persisted with non-storage spaces and simple disks. A valid concern but one you have with any file system so any storage array or controller needs to handle this well. As it turns out any decent piece of storage hardware/controller that’s on the Microsoft Hardware Compatibility List and is certified does its job well enough to guarantee this happens correctly. So, any certified OEM SAN, both entry level ones to high end enterprise grade gear is supported. Just like any good (certified) raid controller. Those are backed with battery backed caches that can survive down time for days to many weeks. You just pick the one that fits your needs, use case and budget form the options you have. That can be S2D, a SAN, a raid controller, or even basic directly attached disks.

My take on things

Why do I like the new supported options? Well because I have been testing them for backup targets, both high available one as non- high available one. I can have the benefits of ReFS that can be leveraged by backup software (Veeam Backup & Replication 9.5 for example) and have better performance, data protection with more type of storage than S2D. I like to have options and choices when designing as solution.

It is important to note one thing when you do not use ReFS in combination with Storage Spaces (S2D, Shared storage Spaces or “stand alone” storage spaces) with any form of data redundancy (2-way or 3-way mirror, parity, mirror accelerate parity). You will not have the built-in capability to repair data corruption than can occur while data sits on disk (bit rot) by leveraging the redundant copies in storage Spaces. That only comes when ReFS is combined with redundant Storage Spaces. Not with Simple Storage Spaces or any other storage array, redundant or not. The combination of ReFS with Storage Spaces offers this capability and is one of its selling points.

Other than that, the above ReFS storage deployment options let you leverage the benefits ReFS has to offer and yes, for some use case that will be preferred over NTFS. But don’t think NTFS should now only be used for the OS and such. That’s not the case. It is and remains very much the dominant file system for Windows. It’s just that now we get to leverage the goodness of ReFS for suitable scenarios with a lot more storage deployment options. This has a reason. For example, if you are going to do Hyper-V with a SAN the supported file system is NTFS, not ReFS. Mind you ReFS works but it’s not supported. I have tested this and while it works one of the concerns is the redirect IO traffic this incurs. With S2D the network fabric to deal with this is there by design: SMB Direct (RDMA) over 10Gbps or better. With a SAN that’s not necessarily so and as a result the network leveraged by CSV traffic might take a beating. The network traffic behavioral patterns are also different with ReFS versus NTFS on SAN based CSV than what you are used to with NFTS when it comes to owner and non-owner nodes. While I can make things work I must consider the benefits versus the risk of being unsupported. On a good SAN with ODX support that’s not worth the risk. Might this ever change? Maybe, but for now that’s it.

That said, when I design my ReFS LUNs and fabric well with a SAN and use them for a supported uses case like backup targets I am supported and I get to leverage the benefits of ReFS as it fits the use case very well (DPM, Veeam).

A side note on mirror accelerated parity

Mirror accelerated parity is only supported with S2D. That’s the only thing that, in regards to backup an archive targets that I want to keep testing (see Hyper-V Amigos Showcast Episode 12 – ReFS and Backup )and asking Microsoft to support at least on non-shared Storage spaces. I know shared storage spaces is being depreciated, no worries. That would make for some great, budget, archival and backup targets due to the fact you get bit rot protection due to the combination ReFS with redundant Storage Spaces. I even have some ideas on how to add tuning capabilities to the mirror / parity movement of data based on data age etc. I can dream right ?

Conclusion

To all the naysayers, the ones that bashed me when I discussed options for and the potential for ReFSv3 outside of S2D, take note, this is where we are today.

clip_image001

And I like it. I like the options ReFSv3 offers with variety of storage solutions to design and implement backup targets for many different needs and budgets. That’s what I like as I’m convinced that one size fits all solution are an illusion. Even at economies of scale and with commodity materials understanding the context in which to design and implement a solution matters, as it allows you to chose the proper methods for the given needs when you genuinely understand the challenge.

If you need help with this there are quite a number of highly skilled, experienced people with the right mindset to make help you maximize your ROI and TCO in an effective and efficient way. Many of these are MVPs and have their own business or work for IT firms where customers are not milked like cattle but really do provide high value services. Just reach out.

Continuous available general purpose file shares & ReFSv3 provide high available backup targets

Introduction

In our previous two blog posts on Veeam and SMB 3 we’ve seen how and when Veeam Backup & Replication can leverage SMB Multichannel and SMB Direct. See Veeam Backup & Replication leverages SMB Multichannel and Veeam Backup & Replication Preferred Subnet & SMB Multichannel.The benefits of this are more bandwidth, high availability, better throughput and with RDMA low latency and CPU offload. What’s not to like, right? In a world where the compute and networks need keeps rising due to the storage capabilities (flash storage) pushing the limits this is all very welcome.

We have also seen earlier that Veeam B & R 9.5 leverages ReFSv3 in Windows Server 2016. This provides clear and present benefits in regards to space efficiencies and speed with many backup file related operations. Read Veeam Leads the way by leveraging ReFSv3 capabilities

When it comes to ReFSv3 in Windows Server 2016 most of the focus has gone to solutions based around Storage Spaces Direct (S2D). That’s a great solution and it is the poster child use case of these technologies.

But what other options do you have out there to build efficient and effective high available backup targets creatively except for S2D? What if you would like to repurpose existing hardware to build those? Let’s take a look together at how continuous available general purpose file shares & ReFSv3v3 provide high available backup targets

CSV, S2D, ReFSv3 & Archival Data

In Windows Server 2016, traditional shared storage (iSCSI, FC, Shared SAS, Shared RAID) with CSV are not recommend to be used with ReFSv3. Why isn’t exactly clear. The biggest impact you’ll see is the performance difference when not writing to the owner node of the CSV in this use case. Even with a well configured RDMA network that difference is significant. But that doesn’t mean that the performance is bad. It’s just that many of the super-fast meta data operations are relatively and significantly slower when compared each other, not that any of these two are slow.

clip_image002

Microsoft does state that an S2D with ReFSv3 and SOFS shares can be used for archival data. Storage spaces and ReFSv3 also have the benefit of offering automatic repair of corrupt data from a redundant copy on the fly even when needed. So yes, the best know supported scenario is this one.

Continuous available general purpose file shares and ReFSv3 provide high available backup targets

But what if we need a high available backup target and would love to leverage ReFSv3 with Veeam Backup & Replication 9.5? Well, you can have 95% of your cookie and eat it to. All this without ignoring the cautions offered.

We could set up SOFS shares on a Windows Server 2016 Cluster with ReFSv3 with traditional shared storage. Some storage vendors do state this is supported actually.

That only means you don’t have the auto repair functionality ReFSv3 combined with storage spaces offers. But perhaps you want to avoid the risk of using ReFSv3 with CSV in a non S2D scenario all together. What you could do is forgo ReFSv3 and use NTFS. How well this will work for archival data or backup is something you’ll need to test and find out how well this holds up. There is not much info is out there, only other cautions and warnings that might keep you up at night.

There is another scenario however and that is using Windows Server 2016 failover clustering to set up continuously available general purpose file shares that leverage SMB3 transparent failover.

The good news is that general purpose file shares (no CSV) do work consistently with ReFSv3 because such a share/LUN is only exposed on one cluster node at the time, the owner. By having multiple shares and setting preferred owners we can load balance the workload across all cluster nodes.

Thank to continuous availability for general file shares and SMB 3 transparent failover we can still get a high available backup target this way. The failover is fast enough to make this happen and all we see with Veeam Backup & Replication is a short pause in throughput before it resumes after failover. To put the icing on the cake, you can leverage SMB multichannel SMB Direct for both backup and restores.

I would take a sizeable whitepaper to walk through the setup so instead I’ll show you a a quick video of a POC we did in the lab here https://vimeo.com/212886392.

clip_image004

If you want to learn more come to the community & other conferences I’m speaking at and will be around for Ask The Experts time opportunities. I’ll be at the German Hyper-V community meet up, The Cloud & Datacenter Conference in Germany 2017, Dell EMC World 2017 and last but not least VeeamON 2017 (see  May 2017 will be a travelling month). 

Conclusion

What do you lose?

Potentially there is one big loss in regards to the capabilities of ReFSv3 with this solution when you are not using storage spaces. This is that you lose the capability to automatic repair of corrupt data. The ability of ReFSv3 to do so is tied into the redundant copies of Storage Spaces (parity/mirror).

What do you get?

That’s fine, the strength of this design is that you get the speed and space efficiencies of ReFSv3and high available backup targets in way more scenarios than “just” S2D. After all, not everyone is in a position to choose their storage fabric for backup targets green field or at will. But they might be able to leverage existing storage and opt to use SMB 3 for their data transport.

So even if you can’t have it all, you can still build very good solutions. It offers ReFSv3 benefits and high availability for your backup target via transparent failover with SMB transparent failover on continuous available general purpose file shares. This also only requires Windows Server 2016 Standard Edition, which is a cost saving. You get to leverage SMB Multichannel and SMB Direct. All this while not ignoring the cautions of using ReFSv3 in certain scenarios.

On top of that, if you use NTFS with this approach it will also work for Windows Server 2012 (R2) as the OS for the backup target cluster hosts.

Disclaimer

I do not work for or at Microsoft, nor am I perfect or infallible just because I’m an MVP. You’ll have to do your own testing and validation. From our testing and without ReFSv3 bugs ruining the show, to me this is a very valid and cost effective approach.

Hyper-V Amigos Showcast Episode 12–ReFS v3.1 and Backup

In this Episode Carsten and I look at a single host deployment with Storage Spaces on Windows Server 2016. We create a “Hybrid” disk just like in Storage Spaces Direct by combining SSD & HDD in a storage Tier. We were very happy to discover that ReFSv3.1 does real time tiering.

image

We’re very excited about this because we want to leverage the benefits if Veeam Backup & Replication 9.5 brings by leveraging ReFSv3.1 (Block Cloning) in regards to backup transformation actions and Grandfather-Father-Son (GFS) spaces savings. To do so we’re looking at our options to get these benefits and capabilities leveraging affordable yet performant storage for our backup targets. S2D is one such option but might be cost prohibitive or overkill in certain environments.

ReFS v3.1 on non-clustered Windows Server 2016 hosts bring us integrity streaming, file corruption repair with instant recovery as protection against bit rot, the performance of tiered storage and SMB3 as a backup target at a great price point.

We encourage you to watch the video and see for yourself. As always, we had fun and hope your can learn something together with us, the Hyper-V Amigos Smile

Veeam Leads the way by leveraging ReFS v3 capabilities

Introduction

You might have noticed that I’m pretty impressed by what Microsoft is doing with ReFS v3 in Windows Server 2016. You can read some of my musing on it in ReFS vNext Block Cloning and ODX and take a look at a comparison between ReFS & ODX speeds when creating VHDX files in Lightning Fast Fixed VHDX File Creation Speed With ReFS on Windows Server 2016 .

Note that this is also leveraged for accelerated checkpoint merges, VHDX resizing etc.

Now it goes without saying that Hyper-V (they’re the tip of the spear at MSFT) and other Microsoft products would take advantage of the capabilities of ReFS. But now we know that Veeam Backup & Replication 9.5 has made use of ReFS to help with the resilience of their backups, the speed of their Synthetic Full backups and the space required.

image

To a Hyper-V MVP and a Veeam vanguard it was obvious these two combined just had to lead to way for others to follow.

Veeam Leads the way by leveraging ReFS v3 capabilities

Veeam Backup & Replication 9.5 will leverage ReFS v3 …

image

 

and by doing so they deliver the following benefits:

  • Shorted backup windows and a reduced backup storage load on the repository
  • Reduced backup target storage capacity which is reducing or eliminating the need for deduplication in many scenarios.
  • Better backup data protection by leveraging the ReFS native capabilities to protect against bit rot which was one of the prime goals for which Microsoft designed ReFS.

How is this done?

ReFS v3 has “fast cloning” technology which Veeam is leveraging. This results in up to 10 times  faster creation and transformation of synthetic full backup files!  ReFS fast cloning allows for creating new files without physically moving data blocks between files. This is what delivers even shorter backup windows and lower backup storage load on the repository or repositories.

They use what they call “Spaceless full backup technology” which allows multiple full backup files to reside on the same ReFS volume that share the same physical data blocks. As a result they need less storage capacity which can reduce or eliminate the need (and cost) of deduplication appliances whilst leveraging commodity storage.

Lest see how this is done. A “legacy” full backup is created an consumes 30% storage capacity. Then we make incremental backups.

image

3 incremental backups add 3 * 10% of delta to the needed backup storage capacity which adds up to 60%.

image

We create a synthetic full backup and the copies of the data require another 30% of space (90%). 

image

No let’s compare this to v9.5 that leverages a Windows Server 2016 ReFS formatted backup target repository. Instead of copying data ReFS references already existing data block for a new file. This saves on IO, space and time!

image

Is this safe? What if those data blocks that are reference multiple times are corrupted? Well Veeam does have protection against that in place already! But it goes the extra mile as ReFS has the capabilities to protect against that itself or it’s power would also become its biggest weakness.

Veeam’s data integrity streams integration leverages ReFS data integrity scanner and even proactive error correction when used in combination with Storage Spaces to protect backup files from bit rot and allows for more reliable forever-incremental archiving. This helps make the spaceless full backup technology trustworthy & safe alongside the health checking & error fixing capabilities already available in Veeam Backup & Replication.

Conclusion

I’m impressed by the forward looking and fast adoption of the capabilities of ReFS v3 by Veeam and I’m testing Backup & Replication v9.5 Beta today in the lab. They have more up their sleeve by the way as they have some interesting work with PowerShell Direct to make backups ever more resilient in ever more scenarios. More on that later.

Anyone who said Veeam would lose its edge in the world of Hyper-V backups when Microsoft introduced their own native change block tracking (resilient change tracking) has clearly never dealt with Veeam seriously and professionally. I have and I’m always happy to chat to them as they have serious technical skills combined with vision and business acumen that makes sure they’re leaders in the business of backup. It makes me proud to be a Veeam vanguard and a MVP with a specialization in Hyper-V.