Presenting at Experts Live 2019 Europe in Prague

Presenting at Experts Live 2019 Europe in Prague

I am happy to announce that I will be presenting at Experts Live 2019 Europe in Prague. The conference is held between 20-22 November 2019. This is my first time speaking there and I am really looking forward to it.

I am speaking at Experts Live Europe 2019
My session is: Hyper-V backups – The good, the bad and the ugly

I will be talking about Hyper-V backups, the good, the bad and the ugly. Many people are still on older Windows Server versions and the improvements in backup alone should make for a strong use case to upgrade. I’ll show you why. I’ll share details improvements in speed, reliability, and scalability no matter what storage technology you use. Local, HCI, SAN, … they all benefit. We’ll share tips on how to leverage the improvements for the best results and make you backups shine. Finally, we’ll provide some feedback on what is still needs improvement. Remember as long as you run workloads on virtual machines you have to maintain it, keep it up to date and protected! If you know all this already, no worries, come for the over 40 other experts presenting. Take a look at the session catalog and see for your self.

Join us!

The conference focusses on Microsoft technologies at large and deals with Cloud, Datacenter, Security, Identity Management and the Modern Workplace. As such it realizes there is a lot of variety out there in building blocks used to make a company run. This means it offers content that reflects that reality and helps people succeed in their digital efforts to help their businesses run smoothly and securely on-premises as well as in the hybrid and public cloud.

I encourage you to attend if you have the opportunity. The content of other Experts Live Conferences I have done in the past was always excellent and the speakers were very knowledgeable. The same goes for Experts Live Europe I hear from my fellow MVP and colleagues who have attended before. Note that It is right after Microsoft Ignite 2019 and there are quite a lot of Microsoft attending as well. This means you’ll be getting some new information and insights hot of the press.

Life long learning is fun and doesn’t only happen at your desk or in a course. Get out of the office and into the world. It helps to get rid of the blinders and widen your view and vision on what is possible. It helps to learn from others, from your peers. So don’t delay and register here

Network, socialize, share and learn

With so many colleagues, experts, Microsoft Cloud Advocates, Program Managers and technologists at the event it offers excellent networking opportunities.

I will be around at the VIP Cloud Party, which provides plenty of networking opportunities and the chance to chat to the presenters and your peers. On top of that, I will be available between sessions and the “Ask The Experts” (ATE) speaker booth.

If you have questions about Hyper-V, Backup, Storage, Networking and best practices do come and find me. I don’t know it all, far from, but I have been able to help out many people before at conferences. Whether you work in small, medium or enterprise-sized organizations it’s free to ask and the worst that can happen is that I don’t know. I have a sweet spot for RDMA and PMEM, so if you ‘d like to chat, come find me!

Join us in Prague at Ask The Experts Europe

I hope to see you in wonderful Prague at a great conference! I am looking forward to meeting you there and presenting at Experts Live 2019 Europe in Prague. You can make new friends and catch up with others while you educate yourself. That’s a great deal.

The lure of having a Ransomware Fund

Introduction

What is the the lure of having a ransomware fund all about? It’s the idea that just paying is the best way to deal with a ransomware incident.While preventing as many ransomware attacks as possible is great, it is not something that will be 100% effective. Detecting an incident as early as possible is key to minimizing the effects. This even in the event of successful and early detection some data has been compromised (encrypted). The nature and function of that data will determine the blast radius and the fall out. To recover from that the attack needs to be stopped by finding and eliminating the points of infection.Next to that, the proven ability to restore data and do so fast is a key capability when it comes to recovering form a ransomware attack. If you don’t you’ll either need to eat the loss or try to pay up.

Dealing with Ransomware step by step

  • Prevention is not 100% effective. Don’t bank on it.
  • Early detection
  • Swift & adequate response
  • Quarantine, wipe (nuke from orbit) of contaminated systems & data
  • See if a free decryption solution is available via the security community or your police services cyber crime department
  • Restore your data. You must have multiple options. You must have implemented the 3-2-1 rule. But beware, your off site, air gapped copy cannot be too old. You need to have fairly recent backups in there to have a decent RPO that is meaningful to the business.
  • Bring data, systems and services back into production.

Now make sure you can do this for end user files, server data (images, VMs, Databases, configuration files,  backups) regardless of where it is (on-premises, private, hybrid & public cloud) what delivery model it comes in (Physical, virtual, IAAS, PAAS, SAAS, Serverless).

The lure of having a Ransomware Fund (Isn’t it cheaper to pay?)

Now some bean counter might come up with the idea that paying is cheaper (and easier) than prevention, let alone backup & restore capabilities.

The lure of having a Ransomware Fund

Some would even consider it a “cost of doing business”. This is the the lure of having a ransomware Fund. Ouch, well I know many parts of the world are a lot less save than mine but this is a path down a slippery slope so dangerous you will fall down sooner or later. Let’s look at why that is.

petya ransomware

The lure of having a Ransomware Fund

First, let’s not forget about the down time caused no matter how you resolve it. So prevention and early detection are key. You might not even survive if you pay and get your data back.

Secondly, while I love the idea of prevention and early detection this doesn’t mean that you can get rid of your backup and restore capabilities. Prevention is an mitigation strategy, it doesn’t eradicate the issue. Early detection minimizes the immediate and secondary damage in many cases. But not in all cases and it is also not perfect.

Third, when you pay your ransom how sure are you you’ll get your decryption key and be able to access your data? Well it seems only in 50% of the cases. Now, some ransomware “businesses’’ have a better customer service than many commercial companies and governments. But that doesn’t mean all of them do and by definition they are not honest people. Unless you consider ransomware “Encryption As A Service” that helps you with GDPR. I think not. You might think that a smart ransomware player delivers not to ruin future revenue streams by acquiring a bad reputation. Probably true, but they to can make mistakes, you can make mistakes, you can become road kill of vandals or of criminals who desire or are hired to incur havoc on a certain industry.

Finally, you might end up being a repeat victim as you have shown the willingness & ability to pay. Don’t forget that ransomware is not like mobster protection money. It will not protect you from others or the same ones doing it again.

Conclusion

Banking on having an emergency stash of Bitcoin (ransomware fund) just to pay ransomware isn’t your best option. It might be a last resort faced with the alternative of bankruptcy but even then it remains a costly and risky gamble.

I know that for some people in IT, backups seem outdated and from a gone by era, a solution to a problem form yesterday. I kid you not. Well, I advise you to think again and act upon what you concluded.

 

It’s not as simple as renaming the avhdx to vhdx

This arrives in via the feedback option on my blog

Hi. I see through your website that you are an expert in vhdx / avhdx file. I had a system crash with data loss. I think this data is in an avhdx file. When I rename this file in vhdx, I can mount it but I have an error: the file is corrupted. Do you know a procedure to repair this type of file? I thank you in advance for your support!

Oh dear! An expert? While flattery can get you a long way in life with certain people virtual disks are impervious to that sort of thing. Look, MVP, Veeam Vanguard, Dell Rockstar … tip of the spear, edge of the sword, it’s all fine and well but it’s no good to split a granite piece of rock and virtual disks don’t care about titles, jut about how they are designed to work.

Before we dive into some more details please use the comments sections under the relevant blog post to ask questions. That way everyone can benefit form the answer. It’s all quite anonymous if you want it to be. Secondly vendors like Microsoft have great public support forums with many thousand pairs of eyes reading. That might also work better and faster for your needs.

Some details

When you have avhdx your data is stored in the avhdx and in the parent disks (more avhdx but at least always one vhdx). While you can throw away what’s in a avhdx under certain conditions (and lose that data) and mount the vhdx you cannot throw away the vhdx and hope to be able to access the data in the avhdx you rename to vhdx.

clip_image002

For a case of real data corruption, not just phantom or mixed up VHDX/AVHDX chain, where you can try to intervene, even manually if needed – and if you have the skills – you’ll have to recover or restore data.

If the storage on which the vhdx/avhdx reside is corrupted a good but time-consuming run of chksdk /f /r can do the job. I have done that before with success. But there are no guarantees in this game.

Other than that, or when the storage is gone, it is restore time. This can be leveraging whatever backup solution you use or VSS snapshots on the storage side of things. Those options are your best bet. You can find some more info on manually manipulating vhdx/avhdx files here but that’s not what you’re facing here it seems.

If you don’t have recovery options in place, what can I say?

Stop what you’re doing and contact a good data recovery company. Only damage can come from trying if you don’t know what you’re doing. You can hope trial and error will fix it but that would be the triumph of hope over experience. You’re usually not that lucky. Trust me.

The snarky bit

I’ll fight like hell if I’m in a pickle and the data is valuable. But it’s near to impossible to do it for someone else as it’s hard, time consuming and often it’s a case were the files have been worked on before, so they tend to be messed up. If the data is not that valuable, just eat the loss.

In reality my time always seems less valuable then peoples their data . Now if you say you can help me retire early by trying anyway and are OK with a best effort, no guarantees given deal I might do it. But I’m pretty sure investing in backups and restores is way cheaper and will lead to better results. Your data is important and valuable, even when my time is not. Just saying

Frustrations about host level backups of Hyper-V guest clusters with Windows Server 2016

Introduction

With Windows Server 2016 came the hope and promise of improved backups for Hyper-V environments. And indeed Microsoft delivered on that and has given us faster, more scalable and more reliable backups. With VHD sets also came the promise of host based backups for guest clusters.

The problem is that this promise or, as it is perhaps better to be mild and careful, that expectation has not been met. Decent, robust host based backups of guest clusters in Windows Server 2016 are still not a reality. For me this means it blocked a few scenarios and we’re working on alternatives. This is a missed opportunity I think for MSFT to excel at virtualization.

The problem

Doing host based backup of guest clusters with VHD Set disks is supported in Windows Server 2016 under certain conditions.

At RTM it became clear that CSV inside the guest cluster was not supported.

You need a healthy cluster with all disks one line

These requirements are reflected in Errors discovered during backup of VHDS in guest clusters

Error code: ‘32768’. Failed to create checkpoint on collection ‘Hyper-V Collection’

Reason: We failed to query the cluster service inside the Guest VM. Check that cluster feature is installed and running.

Error code: ‘32770’. Active-active access is not supported for the shared VHDX in VM group

Reason: The VHD Set disk is used as a Cluster Shared Volume. This cannot be checkpointed

Error code: ‘32775’. More than one VM claimed to be the owner of shared VHDX in VM group ‘Hyper-V Collection’

Reason: Actually we test if the VHDS is used by exactly one owner. So having 0 owner also creates this error. The reason was that the shared drive was offline in the guest cluster

Unfortunately, this is not the only problems people are facing. Quite often the backup software doesn’t support backing up VHD Sets or when it does they fail. Some of those failings like being unable to checkpoint the VHD Set have been addressed via Windows Updates. But there are others issues.

Let’s look at the two most common ones.

Issue 1

You can make one backup an all subsequent backups fail. This is due to the avhdx files being in used and locked. This means that as long as the cluster is up and running the recovery checkpoint chain keeps growing. This can be “cleaned” or merged but only by taking down the cluster.

At the first backup live seems good.

image

The recovery checkpoint as a collection is indeed working.

clip_image003

All attempts at another backup fail.

clip_image005

Shutting down all cluster VMs and starting them up again does merge the recovery checkpoints.

Issue 2

You can make backups, successfully but the recovery checkpoints never get merged. clip_image007

This sounds “better” but it isn’t. There is no way to merge the checkpoint. Manually merging the checkpoints of a VHD Set is bad voodoo.

Both situations get you into problems and I have found no solution so far. At the time or writing I’m back at the “never ending” recovery checkpoint chain situation. But that can change back to the 1st issue I guess. Sigh.

I have found no solution so far

For now I have been unable to solve these problem. There is no fix or even a workaround. The only to get out if this stale mate is to shut down every node of the guest clusters and then restart them all. Just a restart of the guest nodes of the cluster doesn’t do the trick of releasing the checkpoints files and merging them. While this allows you to take one backup successfully again, the problem returns immediately. For you reference that was my issue with the October 2017 CU (KB)

The other scenario we run into is that the backups do work but the recovery checkpoints never ever merge. Not even when you shut down the all the guest VM cluster nodes and start them. With frequent backup that turns into a disaster of a never ending chain of recovery checkpoints. This is actually the situation I was in again after the November 2017 updates on both guests & hosts (KB4049065: Update for Windows Server 2016 for x64-based Systems and KB4048953: 2017-11 Cumulative Update for Windows Server 2016 for x64-based Systems).

To me this situation is blocking the use of guest clustering with VHD Sets where a backup is required. For many reasons we do not wish to go the route of iSCSI or vFC to the guest. That doesn’t cut it for us.

Conclusion

Host level backups of guest clusters in Windows Server 2016 are still a no go. This despite the good hopes we had with VHD Sets to address this limitation and which we were eagerly awaiting. For many of us this is a show stopper for the successful virtualization guest clusters. Every month we try again and we’re not getting anywhere. Hence the frustration and the disappointment.

More than 1 year after Windows Server 2016 RTM we still cannot do consistent host level backup a Hyper-V guest cluster, not even those without CSV, but also not those with standard clustered disks. Trust me on the fact that many of us have given this feedback to Microsoft. They know and I suggest you keep voicing your concerns to them in order to keep it on their radar screen and higher on the priority list. You can do this by opening support calls and by asking for it on user voice. Please Microsoft, we need these workloads to be first class citizens. I’m clearly not the only unhappy camper out there as noticeable in various support forums: Cannot create checkpoint when shared vhdset (.vhds) is used by VM – ‘not part of a checkpoint collection’ error and Backing up a Windows Failover Cluster with Shared vhdx?