Quick Fix Publish : VM won’t boot after October 2017 Updates for Windows Server 2016 and Windows 10 (KB4041691)

If you had WSUS (or SCCM) running tonight with auto approval on you might have woken up this morning to virtual machines that can boot anymore.

image

Great, another update gone wrong. Time to restore from backup as that can be the fasted way to restore services when in a pickle and if you have a good solutions for that in place. For the others you can do what I did is below. Actually a couple of us MVPs were on this issue at a number of sites as our fist task this morning. But first the root cause.

Well read this link Express update delivery ISV support and you have all you need. Basically the delta and the full cumulative update of October (KB4041691 – https://support.microsoft.com/en-us/help/4041691)  ended up in WSUS without you explicitly putting it there. That should not happen, normally the delta is not published for it to be downloaded and heaven forbid auto approved.  You could also have manually approved everything without really knowing what and why. Not a great idea at all.

image

So your VM get’s offered both of them and that is BAD!

image

Normally you get into this pickle if you some how managed to install both of these yourself or via other tools (see the link above), which you shouldn’t do.

Now if you don’t have decent restore capabilities from backups or snapshots there is another way out by removing the updates.

Boot into the problematic VM and select troubleshoot

image

Select to open the command prompt and stay away from any other auto repair options.

image

Microsoft advises to get rid of the SessionsPending reg key. To do so load the software registry hive as follows:

reg load hklm\temp c:\windows\system32\config\software

Delete the SessionsPending registry key, if it exists by running:

reg delete “HKLM\temp\Microsoft\Windows\CurrentVersion\Component Based Servicing\SessionsPending” /v Exclusive

Unload the software registry hive:

reg unload HKLM\temp

Run dism /image:c:\ /get-packages to find the updates installed that caused the issue

image

The yellow one are the ones of interest and you can see the first one never even got an install time/

We now use DISM to remove these updates.  Do first create the C:\Temp folder with MD temp if it doesn’t exist yet!

dism /image:c:\ /remove-package /packagename:myproblematicpackagetoremove /scratchdir:c:\temp

image

When done, close the command prompt, shut down the VM and then start it.

image

It will take a while but if will succeed and you’ll be greeted by a logon screen. Good luck!

Important: Do not try any other repair options or removing the updates with DISM might fail. We choose to remove all 3 updates from tonight to make sure. It might suffice to remove the delta one alone but we wanted to have an VM back as it was last night so more testing can be done before it is deployed again.

So, basically, don’t auto approve updates blindly, but test, validate & roll out in phases. Have great backup and TESTED restores. All by all we were only bitten in the lab, a couple of test/dev VMs and some of our infra VMs. Most of these are redundant and are patched stagger so our services were never badly effected. That gave us time to trouble shoot and investigate and warn our colleagues. As you can see here the issue was a delta update that made it into WSUS and was installed together with the full CU. Just manually downloading the CU and testing it would not have given you the heads up. About an issue. This is a reminder you need to test your real live situation and processes as realistically as possible. When you’re done with testing and cleaning up any fallout of this issue, make sure to patch your systems again!

Update: this also goes for Windows 10 Updates

Also see fellow MVP Mikael Nystrom blog post  https://deploymentbunny.com/2017/10/11/the-october-2017-update-inaccessible-boot-device/

Update: we now also have the official MSFT response & fix for each and every scenario right here https://support.microsoft.com/en-us/help/4049094/windows-devices-may-fail-to-boot-after-installing-october-10-version-o

Changes in RDP over UDP behavior in Windows 10 and Windows 2016

Introduction

With Windows Server 2012 and Windows 8 (and Windows 7 RDP client 8.0) with some updates we got support for RDP to use UDP for data transport. This gave us a great experience over less reliable to even rather bad networks.

Anecdote: I was in an area of the world where there was no internet access available bar a very bad and lousy Wi-Fi connection at the shop/cafeteria. That was just fine, I wasn’t there for the great Wi-Fi access at all. But I needed to check e-mail and that wasn’t succeeding in any way, the network reliability was just too bad. I got the job done by using RDP to connect to a workstation back home (across the ocean on another continent) and check my e-mail there. Not a super great experience but UDP made it possible where nothing else worked. I was impressed.

Changes in RDP over UDP behavior in Windows 10 and Windows 2016

When connecting to Windows Server 2016 or a Windows 10 over a RD Gateway we see 1 HTTP and only one UDP connection being established for a session. We used to see 1 HTTP and 2 UDP connections per session with Windows 8/8.1 and Windows Server 2012(R2)

It doesn’t matter if your client is running RDP 8.0 or RDP 10.0 or whether the RD Gateway itself is running Windows Server 2012 R2 or Windows Server 2016. The only thing that does matter is the target that you are connecting to.

Also, this has nothing to do with a Firewall or so acting up, we’re testing with and without with the same IP etc. Let’s take a quick look at some examples and compare.

When connecting to Windows 10 or Windows Server 2016 we see that 1 UDP connection is established.

In total, there are 8 events logged for a successful connection over the RDG Gateway.

clip_image002

You’ll find 2 event ID 302 events (1 for a HTTP connection and 1 for a UDP connection) as well as 1 Event ID 205 events for the UDP proxy usage.

clip_image003

clip_image004

In the RD Gateway manager, monitoring we can see 1 HTTP and the 1 UDP connections for one RDP Session to a Windows 2016 Server.

clip_image006

When connecting to Windows 8/8.1 or Windows Server 2012 (R2) we see that 2 UDP connections are established.

In total, there are 10 events logged for a successful connection over the RDG Gateway:

clip_image008

You’ll find 3 event ID 302 events (1 for a HTTP connection and 2 for a UDP connection) as well as 2 Event ID 205 events for the UDP proxy usage.

In the RD Gateway manager, monitoring we can see 1 HTTP and the 2 UDP connections for one RDP Session to a Windows 2012 R2 Server.

clip_image010

So, RDP wise something seems to have changed. But I do not know the story and why.

Live Export a Running Virtual Machine or a Checkpoint

A remarkably little known feature in Windows Sever 2012 R2 (and Windows 8.1)  is the ability to export one or multiple running virtual machines.

image

You just select right click in the Hyper-V manager and select Export from the context menu and follow the wizard to select an export location. Easy. This is also possible via PowerShell so you can automate this. The result is a VM you can import which gives you a copy of the original virtual machine in a saved state, at the point in time that you exported it.

More people seem to know about the capability to export a checkpoint of a running virtual machine, not so many of the capability to export a running VM itself. I noticed this because some people figured the latter was a new feature in Windows 2016. No it’s not. We’ve had this option since Windows 8.1 and Windows Server 2012 R2.

image

So why even have the option of exporting a checkpoint of a running VM? Because this enables you to have exports from various points in time, which is pretty cool and handy during test and development and trouble shooting or lab work. As a standard checkpoint has state in Windows Server 2012 R2 I prefer to shut down the VM, create a checkpoint and start the VM again. When I then export that checkpoint I don’t have to worry about the state in the VM at that point in time as it was shut down.

For some workloads this isn’t a big deal bit for some this is not a great experience, hence the fact that checkpoints are “”not supported in production but for test and dev.

In Windows Server 2016 we now have production checkpoints. That means that when we apply such checkpoints we have a consistent state just like when we restore VM from a backup. You’ll have to boot it up after applying the checkpoint, they do not appear running with the state at the time the snapshot was taken. Well, not unless you opt to create standard checkpoints. The reduces the need for me to shut down a VM before I create a checkpoint to export in many cases.

When you export a running VM in Windows Server 2016 you’ll have a copy of it in saved state. Just like you did in Windows Server 2012 R2, no change there. When you import that you’ll have a VM in saved state that you need to start up. If you want an application consistent copy, create a production checkpoint first and export that one.

So there you go. The feature to live export a running virtual machine was here before and it’s still here. The real extra capability with live exports comes from leveraging the live export of a checkpoint of a running virtual machine and the fact that we now have production checkpoints.

Issues to watch out for when configuring Discrete Device Assignment

When you’re discovering how to get discrete device assignment to work you have some potential bumps that might trip you up. So what are the issues to watch out for when configuring Discrete Device Assignment? We’ll share some here but note this is from testing with Windows Server 2016 Technical Preview 4. Changes can and probably will happen before RTM.

Make sure your VMs are running on the latest configuration version. That means 7.x at the time of writing. Many of the new features require this as discussed in Windows Server 2016 TPv4 Hyper-V brings virtual machine configuration version 7

Check the configuration version of the VM

When you try to add a GPU to a VM via Discrete Device Assignment you’ll get an error when the VM has version 5.0 in stead of 7.x. This can easily happen when you move VMs from older versions to a shiny new Windows 2016 environment as in the example below:

image

Naturally all of this is logged in the Hyper-V-VMMS Admin logs as well

‘W2K12R2’ cannot add device ‘Virtual Pci Express Port’ until the virtual machine is upgraded. (Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Mind your Dynamic Memory settings

Another thing you need to watch out for is that when you use dynamic memory the startup memory and the minimum memory values have to match. So minimum memory cannot be lower than the startup memory. Do not that this is TPv4 and things might change.

image

Cannot add the device to ‘W2K12R2’ as that virtual machine has Dynamic Memory configured with different startup memory and minimum memory values. When adding a device, the virtual machine must be configured with equal startup memory and minimum memory values.(Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

If you try to change this on a VM with discrete device assignment enabled you’ll also find that this isn’t allowed.

image

Cannot perform the operation for ‘W2K12R2’ as the specified memory settings are not compatible for device assignment. The startup memory size and minimum memory size must be equal when Dynamic Memory is enabled and devices are also assigned.(Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Set the automatic stop action to “Turn off the virtual machine”

I already mentioned this in the blog but you need to make sure that the automatic stop action for the virtual machine is set to “turn off the virtual machine and not to the default of “save the virtual machine state”. You cannot use DDA unless you do so.

image

Cannot add the device to ‘W2K12R2’ as that virtual machine is configured to go to saved state on host shutdown. (Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Again, changing this on a VM that has DDA assigned will not work.

image

Discrete means one on one

Remember that you cannot assign a device to more than one VM. The thing here is it won’t block you when both VMs are shut down, at least not in TPv4.  But It’s dedicated and won’t work. When you do and you try to start any of those VMs it won’t work.

image

An error occurred while attempting to start the selected virtual machine(s).

‘RFX-WIN10ENT’ failed to start.

Virtual Pci Express Port (Instance ID 9B15DD32-5F94-46EF-8524-501007830322): Failed to Power on with Error ‘The device is in use by an active process and cannot be disconnected.’.

When you try to assign a GPU to a VM that is assigned to a running VM it will block you!

The dive Cleary identifies the VM the device is already assigned to.

Add-VMAssignableDevice -LocationPath $LocationPathOfDismountedDA -VMName RFX-WIN10ENT
Add-VMAssignableDevice : ‘RFX-WIN10ENT’ failed to add resources to ‘RFX-WIN10ENT’.
Virtual Pci Express Port (Instance ID EA7CB907-C38A-4396-97E0-A9A8F3C2D1B0): Failed to Power on with Error ‘The device
is in use by an active process and cannot be disconnected.’.
‘RFX-WIN10ENT’ failed to add resources. (Virtual machine ID 425A366E-E380-4D8C-AADE-DE16EAC0A104)
‘RFX-WIN10ENT’ Virtual Pci Express Port (Instance ID EA7CB907-C38A-4396-97E0-A9A8F3C2D1B0): Failed to Power on with
Error ‘The device is in use by an active process and cannot be disconnected.’ (0x80070964). (Virtual machine ID
425A366E-E380-4D8C-AADE-DE16EAC0A104)
Could not allocate the PCI Express device with the Plug and Play Device Instance path
‘PCIP\VEN_10DE&DEV_0FF2&SUBSYS_101210DE&REV_A1\6&17F903&0&00400010’ because it is already in use by another VM.
At line:1 char:1
+ Add-VMAssignableDevice -LocationPath $LocationPathOfDismountedDA -VMN …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (:) [Add-VMAssignableDevice], VirtualizationException
+ FullyQualifiedErrorId : OperationFailed,Microsoft.HyperV.PowerShell.Commands.AddVmAssignableDevice

Shut down your VM to make change to DDA

Last but not least to use DDA (assign, configure) with a VM you have to shut it down.  Removing devices whilst the VM is running isn’t blocked. But, he results can be quite “harsh”. This is me removing a DDA GPU form a Windows 2012 R2 VM whilst it’s running.

image

The fun part is that you can add it again while the VM is running and with will work, but it’s not a healthy thing to do.

As stated above, these notes are from testing with Windows Server 2016 Technical Preview 4 so thing can still change. Happy testing!