Join our team, we're recruiting Contact us on: +44 (0)121 3145 374

The Site Doctor Blog

Footprints in the snow of a warped mind

< Back to Blog

How to: Restore nodes from a SQL backup

We had an odd issue the other day where a bunch of nodes disappeared from one of our Umbraco installs. There didn't appear to be any reason behind why they were gone and the customer assured me that they didn't delete them but they were gone for sure. No worries I thought, they'll be in the recycle bin but after a quick check I found they weren't… I had a feeling things were about to get tricky -and I was right.

Thankfully all our databases are backed up over night so I reached out for the latest backup to see if I could find the missing data. I had to go back a week or two but I finally I found it -but now what? It's an active CMS so we couldn't just blitz one database with the other so I had to turn to SQL.

Those of you who are familiar with Umbraco's database will know it's a little tricky to get your head around at the best of times let alone when you're trying to fix an issue. There wasn't much on Google so we had to write up some steps to transfer the content ourselves. After a little battling I think the resulting script should be usable by others however this is the first time I've seen this in about 8+ years of working with Umbraco so I hope no-one else ever needs it!

Anyway, once you have your databases restored you can use this script. It's quite long so I've split it out with a little explanation into each step. In short though the main tables you need to restore are: umbracoNode, cmsContent, cmsContentVersion, cmsPropertyData and cmsContentXml

Find the missing nodes

This basically does a quick comparison between the two databases to find any missing nodes. This will include any content, media as well as document types etc so may be a little too inclusive but you can filter out the ones you don't want. The "INTO #ToRestore" pops the records into a temporary table which is dropped later as we'll be referencing them a few times throughout the script and if there are a lot of nodes that could get quite costly.

-- Identify missing nodes between the two databases
SELECT r.id, r.text AS [NodeName], r.[path], (SELECT TOP 1 cv.VersionId FROM [BackupDB].dbo.cmsContentVersion cv WHERE cv.ContentId=r.id) AS [LatestVersion]
INTO #ToRestore
FROM [BackupDB].dbo.umbracoNode r LEFT JOIN [RestoreTo].dbo.umbracoNode m ON r.id = m.id
WHERE m.id IS NULL
ORDER BY r.id

-- The list of nodes to restore
SELECT * 
FROM #ToRestore r
ORDER BY r.[path]

umbracoNode

The table we're probably all most familiar with, this one contains the relationships between the different nodes. Without the records here nothing will work. It firstly changes the table so we can insert the ids (normally these are created by SQL) and it then inserts all missing nodes from the BackupDB.

-- Restore the nodes
SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode ON

INSERT INTO [RestoreTo].dbo.umbracoNode (id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate)
SELECT id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate
FROM [BackupDB].dbo.umbracoNode iin
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id)

SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode OFF

cmsContent

I wasn't too familiar with this table when this started but it's basically an important relationship table which contains information on the type of node each node is.

-- Restore the node to content joins
INSERT INTO [RestoreTo].dbo.cmsContent (nodeId, contentType)
SELECT DISTINCT c.nodeId, c.contentType
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id)

cmsContentVersion

This table is the version record for the node. This script only restores the most recent version of the content, you may need more but we didn't.

-- Restore the content versions
INSERT INTO [RestoreTo].dbo.cmsContentVersion (ContentId, VersionId, VersionDate, LanguageLocale)
SELECT DISTINCT cv.ContentId, cv.VersionId, cv.VersionDate, cv.LanguageLocale
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId
        LEFT JOIN [BackupDB].dbo.cmsContentVersion cv ON c.nodeId = cv.ContentId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentVersion ic WHERE ic.ContentId=iin.id)

cmsPropertyData

This will restore the actual values for each of the nodes.

-- Restore the property data
INSERT INTO [RestoreTo].dbo.cmsPropertyData (contentNodeId, versionId, propertytypeid, dataInt, dataDate, dataNvarchar, dataNtext)
SELECT DISTINCT ipd.contentNodeId, ipd.versionId, ipd.propertytypeid, ipd.dataInt, ipd.dataDate, ipd.dataNvarchar, ipd.dataNtext
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsPropertyData ic WHERE ic.contentNodeId=iin.id)

cmsContentXml

These are the snippets of XML that are used to build the larger umbraco.config file, we're not sure whether they were needed in the end but it saves re-saving everything!

-- Restore the XML
INSERT INTO [RestoreTo].dbo.cmsContentXml (nodeId, [xml])
SELECT cx.nodeId, cx.[xml]
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContentXml cx ON iin.id=cx.nodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentXml ic WHERE ic.nodeId=iin.id)
    AND cx.nodeId is not null

The entire script

To use it, replace BackupDB with the name of the restored database i.e. the one with the missing nodes in. Then replace RestoreTo with the one you want to add the missing nodes to. Once you're happy it's working, you can change the ROLLBACK TRAN to COMMIT TRAN and run one list time...

Note: You won't be able to see the nodes in Umbraco until you have run it with COMMIT in place. Also, don't forget to take a backup of your live database before running scripts from the interwebs... You know... just in case little Bobby Drop Tables is around ;)/p>

BEGIN TRAN

-- Identify missing nodes between the two databases
SELECT r.id, r.text AS [NodeName], r.[path], (SELECT TOP 1 cv.VersionId FROM [BackupDB].dbo.cmsContentVersion cv WHERE cv.ContentId=r.id) AS [LatestVersion]
INTO #ToRestore
FROM [BackupDB].dbo.umbracoNode r LEFT JOIN [RestoreTo].dbo.umbracoNode m ON r.id = m.id
WHERE m.id IS NULL
ORDER BY r.id

-- The list of nodes to restore
SELECT * 
FROM #ToRestore r
ORDER BY r.[path]

-- Restore the nodes
SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode ON

INSERT INTO [RestoreTo].dbo.umbracoNode (id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate)
SELECT id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate
FROM [BackupDB].dbo.umbracoNode iin
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id)

SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode OFF

-- Restore the node to content joins
INSERT INTO [RestoreTo].dbo.cmsContent (nodeId, contentType)
SELECT DISTINCT c.nodeId, c.contentType
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id)

-- Restore the content versions
INSERT INTO [RestoreTo].dbo.cmsContentVersion (ContentId, VersionId, VersionDate, LanguageLocale)
SELECT DISTINCT cv.ContentId, cv.VersionId, cv.VersionDate, cv.LanguageLocale
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId
        LEFT JOIN [BackupDB].dbo.cmsContentVersion cv ON c.nodeId = cv.ContentId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentVersion ic WHERE ic.ContentId=iin.id)

-- Restore the property data
INSERT INTO [RestoreTo].dbo.cmsPropertyData (contentNodeId, versionId, propertytypeid, dataInt, dataDate, dataNvarchar, dataNtext)
SELECT DISTINCT ipd.contentNodeId, ipd.versionId, ipd.propertytypeid, ipd.dataInt, ipd.dataDate, ipd.dataNvarchar, ipd.dataNtext
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsPropertyData ic WHERE ic.contentNodeId=iin.id)

-- Restore the XML
INSERT INTO [RestoreTo].dbo.cmsContentXml (nodeId, [xml])
SELECT cx.nodeId, cx.[xml]
FROM
    [BackupDB].dbo.umbracoNode iin
        LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId
        LEFT JOIN [BackupDB].dbo.cmsContentXml cx ON iin.id=cx.nodeId
WHERE
    iin.id IN (SELECT tr.id FROM #ToRestore tr)
    AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentXml ic WHERE ic.nodeId=iin.id)
    AND cx.nodeId is not null

-- Clean up
DROP TABLE #ToRestore

ROLLBACK TRAN
Author: Tim on

Liked this post? Got a suggestion? Leave a comment