Please note: This blog is no longer active. My new blog is located at http://blog.timwheeler.io

Wednesday, November 24, 2010

The Heading “Libraries” automatically added through content import

Problem

When exporting and importing a single Document Library using the Import-SPWeb command, the Heading "Libraries" is automatically added to the site navigation if it is not there. Exporting then re importing to the same site or another site always adds the heading. Seems to be no option to stop this behaviour.

clip_image001

Solution

As I don’t want the heading Libraries to be added I guess I can either delete them manually or build a PowerShell script that moves through all the SPWebs and deletes them.  Not a great solution and this really should be controlled via parameters on Import-SPWeb.  Hey Microsoft, can you sort that for me please?  Thanks.

Wednesday, November 17, 2010

Content Migration and Taxonomy Fields

 

I have been using Import-SPWeb and Export-SPWeb to migrate content from different environments and found an issue if the sites use Taxonomy Fields.

The problem in a nutshell

The content is exported from a site using the PowerShell command Export-SPWeb or with code. The site contains Taxonomy fields included in content types. The taxonomy fields are wired to a Managed Metadata Service Term Store. The taxonomy fields have id's to both a TaxonomyHiddenList at the root web level and to the Term stores themselves. These ID's are migrated along with the list. However in the next environment you try to import the content to the Managed Metadata Service has a different ID for the Term Store, and as such the taxonomy fields fail to work properly.

Note: I'm not sure if this is an issue if you use the default Managed Metadata Service, I am using a new one created from a PowerShell script.

Visible Issues

If the site was accessed as an anonymous user, some pages would error with:

 image

The error message might as well just say “Something bad’s happened, but you have to figure out what. Good luck!”  (Well, I could turn off custom errors but that would be too easy and harder to make fun of!)

So the Slightly more helpful are the trace logs which showed:

"Hidden list not found, creating new one"

"Creating taxonomy hidden list"

"Unknown SPRequest error occurred. More information: 0x80070005"

"System.ArgumentException: Value does not fall within the expected range."

Strangely, the first page would access correctly under anonymous and subsequent requests to other pages would fail.  I have a theory about why but wild speculation is probably best left unsaid.

Other Symptoms

The publishing root web contains a list called "TaxonomyHiddenList". This list is used by the taxonomy fields to store terms in use for the site. After using Import-SPWeb many extra entries where added to this list with an empty title.

If you use SharePoint Manager (very useful BTW), you will see a bunch of entries with “(no title)”.  As far as I can see this shouldn’t happen, normally when the taxonomy fields create these entries they include the label as the title. 

image

The above screen shot of SharePoint Manager shows the bad entries.  Notice as well that there are lots, 64 in this example.  If this was working properly you should see the following:

image

In this screen shot the entries have a valid title.  These values originally come from the terms in the term store.  These entries where created by a powershell script and added to the term store, they are some of the ones required for a dublin core and AGLS metadata implementation.  (Which is probably about as interesting as xml namespaces or writing documentation but I felt I had to explain them anyway)

The cause

If you look at the definition of the Microsoft.SharePoint.Taxonomy.TaxonomyField you will see a property called SspId. This property is the Guid of the Term Store that its connected to. After importing, this value points to the old term store. It also has a side effect of creating bad entries in the /Lists/TaxonomyHiddenList list and is likely the cause of the entries in the trace logs. The data in the field also points to the bad entry within that list rather than the entry it should point to. There is a good write up on how the entries are created here so I won't go into that part too much.

The fix

The fix for this is much simpler than tracking it down I assure you. After many frustrating hours and some stern words aimed directly at SharePoint I managed to coble together a script that suitably cleans my site collections after content migration.

The following is a PowerShell script that will clean up your publishing site.  Use at your own risk. Backup your data, etc, etc! Wish you the best of luck! Like a Microsoft error message Smile

Note: This script will utilise the field definitions in the root web.  If you are migrating an entire web then you should fix your content types prior to running this script.

Code Snippet
  1. #Script: CleanTaxonomyFields.ps1
  2. #Author: Tim Wheeler (http://tjwheeler.blogspot.com/)
  3. #Created: 17/11/2010
  4. #Notes:
  5. #When content in imported into a new environment the Term Store ID is kept on the
  6. #taxonomy fields.  This causes a problem as the new term store has a different id.
  7. #These functions clean up the taxonomy list which has bad entries following an import
  8. #and modify all the items and resets the correct values.
  9. param
  10.     (    $siteCollectionUrl = (Read-Host "Please enter a site collection url") )
  11.     
  12. #region SharePoint Snappin Setup
  13. $snapin="Microsoft.SharePoint.PowerShell"
  14. if (get-pssnapin $snapin -ea "silentlycontinue") {
  15.     write-host -f Green "PSsnapin $snapin is loaded"
  16. }
  17. else {
  18.     if (get-pssnapin $snapin -registered -ea "silentlycontinue") {
  19.         write-host -f Green "PSsnapin $snapin is registered"
  20.         Add-PSSnapin $snapin
  21.         write-host -f Green "PSsnapin $snapin is loaded"
  22.     }
  23.     else {
  24.         write-host -f Red "PSSnapin $snapin not found"
  25.     }
  26. }
  27. #endregion
  28.     
  29. #region Data Clean Up
  30.  
  31. function CleanTaxonomyList($site)
  32. {
  33.     Write-Host "Cleaning /Lists/TaxonomyHiddenList"
  34.     $web = $site.RootWeb
  35.       $taxlist =$web.GetList("Lists/TaxonomyHiddenList")
  36.     for($count = $taxlist.Items.Count - 1; $count -ge 0; $count--)
  37.     {
  38.          $item = $taxlist.Items[$count]
  39.         if([string]::IsNullOrEmpty($item.Title))
  40.         {
  41.             $item.Delete()
  42.             if($?)
  43.             {
  44.                 #Write-Host "Deleted Item with bad title"
  45.             }
  46.             else
  47.             {
  48.                 Write-Host -ForegroundColor Red "Failed to deleted Item with bad title"
  49.             }
  50.         }
  51.         
  52.     }
  53. }
  54. #This function will reset the term store id if it is wrong, and will fix the WssId.
  55. #If the correct term cannot be located, the default one will be used.
  56. function ResetTaxonomyDefaults($site, [Microsoft.SharePoint.Publishing.PublishingWeb] $web)
  57. {
  58.     $txs = New-Object "Microsoft.SharePoint.Taxonomy.TaxonomySession" -ArgumentList $site
  59.    $pages = $web.GetPublishingPages()
  60.    foreach($page in $pages)
  61.    {
  62.         Write-Host -ForegroundColor Cyan "Checking publishing page " $page.Title
  63.        if($page.ListItem.File.CheckOutStatus -ne "None")
  64.         {
  65.             $page.CheckIn("Checked in by data clean process");
  66.         }
  67.         $page.CheckOut();
  68.        foreach ($field in $page.ListItem.Fields)
  69.        {
  70.            if($field.GetType().Name -eq "TaxonomyField")
  71.            {
  72.                $taxField = [Microsoft.SharePoint.Taxonomy.TaxonomyField] $field
  73.                 
  74.                 Write-Host "Found field to update:" $taxField.Title
  75.                 $currentValue = $page.ListItem.Properties[$taxField.InternalName]
  76.                 Write-Host "Current Value is" $currentValue
  77.                 $templateField = $page.ListItem.ParentList.ParentWeb.Site.RootWeb.Fields[$field.Id]
  78.                 $defaultValue = $templateField.DefaultValue
  79.                 $termStore = $txs.TermStores[$templateField.SspId]
  80.                 $termSet = $termStore.GetTermSet($templateField.TermSetId)
  81.                 if($taxField.SspId -ne $templateField.SspId)
  82.                 {
  83.                     Write-Host "TaxField SspId is not correct, updating"
  84.                     $taxField.SspId = $templateField.SspId
  85.                     $taxField.Update()
  86.                 }
  87.                 if($taxField.TermSetId -ne $templateField.TermSetId)
  88.                 {
  89.                     Write-Host "TaxField TermSetId is not correct, updating"
  90.                     $taxField.TermSetId = $templateField.TermSetId
  91.                     $taxField.Update()
  92.                 }
  93.                 $fieldValue = $templateField.GetFieldValue($currentValue)
  94.                   if(($fieldValue.GetType().Name -eq "TaxonomyFieldValue") -and($fieldValue -eq $null -or [string]::IsNullOrEmpty($fieldValue.TermGuid)))
  95.                 {
  96.                     $fieldValue = $templateField.GetFieldValue($defaultValue)
  97.                 }
  98.                   if(($fieldValue.GetType().Name -ne "TaxonomyFieldValue") -and($fieldValue[0] -eq $null -or [string]::IsNullOrEmpty($fieldValue[0].TermGuid)))
  99.                 {
  100.                     $fieldValue = $templateField.GetFieldValue($defaultValue)
  101.                 }
  102.                 if($fieldValue.GetType().Name -eq "TaxonomyFieldValue")
  103.                 {
  104.                     try
  105.                     {
  106.                         $term = $termSet.GetTerm($fieldValue.TermGuid)
  107.                     }
  108.                     catch
  109.                     {
  110.                           Write-Host -red ("Failed to update field {0} for page {1} in web {2}"-f $taxField.InternalName, $page.Title, $web.Url)
  111.                     }
  112.                 }
  113.                 else
  114.                 {
  115.                     try
  116.                     {
  117.                         $term = $termSet.GetTerm($fieldValue[0].TermGuid)
  118.                     }
  119.                     catch
  120.                     {
  121.                           Write-Host -red ("Failed to update field {0} for page {1} in web {2}"-f $taxField.InternalName, $page.Title, $web.Url)
  122.                     }
  123.                 }
  124.                 $taxField.SetFieldValue($page.ListItem, $term)
  125.            }
  126.        }
  127.         $page.ListItem.Update()
  128.        $page.CheckIn("Data clean process")
  129.    }
  130.  
  131. }
  132. function CleanSiteCollection ($siteColUrl)
  133. {
  134.     [Microsoft.SharePoint.SPSite] $site = get-spsite -Limit ALL | where-object {$_.Url -ieq $siteColUrl}
  135.     if($site -eq $null)
  136.     {
  137.         Write-Host -ForegroundColor Red "Unable to find site collection"
  138.         throw "Unable to find site collection"
  139.     }
  140.     CleanTaxonomyList $site
  141.    $site | Get-SPWeb -limit all | ForEach-Object {
  142.             #Check to see if site is a publishing site
  143.             if ([Microsoft.SharePoint.Publishing.PublishingWeb]::IsPublishingWeb($_))
  144.             {
  145.                 Write-Host "Cleaning pages in `"$($_.Title)`" site."
  146.                 #Get the Publishing Web and pages within it
  147.                 $publishingWeb = [Microsoft.SharePoint.Publishing.PublishingWeb]::GetPublishingWeb($_)
  148.                 ResetTaxonomyDefaults $site $publishingWeb    
  149.             }
  150.             $_.Dispose()
  151.         }
  152.     $site.Dispose()
  153. }
  154. #endregion
  155. CleanSiteCollection $siteCollectionUrl