I have inherited a site that previously had a couple of thousand of nodes imported from an old Umbraco site. My task is now to re-run the import, but only import the missing keywords for each node. I only want to update existing nodes (matched by node name) - and never want new nodes created. Is this possible with CMSImport? My experience is that if a matching node is not found then a new node is created, which is not what I want.
I've had a look at the documentation and I'm struggling to work out how to decide if a matching node has been found or not. You say to "Cancel the import when Id = 0" but what Id?
I've tried this - but I suspect that articleId contains the id from the source node rather than the destination node:
In any case the articleId is never zero. Should I be trying to check the import action and cancelling if it equals ImportAction.ImportAsNew or something like that?
I've found the id of the matched node by doing this in my handler:
var document = sender as Content;
var id = document.Id;
But this is never a match for an existing node.
What I've noticed is that if I run my import (allowing new nodes to be created if no match is found), then new nodes are created for each item in the source database (because no matches are found in the existing data).
If I then run my import again, however, then it will find these new nodes as matches. But the code never seems to match with existing data from the original import (run weeks/months ago by a previous developer). Am I missing something here? If I specify @nodeName as the primary key, does it then try to match nodes in my destination site by nodeName? Or is it matched in some other way?
So to illustrate. Right now I have a node named "Node A" in both databases. I want Node A in the destination to get the keywords from Node A in the source. On first run, no match is found in my destination, so a new node is created called "Node A (1)". If I re-run my import, Node A is matched with "Node A (1)".
I have the same 'problem'. The import does not look at the document name of existing items to match with your primary key, but it checks the relation table. That relation table is only filled after an import run by CMSimport using the same datasource. So in your case, there are no relation in the DB yet.
Richard told me this:
There has to be a relation for existing items already in the DB. These relations can be found in the DB table: CMSImportRelation
I don't have a solution yet (that I can implement) because my umbraco skills are way too low for this kind of stuff :D
Yes I realised this yesterday after getting more info from Richard. My solution is to loop over all of the existing items in my destination Umbraco site and create an entry for each in the CmsImportRelation table. Then when I run the import using the CMSImport UI, it already 'knows' how to match the node name (or id, or whatever primary key you use - I'm using node name) with the id in the destination database.
public void InsertRelations()
{
// Loop over every node of type Article Resource and create entry in CMSImportRelations table
var sqlConnection1 = new SqlConnection(@"YOUR_CONNECTION_STRING_HERE");
var cs = new ContentService();
var script = new StringBuilder();
// Get all Article Resources
var allArticleResources = cs.GetContentOfContentType(1115);
// Loop over all resources, adding a line of SQL for each one
foreach (var r in allArticleResources)
{
var dataSourceKey = string.Format("Umbraco content@nodeName{0}", r.Name.Replace("'", "''"));
script.AppendLine(
string.Format(
"INSERT INTO CmsImportRelation(UmbracoId, DataSourceKey, ImportProvider, Updated) VALUES({0}, '{1}', 'Content', GETDATE())", r.Id,
dataSourceKey));
}
// Now run the script against the database
var cmd = new SqlCommand { CommandType = CommandType.Text, CommandText = script.ToString(), Connection = sqlConnection1 };
sqlConnection1.Open();
cmd.ExecuteNonQuery();
sqlConnection1.Close();
}
Note that the format of the DataSourceKey is a bit strange, e.g. "Umbraco content@nodeNameName of Node" for a node with name "Name of Node".
Now I can run my import and CMSImport knows how to find a match for each node name in the table.
And to answer my original question, to prevent the creation of new nodes in my destination database, i.e. only ever update existing nodes matched by node name, I use the following event handler:
class CmsImportEvents : IApplicationEventHandler
{
public void OnApplicationInitialized(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
ImportProvider.RecordImporting += ImportProvider_RecordImporting;
}
/// <summary>
/// If we haven't found a matching node for the imported data (this is for importing keywords after the original import)
/// then do not create a new node
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
void ImportProvider_RecordImporting(object sender, CMSImport.Extensions.Providers.ImportProviders.EventArgs.RecordImportingEventArgs e)
{
var document = sender as Content;
// Only update content if the node name matches and we're not attempting to create a new document
var doUpdate = document != null && document.Id != 0 && document.Name == e.Items["@nodeName"].ToString();
e.Cancel = !doUpdate;
}
public void OnApplicationStarting(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
//throw new NotImplementedException();
}
public void OnApplicationStarted(UmbracoApplicationBase umbracoApplication, ApplicationContext applicationContext)
{
//throw new NotImplementedException();
}
}
In the new version you can inject CMSImport.Extensions.Import.IImportRelationRepository in your component and use that as to retreive and save relations. For the rest it's pretty much the same.
Update existing nodes only
Hi Richard,
I have inherited a site that previously had a couple of thousand of nodes imported from an old Umbraco site. My task is now to re-run the import, but only import the missing keywords for each node. I only want to update existing nodes (matched by node name) - and never want new nodes created. Is this possible with CMSImport? My experience is that if a matching node is not found then a new node is created, which is not what I want.
Thanks!
David
Hi David,
You could use the RecordImporting event. Cancel the import when Id = 0? Works best on V3 I'm afraid.
Cheers,
Richard
Hi Richard,
Thanks, I'll look into it. We're using version 3 so hopefully that'll work.
Cheers,
David
Hi Richard,
I've had a look at the documentation and I'm struggling to work out how to decide if a matching node has been found or not. You say to "Cancel the import when Id = 0" but what Id?
I've tried this - but I suspect that articleId contains the id from the source node rather than the destination node:
In any case the articleId is never zero. Should I be trying to check the import action and cancelling if it equals ImportAction.ImportAsNew or something like that?
Thanks,
David
Hi David,
I meant the id of the document which is send in via sender, i think that should work?
Cheers,
Richard
Hi Richard,
I've found the id of the matched node by doing this in my handler:
But this is never a match for an existing node.
What I've noticed is that if I run my import (allowing new nodes to be created if no match is found), then new nodes are created for each item in the source database (because no matches are found in the existing data).
If I then run my import again, however, then it will find these new nodes as matches. But the code never seems to match with existing data from the original import (run weeks/months ago by a previous developer). Am I missing something here? If I specify @nodeName as the primary key, does it then try to match nodes in my destination site by nodeName? Or is it matched in some other way?
So to illustrate. Right now I have a node named "Node A" in both databases. I want Node A in the destination to get the keywords from Node A in the source. On first run, no match is found in my destination, so a new node is created called "Node A (1)". If I re-run my import, Node A is matched with "Node A (1)".
I hope I'm being clear. Where am I going wrong?
Thanks,
David
Hi David,
I have the same 'problem'. The import does not look at the document name of existing items to match with your primary key, but it checks the relation table. That relation table is only filled after an import run by CMSimport using the same datasource. So in your case, there are no relation in the DB yet.
Richard told me this:
There has to be a relation for existing items already in the DB. These relations can be found in the DB table: CMSImportRelation
I don't have a solution yet (that I can implement) because my umbraco skills are way too low for this kind of stuff :D
Thanks Gerjan,
Yes I realised this yesterday after getting more info from Richard. My solution is to loop over all of the existing items in my destination Umbraco site and create an entry for each in the CmsImportRelation table. Then when I run the import using the CMSImport UI, it already 'knows' how to match the node name (or id, or whatever primary key you use - I'm using node name) with the id in the destination database.
Note that the format of the DataSourceKey is a bit strange, e.g. "Umbraco content@nodeNameName of Node" for a node with name "Name of Node".
Now I can run my import and CMSImport knows how to find a match for each node name in the table.
Cheers,
David
Hi all,
And to answer my original question, to prevent the creation of new nodes in my destination database, i.e. only ever update existing nodes matched by node name, I use the following event handler:
Thanks,
David
Wow David, thats great! I can not give you credits yet (not enough karma), but very kind that you share your code.
Awesome David,
Sorry I had to leave yesterday.
No worries, Richard, thanks for the hints to send me in the right direction :)
Will @David's solution work for Umb V8.12.2 with CMSImport V4.1.1?
Thanks.
Hi Craig,
In the new version you can inject
CMSImport.Extensions.Import.IImportRelationRepository
in your component and use that as to retreive and save relations. For the rest it's pretty much the same.Hope this helps,
Richard
Hi Richard,
What I was looking for was some way of not creating new nodes when importing. So CMSImport just updates existing nodes.
I'm in a similar position to the OP but 6 years and a few versions later ;)
Craig
is working on a reply...