i'm creating a function that is importing several thousands of items from an external provider.
I'm having 2 performance issues which make my function unusable cause it takes an average of 10 seconds per item to import. The first bottleneck is the document.getProperty() method. I create a new document and fill 11 properties by using this code: doc.getProperty("link").Value = link; This takes about 3 to 4 seconds in total for each item.
The second bottleneck is the publish method: myDocument.Publish(new User(0)); This takes about 6 seconds for each item.
Is there a way to improve the performance of both of these methods to an acceptable time?
Update: the second bottleneck (myDocment.Publish()) is already resolved. Modified it to publish only when import is done and publish root-item and all subitems.
Every time you access a property via getProperty it does a database lookup, so you're actually accessing the database to get the property reference, and then you access the database again when you assign the Value to it.
Is there a way to work around this to increase performance? For example update properties in a different way or update them all and only write to the database after the last update?
When developing CMSImport I ran into the same issues. I managed to get this a little bit better. There are a several things that get in the way when importing more than 1000 items.
1. The document cache. You can disable this by setting the ContinouslyUpdateXmlDiskCache setting in the UmbracoSettings.config file to false.
2. Lucene. You need to disable indexing otherwise Lucene will throw a bunch of errors. Check out this post of Morten
With this you get a better performance. I don't say its perfect but it's a lot better. Your suggestions for the work arround will not work in the current version of Umbraco since it's updating the database directly when the property gets set.
Richard, thanks for the tips. I implemented both modifications and performance is increased, allthough not as much as I want, but I'm now down to 3 to 4 seconds per imported item. For an import of for example 1500 items it is still not acceptible cause it takes about one hour and a half.
I'm now thinking of creating a custom method to update all the properties of an item at once, but that would probably involve overruling the Umbraco datalayer and that might not be wise to do....
I've imported 10000 items faster then that.. How many properties do you have on your doctype? Maybe disable the whole cache by setting XmlCacheEnabled to false, also in the UmbracoSettings.config .
I won't go the custom method If I was you. Also it will not matter if you change the datalayer because a higher level object triggers the save not the datalayer itself
My doctype has only 11 properties. That's not incredibly much.
I will try disabling the whole cache setting and see if it makes any difference. What should be a normal timespan for an import of about 1500 items and 11 properties per item?
I am using MySQL for Umbraco so maybe the database also has a little share in the performance issue.
thanks for your tips. Unfortunatelydisabling the whole cache didn't help. I'm going to test it with a different MySQL server or maybe MS SQL if the performance isn't improving. See if I can resolve this issue.
performance issues when using d.getProperty()
Hello,
i'm creating a function that is importing several thousands of items from an external provider.
I'm having 2 performance issues which make my function unusable cause it takes an average of 10 seconds per item to import.
The first bottleneck is the document.getProperty() method. I create a new document and fill 11 properties by using this code: doc.getProperty("link").Value = link;
This takes about 3 to 4 seconds in total for each item.
The second bottleneck is the publish method: myDocument.Publish(new User(0));
This takes about 6 seconds for each item.
Is there a way to improve the performance of both of these methods to an acceptable time?
Update: the second bottleneck (myDocment.Publish()) is already resolved. Modified it to publish only when import is done and publish root-item and all subitems.
Every time you access a property via getProperty it does a database lookup, so you're actually accessing the database to get the property reference, and then you access the database again when you assign the Value to it.
That was what I already thought :(
Is there a way to work around this to increase performance? For example update properties in a different way or update them all and only write to the database after the last update?
Hi Martijn,
When developing CMSImport I ran into the same issues. I managed to get this a little bit better. There are a several things that get in the way when importing more than 1000 items.
1. The document cache. You can disable this by setting the ContinouslyUpdateXmlDiskCache setting in the UmbracoSettings.config file to false.
2. Lucene. You need to disable indexing otherwise Lucene will throw a bunch of errors. Check out this post of Morten
With this you get a better performance. I don't say its perfect but it's a lot better. Your suggestions for the work arround will not work in the current version of Umbraco since it's updating the database directly when the property gets set.
Cheers,
Richard
Richard, thanks for the tips. I implemented both modifications and performance is increased, allthough not as much as I want, but I'm now down to 3 to 4 seconds per imported item. For an import of for example 1500 items it is still not acceptible cause it takes about one hour and a half.
I'm now thinking of creating a custom method to update all the properties of an item at once, but that would probably involve overruling the Umbraco datalayer and that might not be wise to do....
Hi Martijn,
I've imported 10000 items faster then that.. How many properties do you have on your doctype? Maybe disable the whole cache by setting XmlCacheEnabled to false, also in the UmbracoSettings.config .
I won't go the custom method If I was you. Also it will not matter if you change the datalayer because a higher level object triggers the save not the datalayer itself
Cheers,
Richard
My doctype has only 11 properties. That's not incredibly much.
I will try disabling the whole cache setting and see if it makes any difference.
What should be a normal timespan for an import of about 1500 items and 11 properties per item?
I am using MySQL for Umbraco so maybe the database also has a little share in the performance issue.
Hi Martijn,
That should take less than 5 minutes I think. Didn't test it on Mysql though.
Cheers,
Richard
Richard,
thanks for your tips. Unfortunatelydisabling the whole cache didn't help. I'm going to test it with a different MySQL server or maybe MS SQL if the performance isn't improving. See if I can resolve this issue.
is working on a reply...