Everything is working fine, but I have a problem with documents being created and not showing in the front end. I can only guess its due to examine not indexing the content quick enough.
Basically, if I create a new forum topic of post.. Its created fine but when the page is refreshed its not shown in the front end.. But obviously shows fine in the Umbraco Admin.
So is there a way to make Examine see the recently created node/content immediately? Sort of a hard refresh..
In examine settings there is under providers section interval setting think its default at 10 which i think is in ms possibly lowering that value? Also after post creation take a look at whats going on in the index directory under queue also look into the index using luke to see if anything eventually ends up in there
I was thinking about lowering that value, but didn't know if it would effect anything else? Like more overheads on the server? If I post, then post immediately after the previous post shows so its working fine just not showing the content quick enough...
I might sleep the method for half a second after publishing the node... Will post back in a mo :)
1 is the minimum interval that it will run at . Though i wouldn't recommend it because of possible performance decrease it should work. It will do a file check every 1 second if you put it to 1 for the lifetime of your app.
Examine does find new content as soon as you post it. The way Examine puts data into the index is different though. When a post is published, Examine creates a 'queue' file and persists it to disk (you can see these in the App_Data/TEMP/ExamineIndexes/IndexName/Queue folders when you publish something. Examine has a monitoring thread running in the background that scans this folder for files on a timer based on the interval time you specify in your config. When it finds files it then ingests them into your index.
The reason it does it this way is for a few reasons:
Indexing is a slightly expensive operation and if you ingest into the index on the same thread as your running app, then you might get timeouts especially if the operation causes the index to be rebuilt or optimized.... this is what happens when you turn off Async (DO NOT DO THIS THOUGH)
This ensures ONLY 1 thread is writing to the index ever. If you turn off Async then you will start having multiple threads trying to write to the index at once which will indefinitly happen at some point when running non-async. If this happens you will get YSODs. You will also corrupt your index and need to rebuild it.
The indexing operation runs on a background thread as to not disrupts the web experience
I've not really come across a situation where you require the data to be included in search results in the same second that you publish content... why is this a requirement?
I was hoping to swap out the default dataprovider of Linq2umbraco with Examine... And then use this to power my nForum package. As linq2umbraco is great, but seems very slow on some queries (On a very small site too, so worried about bigger sites).
When someone posts or creates a new topic, it redirects them back to their thread with the new posts - This is something which is not happening with the Examine provider as the post doesn't appear because it hasn't been indexed yet.
well, you would just have 2 providers right... your normal LinqToUmbraco context and your LinqToUmbracoExamine context.
first you query your LinqToUmbracoExamine context ('fast read') and if it returns no data (because the data hasn't gone into the index yet) then query your standard LinqToUmbraco context to get the data directly from Umbraco.
Ideally (in the next version of Examine) this is how it could work and this could be config based as well to support either in memory or file based:
Spawning a worker thread which does the index processing on startup
Telling the worker thread to index some specific data when required, the worker thread would need to keep an in-memory stack of the data to process (just like the file queue)
That way only one index is doing the processing still and should be fairly instant (depending on whether or not the index is rebuilding or optimizing).
Its just a bit trickier to manage this kind of setup vs just having a file system based queue. The other reason we have file system based queues is to support web farms with multiple servers on a SAN that can all write to the same queue but only the one server is doing the processing.
One last thing... when Examine was first written, Lucene didn't support concurrent index writing across threads. Apparently newer versions (like the one shipped with Examine currently) does support this as long as the same (static) IndexWriter is used across threads.... I haven't tested this though but if it did work, it would be a simpler way than having a worker thread.
Will have a new service release by the end of the week (weekend) hopefully.
Fixes a bug with file encoding (UTF8) which can cause the queue to not be processed if irregular chars are found. It also adds a new Multi-Index searcher provider that allows you to search across multiple indexes at the same time and keeping the returned 'Score' consistent.
It also updates the codebase a little and gives the ability to instantiate Indexers and Searchers at runtime really easily (not that you would generally do that :) which makes unit testing much nicer.
Also, just another thought on what you are doing. You should definitely have a read/write and a read/only provider setup as I would hate for you to rely solely on the Lucene index as your data store for the front-end. Having the setup i proposed before will mitigate any errors that could happen (sort of like a fail over plan). So first check Linq2Examine and if no data is returned, check Linq2Umbraco.
Refreshing Examine?
I am trying to use Examine as a dataprovider for Linq2Umbraco based on Mortons post
http://blog.sitereactor.dk/2011/02/28/examine-provider-for-linq2umbraco/
Everything is working fine, but I have a problem with documents being created and not showing in the front end. I can only guess its due to examine not indexing the content quick enough.
Basically, if I create a new forum topic of post.. Its created fine but when the page is refreshed its not shown in the front end.. But obviously shows fine in the Umbraco Admin.
So is there a way to make Examine see the recently created node/content immediately? Sort of a hard refresh..
Lee,
In examine settings there is under providers section interval setting think its default at 10 which i think is in ms possibly lowering that value? Also after post creation take a look at whats going on in the index directory under queue also look into the index using luke to see if anything eventually ends up in there
Regards
Ismail
I was thinking about lowering that value, but didn't know if it would effect anything else? Like more overheads on the server? If I post, then post immediately after the previous post shows so its working fine just not showing the content quick enough...
I might sleep the method for half a second after publishing the node... Will post back in a mo :)
Cheers
1 is the minimum interval that it will run at . Though i wouldn't recommend it because of possible performance decrease it should work. It will do a file check every 1 second if you put it to 1 for the lifetime of your app.
I suspected as much, thanks for commenting Shannon - Is there no other way to force examine to find the new content as soon as I have posted it?
Examine does find new content as soon as you post it. The way Examine puts data into the index is different though. When a post is published, Examine creates a 'queue' file and persists it to disk (you can see these in the App_Data/TEMP/ExamineIndexes/IndexName/Queue folders when you publish something. Examine has a monitoring thread running in the background that scans this folder for files on a timer based on the interval time you specify in your config. When it finds files it then ingests them into your index.
The reason it does it this way is for a few reasons:
I was hoping to swap out the default dataprovider of Linq2umbraco with Examine... And then use this to power my nForum package. As linq2umbraco is great, but seems very slow on some queries (On a very small site too, so worried about bigger sites).
When someone posts or creates a new topic, it redirects them back to their thread with the new posts - This is something which is not happening with the Examine provider as the post doesn't appear because it hasn't been indexed yet.
I would create a repository reader and 'fast read' data access layer.
Your data access layer will check if your Linq2UmbracoExamine provider returns data, if not, then get the data from the repository reader.
Going a little over my head... But I'll have a google around and see if I can figure out your idea :) Thanks for replying :)
well, you would just have 2 providers right... your normal LinqToUmbraco context and your LinqToUmbracoExamine context.
first you query your LinqToUmbracoExamine context ('fast read') and if it returns no data (because the data hasn't gone into the index yet) then query your standard LinqToUmbraco context to get the data directly from Umbraco.
Ideally (in the next version of Examine) this is how it could work and this could be config based as well to support either in memory or file based:
One last thing... when Examine was first written, Lucene didn't support concurrent index writing across threads. Apparently newer versions (like the one shipped with Examine currently) does support this as long as the same (static) IndexWriter is used across threads.... I haven't tested this though but if it did work, it would be a simpler way than having a worker thread.
Ok I'm with you now... I'll have a play :)
Sounds like you are going to have your work cut out for the next version of Examine then!
Thanks again for the help
Will have a new service release by the end of the week (weekend) hopefully.
Fixes a bug with file encoding (UTF8) which can cause the queue to not be processed if irregular chars are found. It also adds a new Multi-Index searcher provider that allows you to search across multiple indexes at the same time and keeping the returned 'Score' consistent.
It also updates the codebase a little and gives the ability to instantiate Indexers and Searchers at runtime really easily (not that you would generally do that :) which makes unit testing much nicer.
Also, just another thought on what you are doing. You should definitely have a read/write and a read/only provider setup as I would hate for you to rely solely on the Lucene index as your data store for the front-end. Having the setup i proposed before will mitigate any errors that could happen (sort of like a fail over plan). So first check Linq2Examine and if no data is returned, check Linq2Umbraco.
is working on a reply...