I'm experiencing this on one of my Umbraco (v4) installs ... but the index has got up to 6Gb!!!
The umbracoLog table it littered with Lucene errors ... but they don't make much sense to me.
Examples of error:
Error indexing node: System.IO.IOException: read past EOF at Lucene.Net.Store.BufferedIndexInput.Refill() at Lucene.Net.Store.BufferedIndexInput.ReadByte() at Lucene.Net.Store.IndexInput.ReadInt() at Lucene.Net.Index.SegmentTermEnum..ctor(IndexInput i, FieldInfos fis, Boolean isi) at Lucene.Net.Index.TermInfosReader..ctor(Directory dir, String seg, FieldInfos fis) at Lucene.Net.Index.SegmentReader.Initialize(SegmentInfo si) at Lucene.Net.Index.SegmentReader.Get(Directory dir, SegmentInfo si, SegmentInfos sis, Boolean closeDir, Boolean ownDir) at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment, Int32 end) at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment) at Lucene.Net.Index.IndexWriter.FlushRamSegments() at Lucene.Net.Index.IndexWriter.Close() at umbraco.cms.businesslogic.index.Indexer.IndexNode(Guid ObjectType, Int32 Id, String Text, String UserName, DateTime CreateDate, Hashtable Fields, Boolean Optimize) at umbraco.cms.businesslogic.web.Document.Index(Boolean Optimze)
and
Error indexing node: (System.NullReferenceException: Object reference not set to an instance of an object. at Lucene.Net.Index.TermVectorsWriter.AddAllDocVectors(TermFreqVector[] vectors) at Lucene.Net.Index.SegmentMerger.MergeVectors() at Lucene.Net.Index.SegmentMerger.Merge() at Lucene.Net.Index.IndexWriter.MergeSegments(Int32 minSegment, Int32 end) at Lucene.Net.Index.IndexWriter.Optimize() at umbraco.cms.businesslogic.index.Indexer.IndexNode(Guid ObjectType, Int32 Id, String Text, String UserName, DateTime CreateDate, Hashtable Fields, Boolean Optimize))
More info... From Ismail's previous comment, Lucene isn't merging the index ... there were over 70,000 files in the "_systemUmbracoIndexDontDelete" folder.
We have deleted the files now (not the "_systemUmbracoIndexDontDelete" folder) ... but are cautious to re-run the "/umbraco/reindex.aspx" page.
This started happening last Friday, (in which we deleted and re-indexed), and today it's grown massive again!
Could there be some kind of doc-type/property data/value that is causing Lucene to throw the error/exception? (Taking a wild guess)
I've also got an old v4 website where the _systemUmbracoIndexDontDelete folder is over 6 GB big. So it's safe to just delete the files inside the _systemUmbracoIndexDontDelete folder?
_systemUmbracoIndexDontDelete Lucene index massive filesize
Hi all,
There was an issue posted on the old forum about the back-end Lucene index getting massive in filesize:
http://forum.umbraco.org/yaf_postst9142_datasystemUmbracoIndexDontDelete-folder-1000-MB.aspx
I'm experiencing this on one of my Umbraco (v4) installs ... but the index has got up to 6Gb!!!
The umbracoLog table it littered with Lucene errors ... but they don't make much sense to me.
Examples of error:
and
Any ideas?
Thanks in advance, Lee.
Lee,
Move the file to another location (moving because it would be great if you could investigate the contents of the Lucenes files using Luke)
Reindex the site using /umbraco/reindex.aspx
Cheers,
/Dirk
More info... From Ismail's previous comment, Lucene isn't merging the index ... there were over 70,000 files in the "_systemUmbracoIndexDontDelete" folder.
We have deleted the files now (not the "_systemUmbracoIndexDontDelete" folder) ... but are cautious to re-run the "/umbraco/reindex.aspx" page.
This started happening last Friday, (in which we deleted and re-indexed), and today it's grown massive again!
Could there be some kind of doc-type/property data/value that is causing Lucene to throw the error/exception? (Taking a wild guess)
Only semi on topic, jsut so you know there's a new indexer in v4.1, Umbraco Examine. Hopefully it doesn't have this problem ;)
I've also got an old v4 website where the _systemUmbracoIndexDontDelete folder is over 6 GB big. So it's safe to just delete the files inside the _systemUmbracoIndexDontDelete folder?
Jeroen
is working on a reply...