I need to move several thousand PDF articles into an Umbraco site. Articles are grouped by the month they appeared.
Should I import all of these into my Umbraco media, which has the added benefit of being able to assign additional meta data? Or, should I keep these as standard files on the file system with a .NET usercontrol to handle listing and displaying them?
Yes, need to be able to search the text within them. Sounds like it will be a bit of work initally to get them uploaded but a once time operation. The problem I have always had with PDFs in Umbraco, is there does not seem to be a way to add extra meta data to them. For instance, I would like to associate a friendly title to the file and perhaps a publication date.
If you create a new Media type for PDF, can you not add extra meta data as properties? I'm not sure how that would get indexed in Examine, but it should be available with the XML returned by the umbraco.library:GetMedia method.
As JC suggested add the extra properties to the media type file. In examine you can tap into indexing events e.g
public class ExamineEvents:ApplicationBase
{
private const string Indexer = "AcmeIndexer";
/// <summary>
/// constructor
/// </summary>
public ExamineEvents()
{
ExamineManager.Instance.IndexProviderCollection[Indexer].GatheringNodeData
+= ExamineEvents_GatheringNodeData;
}
#region examine event handlers
/// <summary>
/// Event handler for GatheringNodeIndex.
/// This will fire everytime Examine is creating/updating an index for an item
/// </summary>
/// <param name="sender"></param>
/// <param name="e"></param>
void ExamineEvents_GatheringNodeData(object sender, IndexingNodeDataEventArgs e)
{
if (e.IndexType == IndexTypes.Media)
{
AddMediaMetaToContentIndex(e);
}
}
#endregion
private void AddMediaMetaToContentIndex(IndexingNodeEventArgs e)
{
Media m = new Media(e.NodeId);
//get the properties and add to index repeat for all fields
string someprop = m.GetProperty("someprop").Value;
e.Fields.Add("someprop",someprop);
}
}
I remember now. Adding extra meta data to the new media type was not a problem. I was unable to get XSLT search to use the new meta data. It looks like we need to look at changing to Examine sooner than planned.
PDF Library Suggestions
I need to move several thousand PDF articles into an Umbraco site. Articles are grouped by the month they appeared.
Should I import all of these into my Umbraco media, which has the added benefit of being able to assign additional meta data? Or, should I keep these as standard files on the file system with a .NET usercontrol to handle listing and displaying them?
do the pdf's need to be searchable namely using examine? If so i would import them in,
Regards
Ismail
Yes, need to be able to search the text within them. Sounds like it will be a bit of work initally to get them uploaded but a once time operation. The problem I have always had with PDFs in Umbraco, is there does not seem to be a way to add extra meta data to them. For instance, I would like to associate a friendly title to the file and perhaps a publication date.
If you create a new Media type for PDF, can you not add extra meta data as properties? I'm not sure how that would get indexed in Examine, but it should be available with the XML returned by the umbraco.library:GetMedia method.
Connie,
As JC suggested add the extra properties to the media type file. In examine you can tap into indexing events e.g
You will need to tweak to your situation.
Regards
Ismail
I remember now. Adding extra meta data to the new media type was not a problem. I was unable to get XSLT search to use the new meta data. It looks like we need to look at changing to Examine sooner than planned.
is working on a reply...