sql reporting over umbraco content including propeerty data

Press Ctrl / CMD + C to copy this to your clipboard.

Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at

overflew 87 posts 110 karma points

Dec 12, 2010 @ 11:38

0

SQL reporting over Umbraco content (including Propeerty Data)

Hi,

I've been asked to help a co-worker write some reports for the content we've now migrated into Umbraco. I've done a moderate amount of work with the Umbraco API, and even Linq2SQL against the Umbraco DB (which I'm now fairly familiar with), but relatively less with SQL itself. The reports must be written independent of .NET code. The reports will mostly be fed into graphs, and the occasional CSV file.

The 2 sample queries I would be looking to write, would be:

- To pull out all published instances of a particular document type, including key Property Data fields.

- E.g., Pull out all current Product documents, including the Price, Weight and Manufacturer fields

- To summarise all published instances of the above.

- E.g. Pull out all counts of all Product documents, grouped by Manufacturer, which is sourced from a Prevalue

With multiple Property Data items being stored as rows, not columns, I understand my options would be to look into 'Cross tab' SQL scripting, or XSLT parsing in SQL? Has anyone had any experience with either?

I've previously seen some of the SQL in the source of umbraco.cms.businesslogic.web.Document, which is a little frightening, but I may need to get my head around...

Any help with getting a foothold on this would be appreciated. Thanks.

Copy Link
Aaron Powell 1708 posts 3046 karma points c-trib

Dec 12, 2010 @ 23:14

0

Trying to do this in the database itself is going to be just unpleasant, the database isn't designed to be queried like that.

cmsContentType table is where you'll find the alias for a doc type

umbracoNode will give you the Text (name) for a doc type as well as a doc in the tree

cmsPropertyType is where you can find a property alias (and name)

cmsPropertyData is the data for a property on a document, it also contains all versions

That should get you started

Copy Link
Vince 1 post 71 karma points notactivated

Mar 16, 2017 @ 23:13

0

How are we expected to query the database without writing C# code and spinning up the entire site just to get some information out?

Sorry for my ignorance with umbraco.

Copy Link
overflew 87 posts 110 karma points

Dec 12, 2010 @ 23:25

0

Thanks Slace,

I'm confident I can pull out all cmsDocuments of a given content type - it's mostly around pulling out (just a few) of their associated property datas - the unpleasant part you mention...

This would be an offline process that's just run once a week, so it can be a little hairy, but I'm just trying to get to grips with how I'd pull out, say 2x columns of property data.

Copy Link
Aaron Powell 1708 posts 3046 karma points c-trib

Dec 12, 2010 @ 23:34

0

You can only do it by using sub queries, since the property data is all in the cmsPropertyData table you can't just do a SELECT (all properties) WHERE docId == 1234;

You have to have something like SELECT ( SELECT data WHERE alias == "myProperty" ) WHERE docId ==1234

It'll end up being slow and brittle too

Copy Link
overflew 87 posts 110 karma points

Jan 27, 2011 @ 08:24
2
Wow. So I ended up writing a few similar to those above - Mostly where I just needed to pull out one property value per document.

I also had a play with SQL pivot tables - which can work if all your data is in the same data type (e.g. dataNText, or dataDate)

The most recent discovery is XML querying in SQL!
```
-- Create temporary table that uses the XML data type
create table #XmlTable (
NodeId int,
NodeXml xml
)

-- Move all XML for documents of type '1121' into there
insert into #XmlTable(NodeId, NodeXml)
select cx.* from cmsContentXml cx
join cmsContent cc on cc.nodeId = cx.nodeId
where cc.contentType = 1121

select un.text [Document Name], 
xt.NodeXml.query('//node/data [@alias=''fieldOne'']').value('.', 'nvarchar(max)') [Field One],
xt.NodeXml.query('//node/data [@alias=''fieldTwo'']').value('.', 'nvarchar(max)') [Field Two],
xt.NodeXml.query('//node/data [@alias=''fieldThree'']').value('.', 'nvarchar(max)') [Field Three]

 from #XmlTable xt
 join umbracoNode un on un.id = xt.NodeId

-- Drop temporary table when finished
drop table #XmlTable
```
So there you go - a slightly more sustainable way to pull query across values.

If I understood it better, I guess it could be made more efficient. I have yet to figure out if there's a way to skip the temporary table, or if I can shorten the lines that pull out values.

Of all the tutorials that explained XML querying, this one helped the most - http://blogs.msdn.com/b/mrorke/archive/2005/07/21/441554.aspx.

Anyone taking a punt on this may also come across the error:
```
XQuery [value()]: 'value()' requires a singleton (or empty sequence),
found operand of type 'xdt:untypedAtomic *'
```
Which is described in this post - http://milambda.blogspot.com/2006/11/sql-server-2005-xml-methods-part-one.html

I assume the reason why this field is NText, not xml datatype in the Umbraco is that

- It consumes extra SQL resource,

- The code's unlikely to fire off these XML queries directly.

- I'm unsure if it works in MySql, etc...

Hope this helps anyone else! Any if you're a SQL whiz, let me know if it can be done any more efficiently!

Thanks.
Copy Link
is working on a reply...

This forum is in read-only mode while we transition to the new forum.

You can continue this topic on the new forum by tapping the "Continue discussion" link below.

Please Sign in or register to post replies

Flag this post as spam?

SQL reporting over Umbraco content (including Propeerty Data)