This is the forum post to discus the whys and hows about getting GraphQL up and running within Umbraco CMS (and ideally Headless).
There is an issue for this on the issue tracker http://issues.umbraco.org/issue/U4-11389 for any actually work that we might do and I envisage this thread on Our being used for any discussions and ideas while we flesh out actionable items that can be done/tried.
I think that before starting to implement the spike, we need to at least consider what the capabilities and design of the endpoint should be.
One thing is how the query should work. Pete mentions the Doctypes as a way to go, and they do need to be in the schema somehow to make the api discoverable. But how should it work when querying child nodes? So if I fetch a node by id, and then want to get the children, then how can the schema describe those? Either it needs to be sort of a generic children with only the core properties, or there needs to be children_doctypealias property, that then only gets the children of a certain type. I can't seem to figure out how generics fit into the GraphQL concept?
I'm not sure that endpoints just exposing what is basically a //myDocType is the way to go. Tehy might give you some performance issues if they are not scoped somehow.
The other thing to consider is wether this API is supposed to be used directly from the clients browser. If so, then Pete's points about ensuring not all properties are exposed etc. is valid. But that really complicates the setup from an Umbraco perspective. If it's server-to-server queries with authentication, then it's a different story, and the client can be assumed "friendly".
GraphQL was built for frontend queries so the permissions thing is important. For proof of concept I don't think we need to worry about it just yet as there are bigger problems to solve first and when we do get to it we could get around it for now with a quick and dirty config file to manage permissions to get us over the hump.
As for the doctypes and nesting etc. Umbraco data is a graph of sorts. Just as Facebook you can have likes, groups and friends; your friends have likes and groups too. The rules of what is allowed or not is set via the back office with allowed DocTypes etc which makes the discover-ability of them "doable" and GraphQL allows for asking for what is getting returned via Meta Fields (https://graphql.org/learn/queries/#meta-fields)...see I told you GraphQL can do loads of stuff like this, these are problems they have already solved :)
I'm not saying it can't be done, I just want to figure out which moving parts of GraphQL match what we would need. I may sound like a "nay sayer", but I just always attack problems by finding issues first :)
When I said "schema", I mean the Introspection feature, and I just can find out how to specify covariance there? The Meta Fields solves the issue of telling the client which type they just got in the response, but would we not also need to tell them in the introspection which types might appear in the children list? I just can't seem to find a sample of that?
The difference from Facebooks graph is that they probably only have one type in the friends list for example.
I'm also interested in maybe compiling a list of "queries" that we would like to support. For example it might make sense to actually execute an xpath query to get back a set of nodes, with some specific properties and the children or something?
Regarding security, I think a nice and simple approach would be to have a simple UI/config for whitelisting doctypes/properties. It might even make sense to be able to configure multiple GraphQL endpoints, with different settings, so you could have one for "public" clients, and one for "trusted" clients, making it possible to restrict different clients from different info.
I think a list of possible queries is a great idea and I love your approach btw, you not a "nay sayer" at all just doing the due diligence that should be done.
I've been playing around with https://github.com/graphql-dotnet/graphql-dotnet and Umbraco a couple of times before and might have some POC code somewhere I can put up on Github if anyone is interested.
I guess I've been looking at it a little more than just a couple of times, found an old repository from 2016 when I first started playing around with the idea of having a GraphQL endpoint for Umbraco. So when I saw this topic I thought I'd share what I had.
I'd love to help with the development of getting something like this into Core.
I've also added links to this post and to the issue tracker in the repository if any one just finds that.
as far as I understood GraphQL and the dummy implementations I have seen it gets the data from an endpoint and then filters for the stuff you actually queried for. So the permission thing should be handled before giving the results to GraphQL I guess?
I am pretty new to GraphQL, but I'm trying to formulate a few sample queries that I think such an API should be able to answer.
If any of them are "non-grapql-ish" feel free to let me know. Also please suggest more queries that you think would be nice to be able to perform. The end goal could be to make a small Postman suite of HTTP requests you could fire against a common Umbraco starterkit, and see if the endpoint gives the expected results.
Also, I think it would be a good starting point for discussing how the queries should be formed, before actually implementing an endpoint for them :)
Use cases:
Scoping:
Get the root node for domain 'example.com'
Navigation:
Get the children for node {ID}
Filter them by 'umbNaviHide=false'
Include 2nd level children (filtered by 'umbNaviHide=false')
Traversing
Get the first childnode of type 'EventFolder', then get the first 'Event' child sorted by 'EventDate', then get the first ten 'Participant' children.
Get node {ID}, then get parent
Get node {ID}, then get ancestors of type 'Home'
Filtering
Get children with composite type 'SEOFields'
Get 'Event' nodes, where 'Event' has 'Participant' chile where 'Name' is 'Pete'
Get 'Participant' nodes where ancestor 'EventFolder' has 'Name = Codegarden'
There is a UI for testing against any GraphQL enpoint that you can use called Graphical which is rather awesome. Runs on node.js and is a Facebook tool.
If we get the data set up and running then we could easily use that to test how well its all going. Its also really handy to point it at a known GraphQL end point and have a play to get a feel for it. Github API is a GraphQL endpoint now so if you have an account with Github you can play around with that to get a feel for it:
Finally if you want to talk with Joe McBride the developer of GraphQL.net he has a Glimmer channel that is good for asking questions direct to the developer here:
Man, every time I read 'node.js' my blood just starts boiling. But that is a whole separate topic :-)
I'm a bit scared of the Github endpoint, because it loudly states that it runs on your real data, and I don't really like that I can't seem to scope the permissions to a single repo. I have too many important ones that I don't want to mess around with :-)
Latest from Joe McBride the dev of GraphQL.net via his Gitter chat room mentioned above:
Joe McBride @joemcbride 15:32
@PeteDuncanson You can dynamically build types. This used to be harder than it is today. I don’t know your exact setup, though I’m confident what you’re trying to do is possible (though may not be super easy).
@PeteDuncanson This SchemaBuilder dynamically builds a schema based on the Schema Language. You could dynamically build the schema based on a SQL query to the DocumentType. The Schema does need to be pre-built before the request execution, but it can be built dynamically.
So we now know its possible to build stuff "on the fly" dynamically which removes the need for ModelsBuilder I think. Something to play with and saving it here so we don't lose it. Home time!
I've been thinking about this project a bit the last few weeks.
Normally when we build a solution for a customer, the structure is something like:
site 1
- frontpage
- news
- about
- employees
site 2
- frontpage
- news
- about
Self Service Site
- Profile
- Subscriptions
- Etc.
configuration
- news
- sections
- employees
- types
- etc.
- more etc.
The configuration is made up of doctypes that does not have a template, but we often query the configuration for a list of emplyeetypes, newssections and so on. There may also be pages that are behind a login (self service), or pages that have information that is shown to specific users etc.
Umbraco with-out-a-head will also have the same issues, but since that is only targeted as a content-only solution, it dosen't have to take the same precautions on what data is being published.
Do these concerns also apply here? Should they be addressed somehow?
Also, if we need to support multisite-solutions, how can that be done?
Trouble is that wouldn't give you access to the shared folder of data on the root out of the box, you'd need a way to be able to get to that and you could end up with permission issues if not careful. It could be you could have a option to mark a folder on the root as "accessible via GraphQL" or similar and in those cases you would be able to try to access it via your query on any of the above domains and that shared folder would then be available. For ease of permissions for now anything marked as "accessible" would be public on all endpoints.
There are plenty of options of adding middle ware into the stack before GraphQL if we want to do more fine grained permissions etc. But that requires custom code at that point (or a kick ass package to be developed in the future).
I am defiantly up to help on this one.
I think this would be an awesome addition to Umbraco and to work to get it in as standard for a version of Umbraco 8.
GraphCMS has a model builder (Doctypes - And on that I think if Umbraco started calling them Models as well it would make them more meaningful and indicate scalability and object relations better) and uses Graph QL to pull them together.
I think being able to build a Graph QL solution using the interface that it comes with and if we can integrate it into the back office reading doctypes as a first step and save.
From that being able to declare access to that request and then just being able to access directly would be very powerful.
A lot of the custom class work, multiple for loops, SQL access, Various node abstract and dynamic access would all be massively reduced.
I think it would make vast improvements to performance, easy of use for both experienced and beginner developers and just feels a perfect fit for Umbraco.
With that and the announcement of Umbraco Headless I also feel it is even more of an ideal fit for that as well.
Getting GraphQL into Umbraco
This is the forum post to discus the whys and hows about getting GraphQL up and running within Umbraco CMS (and ideally Headless).
There is an issue for this on the issue tracker http://issues.umbraco.org/issue/U4-11389 for any actually work that we might do and I envisage this thread on Our being used for any discussions and ideas while we flesh out actionable items that can be done/tried.
Feel free to chip in :)
Probably need to read up on GraphQL. Has been on my radar for a while, so will get back later on to join the discussion
Dave
Yes, lets get this... But please not headless only.
We could do with read-only for starters...
I think that before starting to implement the spike, we need to at least consider what the capabilities and design of the endpoint should be.
One thing is how the query should work. Pete mentions the Doctypes as a way to go, and they do need to be in the schema somehow to make the api discoverable. But how should it work when querying child nodes? So if I fetch a node by id, and then want to get the children, then how can the schema describe those? Either it needs to be sort of a generic
children
with only the core properties, or there needs to bechildren_doctypealias
property, that then only gets the children of a certain type. I can't seem to figure out how generics fit into the GraphQL concept?I'm not sure that endpoints just exposing what is basically a
//myDocType
is the way to go. Tehy might give you some performance issues if they are not scoped somehow.The other thing to consider is wether this API is supposed to be used directly from the clients browser. If so, then Pete's points about ensuring not all properties are exposed etc. is valid. But that really complicates the setup from an Umbraco perspective. If it's server-to-server queries with authentication, then it's a different story, and the client can be assumed "friendly".
GraphQL was built for frontend queries so the permissions thing is important. For proof of concept I don't think we need to worry about it just yet as there are bigger problems to solve first and when we do get to it we could get around it for now with a quick and dirty config file to manage permissions to get us over the hump.
As for the doctypes and nesting etc. Umbraco data is a graph of sorts. Just as Facebook you can have likes, groups and friends; your friends have likes and groups too. The rules of what is allowed or not is set via the back office with allowed DocTypes etc which makes the discover-ability of them "doable" and GraphQL allows for asking for what is getting returned via Meta Fields (https://graphql.org/learn/queries/#meta-fields)...see I told you GraphQL can do loads of stuff like this, these are problems they have already solved :)
I'm not saying it can't be done, I just want to figure out which moving parts of GraphQL match what we would need. I may sound like a "nay sayer", but I just always attack problems by finding issues first :)
When I said "schema", I mean the Introspection feature, and I just can find out how to specify covariance there? The Meta Fields solves the issue of telling the client which type they just got in the response, but would we not also need to tell them in the introspection which types might appear in the
children
list? I just can't seem to find a sample of that?The difference from Facebooks graph is that they probably only have one
type
in the friends list for example.I'm also interested in maybe compiling a list of "queries" that we would like to support. For example it might make sense to actually execute an xpath query to get back a set of nodes, with some specific properties and the children or something?
Regarding security, I think a nice and simple approach would be to have a simple UI/config for whitelisting doctypes/properties. It might even make sense to be able to configure multiple GraphQL endpoints, with different settings, so you could have one for "public" clients, and one for "trusted" clients, making it possible to restrict different clients from different info.
I think a list of possible queries is a great idea and I love your approach btw, you not a "nay sayer" at all just doing the due diligence that should be done.
Going to join Dave on reading up on GraphQL but certainly interested in following on from CG18 on this!
Super Tak!
I'm not sure how relevant this is but this github project by christofur exists.
Although it's 2 years old, some work is out there from a while ago.
Looks like that is a node.js implementation. Seems to be reading directly from the DB.
I've been playing around with https://github.com/graphql-dotnet/graphql-dotnet and Umbraco a couple of times before and might have some POC code somewhere I can put up on Github if anyone is interested.
do it ! :)
would make lots of sense, since we are all just eager to see something
I found the code and it's now up on Github.
rockstar - very nice to see all the work you've already done.
Makes it look like the test I've done, is only about 3-5% of your work :)
Rasmus this looks amazing, light years ahead of my version. Going to down load and have a play :)
Thanks Pete.
I guess I've been looking at it a little more than just a couple of times, found an old repository from 2016 when I first started playing around with the idea of having a GraphQL endpoint for Umbraco. So when I saw this topic I thought I'd share what I had.
I'd love to help with the development of getting something like this into Core.
I've also added links to this post and to the issue tracker in the repository if any one just finds that.
I don't think it should be in core, like headless is not in core.
I would suggest it being an easy install, like modelsbuilder. Modelsbuilder is just a dependency of the
umbracocms
nuget package.Doing it this way, it can be "turned on" bu just adding a nuget-package.
Hi,
as far as I understood GraphQL and the dummy implementations I have seen it gets the data from an endpoint and then filters for the stuff you actually queried for. So the permission thing should be handled before giving the results to GraphQL I guess?
should / could :)
As Pete suggest @ GC it would be an API that would let the frontend get the data it needs, without a backender needing to create another endpoint
I am pretty new to GraphQL, but I'm trying to formulate a few sample queries that I think such an API should be able to answer.
If any of them are "non-grapql-ish" feel free to let me know. Also please suggest more queries that you think would be nice to be able to perform. The end goal could be to make a small Postman suite of HTTP requests you could fire against a common Umbraco starterkit, and see if the endpoint gives the expected results.
Also, I think it would be a good starting point for discussing how the queries should be formed, before actually implementing an endpoint for them :)
Use cases:
Scoping:
Navigation:
Traversing
Filtering
There is a UI for testing against any GraphQL enpoint that you can use called Graphical which is rather awesome. Runs on node.js and is a Facebook tool.
https://github.com/graphql/graphiql
If we get the data set up and running then we could easily use that to test how well its all going. Its also really handy to point it at a known GraphQL end point and have a play to get a feel for it. Github API is a GraphQL endpoint now so if you have an account with Github you can play around with that to get a feel for it:
https://developer.github.com/v4/
Finally if you want to talk with Joe McBride the developer of GraphQL.net he has a Glimmer channel that is good for asking questions direct to the developer here:
https://gitter.im/graphql-dotnet/graphql-dotnet
Man, every time I read 'node.js' my blood just starts boiling. But that is a whole separate topic :-)
I'm a bit scared of the Github endpoint, because it loudly states that it runs on your real data, and I don't really like that I can't seem to scope the permissions to a single repo. I have too many important ones that I don't want to mess around with :-)
Latest from Joe McBride the dev of GraphQL.net via his Gitter chat room mentioned above:
Joe McBride @joemcbride 15:32 @PeteDuncanson You can dynamically build types. This used to be harder than it is today. I don’t know your exact setup, though I’m confident what you’re trying to do is possible (though may not be super easy). @PeteDuncanson This SchemaBuilder dynamically builds a schema based on the Schema Language. You could dynamically build the schema based on a SQL query to the DocumentType. The Schema does need to be pre-built before the request execution, but it can be built dynamically.
https://github.com/graphql-dotnet/graphql-dotnet/blob/master/src/GraphQL/Utilities/SchemaBuilder.cs https://github.com/graphql-dotnet/graphql-dotnet/blob/master/src/GraphQL.Tests/Utilities/SchemaBuilderTests.cs
So we now know its possible to build stuff "on the fly" dynamically which removes the need for ModelsBuilder I think. Something to play with and saving it here so we don't lose it. Home time!
I've been thinking about this project a bit the last few weeks.
Normally when we build a solution for a customer, the structure is something like:
The configuration is made up of doctypes that does not have a template, but we often query the configuration for a list of emplyeetypes, newssections and so on. There may also be pages that are behind a login (self service), or pages that have information that is shown to specific users etc.
Umbraco with-out-a-head will also have the same issues, but since that is only targeted as a content-only solution, it dosen't have to take the same precautions on what data is being published.
Do these concerns also apply here? Should they be addressed somehow?
Also, if we need to support multisite-solutions, how can that be done?
Multi site is a good one, not thought much about that one but we probably should.
We could manage similar to how Umbraco does it currently via the domain you set on the site and use that as the endpoint so given three sites:
www.example.com, example.ie and example.co.uk
You could access each sites data via three different end points:
www.example.com/graphql/ www.example.ie/graphql/ www.example.co.uk/graphql/
Trouble is that wouldn't give you access to the shared folder of data on the root out of the box, you'd need a way to be able to get to that and you could end up with permission issues if not careful. It could be you could have a option to mark a folder on the root as "accessible via GraphQL" or similar and in those cases you would be able to try to access it via your query on any of the above domains and that shared folder would then be available. For ease of permissions for now anything marked as "accessible" would be public on all endpoints.
There are plenty of options of adding middle ware into the stack before GraphQL if we want to do more fine grained permissions etc. But that requires custom code at that point (or a kick ass package to be developed in the future).
I am defiantly up to help on this one. I think this would be an awesome addition to Umbraco and to work to get it in as standard for a version of Umbraco 8.
To help for some reference the following headless CMS's I know/like that use it: http://documentation.near-me.com/reference/graphql/
https://graphcms.com/
GraphCMS has a model builder (Doctypes - And on that I think if Umbraco started calling them Models as well it would make them more meaningful and indicate scalability and object relations better) and uses Graph QL to pull them together.
I think being able to build a Graph QL solution using the interface that it comes with and if we can integrate it into the back office reading doctypes as a first step and save.
From that being able to declare access to that request and then just being able to access directly would be very powerful. A lot of the custom class work, multiple for loops, SQL access, Various node abstract and dynamic access would all be massively reduced.
I think it would make vast improvements to performance, easy of use for both experienced and beginner developers and just feels a perfect fit for Umbraco.
With that and the announcement of Umbraco Headless I also feel it is even more of an ideal fit for that as well.
We need to make this happen basically :)
is working on a reply...