Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Naveed Ali 157 posts 422 karma points
    Jan 01, 2021 @ 15:21
    Naveed Ali
    0

    sitemap google search console crawl

    I have submitted a sitemap to google search for my website a few times now and everytime i get a very large amount of pages that are not being indexed.

    I do not know why. This is a standard sitemap generated within umbraco.

    i have not robots.txt file so this is not blocking it.

    i have 155 pages that are "Crawled - currently not indexed" !!

    only 68 of my pages are indexed.

    I am using merchello backoffice which creates a URL-Slug for my product pages so not sure if this is the issue

    anybody have any ideas on this with there umbraco website and google search console? is there something i have missed to do for my sitemap:

    here is my sitemap code:

        @inherits Merchello.Web.Mvc.MerchelloTemplatePage
    @using Merchello.Core.Models
    @using Merchello.FastTrack.Ui
    @using Merchello.Web
    @using Merchello.Web.Models.ContentEditing;
    @using System.Configuration;
    @{
        Layout = null;
    }<?xml version="1.0" encoding="UTF-8" ?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
            xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"
            xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
        <url>
            <loc>@(Model.Content.AncestorOrSelf(1).UrlWithDomain())</loc>
            <lastmod>@Model.Content.AncestorOrSelf(1).UpdateDate.ToString("s")+00:00</lastmod>
        </url>
        @ListChildNodes(Model.Content.AncestorOrSelf(1))
        @ListProductNodes()
    </urlset>
    
    @helper ListChildNodes(IPublishedContent startNode)
    {
        Response.ContentType = "text/xml";
        const int maxLevelForSiteMap = 100;
        foreach (var node in startNode.Children
            .Where(n =>
                //n.IsVisible() &&
                n.TemplateId > 0 &&
                !Umbraco.IsProtected(n.Path) &&
                (!n.HasProperty("searchEngineSiteMapHide") || !n.GetPropertyValue<bool>("searchEngineSiteMapHide")))
            .Select(n => n.AsDynamic()))
        {
            <url>
                <loc>@(((IPublishedContent)node).UrlWithDomain())</loc>
                <lastmod>@node.UpdateDate.ToString("s")+00:00</lastmod>
                @if (node.SearchEngineSitemapChangeFreq.ToString() != "")
                {<changefreq>@node.SearchEngineSitemapChangeFreq</changefreq>}
                @if (node.SearchEngineSitemapPriority.ToString() != "")
                {<priority>@node.SearchEngineSitemapPriority</priority>}
            </url>
            if (node.Level <= maxLevelForSiteMap)
            {
                @ListChildNodes(node)
            }
        }
    }
    
    @helper ListProductNodes()
    {
        var merchelloHelper = new MerchelloHelper();
        var products = merchelloHelper.Query.Product.Search(1, 1000);
    
        var queryProducts = products.Items.Select(x => (ProductDisplay)x).Where(x => x.Available == true);
    
        foreach (var product in queryProducts)
        {
            var node = merchelloHelper.Query.Product.TypedProductContent(product.Key);
            <url>
                <loc>@String.Format("{0}{1}", ConfigurationManager.AppSettings["SiteUrl"], node.Url)</loc>
                <lastmod>@node.UpdateDate.ToString("s")+00:00</lastmod>
            </url>
        }
    }
    
  • Naveed Ali 157 posts 422 karma points
    Jan 05, 2021 @ 17:08
    Naveed Ali
    0

    Any ideas anybody

  • Nik 1391 posts 6128 karma points MVP 3x c-trib
    Jan 05, 2021 @ 17:26
    Nik
    0

    Have you checked that the generated site map contains all of the links you are expecting to be indexed, including checking they are browseable links? (i.e. copying from the sitemap and pasting in the browser).

    (I don't want to assume anything sorry if it seems like a basic check)

    :-)

  • Naveed Ali 157 posts 422 karma points
    Jan 05, 2021 @ 17:36
    Naveed Ali
    0

    Thanks for the reply..yup i checked that and the urls work. I am thinking if the merchello url slug format is not something google crawl likes. They do have hyphens in it. Should i replace these with may a backlash or take them out completely.

    What format are your urls for the google search console to pick it up.

Please Sign in or register to post replies

Write your reply to:

Draft