Copied to clipboard

Flag this post as spam?

This post will be reported to the moderators as potential spam to be looked at


  • Dennis Milandt 190 posts 517 karma points
    Feb 10, 2010 @ 12:32
    Dennis Milandt
    0

    Auto-login as a member for Google Mini Search

    I am working on a website for a Customer which have implemented Google Mini Search. They want Google Mini Search to index some protected pages protected with the out-of-the box public access protection in umbraco.

    Google Search Appliance supports Forms Authentication, but Google Mini Search doesn't. That means that we will have to somehow authenticate the Google Mini Search crawler so it can see the protected pages.

    So far I have tried this in the codebehind in a masterpage which is used for all templates:

    protected void Page_Init(object sender, EventArgs e)
    {
    // Google Mini auto login
    if (Request.UserAgent.Contains("gsa-crawler")
    {
    if (Request.UserAgent.Contains("[email protected]"))
    {
    // Auto Login
    var m = Member.GetMemberFromEmail("[email protected]");
    Member.AddMemberToCache(m, true, TimeSpan.FromHours(1));
    }
    }
    }

    The UserAgent should looks something like this:

    gsa-crawler (Enterprise; GID01065; [email protected])

    We are checking that the UserAgent contains "gsa-crawler" and a specific e-mail address entered in the Google Mini Search backend. If both of these matches, we attempt to auto-login a member we have created for Google Mini Search which have access to the protected pages.

    I know that Google Mini will ignore cookies, so we have tried to use this to login Google Mini:

    Member.AddMemberToCache(m, true, TimeSpan.FromHours(1));

    setting true to use session.

    With the above code being execute for every request that Google Mini Search makes, should Google Mini Search then not be allowed to see the protected pages? - even though Google Mini Search doesn't store any cookies?

    Result is that only the login page is indexed when we try to index a protected page.

     

    Any other suggestions on how to give Google Mini Search access to the protected pages are most welcome

    Kind regards
    Dennis Milandt

  • Richard Soeteman 4045 posts 12898 karma points MVP 2x
    Feb 10, 2010 @ 12:51
    Richard Soeteman
    2

    Hi Dennis,

    Feels a bit strange to open up your closed content for google, but to answer your question Recently I found out by building the Membershwitcher package that .net supports hacks like this. Once you set the Authentication cookie with the username it's all ok. So this little snippet should do the work

    FormsAuthentication

     

    .SetAuthCookie(your username her, false);

    Hope this helps you,

    Richar

  • Dennis Milandt 190 posts 517 karma points
    Feb 10, 2010 @ 13:10
    Dennis Milandt
    0

    Thank you very much for answering!

    We are not opening up for Google, just the inhouse Google Mini Search solution. The search results for the protected pages will only be available for users who are logged in.

    So what you are saying, that if I modify the code like this, it should work?

    // Auto Login
    var m = Member.GetMemberFromEmail("[email protected]");
    Member.AddMemberToCache(m, true, TimeSpan.FromHours(1));
    FormsAuthentication.SetAuthCookie(m.LoginName, false);

    Kind regards
    Dennis Milandt

  • Richard Soeteman 4045 posts 12898 karma points MVP 2x
    Feb 10, 2010 @ 13:23
    Richard Soeteman
    0

    Hi Dennis,

    Yes it should work. You can even remove the Member.AddmemberToCache line.

    Cheers,

    Richard

  • Dennis Milandt 190 posts 517 karma points
    Feb 10, 2010 @ 14:26
    Dennis Milandt
    1

    It didn't seem to work, by using SetAuthCookie, as it still requires the client (in this case Google Mini Search) to be able to store cookies.

    The following however did work:

    protected void Page_Init(object sender, EventArgs e)
    {
           
    // Google Mini auto login
           
    if (Request.UserAgent.Contains("gsa-crawler")
           
    {
                   
    if (Request.UserAgent.Contains("[email protected]"))
                   
    {
                           
    // Auto Login
                           
    var m = Member.GetMemberFromEmail("[email protected]");
                           
    Member.AddMemberToCache(m, true, TimeSpan.FromHours(1));
                            Response.Redirect(Request.RawUrl, true);
                    }
           
    }
    }

    In Web.config we added cookieless="AutoDetect" for system.web/authentication/forms.

    It allows Google Mini Search to login without using cookies, without affecting the regular user browsing the site using a browser.

    Thank you for your feedback.

    Kind regards
    Dennis Milandt

  • Lee Kelleher 4026 posts 15836 karma points MVP 13x admin c-trib
    Jun 07, 2010 @ 17:37
    Lee Kelleher
    0

    Hi Dennis,

    Curious if you got this working correctly - if you ran into any other issues ... I need to do something similar, not with Google Search Mini, but to allow access to protected pages for a spider/web-crawler - again based on the UserAgent string.

    From looking at your code snippet, wouldn't you get caught in a redirect loop?

    Curious if anyone else has done this successfully? I'm starting to hit a brick wall at the moment.

    Thanks, Lee.

  • Lee Kelleher 4026 posts 15836 karma points MVP 13x admin c-trib
    Jun 07, 2010 @ 18:24
    Lee Kelleher
    0

    Dennis, quick question... which mode are you using to store the session state?

    I'm currently using InProc, but that relies on cookies, so thinking I need to use SQL Server Mode?

    Thanks, Lee.

  • Dennis Milandt 190 posts 517 karma points
    Jun 08, 2010 @ 17:31
    Dennis Milandt
    0

    I believe that cookieless="AutoDetect" did the trick for us. Our session state is stored on the webserver InProc as well.

    /Dennis

  • Lee Kelleher 4026 posts 15836 karma points MVP 13x admin c-trib
    Jun 09, 2010 @ 09:29
    Lee Kelleher
    0

    Thanks Dennis.  I couldn't get it to work from the MasterPage's Page_Init event ... I had to override the /default.aspx code-behind (inheriting from umbraco.UmbracoDefault) and do it in the Page_PreInit event.  Still can't get the spider to login as a member ... I'll keep looking into it.

    Cheers, Lee.

Please Sign in or register to post replies

Write your reply to:

Draft