Kryptronic Software Support Forum

Blitzen · 11-13-2008 21:09:06

Why, when a first-time visitor (w/o cookies) comes to the site, the links are the long urls that contain the sid?
Then, after they go to another page, it switches to the SEO urls?

How can we get around the page that the first-time visitor comes to to contain the SEO urls and not the long ones?

wyattea · 11-13-2008 21:53:23

it's annoying and I hate it with the passion of a mighty hurricane, but it's the way ccp works and I've been told it's normal (although I'ver never seen another site that needed to do that).

Dave · 11-13-2008 22:18:50

This question comes up often enough I should really create a Wiki entry for it.

The first time a person visits your site CCP attempts to set a cookie which contains the SID. When setting a cookie an application can not determine whether or not setting the cookie actually worked until the next time something is sent back to the server (completing a form or navigating to another page). The 2nd time CCP sees something it can now know whether or not the cookie setting worked and adjust the URLs accordingly.

Other sites that don't exhibit this usually carry the SID in the URL which isn't very SEO friendly.

Blitzen · 11-14-2008 19:47:31

I managed to resolve this with CCP5 and forgot what I did. It had to do with cookies.

Can we fool the program to test for the cookie and circumvent this?

Anyway, my CCP6 store is in a subdirectory (e.g., wwwmystore.com/buy/index.php).
When a person puts something in their cart, then goes to the home page
in the root directory (wwwmystore.com/, one dir up from the CCP6 store directory), their
cart empties.

Do you have any suggestion how to get around the cart emptying?

Last edited by Blitzen (11-14-2008 19:49:20)

magwa · 02-02-2009 05:43:11

I'm a little confused by this.

What links will a search engine crawler see?

Will it see the long ones or the SEO ones?

Blitzen · 02-02-2009 16:10:43

The SE will index both. The days are over that long URLs are a problem.

You can go to G and enter site:www.mysite.com to see what G will index. That can answer one question.

The concern is if you have two identical pages with different URL's, SE's see this as "duplicate pages" and will demote one. It's uncertain if the other dupe page will suffer a demotion - no one has good research data on that.

Last edited by Blitzen (02-02-2009 16:11:51)

topper · 02-08-2009 18:04:38

If Google indexes both the long and the seo url's produced by CCP, this means the destination page is the same for both url's and will be seen as duplicate content. Therefore every page in CCP is a duplicate if SEO is turned on - someone please tell me this is not correct otherwise this is a complete disaster.

Dave · 02-08-2009 18:26:48

According to the , duplicate content links are NOT the problem that everyone seems to think they are.

theblade24 · 02-08-2009 19:06:47

Why not place these in your robots.txt file

Disallow: /index.php?app=
Disallow: /ccp0-emailfriend/

and sit back and relax knowing only the SEO links will be indexed?

Last edited by theblade24 (02-08-2009 19:07:41)

magwa · 02-09-2009 04:32:59

blade, wouldn't your suggestion in the robots tex file effectively ban search engines completely from a website?

These long url's are what is presented on first visit and it's not until one of them is followed that the seo links become visible and if the command is to disallow the long urls the search engines have nowhere to go.

Dave · 02-09-2009 06:09:52

Don't forget that what you see as a real user with a real browser is very different from what is presented to bots that are crawling your site.

theblade24 · 02-09-2009 06:46:48

Exactly! And if youre submitting sitemaps to google, yahoo, and msn with the short urls then all is good.

west4 · 02-09-2009 10:10:13

Hi All,

Conversation with Brett Yount at MSN LIVE SEARCH about site not being spidered, he pointed out this...........

I think we may have hit on something. It is possible your robots.txt is blocking due to your site being accessable using /index.php?

You have that blocked in your rep to disallow /index.php?=App

Problem is, ? is a wildcard in the REP

As borrowed from janeandrobot.com:

Selectively allow access to a URL that matches a blocked pattern - Use the Allow directive in conjunction with pattern matching for more complex implementations.

# Block access to URLs that contain ?

# Allow access to URLs that end in ?

User-agent: *

Disallow: /*?

Allow: /*?$

That directive blocks all URLs that contain ? except those that end in ?.

In this example, the default version of the page will be indexable:

* http://www.example.com/productlisting.aspx?

Variations of the page will be blocked:

* http://www.example.com/productlisting.aspx?nav=price
* http://www.example.com/productlisting.aspx?sort=alpha

Maybe adding the allow statement will help in your case...
--------------
OK added this to the robots.txt
----------------

so I added this to my robots.txt but so far no help at all.

User-agent: *

Disallow: /*?

Allow: /*?$

------------------

What do you think guys?

Cheers,
Bruce.

Dave · 02-09-2009 10:16:26

west4 wrote:
What do you think guys?

Speaking only for myself I think people spend far too much time, effort and $$ on something that is really pretty straightforward. Have a sitemap and feed it to the search engines. End of story.

Yes, that's simplistic and easy to do but that is really all that should be needed. You tell the bots exactly what you want them to look at and tell them exactly what URL it is you want them to associate with what they find.

Standing back waiting for the net rocks

theblade24 · 02-09-2009 12:39:44

I'm confused on what you are asking. I see your categories and products spidered with nice SEO urls in Google.

What problem are you having? Not showing up in Yahoo or MSN?

Both of those are waaaaayyyyyy slower to add sites to their index than google is. Is that the issue?

As far as index.php? I can't think of any url that I would want it in and allow it to be spidered. I don't want or need that appearing in any urls picked up by search engines. Am I missing something?

west4 wrote:
What do you think guys?

Cheers,
Bruce.

Blitzen · 02-09-2009 13:42:08

I avoid duplicate pages with my own SEO URL mod and denying absolutely everything in cgi-bin in robots.txt.
There are no links in cgi-bin that I care to have indexed.
Before denying cgi-bin, I saw both URLs to the same page being indexed by SE's.
Bear in mind that not every SE obeys robots.txt.

I'm disappointed to see G Webmaster Tools (Sitemaps) listing the URLS in the cgi-bin as restricted in robots.txt. For some reason, G can and is looking at those pages, even though robots.txt tells it to ignore those pages. What a waste of resources for G.

In my experience, the sitemap doesn't trump the links in the website itself. Not all SE's read the sitemap.
I would think [opinion] that the links in the website itself are weighted by SE's more than external pages.

theblade24 · 02-09-2009 14:04:37

What cgi-bin directory are you referring to in CCP6?

Google webmaster tools is doing exactly as it states. It's letting you know the urls that are resticted from being indexed by them by robots.txt. Sure it's going to read them all as it has no idea what not to read unless it relies on something to tell it not to read something. It still at least reads urls that are restricted because it may find a link to a page that isn't resticted on a restricted page.

west4 · 02-10-2009 04:05:09

Hi,

Well I think Brett (administrator on msn forum) was saying that having ? blocked was hindering my attempt to get msn to spider my site! in fact it fell from only the sitemap.xml being indexed to nothing being indexed in the last 3 weeks since adding the allow statement.. so do i want spiders to pick up index.php? or not, and do i want any links with ? on the end spidered, and does any one else see an issue with msn not spidering and cured it with a robot.txt entry?

Cheers,
Bruce.

theblade24 · 02-10-2009 06:59:57

How long has your site been live?

I'm seeing MSN completely ignore my robots.txt file anyway. I see both good and bad urls spidered.

west4 · 02-10-2009 07:40:53

Hi theblade24,

The site has been live since Sept 2008 so 5 months, loads of pages in Google and some pages in Yahoo but zero in MSN, the company web site has been going for over 5 years but this is the new design and url.

Google is really good, it changed the pages within weeks of me adding the Meta Title hack to the site, and re-spidered with all the new names, cool.

Yahoo seems to have given up after doing about random 30 pages and won't do any more.

MSN just wont show any pages.

Cheers,
Bruce.

theblade24 · 02-10-2009 07:46:37

I have seen the same behavior. I wouldn't worry too much about. I think time has alot to do with it regarding yahoo and msn. I'll bet if you look at server logs back from when you launched youll find MSNbot not even coming around until long after the others.

MSN is a very very very small fraction of traffic. I would lose sleep over them.

Dave · 02-10-2009 08:33:02

The SiteMap XMOD for the US version of CCP includes an option to have your site map automatically submitted to ask.com, Yahoo, MSN and Google. Makes it a "no brainer" to get the major engines to index your site.

Kryptronic Software Support Forum

#1 11-13-2008 21:09:06

SEO and First-Time Visitor Links to Long URL

#2 11-13-2008 21:53:23

Re: SEO and First-Time Visitor Links to Long URL

#3 11-13-2008 22:18:50

Re: SEO and First-Time Visitor Links to Long URL

#4 11-14-2008 19:47:31

Re: SEO and First-Time Visitor Links to Long URL

#5 02-02-2009 05:43:11

Re: SEO and First-Time Visitor Links to Long URL

#6 02-02-2009 16:10:43

Re: SEO and First-Time Visitor Links to Long URL

#7 02-08-2009 18:04:38

Re: SEO and First-Time Visitor Links to Long URL

#8 02-08-2009 18:26:48

Re: SEO and First-Time Visitor Links to Long URL

#9 02-08-2009 19:06:47

Re: SEO and First-Time Visitor Links to Long URL

#10 02-09-2009 04:32:59

Re: SEO and First-Time Visitor Links to Long URL

#11 02-09-2009 06:09:52

Re: SEO and First-Time Visitor Links to Long URL

#12 02-09-2009 06:46:48

Re: SEO and First-Time Visitor Links to Long URL

#13 02-09-2009 10:10:13

Re: SEO and First-Time Visitor Links to Long URL

#14 02-09-2009 10:16:26

Re: SEO and First-Time Visitor Links to Long URL

west4 wrote:

#15 02-09-2009 12:39:44

Re: SEO and First-Time Visitor Links to Long URL

west4 wrote:

#16 02-09-2009 13:42:08

Re: SEO and First-Time Visitor Links to Long URL

#17 02-09-2009 14:04:37

Re: SEO and First-Time Visitor Links to Long URL

#18 02-10-2009 04:05:09

Re: SEO and First-Time Visitor Links to Long URL

#19 02-10-2009 06:59:57

Re: SEO and First-Time Visitor Links to Long URL

#20 02-10-2009 07:40:53

Re: SEO and First-Time Visitor Links to Long URL

#21 02-10-2009 07:46:37

Re: SEO and First-Time Visitor Links to Long URL

#22 02-10-2009 08:33:02

Re: SEO and First-Time Visitor Links to Long URL

Board footer