You are viewing this forum as a guest. Login to an existing account, or create a new account, to reply to topics and to create new topics.
Hello,
I use a script over at my Linux Review Site called ReviewLinux.Com that creates a sitemap of all my news and reviews. One thing about my database over there is it creates a timestamp for every article written. Now I did not see anything in the ccp0_prod that relates to a timestamp as the true way for this script to work was to create a sitemap based on articles at reviewlinux.com in DESCENDING order. Now this script that I have done for myself at http://linuxcdshop.com is a very modified version and if anyone wants to add to it please do and share. Use script at own risk but all it does is read database. This script just adds date of excution similar to other scripts online only no need to edit anything. This script only sitemaps products and if anyone wants to share a revised work to index other important links please do. This script is also written to only create SEO urls that relate to my site and you will need to edit script.
Here it goes:
Create a config.php file off webroot and place you database info in it.
<?php $s = "localhost"; $u = "user"; $p = "password"; $d = "ccp6_dbname"; ?>
Now create sitemap.php and place this script in webroot directory. ie: http://www.domain.com/sitemap.php
<? include '/somewhere/off/webroot/config.php'; header("Content-Type: text/xml;charset=iso-8859-1"); //connect to the database mysql_connect("$s","$u","$p"); @mysql_select_db("$d") or die("Unable to select DB"); //count all the articles that are current and published $num_rows = mysql_num_rows(mysql_query("select * from ccp0_prod")); //select them and put them into a dataset called $result $query = "select * from ccp0_prod ORDER BY id DESC" ; $result = mysql_query($query) or die("Query failed"); //this is the normal header applied to any Google sitemap.xml file echo '<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'; $url_product = 'http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id"); //loop through the entire resultset for($i=0;$i<$num_rows; $i++) { $url_product = 'http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id"); /*you need to assign a date to the entity. if you don't store a timestamp in the Database then you need slapping*/ $date = gmdate("Y m d"); $year = substr($date,0,4); //work out the year $mon = substr($date,5,2); //work out the month $day = substr($date,8,3); //work out the day /*display the date in the format Google expects: 2006-01-29 for example*/ $displaydate = ''.$year.'-'.$mon.'-'.$day.''; //you can assign whatever changefreq and priority you like echo ' <url> <loc>'.$url_product.'.html</loc> <lastmod>'.$displaydate.'</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> '; } mysql_close(); //close connection //close the XML attribute echo '</urlset>'; ?>
You will also need to add this code in .htaccess. Well at least I did for redirect
RewriteRule (.*)\.xml.gz $1.php [nocase] RewriteRule (.*)\.txt.gz $1.php [nocase]
What the code does is redirect a call from Google site map from http://www.linuxcdshop.com/sitemap.xml.gz to the sitemap.php file. Not sure if you need this but I use it...
What this script needs to do a true sitemap for Google is to have an actual timestamp in each product so that this script could write the sitemap in desending order of when you added products to database. Now if anyone knows how to add a timestamp everytime we add products please tell me.. smile
Anyhow this is what I am using at http://www.linuxcdshop.com/sitemap.xml.gz and it creates the file automatically everytime someone goes to it like google or Yahoo. I have the redirect code for Yahoo but I think they now accept normal Google-like sitemaps.
Let me know what you think..
Last edited by Perkster (12-10-2007 12:32:57)
Offline
UPDATE:
Was reading at Google sitemap that they like a Timezone so I changed up part of code. Edit where neccessary.
/*you need to assign a date to the entity. if you don't store a timestamp in the Database then you need slapping*/ $date = gmdate("Y m d H:i:s"); // $date = gmdate("Y m d"); $year = substr($date,0,4); //work out the year $mon = substr($date,5,2); //work out the month $day = substr($date,8,2); //work out the day $hrs = substr($date,11,2); $min = substr($date,14,2); $sec = substr($date,17,2); /*display the date in the format Google expects: 2006-01-29T19:00:01 for example*/ $displaydate = ''.$year.'-'.$mon.'-'.$day.'T'.$hrs.':'.$min.':'.$sec.''; //you can assign whatever changefreq and priority you like echo // $pubdate = date("Y m d", $date); ' <url> <loc>'.$url_product.'.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> ';
I'm in BC Canada so I think we are -8 off GMT and I added in the <lastmod> tag area
Last edited by Perkster (12-09-2007 13:55:36)
Offline
Bump.. This script works outside of CCP60 which seems to be the only way to get a mod unless you pay for a professional service. This sitemap generator at leaset is supported by Google Sitemaps and worksand guess what it is free.. and awaiting anyone to improve on it..
Offline
In the above post where the .htaccess code is you can add the below code this to get a sitemap.xml
RewriteRule (.*)\.xml $1.php [nocase]
That might have seen straight forward to some..
Offline
UPDATE:
Well I did a little improvement to script. I was able to add url's of all categories to the sitemap.xml.gz Again this is a script that works outside of the framework of CCP6 and please adjust it for your needs and use at your own risk. This script works with my site and does pass Google Sitemap Status as OK You will need to adjust all urls but basic calling from CCP6 should work.
Below is new sitemap.php and all post prior show how to get here.
<? include '/somewhere/off/webroot/config.php'; header("Content-Type: text/xml;charset=iso-8859-1"); //connect to the database mysql_connect("$s","$u","$p"); @mysql_select_db("$d") or die("Unable to select DB"); //count all the articles that are current and published $num_rows = mysql_num_rows(mysql_query("select * from ccp0_prod")); //select them and put them into a dataset called $result $query = "select * from ccp0_prod ORDER BY id DESC" ; $result = mysql_query($query) or die("Query failed"); //count all the articles that are current and published $num_rows1 = mysql_num_rows(mysql_query("select * from ccp0_cat")); //select them and put them into a dataset called $result $query = "select * from ccp0_cat ORDER BY id DESC" ; $result1 = mysql_query($query) or die("Query failed"); //this is the normal header applied to any Google sitemap.xml file echo '<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">'; //don't worry this code makes more sense in step #5 $url_product = 'http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id"); //loop through the entire resultset for($i=0;$i<$num_rows; $i++) { //your url-product as we worked out in #4 $url_product = 'http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id"); /*you need to assign a date to the entity. if you don't store a timestamp in the Database then you need slapping*/ $date = gmdate("Y m d H:i:s"); $year = substr($date,0,4); //work out the year $mon = substr($date,5,2); //work out the month $day = substr($date,8,2); //work out the day $hrs = substr($date,11,2); $min = substr($date,14,2); $sec = substr($date,17,2); /*display the date in the format Google expects: 2006-01-29T18:00:00 for example*/ $displaydate = ''.$year.'-'.$mon.'-'.$day.'T'.$hrs.':'.$min.':'.$sec.''; //you can assign whatever changefreq and priority you like echo ' <url> <loc>'.$url_product.'.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> '; } //loop through the entire resultset for($i=0;$i<$num_rows1; $i++) { //your url-product as we worked out in #4 $url_cat = 'http://www.linuxcdshop.com/distro/'.mysql_result($result1,$i,"id"); /*you need to assign a date to the entity. if you don't store a timestamp in the Database then you need slapping*/ $date = gmdate("Y m d H:i:s"); $year = substr($date,0,4); //work out the year $mon = substr($date,5,2); //work out the month $day = substr($date,8,2); //work out the day $hrs = substr($date,11,2); $min = substr($date,14,2); $sec = substr($date,17,2); /*display the date in the format Google expects: 2006-01-29T18:00:00 for example*/ $displaydate = ''.$year.'-'.$mon.'-'.$day.'T'.$hrs.':'.$min.':'.$sec.''; //you can assign whatever changefreq and priority you like echo ' <url> <loc>'.$url_cat.'.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> '; } mysql_close(); //close connection //close the XML attribute that we opened in #3 echo '</urlset>'; // ob_end_flush(); ?>
Let me know if using this and if you have ways to improve it let me know. Next I am going to work on adding Specials and Best Seller urls and I guess that may be all it needs
Forgot to mention this script is totally automatic and will add all new cats and products every time you add them to CCP6 admin. Google the next time it goes to http://www.linuxcdshop.com/sitemap.xml.gz will auto pickup on changes...
Last edited by Perkster (12-13-2007 21:27:48)
Offline
Final UPDATE:
I thought it best to add static urls that never change at bottom of page. I suppose one could add store policies in here and whatever else you have for urls that are not dynamic like the categories and products. Basically I just moved the mysql_close call to end of file and hardcoded som of the <url> I want in sitemap
//Enter static url below here and close tag echo '<url> <loc>http://www.linuxcdshop.com/index/index.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url><url> <loc>http://www.linuxcdshop.com/all/index.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url><url> <loc>http://www.linuxcdshop.com/bestsellers/index.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url><url> <loc>http://www.linuxcdshop.com/new/index.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url><url> <loc>http://www.linuxcdshop.com/specials/index.html</loc> <lastmod>'.$displaydate.'-08:00</lastmod> <changefreq>daily</changefreq> <priority>0.8</priority> </url> </urlset>'; mysql_close(); //close connection ?>
This script does everything I need right now until we get a US version from CCP6 developers for a sitemap that will probably have way more features than this script..
Offline
What would do you put in for localhost? your site's domain?
Offline
No for mysql at least when I use it just localhost is the call to your db server as long as db resides on the server where domain is..
Offline
When you say: Create a config.php file off webroot and place you database info in it.
So I am not to place this in my www folder or public_html folder. But right off the root w/where these other folders are?
Offline
Thats what I would do.. Same area where you see the public_html directory but not in it. Where you placed the private dir for CCP6
Offline
include '/somewhere/off/webroot/config.php';
would be:
include'/cellular-concepts.biz/config.php'; is that correct?
Offline
This is what I have now. Is it correct? I have replaced your url w/mine. But I think something may be wrong. Also, when I see: http://www.linuxcdshop.com/linux/ is linux the database? I don't have anything after my url and I probably should, correct?
---------------------------------------------
<?
include'/cellular-concepts.biz/config.php';
header("Content-Type: text/xml;charset=iso-8859-1");
//connect to the database
mysql_connect("$s","$u","$p");
@mysql_select_db("$d") or die("Unable to select DB");
//count all the articles that are current and published
$num_rows = mysql_num_rows(mysql_query("select * from ccp0_prod"));
//select them and put them into a dataset called $result
$query = "select * from ccp0_prod ORDER BY id DESC" ;
$result = mysql_query($query) or die("Query failed");
//count all the articles that are current and published
$num_rows1 = mysql_num_rows(mysql_query("select * from ccp0_cat"));
//select them and put them into a dataset called $result
$query = "select * from ccp0_cat ORDER BY id DESC" ;
$result1 = mysql_query($query) or die("Query failed");
//this is the normal header applied to any Google sitemap.xml file
echo '<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
http://www.sitemaps.org/schemas/sitemap … .xsd"
xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">';
//don't worry this code makes more sense in step #5
$url_product = 'http://www.cellular-concepts.biz'.mysql_result($result,$i,"id");
//loop through the entire resultset
for($i=0;$i<$num_rows; $i++)
{
//your url-product as we worked out in #4
$url_product = 'http://www.cellular-concepts.biz'.mysql_result($result,$i,"id");
/*you need to assign a date to the entity. if you don't
store a timestamp in the Database then you need slapping*/
$date = gmdate("Y m d H:i:s");
$year = substr($date,0,4); //work out the year
$mon = substr($date,5,2); //work out the month
$day = substr($date,8,2); //work out the day
$hrs = substr($date,11,2);
$min = substr($date,14,2);
$sec = substr($date,17,2);
/*display the date in the format Google expects:
2006-01-29T18:00:00 for example*/
$displaydate = ''.$year.'-'.$mon.'-'.$day.'T'.$hrs.':'.$min.':'.$sec.'';
//you can assign whatever changefreq and priority you like
echo
'
<url>
<loc>'.$url_product.'.html</loc>
<lastmod>'.$displaydate.'-08:00</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
';
}
//loop through the entire resultset
for($i=0;$i<$num_rows1; $i++)
{
//your url-product as we worked out in #4
$url_cat = 'http://www.cellular-concepts.biz'.mysql_result($result1,$i,"id");
/*you need to assign a date to the entity. if you don't
store a timestamp in the Database then you need slapping*/
$date = gmdate("Y m d H:i:s");
$year = substr($date,0,4); //work out the year
$mon = substr($date,5,2); //work out the month
$day = substr($date,8,2); //work out the day
$hrs = substr($date,11,2);
$min = substr($date,14,2);
$sec = substr($date,17,2);
/*display the date in the format Google expects:
2006-01-29T18:00:00 for example*/
$displaydate = ''.$year.'-'.$mon.'-'.$day.'T'.$hrs.':'.$min.':'.$sec.'';
//you can assign whatever changefreq and priority you like
echo
'
<url>
<loc>'.$url_cat.'.html</loc>
<lastmod>'.$displaydate.'-08:00</lastmod>
<changefreq>daily</changefreq>
<priority>0.8</priority>
</url>
';
}
mysql_close(); //close connection
//close the XML attribute that we opened in #3
echo
'</urlset>';
// ob_end_flush();
?>
Thanks for all the help.
Offline
The code works best with SEO on.. I do not think you have it on
Offline
Hi,
Thanks for the code. I added all that to my website and it works great. The only thing I'm having trouble with now is the .htaccess file. I have other files on my website that really are .xml files. It won't read them because it's apparetnly looking for a .php document/script?
Is there some adjustment I can make to the .htaccess file to enable both .php and .xml?
Thanks,
Tom
Offline
You could just place the actual name of file you want to rewrite in .htaccess.
RewriteRule ^sitemap.xml sitemap.php [L,PT]
or
RewriteRule ^sitemap.xml.gz sitemap.php [L,PT]
Yes the code in my original post covers anything xml so the above way may be best for your situation.
Also I think google will even accept the sitemap in their webmaster tools area as a .php file ?? No sure but I think they do..
Offline
Hi There,
I inserted that code as RewriteRule ^swissgear.xml swissgear.php [L,PT] and it stays the same. I know it's the .htaccess file because if I send up a different one, I can access the .xml document.
The document is in a different directory, does that matter? www/blogs/swissgear.xml
Thanks,
Tom
Offline
If not in your webroot directory then you need to adjust the code. either ^blogs/swissgear.xml or^/blogs/swissgears Experiment with it to work.. But it does need the sub directory
Offline
Just as an FYI the location of your site map is critical to make sure it will get read and processed. From sitemaps.org:
Sitemap file location
The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.
If you have the permission to change http://example.org/path/sitemap.xml, it is assumed that you also have permission to provide information for URLs with the prefix http://example.org/path/. Examples of URLs considered valid in http://example.com/catalog/sitemap.xml include:
http://example.com/catalog/show?item=23
http://example.com/catalog/show?item=233&user=3453
URLs not considered valid in http://example.com/catalog/sitemap.xml include:
http://example.com/image/show?item=23
http://example.com/image/show?item=233&user=3453
https://example.com/catalog/page1.php
Note that this means that all URLs listed in the Sitemap must use the same protocol (http, in this example) and reside on the same host as the Sitemap. For instance, if the Sitemap is located at http://www.example.com/sitemap.xml, it can't include URLs from http://subdomain.example.com.
URLs that are not considered valid are dropped from further consideration. It is strongly recommended that you place your Sitemap at the root directory of your web server. For example, if your web server is at example.com, then your Sitemap index file would be at http://example.com/sitemap.xml. In certain cases, you may need to produce different Sitemaps for different paths (e.g., if security permissions in your organization compartmentalize write access to different directories).
Offline
This sitemap is not validating for Google, because some of my pages have " - " and spaces in the name, and the code does not rewrite them so that they will validate.
For instance, this script writes many of our URLs as:
www.mysite.com/ccp0-catshow/High Power Connector – 650mm.html
- which does not validate.
But the correct URL is this:
www.mysite.com/ccp0-catshow/High+Power+Connector+%96+650mm.html
- which does validate.
I'd love to use this script, but only if code can be added that would can write the URLs so that they will validate. Any ideas?
I realize, that we have an unconventional product IDs, so this is an unusual issue. Otherwise, this is a great script, and I really hope to be able to use it.
Last edited by Lisaweb (05-28-2008 12:37:13)
Offline
In sitemap.php change this line from:
$url_product = 'http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id");
to
$url_product = urlencode('http://www.linuxcdshop.com/linux/'.mysql_result($result,$i,"id"));
and see it your URLs get emitted correctly.
Offline
Hi Dave, thanks for the help!
But now the URLs are turning out:
http%3A%2F%2Fwww.mysite.com%2Fccp0-prodshow%2FConnector+Boot.html
and this is not validating either.
Offline
OK, try this then (encoding just the name of the page instead of the entire URL):
$url_product = 'http://www.linuxcdshop.com/linux/'.urlencode(mysql_result($result,$i,"id"));
Offline